Вы находитесь на странице: 1из 292

Matrix Algebra: Exercises and Solutions

Springer Science+Business Media, LLC


David A. Harville

Matrix Algebra:
Exercises and Solutions

" Springer
David A. Harville
Mathematical Sciences Department
IBM TJ. Watson Research Center
Yorktown Heights, NY 10598-0218
USA

Library ofCongress Cataloging-in-Publieation Data


Harville, David A.
Matrix algebra: exereises and solutions / David A. Harville.
p. em.
lncludes bibliographieal referenees and index.
ISBN 978-0-387-95318-2 ISBN 978-1-4613-0181-3 (eBook)
DOI 10.1007/978-1-4613-0181-3
1. Matrices-Problems, exereises, etc. I. Title.
QAI88 .H38 2001
519.9'434--iic21 2001032838

Printed on acid-free paper.

© 2001 Springer Science+Business Media New York


Originally published by Springer-Verlag New York, Ine in 2001
AII rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher Springer Science+Business Media, LLC, except for brief excerpts
inconnection with reviews or scholarly analysis. Use in connection with any form of information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed is forbidden,
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the
former are not especially identified, is not to be taken as a sign that such names, as understood by the
Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

Production managed by Yong-Soon Hwang; manufacturing supervised by Jeffrey Taub.


Photocomposed copy prepared from the author's LaTeX file.

9 8 765 432 I

ISBN 978-0-387-95318-2
Preface

This book comprises well over three-hundred exercises in matrix algebra and their
solutions. The exercises are taken from my earlier book Matrix Algebra From a
Statistician's Perspective. They have been restated (as necessary) to make them
comprehensible independently of their source. To further insure that the restated
exercises have this stand-alone property, I have included in the front matter a
section on terminology and another on notation. These sections provide definitions,
descriptions, comments, or explanatory material pertaining to certain terms and
notational symbols and conventions from Matrix Algebra From a Statistician's
Perspective that may be unfamiliar to a nonreader of that book or that may differ in
generality or other respects from those to which he/she is accustomed. For example,
the section on terminology includes an entry for scalar and one for matrix. These
are standard terms, but their use herein (and in Matrix Algebra From a Statistician's
Perspective) is restricted to real numbers and to rectangular arrays of real numbers,
whereas in various other presentations, a scalar may be a complex number or more
generally a member of a field, and a matrix may be a rectangular array of such
entities.
It is my intention that Matrix Algebra: Exercises and Solutions serve not only
as a "solution manual" for the readers of Matrix Algebra From a Statistician's
Perspective, but also as a resource for anyone with an interest in matrix algebra
(including teachers and students of the subject) who may have a need for exercises
accompanied by solutions. The early chapters of this volume contain a relatively
small number of exercises-in fact, Chapter 7 contains only one exercise and
Chapter 3 only two. This is because the corresponding chapters of Matrix Alge-
bra From a Statistician's Perspective cover relatively standard material, to which
many readers will have had previous exposure, and/or are relatively short. It is
vi Preface

the final ten chapters that contain the vast majority of the exercises. The topics
of many of these chapters are ones that may not be covered extensively (if at all)
in more standard presentations or that may be covered from a different perspec-
tive. Consequently, the overlap between the exercises from Matrix Algebra From
a Statistician's Perspective (and contained herein) and those available from other
sources is relatively small.
A considerable number of the exercises consist of verifying or deriving results
supplementary to those included in the primary coverage of Matrix Algebra From
a Statistician's Perspective. Thus, their solutions provide what are in effect proofs.
For many of these results, including some of considerable relevance and interest
in statistics and related disciplines, proofs have heretofore only been available (if
at all) through relatively high-level books or through journal articles.
The exercises are arranged in 22 chapters and within each chapter, are numbered
successively (starting with 1). The arrangement, the numbering, and the chapter
titles match those in Matrix Algebra From a Statistician's Perspective. An exercise
from a different chapter is identified by a number obtained by inserting the chapter
number (and a decimal point) in front of the exercise number.
A considerable effort was expended in designing the exercises to insure an
appropriate level of difficulty-the book Matrix Algebra From a Statistician's
Perspective is essentially a self-contained treatise on matrix algebra, however it
is aimed at a reader who has had at least some previous exposure to the subject
(of the kind that might be attained in an introductory course on matrix or linear
algebra). This effort included breaking some of the more difficult exercises into
relatively palatable parts and/or providing judicious hints.
The solutions presented herein are ones that should be comprehensible to those
with exposure to the material presented in the corresponding chapter of Matrix
Algebra From a Statistician's Perspective (and possibly to that presented in one
or more earlier chapters). When deemed helpful in comprehending a solution,
references are included to the appropriate results in Matrix Algebra From a Statis-
tician's Perspective-unless otherwise indicated a reference to a chapter, section,
or subsection or to a numbered result (theorem, lemma, corollary, "equation",
etc.) pertains to a chapter, section, or subsection or to a numbered result in Matrix
Algebra From a Statistician's Perspective (and is made by following the same con-
ventions as in the corresponding chapter of Matrix Algebra From a Statistician's
Perspective). What constitutes a "legitimate" solution to an exercise depends of
course on what one takes to be "given". If additional results are regarded as given,
then additional, possibly shorter solutions may become possible.
The ordering of topics in Matrix Algebra From a Statistician's Perspective is
somewhat nonstandard. In particular, the topic of eigenvalues and eigenvectors is
deferred until Chapter 21, which is the next-to-Iast chapter. Among the key results
on that topic is the existence of something called the spectral decomposition. This
result if included among those regarded as given, could be used to devise alternative
solutions for a number of the exercises in the chapters preceding Chapter 21.
However, its use comes at a "price"; the existence of the spectral decomposition
can only be established by resort to mathematics considerably deeper than those
Preface vii

underlying the results of Chapters 1-20 in Matrix Algebra From a Statistician's


Perspective.
I am indebted to Emmanuel Yashchin for his support and encouragement in the
preparation of the manuscript for Matrix Algebra: Exercises and Solutions. I am
also indebted to Lorraine Menna, who entered much of the manuscript in IhTPC,
and to Barbara White, who participated in the latter stages of the entry process.
Finally, I wish to thank John Kimmel, who has been my editor at Springer-Verlag,
for his help and advice.
Contents

Preface v

Some Notation ........................................................................................... xi

Some Terminology xvii

1 Matrices

2 Submatrices and Partitioned Matrices 7

3 Linear Dependence and Independence 11

4 Linear Spaces: Rowand Column Spaces 13

5 Trace of a (Square) Matrix 19

6 Geometrical Considerations 21

7 Linear Systems: Consistency and Compatibility ............................ .. 27

8 Inverse Matrices 29

9 Generalized Inverses 35

10 Idempotent Matrices 49
x Contents

11 Linear Systems: Solutions ........... ...................................................... 55

12 Projections and Projection Matrices 63

13 Determinants ...................................................................................... 69

14 Linear, Bilinear, and Quadratic Forms 79

15 Matrix Differentiation 113

16 Kronecker Products and the Vec and Vech Operators 139

17 Intersections and Sums of Subspaces 161

18 Sums (and Differences) of Matrices 179

19 Minimization of a Second-Degree Polynomial (in n Variables)


Subject to Linear Constraints ........................................................... 209

20 The Moore-Penrose Inverse 221

21 Eigenvalues and Eigenvectors 231

22 Linear Transformations 251

References 265

Index 267
Some Notation

{xd A row or (depending on the context) column vector whose ith element
is Xi
{aij} A matrix whose ijth element is aij (and whose dimensions are arbitrary
or may be inferred from the context)
A' The transpose of a matrix A
AP The pth (for a positive integer p) power of a square matrix A; i.e., the
matrix product AA ... A defined recursively by setting A0 = I and taking
Ak = AA k - 1 (k = 1, ... , p)
C(A) Column space of a matrix A
R(A) Row space of a matrix A
R mxn The linear space comprising all m x n matrices
Rn The linear space R nx I comprising all n-dimensional column vectors
or (depending on the context) the linear space R I xn comprising all n-
dimensional row vectors
speS) Span of a finite set S of matrices; Sp({AI, ... , Ad), which represents
the span of the set {AI, ... , Ad comprising the k matrices AI, ... , Ab
is generally abbreviated to Sp(AI, ... , Ak)
C Writing SeT (or T ~ S) indicates that a set S is a (not necessarily
proper) subset of a set T
dim (V) Dimension of a linear space V
rank A The rank of a matrix A
rank T The rank of a linear transformation T
xii Some Notation

tr(A) The trace of a (square) matrix A


o The scalar zero (or, depending on the context, the zero transformation
from one linear space into another)
o A null matrix (whose dimensions are arbitrary or may be inferred from
the context)
I An identity transformation
I An identity matrix (whose order is arbitrary or may be inferred from the
context)
In An identity matrix of order n
A •B Inner product of a pair of matrices A and B (or if so indicated, quasi-inner
product of the pair A and B)
II A II Norm of a matrix A (or, in the case of a quasi-inner product, the quasi
norm of A)
8(A, B) Distance between two matrices A and B
T- I The inverse of an invertible transformation T
A-I The inverse of an invertible matrix A
A- An arbitrary generalized inverse of a matrix A
N(A) NulI space of a matrix A
N(T) Null space of a linear transformation T
J.. A symbol for "is orthogonal to"
J.. w A symbol used to indicate (by writing x J..w y, x J.. w U, orU J..w V) that
2 vectors x and y, a vector x and a subspace U, or 2 subspaces U and V
are orthogonal with respect to a symmetric nonnegative definite matrix
W
Px The matrix X(X'X)-X' [which is invariant to the choice of the general-
ized inverse (X'X)-]
Px,w The matrix X(X'WX) -X'W [which if W is symmetric and positive def-
inite, is invariant to the choice of the generalized inverse (X'WX) -]
The orthogonal complement of a subspace U of a linear space V
The orthogonal (with respect to the usual inner product) complement of
the column space C(X) of an n x p matrix X [when C(X) is regarded as
a subspace of R"]
The orthogonal complement of the column space C (X) of an n x p matrix
X when, for an Ii x n symmetric positive definite matrix W, the inner
product is taken to be the bilinear form x'Wy [and when C(X) is regarded
as a subspace of R"]
A function whose value an(il, il;"'; ill, ill) for any two (not neces-
sarily different) permutations of the first n positive integers is the num-
ber of negative pairs among the G) pairs that can be formed from the
i IiI, ... , in ill th elements of an n x Ii matrix
A function whose value ¢n (i I, ... , ill) for any sequence of n distinct
Some Notation xiii

integers ii, ... , in is PI + ... + Pn-l, where (for k = 1, ... , n - 1) Pk


represents the number of integers in the subsequence ik+ 1, ... , in that
are smaller than h
IAI The determinant of a square matrix A - with regard to partitioned ma-
All . .. Ale) All Ale
(
trices,: : may be abbreviated to
Arl Are Arl Are
det(A) The determinant of a square matrix A
adj(A) The adjoint matrix of a square matrix A
J A matrix, all of whose elements equal one (and whose dimensions are
arbitrary or may be inferred from the context)
Jmn An m x n matrix, all of whose mn elements equal one
A®B The Kronecker product of a matrix A and a matrix B - this notation
extends in an obvious way to the Kronecker product of 3 or more matrices
vecA The vec of a matrix A
vechA The vech of a (square) matrix A
Kmn The mn x mn vec-permutation (or commutation) matrix
Gn The n 2 x n(n + 1)/2 duplication matrix
Hn An arbitrary left inverse of Gn, so that Hn is any n(n + 1)/2 x n 2
matrix such that HnGn = lor equivalently such that, for every n x n
symmetric matrix A, vech A = Hn vec A - one choice for Hn is Hn =
(G~Gn)-lG~
The jth (first-order) partial derivative of a function f, with domain S in
nmxl, at an interior point e of S - the function whose value at a point
cis Dj/(e) is represented by the symbol Dj/
The jth partial derivative of a function f of an m x 1 vector x =
(Xl, ... , Xm)' - an alternative notation to D j f or D j f (x)
Df(e) The 1 x m vector [Dd(e), ... , Dm(e)] (where f is a function with
domain S in nm x 1 and where c is an interior point of S) - similarly,
Df is the 1 x m vector (Dlf, ... , Dmf)
The m x 1 vector (af /aXl, ... , af/axm )' of partial derivatives of a func-
tion f of an m x 1 vector x = (Xl, ... , x m )' - an alternative [to (D f)'
or (D f (x) )'] notation for the gradient vector
The 1 x m vector (af/axl, ... , af/axm ) of partial derivatives of a func-
tion f of an m x 1 vector x = (Xl, ... , x m)' - equals (af/ax)' and is
an alternative notation to D f or D f (x)
D~ fee) The ijth second-order partial derivative of a function f, with domain S
in nm xl, at an interior point e of S - the function whose value at a point
e is D~f(e) is represented by the symbol D~f
xiv Some Notation

~
aXiaXj An alternative [to Drjf or DrJ(x)] notation for the ijth (second-order)
partial derivative of a function 1 of an m x 1 vector x = (XI, ... , x m ) ' -
this notation extends in a straightforward way to third- and higher-order
partial derivatives
HI The Hessian matrix of a function 1 - accordingly, Hf(c) represents
the value of Hf at an interior point c of the domain of f
Djf The p x 1 vector (Dj iJ, ... , D J!p)', whose ith element is the jth partial
derivative D j Ii of the ith element fi of a p x 1 vector f = (fl, ... , I p )'
of functions, each of whose domain is a set S in nmxl_ similarly,
Djf(c} = [DjiJ (c), ... , DJ!p(c}]', where c is an interior point of S
g~ The p x q matrix whose stth element is the partial derivative afsr/axj
of the stth element of a p x q matrix F = {fsd of functions of a vector
x = (XI, ... ,xm )' of m variables
a2 F
aXi ax j The p x q matrix whose stth element is the second-order partial derivative
a2fsr/aXiaXj of the stth elementofa p x q matrix F = {fsd offunctions
of a vector x = (XI, ... ,xm )' of m variables-this notation extends in a
straightforward way to a p x q matrix whose s tth element is one of the
third- or higher-order partial derivatives of the stth element of F
Df The Jacobian matrix (Dlf, ... , Dmf) of a vector f = (fl,"" fp)' of
functions, each of whose domain is a set S in nm
x I - similarly, Df(c}
= [Dlf(c), ... , Dmf(c}], where c is an interior point of S
:: An alternative [to Df or Df(x}] notation for the Jacobian matrix of a vector
f = (iJ, ... , fp)' of functions of an m x I vector x = (XI, ... ,xm )' -
af/ax' is the p x m matrix whose ijth element is ali/axj
ar'
ax An alternative [to (Df)' or (Df(x»'] notation for the gradient (matrix)
of a vector f = (fJ, ... , fp)' of functions of an m x 1 vector x =
(XI, ... , xm)' - ar' lax is the m x p matrix whose jith element is
af;/axj
ax
aaaf The derivative of a function f of an m x n matrix X of mn "independent"
variables or (depending on the context) of an n x n symmetric matrix X
- the matrix af/aX' is identical to (af/aX)'
unV The intersection of 2 sets U and V of matrices-this notation extends in
an obvious way to the intersection of 3 or more sets
U UV The union of 2 sets U and V of matrices (of the same dimensions )-this
notation extends in an obvious way to the union of 3 or more sets
U +V The sum of 2 nonempty sets U and V of matrices (of the same dimen-
sions)-this notation extends in an obvious way to the sum of 3 or more
nonempty sets
U EI1 V The direct sum of 2 (essentially disjoint) linear spaces U and V in xn nm
Some Notation xv

- writing U EEl V (rather than U + V) serves to emphasize, or (in the


absence of any previous indication) imply, that U and V are essentially
disjoint and hence that their sum is a direct sum
A+ The Moore-Penrose inverse of a matrix A
(kT) The scalar multiple of a scalar k and a transformation T from a linear
space V into a linear space W; in the absence of any ambiguity, the
parentheses may be dropped, that is, kT may be written in place of (kT)
(T + S) The sum of two transformations T and S from a linear space V into a
linear space W; in the absence of any ambiguity, the parentheses may be
dropped, that is, T + S may be written in place of (T + S) - this notation
extends in an obvious way to the sum of three or more transformations
(T S) The product of a transformation T from a linear space V into a linear
space W and a transformation S from a linear space U into V; in the
absence of any ambiguity, the parentheses may be dropped, that is, T S
may be written in place of (T S) - this notation extends in an obvious
way to the product of three or more transformations
LB A transformation defined for any (nonempty) linearly independent set B
of matrices (of the same dimensions), say the matrices YI, Y2, ... , Y n :
it is the transformation from nnxl onto the linear space W = sp(B)
that assigns to each vector x = (Xl, X2, ... , xn)' in nnxl the matrix
XIYI +X2Y2 + ... +xnYn in W.
Some Terminology

adjoint matrix The adjoint matrix of an n x n matrix A = {aij} is the transpose


of the cofactor matrix of A (or equivalently is the n x n matrix whose ijth
element is the cofactor of a ji).
algebraic mUltiplicity The characteristic polynomial, say p(o), of an n x n matrix
A has a unique (aside from the order of the factors) representation of the
form
p(A) = (_l)np.. - Ad Y1 ••• (A - Ak)Ykq(A) (-00 < A < 00),
where {AI, ... , Ak} is the spectrum of A (comprising the distinct scalars that
are eigenvalues of A), YI, ... , Yk are (strictly) positive integers, and q is a
polynomial (of degree n - L~=l yd that has no real roots; for i = I, ... , k,
Yi is referred to as the algebraic multiplicity of the eigenvalue Ai .
basis A basis for a linear space V is a finite set of linearly independent matrices
in V that spans V.
basis (natural) The natural basis for R mxn comprises the mn matrices UII, U21,
... , Uml, ... , Ul n, U2n, ... , U mn ' where (fori = 1, ... , m andj = 1, '" n)
Uij is the m x n matrix whose ijth element equals I and whose remaining
mn - 1 elements equal 0; the natural (or usual) basis for the linear space of
all n x n symmetric matrices comprises the n(n + 1)/2 matrices Uil' Uil'
... , U~l' ... , Wi' W+I,i' ... , U~i' ... , U~n' where (for i = I, ... , n)
uti is the n x n matrix whose ith diagonal element equals 1 and whose
remaining n 2 - 1 elements equal 0 and (for j < i = 1, ... , n) utj is the
n x n matrix whose ijth and jith elements equal I and whose remaining
n 2 - 2 elements equal O.
bilinear form A bilinear form in an m x I vector x = (XI, ... , x m )' and an n x 1
vector y = (YI, ... , Yn)' is a function of x and y (defined for x E R m and
xviii Some Terminology

Y E nn) that, for some m x n matrix A = {aij} (called the matrix of the
bilinear form), is expressible as x'Ay = Li,j aijXiYj - the bilinear form
is said to be symmetric if m = n and x' Ay = y' Ax for all x and all y or
equivalently if the matrix A is symmetric,

block-diagonal A partitioned matrix of the form (A~1 A~2 ~) (all

o 0 Arr
of whose off-diagonal blocks are null matrices) is said to be block-diagonal
and may be expressed in abbreviated notation as diag(AIl, A22, ... , Arr).
All AJ2 ... Air)
(
o A22 ... A2r
block-triangular A partitioned matrix of the form: ... : or

o 0 Arr

(!~: A~2
Arl Ar2
~
Arr
) is respectively upper or lower block-triangular-

to indicate that a partitioned matrix is upper or lower block-triangular (with-


out specifying which), the matrix is referred to simply as block-triangular.
characteristic polynomial (and equation) Corresponding to any n x n matrix
A is its characteristic polynomial, say p(o), defined (for -00 < ).. < (0)
by p()..) = IA - All, and its characteristic equation p()..) = 0 obtained by
setting its characteristic polynomial equal to 0; p()..) is a polynomial in).. of
degree n and hence is of the form p()..) = Co +CI)..+· .. +Cn_l)..n-1 +cn)..n,
where the coefficients Co, CI, ... , Cn -I, Cn depend on the elements of A.
Cholesky decomposition The Cholesky decomposition of a symmetric positive
definite matrix, say A, is the unique decomposition of the form A = T'T,
where T is an upper triangular matrix with positive diagonal elements. More
generally, the Cholesky decomposition of an n x n symmetric nonnegative
definite matrix, say A, of rank r is the unique decomposition of the form
A = T'T, where T is an n x n upper triangular matrix with r positive
diagonal elements and n - r null rows.
cofactor (and minor) The cofactor and minor of the ijth element, say aij, of an
n x n matrix A are defined in terms of the (n - 1) x (n - 1) sub matrix, say
Aij, of A obtained by striking out the ith row and jth column (i.e., the row
and column containing aij): the minor of aij is IAij I, and the cofactor is the
"signed" minor ( _1)i + j IAij /.
cofactor matrix The cofactor matrix (or matrix of cofactors) of an n x n matrix
A = {aij} is the n x n matrix whose ijth element is the cofactor of aij.
column space The column space of an m x n matrix A is the set whose elements
consist of all m-dimensional column vectors that are expressible as linear
Some Tenninology XIX

combinations of the n columns of A.


commute Two n x n matrices A and B are said to commute if AB = BA.
commute in pairs n x n matrices, say AI, ... , Ak, are said to commute in pairs
if AsAi = AiAs for s > i = 1, ... , k.
consistent A linear system is said to be consistent if it has one or more solutions.
continuous A function f, with domain S in nm x I , is continuous at an interior
point c of S if f(x) = fCc).
limx~c
continuously differentiable A function f , with domain S in nm
x I , is contin-
uously differentiable at an interior point c of S if DJ!(c), D2f(c), ... ,
Dmf(c) exist and are continuous at every point x in some neighborhood of
c - a vector or matrix of functions is continuously differentiable at c if all
of its elements are continuously differentiable at c.
derivative of a function of a matrix The derivative of a function f of an m x n
matrix X = {xij} of mn "independent" variables is the m x n matrix whose
ijth element is the partial derivative af/aXij of f with respect to Xij when
f is regarded as a function of an mn-dimensional column vector x formed
from X by rearranging its elements; the derivative of a function f of an
n x n symmetric (but otherwise unrestricted) matrix of variables is the n x n
(symmetric) matrix whose ijth element is the partial derivative af/axij or
af/aXji of f with respect to Xi} or Xji when f is regarded as a function of an
n (n + 1) /2-dimensional column vector x formed from any set of n (n + 1) /2
nonredundant elements of X.
determinant The determinant of an n x n matrix A = {aij} is (by definition)
the (scalar-valued) quantity L (_I)<i>n(jl ,... ,}n)alh ... anjn' or equivalently
the quantity L (_I)<i>n(il , ... ,in)aill ... ainn, where iJ, ... , jn or ii, ... , in is
a permutation of the first n positive integers and the summation is over all
such permutations.
diagonalization An n x n matrix, say A, is said to be diagonalizable if there exists
an n x n nonsingular matrix Q such that Q-I AQ is diagonal, in which case
Q is said to diagonalize A (or A is said to be diagonalized by Q); a matrix
that can be diagonalized by an orthogonal matrix is said to be orthogonally
diagonalizable.
diagonalization (simultaneous) k matrices, say A I, ... , Ak, of dimensions n x n,
are said to be simultaneously diagonalizable if all k of them can be diag-
onalized by the same matrix, that is, if there exists an n x n nonsingular
matrix Q such that Q-I Al Q, ... , Q-I AkQ are all diagonal, in which case
Q is said to simultaneously diagonalize AI, ... , Ak (or AI, ... , Ak are said
to be simultaneously diagonalized by Q).
dimension (of a linear space) The dimension of a linear space V is the number
of matrices in a basis for V.
dimension (of a row or column vector) A row or column vector having n ele-
ments is said to be of dimension n.
xx Some Tenninology

dimensions (of a matrix) A matrix having m rows and n columns is said to be of


dimensions m x n.
direct sum If 2 linear spaces in nm xn are essentially disjoint, their sum is said to
be a direct sum.
distance The distance between two matrices A and B in a linear space V is
IIA-B II.
dual transformation Corresponding to any linear transformation T from an n-
dimensional linear space V into an m-dimensionallinear space W is a linear
transformation from W into V called the dual transformation: denoting by
X Z the inner product of an arbitrary pair of matrices X and Z in V and
0

by U * Y the inner product of an arbitrary pair of matrices U and Y in


W, the dual transformation is the (unique) linear transformation, say S,
from W into V such that (for every matrix X in V and every matrix Y in W)
XoS(Y) = T(X)*Y; further,forallYinW,S(Y) = 'LJ=I [Y*T(Xj)]Xj,
where XI, X2, ... , Xn are any matrices that form an orthonormal basis for
V.
duplication matrix The n 2 x 11(11 + 1)/2 duplication matrix is the matrix, denoted
by the symbol G n , such that, for every n x n symmetric matrix A, vec A =
GnvechA.
eigenspace The eigenspace of an eigenvalue, say A, of an n x n matrix A is the
linear space N (A - AI) - with the exception of the n x I null vector, every
member of this space is an eigenvector (of A) corresponding to A.
eigenvalues and eigenvectors An eigenvalue of an n x 11 matrix A is (by defini-
tion) a scalar (real number), say A, for which there exists an 11 x I vector,
say x, such that Ax = AX, or equivalently such that (A - AI)x = 0; any
such vector x is referred to as an eigenvector (of A) and is said to belong
to (or correspond to) the eigenvalue A- eigenvalues (and eigenvectors), as
defined herein, are restricted to real numbers (and vectors of real numbers).
eigenvalues (not necessarily distinct) The characteristic polynomial, say p(o),
of an n x n matrix A is expressible as
(-00 < A < (0),
where dl, d2,.'" dm are not-necessarily-distinct scalars and q(o) is a poly-
nomial (of degree n - m) that has no real roots; dl, d2, ... , dm are referred to
as the not-necessarily-distinct eigenvalues of A or (at the possible risk of con-
fusion) simply as the eigenvalues of A - ifthe spectrum of A has k members,
say AI, ... , Ab with algebraic multiplicities of YI, ... , Yk, respectively, then
m = 'L7=1 Yi, and (for i = I, ... , k) Yi of the m not-necessarily-distinct
eigenvalues equal Ai .
essentially disjoint Two subspaces, say U and V, of nmxn are (by definition)
essentially disjoint if U n V = {OJ, i.e., if the only matrix they have in
common is the (m x n) null matrix-note that every subspace of nm xn
contains the (m x n) null matrix, so that no two subspaces can be entirely
disjoint.
Some Terminology XXI

full column rank An m x n matrix A is said to have full column rank if rank(A)
=n.
full row rank An m x n matrix A is said to have full row rank ifrank(A) = m.
generalized eigenvalue problem The generalized eigenvalue problem consists of
finding, for a symmetric matrix A and a symmetric positive definite matrix
B, the roots of the polynomial IA - ABI (i.e., the solutions for A to the
equation IA - ABI = 0).
generalized inverse A generalized inverse of an m x n matrix A is any n x m
matrix G such that AGA = A - if A is nonsingular, its only generalized
inverse is A-I; otherwise, it has infinitely many generalized inverses.
geometric multiplicity The geometric multiplicity of an eigenvalue, say A, of an
n x n matrix A is (by definition) dim[N(A - AI)] (i.e., the dimension of the
eigenspace of A).
gradient (or gradient matrix) The gradient of a vector f = (h, ... , Ip)' of
functions, each of whose domain is a set in nm xl, is the m x p matrix
[(DI!)', ... , (Dip)'], whose jith element is Ddi-the gradient off is the
transpose of the Jacobian matrix of f.
gradient vector The gradient vector of a function I, with domain in n mx1 , is
the m-dimensional column vector (D f)', whose jth element is the partial
derivative D j I of I
Hessian matrix The Hessian matrix of a function I, with domain in nm xl, is the
m x m matrix whose ijth element is the ijth partial derivative D;j I of I
homogeneous linear system A linear system (in a matrix X) of the fonn AX = 0;
i.e., a linear system whose right side is a null matrix.
idempotent A (square) matrix A is idempotent if A2 = A.
identity transformation An identity transfonnation is a transfonnation from a
linear space V onto V defined by T (X) = X.
indefinite A square (symmetric or nonsymmetric) matrix or a quadratic fonn is
(by definition) indefinite if it is neither nonnegative definite nor non positive

° °
definite-thus, an n x n matrix A and the quadratic fonn x' Ax (in an n x 1
vector x) are indefinite if x' Ax < for some x and x' Ax > for some
(other) x.
inner product The inner product A • B of an arbitrary pair of matrices A and B in
a linear space V is the value assigned to A and B by a designated function
having the following 4 properties: (1) A· B = B· A; (2) A· A :::: 0, with
equality holding if and only if A = 0; (3) (kA) • B = k(A· B) (where k is
an arbitrary scalar); (4) (A + B) • C = (A· C) + (B· C) (where C is an
arbitrary matrix in V)-the quasi-inner product A· B is defined in the same
way as the inner product except that Property (2) is replaced by the weaker
property (2') A· A :::: 0, with equality holding if A = O.
inner product (usual) The usual inner product of a pair of matrices A and B in a
linear space is tr(A'B) (which in the special case of a pair of column vectors
xxii Some Tenninology

a and b reduces to a'b).


interior point A matrix, say X, in a set S of m x n matrices is an interior point
of S if there exists a neighborhood, say N, of X such that N C S.
intersection The intersection of 2 sets, say U and V, of m x n matrices is the set
comprising all matrices that are contained in both U and V; more generally,
the intersection of k sets, say UI, ... ,Uk. of m x n matrices is the set
comprising all matrices that are contained in every one of UI, ... , Uk.
invariant subspace A subspace U of the linear space R n x I is said to be invariant
relative to an n x n matrix A if, for every vector x in U, the vector Ax is also
in U; a subspace U of an n-dimensional linear space V is said to be invariant
relative to a linear transformation T from V into V if T (U) C U, that is, if
the image T (U) of U is a subspace of U itself.
inverse (matrix) A matrix B that is both a right and left inverse of a matrix A (so
that AB = I and BA = I) is called an inverse of A.
inverse (transformation) The inverse of an invertible transformation T from a
linear space V into a linear space W is the transformation from W into V
that assigns to each matrix Y in W the (unique) matrix X (in V) such that
T(X) = Y.
invertible (matrix) A matrix that has an inverse is said to be invertible-a matrix
is invertible if and only if it is nonsingular.
invertible (transformation) A transformation from a linear space V into a linear
space W is (by definition) invertible if it is both 1-1 and onto.
involutory A (square) matrix A is involutory if A 2 = I, i.e., if it is invertible and
is its own inverse.
isomorphic If there exists a 1-1 linear transformation, say T, from a linear space
V onto a linear space W, then V and W are said to be isomorphic, and T is
said to be an isomorphism of V onto W.
Jacobian matrix The Jacobian matrix of a p-dimensional vector f = (II, ... ,
Ip)' of functions, each of whose domain is a set in R m x I, is the p x m
matrix (Dlf, ... , Dmf), whose ijth element is D j Ii - in the special case
where p = m, the determinant of this matrix is referred to as the Jacobian
(or Jacobian determinant) of f.
Kronecker product The Kronecker product of two matrices, sayan m x n matrix
A = {aij} and a p x q matrix B, is the mp x nq matrix

( :~~: :~~: :~:)


am:IB a m:2B amnB
obtained by replacing each element aij of A with the p x q matrix aijB-
the Kronecker-product operation is associative [for any 3 matrices A, B, and
C, A ® (B ® C) = (A ® B) ® C], so that the notion of a Kronecker product
extends in an unambiguous way to 3 or more matrices.
Some Tenninology xxiii

k times continuously differentiable A function j, with domain S in n mxI , is


k times continuously differentiable at an interior point c of S if it and all
of its first- through (k - 1)th-order partial derivatives are continuously dif-
ferentiable at c or, equivalently, if all of the first- through kth-order partial
derivatives of j exist and are continuous at every point in some neighborhood
of c - a vector or matrix of functions is k times continuously differentiable
at c if all of its elements are k times continuously differentiable at c.
LDU decomposition An LDU decomposition of a square matrix, say A, is a
decomposition of the form A = LDU, where L is a unit lower triangular
matrix, D a diagonal matrix, and U an upper triangular matrix.
least squares generalized inverse A generalized inverse, say G, of an m x n
matrix A is said to be a least squares generalized inverse (of A) if (AG)' =
AG; or, equivalently, an n x m matrix is a least squares generalized inverse
of A if it satisfies Moore-Penrose Conditions (1) and (3).
left inverse A left inverse of an m x n matrix A is an n x m matrix L such that
LA = In - a matrix has a left inverse if and only if it has full column rank.
linear dependence or independence A nonempty (but finite) set of matrices (of
the same dimensions), say A I, A2, ... , Ak, is (by definition) linearly depen-
dent if there exist scalars Xl, X2, ... ,Xk, not all 0, such that E~=I XiAi = 0;
otherwise (if no such scalars exist), the set is linearly independent-by con-
vention, the empty set is linearly independent.
linear space The use of this term is confined (herein) to sets of matrices (all of
which have the same dimensions). A nonempty set, say V, is called a linear
space if: (1) for every matrix A in V and every matrix B in V, the sum A + B
is in V; and (2) for every matrix A in V and every scalar k, the product kA
is in V.
linear system A linear system is (for some positive integers m, n, and p) a set of
mp simultaneous equations expressible in nonmatrix form as EJ=I aijXjk
=bik (i = i, ... , m; k = 1, ... , p), or in matrix form as AX = B, where A
= {aij} is an m x n matrix comprising the "coefficients", X = {Xjk} is an
n x p matrix comprising the "unknowns", and B = {bik} is an m x p matrix
comprising the "right (hand) sides"-A is referred to as the coefficient matrix
and B as the right side of AX = B; and to emphasize that X comprises the
unknowns, AX = B is referred to as a linear system in X.
linear transformation A transformation, say T, from a linear space V (of m x n
matrices) into a linear space W (of p x q matrices) is said to be linear if it
satisfies the following two conditions: (1) for all X and Z in V, T(X + Z)
= T(X) + T(Z); and (2) for every scalar c and for all X in V, T(cX) =
c T (X) - in the special case where W = n, it is customary to refer to a
linear transformation from V into W as a linear functional on V.
matrix The use of the term matrix is confined (herein) to real matrices, i.e., to
rectangular arrarys of real numbers.
matrix representation The matrix representation of a linear transformation from
XXIV Some Terminology

an n-dimensional linear space V, with a basis B comprising matrices V I,


V2, ... , Vn, into a linear space W, with a basis C comprising matrices WI,
W 2 , ••• , W m, is the m x 11 matrix A = {aij} whose jth column is (for
j = 1, 2, ... , n) uniquely determined by the equality
T(Vj) = aljWI + a2jW2 + ... + amjWm ;
this matrix (which depends on the choice of B and C) is such that if x = {x j }
is the 11 x I vector that comprises the coordinates of a matrix V (in V) in
terms of the basis B (i.e., V = L j x j V j), then the m x I vector y = {Yi}
given by the formula y = Ax comprises the coordinates of T (V) in terms
of the basis C [i.e., T (V) = Li )'i WiJ.
minimum norm generalized inverse A generalized inverse, say G, of an m x 11
matrix A is said to be a minimum norm generalized inverse (of A) if (GA)' =
GA; or, equivalently, an 11 x m matrix is a minimum norm generalized inverse
of A if it satisfies Moore-Penrose Conditions (1) and (4).
Moore-Penrose inverse (and conditions) Corresponding to any m x 11 matrix A,
there is a unique 11 x m matrix, say G, such that (1) AGA = A (i.e., G is
a generalized inverse of A), (2) GAG = G (i.e., A is a generalized inverse
of G), (3) (AG)' = AG (i.e., AG is symmetric), and (4) (GA)' = GA
(i.e., GA is symmetric). This matrix is called the Moore-Penrose inverse (or
pseudoinverse) of A, and the four conditions that (in combination) define
this matrix are referred to as Moore-Penrose (or Penrose) Conditions (1) -
(4).
negative definite An n x 11 (symmetric or nonsymmetric) matrix A and the quad-
ratic form x' Ax (in an 11 x I vector x) are (by definition) negative definite if
-x' Ax is a positive definite quadratic form (or equivalently if -A is a pos-
itive definite matrix)-thus, A and x' Ax are negative definite if x' Ax < 0
for every nonnull x in nn.
negative or positive pair Any pair of elements of an 11 x n matrix A = {aij) that
do not lie either in the same row orthe same column, say aij and ai'j' (where
i' :j::. i and j' :j::. j) is (by definition) either a negative pair or a positive pair:
it is a negative pair if one of the elements is located above and to the right
of the other, or equivalently if either i' > i and j' < j or ;' < ; and j' > j;
otherwise (if one of the elements is located above and to the left of the other,
or equivalently if either;' > ; and j' > j or;' < ; and j' < j), it is a
positive pair-note that whether a pair of elements is a negative pair or a
positive pair is completely determined by the elements' relative locations
and has nothing to do with whether the numerical values of the elements are
positive or negative.
negative semidefinite An 11 x 11 (symmetric or nonsymmetric) matrix A and the
quadratic form x' Ax (in an 11 x I vector x) are (by definition) negative
semidefinite if -x' Ax is a positive semidefinite quadratic form (or equiva-
lently if -A is a positive semidefinite matrix)-thus, A and x' Ax are nega-
tive semidefinite if they are non positive definite but not negative definite, or
equivalently if x' Ax ::::: 0 for every x in nn with equality holding for some
Some Tenninology xxv

nonnull x.
neighborhood A neighborhood of an m x n matrix C is a set of the general form
{X E nmxn: IIX-CII < r}, wherer is a positive number called the radius
of the neighborhood (and where the norm is the usual norm).
nonhomogeneous linear system A linear system whose right side (which is a
column vector or more generally a matrix) is nonnull.
nonnegative definite An n x n (symmetric or nonsymmetric) matrix A and the
quadratic form x' Ax (in an n x 1 vector x) are (by definition) nonnegative
definite if x' Ax :::: 0 for every x in nn.
nonpositive definite An n x n (symmetric or nonsymmetric) matrix A and the
quadratic form x' Ax (in an n x 1 vector x) are (by definition) nonpositive
definite if -x' Ax is a nonnegative definite quadratic form (or equivalently
if -A is a nonnegative definite matrix)-thus, A and x' Ax are nonpositive
definite if x' Ax :::: 0 for every x in nn.
nonnull matrix A matrix having 1 or more nonzero elements.
nonsingular A matrix is nonsingular if it has both full row rank and full column
rank or equivalently if it is square and its rank equals its order.
norm The norm of a matrix A in a linear space V is (A· A)1/2-the use of this
term is limited herein to norms defined in terms of an inner product; in the
case of a quasi-inner product, (A· A) 1/2 is referred to as the quasi norm.
normal equations A linear system (or the equations comprising the linear system)
of the form X/Xb = X' y (in a p x 1 vector b), where X is an n x p matrix
and y an n x 1 vector.
null matrix A matrix all of whose elements are o.
null space (of a matrix) The null space of an m x n matrix A is the solution
space of the homogeneous linear system Ax = 0 (in an n-dimensional
column vector x), or equivalently is the set {x E nnxl : Ax = O} .
null space (of a transformation) The null space-also known as the kernel- of
a linear transformation T from a linear space V into a linear space W is the
set (X E V : T(X) = O}, which is a subspace of V.
one to one A transformation T from a set V into a set W is said to be 1-1 (one to
one) if each member of the range of T is the image of only one member of
V.
onto A transformation T from a set V into a set W is said to be onto if T (V) = W
(i.e., if the range of T is all of W), in which case T may be referred to as a
transformation from V onto W.
open set A set S of m x n matrices is an open set if every matrix in S is an interior
point of S.
order A (square) matrix of dimensions n x n is said to be of order n.
orthogonal complement The orthogonal complement of a subspace U of a linear
space V is the set comprising all matrices in V that are orthogonal to U -
note that the orthogonal complement of U depends on V as well as U (and
xxvi Some Terminology

also on the choice of inner product).


orthogonality of a matrix and a subspace A matrix Y in a linear space V is or-
thogonal to a subspace U (of V) if Y is orthogonal to every matrix in U.
orthogonality of two subspaces A subspace U of a linear space V is orthogonal
to a subspace W (of V) if every matrix in U is orthogonal to every matrix
inW.
orthogonality with respect to a matrix For any n x n symmetric nonnegative
definite matrix W, two n x 1 vectors, say x and y, are said to be orthogonal
with respect to W if x'Wy = 0; an n x I vector, say x, and a subspace,
say U, of nn x I are said to be orthogonal with respect to W if x'Wy = 0
for every y in U; and two subspaces, say U and V, of nnx
I are said to be
orthogonal with respect to W if x'Wy = 0 for every x in U and every y in
V.
orthogonal matrix A (square) matrix A is orthogonal if A' A = AA' = I.
orthogonal set A finite set of matrices in a linear space V is orthogonal if the
inner product of every pair of matrices in the set equals O.
orthonormal set A finite set of matrices in a linear space V is orthonormal if it
is orthogonal and if the norm of every matrix in the set equals 1.
All A12 . .. Ale)
A21 A22 ... A2e
partitioned matrix A partitioned matrix, say ( : : ' is a ma-

Arl Ar2 Are


trix that has (for some positive integers r and c) been subdivided into rc sub-
matrices Aij (i = 1,2, ... , r; j = 1,2, ... , c), called blocks, by implicitly
superimposing on the matrix r - 1 horizontal lines and c -1 vertical lines (so
that all of the blocks in the same "row" of blocks have the same number of
rows and all of those in the same "column" of blocks have the same number
of columns )-in the special case where c = r, the blocks AII, A22, ... , Arr
are referred to as the diagonal blocks (and the other blocks are referred to
as the off-diagonal blocks).
permutation matrix An n x n permutation matrix is a matrix that is obtainable
from the n x n identity matrix by permuting its columns; i.e., a matrix of
the form (Ukl ' Ukz, ... , Ukn ), where U I, U2, ... , Un are respectively the first,
second, ... , nth columns of In and where kl, k2, ... , k n is a permutation of
the first n positive integers.
positive definite An n x n (symmetric or nonsymmetric) matrix A and the quad-
ratic form x' Ax (in an n x 1 vector x) are (by definition) positive definite if
x' Ax > 0 for every nonnull x in nn.
positive semidefinite An n x n (symmetric or nonsymmetric) matrix A and the
quadratic form x' Ax (in an n x I vectorx) are (by definition) positive semidef-
inite if they are nonnegative definite but not positive definite, or equivalently
if x' Ax ::: 0 for every x innn with equality holding for some nonnull x.
Some Terminology xxvii

principal submatrix A submatrix of a square matrix is a principal submatrix if


it can be obtained by striking out the same rows as columns (so that the ith
row is struck out whenever the ith column is struck out, and vice versa); the
r x r (principal) submatrix of an n x n matrix obtained by striking out its
last n - r rows and columns is referred to as a leading principal submatrix
(r = I, ... , n).
product (of transformations) The product (or composition) of a transformation,
say T, from a linear space V into a linear space Wand a transformation, say
S, from a linear space U into V is the transformation from U into W that
assigns to each matrix X in U the matrix T[S(X)] (in W)-the definition of
the term product (or composition) extends in a straightforward way to three
or more transformations.
projection (orthogonal) The projection-also known as the orthogonal projec-
tion-of a matrix Y in a linear space V on a subspace U (of V) is the unique
matrix, say Z, in U such that Y - Z is orthogonal to U; in the special case
where (for some positive integer n and for some symmetric positive definite
matrix W) V = nnxl and the inner product is the bilinear form x'Wy, the
projection of y (an n x 1 vector) on U is referred to as the projection of y
on U with respect to W - this terminology can be extended to a symmetric
nonnegative definite matrix W by defining a projection of yon U with respect
to W to be any vector z in U such that (y - z) -Lw U.
projection along a subspace For a linear space V of m x n matrices and for
subspaces U and W such that U EB W = V (essentially disjoint subspaces
whose sum is V), the projection of a matrix in V, say the matrix Y, on U
along W is (by definition) the (unique) matrix Z in U such that Y - Z E W.
projection matrix (orthogonal) The projection matrix-also known as the or-
thogonal projection matrix-for a subspaceU ofnnx1 is the unique (n x n)
matrix, say A, such that, for every n x 1 vector y, Ay is the projection (with
respect to the usual inner product) of yon U - simply saying that a matrix
is a projection matrix means that there is some subspace of nnx 1 for which
it is the projection matrix.
projection matrix (general orthogonal) The (orthogonal) projection matrix for
a subspace U of nnx 1 with respect to an n x n symmetric positive definite
matrix W is the unique (n x n) matrix, say A, such that, for every n x 1
vector y, Ay is the projection of y on U with respect to W - simply saying
that a matrix is a projection matrix with respect to W means that there is
some subspace of nnxl for which it is the projection matrix with respect
to W- more generally, a projection matrix for U with respect to an n x n
symmetric nonnegative definite matrix W is an (n x n) matrix, say A, such
that, for every n x 1 vector y, Ay is a projection of y on U with respect to
W.
projection matrix for one subspace along another For subspaces U and W (of
nn xl) such that U EB W = nn x 1 (essentially disjoint subspaces whose sum
is nnx 1), the projection matrix for U along W is the (unique) n x n matrix,
xxviii Some Terminology

say A, such that for every n x I vector y, Ay is the projection of yon U


along W.
QR decomposition The QR decomposition of a matrix offull column rank, sayan
m x k matrix A of rank k, is the unique decomposition of the form A = QR,
where Q is an m x k matrix whose columns are orthonormal (with respect
to the usual inner product) and R is a k x k upper triangular matrix with
positive diagonal elements.
quadratic form A quadratic form in an n x I vector x = (XI, ... , xn)' is a
function of x (defined for x E Rn) that, for some n x n matrix A = {aij}, is
expressible as x' Ax = Li.j aijXiX j - the matrix A is called the matrix of
the quadratic form and, unless n = I or the choice for A is restricted (e.g.,
to symmetric matrices), is nonunique.
range The range of a transformation T from a set V into a set W is the set
T (V) (i.e., the image of the domain of T)-in the special case of a linear
transformation from a linear space V into a linear space W, the range T(V)
of T is a linear space and is referred to as the range space of T.
rank (of a linear transformation) The rank of a linear transformation T from
a linear space V into a linear space W is (by definition) the dimension
dim[T(V)] of the range space T(V) of T.
rank (of a matrix) The rank of a matrix A is the dimension of C(A) or equiva-
lently of R(A).
rank additivity Two matrices A and B (of the same size) are said to be rank
additive if rank(A + B) = rank(A) + rank(B); more generally, k ma-
trices AI, A2 .... , Ak (of the same size) are said to be rank additive if
rank(L7=1 Ai) = L7=1 rank(Ai) (i.e., if the rank of their sum equals the
sum of their ranks).
reflexive generalized inverse A generalized inverse, say G, of an m x n matrix
A is said to be reflexive if GAG = G; or, equivalently, an n x m matrix is a
reflexive generalized inverse of A if it satisfies Moore-Penrose Conditions
(1) and (2).
restriction If T is a linear transformation from a linear space V into a linear space
Wand if U is a subspace of V, then the transformation, say R, from U into
W defined by R(X) = T(X) (which assigns to each matrix in U the same
matrix in W assigned by T) is called the restriction of T to U.
right inverse A right inverse of an m x n matrix A is an n x m matrix R such that
AR = 1m - a matrix has a right inverse if and only if it has full row rank.
row space The row space of an m x n matrix A is the set whose elements consist
of all n-dimensional row vectors that are expressible as linear combinations
of the m rows of A.
scalar The term scalar is (herein) used interchangeably with real number.
scalar mUltiple (of a transformation) The scalar multiple of a scalar k and a
transformation, say T, from a linear space V into a linear space W is the
Some Tenninology xxix

transformation from V into W that assigns to each matrix X in V the matrix


kT (X) (in W).
Schur complement In connection with a partitioned matrix A of the form A =
(~ ~) or A = (~ ~), the matrix Q = W - VT-U is referred to
as the Schur complement of T in A relative to T- or (especially in a case
where Q is invariant to the choice of the generalized inverse T-) simply as
the Schur complement of T in A or (in the absence of any ambiguity) even
more simply as the Schur complement of T.
second-degree polynomial A second-degree polynomial in an n x 1 vector x =
(Xl, ... , xn)' is a function, say f(x), of x that is defined for all x in nn
and that, for some scalar c, some n x I vector b = {bi}, and some n x n
matrix V = {vi}}, is expressible as f(x) = c - 2b'x +x'Vx, or in nonmatrix
notation as f(x) = c - 2I:7=1 biXi + I:7=1 I:J=1 VijXiXj - in the special
case where c = 0 and V = 0, f(x) = -2b'x, which is a linear form (in x),
and in the special case where c = 0 and b = 0, f(x) = x'vx, which is a
quadratic form (in x).
similar An n x n matrix B is said to be similar to an n x n matrix A if there exists
an n x n nonsingular matrix C such that B = C- IAC or, equivalently, such
that CB = AC - if B is similar to A, then A is similar to B.
singular A square matrix is singular if its rank is less than its order.

t t
singular value decomposition An m x n matrix A of rank r is expressible as

A= p(~l ~)Q' = PIDIQ'I = siPiq; = ajUj ,

where Q = (ql, ... , qn) is an n x n orthogonal matrix and Dl = diag(sl,


... , sr) an r x r diagonal matrix such that Q' A' AQ = (~i ~), where
Sl, ... , Sr are (strictly) positive, where QI = (ql, ... , qr)' PI = (PI, ... ,
Pr) = AQIDjl, and, for any m x (m - r) matrix P2 such that ~P2 = 0,
P = (PI, P2), where ai, ... ,ak are the distinct values represented among
SI, ... ,Sr, and where (for j = 1, ... ,k)Uj = I:{i:si=aj}Piq;; any of
these four representations may be referred to as the singular value decom-
position of A, and Sl, ... , Sr are referred to as the singular values of A -
SI, ... , Sr are the positive square roots of the nonzero eigenvalues of A' A
(or equivalently AA'), ql' ... , qn are eigenvectors of A' A, and the columns
of P are eigenvectors of AA'.
skew-symmetric An n x n matrix, say A = {aij}, is (by definition) skew-sym-
metric if A' = -A; that is, if aji = -aij for all i and j (or equivalently if
au =Ofori = l, ... ,nandaji = -aij for j =l=i = l, ... ,n).
solution A matrix, say X*, is said to be a solution to a linear system AX = B (in
X) ifAX* = B.
solution set or space The collection of all solutions to a linear system AX = B
(in X) is called the solution set of the linear system; in the special case of
xxx Some Tenninology

a homogeneous linear system AX = 0, the solution set may be called the


solution space.
span The span of a finite set of matrices (having the same dimensions) is defined as
follows: the span of a finite nonempty set {A I, ... , Ak Jis the set consisting
of all matrices that are expressible as linear combinations of AI, ... , Ak.
and the span of the empty set is the set (OJ, whose only element is the null
matrix. And, a finite set S of matrices in a linear space V is said to span V
ifsp(S) = V.
spectral decomposition An n x n symmetric matrix A is expressible as
n k
A = QDQ' = L diqiq; =L AjE) ,
i=1 )=1
where dl, ... , dn are the not-necessarily-distinct eigenvalues of A, ql' ... ,
~ are orthonormal eigenvectors corresponding to dl, ... , dn , respectively,
Q = (ql' ... ' ~), D = diag(dl, ... , dn ), V'I, ... , Ad is the spectrum
of A, and (for j = 1, ... , k) Ej = Eli :dj=).j} qiq;; any of these three
representations may be referred to as the spectral decomposition of A.
spectrum The spectrum of an n x n matrix A is the set whose members are the
distinct (different) scalars that are eigenvalues of A.
subspace A subspace of a linear space V is a subset of V that is itself a linear
space.
sum (of sets) The sum of 2 nonempty sets, say U and V, of m x n matrices
is the set {A + B : A E U, B E VJ comprising every (m x n) matrix
that is expressible as the sum of a matrix in U and a matrix in V; more
generally, the sum of k sets, say UI, ... , Uk. of m x n matrices is the set
(E7=IAi : Al eUI, ... ,Ak eUd·
sum (oftransformations) The sum oftwo transformations, say T and S, from a
linear space V into a linear space W is the transformation from V into W
that assigns to each matrix X in V the matrix T(X) + S(X) (in W)-since
the addition of transformations is associative, the definition of the term sum
extends in an unambiguous way to three or more transformations.
symmetric A matrix, say A, is symmetric if A' = A, or equivalently if it is square
and (for every i and j) its ijth element equals its jith element.
trace The trace of a (square) matrix is the sum of its diagonal elements.
transformation A transformation (also known as a function, operator, map, or
mapping), say T, from a set V, called the domain, into a set W is a corre-
spondence that assigns to each member X of V a unique member of W; the
member of W assigned to X is denoted by the symbol T (X) and is referred
to as the image of X, and, for any subset U of V, the set of all members of W
that are the images of one or more members of U is denoted by the symbol
T (U) and is referred to as the image of U - V and W consist of scalars,
row or column vectors, matrices, or other "objects".
transpose The transpose of an m x n matrix A is the n x m matrix whose ijth
Some Tenninology xxxi

element is the jith element of A.


union The union of 2 sets, say U and V, of m x n matrices is the set comprising
all matrices that belong to either or both of U and V; more generally, the
union of k sets, say UI, ... ,Uk, of m x n matrices comprises all matrices
that belong to at least one of UI, ... , Uk.
unit (upper or lower) triangular matrix A unit triangular matrix is a triangular
matrix all of whose diagonal elements equal one.
U'DU decomposition A U'DU decomposition of a symmetric matrix, say A, is a
decomposition of the form A = U'DU, where U is a unit upper triangular
matrix and D is a diagonal matrix.
Vandermonde matrix A Vandermonde matrix is a matrix of the general form
1 Xl
1 X2 X 2
Xf ... X~_l)
n-l
••. X2
( : ::2 : (where Xl, X2, ..• , xn are arbitrary scalars)
1 Xn x2n x nn - I
vee The vee of an m x n matrix A = (aI, a2, ... , an) is the mn-dimensional

(column) vee,,,, (:) obtained by succe"ively stacking the firs~ second,

... , nth columns of A one under the other.


vech The vech of an n x n matrix A = {aij} is the n(n + 1)j2-dimensional

(column) vee'or (1} where (fo, i = 1.2 •...• n).; = (aii. ai+l.i.···.
and is the subvector of the i th column of A obtained by striking out its first
i - I elements.
vee-permutation matrix The mn x mn vee-permutation matrix is the unique
permutation matrix, denoted by the symbol K mn , such that, for every m x n
matrix A, vec(A') = Kmn vee (A) - the vee-permutation matrix is also
known as the commutation matrix.
zero transformation The linear transformation from a linear space V into a linear
space W that assigns to every matrix in V the null matrix (in W) is called
the zero transformation.
1
Matrices

EXERCISE 1. Show that, for any matrices A, B, and C (of the same dimensions),
(A + B) +C = (C + A) + B.
Solution. Since matrix addition is commutative and associative,
(A + B) +C = C + (A + B) = (C + A) + B.

EXERCISE 2. For any scalars c and k and any matrix A,


c(kA) = (ck)A = (kc)A = k(cA),
and, for any scalar c, m x n matrix A, and n x p matrix B,
cAB = (cA)B = A(cB).
Using results (*) and (**) (or other means), show that, for any m x n matrix A
and n x p matrix B and for arbitrary scalars c and k,
(cA)(kB) = (ck)AB.

Solution. Making use of results (**) and (*), we find that


(cA)(kB) = k(cA)B = k(cAB) = (ck)AB.

EXERCISE 3. (a) Verify the associativeness of matrix multiplication; that is,


show that, for any m x n matrix A = {aij}, n x q matrix B = {b jk}, and q x r
matrix C = {cks}, A(BC) = (AB)C.
2 1. Matrices

(b) Verify the distributiveness with respect to addition of matrix multiplication;


that is, show that, for any m x n matrix A = {aij} and n x q matrices B = {bjk}
and C = {Cjk}, A(B + C) = AB + Ae.
Solution. (a) The jsth element of BC equals Lk bjkCks, and similarly the ikth
element of AB equals Lj aijbjk. Thus, the isth element of A(BC) equals

~ (~>ijbjkCks)
~ (~aijbjkCks) = ~ (~aijbjk)CkS'
and Lk(Lj aijbjk)cks equals the isth element of (AB)e. Since each element of
A(BC) equals the corresponding element of (AB)C, we conclude that A(BC) =
(AB)C.
(b) Observing that the j kth element of B + C equals b j k + Cj k, we find that the
ikth element of A(B + C) equals

L:>ij (bjk + Cjk) = L (aijbjk + aijCjk) = L aijbjk +L aijCjk·


j j j j

Further, observing that Lj aijbjk is the ikth element of AB and that Lj aijCjk
is the ikth element of AC, we find that Lj aijbjk + Lj aijCjk equals the ikth
element of AB + Ae. Since each element of A(B + C) equals the corresponding
element of AB + BC, we conclude that A(B + C) = AB + Be.

EXERCISE 4. Let A = {aij} represent an m x n matrix and B = {bij} a p x m


matrix.
(a) Let x = {Xi} represent an n-dimensional column vector. Show that the ith
element of the p-dimensional column vector BAx is

(E.1)

(b) Let X = {xij} represent an n x q matrix. Generalize formula (E.1) by


expressing the i rth element of the p x q matrix BAX in terms of the elements of
A, B, andX.
(c) Let x = {x;} represent an n-dimensional column vector and C = {ci}}
a q x p matrix. Generalize formula (E.I) by expressing the ith element of the
q-dimensional column vector CBAx in terms of the elements of A, B, C, and x.
(d) Let y = {y;} represent a p-dimensional column vector. Express the ith
element of the n-dimensional row vector y'BA in terms of the elements of A, B,
andy.
1. Matrices 3

Solution. (a) The jth element of the vector Ax is Lk=l ajkxk. Thus, upon re-
garding BAx as the product of B and Ax, we find that the ith element of BAx is
Lj=l bij Lk=l a jkXk·
(b) The i rth element of BAX is
m n
Lbij LajkXkr,
j=l k=l
as is evident from Part (a) upon regarding the irth element of BAX as the ith
element of the product of BA and the rth column of X.
(c) According to Part (a), the sth element of the vector BAx is
m n
Lbsj Lajkxk.
j=l k=l
Thus, upon regarding CBAx as the product of C and BAx, we find that the i th
element of CBAx is

(d) The i th element of the row vector y'BA is the same as the i th element of the
column vector (y'BA)' = A'B'y. Thus, according to Part (a), the ith element of
y'BA is
m p
Laji LbkjYk.
j=l k=l

EXERCISE 5. Let A and B represent n x n matrices. Show that

(A + B)(A - B) = A2 - B2

if and only if A and B commute.


Solution. Clearly,

(A + B)(A - B) = A(A - B) + B(A - B) = A2 - AB + BA - B2.

Thus,
(A + B)(A - B) = A 2 - B2
if and only if -AB + BA = 0 or equivalently if and only if AB = BA (i.e., if and
only if A and B commute).

EXERCISE 6. (a) Show that the product AB oftwo n x n symmetric matrices A


and B is itself symmetric if and only if A and B commute.
(b) Give an example of two symmetric matrices (of the same order) whose
product is not symmetric.
4 I. Matrices

Solution. (a) Since A and B are symmetric, (AB)' = B' A' = BA. Thus, if AB
is symmetric, that is, if AB = (AB)', then AB = BA, that is, A and B commute.
Conversely, if AB = BA, then AB = (AB)'.

(b) Take A = (~ ~) and B = (~ ~). Then,

AB = (~ ~) # (~ ~) = BA.
EXERCISE 7. Verify (a) that the transpose of an upper triangular matrix is lower
triangular and (b) that the sum of two upper triangular matrices (of the same order)
is upper triangular.
Solution. Let A = {aij} represent an upper triangular matrix of order n. Then,
by definition, the ijth element of A' is aji. Since A is upper triangular, aji = 0
for i < j = 1, ... , n or equivalently for j > i = 1, ... , n. Thus, A' is lower
triangular, which verifies Part (a).
Let B = {bij} represent another upper triangular matrix of order n. Then, by
definition, the ijth element of A + B is aij + bij. Since both A and B are upper
triangular, aij = 0 and bij = 0 for j < i = 1, ... , n, and hence aij + bij = 0 for
j < i = 1, ... , n. Thus, A + B is upper triangular, which verifies Part (b).

EXERCISE 8. Let A = {aij} represent an n x n upper triangular matrix, and


suppose that the diagonal elements of A equal zero (i.e., that all = a22 = ... =
ann = 0). Further, let p represent an arbitrary positive integer.

(a) Show that, for i = 1, ... , n and j = 1, ... , min(n, i + p - 1), the ijth
element of AP equals zero.
(b) Show that, for i :::: n - p + 1, the ith row of AP is null.
(c) Show that, for p :::: n, AP = O.
Solution. For i, k = 1, ... , n, let bik represent that ikth element of AP.
(a) The proof is by mathematical induction. Clearly, for i = 1, ... , n and j =
1, ... , min (n, i + 1 - 1), the i j th element of A I equals zero. Now, suppose that, for
i = I, ... , nand j = I, ... , min(n, i + p - I), the ijth element of AP equals zero.
Then, to complete the induction argument, it suffices to show that, for i = I, ... , n
and j = 1, ... , min(n, i + p), the ijth element of AP+! equals zero. Observing
that AP+l = APA, we find that, fori = I, ... ,nandj = l, ... ,min(n,i+p),
the ijth element of AP+l equals
min(n,i+p-l) n
= L Oakj + L bikakj
k=l k=i+p
(where, if i > n - p, the sum Lk=i+p bikakj
is degenerate and is to be interpreted as 0)
1. Matrices 5

= 0
(since, for k ~ j, akj = 0).
(b)Fori ~ n-p+1,min(n, i+p-1) = n(sincei ~ n-p+1 <=> i+p-1 ~ n).
Thus, for i ~ n - p + 1, it follows from Part (a) that all n elements of the ith row
of AP equal zero and hence that the ith row of AP is null.
(c) Clearly, for p ~ n, n - p + 1 S 1. Thus, for p ~ n, it follows from Part (b)
that all n rows of AP are null and hence that AP = O.
2
Submatrices and Partitioned Matrices

EXERCISE 1. Let A* represent an r x s submatrix of an m x n matrix A obtained


by striking out the iI, ... , im-rth rows and h, ... , jn-sth columns (of A), and let
B* represent the s x r submatrix of A' obtained by striking out the h, ... , jn-s th
rows and iI, ... , im-rth columns (of A'). Verify that
B* = A:.

Solution. Let it, ... , i: (it < ... < in represent those r of the first m posi-
tive integers that are not represented in the sequence iI, ... , i m- r . Similarly, let
H, ... , g (j{ < ... < js*) represent those s of the first n positive integers that
are not represented in the sequence h, ... , jn-s. Denote by aij and bij the ijth
elements of A and A', respectively. Then,

a" .• a" .•
'Ilt 'ilt
A'* - .
- :
( (

a'.
'T lt'* a·* '*
'I Js

b .•.• b·.·.
Ji'l lt'T
(
=
b·.··
Jsli
b·.·.
JS'T

EXERCISE 2. Verify (a) that a principal submatrix of a symmetric matrix is


symmetric, (b) that a principal submatrix of a diagonal matrix is diagonal, and (c)
that a principal submatrix of an upper triangular matrix is upper triangular.
8 2. Submatrices and Partitioned Matrices

Solution. Let B = {hij} represent the r x r principal submatrix of an n x n matrix


A = {aij} obtained by striking out all of the rows and columns except the kl' k2,
... , krthrows and columns (where kl < k2 < ... < k r ). Then, hij = akik/i, j =
1, ... , r).
(a) Suppose that A is symmetric. Then, for i, j = 1, ... , r, hi} = akikj =
akjki = h ji .
(b) Suppose that A is diagonal. Then, for j i= i = 1, ... , r, hij = akikj = O.
(c) Suppose that A is upper triangular. Then, for j < i = 1, ... , r, hi} =
akikj = O.

EXERCISE 3. Let
...... A2r
AIr)
o Arr
represent an n x n upper block-triangular matrix whose ijth block Ai} is of di-
mensions ni x n j (j ~ i = I, ... , r). Show that A is upper triangular if and only
if each of its diagonal blocks All, A22, ... , Arr is upper triangular.
Solution. Let ats represent the t sth element of A (t, s = 1, ... , n). Then,

anl+ .. +ni_l+l,nl+ .. +nj_l+l ... ~: nl+,,+n,_l+l,nl+·+n,_l+n, )


(
Ai} = :
anl+·+ni-l+ni,nl+·+nj_l+l an1+"+nl_1+n"nl+,,+n,_1+n,

(j ~ i = 1, ... ,r).
Suppose that A is upper triangular. Then, by definition, ats = 0 for s < t =
1, ... , n. Thus, anl+.+ni-l+k,nl+,,+ni-l+l (which is the kith element of the ith
diagonal block Aii) equals zero for I < k = 1, ... , ni, implying that Aii is upper
block-triangular (i = 1, ... , r).
Conversely, suppose that All, A22, ... ,Arr are upper triangular. Let t and s
represent any integers (between 1 and n, inclusive) such that ats i= O. Then,
clearly, for some integers i and j ~ i, ats is an element of the submatrix Aij, say
the kith element, in which case t = nl + .. ·+ni-l +k ands = nl + .. ·+nj-l +1.
If j > i, then (since k :::: ni) t < s. Moreover, if j = i, then (since Aii is upper
triangular) k :::: I, implying that t :::: s. Thus, in either case, t :::: s. We conclude
that A is upper triangular.

EXERCISE 4. Let

A= Cl
A2!
.

Ar!
Al2
A22

Ar2 ...
Ak)
A2c

Arc
2. Submatrices and Partitioned Matrices 9

represent a partitioned m x n matrix whose ijth block Aij is of dimensions mi x n j.


Verify that

...
...
A~I)
A~2

A~c

in other words, verify that A' can be expressed as a partitioned matrix, comprising
c rows and r columns of blocks, the ijth of which is the transpose Aji of the jith
block A ji of A. And, letting

BIV)
B2v
. ,

... B~v
represent a partitioned P x q matrix whose ijth block Bij is of dimensions Pi x qj,
verify also that if c = u and nk = Pk (k = 1, ... , c) [in which case all of the
products AikBkj (i = 1, ... , r; j = 1, ... , v; k = 1, ... , c), as well as the product
AB, exist], then
...
AB=
(Fll. F21
F12
F22 ... Fl")
F2v
. ,

Frl Fr2 F rv

where Fij = Lk=1 AikBkj = AilBlj + Ai2BZj + ... + AicBcj.


Solution. Let aij, bij, hij, and Sij represent the ijth elements of A, B, A', and
AB, respectively. Define Hij to be a matrix of dimensions ni x m j (i = I, ... , c;
j = 1, ... , r) such that

Hll H12
H21 H22
A'= ( .

HcI

Clearly, H;j is the submatrix of A' obtained by striking out the first n 1 + ... + ni-I
and last ni+1 + ... + nc rows of A' and the first ml + ... + mj_1 and last
mj+l + ... + mr columns of A'; and Aji is the submatrix of A obtained by
striking out the first m 1 + ... + m j -I and last m j + 1 + ... + mr rows of A and the
first n 1 + ... + ni -1 and last ni +1 + ... + nc columns of A. Thus, it follows from
result (1.1) that
10 2. Submatrices and Partitioned Matrices

Further, define Sij to be a matrix of dimensions mi x qj (i = 1, ... , r; j =


1, ... , v) such that

Then, for w = 1, ... , mi and z = 1, ... , qj, the wzth element of Sij is
Sml +··+mi-I +w,ql +"+qj_1 +~
nl+-·+n c
L aml+-·+mi_l+w.l bl,ql+··+qj_l+z
l=1
c nl+··+nk_l+nk
L L aml+-··+mi_l+w,l be,ql+··+qj_l+z
k=1 l=nl+··+nk_I+1
c nk
= LL aml+-·+mi_l+w,nl+-·+nk_l+t bnl+··+nk_l+t.ql+-·+qj_l+z,
k=1 t=1

And, upon observing that aml+-.+mi_l+w.nl+-.+nk_l+t is the wtth element of Aik


and that bnl+ .. +nk_l+t,ql+ ... +qj_I+Z is the tzth element of Bkj, it is clear that

nk
L
1=1
ami +-·+mi-I +W,nl +··+nk-I +t b nl +-·+nk-I +t,ql +-'+qj-I +z

is the wzth element of AikBkj and hence that sml+-.+mi_l+w.ql+ .. +qj_I+Z is the
wzth element of Fij. Thus,
3
Linear Dependence and Independence

EXERCISE 1. For what values of the scalar k are the three row vectors (k, 1, 0),
(1, k, 1), and (0, 1, k) linearly dependent, and for what values are they linearly
independent? Describe your reasoning.
Solution. Let Xl, x2, and X3 represent any scalars such that

xI(k, 1,0) +x2(1,k, 1) +X3(0, l,k) = 0,

or equivalently such that

xlk +X2 = 0,
Xl +X2k +X3 = 0,
X2 +X3k = 0,

or also equivalently such that

X2 = -kX3 = -kXI, (S.I)


kX2 = -Xl -x3· (S.2)
Suppose that k = O. Then, conditions (S.1) and (S.2) are equivalent to the
conditions X2 = 0 and X3 = -Xl.
Alternatively, suppose that k -10. Then, conditions (S.I) and (S.2) imply that
X3 = Xl and _k2xI = kX2 = -2Xl and hence that k 2 = 2 or X3 = X2 = Xl = O.
Moreover, if k 2 = 2, then either k = .fi, in which case conditions (S.l) and
(S.2) are equivalent to the conditions X3 = Xl and X2 = -.fixl' or k = -.fi, in
which case conditions (S.1) and (S.2) are equivalent to the conditions X3 = Xl and
X2 = .fiX!.
12 3. Linear Dependence and Independence

Thus, there exist values of XI, X2, and X3 other than XI = X2 = X3 = 0 if and only
if k = 0 or k = ±J2. And, we conclude that the three vectors (k, 1,0), (1, k, 1),
and (0, 1, k) are linearly dependent if k = 0 or k = ±J2, and linearly independent,
otherwise.

EXERCISE 2. Let A, B, and C represent three linearly independent m x n


matrices. Determine whether or not the three pairwise sums A + B, A + C, and
B + C are linearly independent. [Hint. Take advantage of the following general
result on the linear dependence or independence of linear combinations: Letting
AI, A2, ... , Ak represent m x n matrices and for j = 1, ... , r, taking Cj =
xljAI + X2jA2 + ... + XkjAk (where Xlj, X2j, ... , Xkj are scalars) and letting
Xj = (Xlj, X2j, ... ,Xkj )', the linear combinations C I , C2, ... , C r are linearly
independent if AI, A2, ... , Ak are linearly independent and XI, X2, ... , Xr are
linearly independent, and they are linearly dependent if XI, X2, ... , Xr are linearly
dependent.]
Solution. It follows from the result cited in the hint that A + B, A + C, and B + C
are linearly independent if (and only if) the three vectors (1, 1,0)', (1,0, 1)', and
(0, 1, 1, )' are linearly independent. Moreover, for any scalars XI, X2, and X3 such
that
XI (1,1,0)' + x2(1, 0,1)' + X3(0, 1, 1)' = 0,
we have that XI + X2 = XI + X3 = X2 + X3 = 0, implying that

and XI = X2 = -X3 and hence that X3 = 0 and XI = X2 = O. Thus, (1, 1,0)',


(1,0, 1)', and (0, 1, 1)' are linearly independent. And, we conclude thatA+B, A+
C, and B + C are linearly independent.
4
Linear Spaces: Rowand Column
Spaces

EXERCISE 1. Which of the following two sets are linear spaces: (a) the set of all
n x n upper triangular matrices; (b) the set of all n x n nonsymmetric matrices?
Solution. Clearly, the sum of two n x n upper triangular matrices is upper triangular.
And, the matrix obtained by multiplying any n x n upper triangular matrix by any
scalar is upper triangular. However, the sum of two n x n nonsymmetric matrices is
not necessarily nonsymmetric. For example, if A is an n x n nonsymmetric matrix,
then -A and A' are nonsymmetric, yet the sums A + (-A) = 0 and A + A' are
symmetric. Also, the product of the scalar 0 and any n x n matrix is the null matrix,
which is symmetric. Thus, the set of all n x n upper triangular matrices is a linear
space, but the set of all n x n nonsymmetric matrices is not.

EXERCISE 2. Letting A represent an m x n matrix and B an m x p matrix,


verify that (1) C(A) c C(B) if and only if R(A') c R(B'), and (2) C(A) = C(B)
if and only if R(A') = R(B').
Solution. (1) Suppose that R(A') c R(B'). Then, for any vector x in C(A), we
have (in light of Lemma 4.1.1) that x' E R(A'), implying that x' E R(B') and
hence (in light of Lemma 4.1.1) that x E C(B). Thus, C(A) c C(B).
Conversely, suppose that C(A) C C(B). Then, for any m-dimensional column
vector x such that x' E R(A'), we have that x E C(A), implying that x E C(B) and
hence that x' E R(B'). Thus, R(A') c R(B').
We conclude that C(A) C C(B) if and only if R(A') c R(B').
An alternative verification of Part (1) is obtained by taking advantage of Lemma
14 4. Linear Spaces: Rowand Column Spaces

4.2.2. We have that

C(A) c C(B) ¢> A = BK for some matrix K


¢> A' = K'B' for some matrix K
¢> R(A') c R(B').

(2) IfR(A') = R(B'), then R(A') c R(B') and R(B') c R(A'), implying [in
light of Part (1)] thatC(A) C C(B) andC(B) c C(A) and hence thatC(A) = C(B).
Similarly, if C(A) = C(B), then C(A) c C(B) and C(B) C C(A), implying that
R(A') c R(B') and R(B') c R(A') and hence that R(A') = R(B'). Thus,
C(A) = C(B) if and only if R(A') = R(B').

EXERCISE 3. Let U and W represent subspaces of a linear space V. Show that


if every matrix in V belongs to U or W, then U = V or W = V.
Solution. Suppose that every matrix in V belongs to U or W. And, assume (for
purposes of establishing a contradiction) that neither U = V nor W = V. Then,
there exist matrices A and B in V such that Art. U and B rt. W. And, since A and
B each belong toU or W, A E Wand BE U.
Clearly, A = B- (B-A) andB = A+(B-A), andB -A E U orB-A E W.
If B - A E U, then B - (B - A) E U and hence A E U. If B - A E W, then
A + (B - A) E Wand hence B E W. In either case, we arrive at a contradiction.
We conclude that U = V or W = v.

EXERCISE 4. Let A, B, and C represent three matrices (having the same dimen-
sions) such that A + B + C = O. Show that sp(A, B) = sp(A, C).
Solution. Let E represent an arbitrary matrix in sp(A, B). Then, E = dA + kB for
some scalars d and k, implying (since B = -A - C) that

E = dA + k(-A - C) = (d - k)A + (-k)C E sp(A, C).

Thus, sp(A, B) C sp(A, C). And, it follows from an analogous argument that
sp(A, C) C sp(A, B). We conclude that sp(A, B) = sp(A, C).

EXERCISE 5. Let AI, ... , Ak represent any matrices in a linear space V. Show
that sp(A I, ... , A k ) is a subspace of V and that, among all subspaces of V that
contain AI, ... Ak, it is the smallest [in the sense that, for any subspace U (of V)
that contains AI, ... , Ab SP(AI, ... , Ak) c U].
Solution. Let U represent any subspace of V that contains AI, ... , Ak. It suffices
(since V itself is a subspace of V) to show that SP(AI, ... , Ak) is a subspace of U.
Let A represent an arbitrary matrix in Sp(AI, ... , Ak). Then, A = xlAI + ... +
XkAk for some scalars XI, ... , Xk, implying that A E U. Thus, Sp(AI, ... , Ak)
is a subset of U, and, since Sp(AI, ... ,Ak) is a linear space, it follows that
Sp(AI, ... , Ak) is a subspace of U.
4. Linear Spaces: Rowand Column Spaces 15

EXERCISE 6. Let AI, ... , Ap and BI, ... , Bq represent matrices in a linear
space V. Show that if the set {AI, ... , Ap} spans V, then so does the set {AI, ... ,
A p , BI, ... , B q }. Show also that if the set {AI, ... , A p , BI, ... , B q } spans V and
if BI, ... ,Bq are expressible as linear combinations of AI, ... , A p , then the set
{AI, ... , Ap} spans V.
Solution. It suffices (as observed in Section 4.3c) to show that if BI, ... ,Bq are
expressible as linear combinations of AI, ... , A p , then any linear combination
of the matrices AI, ... , A p , BI, ... , Bq is expressible as a linear combination of
AI, ... ,Ap and vice versa. Suppose then that there exist scalars klj, ... ,kpj such
that B j = Li kijAi (j = 1, ... , q). Then, for any scalars Xl, ... ,x p' YI, ... , Yq'

I>iAi + I>jBj = L(Xi + LYjkij)Ai,


i j i j

which verifies that any linear combination of AI, ... ,A p , BI, ... ,Bq is express-
ible as a linear combination of AI, ... , Ap. That any linear combination of AI, ... ,
Ap is expressible as a linear combination of AI, ... , A p , BI, ... , Bq is obvious.

EXERCISE 7. Suppose that {A 1, ... , Ak} is a set of matrices that spans a linear
space V but is not a basis for V. Show that, for any matrix A in V, the representation
of A in terms of AI, ... , Ak is nonunique.
Solution. Let Xl, ... , Xk represent any scalars such that A = L~= 1 Xi Ai. [Since
Sp(AI, ... ,Ak) = V, such scalars necessarily exist.] Since the set {AI, ... , Ak}
spans V but is not a basis for V, it is linearll dependent and h~nce there exist
scalars ZI, ... , Zk. not all zero, such that Li=l ZiAi = o. Lettmg Yi = Xi +
Zi (i = 1, ... ,k), we obtain a representation A = L~=l YiAi different from the
representation A = L~=I XiAi.

EXERCISE 8. Let

A-
- (~ 0
1
-2
2
-3 2)
o0 6 2
5 2 .
2
o -4 -2 1 0

(a) Show that each of the two column vectors (2, -1,3, -4)' and (0, 9, -3, 12)'
is expressible as a linear combination of the columns of A [and hence is in C(A)].
(b) A basis, say S*, for a linear space V can be obtained from any finite set S
that spans V by successively applying to each of the matrices in S the following
algorithm: include the matrix in S* if it is nonnull and if it is not expressible as a
linear combination of the matrices already included in S*. Use this algorithm to
find a basis for C(A). (In applying the algorithm, take the spanning set S to be the
set consisting of the columns of A.)
(c) What is the value of rank(A)? Explain your reasoning.
16 4. Linear Spaces: Rowand Column Spaces

(d) A basis for a linear space V that includes a specified set, say T, of r linearly
independent matrices in V can be obtained by applying the algorithm described
in Part (b) to the set S whose first r elements are the elements of T and whose
remaining elements are the elements of any finite set U that spans V. Use this
generalization of the procedure from Part (b) to find a basis for C(A) that includes
the two column vectors from Part (a). (In applying the generalized procedure, take
the spanning set U to be the set consisting of the columns of A.)
Solution. (a) Clearly,

and

(b) The basis obtained by applying the algorithm comprises the following 3

(j} (j). (0
vectors:

(c) Rank A = 3. The number of vectors in a basis for C(A) equals 3 [as is
evident from Part (b)], implying that the column rank of A equals 3.
(d) The basis obtained by applying the generalized procedure comprises the

cn
following 3 vectors:

(j). (~D·
EXERCISE 9. Let A represent a q x p matrix, B a p x n matrix, and C an m x q
matrix. Show that (a) if rank(CAB) = rank(C), then rank(CA) = rank(C) and
(b) ifrank(CAB) = rank(B), then rank(AB) = rank(B).
Solution. (a) Suppose that rank(CAB) = rank(C). Then, it follows from Corollary
4.4.5 that
rank(C) ~ rank(CA) ~ rank(CAB) = rank(C)
and hence that rank(CA) = rank(C).
4. Linear Spaces: Rowand Column Spaces 17

(b) Similarly, suppose that rank (CAB) = rank(B). Then, it follows from Corol-
lary 4.4.5 that
rank(B) ~ rank(AB) ~ rank(CAB) = rank(B)
and hence that rank(AB) = rank(B).

EXERCISE 10. Let A represent an m x n matrix of rank r. Show that A can be


expressed as the sum of r matrices of rank 1.
Solution. According to Theorem 4.4.8, there exist an m x r matrix B and an r x n
matrix T such that A = BT. Let bl, ... , b r represent the first, ... , rth columns of
B and t;, ... , t~ the first, ... , rth rows of T. Then, applying formula (2.2.9), we
find that
r
A= LAj,
j=1

where (for j = 1, ... , r) Aj = bjtj. Moreover, according to Theorem 4.4.8,


rank(B) = rank(T) = r, and it follows that bl, ... , b r and t; , ... , t~ are nonnull
and hence that A I, ... , Ar are nonnull. And, upon observing (in light of Corollary
4.4.5 and Lemma 4.4.3) that rank (A j) ~ rank(b j) ~ 1, it is clear that rank(A j) =
1 (j = 1, ... , r).

EXERCISE 11. Let A represent an m x n matrix and C a q x n matrix.


(a) Confirm that

R(C) = R(~) {:} R(A) c R(C) .

(b) Confirm that rank(C) ~ rank ( ~). with equality holding if and only if
R(A) c R(C).
Solution. (a) Suppose that R(A) c R(C). Then, according to Lemma 4.2.2, there
exists an m x q matrix L such that A = LC and hence such that ( ~) = (~) C.
Thus, R( ~) c R(C), implying [since R(C) C R(~)l that R(C) = R( ~).
Conversely, suppose that R(C) = R(~). Then, since R(A) C R(~)'
R(A) c R(C). Thus, we have established that R(C) = R(~) {:} R(A) c
R(C) .

(b) Since (according to Lemma 4.5. 1) R(C) C R(~) , it follows from Theorem

4.4.4 that rank(C) ~ rank(~). Moreover, if R(A) c R(C), then [according


18 4. Linear Spaces: Rowand Column Spaces

to Part (a) or to Lemma 4.5.1] R(C) = R(~) and consequently rank(C) =

rank ( ~). And, conversely, if rank(C) = rank ( ~). then since R(C) C R( ~) ,
it follows from Theorem 4.4.6 that R(C) = R( ~) and hence [in light of Part (a)

or of Lemma 4.5.1] that R(A) C R(C). Thus, rank(C) ::: rank ( ~). with equality
holding if and only if R(A) c R(C).
5
Trace of a (Square) Matrix

EXERCISE 1. Show that for any m x n matrix A, n x p matrix B, and p x q


matrix C,
tr(ABC) = tr(B' A' C') = tr(A' C'B').

Solution. Making use ofresults (2.9) and (1.5), we find that

tr(ABC) = tr(CAB) = tr[(CAB)'] = tr(B'A'C') = tr(A'C'B').

EXERCISE 2. Let A, B, and C represent n x n matrices.


(a) Using the result of Exercise 1 (or otherwise), show that if A, B, and C are
symmetric, then tr(ABC) = tr(BAC).
(b) Show that [aside from special cases like that considered in Part (a)] tr(BAC)
is not necessarily equal to tr(ABC).
Solution. (a) If A, B, and C are symmetric, then B' A'e' = BAC and it follows
from the result of Exercise 1 that tr(ABC) = tr(BAC).
(b) Let A = diag(A*, 0), B = diag(B*, 0), and C = diag(C*, 0), where

A* = G~). B* = (_ ~ ~ ). C* = G_~ ).
20 5. Trace of a (Square) Matrix

we find that

EXERCISE 3. Let A represent an n x n matrix such that A' A = A 2 .


(a) Show that tr[(A - A')'(A - A')] = O.
(b) Show that A is symmetric.

Solution. (a) Making use ofresuIts (2.3) and (1.5), we find that

tr[ (A - A')' (A - A')] = tr[A' A - A' A' - AA + AA']


= tr(A' A) - tr[ (AA)'] - tr(A 2) + tr(AA')
= tr(AA') - tr[(AA)']
= tr(A' A) - tr[(AA)']
= tr(A' A) - tr(AA) = O.
(b) In light of Lemma 5.3.1, it follows from Part (a) that A - A' = 0 or equiva-
lently that A' = A.
6
Geometrical Considerations

EXERCISE 1. Use the Schwarz inequality to show that, for any two matrices A
and B in a linear space V,

II A + B II :<:: II A II + II B II,

with equality holding if and only if B = 0 or A = kB for some nonnegative scalar


k. (This inequality is known as the triangle inequality.)
Solution. We have that

II A + B 112 = (A + B) (A + B)
0

= IIAII2 +2(AoB) + IIBII2


:<:: IIAII2+2IAoBI+ IIBII2 (S.1)
:<:: IIAII2 + 211AIIIIBil + IIBII2 (S.2)
(using the Schwarz inequality)
= (IIAII + liB 11)2

or equivalently that
IIA+BII:<:: IIAII + IIBII·
For this inequality to hold as an equality, it is necessary and sufficient that both
of inequalities (S.l) and (S.2) hold as equalities. Recalling (from, for instance,
Theorem 6.3.1) the conditions under which the Schwarz inequality holds as an
equality, we find that inequalities (S.l) and (S.2) both hold as equalities if and
only ifB = 0 or A = kB with k ::: O.
22 6. Geometrical Considerations

EXERCISE 2. Letting A, B, and C represent arbitrary matrices in a linear space


V, show that
(a) 8(B, A) = 8(A, B), that is the distance between B and A is the same as that
between A and B;
(b)

8(A, B) > 0, if A =1= B,


= 0, if A = B,
that is, the distance between any two matrices is greater than zero, unless the two
matrices are identical, in which case the distance between them is zero;
(c) 8(A, B) ::::: 8(A, C)+ 8(C, B), that is, the distance between A and B is less
than or equal to the sum of the distances between A and C and between C and B;
(d) 8(A, B) = 8(A + C, B + C), that is, distance is unaffected by a translation
of "axes."
[For Part (c), use the result of Exercise l, i.e., the triangle inequality.]
Solution. (a)

8(B, A) = II B - A II
= II (-I)(A - B) II = 1- 11 II A - B II = II A - B II = 8(A, B).

(b)

8(A, B) = II A - B II > 0, if A - B =1= 0 or equivalently if A =1= B,


= 0, if A - B = 0 or equivalently if A = B.
(c)

8(A, B) = II A - B I
= II (A - C) + (C - B) II
::::: II A - Cil + II C - B II = 8(A, C) + 8(C, B).

(d)

8(A + C, B + C) = II (A + C) - (B + C) II = II A - B II = 8(A, B).

EXERCISE 3. Let w'J , w;, and w; represent the three linearly independent 4-
dimensional row vectors (6, 0, -2,3), (-2,4,4,2), and (0,5, -1, 2), respectively,
in the linear space R4, and adopt the usual definition of inner product.
(a) Use Gram-Schmidt orthogonalization to find an orthonormal basis for the
linear space sp(w;, w;, w;).
6. Geometrical Considerations 23

n
(b) Find an orthonormal basis for 4 that includes the three orthonormal vectors
from Part (a). Do so by extending the results of the Gram-Schmidt orthogonaliza-
tion [from Part (a)] to a fourth linearly independent row vector such as (0, 1,0,
0).

Solution. (a) The 3 orthogonal vectors obtained by applying the formulas (for
Gram-Schmidt orthogonalization) of Theorem 6.4.1 are:
y~ = w~ = (6,0, -2,3),
y~ = w~ - (-2/7)y~ = (1/7)(-2,28,24,20),
y; = w; - (13/21)y~ - (8/49)y~ = (1/147)(-118,371, -411, -38).
By normalizing y~, y~, and y;, we obtain a basis for sp(w;, w~, w;) consisting of
the following 3 vectors:
z; = (1/7)y; = 0/7)(6,0, -2,3),
z~ = (1/6)y~ = 0/42)(-2,28,24,20),
z; = (21609/321930)1/2y; = (321930)-1/2(-118,371, -411, -38).

(b) An orthonormal basis for n4 can be obtained by extending the results of


the Gram-Schmidt orthogonalization to a fourth linearly independent vector w~.
Taking w~ = (0, 1,0,0) and applying the formulas of Theorem 6.4.1, we obtain
the following vector, which is orthogonal to y;, y~, and y;:
y~ = w~ - (371/2190)y; - (1/9)y; - (O)y;
= 0/321930)(53998,41209,29841, -88102).
The set consisting of the normalized vector
z~ = [321930/(13266413370)1/2]y~
= (3266413370)-1/2(53998,41209,29841, -88102),
together with z;, z~, and z;, is an orthonormal basis for n4.

EXERCISE 4. Let {AI, ... , Ak} represent a nonempty (possibly linearly depen-
dent) set of matrices in a linear space V.
(a) Generalize the results underlying Gram-Schmidt orthogonalization (which
are for the special case where the set (AI, ... ,Ad is linearly independent) by
showing (1) that there exist scalars Xij (i < j = 1, ... , k) such that the set
comprising the k matrices
BI =AI,
B2 = A2 - X12BI,
24 6. Geometrical Considerations

is orthogonal; (2) that, for j = I, ... , k and for those i < j such that Bi is nonnull,
Xij is given uniquely by

and (3) that the number of non null matrices among BI, ... , Bk equals dim[sp(AI,
... , Ak)].
(b) Describe a procedure for constructing an orthonormal basis for Sp(AI, ... ,
Ad.
Solution. (a) The proof of (I) and (2) is by mathematical induction. Assertions
(1) and (2) are clearly true for k = 1. Suppose now that they are true for a set
of k - 1 matrices. Then, there exist scalars xij (i < j = 1, ... , k - 1) such
that the set comprising the k - 1 matrices BI, ... , Bk-I is orthogonal, and, for
j = 1, ... , k - I and for those i < j such that Bi is nonnull, Xij is given uniquely
by
AjoBi
xij = - - .
BioBi
Moreover, for i = I, ... , k - 1, we find (as in the proof of the results underlying
Gram-Schmidt orthogonalization in Theorem 6.4.1) that Bk ° Bi = 0 if and only if

For those i (between I and k - 1) such that Bi = 0, this equation is satisfied by


any Xik, and, for those i such that Bi -10, it has the unique solution

AkoBi
Xik=--·
BioBi

This completes the induction argument, thereby establishing (I) and (2).
Consider now Assertion (3). Each of the matrices BI, ... , Bk can (by repeated
substitution) be expressed as a linear combination of AI, ... , Ak. Conversely, each
of the matrices AI, ... , Ak can be expressed as a linear combination ofBI , ... , Bk.
Thus, Sp(BI, ... , Bk) = SP(AI, ... , Ak). Since the set {BI, ... , Bd is orthogonal,
we conclude - in light of Lemma 6.2.1 and Theorem 4.3.2 - that the nonnull
matrices among BI, ... , Bk form a basis for Sp(AI, ... , Ak) and hence that the
number of such matrices equals dim[sp(AI, ... , Ak)].
(b) An orthonormal basis for Sp(AI, ... , Ad can be constructed by making use
of the formulas for BI, ... , Bk from Part (a). The basis consists of those matrices
obtained by normalizing the nonnull matrices among BI, ... , Bk.

EXERCISE 5. Let A represent an m x k matrix of rank r (where r is possibly less


than k). Generalize the so-called QR decomposition of A, which is for the special
case where r = k and is obtainable through the application of Gram-Schmidt
orthogonalization to the columns of A. Do so by using the results of Exercise 4 to
obtain a decomposition of the form A = QRI, where Q is an m x r matrix with
6. Geometrical Considerations 25

orthonormal columns and R 1 is an r x k submatrix whose rows are the r nonnull


rows of a k x k upper triangular matrix R having r positive diagonal elements and
k - r null rows.

Solution. Denote the first, ... , kth columns of A by aI, ... , at, respectively. Then,
according to the results of Exercise 4, there exist scalars xij (i < j = I, ... , k)
such that the k column vectors bl, ... , bk defined recursively by the equalities

bl = aI,
b2 = a2 - Xl2bl,

or equivalently by the equalities

al = bl
a2 = b2 + X12bl,

form an orthogonal set. Further, r of the vectors bl, ... , bt, say the SI th, ... , srth
of them, are nonnull, and, for j = I, ... , k and for those i < j such that b i is
nonnull, Xij is given uniquely by

Now, let B represent the m x k matrix whose first, ... , kth columns are bl, ... ,
bk, respectively, and let X represent the k x k unit upper triangular matrix whose ij th
element is (for i < j = I, ... , k) Xij. Then, observing that the first column ofBX is
bI and that (for j = 2, ... ,k)the jthcolumn ofBX is bj+Xj-I,jbj-1 +. ,,+xljbI
and recalling result (2.2.9), we find that

where BI is the m x r submatrix (of B) whose columns are the slth, ... , srth
columns of B and Xl is the r x k submatrix (of X) whose rows are the slth,
... ,srth rows of X.
And, the decomposition A = BIXI can be reexpressed as
26 6. Geometrical Considerations

where Q = BID, with D = diag(1I b s [ II-I, ... , II bSr II-I), and Rl = EXl, with
E = diag( II bs [ II, ... ,II bSr II), or equivalently where Q is the m x r matrix with
jth column II b Sj 11-1 b Sj and RI = {rij} is the r x k matrix with

IIbs; Ilxs;j, for j > Si,


rij = { II bs; II, for j = Si ,
0, for j < Si .

Moreover, the columns of Q are orthonormal, and R I is an r x k submatrix whose


rows are the r nonnull rows of a k x k upper triangular matrix R having r positive
diagonal elements and n - r null rows - the Slth, ... , srth rows of R (which are
the nonnull rows) are respectively the first, ... , rth rows of Rl.
7
Linear Systems: Consistency and
Compatibility

EXERCISE 1. (a) Let A represent an m x n matrix, C an n x q matrix, and B a


q x p matrix. Show that if rank(AC) = rank(C), then

R(ACB) = R(CB) and rank(ACB) = rank(CB)

and that if rank(CB) = rank(C), then

C(ACB) = C(AC) and rank(ACB) = rank(AC) .

(b) Let A and B represent m x n matrices. (1) Show that if C is an r x q matrix and
Daqxmmatrixsuchthatrank(CD) = rank (D) , then CDA = CDBimpliesDA =
DB. {Hint. To show that DA = DB, it suffices to show that rank [D(A - B)] = O.}
(2) Similarly, show that if C is an n x q matrix and D a q x p matrix such that
rank (CD) = rank(C), then ACD = BCD implies AC = BC.
Solution. (a) It is clear from Corollary 4.2.3 that R(ACB) c R(CB) and C(ACB)
c C(AC).
Now, suppose that rank(AC) = rank(C). Then, according to Corollary 4.4.7,
R(AC) = R(C), and it follows from Lemma 4.2.2 that C = LAC for some matrix
L. Thus,
R(CB) = R(LACB) c R(ACB),
implying that R(ACB) = R(CB) [which implies, in tum, that rank(ACB) =
rank(CB)].
Similarly, if rank(CB) = rank(C), then C(CB) = C(C), in which case C =
CBR for some matrix R, implying that C(AC) = C(ACBR) c C(ACB) and
hence that C(ACB) = C(AC) [and rank(ACB) = rank(AC)].
28 7. Linear Systems: Consistency and Compatibility

(b) Let F = A-B. (1) Suppose that rank(CD) = rank(D). Then, if CDA =
CDB, we find, in light of Part (a), that

rank(DF) = rank(CDF) = rank(CDA - CDB) = rank(O) = 0,


implying that DF = 0 or equivalently that DA = DB.
(2) Similarly, suppose that rank(CD) = rank(C). Then, if ACD = BCD, we
find, in light of Part (a), that

rank(FC) = rank(FCD) = rank(ACD - BCD) = rank(O) = 0,


implying that FC = 0 or equivalently that AC = BC.
8
Inverse Matrices

EXERCISE 1. Let A represent an m x n matrix. Show that (a) if A has a right


inverse, then n ::: m and (b) if A has a left inverse, then m ::: n.
Solution. (a) If A has a right inverse, then, according to Lemma 8.1.1, rank (A) =
m, and, since (according to Lemma 4.4.3) n ::: rank(A), it follows that n ::: m. (b)
Similarly, if A has a left inverse, then according to Lemma 8.1.1, rank (A) = n,
and, since (according to Lemma 4.4.3) m ::: rank(A), it follows that m ::: n.

EXERCISE 2. An n x n matrix A is said to be involutory if A2 = I, that is, if A


is invertible and is its own inverse.
(a) Show that an n x n matrix A is involutory if and only if (I - A)(I + A) = O.

(b) Show that a 2 x 2 matrix A = (: :) is involutory if and only if (1)

a2 + be = 1 and d = -a or (2) b = e = 0 and d = a = ±l.


Solution. (a) Clearly,

(I - A)(I + A) =I - A + (I - A)A =I - A + A - A2 =I - A2.

Thus,
(I - A) (I + A) = 0 <=> 1- A2 =0 <=> A2 = I.
(b) Clearly,
A2 = (a 2 +be ab +bd)
be +d 2
ae+ed .
30 8. Inverse Matrices

And, if Condition (1) or (2) is satisfied, it is easy to see that A is involutory.


Conversely, suppose that A is involutory. Then, ab = -db and ae = -dc,
implying that d = -a or b = e = O. Moreover, a 2 + be = 1 and d 2 + be = 1.
Consequently, if d = -a, Condition (1) is satisfied. Alternatively, if b = e = 0,
then d 2 = a 2 = 1, implying that d = a = ±l (in which case Condition (2) is
satisfied) or that d = -a = ±l (in which case Condition (1) is satisfied).

EXERCISE 3. Let A represent an n x n nonnull symmetric matrix, and let B


represent an n x r matrix of full column rank rand T an r x n matrix of full row
rank r such that A = BT. Show that the r x r matrix TB is nonsingular. (Hint.
Observe that A' A = A 2 = BTBT.)
Solution. Since A' A = A2 = BTBT, we have (in light of Corollaries 7.4.5 and
8.3.4) that
rank(BTBT) = rank(A' A) = rank(A) = r.

And, making use of Lemma 8.3.2, we find that

rank(BTBT) = rank(TBT) = rank(TB).

Thus, rank(TB) = r.

EXERCISE 4. Let A represent an n x n matrix, and partition A as A = (AI, A2).

(a) Show that if A is invertible and A -I is partitioned as A-I = (:~) (where


BI has the same number of rows as Al has columns), then

BIAI = I, BIA2 = 0, B2AI = 0, B2A2 = I, (E.l)

AIBI =I - A2B2, A2B2 = 1- AIBI . (E.2)

(b) Show that if A is orthogonal, then

Solution. (a) To establish results (E. I) and (E.2), it suffices to observe that if A is
invertible and A -I is partitioned as A -I = (:~). then

and
8. Inverse Matrices 31

(b) Results (E.3) and (E.4) can be obtained as a special case of results (E.l) and
(E.2) by observing that if A is orthogonal, then A is invertible and A-I = A' =

(!D·
EXERCISE 5. Let A represent an m x n non null matrix of rank r. Show that
there exists an m x m orthogonal matrix whose first r columns span C(A).
Solution. According to Theorem 6.4.3, there exist r m-dimensional vectors that
are orthonormal with respect to the usual inner product for n mxI and form a
basis for C(A). And, according to Theorem 6.4.5, there exist m - r additional
m-dimensional vectors, say b r+1, ... , b m, such that bI, ... , b r , b r+1, ... , bm are
orthonormal with respect to the usual inner product for n'" x I and form a basis for
nmx 1. Clearly, the m x m matrix whose first, ... , rth, (r + l)th, ... , mth columns
are respectively bI, ... , b r , b r+ 1, ... , b m is orthogonal, and its first r columns
span C(A).

EXERCISE 6. Let T represent an n x n triangular matrix. Show that rank(T) is


greater than or equal to the number of nonzero diagonal elements in T.
Solution. Suppose that T has m nonzero diagonal elements and that they are
located in the i 1 th, i 2 th, ... , im th rows of T. Let T * represent the m x m submatrix
obtained by striking out all of the rows and columns of T except the iIth, i2th,
... , im th rows and columns. Then, T * is triangular, and the diagonal elements of
T *, which are identical to the i 1 th, i2th, ... , im th diagonal elements of T, are all
nonzero. Thus, it follows from Corollary 8.5.6 that rank(T*) = m. We conclude,
on the basis of Theorem 4.4.10, that rank(T) ~ m.

EXERCISE 7. Let

..
.. ,, AIr)
A2r (BII
B2I
., B= .
o A" Brl

represent respectively an n x n upper block-triangular matrix whose ijth block Aij


is of dimensions nj x n j (j ~ i = 1, ... , r) and an n x n lower block-triangular
matrix whose ijth block B jj is of dimensions nj x n j (j S i = 1, ... , r).
(a) Assume that A and B are invertible, and "recall" that

FIr)
F2r
· , B
_I
=
(Gil
G2I
·· ...
o F" GrI
32 8. Inverse Matrices

where
j
Fii = Ail I , Fij = -Ail l L AikFkj (j > i = 1, ... ,r) , (*)
~r+-I
Gii = Bill, Gij = -Bill LBikGkj (j < i = 1, ... , r) . (**)
k=j
Show that the submatrices Fij (j ::: i = 1, ... , r) and Gij (j :::: i = 1, ... , r) are
also expressible as

j-I
Fjj = Aj}, Fij = -(LFikAkj)Aj/ (i < j = 1, ... , r), (E.5)
k=ri
Gjj = Bj/, Gij = -( L GikBkj)Bj/ (i > j = 1, ... , r) . (E.6)
k=j+1
Do so by applying results (**) and (*) to A' and B', respectively.
(b) Formulas (*) form the basis for an algorithm for computing A -I in r steps: the
first step is to compute the matrix F rr = A;,I; the (r - i + l)th step is to compute
the matrices F ii , Fi,i+I, ... , Fir from formulas (*) (i = r - 1, r - 2, ... ,1).
Similarly, formulas (**) form the basis for an algorithm for computing B- 1 in r
steps: the first step is to compute the matrix GIl = B1/; the ith step is to compute
the matrices Gil, Gi2, ... , Gi i from formulas (**) (i = 2, ... , r). Describe how
formulas (E.5) and (E.6) in Part (a) can be used to devise r-step algorithms for
computing A -I and B- 1, and indicate how these algorithms differ from those
based on formulas (*) and (**).
Solution. (a) Clearly, it suffices to show that

:}
G;I
%)
C C'
F' F;2 G;2 G~2
(A')-I = : 12 (8')-' ~ ~ . ,

F~r F;r 0 G~r


where
j-I
F'jj = (A~ij)-I, F;j = _(Aj)-I(LAkjF;k)(i < j = 1, ... , r) ,
k=ri
Gj j = (Bj)-I, G;j = -(Bjj)-I( L BkjG;k) (i > j = 1, ... , r) ,
k=j+1
or equivalently (after relabeling the i and j subscripts) where

i-I
F;i = (A;i)-I, Fji = -(A;i)-I(LAkiFjk) (j < i = 1, ... , r),
k=j
8. Inverse Matrices 33

j
G;i = (B;i)-I, Gji = -(B;i)-I( L B~iGjk) (j > i = 1, ... , r).
k=i+1
Upon observing that

C'
A' A;2
A' = : 12

A~r A;r o
the validity of these fonnulas for (A')-1 and (B')-1 is seen to be an immediate
consequence of fonnulas (**) and (*), respectively.
(b) To compute A-I, we can employ an r-step algorithm, whose first step is to
compute Fl1 = All and whose jth step is to compute the matrices Flj, F2j' ... ,
Fjj fromfonnulas (E.5)(j = 2, .... r). To compute B- 1, we can employ an r-step
algorithm, whose first step is to compute G rr = B;:;.1 and whose (r - j + l)th
step is to compute the matrices G jj, G 1+ I.j, ... , G r j from fonnulas (E.6) (j =
r - 1, r - 2, ... , 1). These algorithms differ from those based on fonnulas (*) and
(**) in that they generate A-I and B- 1 one "column" of blocks at a time, rather
than one "row" at a time.
9
Generalized Inverses

EXERCISE 1. Let A represent any m x n matrix and B any m x p matrix. Show


that if ARB = B for some n x m matrix H. then AGB = B for every generalized
inverse G of A.
Solution. Suppose that ARB = B for some n x m matrix H, and let G represent
an arbitrary generalized inverse of A. Then,

AGB = AGARB = ARB = B.

[Or, alternatively, observe that HB is a solution to the linear system AX = B (in


X), so that this linear system is consistent and it follows from Theorem 9.1.2 that
GB is a solution to AX = B or equivalently that AGB = B.]

EXERCISE 2. (a) Let A represent an m x n matrix. Show that any n x m matrix


X such that A' AX = A' is a generalized inverse of A and similarly that any n x m
matrix Y such that AA'Y' = A is a generalized inverse of A.
(b) Use Part (a), together with the result that (for any matrix A) the linear system
A' AX = A' (in X) is consistent, to conclude that every matrix has at least one
generalized inverse.
Solution. (a) Suppose that X is such that A' AX = A'. Then,

A'AXA = A'A = A'AI,

and it follows from Corollary 5.3.3 that

AXA=AI=A
36 9. Generalized Inverses

(i.e., that X is a generalized inverse of A). Similarly, if Y is such that AA'Y' = A,


then
AA'y' A' = AA' = AA'I,
implying that A'y' A' = A'I = A' and hence that
AYA = (A'Y' A')' = (A')' = A.
(b) The consistency (for any matrix A) of the linear system A' AX = A' implies
that corresponding to any matrix A, there exists a matrix X such that A' AX = A'
(and a matrix Y such that AA'Y' = A). Thus, it follows from Part (a) that every
matrix has at least one generalized inverse.

EXERCISE 3. Let A represent an m x n nonnull matrix, let B represent a matrix


of full column rank and T a matrix of full row rank such that A = BT, and let L
represent a left inverse of Band R a right inverse of T.
(a) Show that the matrix R(B'B) -I R' is a generalized inverse of the matrix A' A
and that the matrix L' (TT') -I L is a generalized inverse of the matrix AA'.
(b) Show that if A is symmetric, then the matrix R(TB)-IL is a generalized
inverse of the matrix A2. (If A is symmetric, then it follows from the result of
Exercise 8.3 that TB is nonsingular.)
Solution. (a) Clearly,

A' A[R(B'B) -I R']A' A = T'B'BTR(B'B) -I R'T'B'BT


= T'B'BI(B'B)-I(TR)'B'BT
= T'I'B'BT = T'B'BT = A' A,
and similarly

AA'[L'(TT')-IL]AA' = BTT'B'L'(TT')-ILBTT'B'
= BTT'(LB)'(TT')-IITT'B'
= BTT'I'B' = BTT'B' = AA'.

(b) Clearly,

A 2[R(TB)-IL]A 2 = BTBTR(TB)-ILBTBT
= BTBI(TB)-IITBT = BTBT = A2.

EXERCISE 4. A generalized inverse, say G, of an m x n matrix A of rank r


can be obtained by an approach consisting of (1) finding r linearly independent
rows (of A), say rows ii, i2, ... , iT (where il < i2 < ... < ir), and r linearly
independent columns, say h, jz, ... , jr (where h < jz < ... < jr), (2) inverting
the submatrix, say BII, of A obtained by striking out all of the rows and columns
9. Generalized Inverses 37

(of A) save rows il,i2, ... ,i, and columns h,h, ... ,j" and (3) taking (for
s = 1,2, ... , rand t = 1,2, ... , r) the jsitth element of G to be the stth element
ofB l / and taking its other (n - r)(m - r) elements to be O. Use this approach to
find a generalized inverse of the matrix

Solution. The second and third columns of A are linearly independent (as can
be easily verified), implying (since the first column is null) that r = 2. Choose,
for example, the linearly independent rows and linearly independent columns so
that il = 2 and i2 = 4 - clearly, the second and fourth rows of A are linearly
independent- and h = 2 and h = 3. Then,

Bll = (~ D·
Applying formula (8.1.2) for the inverse of a 2 x 2 nonsingular matrix, we find
that
Bl/ = (1/6) (_~ -2)4 .
Thus, one generalized inverse of A is

~)
0 0
G = (1/6) (~ 3
-3
0
0
-24
EXERCISE 5. Let A represent an m x n nonnull matrix of rank r. Take Band K
to be nonsingular matrices (of orders m and n, respectively) such that

(the existence of which is guaranteed). Show (a) that an n x m matrix G is a


generalized inverse of A if and only if G is expressible in the form

G -- K- 1 (I,V U)B- 1
W
(E.1)

for some r x (m - r) matrix U, (n - r) x r matrix V, and (n - r) x (m - r) matrix


W, and (b) that distinct choices for U, V, and/or W lead to distinct generalized
inverses.
38 9. Generalized Inverses

Solution. (a) Let H = KGB, and partition H as

where H II is of dimensions r x r.

Clearly, G is a generalized inverse of A if and only if

or equivalently (since B and K are nonsingular) if and only if

( Ir 0) H (Ir 0) _ (Ir 0)
0000-00'

and hence if and only if H II = I.


Moreover, if G is expressible in the form (E. I), then H = (~ ~). so that
H11 = I. Conversely, if H11 = I, then

with V = H12, V = H21, and W = H22, so that G is expressible in the form (E.l).
We conclude that G is a generalized inverse of A if and only if G is expressible in
the form (E. I).

(b) Let GI = K- I (~I ~IJB-I and G2 = K- I (~2 ~22)B-I, where VI


and V2 are r x (m - r) matrices, V I and V 2 are (n - r) x r matrices, and W I and
W2 are (n - r) x (m - r) matrices. Then, GI = G2 only if KGIB = KG2B, or
equivalently only if

(~I ~IJ = (~2 ~),


that is, only ifV2 = VI, V2 = VI, and W2 = WI.
EXERCISE 6. Let k represent a nonzero scalar. For any matrix A, (1/ k)A - is a
generalized inverse of the matrix kA. Generalize this result to partitioned matrices
of the form (A, kB) and (k~)' where A is an m x n matrix, B an m x p matrix,

and C a q x n matrix. Do so by showing (I) that, for any generalized inverse (~~)
of the partitioned matrix (A, B)(where Gll is of dimensions n x m), (k~~2) is
a generalized inverse of (A, kB) and (2) that, for any generalized inverse (HI, H2)
9. Generalized Inverses 39

of the partitioned matrix (~) (where HI is of dimensions n x m) (HI, k- I H2)

is a generalized inverse of (k~).


Solution. (1) Clearly,

(A, kB) = (A, B) eO' kt)·

Thus, it follows from Part (2) of Lemma 9.2.4 that the matrix

is a generalized inverse of (A, kB).


(2) Similarly,

Thus, it follows from Part (1) of Lemma 9.2.4 that the matrix

(HI, H2) (
I
0' 0
kIq
)-1 = (HI, H2)
(1m
0

is a generalized inverse of (k~ ).


EXERCISE 7. Let T represent an m x p matrix and W an n x q matrix.
(a) Show that, unless T and W are both nonsingular, there exist generalized
inverses of (! ~) that are not of the form (~- ~_). [Hint. Make use
of the result that, for any m x n matrix A and for any particular generalized
inverse G of A, an n x m matrix G* is a generalized inverse of A if and only if
G* = G + (I - GA)T + S(I - AG) for some n x m matrices T and S.]
(b) Take U to be an m x q matrix and V an n x p matrix such that C(U) c C(T)
and R(V) C R(T), define Q = W - VT-U, and "recall" that the partitioned
matrix

is a generalized inverse ofthe matrix (~ ~). Generalize the result of Part (a) by
showing that, unless T and Q are both nonsingular, there exist generalized inverses
of (~ ~) that are not of the form (*). [Hint. Use Part (a), together with the
result that, for any r x s matrix B, any r x r nonsingular matrix A, and any s x s
40 9. Generalized Inverses

nonsingular matrix C, a matrix G is a generalized inverse of ABC if and only if


G = C- I HA -I for some generalized inverse H of B.]

Solution. (a) Making use of the result cited in the hint, we find that, for any p x n
matrix X and q x m matrix Y, the partitioned matrix

is a generalized inverse of (! ~). 1fT is not nonsingular, then either 1- T-T =1=
oor I - TT- =1= 0, as is evident from Corollary 8.1.2. Moreover, if 1- T-T =1= 0,
then X can be chosen so that (I - T-T) X =1= 0, and similarly if 1- TT- =1= 0,
then V can be chosen so that V(I - TT-) =1= o. We conclude that if T is not
nonsingular, then there exists a generalized inverse of (! ~) that is not of

the form (TO- ~_). It follows from an analogous argument that if W is not
nonsingular, then again there exists a generalized inverse of (! ~) that is not
of the form (T- 0)
0 W-·

(b) Suppose that either Tor Q is not nonsingular, and assume (for purposes of
establishing a contradiction) that every generalized inverse of (~ ~) is of the
form (*). Upon observing that (in light of Lemma 9.3.5)

and that (according to Lemma 8.5.2) the matrices (V~- ~) and (! - T;U)
are nonsingular and upon applying the result cited in the hint and making use of
formulas (8.5.6), we find that every generalized inverse of (! g) is of the form
(0 1 -T-U)-I (T- + T-UQ-VT-
I -Q-VT-
-T-UQ-)
Q-
(I 0)-1
- VT- I
= (Io T-U) (T- + T-UQ-VT-
I -Q-VT-
(I
- T-UQ-)
Q- VT-
0)
1

- (T-0 0)Q- ,
9. Generalized Inverses 41

which contradicts Part (a). We conclude that, unless T and Q are both nonsingular,
there exist generalized inverses of (~ ~) that are not of the fonn (*).

EXERCISE 8. Let T represent an m x p matrix, U an m x q matrix, V an n x p


matrix, and Wan n x q matrix, take A = (~ ~), and define Q = W - VT-U.

(a) Show that the matrix

is a generalized inverse of the matrix A if and only if


(1) (I - TT-)U(I - Q-Q) = 0,
(2) (I - QQ-)V(I - T-T) = 0, and
(3) (I - TT-)UQ-V(I - T-T) = 0.
(b) Verify that (together) the two conditions C(U) c C(T) and R(V) C R(T)
imply Conditions (1) - (3) of Part (a).
(c) Exhibit matrices T, U, V, and W that (regardless of how the generalized
inverses T- and Q- are chosen) satisfy Conditions (1) - (3) of Part (a) but do not
satisfy (both of) the conditions C(U) c C(T) and R(V) c R(T).
Solution. (a) It is a straightforward exercise to show that

U - (I - TT-)U(I - Q-Q»)
W .

Thus, AGA = A if and only if Conditions (1) - (3) are satisfied.

°
(b) Suppose that C (U) C C(T) and R(V) c R(T). Then, it follows from Lemma
9.3.5 that (I - TT-)U = and V(I - T-T) = 0, and hence that Conditions (1)
- (3) are satisfied.
°
(c) Take T = and U = 0, take W to be an arbitrary nonnull matrix, and take V
to be any nonnull matrix such thatC(V) C C(W). Conditions (1) and (3) are clearly
satisfied. Moreover, Q = W, and (in light of Lemma 9.3.5) (I - QQ-)V = 0, so
that condition (2) is also satisfied. On the other hand, the condition R(V) c R(T)
is obviously not satisfied.

EXERCISE 9. Suppose that a matrix A is partitioned as

and that C(AI2) c C(AII) and R(A2t> C R(AII)' Take Q to be the Schur
42 9. Generalized Inverses

complement of All in A relative to All' and partition Q as

(where Qll, Q12, Q21' and Q22 are of the same dimensions as A22, A23, A32, and
A33, respectively), so that QII = A22 - A21AlI AJ2, QJ2 = A23 - A21AlIAI3,
Q21 = A32 - A3IAlIAJ2, and Q22 = A33 - A3IAlIA!3. Let

Define T = (All AJ2), U


A21 A22
(1~~). and V = (A31,A32), or equivalently
define T, U, and V to satisfy

Show that (1) G is a generalized inverse of T; (2) the Schur complement Q22 -
Q21 Q ll QI2 of QI I in Q relative to Q ll equals the Schur complement A33 - VGU
of T in A relative to G; and (3)

GU = (AlIAI3 - ~IIAI2QlIQ12) ,
QII QI2
VG = (A3IAII - Q21 Q lIA2I All ' Q21 Q ll)'

(b) Let A represent an n x n matrix (where n ::: 2), let nl, ... , nk represent
positive integers such that n I + ... + nk = n (where k ::: 2), and (for i = 1, ... , k)
let n7 = nl + '" + ni. Define (for i = 1, ... , k) Ai to be the leading principal
submatrix of A ofordern7 and define (fori = 1, ... , k-I)Ui tobethen7x(n-n7)
matrix obtained by striking out all of the rows and columns of A except the first n7
rows and the last n - n7 columns, Vi to be the (n - nn x n7 matrix obtained by
striking out all of the rows and columns of A except the last n- n7 rows and first
n7 columns, and Wi to be the (n - n7) x (11 - n7) submatrix obtained by striking
out all of the rows and columns of A except the last n - n7 rows and columns, so
that (for i = 1, ... , k - 1)

A= (~:
Suppose that (for i = 1, ... , k - 1) C(U i ) C C(Ai) and R(Vi) C R(Ai). Let

i)
B (i) _ ( BI\
- (i)
B21
9. Generalized Inverses 43

(l. -- 1, •.. , k - 1) and B(k) = BII


(k) (I) = AI'
' where Bll - BI2(I)
= A-VI I, B(I)
21 =
V 1A -l,and B(l) 22 = W V A-V d h ~. B(i) B(i)
1 - I I I an w ere (lOfl ~ 2) Ii' 12' 21,an
B(i) d B(i)
22
are defined recursively by partitioning B~i2-1), B~I-I), and B~;I) as

B (i-I) - (X(i-I) X(i-I»


12 - I ' 2 '

(i-I) (y(Ii-I») B(i-I) _ (Q (i-I)


II Q(112"-I»)
B21 = y~-I) , 22 - Q(i-I) Q(i-I)
21 22
(in such a way that X~i-I) has ni columns, y~i-I) has ni rows, and Q~il-I) is
of dimensions ni x ni) and (using QII(i-l) to represent a generalized inverse of
Q (i-I»
II
b taki
Y ng

.
B(l) _
(B(i-I)
II
+ X(i-I)Q-(i-l)y(i-I)
I II I
_X(i-I)Q-(i-I»)
I 11
11 - _Q-(i-I)y(i-I) Q -(i-I) ,
II I II
" (X(i-I) _ X(i-I)Q-(i-I)Q(i-I»)
B(l) _ 2 I II 12
12 - Q-(i-I)Q(i-I) ,
II 12
B(i) _ (y(i-l) _ Q(i-I)Q-(i-I)y(i-l) Q(i-I)Q-(i-I»
21 - 2 21 11 I' 21 I '
B(i) _ Q(i-I) _ Q(i-I)Q-(i-I)Q(i-l)
22 - 22 21 II 12·

Show that (1) B~? is a generalized inverse of Ai (i = 1, ... , k); (2) B~d is the
Schur complement of Ai in A relative to B~ii (i = 1, ... , k -1); (3) B~id = B~iivi
and B~i = ViB~ii (i = 1, ... , k - 1).
[Note. The recursive formulas given in Part (b) for the sequence of matrices B(I) ,
... , B(k-I), B(k) can be used to generate B(k-I) in k - 1 steps or to generate B(k)
in k steps - the formula for generating B(i) from B(i-l) involves a generalized
inverse of the ni x ni matrix Q~\-I). The various parts of B(k-I) consist of a
generalized inverse B~~-I) of Ak-I, the Schur complement Bi~-l) of Ak-I in A
relative to B~~-I), a solution B~~-I) of the linear system Ak-IX = Vk-I (in X),
and a solution Bi~-l) of the linear system YAk-1 = Vk-I (in V). The matrix B(k)
is a generalized inverse of A. In the special case where ni = 1, the process of
generating the elements of the n x n matrix B(i) from those of the n x n matrix
B(i-I) is called a sweep operation - see, e.g., Goodnight (1979).]
Solution. (a) (1) That G is a generalized inverse of T is evident upon setting
T = All, V = Al2, V = A2J, and W = A22 in formula (6.2a) of Theorem 9.6.1 [or
equivalently in formula (*) of Exercise 7 or 8]-the conditions C(AI2) c C(All)
and R(A21) c R(A II ) insure that this formula is applicable.
(2)

Q22 - Q21 Qil Ql2


44 9. Generalized Inverses

= A33 - A3I A iI A \3 - (A32 - A3I AiI A12)Qil (A23 - A2IAiIAl3)


= A33 -A31(Ail + AiI A 12QiI A2I A il)Al3
-A31(-AiI A12Qil)A23 - A32(-QiI A2I A il)A13 - A32Qil A 23
= A33 - VGu.

(3) Partition GV as GV = (~~) and VG as VG = (YI, Y2) (where XI, X2,


Y I, and Y2 are of the same dimensions as Al3, A23, A31, and A32, respectively).
Then,

XI = (Ail + AiI A 12QiI A21 A il)A13 + (-Ail A12Qil)A23


= AiIA13 - Ail A 12Qil (A23 - A2I A iI A l3)
= AiIA13 - AiIA12QiI Q12

and

It can be established in similar fashion that Y I = A31Aii - Q21 Qil A21Ali and
Y2 = Q2IQil'
(b) The proof of results (1), (2), and (3) is by mathematical induction. By defi-
nition, BW is a generalized inverse of AI, Bi~ is the Schur complement of Al in
ArelativetoBii),andBi~ = Bg)VI andBW =VIBW·
Suppose now that B;\-I) is a generalized inverse of Ai-I, that B~2-1) is the
Schurcompemento
I . A relahveto
f A i-I m . B(i-I)
II ,and th at B(i-l)
12 = B(i-I)V
II i-I
and B~I-I) = Vi-IBi\-I) (where 2 :: i :: k - 1). Partition Ai, Vi, and Vi as

. _ (A i-I A (i-I»)
12 · - (A(i-I) A(i-I»
AI - A(i-I) A(i-I) , an d V I - 31 ' 32
21 22

(were
h 13 has n *i _ 1 rows an d A(i-I)
A (i-I) 31 has n *i _ 1 coI umns.
) Then, cIearI y,

· - (A(i-I) A(i-I» A(i-I»)


V I-I - 12 ' 13 and Vi-I = ( N-l)'
A31

th t X (i-I) - B(i-I)A(i-l) X(i-I) _ B(i-I)A(i-I) y(i-I) - A(i-I)B(i-1)


so a I - 11 12' 2 - II 13' I - 21 11'
and y~-I) = A~\-I)Bi\-l). Thus, it follows from Part (a) that Bi? is a generalized
inverse of Ai, that B~i is the Schur complement of Ai in A relative to BW, and
(i) (i) (i) (i)
that B12 = BII Vi and B21 = ViB II ·
We conclude (based on mathematical induction) that (for i = 1, ... , k - 1) Biii
is a generalized inverse of Ai, B~i is the Schur complement of Ai in A relative to
9. Generalized Inverses 45

B (i)
Ii ' andB(i)
12 = B(i)V
11 i an
dB(i)
21 = V i B(i)
II. Moreover, SInce
. B(k-I)
11
. a generaI·Ized
IS
inverse of Ak-I, since Qi~-l) = Bi~-I) and Bi~-l) is the Schur complement of
. A I· B(k-I). X(k-I) B(k-I) B(k-I)V
Ak-I In re atlve to 11 ,SInce I = 12 = 11 k-I and y(k-I) I =
Bi~-l) = Vk_IB~~-l), and since Ak = A, it is evident upon setting T = Ak-I,
V = Uk-I, V = Vk-I, and W = Wk-I in formula (6.2a) of Theorem 9.6.1 [or
equivalently in formula (*) of Exercise (7) or (8)] that B~~) is a generalized inverse
of Ak.

EXERCISE 10. Let T represent an m x p matrix and W an n x q matrix, and


let G = (g~~ g~~) (where Gll is of dimensions p x m) represent an arbitrary
generalized inverse ofthe (m +n) x (p +q) block-diagonal matrix A = (! ~).
Show that GIl is a generalized inverse of T and G22 a generalized inverse of W.
Show also that TG12W = 0 and WG21T = O.
Solution. Clearly,

o W0)
(T = A = AGA = (TG ll
WG21
Thus, TGII T = T (i.e., Gil is a generalized inverse of T), WG22 W = W (i.e.,
G22 is a generalized inverse of W), TG12 W = 0, and WG21 T = O.

EXERCISE 11. Let T represent an m x p matrix, V an m x q matrix, V an


n x p matrix, and Wan n x q matrix, and define Q = W - VT-V. Suppose
that C(U) C C(T) and R(V) c R(T). Prove that for any generalized inverse
G = (g~~ g~~) of the partitioned matrix (~ ~).the(q xn)submatrixG22
is a generalized inverse of Q. Do so via an approach that consists of showing that

I
( - VT- 0) (T
I V
V) (I0
W
-T-V) = (T
I 0
0)
Q

and of then using the result cited in the hint for Part (b) of Exercise 7, along with
the result of Exercise 10.
Solution. Observing (in light of Lemma 9.3.5) that V - VT-T = 0 and that
V - TT-V = 0, we find that

-T-V) _ (T
I - 0
V)
Q 0
(I

= (! ~).
Moreover, recalling Lemma 8.5.2, it follows from the result cited in the hint for
Part (b) of Exercise 7, or (equivalently) from Part (3) of Lemma 9.2.4, that the
46 9. Generalized Inverses

matrix

is a generalized inverse of the matrix (! g). Based on the result of Exercise


10, we conclude that G22 is a generalized inverse of Q..

EXERCISE 12. Let T represent an m x p matrix, U an m x q matrix, V an n x p


matrix, and W an n x q matrix, and take A = (~ ~) .

= W - VT-U, and let G = (~~: ~~~), where Gll = T- +


(a) Define Q
T-UQ-VT-, G12 = -T-UQ-, G21 = -Q-VT-, and G22 = Q-. Show that
the matrix
Gll - G12G22G21
is a generalized inverse of T. (Note. Ifthe conditions C(U) c C(T) and R(V) C
R(T) are satisfied or more generally if Conditions (I) - (3) of Part (a) of Exercise
8 are satisfied, then G is a generalized inverse of A}.
(b) Show by example that, for some values of T, U, V, and W, there exists a
generalized inverse G = (~~: ~~~) of A (where Gll is of dimensions p x m,
GI2 of dimensions p x n, G21 of dimensions q x m, and G22 of dimensions q x n)
such that the matrix
Gll - G12G22G21
is not a generalized inverse of T.
Solution. (a) We find that

Gll - G12G22G21 = T- + T-UQ-VT- - T-UQ-(Q-)-Q-VT-


= T- + T-UQ-VT- - T-UQ-VT-
= T-.

(b) Take T = (! ~), U = 0, V = 0, and W = 0, and take

G = (Gil G12).
G21 G22
9. Generalized Inverses 47

where GIl = T-, G22 = 0, and GI2 and G2I are arbitrary. Then, clearly G is a
generalized inverse of A. Further,

so that GIl - GI2G22G2I is a generalized inverse of T if and only if

Suppose, for example, that G12, GZi, and G2I are chosen so that the (I, I)th ele-
ment of GI2 G 22 G2I is nonzero (which - since any n x q matrix is a generalized in-
verseofG22 -is clearly possible). Then, the (1, I)th elementofTGI2G22G2I T is
nonzero and hence TG12G22G2I Tis nonnull. We conclude that Gil -G12G22G2I
is not a generalized inverse of T.
10
Idempotent Matrices

EXERCISE 1. Show that if an n x n matrix A is idempotent, then (a) for any n x n


nonsingular matrix B, B- 1AB is idempotent; and (b) for any integer k greater than
or equal to 2, Ak = A.
Solution. Suppose that A is idempotent.
(a) B-1AB(B-1AB) = B- 1A 2B = B-1AB.
(b) The proof is by mathematical induction. By definition, Ak = A for k = 2.
Suppose that A k = A for k = k*. Then,
Ak*+1 = AAk* = AA = A,

that is, Ak = A for k = k* + 1. Thus, for any integer k ::: 2, Ak = A.

EXERCISE 2. LetPrepresentanm xn matrix (wherem ::: n) such thatP'P = In,


or equivalently an m x n matrix whose columns are orthonormal (with respect to
the usual inner product). Show that the m x m symmetric matrix PP' is idempotent.
Solution. Clearly,
(PP')PP' = P(P'P)p' = PlnP' = PP'.

EXERCISE 3. Show that, for any symmetric idempotent matrix A, the matrix
I - 2A is orthogonal.
Solution. Clearly,
(I-2A)'(1-2A) = (1-2A)(I-2A) = 1-2A-2A+4A2 = I-2A-2A+4A = I.
50 10. Idempotent Matrices

EXERCISE 4. Let A represent an m x n matrix. Show that if A' A is idempotent,


then AA' is idempotent.

Solution. Suppose that A' A is idempotent. Then,


A'AA'A = A'A = A'AI,

implying (in light of Corollary 5.3.3) that

AA'A = AI = A
and hence that
(AA')2 = (AA' A)A' = AA'.
EXERCISE 5. Let A represent a symmetric matrix and k an integer greater than
or equal to 1. Show that if Ak+ I = Ak, then A is idempotent.

Solution. It suffices to show that, for every integer m between k and 2 inclusive,
A m+ 1 = Am implies Am = Am-I. (If Ak + 1 = Ak butA 2 were not equal to A, then
there would exist an integer m between k and 2 inclusive such that Am+ I = Am
but Am "I Am-I.)
Suppose that Am+ 1 = Am. Then, since A is symmetric,

A'AA m- 1 = A'AAm- 2
(where A0 = I), and it follows from Corollary 5.3.3 that
AA m- 1 = AA m- 2

or equivalently that

EXERCISE 6. Let A represent an n x n matrix. Show that 0/2)(1 + A) is


idempotent if and only if A is involutory (where involutory is as defined in Exercise
8.2).
Solution. Clearly,

[(1/2)(1 + A)]2 = (1/4)1 + (1/2)A + (1/4)A 2.

Thus, (1/2)(1 + A) is idempotent if and only if

(1/4)1 + (1/2)A + (1/4)A 2 = (1/2)1 + (1/2)A,

or equivalently if and only if (1/4)A 2 = (1/4)1, and hence if and only if A2 =I


(i.e., if and only if A is involutory).

EXERCISE 7. Let A and B represent n x n symmetric idempotent matrices.


Show that if C(A) = C(B). then A = B.
10. Idempotent Matrices 51

Solution. Suppose that C(A) = C(B). Then, according to Lemma 4.2.2, A = BR


and B = AS for some n x n matrices R and S. Further,
B = B' = (AS)' = S'A' = S'A.
Thus,
A = BBR = BA = S' AA = S' A = B.
EXERCISE 8. Let A represent an r x m matrix and B an m x n matrix.
(a) Show that B-A-is a generalized inverse of AB if and only if A - ABB- is
idempotent.
(b) Show that if A has full column rank or B has full row rank, then B-A-is a
generalized inverse of AB.
Solution. (a) In light of the definition of a generalized inverse and the definition
of an idempotent matrix, it suffices to show that
ABB-A-AB = AB
if and only if
A-ABB-A-ABB- = A-ABB-.
Premultiplication and postmultiplication of both sides of the first of these two
equalities by A-and B- , respectively, give the second equality, and premultipli-
cation and postmultiplication of both sides of the second equality by A and B,
respectively, give the first equality. Thus, these two equalities are equivalent.
(b) Suppose that A has full column rank. Then, according to Lemma 9.2.8, A-
is a left inverse of A (i.e., A-A = I). It follows that A- ABB- = BB- and hence,
in light of Lemma 10.2.5, that A- ABB- is idempotent. We conclude, on the basis
of Part (a), that B-A-is a generalized inverse of AB.1t follows from an analogous
argument that ifB has full row rank, then B- A - is a generalized inverse of AB.

EXERCISE 9. Let T represent an m x p matrix, U an m x q matrix, V an


n x p matrix, and W an n x q matrix, and define Q = W - VT-U. Using the
result of Part (a) of Exercise 9.8, together with the result that (for any matrix B)
rank(B) = tr (B-B) = tr (BB-), show that if
(1) (I - TT-)U(I - Q-Q) = 0,
(2) (I - QQ-)V(I - T-T) = 0, and
(3) (I - TT-)UQ-V(I - T-T) = 0,
then
rank (~ ~) = rank(T) + rank(Q) .

Solution. Let A = (~ ~). and define G as in Part (a) of Exercise 9.8. Sup-
pose that Conditions (1) - (3) are satisfied. Then, in light of Exercise 9.8, G is
52 10. Idempotent Matrices

a generalized inverse of A, and, making use of the result that (for any matrix B)
rank(B) = tr (BB - ) [which is part of result (2.1)], we find that

rank (A) = tr (AG)


_ (TT- - (I - TT-)UQ-VT- (I - TT-)UQ-)
- tr (I - QQ-)VT- QQ-
= tr(TT-) - tr[(l- TT-)UQ-VT-] + tr(QQ-)
= rank(T) - tr [(I - TT-)UQ-VT-] + rank(Q).
Moreover, it follows from Condition (3) that

and hence that

tr [(I - TT-)UQ-VT-] = tr [(I - TT-)UQ-VT-TT-]


= tr[TT-(I - TT-)UQ-VT-]
= tr [(TT- - TT-)UQ-VT-]
= tr (0)
= o.
We conclude that
rank(A) = rank(T) + rank(Q).

EXERCISE 10. Let T represent an m x p matrix, U an m x q matrix, V an n x p


matrix, and W an n x q matrix, take A = (~ ~), and define Q = W - VT-U.
Further, let

ET =I - TT-, FT =I- T-T, X = ETU, Y = VFT, Ey =I - YY-,


Fx =I - X-X, Z = EyQFx, and Q* = FxZ-Ey .

(a) (Meyer 1973, Theorem 3.l) Show that the matrix

(E.l)

where
T- - T-U(I - Q*Q)X-ET
- FTY-(I - QQ*)VT- FTY-(I - QQ*)
- FTY-(I - QQ*)QX-ET

and
10. Idempotent Matrices 53

is a generalized inverse of A.
(b) (Meyer 1973, Theorem 4.1) Show that

rank (A) = rank(T) + rank(X) + rank(Y) + rank(Z) . (E.2)

[Hint. Use Part (a), together with the result that (for any matrix B) rank(B) =
tr (B-B) = tr (BB-).]
(c) Show that if C(U) c C(T) and R(V) c R(T), then formula (E.2) for
rank(A) reduces to the formula

rank(A) = rank(T) + rank(Q),


and the formula

which is reexpressible as

can be obtained as a special case of formula (E. 1) for a generalized inverse of A.


Solution. (a) It can be shown, via some painstaking algebraic manipulation, that

and hence that

(b) Taking G to be the generalized inverse (E. 1), it is easy to show that

Thus, making use of the result that (for any matrix B) rank (B) = tr (BB-) [which
is part of result (2.1)], we find that

rank(A) = tr (AG)
= tr (TT-) + tr (XX-ET) + tr (YY-) + tr (EyQQ*)
= rank(T) + tr (XX-ET) + rank(Y) + tr (EyQQ*).
Moreover,

tr(XX-ET) = tr(ETXX-) = tr(ETETUX-)


= tr(ETUX-) = tr(XX-) = rank(X),
54 10. Idempotent Matrices

and similarly

tr(EyQQ*) = tr(EyQFxZ-Ey) = tr(EyEyQFxZ-)


= tr (EyQFxZ-)
= tr(ZZ-) = rank(Z).
(c) Suppose thatC(U) c C(T) and R(V) c R(T). Then, it follows from Lemma
9.3.5 that X = 0 and Y = O. Accordingly, Fx = I and Ey = I, implying that
Z = Q. Thus, formula (E.2) reduces to

rank(A) = rank(T) + rank(Q),


[which is formula (9.6.1)].
Clearly, Q* is an arbitrary generalized inverse of Q, and the q x m and p x n null
matrices are generalized inverses of X and Y, respectively. Thus, formula (**) can
be obtained as a special case of formula (E.l) by setting X- = 0 and Y- = 0 -
formula (**) is identical to formula (9.6.2b), and formula (*) identical to formula
(9.6.2a) [and to formula (*) of Exercise 9.7 or 9.8].
11
Linear Systems: Solutions

EXERCISE 1. Show that, for any matrix A,

Solution. Letting x represent a column vector (whose dimension equals the number
of rows in A), it follows from Corollary 9.3.6 that x E C(A) if and only if x =
AA -x, or equivalently if and only if (I - AA -)x = 0, and hence if and only if
x E N(I - AA -). We conclude that C(A) = N(I - AA -).

EXERCISE 2. Show that if Xl, ... , Xk are solutions to a linear system AX = B


(in X) and Cj, ... , Ck are scalars such that :L7= I Ci = 1, then the matrix :L7= I Ci Xi
is a solution to AX = B.
Solution. If Xl, ... , Xk are solutions to AX = Band CI, ... , Ck are scalars such
that :L7=I Ci = 1, then

EXERCISE 3. Let A and Z represent n x n matrices. Suppose that rank(A) =


n - 1, and let x and y represent nonnull n-dimensional column vectors such that
Ax = 0 and A'y = O.
(a) Show that AZ = 0 if and only if Z = xk' for some n-dimensional row vector
k'.
56 11. Linear Systems: Solutions

(b) Show that AZ = ZA = 0 if and only if Z = cxy' for some scalar c.


Solution. (a) Suppose that Z = xk' for some row vector k'. Then,

AZ = (Ax)k' = Ok' = O.

Conversely, suppose that AZ = O. Let Zj represent the jth column of Z. Since


(in light of Lemma 11.3.1 and Theorem 4.3.9) {x} is a basis for N(A), Zj = kjxfor
some scalar k j (j = 1, ... , n), in which case Z = xk', where k' = (k" ... , kn ).
(b) Suppose that Z = cxy' for some scalar c. Then, AZ = 0 [as is evident from
Part (a)], and
ZA = cxy' A = cx(A'y)' = cxO' = O.
Conversely, suppose that AZ = ZA = O. Then, it follows from Part (a) that
Z = xk' for some row vector k'. Moreover, k' = (x'x)-l (x'x)k' = (x'x)-'x'Z,
so that k'A = (x'x)-lx'ZA = 0, implying that A'k = (k'A)' = 0 and hence that
k E N(A'). Since (in light of Lemma 11.3.1 and Theorem 4.3.9) {y} is a basis for
N(A'), k = cy for some scalar c. Thus, Z = cxy'.

EXERCISE 4. Suppose that AX = B is a nonhomogeneous linear system (in


an n x p matrix X). Let s = p[n - rank(A)], and take Z" ... , Zs to be any s
n x p matrices that form a basis for the solution space of the homogeneous linear
system AZ = 0 (in an n x p matrix Z). Define Xo to be any particular solution to
AX = B, and let Xi = Xo + Zi (i = 1, ... , s).
(a) Show that the s + 1 matrices Xo, Xl, ... , Xs are linearly independent solu-
tions to AX = B.
(b) Show that every solution to AX = B is expressible as a linear combination
ofXo,X" ... ,Xs ·
(c) Show that a linear combination Lf=0 ki Xi of Xo, X" ... , Xs is a solution
to AX = B if and only if the scalars ko, k" ... , ks are such that Lf=o ki = 1.
(d) Show that the solution set of AX = B is a proper subset of the linear space
sp(Xo, X" ... , Xs).
Solution. (a) It follows from Theorem 11.2.3 that Xl, ... , Xs , like Xo, are solutions
to AX = B.
For purposes of showing that Xo, Xl, ... , Xs are linearly independent, suppose
that ko, k" ... , ks are scalars such that Lf=0 ki Xi = O. Then,

(S.l)

Consequently,
11. Linear Systems: Solutions 57

implying (since B =1= 0) that

(S.2)

and hence [in light of equality (S.l)] that L:f=l kiZi = O. Since ZI,"" Zs are
linearly independent, we have that kl = ... = ks = 0, which, together with
equality (S.2), further implies that ko = 0. We conclude that Xo, Xl, ... , Xs are
linearly independent.
(b) Let X* represent any solution to AX = B. Then, according to Theorem
11.2.3, X* = Xo + Z* for some solution Z* to AZ = O. Since ZI, ... , Zs form
a basis for the solution space of AZ = 0, Z* = L:f=1 kiZi for some scalars
ki' ... , k s . Thus,

X* s
= Xo + {;kiZi =(
1-s ) Xo + (;kiXi'
{;ki s

(c) We find that

A(tkiXi) = A[(tki)XO + tkiZi] = (tki)B.


1=0 1=0 ,=1 ,=0

Thus, if L:f=o ki = 1, then L:f=o ki Xi is a solution to AX = B. Conversely, if


L:f=o ki X i is a solution to AX = B, then clearly (L:f=o ki) B = B, implying
(since B =1= 0) that L:f=o ki = 1.
(d) It is clear from Part (b) that every solution to AX = B belongs to sp(Xo,
Xl, ... , Xs). However, not every matrix in sp(Xo, XI, ... , Xs) is a solution to
AX = B, as is evident from Part (c). Thus, the solution set of AX = B is a proper
subset of sp(Xo, XI, ... , Xs).

EXERCISE 5. Suppose that AX = B is a consistent linear system (in an n x p


matrix X). Show that if rank(A) < nand rank(B) < p, then there exists a solution
X* to AX = B that is not expressible as X* = GB for any generalized inverse G
ofA.
Solution. Suppose that rank(A) < nand rank(B) < p. Then, since the columns
of B are linearly dependent, there exists a nonnull vector kl such that Bkl =
0, and, according to Theorem 4.3.12, there exist p - 1 p-dimensional column
vectors k2, ... , kp such that the set {kl' k2, .... k p } is a basis for 'Rf. Define
K = (kl' K2), where K2 is the p x (p -1) matrix whose columns are k2, ... , k p .
Clearly, the matrix K is nonsingular.
Since the columns of A are linearly dependent, there exists a nonnull vector yj
such that Ayj = O. Let y* = (Yj, Y;), where Y; is any solution to the linear
system AY2 = BK2 (in Y2). (Since AX = B is consistent, so is AY2 = BK2.)
Clearly, AY* = BK.
58 11. Linear Systems: Solutions

Define X* = Y*K- 1 . Then,

AX* = AY*K- 1 = BKK- 1 = B,

so that X* is a solution to AX = B.
To complete the proof, it suffices to show that X* is not expressible as X* = GB
for any generalized inverse G of A. Assume the contrary, that is, assume that
X* = GB for some generalized inverse G of A. Then, since

we have that
yj = X*kl = GBkl = 0,
which (since, by definition, yj is nonnull) establishes a contradiction.

EXERCISE 6. Let A represent an m x n matrix and B an m x p matrix. If C


is an r x m matrix of full column rank (i.e., of rank m), then the linear system
CAX = CB is equivalent to the linear system AX = B (in X). Use the result of
Part (b) of Exercise 7.1 to generalize this result.
Solution. If C is an r x q matrix and D a q x m matrix such that rank(CD) =
rank(D), then the linear system CDAX = CDB (in X) is equivalent to the linear
system DAX = DB (in X) [as is evident from Part (b) of Exercise 7.1].

EXERCISE 7. Let A represent an m x n matrix, B an m x p matrix, and C a


q x m matrix, and suppose that AX = B and CAX = CB are linear systems (in
X).
(a) Show that if rank[C(A, B)] = rank(A, B), then CAX = CB is equivalent
to AX = B - this result is a generalization of the result that CAX = CB is
equivalent to AX = B if C is of full column rank (i.e., of rank m) and also of the
result that (for any n x s matrix F, the linear system A'AX = A'AF is equivalent
to the linear system AX = AF (in X).
(b) Show that if rank[C(A, B)] < rank(A, B) and if CAX = CB is consistent,
then the solution set of AX = B is a proper subset of that of CAX = CB (i.e.,
there exists a solution to CAX = CB that is not a solution to AX = B).
(c) Show, by example, that if rank[C(A, B)] < rank(A, B) and if AX = B is
inconsistent, then CAX = CB can be either consistent or inconsistent.
Solution. (a) Suppose that rank[C(A, B)] = rank(A, B). Then, according to Cor-
ollary 4.4.7, R[C(A, B)] = R(A, B) and hence R(A, B) c R[C(A, B)], im-
plying (in light of Lemma 4.2.2) that (A, B) = LC(A, B) for some matrix L.
Therefore,
A = LCA and B = LCB.
For any solution X* to CAX = CB, we find that

AX* = LCAX* = LCB = B.


11. Linear Systems: Solutions 59

Thus, any solution to CAX = CB is a solution to AX = B, and hence (since any


solution to AX = B is a solution to CAX = CB) CAX = CB is equivalent to
AX=B.
(b) Suppose that rank[C(A, B)) < rank(A, B) and that CAX = CB is consis-
tent. And, assume that AX = B is consistent - if AX = B is inconsistent, then
clearly the solution set of AX = B is a proper subset of that of CAX = CB. Then,
making use of Theorem 7.2.1, we find that

rank(A) = rank(A, B) > rank[C(A, B)] = rank(CA, CB) = rank(CA),

implying that
n - rank(A) < n - rank(CA). (S.3)
Let Xo represent any particular solution to AX = B. According to Theorem
11.2.3, the solution set of AX = B is comprised of every n x p matrix X* that is
expressible as
X* = Xo +Z*
for some solution Z* to the homogeneous linear system AZ = 0 (in an n x p
matrix Z). Similarly, since Xo is also a solution to CAX = CB, the solution set of
CAX = CB is comprised of every matrix X* that is expressible as

X* = Xo+Z*

for some solution Z* to the homogeneous linear system CAZ = O.


It follows from Lemma 11.3.2 that the dimension of the solution space of AZ = 0
equals p[n - rank(A)] and the dimension of the solution space of CAZ = 0 equals
p[n-rank(CA)]. Clearly, the solution space of AZ = Ois a subspace of the solution
space of CAZ = 0 and hence, in light of inequality (S.3), it is a proper subspace.
We conclude that the solution set of AX = B is a proper subset of the solution set
ofCAX = CB.
(c) Suppose that AX = B is any inconsistent linear system and that C = 0, in
which case
rank[C(A, B)] = 0 < rank(A, B).
Then, CAX = CB is clearly consistent.

e
Alternatively, suppose that

A= ~), B= (~) , and C = (1,0),

in which case AX = B is obviously inconsistent and

rank[C(A. B)] = 1 < 2 = rank(A. B).

Then, CAX = CB is clearly inconsistent.

EXERCISE 8. Let A represent a q x n matrix, B an m x p matrix. and C an m x q


matrix; and suppose that the linear system CAX = B (in an n x p matrix X) is
60 II. Linear Systems: Solutions

consistent. Show that the value of AX is the same for every solution to CAX = B
if and only if rank:(CA) = rank(A).
Solution. It suffices (in light of Theorem ILl 0.1) to show that rank(CA) =
rank(A) if and only if R(A) c R(CA) or equivalently [since R(CA c R(A)]
if and only if R(A) = R(CA). If R(A) = R(CA), then it follows from the
very definition of the rank of a matrix that rank(CA) = rank(A). Conversely, if
rank(CA) = rank:(A), then it follows from Corollary 4.4.7 that R(A) = R(CA).

EXERCISE 9. Let A represent an m x n matrix, B an m x p matrix, and K an


n x q matrix. Verify (1) that if X* and L * are the first and second parts, respectively,
of any solution to the linear system

(in X and L), then X* is a solution to the linear system AX = B (in X), and
L * = K'X*. and, conversely, if X* is any solution to AX = B, then X* and K'X*
are the first and second parts, respectively, of some solution to linear system (*);
and (2) (restricting attention to the special case where m = n) that If X* and L *
are the first and second parts, respectively, of any solution to the linear system

(in X and L), then X* is a solution to the linear system AX = B (in X) and
L * = K'X*, and, conversely, if X* is any solution to AX = B, then X* and K'X*
are the first and second parts, respectively, of some solution to linear system (**).
Solution. (1) Suppose that X* and L* are the first and second parts, respectively,
of any solution to linear system (*). Then, clearly

-K'X* + L* = 0,

or equivalently L * = K'X*, and

AX* = AX* + OL* = B,


so that X* is a solution to AX = B.
Conversely, suppose that X* is a solution to AX = B. Then, clearly

so that X* and K'X* are the first and second parts, respectively, of some solution
to linear system (*).
(2) Suppose that X* and L * are the first and second parts, respectively, of any
solution to linear system (**). Then, clearly
-K'X* + L* = 0,
11. Linear Systems: Solutions 61

or equivalently L * = K'X*, and

AX* = (A + KK')X* - K(K'X*) = (A + KK')X* - KL* = B,

so that X* is a solution to AX = B.
Conversely, suppose that X* is a solution to AX = B. Then, clearly,

( A + KK'
-K'
-K) ( X* ) _ (A + KK')X* - KK'X*) = (AX*) =
I K'X* - -K'X* + K'X* 0
(B)
0'

so that X* and K'X* are the first and second parts, respectively, of some solution
to linear system (**).
12
Projections and Projection Matrices

EXERCISE 1. Let Y represent a matrix in a linear space V, let U and W represent


subspaces of V, and take {X I, ... , Xs} to be a set of matrices that spans U and
{ZI, ... , Zt} to be a set that spans W. Verify that Y ..1 U if and only if Y • Xi = 0
for i = I, ... , s (i.e., that Y is orthogonal to U if and only if Y is orthogonal
to each of the matrices X I, ... , X s ); and, similarly, that U ..1 W if and only if
Xi· Zj = 0 for i = 1, ... , sand j = 1, ... , t (i.e., that U is orthogonal to W if
and only if each of the matrices XI, ... , Xs is orthogonal to each of the matrices
ZI, ... , Zt)·
Solutiou. Suppose that Y ..1 U. Then, since Xi E U, we have that y. Xi = 0 (i =
1, ... ,s).
Conversely, suppose that Y • Xi = 0 for i = 1, ... , s. For each matrix X E U,
there exist scalars CJ, ... , Cs such that X = CIXI + ... + csX s , so that

Thus, Y is orthogonal to every matrix in U, that is, Y ..1 U.


The verification of the first assertion is now complete. For purposes of verifying
the second assertion, suppose that U ..1 W. Then, since Xi E U and Y JEW, we
have that Xi • Y j = 0 (i = 1, .... s; j = 1, ... , t).
Conversely, suppose that Xi· Zj = 0 for i = I, ... , sand j = 1, ... , t. For
each matrix X E U, there exist scalars Cl, ... , Cs such that X = CIXI + ... + csXs
and, for each matrix Z in W, there exist scalars dl, ... , dt such that Z = dlZI +
64 12. Projections and Projection Matrices

... + dtZ t , so that

XoZ = ~ Ci(XiO ~djZj) = LCi ~dj(XiOZj) = O.


r ] r ]

Thus, U J.. W.

EXERCISE 2. Let U and V represent subspaces of nm


xn. Show that if dim (V)
> dim(U), then V contains a nonnull matrix that is orthogonal to U.

Solution. Let r = dim(U) and s = dim(V). And, let {AI, ... , AT} and {BI, ... ,
Bs} represent bases for U and V, respectively. Further, define H = {hi}} to be the
r x s matrix whose ijth element equals Ai ° B j.
Now, suppose that s > r. Then, since rank(H) ::s r < s, there exists an s x 1
nonnull vector x = {x j} such that Hx = O.
Let C = xlB I + ... + xsBs. Then, C is nonnull. Moreover, for i = 1, ... , r,

AiOC = XI(AioBI) + ... + xs(Ai oBs) = LhijXj.


j

Since Lj hi}x j is the ith element of the vector Hx, Lj hijX j = 0, and hence
Ai ° C = 0 (i = 1, ... , r). Thus, C is orthogonal to each of the matrices
A I, ... , AT' We conclude on the basis of Lemma 12.1.1 (or equivalently the result
of Exercise 1) that C is orthogonal to U.

EXERCISE 3. Let U represent a subspace of the linear space nm


of all m-
dimensional column vectors. Take M to be the subspace of nmxn defined by
WE M if and only ifW = (WI, ... , w n ) for some vectors WI, ... , Wn in U. Let
Z represent the projection (with respect to the usual inner product) of an m x n
matrix Y on M, and let X represent any m x p matrix whose columns span U.
Show that Z = XB* for any solution B* to the linear system

X'XB = X'Y (in B).

Solution. Let Yi represent the ith column of Y, and take Vi to be the projection
(with respect to the usual inner product) of Yi on U (i = I, ... , n). Define V =
(VI, ... , vn). Then, by definition, (Yi - Vi)'W = 0 for every vector WinU, so that,
for every matrix W = (WI, ... , w n ) in M,

n
tr[(Y - V)'W] = L(Yi - Vi)'Wi = 0,
;=1

implying that Z = V.
Now, suppose that B* is a solution to X'XB = X'Y. Then, for i = I, ... , n, the
ith column bt of B* is clearly a solution to the linear system X'Xbi = X'y; (in
12. Projections and Projection Matrices 65

bd· We conclude, on the basis of Theorem 12.2.1, that Vi = Xb7 (i = 1, ... , n)


and hence that

z = V = (VI, ... , v = (XbT, ... , Xb~) = XB*.


n)

EXERCISE 4. The projection (with respect to the usual inner product) of an


n-dimensional column vector y on a subspace U of nn
in the special case where
n = 3, y = (3, -38/5,74/5)' and U = Sp{XI, X2, X3}, with

was determined to be the vector (3, 22/5, 44/5)' -and it was observed that XI and
X2 are linearly independent and that X3 = X2 - (1/3 )XI, with the consequence that
dim(U) = 2. Recompute the projection of yon U (in this special case) by taking
X to be the 3 x 2 matrix

and carrying out the following two steps: (1) compute the solution to the normal
equations X'Xb = X'y; and (2) postmultiply X by the solution you computed in
Step (1).

Solution. (1) The normal equations are

They have the unique solution

b = (45 30)-1 (66) = (2/15 -1/6) (66) = (37/15)


30 24 38 -1/6 1/4 38 -3/2·

(2) The projection of y on U is

z= (~6 -;)
4
(~/13/25 ) = (2i/5)
44/5
.
EXERCISE 5. Let X represent any n x p matrix. If a p x n matrix B* is a solution
to the linear system X'XB = X' (in B), then B* is a generalized inverse of X and
XB* is symmetric. Show that, conversely, if a p x n matrix G is a generalized
inverse of X and if XG is symmetric, then X'XG = X' (i.e., G is a solution to
X'XB = X').
66 12. Projections and Projection Matrices

Solution. Suppose that G is a generalized inverse of X and XG is symmetric. Then,

X'XG = X'(XG)' = (XGX)' = X'.

EXERCISE 6. Using the result of Part (b) of Exercise 9.3 (or otherwise), show
that, for any nonnull symmetric matrix A,

where B is any matrix of full column rank and T any matrix of full row rank such
that A = BT. (That TB is nonsingular follows from the result of Exercise 8.3.)
Solution. Let L represent a left inverse of B and R a right inverse of T. Then,
according to Part (b) of Exercise 9.3, the matrix R(TB) - I L is a generalized inverse
of A2 or equivalently (since A is symmetric) of A'A. Thus,

EXERCISE 7. Let V represent a k-dimensional subspace of the linear space nn


of all n-dimensional column vectors. Take X to be any n x p matrix whose columns
span V, let U represent a subspace of V, and define A to be the projection matrix for
U. Show (1) that a matrix B (of dimensions n x n) is such that By is the projection
of y on U for every y E V if and only if B = A + Z: for some solution Z* to the
homogeneous linear system X'Z = 0 (in an n x n matrix Z) and (2) that, unless
k = n, there is more than one matrix B such that By is the projection of yon U for
every y E V.
Solution. (1) The vector Ay is the projection of yon U for every y E nn.
Thus,
By is the projection of y on U for every y E V if and only if By = Ay for every
y E V, or equivalently if and only if BXr = AXr for every p x 1 vector r, and
hence (in light of Lemma 2.3.2) if and only if BX = AX.
Furthermore, BX = AX if and only if X' (B - A)' = 0, or equivalently if and
only if (B - A)' is a solution to the homogeneous linear system X'Z = 0 (in an
n x n matrix Z), and hence if and only if B' = A' + Z* for some solution Z* to
X'Z = 0, that is, if and only if B = A + Z: for some solution Z* to X'Z = O.
(2) According to Lemma 11.3.2, the solution space of the homogeneous linear
system X'Z = 0 (in an n x n matrix Z) is of dimension n [n - rank(X)] = n (n - k).
Thus, unless k = n, there is more than one solution to AZ = 0, and hence [in
light of the result of Part (1)] there is more than one matrix B such that By is the
projection of y on U for every y E V.

EXERCISE 8. Let {A I, ... , Ak} represent a nonempty linearly independent set of


matrices in a linear space V. And, define (as in Gram-Schmidt orthogonalization) k
12. Projections and Projection Matrices 67

nonnullorthogonallinearcombinations, say BI, ... ,Bb of AI, ... , Ak as follows:

BI =AI,
B2 = A2 - XI2BI,

where (for i < j = 1, ... , k)

Show that B j is the (orthogonal) projection of A j on some subspace Uj (of V) and


describe Uj (j = 2, ... , k).
Solution. For j = 1, ... , k, define C j = II B j II-I B j (as in Corollary 6.4.2). And,
define Wj = SP(CI, ... , C j ). Then, for j = 2, ... , k,

Moreover, upon observing that the set {C I, ... , C j } is orthonormal and applying
result (1.1), we find that Lf::II(Aj ·Ci)Ci is the projection of Aj on Wj_l.
Thus, it follows from Theorem 12.5.8 that B j is the projection of Aj on Wf-I·
And, since (in light of the discussion of Section 6.4b) W j -I = sp(A I, ... , A j -I),
we conclude that B j is the projection of A j on the orthogonal complement of the
subspace (of V) spanned by AI, ... , Aj_l.
13
Determinants

EXERCISE 1. Let

j ::: I::: I:~ II:: I).


A
l a41 Ia421 a43 a44

(a) Write out all of the pairs that can be formed from the four boxed elements
ofA.
(b) Indicate which ofthe pairs from Part (a) are positive and which are negative.
(c) Use the formula
a n (1, il; ... ; n, in) = a n (il, 1; ... ; in, n) = ¢nUl,.··, in)
(in which iI, ... ,in represents an arbitrary permutation of the first n positive
integers) to compute the number of pairs from Part (a) that are negative, and check
that the result of this computation is consistent with your answer to Part (b).
Solution. (a) and (b)
Pair "Sign"
a14, a21
a14, a33
a14, a42
a21, a33 +
a21, a42 +
a33, a42
70 13. Determinants

(c) </>4(4, 1,3,2) = 3+0+ 1 = 4 [or alternatively </>4(2, 4,3,1) = 1 +2+ 1 =


4].

EXERCISE 2. Consider the n x n matrix

"Recall" that
lSI = IRI
for any n x n matrix R and for any matrix S formed from R by adding to anyone
of its rows or columns, scalar mUltiples of one or more other rows or columns; and
use this result to show that

(Hint. Add the last n - 1 columns of A to the first column, and then subtract the
first row of the resultant matrix from each of the last n - 1 rows).
Solution. The matrix obtained from A by adding the last n - 1 columns of A to
the first column is

B= (:;:~ x ~A
nx +"A x

The matrix obtained from B by subtracting the first row of B from each of the
next i rows is

nx +"A x x x x

I
0 "A 0 0 0
i rows
Ci = 0 0 "A 0 0

I
nx+"A x x x+"A x
n - 1 - i rows
nx +"A x x x x+"A

Observing that C i can be obtained from Ci -I by subtracting the first row of


C i -1 from the (i + 1)th row and making use of result (*) (or equivalently Theorem
13.2.10) and Lemma 13.1.1, we find that
13. Detenninants 71

EXERCISE 3. Let A represent an n x n nonsingular matrix. Show that if the


elements of A and A -I are all integers, then IA I = ± 1.
Solution. Suppose that the elements of A and A -I are all integers. Then, it follows
from the very definition of a determinant that IAI and lA-II are both integers.
Thus, since (according to Theorem 13.3.7) lA-II = lilA!, IAI and I/IAI are both
integers. We conclude that IAI = ±l.

EXERCISE 4. Let T represent an m x m matrix, U an m x n matrix, V an n x m


matrix, and W an n x n matrix. Show that if T is nonsingular, then

Solution. It follows from Theorem 13.2.7 that

Thus, making use of Theorem 13.3.8, we find that

EXERCISE 5. Compute the determinant of the n x n matrix A = {aij} in the


special case where n = 4 and

0 4 0
A = ( I 0 -1
030
o 0 -6
Do so in each of the following two ways:
(a) by finding and summing the nonzero terms in the expression

L (_l)<Pn(jI, ... ,jn)aljl· ··anjn or L (_l)<Pn(iI .... ,in)aill·· ·ainn,

(where iI, ... , in or ii, ... , in is a permutation of the first n positive integers and
the summation is over all such permutations);
(b) by repeated expansion in terms of cofactors-use the (general) formula
n n
IAI = L aijaij or IAI = L aijaij
j=1 i=1

(where i or i, respectively, is any integer between 1 and n inclusive and where aij is
the cofactor of aij) to expand IAI (in the special case) in terms of the determinants
72 13. Determinants

of 3 x 3 matrices, to expand the detenninants of the 3 x 3 matrices in tenns of


the detenninants of 2 x 2 matrices, and finally to expand the detenninants of the
2 x 2 matrices in tenns of the detenninants of 1 x 1 matrices.
Solution. (a)
IAI = (_1)1/>4(2,1,4,3)4(1)(_2)(_6) + (_1)1/>4(4,1,2.3)5(1)(3)(-6)
= (_1)1+0+148 + (_1)3+0+0(_90)
= 48+90
= 138.

(b)

405
IAI = (1)(-1)2+1 3 0-2
o -6 0
= (_1)3(_6)(_1)3+2143 51
-2
= (-1)3(-6)(-1)5[4(-1)1+1(-2) + 5(_1)1+2(3)]
= (-6)(-8 -15)
= 138.

EXERCISE 6. ,Let A = {aij} represent an n x n matrix. Verify that if A is


symmetric, then the matrix of cofactors (of A) is also symmetric.
Solution. Letaij representthecofactorof aij, letAij representthe (n -1) x (n-l)
submatrix of A obtained by striking out the ith row and the jth column (of A),
and let B ji represent the (n - 1) x (n - 1) submatrix of A' obtained by striking
out the jth row and the ith column of A'. Then, making use of Lemma 13.2.1 and
result (2.1.1), we find that

aij = (-I)i+ j IAijl = (-I)i+ j IA;jl = (-I)i+ j IBj;l.

Moreover, if A is symmetric, then Bji = Aji, implying that

aij = (-I)'.+.JIAj;l = aji


and hence that the matrix of cofactors is symmetric.

EXERCISE 7. Let A represent an n x n matrix.


(a) Show that if A is singular, then adj(A) is singular.
(b) Show that det[adj(A)] = [det(A)]n-l.
Solution. (a) If A is null, then it is clear that adj(A) = 0 and hence that adj(A) is
singular.
13. Detenninants 73

Suppose now that A is singular but nonnull, in which case A contains a nonnull
row, say the ith row a;.
Since A is singular, IAI = 0, and it follows from Theorem
13.5.3 that A adj (A) = 0 and hence that

a; adj(A) = 0,
a;
implying (since is nonnull) that the rows of adj(A) are linearly dependent. We
conclude that adj(A) is singular.
(b) Making use of Theorems 13.3.4 and 13.5.3, Corollary 13.2.4, and result
(1.9), we find that

IAlladj(A)1 = IA adj(A)1 = det(IAIIn) = IAlnlInl = lAin. (S.l)

If A is nonsingular, then IAI i- 0, and it follows from result (S.l) that


ladj(A) I = lAin-I.

Alternatively, if A is singular, then it follows from Part (a) that adj(A) is singular
and hence that
°
ladj(A) I = = IAr-l.

EXERCISE 8. For any n x n nonsingular matrix A,

A-I = (lilA!) adj(A). (*)

Use formula (*) to verify that, for any 2 x 2 nonsingular matrix A = (all a l2 ),
a21 a22

where k = alla22 - a12a21.

Solution. Let aij represent the ijth element of a 2 x 2 matrix A, and let aij
represent the cofactor of aij (i, j = 1, 2). Then, as a special case of formula (*)
[or equivalently formula (5.7)], we have that

A-I = (lIlA!) (all (21 ) . (S.2)


al2 a22

Moreover, it follows from the very definition of a cofactor and from formulas (1.3)
and (l.4) that

all = (-1)1+l a22 = a22, a21 = (-1)2+ l aI2 = -al2,


al2 = (-1)1+2 a21 = -a21, a22 = (_1)2+2all = all, and
IAI = alla22 - a12a 21·

Upon substituting these expressions in formula (S.2), we obtain formula (**) [or
equivalently formula (8.1.2)].
74 13. Detenninants

( o2 0 -1)
EXERCISE 9. Let

A= -1 3 1
-4 5

(a) Compute the cofactor of each element of A.


(b) Compute IAI by expanding IAI in terms of the cofactors of the elements of
the second row of A, and then check your answer by expanding IAI in terms of the
cofactors of the elements of the second column of A.
(c) Use formula (*) of Exercise 8 to compute A-I.
Solution. (a) Let aij represent the cofactor of the ijth element of A. Then,

all = (_1)1+1 I 3
-4 ~I = 19, a12 = (_1)1+21-~ ~I =5,
al3 = (_1)1+31-~ _;1 =4, a21 = (_1)2+1 I_~ -11
5 =4,

a22 = (_1)2+21~ -11


5 = 10, a23 = (_1)2+31~ _~I =8,

a31 = (_1)3+1 I~ -11


1 = 3, a32 = (-1)3+21_i -11
1 =-1, and

a33 = (_1)3+31 2
-1 ~I =6.
(b) Expanding IAI in terms of the cofactors of the elements of the second row
of A gives
IAI = (-1)4 + 3(10) + 1(8) = 34.
Expanding IAI in terms of the cofactors of the elements of the second column of
A gives
IAI = 0(5) + 3(10) + (-4)( -I) = 34.
(c) Substituting from Parts (a) and (b) in formula (*) of Exercise 8 [or equiva-
lently in formula (5.7)], we find that

A-I = (1/34) ( 5
19
10
4 3)-1 .
4 8 6

EXERCISE 10. Let A = {aij} represent an n x n matrix (where n :::: 2), and let
aij represent the cofactor of aij .
13. Determinants 75

(a) Show [by for instance, making use of the result of Part (b) of Exercise 11.3]
that if rank(A) = n - 1, then there exists a scalar c such that adj(A) = cxy',
where x = {x j} and y = {Yi} are any nonnull n-dimensional column vectors such
that Ax = 0 and A'y = O. Show also that c is nonzero and is expressible as
c = aij /(YiXj) for any i and j such that Yi i= 0 and Xj i= O.
(b) Show that ifrank(A) :s n - 2, then adj(A) = O.
Solution. (a) Suppose that rank(A) = n - 1. Then, det(A) = 0 and hence (accord-
ing to Theorem 13.5.3) A adj(A) = adj(A)A = O. Thus, it follows from the result
of Part (b) of Exercise 11.3 that there exists a scalar c such that adj(A) = cxy' [or
equivalently such that (adj A)' = cyx'] and hence such that (for all i and j)

(S.3)

Moreover, since (according to Theorem 4.4.10) A contains an (n - 1) x (n - 1)


nonsingular submatrix, aij i= 0 for some i and j, implying that c i= O. And, for
any i and j such that Yi i= 0 and x j i= 0, we have [in light of result (S.3)] that
c = aij/(YiXj).
(b) If rank(A) :s n - 2, then it follows from Theorem 4.4.10 that every (n -
1) x (n - 1) submatrix of A is singular, implying that aij = 0 for all i and j or
equivalently that adj(A) = O.

EXERCISE 11. Let A represent an n x n nonsingular matrix and b an n x 1


vector. Show that the solution to the linear system Ax = b (in x) is the n x 1 vector
whose jth component is

where A j is a matrix formed from A by substituting b for the jth column of A


(j = 1, ... , n). [This result is called Cramer's rule, after Gabriel Cramer (1704-
1752).]
Solution. The (unique) solution to Ax = b is expressible as A -lb. Let hi represent
the ith e1ementofb and aij the cofactor of the ijth element of A (i, j = 1, ... , n).
It follows from Corollary 13.5.4 that the jth element of A -I b is

Clearly, the cofactor of the ijth element of Aj is the same as the cofactor of the
ijth element of A (i = 1, ... , n), so that, according to Theorem 13.5.1, the jth
element of A -Ib is IAj 1/ IAI.

EXERCISE 12. Let c represent a scalar, let x and y represent n x 1 vectors, and
let A represent an n x n matrix.
76 13. Determinants

(a) Show that


I~ ~ I = clAI - x'adj(A)y. (E.I)

(b) Show that, in the special case where A is nonsingular, result (E. 1) can be
reexpressed as
I~ ~I = IAI(c-x'A-1y),

in agreement with the more general result that, for any n x n nonsingular matrix
T, n x m matrix U, m x n matrix V, and m x m matrix W,

IVT WUI = IWU VIT =ITIIW-VT -I UI·

Solution. (a) Denote by Xi the ith element of x, and by Yi the ith element of y.
Let Aj represent the n x (n - I) submatrix of A obtained by striking out the jth
column, let Aij represent the (n - I) x (n - I) submatrix of A obtained by striking
out the ith row and the jth column, and let aij represent the cofactor of the ijth
element of A.
Expanding I~ ~ Iin tenns of the cofactors of the last row of (~
c '
wey)
obtain

I~ ~ I= ~Xj(-l)n+l+jdet(Aj, y) + c(_1)2(n+I)IAI. (S.4)


J

Further, expanding det(Aj' y) in tenns of the cofactors of the last column of


(Aj, y), we obtain
(S.5)

Substituting expression (S.5) in equality (S.4), we find that

I~ ~I = ~Yixj(-1)2n+l+i+jIAijl+cIAI
I.J

= clAI- LYixj(-I)i+jIAijl
i.j

= clAI- LYiXjaij
i,j
= clAI - x' adj(A)y.

(b) Suppose that A is nonsingular, in which case IA I 1= O. Then, using Corollary


13.5.4, result (E.I) can be reexpressed as

I~ ~ I = IAl{c - x'[(l/IAI) adj(A)]y}

= IAI(c - x'A -I y).


13. Detenninants 77

Notethatthis same expression can be obtained by setting T = A, U = y, V = x',


and W = c in result (*) [or equivalently result (3.13)].

EXERCISE 13. Let Vk represent the (n - I) x (n - I) submatrix of the n x n


Vandermonde matrix

XI
X2
x2
I
x2
2
...
...
'_I)
xI
n-I
X2
v= ((
x2 n-I
xn n Xn

(where XI, X2, ... ,Xn are arbitrary scalars) obtained by striking out the kth row
and the nth (last) column (of V). Show that

IVI = IVkl(-l)n-kn(Xk -Xi).


i#

Solution. Let V* represent the n x n matrix whose first, ... , (k - I)th rows are
respectively the first, ... , (k - I)th rows of V, whose kth, ... , (n - I)th rows
are respectively the (k + I)th, ... , nth rows of V, and whose nth row is the kth
row of V. Then, V* (like V) is an n x n Vandermonde matrix, and Vk equals
the (n - I) x (n - I) submatrix of V* obtained by striking out the last row
and the last column (of V*). Moreover, V can be obtained from V* by n - k
successive interchanges of pairs of rows - specifically, V can be obtained from
V* by successively interchanging the nth row of V* with the (n - I)th, ... , kth
rows of V*. Thus, making use of Theorem 13.2.6 and of result (6.4), we find that

IVI = (_I)n-k IV* I


= (-It-k(Xk -xJ)"'(Xk -Xk-J}(Xk -Xk+J)"'(Xk -Xn)IVkl
= IVkl (_I)n-k n(Xk - Xi).
i#

EXERCISE 14. Show that, for n x n matrices A and B,

adj(AB) = adj(B)adj(A).

(Hint. Use the Binet-Cauchy formula to establish that the ijth element of adj(AB)
equals the ijth element of adj (B)adj (A).)
Solution. Let A j represent the (n - I) x n submatrix of A obtained by striking out
the jth row of A, and let Bi represent the n x (n - I) submatrix of B obtained by
striking out the ith column ofB. Further, let Ajs represent the (n - I) x (n - I)
submatrix of A obtained by striking out the jth row and the sth column of A,
and let Bsi represent the (n - I) x (n - I) submatrix of B obtained by striking
78 13. Detenninants

out the sth row and the ith column of B. Then, application of formula (8.3) (the
Binet-Cauchy formula) gives
n
IAjBil = L IAjsllBsil.
s=\

implying that
n
(-I)i+iIAjB;I = L(_1)S+iIB sil (-1)j+s IAjsl. (S.6)
s=1

Note that A j Bi equals the (n - I) x (n - 1) submatrix of AB obtained by striking


out the jth row and the ith column of AB, so that the left side of equality (S.6) is
the cofactor of the jith element of AB and hence is the ijth element of adj(AB).
Note also that (_1)s+i IBsi I is the cofactor of the sith element of B and hence is
the isth element of adj(B) and similarly that (_l)j+s IA js I is the cofactor of the
jsth element of A and hence is the sjth element of adj(A). Thus, the right side of
equality (S.6) is the ijth element of adj(B)adj(A).
We conclude that
adj(AB) = adj(B)adj(A).
14
Linear, Bilinear, and Quadratic Forms

EXERCISE 1. Show that a symmetric bilinear form x' Ay (in n-dimensional


vectors x and y) can be expressed in terms of the corresponding quadratic form,
that is, the quadratic form whose matrix is A. Do so by verifying that

x' Ay = (l/2)[(x + y)' A(x + y) - x' Ax - y' Ay] .

Solution. Since the bilinear form x' Ay is symmetric, we have that

(1/2)[(x + y)' A(x + y) - x' Ax - y' Ay]


= (l/2)(x'Ax + x'Ay + y'Ax + y'Ay - x'Ax - y'Ay)
= (l/2)(x' Ay + y' Ax) = (l/2)(x' Ay + x' Ay) = x' Ay.

EXERCISE 2. Show that corresponding to any quadratic form x' Ax (in the n-
dimensional vector x) there exists a unique upper triangular matrix B such that
x' Ax and x'Bx are identically equal, and express the elements of B in terms of the
elements of A.
= 1, ... ,n). WhenB = {bij}
Solution. Letaij represent the ijth element of A (i, j
is upper triangUlar, the conditions aii = bii and aij + aji = bij + bji (j =l=-
i = 1, ... , n) of Lemma 14.1.1 are equivalent to the conditions aii = bii and
aij + aji = bij (j > i = 1, ... , n). Thus, it follows from the lemma that there
exists a unique upper triangular matrix B such that x' Ax and x'Bx are identically
equal, namely, the upper triangular matrix B = {bij}, where bii = aii and bij =
aij +aji (j > i = 1, ... ,n).
80 14. Linear, Bilinear, and Quadratic Forms

EXERCISE 3. Show, by example, that the sum of two positive semidefinite


matrices can be positive definite.

Solution. Consider the two n x n matrices G~) and (~ ~n -I). Clearly, both
of these two matrices are positive semidefinite, however, their sum is the n x n
identity matrix In, which is positive definite.

EXERCISE 4. Show, via an example, that there exist (nonsymmetric) nonsingular


positive semidefinite matrices.

Solution. Consider the n x n upper triangular matrix

1 2 0 0
0 1 0 0
A= 0 0 0

0 0 0

For an arbitrary n-dimensional vector x = (XI, X2, X3, ... ,xn )', we find that

x' Ax = (XI + X2)2 + xi + ... + x; ~0

and that x'Ax = 0 if XI = -X2 and X3 = ... = Xn = O. Thus, A is positive


semidefinite. Moreover, it follows from Corollary 8.5.6 that A is nonsingular.

EXERCISE 5. Show, by example, that there exist an n x n positive semidefinite


matrix A and an n x m matrix P (where m < n) such that P' AP is positive definite.

Solution. Take A to be the n x n diagonal matrix diag(lm, 0), which is clearly


positive semidefinite, and take P to be the n x m (partitioned) matrix (~m). Then,
P' AP = 1m , which is an m x m positive definite matrix.

EXERCISE 6. For an n x n matrix A and an n x m matrix P, it is the case


that (1) if A is nonnegative definite, then P' AP is nonnegative definite; (2) if A
is nonnegative definite and rank(P) < m, then P' AP is positive semidefinite; and
(3) if A is positive definite and rank(P) = m, then P' AP is positive definite. Con-
vert these results, which are for nonnegative definite (positive definite or positive
semidefinite) matrices, into equivalent results for nonpositive definite matrices.

Solution. As in results (1) - (3) (of the exercise or equivalently of Theorem 14.2.9),
letA represent an n xn matrixandPann xm matrix. Upon applying results (1)-(3)
with -A in place of A, we find that (1') if -A is nonnegative definite, then -P' AP
is nonnegative definite; (2') if -A is nonnegative definite and rank(P) < m, then
- P' AP is positive semidefinite; and (3') if - A is positive definite and rank(P) = m,
then -P' AP is positive definite. These three results can be restated as follows:
14. Linear, Bilinear, and Quadratic Fanns 81

(1') if A is nonpositive definite, then P' AP is nonpositive definite; (2') if A is


nonpositive definite and rank(P) < m, then P' AP is negative semidefinite; and
(3') if A is negative definite and rank(P) = m, then P' AP is negative definite.

EXERCISE 7. Let {X I, ... , X r } represent a set of matrices from a linear space V.


And, let A = {ai j } represent the r x r matrix whose i j th element is Xi • X j - this
matrix is referred to as the Gram matrix (or the Gramian) of the set {Xl, ... , X r }
and its determinant is referred to as the Gramian (or the Gram determinant) of
{Xl,· .. , X r }·
(a) Show that A is symmetric and nonnegative definite.
(b) Show that X I, ... , Xr are linearly independent if and only if A is nonsingular.
Solution. Let Y I, ... ,Yn represent any matrices that form an orthonormal basis
for V. Then, for j = 1, ... , r, there exist scalars blj' ... , bnj such that

And, for i, j = 1, ... , r,

n n
= I)ki I)Sj(Ys·Yk)
k=l s=I
n
= Lbkibkj.
k=l

Moreover, Lk=l bkibkj is the ijth element of the r x r matrix B'B, where B is
the n x r matrix whose kjth element is bkj (and hence where B' is the r x n
matrix whose ikth element is bki). Thus, A = B'B, and since B'B is symmetric
(and in light of Corollary 14.2.14) nonnegative definite, the solution of Part (a) is
complete.
Now, consider Part (b). For j = 1, ... , r, let b j = (blj, ... ' bnj. Then,
since clearly Y I, ... , Yn are linearly independent, it follows from Lemma 3.2.4
that Xl, ... ,Xr are linearly independent if and only if bI, ... , b r are linearly
independent. Thus, since bl, ... , b r are the columns ofB, Xl, ... , Xr are linearly
independent if and only if rank(B) = r or equivalently (in light of Corollary 7.4.5)
if and only if rank(B'B) = r. And, since A = B'B, we conclude that X I, ... , Xr
are linearly independent if and only if A is nonsingular.

EXERCISE 8. Let A = {aij} represent an n x n symmetric positive definite


82 14. Linear, Bilinear, and Quadratic Forms

matrix, and let B = {b ij } = A -I . Show that, for i = 1, ... , n,


bii 2: ll a u ,

with equality holding if and only if, for all j =1= i, aij = O.
Solution. Let V = (UI, V2), where UI is the ith column of In and V2 is the
submatrix of In obtained by striking out the ith column, and observe that V is a
permutation matrix.
Define R = V' AV and S = R- I . Partition R and S as

and

[where the dimensions of both R* and S* are (n - 1) x (n - 1)]. Then,

r' = U~AV2 = (ail, ai2, ... , ai. i-I, ai. i+l, ... , ai. n-I, ain), (S.2)

and (since S = V'BV)


(S.3)

It follows from Corollary 14.2.10 that R is positive definite, implying (in light
of Corollary 14.2.12) that R* is positive definite and hence (in light of Corollary
14.2.11) that R* is invertible and that R;I is positive definite. Thus, making use
of Theorem 8.5.11, we find [in light of results (S.l) and (S.3)] that

bU = (aii - r 'R-I
* r )-1

and also that r'R; Ir 2: 0 with equality holding if and only if r = O. Since bii > 0
(and hence aii - r'R;lr > 0), we conclude that bu 2: l/aii with equality holding
if and only if r = 0 or equivalently [in light of result (S.2)] if and only if, for
j =1= i, aij = O.

EXERCISE 9. Let A represent an m x n matrix and D a diagonal matrix such


that A = PDQ for some matrix P of full column rank and some matrix Q of full
row rank. Show that rank(A) equals the number of nonzero diagonal elements in
D.
Solution. Making use of Lemma 8.3.2, we find that

rank(A) = rank(PDQ) = rank(DQ) = rankeD).


Moreover, rank (D) equals the number of nonzero diagonal elements in D.

EXERCISE 10. Let A represent an n x n symmetric idempotent matrix and V


an n x n symmetric positive definite matrix. Show that rank(A VA) = tr(A).
14. Linear, Bilinear, and Quadratic Fonns 83

Solution. According to Corollary 14.3.13, V = P'P for some nonsingular matrix


P. Thus, making use of Corollary 7.4.5, Corollary 8.3.3, and Corollary 10.2.2, we
find that

rank (AVA) = rank[(PA)'PA] = rank(PA) = rank(A) = tr(A).

EXERCISE 11. Show that if an n x n matrix A is such that x' Ax 'I 0 for every
n x 1 nonnull vector x, then A is either positive definite or negative definite.
Solution. Let A represent an n x n matrix such that x' Ax 'I 0 for every n x 1
nonnull vector x.
Define B = (l/2)(A + A'). Then, x'Bx = x' Ax for every n x 1 vector x.
Moreover, B is symmetric, implying (in light of Corollary 14.3.5) that there exists
a nonsingular matrix P and a diagonal matrix D = diag(dl, ... , dn ) such that B =
P'DP. Thus, (Px)'DPx = x' Ax for every n x 1 vector x and hence (Px)'DPx 'I 0
for every n x 1 nonnull vector x.
There exists no i such that di = 0 [since, if di = 0, then, taking x to be
the nonnull vector P-Iei, where ei is the ith column of In, we would have that
(Px)'DPx = e;Dei = di = 0]. Moreover, there exists no i and j such that di > 0
and dj < 0 [since, if di > 0 and d j < 0, then, taking x to be the (nonnull)
vector P-Iy, where y is the n x 1 vector with ith element II v'd; and jth element
II J-dj, we would have that (Px)'DPx = y'Dy = 1 - 1 = 0].
It follows that the n scalars dl, ... , dn are either all positive, in which case B
is (according to Corollary 14.2.15) positive definite, or all negative, in which case
-B is positive definite and hence B is negative definite. We conclude (on the basis
of Corollary 14.2.7) that A is either positive definite or negative definite.

EXERCISE 12. (a) Let A represent an n x n symmetric matrix of rank r. Take


P to be an n x n nonsingular matrix and D an n x n diagonal matrix such that
A = P'DP - the existence of such matrices is guaranteed. The number, say m,
of diagonal elements of D that are positive is called the index of inertia of A (or
of the quadratic form x' Ax whose matrix is A). Show that the index of inertia is
well-defined in the sense that m does not vary with the choice of P or D. That is,
show that, if PI and P2 are nonsingular matrices and DI and D2 diagonal matrices
such that A = ~DIPI = P;D2P2, then D2 contains the same number of positive
diagonal elements as DI. Show also that the number of diagonal elements of D
that are negative equals r - m.
(b) Let A represent an n x n symmetric matrix. Show that A = P' diag(Im,
-Ir - m , O)P for some n x n nonsingular matrix P and some nonnegative integers
m and r. Show further that m equals the index of inertia of the matrix A and that
r = rank(A).
(c) An n x n symmetric matrix B is said to be congruent to an n x n symmetric
matrix A if there exists an n x n nonsingular matrix P such that B = P' AP. (If B
is congruent to A, then clearly A is congruent to B.) Show that B is congruent to
84 14. Linear. Bilinear. and Quadratic Fonns

A if and only if B has the same rank and the same index of inertia as A. This result
is called Sylvester's law of inertia. after James Joseph Sylvester (1814-1897).
(d) Let A represent an n x n symmetric matrix of rank r and with index of inertia
m. Show that A is nonnegative definite if and only if m = r and is positive definite
if and only if m = r = n.

Solution. (a) Take PI and P2 to be n x n nonsingular matrices and 01 = {dP)}


and 02 = {di(2)} to be n x n diagonal matrices such that A = P'IOIPI = ~02P2.
Let m I represent the number of diagonal elements of 01 that are positive and
m2 the number of diagonal elements of 02 that are positive. Take il. i2 •...• in
to be a permutation of the first n positive integers such that d?) > 0 for j =
J

dk
1. 2, ... , m I, and similarly take kl , k2, ... , kn to be a permutation such that 2) >
ofor j = 1,2, ... , m2. Further. take UI to be the n x n permutation matrix whose
J

first, second, ... , nth columns are respectively the ilth, i2th•... , inth columns
of In and U2 to be the n x n permutation matrix whose first, second, ... , nth
columns are respectively the kith. k2th • ... , knth columns ofln, and define Or =
U~OI UI and O2 = U;02U2. Then, Or = diag(di\I), dg), ... ,d2» and O 2 =
diag(dk~)' dk;),·.·, dk~\
Suppose, for purposes of establishing a contradiction, that m I < m 2, and observe
that

O 2= U;(PZ-I)'APZ- IU 2 = U;(pZ-I)'~0IPIPZ-IU2
= U;(pZ-I)'P'IUIOrU;PIPz-IU2 = R'Or R ,

where R = U;PIPZ-IU2. Partition the n x n matrix R as R = (:~~ :~~).


where RII is of dimensions ml x m2.
Take X = {Xj} to be anm2-dimensional nonnull column vector such thatRllx =
o - since (by supposition) m I < m2, such a vector necessarily exists. Letting
Yl, Y2, ... , Yn-ml' represent the elements ofthe vector R2IX, we find that

~
L... d(2) 2_
kj Xj -
(X)'o*(x)
0 2 0 -
_ (X)'R'O*R(X)
0 I 0
j=1

_ ( 0 )'0*( 0 )
- R21 X I R21 X

(S.4)

Moreover.
14. Linear, Bilinear, and Quadratic Fonns 85

and, since the last n - m 1 diagonal elements of Dr are either zero or negative,
n
L d01) YJ-ml ::: 0.
j=ml+1

These two inequalities, in combination with equality (S.4), establish the sought-
after contradiction.
We conclude that m 1 ::: m2. It can be established, via an analogous argument,
that ml ::: m2. Together, those two inequalities imply that m2 = mi.
Consider now the number of negative diagonal elements in the diagonal matrix
D. According to Lemma 14.3.1, the number of nonzero diagonal elements in D
equals r. Thus, the number of negative diagonal elements in D equals r - m.
(b) According to Corollary 14.3.5, there exists an n x n nonsingular matrix P * and
an n x n diagonal matrix D = {dd such that A = P'*DP*. Take ii, i2, ... , in to be
any permutation of the first n positive integers such that - for some integers m and
r (0 ::: m ::: r ::: n) - di j > 0, for j = 1, ... , m, di j < 0, for j = m + 1, ... , r,
and dij = 0, for j = r + 1, ... , n. Further, take U to be the n x n permutation
matrix whose first, second, ... , nth columns are respectively the i 1th, izth, ... , in th
columns ofln, and define D* = U'DU. Then,

We find that
A = P'*UU'DUU'P* = (U'P*)'D*U'P*.
And, taking t1 to be the diagonal matrix whose first m diagonal elements are
p;;, jdi;, ... , .jdi;" whose (m + l)th, (m + 2)th, ... , rth diagonal elements
are J-dim+1 ' J -dim+2' ... , J -dir' and whose last n - r diagonal elements equal
one, we have that

and hence that

A = (U'P*)'D*U'P* = (t1U'pS t1 -ID*t1 -I t1U'P*


= P'diag(Im , -Ir - m , O)p,

where P = t1 U'P*. Clearly, P is nonsingular.


That m equals the index of inertia and that r = rank(A) are immediate conse-
quences of the results of Part (a).
(c) Suppose that B is congruent to A. Then, by definition, B = P' AP for some
n x n nonsingular matrix P. Moreover, according to Corollary 14.3.5, A = Q'DQ
for some n x n nonsingular matrix Q and some n x n diagonal matrix D. Thus,

where P * = QP. Clearly, P * is nonsingular. And, in light of Part (a), we conclude


that B has the same rank and the same index of inertia as A.
86 14. Linear, Bilinear, and Quadratic Fonns

Conversely, suppose that A and B have the same rank, say r, and the same index
of inertia, say m. Then, according to Part (b), A = p' diag(Im, - I r - m , 0) P and
B = Q' diag(Im, - I r - m , 0) Q for some n x n nonsingular matrices P and Q. Thus,

and consequently
B = Q'(P-I)'AP-1Q = ~AP*,
where P * = p- I Q. Clearly, P * is nonsingular. We conclude that B is congruent
toA.
(d) According to Part (b),

A = pi diag(Im, - I r - m, 0) P

for some n x n nonsingular matrix P. Thus, we have as an immediate consequence


of Corollary 14.2.15 that A is nonnegative definite if and only if m = r and is
positive definite if and only if m = r = n.

EXERCISE 13. Let A represent an n x n symmetric nonnegative definite matrix


of rank r. Then, there exists an n x r matrix B (of rank r) such that A = BB'. Let
X represent any n x m matrix (where m ::: r) such that A = XX'.
(a) Show that X = PBX.
(b) Show that X = (B, O)Q for some orthogonal matrix Q.
Solution. (a) It follows from Corollary 7.4.5 that C(X) = C(A) = C(B), implying
(in light of Corollary 12.3.6) that Px = P B. Thus, making use of Part (1) of
Theorem 12.3.4, we find that

X = PxX = PBX.
(b) Since rank(B'B) = rank(B) = r, B'B (which is of dimensions r x r) is
invertible. Thus, it follows from Part (a) that

(S.5)

where QI = (B'B)-IB'X. Moreover,

QIQ'I = (B'B)-IB'XX'B(B'B)-I
= (B'B)-IB'AB(B'B)-I
= (B'B) -I B'BB'BB(B'B) -I = I,

so that the rows ofthe r x m matrix QI are orthonormal (with respect to the usual
inner product).
It follows from Theorem 6.4.5 that there exists an (m - r) x m matrix Q2 whose
rows, together with the rows of QI, form an orthonormal (with respect to the usual
14. Linear, Bilinear, and Quadratic Fonns 87

inner product) basis for nm. Take Q = (~~). Then, clearly, Q is orthogonal.
Further, (B, O)Q = BQl, implying, in light of result (S.5), that X = (B, O)Q.

EXERCISE 14. Show that if a symmetric matrix A has a nonnegative definite


generalized inverse, then A is nonnegative definite.

Solution. Suppose that the symmetric matrix A has a nonnegative definite gener-
alized inverse, say G. Then, A = AGA = A'GA, implying (in light of Theorem
14.2.9) that A is nonnegative definite.

EXERCISE 15. Suppose that an n x n matrix A has an LDU decomposition, say


=
A LD U, and let d 1, d2, ... , dn represent the diagonal elements of the diagonal
matrix D. Show that

Solution. Making use of Theorem 13.2.11 and of Corollary 13.1.2, we find that

IAI = ILDUI = ILDI = IDI = dld2 ... dn .


EXERCISE 16. (a) Suppose that an n x n matrix A (where n ~ 2) has a unique
LDU decomposition, say A = LDU, and let dl, d2, ... ,dn represent the first,
second, ... , nth diagonal elements of D. Show that d; i= 0 (i = 1, 2, ... , n - 1)
and that dn i= 0 if and only if A is nonsingular.
(b) Suppose that an n x n (symmetric) matrix A (where n ~ 2) has a unique
U'DU decomposition, say A = U'DU, and let d 1 , d2,.'" dn represent the first,
second, ... , nth diagonal elements of D. Show that d; i= 0 (i = 1, 2, ... , n - 1)
and that dn i= 0 if and only if A is nonsingular.

Solution. Let us restrict attention to Part (a) - Part (b) can be proved in essentially
the same way as Part (a).
Suppose - for purposes of establishing a contradiction - that, for some i (1 :::
i ::: n - 1), d; = O. Take L * to be a unit lower triangular matrix and U* a unit upper
triangular matrix that are identical to Land U, respectively, except that, for some
j (j > i) the ijth element ofU* differs from the ijth element ofU and/or the jith
element of L * differs from the j i th element of L. Then, according to Theorem
14.5.5, A = L*D*U* is an LDU decomposition of A. Since this decomposition
differs from the supposedly unique LDU decomposition A = LDU, we have
arrived at the desired contradiction. We conclude that d; i= 0 (i = 1, ... , n - 1).
And, since (in light of Lemma 14.3.1) A is nonsingular if and only if all n diagonal
elements of D are nonzero, we further conclude that A is nonsingular if and only
if dn i= O.

EXERCISE 17. Suppose that an n x n (symmetric) matrix A has a unique U'DU


decomposition, say A = U'DU. Use the result of Part (b) of Exercise 16 to show
that A has no LDU decompositions other than A = U'DU.
88 14. Linear, Bilinear, and Quadratic Fonns

Solution. Let us restrict attention to the case where n ::: 2 - if n = 1, then it is


clear that A has no LDU decompositions other than A = V'DU.
The result of Part (b) of Exercise 16 implies that the first n - 1 diagonal elements
ofD are nonzero. We conclude, on the basis of Theorem 14.5.5, that A has no LDU
decompositions other than A = V'DU.

EXERCISE 18. Show that if a nonsingu1ar matrix has an LDU decomposition,


then that decomposition is unique.
Solution. Any 1 x 1 matrix (nonsingular or not) has a unique LDU decomposition,
as discussed in Section 14.5b. Consider now a nonsingular matrix A of order
n ::: 2 that has an LDU decomposition, say A = LDV. Let All, LII, VII, and
D1 represent the (n - l)th-order leading principal submatrices of A, L, V, and D,
respectively. Then, according to Theorem 14.5.3, All = LIIDI VII is an LDU
decomposition of A 11. Since A is nonsingu1ar, D is nonsingular, implying that D1
is nonsingular and hence that All is nonsingular. We conclude, on the basis of
Corollary 14.5.6, that A = LDV is the unique LDU decomposition of A.

EXERCISE 19. Let A represent an n x n matrix (where n ::: 2). By for instance
using the results of Exercises 16, 17, and 18, show that if A has a unique LDU
decomposition or (in the special case where A is symmetric) a unique U'DU de-
composition, then the leading principal submatrices (of A) of orders 1,2, ... , n-1
are nonsingular and have unique LDU decompositions.
Solution. In light of the result of Exercise 17, it suffices to restrict attention to the
case where A has a unique LDU decomposition, say A = LDV.
For i = 1, 2, ... , n - 1, let Ai, Li, Vi, and Di represent the ith-order leading
principal submatrices of A, L, V, and D, respectively. Then, according to Theorem
14.5.3, an LDU decomposition of Ai is Ai = LiDiVi, and, according to the result
of Exercise 16, Di is nonsingular. Thus, Ai is nonsingular and, in light of the result
of Exercise 18, has a unique LDU decomposition.

EXERCISE 20. (a) Let A = {aij} represent an m x n nonnull matrix of rank r.


Show that there exist an m x m permutation matrix P and an n x n permutation
matrix Q such that

PAQ = (BII
B21

where Bl1 is an r x r nonsingular matrix whose leading principal submatrices (of


orders 1, 2, ... , r - 1) are nonsingular.

(b) Let B = (:~: :~~) represent any m x n nonnull matrix of rank r such that
BII is an r x r nonsingular matrix whose leading principal submatrices (of orders
1, 2, ... , r - 1) are nonsingular. Show that there exists a unique decomposition of
14. Linear, Bilinear, and Quadratic Forms 89

B of the fonn
B= (~~)D(U\, U2),

where L\ is an r x r unit lower triangular matrix, U \ is an r x r unit upper triangular


matrix, and D is an r x r diagonal matrix. Show further that this decomposition is
such thatBlI = L\DU\ is the unique LDU decomposition ofB\\, D is nonsingular,
L2 = B2\ Ui\D-I, and U2 = D-\Li\B\2 .

Solution. (a) The matrix A contains r linearly independent rows, say rows iI,
i2, ... , i r . For k = 1, ... , r, denote by Ak the k x n matrix whose rows are
respectively rows iI, i2, ... , h of A.
There exists a subset h, h, ... ,jr of the first n positive integers such that,
for k = 1, ... , r, the matrix, say A k, whose columns are respectively columns
h, h, ... , jk of Ak, is nonsingular. As evidence of this, let us outline a recursive
scheme for constructing such a subset.
Row il of A is nonnull, so that h can be chosen in such a way that ailh f. 0
and hence in such a way that AT = (ailh) is nonsingular. Suppose now that
h, h, ... , ik-I have been chosen in such a way that AT, Ai,.··, Ak-I are non-
singular. Since Ak-\ is nonsingular, columns j\, h, ... , ik-I of Ak are linearly
independent, and, since rank (Ak) = k, Ak has a column that is not expressible as
a linear combination of columns h, h, ... , jk-\. Thus, it follows from Corollary
3.2.3 that ik can be chosen in such a way that Ak is nonsingular.
Take P to be any m x m pennutation matrix whose first r rows are respectively
rows iI, i2, ... , ir of 1m , and take Q to be any n x n pennutation matrix whose
first r columns are respectively columns h, h, ... , jr of In. Then,

PAQ = (Bl1
B2\
B12)
B22 '

where Bl1 = A; is a nonsingular matrix whose leading principal submatrices (of


orders 1, 2, ... , r-l) are respectively the nonsingularmatricesAT, Ai, ... , A;_\.
(b) Clearly, showing that B has a unique decomposition of the fonn specified in
the exercise is equivalent to showing that there exist a unique unit lower triangular
matrix L\, a unique unit upper triangular matrix U \, a unique diagonal matrix D,
and unique matrices L2 and U2 such that

Bl1 = L\DU\,
B\2 = L\DU2,
B2\ = L2DU\, and
B22 = L2DU2.
It follows from Corollary 14.5.7 that there exists a unique unit lower triangular
matrix LI, a unique unit upper triangular matrix U \, and a unique diagonal matrix
D such that B11 = L\DU\ - by definition, Bl1 = L\DU\ is the unique LDU
decomposition ofBl1. Moreover, D is nonsingular (since B\\ = L\DU\ is non-
singular). Thus, there exist unique matrices L2 and U2 such that B21 = L2 DU I
90 14. Linear, Bilinear, and Quadratic Forms

and B12 = LIDU2 ,namely, L2 = B21 U,ID-I and U2 = D- 1L,I B12 . Finally,
it follows from Lemma 9.2.2 that

B22 = B21B,/BI2 = L2DUI (LIDUd-ILIDU2


= L2DUIU,ID-IL,ILIDU2 = L2DU2.

EXERCISE 21. Show, by example, that there exist n x n (non symmetric ) positive
semidefinite matrices that do not have LDU decompositions.
Solution. Let
A_ (0
- In-I
-1~_1)
In-I .

Consider the quadratic form x' Ax in x. Partitioning x as x = (~~), where X2 is of


dimensions (n - 1) xl, we find that

Thus, x' Ax :::: 0 for all x, with equality holding when, for example, XI = I and
X2 = 0, so that x' Ax is a positive semidefinite quadratic form and hence A is a

D
positive semidefinite matrix. Moreover, since the leading principal submatrix of
A of order two is (~ - and since -1 ~ C(O), it follows from Part (2) of
Theorem 14.5.4 that A does not have an LDU decomposition.

EXERCISE 22. Let A represent an n x n nonnegative definite (possibly non-


symmetric) matrix that has an LDU decomposition, say A = LDU. Show that the
diagonal elements of the diagonal matrix D are nonnegative.
Solution. Consider the matrix B = DU(L -I )/. Since (in light of Corollary 8.5.9)
(L -1)' -like U - is unit upper triangular, it follows from Lemma 1.3.1 that the
diagonal elements of B are the same as the diagonal elements, say dl, ... , dn , of
D.
Moreover,

implying (in light of Theorem 14.2.9) that B is nonnegative definite. We conclude


- on the basis of Corollary 14.2.13 - that dl, ... , dn are nonnegative.

EXERCISE 23. Let A represent an m x k matrix of full column rank. And, let
A = QR represent the QR decomposition of A; that is, let Q represent the unique
m x k matrix whose columns are orthonormal with respect to the usual inner
product and let R represent the unique k x k upper triangular matrix with positive
diagonal elements such that A = QR. Show that A' A = R/R (so that A' A = R/R
is the Cholesky decomposition of A' A).
14. Linear, Bilinear, and Quadratic Forms 91

Solutioo. Since the inner product with respect to which the columns of Q are
orthonormal is the usual inner product, Q'Q = It, and consequently

A'A = R'Q'QR = R'R.

EXERCISE 24. Let A represent an m x k matrix of rank r (where r is possibly


less than k). Consider the decomposition A = QRI, where Q is an m x r matrix
with orthonormal columns and RI is an r x k submatrix whose rows are the r
non null rows of a k x k upper triangular matrix R having r positive diagonal
elements and n - r null rows. (Such a decomposition can be obtained by using the
results of Exercise 6.4 - refer to Exercise 6.5.) Generalize the result of Exercise
23 by showing that if the inner product with respect to which the columns of Q
are orthonormal is the usual inner product, then A' A = R'R (so that A' A = R'R
is the Cholesky decomposition of A' A).

Solutioo. Suppose that the inner product with respect to which the columns of Q
are orthonormal is the usual inner product. Then, Q' Q = I r . Thus, recalling result
(2.2.9), we find that

EXERCISE 25. Let A = {aij} represent an n x n matrix that has an LOU decom-
position, say A = LOU. And, define G = U-10-L -I (which is a generalized
inverse of A).
(a) Show that

G = O-L -I + (I - U)G = U-10- + G(I - L).

(b) For i = I, ... , n, let di represent the ith diagonal element of the diagonal
matrix 0; and, for i, j = 1, ... , n, let iij' Uij, and gij represent the ijth elements
of L, U, and G, respectively. Take 0- = diag(dj, ... , d;), where d; = Ildi, if
di f= 0, and d; is an arbitrary scalar, if di = O. Show that

n n
gii = d; - L Uikgki = d; - L gikiki (E. I)
k=i+1 k=i+1

and that

-t gikikj' for j < i , (E.2a)


g .. _
IJ -
1- Lk=J+I
n
Uikgkj, for j > i (E.2b)
k=i+1

(where the degenerate sums L~=n+1 gikiki and L~=n+1 Uikgki are to be inter-
preted as 0).
92 14. Linear, Bilinear, and Quadratic Forms

(c) Devise a recursive procedure that uses the formulas from Part (b) to generate
a generalized inverse of A.

Solution. (a) Clearly,

n-L -I + (I - U)G = n-L -I + G - UG = n-L -I + G - UU- I n-L -I = G.


Similarly,

u-In- + G(I - L) = u-In- + G - GL = u-In- + G - U-In-L -IL = G.


(b) Since Land U are unit triangular, their diagonal elements equal I (and the
diagonal elements of I - L and I - U equal 0). Thus, it follows from Part (a) that

if j = i ,

if j > i,

and similarly that

(c) The formulas from Part (b) can be used to generate a generalized inverse of A
in n steps. During the first step, the nth diagonal element gnn is generated from the
formula gnn = d:,and then gn-I,n, ... , gin and gn,n-I, , .. , gnl (the off-diagonal
elements of the nth column and row of G) are generated recursively using formulas
(E.2b) and (E.2a), respectively. During the (n - s + l)th step (n - I :'S s :'S 2), the
sth diagonal element gss is generated from gs+I,s, ... , gns or alternatively from
gs,s+l, ... , gsn using result (E.I), and then gs-I,s, ... , gls and gs,s-I, ... , gsl are
generated recursively using formulas (E.2b) and (E.2a), respectively. During the
nth (and final) step, the first diagonal element gll is generated from the last n - I
elements of the first column or row using result (E. I ).

EXERCISE 26. Verify that a principal submatrix of a skew-symmetric matrix is


skew-symmetric.

Solution. Let B = {hi}} represent the r x r principal submatrix of an n x n skew-


symmetric matrix A = {ai}} obtained by striking out all of its rows and columns
except the kith, k2th, ... , krth rows and columns (where kl < k2 < ... , k r ).
Then, for i, j = I, ... , r,
14. Linear, Bilinear, and Quadratic Forms 93

Since bji is the ijth element of B' and -bij the ijth element of -B, we conclude
thatB' = -B.

EXERCISE 27. (a) Show that the sum of skew-symmetric matrices is skew-
symmetric.
(b) Show that the sum Al + A2 + ... + Ak of n x n nonnegative definite
matrices AI, A2, ... , Ak is skew-symmetric if and only if AI, A2, ... , Ak are
skew-symmetric.
(c) Show that the sum Al + A2 + ... + Ak of n x n symmetric nonnegative
definite matrices AI, A2, ... , Ak is a null matrix if and only if AI, A2, ... ,Ak are
null matrices.

Solution. (a) Let AI, A2, ... ,Ak represent n x n skew-symmetric matrices. Then,

(4= Ai)' = 4= A; = 4= (-Ai) = - LAi,


I I I I

so that Li Ai is skew-symmetric.
(b) If the nonnegative definite matrices A I, A2, ... ,Ak are skew-symmetric,
then it follows from Part (a) that their sum Li Ai is skew-symmetric.
Conversely, suppose that Li Ai is skew-symmetric. Let dij represent the jth
diagonal element of Ai (i = 1, ... , k; j = I, ... , n). Since (according to the
definition of skew-symmetry) the diagonal elements of Li Ai equal zero, we have
that
dlj + d2j + ... + dkj = 0

(j = I, ... , n). Moreover, it follows from Corollary 14.2.13 that

leading to the conclusion that dlj' d2j, ... , dkj equal zero (j = I, ... , n). Thus,
it follows from Lemma 14.6.4 that AI, A2, ... , Ak are skew-symmetric.
(c) Since (according to Lemma 14.6.1) the only n x n symmetric matrix that is
skew-symmetric is the n x n null matrix, Part (c) is a special case of Part (b).

EXERCISE 28. (a) Let AI, A2, ... , Ak represent n x n nonnegative definite
matrices. Show that tr(L7=1 Ai) ~ 0, with equality holding if and only if L7=1 Ai
is skew-symmetric or equivalently if and only if A I, A2, ... , Ak are skew-symmet-
ric. [Note. That L7=1 Ai being skew-symmetric is equivalent to AI, A2, ... , Ak
being skew-symmetric is the result of Part (b) of Exercise 27.]
(b) Let A I, A2, ... , Ak represent n x n symmetric nonnegative definite matrices.
Show that tr(L7=1 Ai) ~ 0, with equality holding if and only if L~=I Ai = 0 or
equivalently if and only if A I, A2, ... , Ak are null matrices.
Solution. (a) According to Corollary 14.2.5, L~=l Ai is nonnegative definite.
Thus, it follows from Theorem 14.7.2 that tr(L~=1 Ai) ~ 0, with equality holding
94 14. Linear, Bilinear, and Quadratic Forms

if and only if L7=1 Ai is skew-symmetric or equivalently [in light of the result of


Part (b) of Exercise 27] if and only if AI, A2, ... ,Ak are skew-symmetric.
(b) Part (b) follows from Part (a) upon observing (on the basis of Lemma 14.6.1)
that a symmetric matric is skew-symmetric if and only if it is null.

EXERCISE 29. Show, via an example, that (for n > 1) there exist n x n (non-
symmetric) positive definite matrices A and B such that tr(AB) < O.
Solution. Take A = {aij} to be an n x n matrix such that
au 1, for j = i,
2, for j = i + 1, ... , n,
= -2, for j = 1, ... , i - I

(i = 1, ... , n). and take B = A. That is. take


2 2 2
-2 2 2
B= A = -2 -2 2

-2 -2

Then. (1/2)(A+A') = In. which is a positive definite matrix. implying (in light
of Corollary 14.2.7) that A is positive definite. Moreover, all n diagonal elements of
AB equal 1 - 4(n - 1), which (for n > 1) is a negative number. Thus. tr(AB) < O.

EXERCISE 30. (a) Show, via an example. that (for n > 1) there exist n x n
symmetric positive definite matrices A and B such that the product AB has one
or more negative diagonal elements (and hence such that AB is not nonnegative
definite).
(b) Show. however. that the product of two n x n symmetric positive definite
matrices cannot be nonpositive definite.
Solution. (a) Take A = diag(AlI, I n -2) and B = diag(BII. In -2). where

All = ( 2 -1)
-1 2 and
12
B" = ( 3
3)
1 '

and consider the quadratic forms x' Ax and x'Bx in the n-dimensional vector x =
(Xl, X2, ...• x n )'. We find that

x' Ax = 2[xl - (l/2) X2]2 + (3/2) xi + x~ + xl + ... + x;.


x'Bx = 12[xl + (1/4) xd + (1/4) xi + x~ + xl + ... + x;.

Clearly. x' Ax ~ 0 with equality holding only if x = O. and similarly x/Bx ~ 0


with equality holding only if x = O. Thus. the quadratic forms x' Ax and x'Bx are
14. Linear, Bilinear, and Quadratic Forms 95

positive definite, and hence, by definition, the matrices A and B of the quadratic
forms are positive definite.
Consider now the product AB. We find that AB = diag(AllBll, I n -2) and that
AlIBI I = (~~ _~), thereby revealing that the second diagonal element of AB
equals the negative number -I.
(b) Let A and B represent n x n symmetric positive definite matrices. Suppose,
for purposes of establishing a contradiction, that AB is nonpositive definite. Then,
by definition, - AB is nonnegative definite, implying (in light of Theorem 14.7.2)
that tr( -AB) ::': 0 and hence that

tr(AB) = -tr( -AB) ::: o.


However, according to Theorem 14.7.4, tr(AB) > O. Thus, we have arrived at the
sought-after contradiction. We conclude that AB cannot be nonpositive definite.

EXERCISE 31. Let A = {aij} and B = {bij} represent n x n matrices, and


take C to be the n x n matrix whose ijth element Cij = aijbij is the product
of the ijth elements of A and B - C is the so-called Hadamard product of A
and B. Show that if A is nonnegative definite and B is symmetric nonnegative
definite, then C is nonnegative definite. Show further that if A is positive definite
and B is symmetric positive definite, then C is positive definite. [Hint. Taking
x = (XI, ... , xn)' to be an arbitrary n x I vector and F = (fl , ... , fn) to be a
matrix such thatB = F'F, begin by showing thatx'Cx = tr(AH), where H = G'G
with G = (Xlfl, ... , xnfn ).]
Solution. Suppose that B is symmetric nonnegative definite. Then, according to
Corollary 14.3.8, there exists a matrix F = (fl' ... , fn) such that B = F'F.
Let x = (Xl, ... , xn)' represent an arbitrary n-dimensional column vector, let
G = (Xlfl, ... , xnfn ), and let H = G'G. Then, the ijth element of His

Thus,

i. j i, j i, j

= Laijhji = tr(AH),
;, j

Clearly, the matrix H is symmetric nonnegative definite, implying (in light of The-
orem 14.7.6) that if A is nonnegative definite, then tr(AH) ::': 0 and consequently
x'Cx::,: O.
Consider now the special case where B is symmetric positive definite. In this
special case, rank(F) = rank(B) = n, implying that the columns of F are linearly
independent and hence nonnull. Thus, unless x = 0, G is non null and hence H is
96 14. Linear, Bilinear, and Quadratic Fonns

nonnull. It follows (in light of Theorem 14.7.4) that if A is positive definite, then,
unless x = 0, tr(AH) > 0 and consequently x'Cx > O.
We conclude that if A is nonnegative definite and B is symmetric nonnegative
definite, then C is nonnegative definite and that if A is positive definite and B is
symmetric positive definite, then C is positive definite.

EXERCISE 32. Let AI, A2, ... , Ak and BI, B2, ... , Bk represent n x n sym-
metric nonnegative definite matrices. Show that tr(L:7=1 AiBi) ~ 0, with equality
holding if and only if, for i = 1, 2, ... , k, Ai Bi = 0, thereby generalizing and the
results of Part (b) of Exercise 28.
Solution. According to Corollary 14.7.7, tr(AiBi) ~ 0 (i = 1,2, ... , k). Thus,

with equality holding if and only if, for i = 1, 2, ... ,k, tr(Ai Bi) = 0 or equiva-
lently (in light of Corollary 14.7.7) if and only if, for i = 1,2, ... , k, AiBi = 0.

EXERCISE 33. Let A represent a symmetric nonnegative definite matrix that has
been partitioned as

A= (~ ~)
where T (and hence W) is square. Show that VT-U and UW-V are symmetric
and nonnegative definite.
Solution. According to Lemma 14.8.l, there exist matrices R and S such that

T = R'R, U = R'S, V = S'R, W = S'S.

Thus, making use of Parts (6) and (3) of Theorem 12.3.4, we find that

VT-U = S'R(R'R)-R'S = S'PRS = S'PRPRS = S'P~PRS = (PRS)'PRS


and similarly that

UW-V = R'S(S'S)-S'R = R'PsR = R'PsPsR = R'P~PsR = (PsR)'PsR.


We conclude that VT-U and UW-V are symmetric and (in light of Corollary
14.2.14) nonnegative definite.

EXERCISE 34. Show, via an example, that there exists an (m + n) x (m + n)


(nonsymmetric) positive semidefinite matrix A of the form A = (~ ~), where
T is of dimensions m x m, W of dimensions n x n, U of dimensions m x n, and V of
14. Linear, Bilinear, and Quadratic Fonns 97

dimensions n x m, for which C(U) (/:. C(T) and/or R(V) (/:. R(T), the expression
rank(T) + rank(W - VT-U) does not necessarily equal rank(A), and the formula

(
T- + T-UQ-VT-
-Q-VT-

where Q = W - VT-U, does not necessarily give a generalized inverse of A.

Solution. Consider the matrix A = (~ ~), where T = 0, U = Jmn, V =


-Jnm , and W = In. Clearly, C(U) (/:. C(T) and R(V) (/:. R(T). Moreover,
(l/2)(A + A') = (~ ~J (which is a positive semidefinite matrix), so that
(according to Corollary 14.2.7) A is positive semidefinite.
Now, take T- = 0, in which case the Schur complement of T relative to T- is
W - VT-U = In. Using Theorem 9.6.1, we find that
rank(A) = n + rank(T - UW-1V)
= n + rank(JmnJnm)
= n + rank(nJmm) = n + 1.
However,
rank(T) + rank(W - VT-U) = rank(O) + rank(In) = n.
Thus,
rank(A) :f:. rank(T) + rank(W - VT-U).
Further, the matrix obtained by applying formula (*) [or equivalently formula
(9.6.2)] is (~ ~J. Since
A (0o 0)
In
A = (-nJmm
-Jnm
Jmn):f:. A,
In

the matrix (~ ~J is not a generalized inverse of A.


EXERCISE 35. Show, via an example, that there exists an (m + n) x (m +
n) symmetric partitioned matrix A of the form A = (~, ~). where T is of
dimensions m x m, U of dimensions m x n, and W of dimensions n x n, such
that T is nonnegative definite and (depending on the choice of T-) the Schur
complement W - U'T-U of T relative to T- is nonnegative definite, but A is not
nonnegative definite.

Solution. Consider the symmetric matrix A = (~, ~). whereT = 0, U = Jmn ,


and W = In. And, take T- = 0, in which case the Schur complement of T relative
to T- is W - U'T-U = In.
98 14. Linear, Bilinear, and Quadratic Forms

Then, clearly, T is nonnegative definite, and the Schur complement of T relative


to T- is nonnegative definite. However, A is not nonnegative definite, as is evident
from Corollary 14.8.2.

EXERCISE 36. An n x n matrix A = {aij} is said to be diagonally dominant


if, for i = 1,2, ... , n, laii I > LJ=I (ji=i) laij I. (In the degenerate special case
where n = 1, A is said to be diagonally dominant if it is nonnull).
(a) Show that a principal submatrix of a diagonally dominant matrix is diagonally
dominant.

:nJ
(b) Let A = {aij} represent an n x n diagonally dominant matrix, partition A

as A = (~I,t [so that All is of dimensions (n - 1) x (n - 1)], and let


C = A II - (1 I ann )ab' represent the Schur complement of ann. Show that C is
diagonally dominant.
(c) Show that a diagonally dominant matrix is nonsingular.
(d) Show that a diagonally dominant matrix has a unique LDU decomposition.
(e) Let A = {aij} represent an n x n symmetric matrix. Show that if A is
diagonally dominant and if the diagonal elements all, a22, ... , ann of A are all
positive, then A is positive definite.

Solution. (a) Let A = {aij} represent an n x n diagonally dominant matrix, and


let B = {bkl} represent the m x m principal submatrix obtained by striking out
all of the rows and columns of A except the i I th, i2th, ... , im th rows and columns
(where il < i2 < . " < im ). Then, for k = 1, 2 .... , m,
n m m
Ibkkl = laikikl > L laidl ~ L laikitl = L Ibkll·
j=1 (ji=ik) C=I (t#) t=1 (f#)

Thus, B is diagonally dominant.


(b) For i, j I, 2, ... , n - 1, let Cij represent the ijth element of C. By
definition,

Then, for i = 1, 2, ... , n - I,

n-I n-I

L ICij I < L (Iaij 1+ lainanj lann I)


j= I (j i=i) j= I (j i=i)
n n-I
= L laij I - lain 1+ L lainanj/ann I
j=1 (ji=il j=1 (ji=i)
n-I
< laii I - lain I + L lainanj I ann I
j= I (j i=i)
14. Linear, Bilinear, and Quadratic Forms 99

n-l
S Icul-Iainl + I: lainanjlannl
j=1

(since laiil = laii - ai"ani I ann + ainand ann I


S laii - ainani lann 1+ lainandann I
= ICiil + lainand ann I)
n-l
= ICiil-lainl + lainlI: lanjlan" I
j=l

< ICiil-lainl + lainl


(since I:j:: lanjlannl = I:j:: lanji/lanni
< lanni/lanni = I)
= Icul·
(c) The proof is by mathematical induction. Clearly, any I x I diagonally dom-
inant matrix is nonsingular. Suppose now that any (n - I) x (n - I) diagonally
dominant matrix is nonsingular, and let A = {aij} represent an arbitrary n x n

:nJ
diagonally dominant matrix. It suffices to show that A is nonsingular.
Partition A as A = (~,I [so that All is of dimensions (n - 1) x (n -I)].
Since A is diagonally dominant, ann f= O. Let C = A II - (l I ann )ab'. It follows
from Part (b) that C is diagonally dominant, so that by supposition C [which is
of dimensions (n - I) x (n - I)] is nonsingular. Based on Theorem 8.5.11, we
conclude that A is nonsingular.
(d) It follows from Part (a) that every principal submatrix of a diagonally dom-
inant matrix is diagonally dominant and hence-in light of Part (c)-nonsingular.
We conclude--on the basis of Corollary 14.5.7-that a diagonally dominant matrix
has a unique LDU decomposition.
(e) The proof is by mathematical induction. Clearly, any 1 x I diagonally dom-
inant matrix with a positive (diagonal) element is positive definite. Suppose now
that any (n - 1) x (n - 1) symmetric diagonally dominant matrix with positive
diagonal elements is positive definite. Let A = {aij} represent an n x n symmetric
diagonally dominant matrix with positive diagonal elements. It suffices to show
that A is positive definite.
Partition A as A = (AI,l a ) [so that All is of dimensions (n -I) x (n -1 )],
a ann
and let C = A II - (1 I ann )aa' represent the Schur complement of ann. It follows
from Part (b) that C is diagonally dominant. Moreover, the ith diagonal element
ofC is

aii - ainani I ann 2:: aii - lain I lani I ann I


2:: aii - lain I
>0
100 14. Linear, Bilinear, and Quadratic Forms

(i = 1,2, ... , n - 1). Thus, by supposition, C [which is symmetric and of di-


mensions (n - 1) x (n - 1)] is positive definite. Based on Corollary 14.8.6, we
conclude that A is positive definite.

EXERCISE 37. Let A = {aij} represent an n x n symmetric positive definite


matrix. Show that det(A) ::: n7=1 au, with equality holding if and only if A is
diagonal.

Solution. Thatdet(A) = n7=1 aii if A is diagonal is an immediate consequence of


Corollary 13.1.2. Thus, it suffices to show that if A is not diagonal, then det(A) <
n7=1 aii· This is accomplished by mathematical induction.
Consider a 2 x 2 symmetric matrix

A = (all a l2 )
al2 a22

that is not diagonal. (Every 1 x 1 matrix is diagonal.) Even if A is not positive


definite, we have that

det(A) = alla22 - ar2 < alla22·


Suppose now that, for every (n - 1) x (n -1) symmetric positive definite matrix
that is not diagonal, the determinant of the matrix is less than the product of its
diagonal elements, and consider the determinant of an n x n symmetric positive
definite matrix A = {ai}} that is not diagonal (where n :::: 3).
Partition A as

A= (~~ :nJ
[where A* is of dimensions (n - 1) x (n - 1)]. Then, in light of the discussion of
Section 14.8a, it follows from Theorem 13.3.8 that

(S.6)

And, it follows from Corollary 14.8.6 and Lemma 14.9.1 that IA*I > 0 and ann -
a'A;1 a > O.
In the case where A* is diagonal, we have (since A is not diagonal) that a f= 0,
implying (since A;I is positive definite) that a' A;I a > 0 and hence that ann>
ann - a' A;I a, so that [in light of result (S.6)]

IAI < annlA*1 = ann n n


n-I

i=l
au =
n

i=1
aii·

In the alternative case where A* is not diagonal, we have that a' A; 1a :::: 0, implying
that ann :::: ann - a'A;1 a, and we have, by supposition, that IA*I < n7~11 aii, so
that

IAI < (ann - a'A;1 a) n


n-l

i=l
aii ::: ann n n
n-l

i=l
au =
n

i=l
aii·
14. Linear, Bilinear, and Quadratic Fonns 101

Thus, in either case, IAI < 07=1 aii.

EXERCISE 38. Let A = (; :), where a, b, c, and d are scalars.

(a) Show that A is positive definite if and only if a > 0, d > 0, and I b +cI
/2 < .f(id.
(b) Show that, in the special case where A is symmetric (i.e., where c = b), A
is positive definite if and only if a > 0, d > 0, and Ib I < .f(id.
Solution. (a) Let

(b + C)/2)
d .

Observe that
det(B) = ad - [(b + c)/2]2 (S.7)
and (in light of Corollary 14.2.7) that A is positive definite if and only if B is
positive definite.

°
Suppose that A is positive definite (and hence that B is positive definite). Then,
it follows from Corollary 14.2.13 that a > and d > 0, and [in light of equality
(S.7)] it follows from Lemma 14.9.1 that ad - [(b + c)/2]2 > 0, or equivalently
that [(b + c)/2]2 < ad, and hence that Ib + cl/2 < .f(id.
Conversely, suppose that a > 0, d > 0, and Ib + cl/2 < .f(id, in which case
[(b + c)/2]2 < ad or equivalently [in light of equality (S.7)] that det(B) > O.
Then, it follows from Theorem 14.9.5 that B is positive definite and hence that A
is positive definite.
(b) Part (a) follows from Part (b) upon observing that, in the special case where
c = b, the condition Ib + cl/2 < .f(id simplifies to the condition Ibl < .f(id.

EXERCISE 39. By, for example, making use of the result of Exercise 38, show
that if an n x n matrix A = {aij} is symmetric positive definite, then, for j =1= i =
1, ... , n,
laijl < Jaiiajj ~ max(aii, ajj).

Solution. Suppose that A is symmetric positive definite. Clearly, the 2 x 2 matrix


(:~i :~) is a principal submatrix of A and hence (in light of Corollary 14.2.12)
is symmetric positive definite. Thus, it follows from Part (b) of Exercise 38 that
aii > 0, a jj > 0, and laij I < Jaiia jj.
Moreover, if aii ~ ajj, then

Jaiiajj ~ jdf; = aii = max(aii' ajj);

and similarly if ai i < a jj, then

Jaiiajj ~ ;;;t = ajj = max(aii, ajj).


102 14. Linear, Bilinear, and Quadratic Forms

EXERCISE 40. Show, by example, that it is possible for the determinants of


both leading principal submatrices of a 2 x 2 symmetric matrix to be nonnegative
without the matrix being nonnegative definite and that, for n :::: 3, it is possible
for the determinants of all n leading principal submatrices of an n x n symmetric
matrix to be nonnegative and for the matrix to be nonsingular without the matrix
being nonnegative definite.

Solution. Consider the 2 x 2 symmetric matrix (~ _ ~). The determinants of


both of its leading principal submatrices are zero (and hence nonnegative), but it
is obviously not nonnegative definite.
Next, consider the 3 x 3 symmetric matrix

By, for example, expanding IA* I in terms of the cofactors of the three elements of
the first row of A*, we find that IA* I = 1. Thus, the determinants of the leading
principal submatrices of A* (of orders I, 2, and 3) are 0, 0, and 1, respectively, all
of which are nonnegative; and A* is nonsingular. However, A* is not nonnegative
definite (since, e.g., one of its diagonal elements is negative).
Finally, for n :::: 4, consider the n x n symmetric matrix

A=(:* ~n-3).
Clearly, the leading principal submatrices of A of orders 1, 2, and 3 are the same
as those of A*, so that their determinants are 0, 0, and 1, respectively. Moreover, it
follows from results (13.3.5) and (13.1.9) that the determinants of all ofthe leading
principal submatrices of A of order 4 or more equallA* I and hence equal 1. Thus,
the determinants of all n leading principal submatrices of A are nonnegative, and
A is nonsingular. However, A is not nonnegative definite (since, e.g., one of its
diagonal elements is negative).

EXERCISE 41. Let V represent a subspace of R n x I of dimension r (where r ::::


1). TakeB = (bl, b2, ... , br)to be any n xr matrix whose columns bl, b2, ... , b r
form a basis for V, and let L represent any left inverse of B. Let g represent a
function that assigns the value x * y to an arbitrary pair of vectors x and y in V.
(a) Let f represent an arbitrary inner product for R rx I, and denote by sot the
value assigned by f to an arbitrary pair of r-dimensional vectors sand t. Show
that g is an inner product (for V) if and only if there exists an f such that (for all
x and y in V)
x * y = (Lx)o(Ly).

(b) Show that g is an inner product (for V) if and only if there exists an r x r
14. Linear, Bilinear, and Quadratic Forms 103

symmetric positive definite matrix W such that (for all x and y in V)

x * y = x'L'WLy.

(c) Show that g is an inner product (for V) if and only if there exists an n x n
symmetric positive definite matrix W such that (for all x and y in V)

x*y = x'Wy.

Solution. (a) Suppose that, for some f, x * y = (Lx) (Ly) (for all x and y in V).
0

Then,
(1) x * y = (Lx) (Ly) = (Ly) (Lx) = y * x;
0 0

(2) x * x = (Lx) (Lx) ~ 0, with equality holding if and only if Lx = 0 or


0

equivalently (since x = Bk for some vector k, so that Lx = 0 ===} LBk =


o ===} Ik = 0 ===} k = 0 ===} Bk = 0 ===} x = 0) if and only ifx = 0 ;
(3) (kx) * y = (kLx) 0 (Ly) = k[(Lx) 0 (Ly)) = k(x * y);
(4) (x+y)*Z = (Lx+Ly) (Lz) = [(Lx) (Lz)] +[(Ly) (Lz)] = (uz)+(y*z)
0 0 0

(where x, y, and z represent arbitrary vectors in V and k represents an arbitrary


scalar). Thus, g is an inner product.
Conversely, suppose that g is an inner product, and consider the function j that
n
assigns to an arbitrary pair of vectors sand t in T x 1 the value

S* t = (Bs) * (Bt).

We find that
(1) S* t = (Bs) * (Bt) = (Bt) * (Bs) = t *S;
(2) s * s = (Bs) * (Bs) ~ 0, with equality holding if and only if Bs = 0 or
equivalently (since the columns of B are linearly independent) if and only if
s=O;
(3) (ks) * t = (kBs) * (Bt) = k[(Bs) * (Bt)] = k(s * t) ;
(4) (s+t)*u = (Bs+Bt)*(Bu) = [(Bs) * (Bu)]+[(Bt)*(Bu)] = (s*u)+(t*u)
n
(where s, t, and u represent arbitrary vectors in T x 1 and k represents an arbitrary
n
scalar). Thus, j is an inner product (for T xl).
Now, set f = 1.
Then, letting x and y represent arbitrary vectors in V and
defining s and t to be the unique vectors that satisfy Bs = x and Bt = y (so that
s = Is = LBs = Lx and similarly t = Ly), we find that

x * y = (Bs) * (Bt) = s*t = sot = (Lx) (Ly). 0

(b) Let f represent an arbitrary inner product for n T xl, and denote by sot the
value assigned by f to an arbitrary pair of r-dimensional vectors sand t. According
to Part (a), g is an inner product (for V) if and only if there exists an f such that
(for all x and y in V) x * y = (Lx) (Ly). Moreover, according to the discussion of
0
104 14. Linear, Bilinear, and Quadratic Fonns

Section 14.10a, every inner product for 'Rf x 1 is expressible as a bilinear form, and
a bilinear form (in r-dimensional vectors) qualifies as an inner product for rx1 n
if and only if the matrix of the bilinear form is symmetric and positive definite.
Thus, g is an inner product (for V) if and only if there exists an r x r symmetric
positive definite matrix W such that (for all x and y in V) x * y = (Lx)'WLy.
(c) Suppose that there exists an n x n symmetric positive definite matrix W such
that (for all x and y in V) x * y = x'Wy. According to the discussion of Section
14. lOa, the function that assigns the value x'Wy to an arbitrary pair of vectors x
and y in nn x 1 is an inner product fornn x 1. Thus, it follows from the discussion
of Section 6.1 b that g is an inner product (for V).
Conversely, suppose that g is an inner product. According to Theorem 4.3.12,
there exist n - r n-dimensional column vectors b r + 1, ... , b n such that bl, ... , b r ,
b r +l, ... , b n form a basis fornnx 1. Let C = (b r +l, b r +2, ... , b n ), and define

F = (B, C).

Partition F- 1 as

where L* is of dimensions r x n. (The matrix F is invertible since its columns are


linearly independent.) Note that (by definition)

so that L* is a left inverse of B, and

MB=O.

According to Part (b), there exists an r x r symmetric positive definite matrix


W * such that (for all x and y in V)

Moreover, letting x and y represent arbitrary vectors in V and defining sand t to


be the unique vectors that satisfy Bs = x and Bt = y, we find that

x'L: W *L*y = s' (L*B)'W*L*Bt

(~* ~n-J (LOB)t


r
= s,(LOB)'

= s'[ (~)B (~* ~n-J (~)Bt


=x'Wy,

where W = (F- 1)'diag(W*, I n _ r )F- I. Thus, x * Y = x'Wy. Furthermore, W is


symmetric, and it follows from Lemma 14.8.3 and Corollary 14.2.10 that W is
positive definite.
14. Linear, Bilinear, and Quadratic Forms 105

EXERCISE 42. Let V represent a linear space of m x n matrices, and let A B 0

represent the value assigned by a quasi-inner product to any pair of matrices A and
B (in V). Show that the set

U = {A E V: AoA = O},
which comprises every matrix in V with a zero quasi norm, is a linear space.
Solution. Let A and B represent arbitrary matrices in U, and let k represent an
arbitrary scalar.
Since (by definition) II A II = 0, it follows from the discussion in Section 14.lOc
that A B = O. Thus,
0

(A +B)o(A + B) = (AoA) + 2(AoB) + (BoB) =0+ 0+ 0 = 0,


implying that (A + B) E U. Moreover,

(kA) 0 (kA) = k2(AoA) = k2(O) = 0,


so that kA E U.
We conclude that U is a linear space.

EXERCISE 43. Let W represent an m x m symmetric positive definite matrix


and V an n x n symmetric positive definite matrix.
(a) Show that the function that assigns the value tr(A'WBV) to an arbitrary pair
of m x n matrices A and B qualifies as an inner product for the linear space nm xn .
(b) Show that the function that assigns the value tr(A'WB) to an arbitrary pair
of m x n matrices A and B qualifies as an inner product for nmxn.
(c) Show that the function that assigns the value tr(A'WBW) to an arbitrary pair
of m x m matrices A and B qualifies as an inner product for nm xm .
Solution. (a) Let us show that the function that assigns the value tr(A'WBV) to an
arbitrary pair of m x n matrices A and B has the four basic properties (described
in Section 6.1 b) of an inner product. For this purpose, let A, B, and C represent
arbitrary m x n matrices, and let k represent an arbitrary scalar.
(1) Using results (5.1.5) and (5.2.3), we find that

tr(A'WBV) = tr[(A'WBV)'] = tr(VB'WA) = tr(B'WAV).


(2) According to Corollary 14.3.13, W = Q'Q for some m x m nonsingular
matrix Q, and V = P'P for some n x n nonsingular matrix P. Thus, using results
(5.2.3) and (5.2.5) along with Lemma 5.3.1, we find that

tr(A'WA V) = tr(A' Q' QAP'P) = tr(PA' Q' QAP')


= tr[ (QAP')' QAP'] ::: 0,

with equality holding if and only if QAP' = 0 or, equivalently, if and only if
A=O.
106 14. Linear, Bilinear, and Quadratic Forms

(3) Clearly, tr[(kA)'WBV)] = k tr(A'WBV).


(4) Clearly, tr[(A + B)'WCV] = tr(A'WCV) + tr(B'WCV).
(b) and (c) The functions described in Parts (b) and (c) are special cases of the
function described in Part (a) - those where V = I and V = W, respectively.

EXERCISE 44. Let A represent a q x p matrix, B a p x n matrix, and C an m x q


matrix. Show that (a) CAB(CAB)-C = C if and only if rank(CAB) = rank(C),
and (b) B(CAB)-CAB = B if and only ifrank(CAB) = rank(B).

Solution. (a) Suppose that rank(CAB) = rank(C). Then, it follows from Corollary
4.4.7 that C(CAB) = C(C) and hence that C = CABR for some matrix R. Thus,

CAB(CAB)-C = CAB(CAB)-CABR = CABR = C.

Conversely, suppose that CAB(CAB)-C = C. Then,


rank(CAB) 2: rank[CAB(CAB)-C] = rank(C).

Since clearly rank(CAB) :::: rank(C), we have that rank(CAB) = rank(C).


(b) Similarly, suppose that rank(CAB) = rank(B). Then, it follows from Corol-
lary 4.4.7 that R(CAB) = R(B) and hence that B = LCAB for some matrix L.
Thus,
B(CAB)-CAB = LCAB(CAB)-CAB = LCAB = B.
Conversely, suppose that B(CAB)-CAB = B. Then,
rank(CAB) 2: rank[B(CAB)-CAB] = rank(B).
Since clearly rank(CAB) :::: rank(B), we have that rank(CAB) = rank(B).

EXERCISE 45. Let U represent a subspace of R n xl, let X represent an n x p


matrix whose columns span U, and let Wand V represent n x n symmetric positive
definite matrices. Show that each of the following two conditions is necessary and
sufficient for the projection px.wy of yon U with respect to W to be the same (for
every y in Rn) as the projection px,vY of yon U with respect to V:
(a) V = p~wVpx.w + (I - px.w)'V(I - px.w);
(b) there exist a scalar c, a p x p matrix K, and an n x n matrix H such that

V = cW + WXKX'W + (I - Px.w)'H(I - Px,w).

Solution. (a) In light of Theorem 14.12.18, it suffices to show that this condition
is equivalent to the condition (I - PX,w)'VPx .w = O.
Suppose that V = p~.w VPx,w + (I - px.w),V (I - px.w). Then,

(I - px.w)'VPx .w
= [Px.w(I - px.w)]'vPiw + [(I - Px,w),]2V (I - px.w)Px.w·
14. Linear, Bilinear, and Quadratic Forms 107

Moreover, since [according to Part (6) of Theorem 14.12.11] px.w is idempotent,

Px.w(I - px.w) = px.w - pi.w = px.w - px.w = 0,


and similarly (I - px.w)Px.w = O. Thus, (I - px.w)'VPx.w = O.
Conversely, suppose that (I - px.w)'VPx.w = O. Then,

V = [Px.w + (I - px.w)],V[Px.w + (I - px.w)]


= p~.wVPx.w + (I - px.w)'V(I - px.w)
+(1 - px.w)'VPx.w + [(I - px.w)'VPx.w]'
= p~.wVPx.w + (I - px.w)'V(I - p X.w).

(b) Suppose that px.wy is the same for every y in R n as px.vy. Then, Condition
(a) of the exercise is satisfied, so that

V = p~.wVpx.w + (I - px.w)'V(I - px.w)


= cW + WXKX'W + (I - Px.w)'H(I - px.w)
for c = 0, K = [(X'WX)-]'X'VX(X'WX)-, and H = V.
Conversely, suppose that

V = cW + WXKX'W + (I - Px.w)'H(I - px.w)


for some scalar c and some matrices K and H. Then, since [according to Part (1)
of Theorem 14.12.11] X - px.wX = 0, we have that

VX = cWX + WXKX'WX + (I - Px.w),H(X - px.wX)


= cWX + WXKX'WX
= WX(cI + KX'WX)
=WXQ
for Q = cI + KX'WX. Thus, it follows from Theorem 14.12.18 that px.wY is the
same for every y in R n as px.vY.

EXERCISE 46. Let y represent an n-dimensional column vector, let U represent


a subspace of R nx 1, and let X represent an n x p matrix whose columns span
U. "Recall" that, for any m x n matrix L, for any subspace W of R nx1 , and for
V = {v E R m : v = Lx for some x E W},
X ..in W ¢> Lx..iJ V,
where H = L'L (and where x represents an arbitrary n x 1 vector), and

C(LZ) = V,

where Z is any n x q matrix whose columns span W. Use results (*) and (**) to
show that, for any n x n symmetric nonnegative definite matrix W, Y..iw U if and
only if X'Wy = O.
108 14. Linear, Bilinear, and Quadratic Forms

Solution. According to Corollary 14.3.8, W = L'L for some matrix L. Denote by


m the number of rows in L, and let

v = {v E R m : v = Lx for some x E U}.


Then, it follows from result (*) [or equivalently from Part (3) of Lemma 14.12.2]
that y ..Lw U if and only if Ly ..LI V and hence, in light of result (**) [or equiva-
lently in light of Part (5) of Lemma 14.12.2], if and only if (LX)'Ly = O. Since
(LX)'Ly = X'Wy, we conclude that y ..Lw U if and only if X'Wy = O.

EXERCISE 47. Let W represent an n x n symmetric nonnegative definite matrix.


(a) Show that, for any n x p matrix X and any n x q matrix U such that
C(U) c C(X),
(1) WPx,wU = WU, and U'WPX.w = U'W;
(2) PU,wPx,w = Pu,w, and P~,wWPu.w = WPx,wPu,w = WPu,w,
(b) Show that, for any n x p matrix X and any n x q matrix U such that C(U) = C(X),

WPu,w = WPx,w,

Solution. (a) (1) According to Lemma 4.2.2, there exists a matrix F such that
=
U XF. Thus, making use of Parts (1) and (5) of Theorem 14.12.25, we find that

WPx,wU = WPx,wXF = WXF = WU


and
U'WPx,w = F'X'WPx,w = F'X'W = U'W.
(2) Making use of Part (1), we find that

PU,wPx,w = U(U'WU)-U'WPx,w = U(U'WU)-U'W = Pu,w


and similarly that

WPx,wPu,w = WPx,wU(U'WU)-U'W = WU(U'WU)-U'W = WPu,w.


Further, making use of Part (3) of Theorem 14.12.25, we find that

P~,wWPu,w = (WPx,w)'Pu,w = wpx.wpu,w.


(b) Making use of Part (a), we find that

WPu,w = W(Px,wPu,w) = wpx.w·

EXERCISE 48. Let U represent a subspace of R nx1 , let A represent an n x n


matrix and W an n x n symmetric nonnegative definite matrix, and let X represent
any n x p matrix whose columns span U.
14. Linear, Bilinear, and Quadratic Forms 109

(a) Show that A is a projection matrix for U with respect to W if and only if

A = px.w + (I - PX,w)XK

for some p x n matrix K.


(b) Show that if A is a projection matrix for U with respect to W, then W A =
WPx,w.
Solution. (a) Suppose that

A = Px,w + (I - PX,w)XK

for some matrix K. Then, for any n-dimensional column vector y,

Ay = PX,wY + (I - PX,w)X(Ky).

Thus, it follows from Corollary 14.12.27 that A is a projection matrix for U with
respect to W.
Conversely, suppose that A is a projection matrix for U with respect to W. Then,
Ay E U for every y in R n , so that C(A) c U = C(X) and hence A = XF for
some matrix F. Moreover, it follows from Parts (2) and (3) of Theorem 14.12.26
that WAy = WX(X'WX)-X'Wy for every y in R n [since one solution to linear
system (12.4) is X'WX)-X'Wy], implying that

WA = WX(X'WX)-X'W

and hence that


WXF = WX(X'WX)-X'W. (S.8)
Since [according to Part (I) of Theorem 14.12.25] (X'WX)-X' is a generalized
inverse of WX, we conclude, on the basis of Theorem 11.2.4 and Part (5) of
Theorem 14.12.25, that there exists a matrix K such that

F = (X'WX)-X'W + [I - (X'WX)-X'WX]K

and hence such that

A = X(X'WX)-X'W +X[I - (X'WX)-X'WX]K = Px,w + (I - PX,w)XK.

(b) Suppose that A is a projection matrix for U with respect to W. Then, it


follows from Part (a) that

A = Px,w + (I - PX,w)XK

for some matrix K. Thus, making use of Part (I) of Theorem 14.12.25, we find
that
WA = WPx,w + (WX - WPx.wX)K = WPx,w,
110 14. Linear, Bilinear, and Quadratic Forms

EXERCISE 49. Let A represent an n x n matrix and W an n x n symmetric


nonnegative definite matrix.
(a) Show (by, e.g., using the results of Exercise 48) that if A'WA = WA [or,
equivalently, if (I - A)'WA = 0], then A is a projection matrix with respect
to W, and in particular A is a projection matrix for C(A) with respect to W,
and, conversely, show that if A is a projection matrix with respect to W, then
A'WA= WA.
(b) Show that if A is a projection matrix with respect to W, then in particular A
is a projection matrix for C(A) with respect to W.
(c) Show that A is a projection matrix with respect to W if and only if WA is
symmetric and WA 2 = W A.
Solution. (a) Suppose that A'WA = WA and hence that

A'W = (WA)' = (A'WA)' = A'WA.

Then,

A = PA.W +A - A(A'WA) - A'W


= PA.W +A - A(A'WA)-A'WA
= PA.W + (I - PA.w)AIn ,
and it follows from Part (a) of Exercise 48 that A is a projection matrix for C(A)
with respect to W.
Conversely, suppose that A is a projection matrix with respect to W. Let U
represent any subspace of nn x I for which A is a projection matrix with respect to
W, and let X represent any n x p matrix whose columns span U. Then, according
to Part (b) of Exercise 48,
WA=WPx.w,
and, making use of Part (6') of Theorem 14.12.25, we find that

A'WA = A'WPx.w = (WA)'Px.w = (WPx.w)'px.w


= p~.w Wpx.w = Wpx.w = WA.

(b) Suppose that A is a projection matrix with respect to W. Then, it follows


from Part (a) that A'WA = WA, and we conclude [on the basis of Part (a)] that A
is a projection matrix for C(A) with respect to W.
(c) In light of Part (a), it suffices to show that A'WA = WA if and only if WA
is symmetric and W A 2 = W A.
If W A is symmetric and W A 2 = W A, then

A'WA = (WA)'A = WAA = WA 2 = WA.

Conversely, if A'WA = W A, then

(WA)' = (A'WA)' = A'WA = WA


14. Linear, Bilinear, and Quadratic Fonns III

(i.e., WA is symmetric), and

WA 2 = WAA = (WA)'A = A'WA = WA.

EXERCISE 50. Let U represent a subspace of R" xl, let X represent an n x


p matrix whose columns span U, and let Wand V represent n x n symmetric
nonnegative definite matrices. Show (by, e.g., making use of the result of Exercise
46) that each of the following two conditions is necessary and sufficient for every
projection of yon U with respect to W to be a projection (for every y in Rn) of y
on U with respect to V:
(a) X'VPx,w = X'V, or, equivalently, X'V(I - px.w) = 0;
(b) there exists a p x p matrix Q such that VX = WXQ, or, equivalently, C(VX) c
C(WX).

Solution. (a) It follows from the result of Exercise 46 that a vector z (in U) is a
projection of a vector y (in R") on U with respect to V if and only if

X'V(y - z) = o.
Further, it follows from Corollary 14.12.27 that every projection of yon U with
respect to W is a projection (for every y in R") of y on U with respect to V if and
only if, for every y and every vector-valued function key),

X'V[y - PX,wy - (I - Px.w)Xk(y)] = 0,

or, equivalently, if and only if, for every y and every vector-valued function key),

X'V(I - px.w)[y - Xk(y)] = o. (S.9)

Thus, it suffices to show that condition (S.9) is satisfied for every y and every
vector-valued function key) if and only if X'V (I - px.w) = o.
If X'V (I - Px,w) = 0, then condition (S.9) is obviously satisfied. Conversely,
suppose that condition (S.9) is satisfied for every y and every vector-valued function
key). Then, since one choice for key) is key) == 0,

X'V(I - px.w)y = 0

for every y, implying that X'V (I - px.w) = O.


(b) It suffices to show that Condition (b) is equivalent to Condition (a) or, equiv-
alently [since VX = (X'V)' and p~.w VX = (X'VPx .w )'], to the condition

VX=P~.wVX. (S.lO)

If condition (S.lO) is satisfied, then

VX = WX[(X'WX)-],X'VX = WXQ
112 14. Linear, Bilinear, and Quadratic Fanns

for Q = [(X'WX)-],X'VX. Conversely, if VX = WXQ for some matrix Q, then,


making use of Part (4) of Theorem 14.12.25, we find that

p~.w VX = p~.w WXQ = WXQ = VX,


that is, condition (S.lO) is satisfied.

EXERCISE 51. Let X represent an n x p matrix and W an n x n symmetric


nonnegative definite matrix. As in the special case where W is positive definite,
let
C~(X) = {y E nnxl : y -Lw C(X)}.

(a) By, for example, making use of the result of Exercise 46, show that
C~(X) = N(X'W) = C(I - Px,w),

(b) Show that


dim[C~(X)] = n - rank(WX) ~ n - rank (X) = n - dim[C(X)].

(c) By, for example, making use of the result of Exercise 46, show that, for any
solution b* to the linear system X'WXb = X'Wy (in b), the vector y - Xb* is a
projection of y on C~(X) with respect to W.
Solution. (a) It follows from the result of Exercise 46 that an n-dimensional column
vectory andC(X) are orthogonal with respectto W if and only ifX'Wy = O. Thus,
C~(X) = N(X'W). Moreover, since [according to Part (5) of Theorem 14.12.25]
X(X'WX)- is a generalized inverse ofX'W, we have (in light of Corollary 11.2.2)
that N (X'W) = C(I - Px,w).
(b) Making use of Part (a), together with Part (10) of Theorem 14.12.25 and
Corollary 4.4.5, we find that
dim[C~(X)] = dim[C(I - px.w)]
= rank(I - Px,w)
= n - rank(WX) ~ n - rank(X) = n - dim[C(X)].

(c) Let z = Xb*. According to Theorem 14.12.26, z is a projection ofy on C(X)


with respect to W. Thus, (y - z) ..lw C(X), and, consequently, (y - z) E C~(X). It
remains to show that [y - (y - z)] -Lw C~(X) or, equivalently, that z..lw C~(X).
Making use of Part (4) of Theorem 14.12.25, we find that
(I - px.w)'Wz = (I - PX,w)'WXb* = 0,
implying (in light of the result of Exercise 46) that
z..lw C(I - px.w)
and hence [in light of the result of Part (a)) that
z..l C~(X).
15
Matrix Differentiation

EXERCISE 1. Using the result of Part (c) of Exercise 6.2, verify that every
n
neighborhood of a point x in m x I is an open set.
Solution. Take the nonn for nm x I to be the usual nonn, let N represent the
neighborhood of x of radius r, and let y represent an arbitrary point in N. Further,
take M to be the neighborhood of y of radius

s = r - II y - x II,

and let z represent an arbitrary point in M. Then, using the result of Part (c) of
Exercise 6.2, we find that

liz-xII::: IIz-YII+lly-xll
<s+lIy-xll
=r-lIy-xll+lly-xll
= r,

implying that zEN. It follows that MeN and hence that y is an interior point
of N. We conclude that N is an open set.

EXERCISE 2. Let f represent a function, defined on a set S, of a vector x =


(XI, ... , xm)' of m variables, suppose that the set S contains at least some interior
points, and let c represent an arbitrary one of those points. Verify that if f is k
times continuously differentiable at c, then f is k times continuously differentiable
at every point in some neighborhood of c.
114 15. Matrix Differentiation

Solution. Suppose that I is k times continuously differentiable at c. Then, there


exists a neighborhood N of c such that all of the first- through kth-order partial
derivatives of I exist and are continuous at every point in N.
Let x* represent an arbitrary point in N. Since any neighborhood is an open
set, x* is an interior point of N. Thus, there exists a neighborhood N* of x*, all
of whose points belong to N. It follows that the first- through kth-order partial
derivatives of I exist and are continuous at every point in N* and hence that I is
continuously differentiable at x* .

EXERCISE 3. Let X = {Xij} represent an m x n matrix of mn variables, and let x


represent an mn-dimensional column vector obtained by rearranging the elements
of X (in the form of a column vector). Further, let S represent a set of X-values,
and let S* represent the corresponding set of x-values (i.e., the set obtained by
rearranging the elements of each m x n matrix in S in the form of a column
vector). Verify that an mn-dimensional column vector is an interior point of S* if
and only if it is a rearrangement of an m x n matrix that is an interior point of S.
Solution. Let C = {cij } represent a value of X, and let c represent the correspond-
ing value of x. Then (when the inner products for R mn x I and R m xn are taken to
be the usual inner products)

1/2
II x - c II = [(x - c)'(x - C)]1/2 = [ l: (Xij - Cij)2
]
l.J

= {tr[(X - C)'(X - C)]}1/2


= IIX-CII·

Thus, a set of X-values is a neighborhood of C of radius r if and only if the


corresponding set of x-values is a neighborhood of c of radius r. It follows that
there exists a neighborhood of c, all of whose points belong to S*, if and only if
there exists a neighborhood of C, all of whose points belong to S. We conclude
that c is an interior point of S* if and only if C is an interior point of S.

EXERCISE 4. Let I represent a function whose domain is a set S in R m x I (that


contains at least some interior points). Show that the Hessian matrix HI of I is
the gradient matrix of the gradient vector (D f)' of I.
Solution. The gradient vector of I is (Df)' = (DIJ, ... , Dmf)'. The gradient
matrix of this vector is the m x m matrix whose ijth element is the ith (first-order)
partial derivative D~ I of D j I, which by definition is the Hessian matrix of I·

EXERCISE 5. Let g represent a function, defined on a set S, of a vector x =


(XI, ... , x m )' of m variables, let S* = {x E S : g(x) =1= a}, and let c represent any
interior point of S* at which g is continuously differentiable. Show that (for any
15. Matrix Differentiation 115

positive integer k) g-k is continuously differentiable at c and

a -k a
_g_ = _kg-k-l~.
aXj aXj

Do so based on the result that

a(1/g) = _(I/g)2~ ,
aXj aXj

and, letting f (like g) represent a function (defined on S) of x, the results that


if f is continuously differentiable at c, then the ratio f / g is also continuously
differentiable at c, that, for any positive integer k, fk is continuously differentiable
at any point at which f is continuously differentiable, and that

Solution. In light of the results cited in the statement of the exercise (or equiva-
lently in light of Lemma 15.2.2 and the ensuing discussion), we have that 1/g is
continuously differentiable at c and that, as a consequence, (1/ g)k or equivalently
g-k is continuously differentiable at c. Moreover, using results (**) and (*) [or,
equivalently, results (2.16) and (2.8)], we find that

a(g-k) = a[(1/g)k] = k(l/gl-l a(1/g) = k(1/gl-I(-1)(1/g2)~


aXj aXj aXj aXj

= _k(1/g)k+l~
aXj

= _kg-k-l~.
aXj

EXERCISE 6. Let F represent a p x p matrix of functions, defined on a set S, of


a vector x = (Xl, ... , xm)' of m variables. Let c represent any interior point of S
at which F is continuously differentiable. Show that ifF is idempotent at all points
in some neighborhood of c, then (at x = c)

Solution. Suppose that F is idempotent at all points in some neighborhood of c.


Then, differentiating both sides of the equality F = FF [with the help of result
(4.3)], we find that (at x = c)

(S.l)
116 15. Matrix Differentiation

Premultiplying both sides of equality (S.l) by F gives

or equivalently
aF
F-F=O.
aXj

EXERCISE 7. Let g represent a function, defined on a set S, of a vector x


= (Xl, ... , xm)' of m variables, and let f represent a p x I vector of functions
(defined on S) of x. Let c represent any interior point (of S) at which g and f are
continuously differentiable. Show that gf is continuously differentiable at c and
that (at x = c)
a(gf) = f~ + g~ .
ax' ax' ax'
Solution. It follows from result (4.9) (and the discussion thereof) that gf is con-
tinuously differentiable at c and that (at x = c)

(j = 1, ... , m). Moreover, since a(gf)/aXj, (ag/axj)f, and g(af/axj) are the
jth columns of a(gf)/ax', f(ag/ax'), and g(af/ax'), respectively, we have that

a(gf) =f~ +g~.


ax' ax' ax'

EXERCISE 8. (a) Let X = {Xij} represent an m x n matrix of mn "independent"


variables, and suppose that X is free to range over all of xn.nm
(1) Show that, for any p x m and n x p matrices of constants A and B,

atr(AXB) = A'B'.
ax
[Hint. Observe that tr(AXB) = tr(BAX).]
(2) Show that, for any m- and n-dimensional column vectors a and b,

a(a'Xb) = ab'.
ax
[Hint. Observe that a'Xb = tr(a'Xb).]
(b) Suppose now that X is a symmetric (but otherwise unrestricted) matrix (of
dimensions m x m).
15. Matrix Differentiation 117

(1) Show that, for any p x m and m x p matrices of constants A and B,

atr(AXB) ,
ax = C + C - diag(cll, C22, ... , Cmm ),

where C = {cij} = BA.


(2) Show that, for any m-dimensional column vectors a = {aj} and b = {bj},

a(a'Xb) , ,
ax = ab + ba - diag(albl, a2b2, ... , ambm).

Solution. (a) (1) Using result (6.5), we find that

atr(AXB) _ atr(BAX) _ BA' - A'B'


- --- -() - .
ax ax

(2) Observing that a'Xb = tr(a'Xb) and applying Part (1) (with A = a' and
B = b), we find that
a(a'Xb) ,
ax = ab.

(b) (1) Using result (6.7), we find that

atr(AXB) atr(BAX) ,.
ax = ax =C +C - dlag(cII, C22, ... , cmm ).

(2) Observing that a'Xb = tr(a'Xb) and applying Part (1) (with A = a' and
B = b), we find that

a(a'Xb)
ax
, , b
= ba + ab - diag(albl, a2 b2, ... , am m).

EXERCISE 9. (a) Let X = {xst} represent an m x n matrix of "independent"


variables, and suppose that X is free to range over all of nm xn . Show that, for any
n x m matrix of constants A,

atr[(Ax)2] = 2(AXA)'.
ax
(b) Let X = {xstl represent an m x m symmetric (but otherwise unrestricted)
matrix of variables. Show that, for any m x m matrix of constants A,

atr[(Ax)2] ,.
ax = 2[B + B - dlag(bll, b22, ... , bmm )],

where B ={b st } =AXA.


Solution. Let Uj represent the jth column of an identity matrix (of unspecified
dimensions).
118 15. Matrix Differentiation

(a) According to results (4.7) and (5.3),


a(AXA) ax ,
- - - = A-A = AUiU,.A.
aXij aXij

Thus, making use of results (6.3), (5.3), and (5.2.3), we find that
atr[(AX)2] atr[(AXA)X]
aXij aXij

= tr(AXA ax ) + tr[x a(AXA) ]


aXij aXij

= tr(AXAuiUj) + tr(XAUiUjA)
= 2 tr(ujAXAui)
= 2 ujAXAui.

Since ujAXAui is the jith element ofAXA or, equivalently, the ijth element of
(AXA)', we find that
atr[(Ax)2] = 2(AXA)'.
ax
(b) For purposes of differentiating tr[(AX)2], tr[(AX)2] is interpreted as a
function of an m(m + 1)/2-dimensional column vector x whose elements are Xij
(j ~ i = 1, ... , m).
According to results (4.7), (5.6), and (5.7),

a(AXA) ax {AUiU~A' if j = i ,
-aXij
- - =A-A=
aXij
A(UiUj
" + UjU')A , l·f J. < I,.
i

and, according to result (6.3),


atr[(Ax)2] atr[(AXA)X]
aXij aXij

= tr(AXA ax )
aXij
+ tr[x a(AXA)
aXij
J.
Thus, making use of results (5.6), (5.7), and (5.2.3), we find that
atr[(AX)2] , ,
- - - - = tr(AXAuiUi) + tr(XAUiUiA)
aXii
= 2 tr(u;AXAui)
= 2u;Bui
and that (for j < i)

atr[(AX)2] "
_"'--'-----'--C. = tr[AXA(Ui U,. + UjUi)] + tr[XA(UiU,." + UjUi )A]
aXij
15. Matrix Differentiation 119

= 2 tr[AXA(UiUi + UjU;)]
= 2 tr(AXAuiUi) + 2 tr(AXAuju;)
= 2 tr(ujAXAui) + 2 tr(u;AXAuj)
= 2 UjBUi + 2 u;Buj.

Since (for i, j = 1, ... ,m) uiBui is the jith element of B or, equivalently, the
ijth element of B' and since u;Bu j is the ijth element of B, we conclude that

atr[(Ax)2] ,.
ax = 2 [B + B - dlag(bll, b22,···, b mm )]·

EXERCISE 10. Let X = {Xij} represent an m x n matrix of "independent"


variables, and suppose that X is free to range over all of nmxn. Show that, for
k =2,3, ... ,

Solution. Let Uj representthe jth column ofIm . Then, making use of results (6.1),
(4.8), (5.2.3), and (5.3), we find that

atr(X k) = tr(aXk )
aXij aXij

= tr(Xk-1 ax + Xk- 2 ax X + ... + ax Xk-1)


aXij aXij aXij

= tr(Xk-1 ax) + tr(Xk- 2 ax X) + ... + tr( ax Xk-1)


aXij aXij aXij

= tr(Xk- 1 ax ) + tr(Xk- 1 ax ) + ... + tr(Xk- 1 ax )


aXij aXij aXij

= k tr(Xk-1 ax)
aXij

= k tr(Xk-1uiuj)
= k tr(U'.Xk-1Ui)
J
= k(uiXk-1ui).

Moreover, ujXk-1ui equals the jith element of Xk- 1 or equivalently the ijth
element of (Xk- 1),. Since (Xk- 1), = (X')k-l, we conclude that atr(Xk) / aXij'
which is the ijth element of atr(Xk)/ax, equals the ijth element of k(X')k-l and
hence that
120 15. Matrix Differentiation

EXERCISE 11. Let X = {xst} represent an m x n matrix of "independent"


variables, and suppose that X is free to range over all of nm
xn .

(a) Show that, for any m x m matrix of constants A,

atr(X' AX) A A' X


ax =( + ) .
(b) Show that, for any n x n matrix of constants A,

atr(XAX') = X(A + A').


ax

(c) Show (in the special case where n = m) that, for any m x m matrix of
constants A,
atr(XAX) = (AX)' + (XA)'.
ax
(d) Use the results of Parts (a)-(c) to devise simple formulas for atr(X'X)/ax,
atr(XX')/ax, and (in the special case where n = m) atr(X2 )/ax.
Solution. Let Uj represent the jth column of an identity matrix (of unspecified
dimensions).
(a) Since tr(X' AX) = tr(AXX'), it follows from result (6.2) that
atr(X' AX) = tr[A a(XX')].
aXij aXij

Thus, making use of results (4.3), (4.10), (5.3), and (5.2.3), we find that

atr(X' AX) = tr[AX ( ax )'] + tr(A ax X')


aXij aXij aXij

= tr(AXuju;) + tr(AUiUiX')
= tr(u;AXuj) + tr(UiX'Aui)
= u;AXuj + UiX'Aui.

Since u;AXuj is the ijth element of AX and since ujX' AUi is the jith element
of X' A or equivalently the ijth element of (X' A)', we conclude that

atr(X' AX) = AX + (X' A)' = (A + A')X.


ax

(b) Since tr(XAX') = tr(AX'X), it follows from result (6.2) that

a tr(XAX') = tr[A a (X'X) ] .


aXij aXij
15. Matrix Differentiation 121

Thus, making use of results (4.3), (5.3), and (5.2.3), we find that

atr(XAX') = tr(AX' ax ) + tr[A ( ax )' x]


aXij aXil aXij
= tr(AX'Uiuj) + tr(Auju;X)
= UjAX'Ui + u;XAuj.

Since UjAX'Ui is the jith element of AX' or equivalently the ijth element of
(AX')' and since u;XAuj is the ijth element ofXA, we conclude that

atr(XAX') = (AX')' + XA = X(A + A').


ax
(c) Since tr(XAX) = tr(AXX), it follows from result (6.2) that

atr(XAX) = tr[A a(XX) ].


aXij aXil
Thus, making use of results (4.3), (5.3), and (5.2.3), we find that

- - - - = tr (A X
atr(XAX) ax)
- +tr (A -ax X)
axi} aXil aXil
= tr(AXUiUj) + tr(AUiUjX)
= ujAXui + ujXAui .

Since UjAXUi is the jith element of AX or equivalently the ijth element of (AX)'
and since UjXAUi is the jith element of XA or equivalently the ijth element of
(XA)', we conclude that

atr(XAX) = (AX)' + (XA)'.


ax
(d) Upon setting A = I in the formulas from Parts (a)-(c), we find that

atr(X'X) = atr(XX') = 2X
ax ax
and (in the special case where n = m)

EXERCISE 12. Let X = {xi}} represent an m x m matrix. Let f represent


a function of X defined on a set S comprising some or all m x m symmetric
122 15. Matrix Differentiation

matrices. Suppose that, for purposes of differentiation, I is to be interpreted as a


function of the [m (m + 1) j2]-dimensional column vector x whose elements are xi}
(j ::::; i = 1, ... , m). Suppose further that there exists a function g, whose domain
is a set T of not-necessarily-symmetric matrices that contains S as a proper subset,
such that g(X) = I(X) for XES, so that g is a function of X and I is the
function obtained by restricting the domain of g to S. Define S* = {x : XES}.
Let c represent an interior point of S* , and let C represent the corresponding value
of X. Show that if C is an interior point of T and if g is continuously differentiable
at C, then I is continuously differentiable at c and that (at x = c)

al ag (a g )' . (a g ag ag )
ax = ax + ax - dlag aXil' aX22' ... , aXmm .

Solution. Let H represent the m x m matrix offunctions defined, on S*, by H(x) =


X. Then, H is continuously differentiable at c. Thus, it follows from the results of
Section 15.7 that if C is an interior point of T and if g is continuously differentiable
at C, then I is continuously differentiable at c and (at x = c)

(j ::::; i = 1, ... , m).


Moreover, in light of results (5.6), (5.7), and (5.2.3), we have that

and that (for j < i)

Since u;(agjaX)'Ui is the ith diagonal element of (agjaX)' (or equivalently the
ith diagonal element of agjaX) and since uj(agjaX)'Ui and u;(agjaX)'Uj are
the ijth elements of agjaX and (agjaX)', respectively, it follows that (at x = c)

al ag ( ag )' . (a g ag ag )
ax = ax + ax - dlag aXil' aX22' ... , aXmm .

EXERCISE 13. Let h = {hi} represent an n x 1 vector of functions, defined on


a set S, of a vector x = (XI, ... , xm)' of m variables. Let g represent a function,
defined on a set T, of a vector Y = (YI, ... , Yn)' of n variables. Suppose that
h(x) E T for every x in S, and take I to be the composite function defined (on
15. Matrix Differentiation 123

S) by f(x) = g[h(x)]. Show that if h is twice continuously differentiable at an


interior point c of S and if [assuming that h(c) is an interior point of T] g is twice
continuously differentiable at h(c), then f is twice continuously differentiable at
cand
n
Hf(c) = [Dh(c)]'Hg[h(c)]Dh(c) +L Dig[h(c)]Hhi(C).
i=1

Solution. Suppose that h is twice continuously differentiable at c or equivalently


that hi, ... , hn are twice continuously differentiable at c. Suppose further that g
is twice continuously differentiable at h(c).
Then, g is continuously differentiable at h(c) and hence is continuously differ-
entiable at every point in some neighborhood N g of h(c). Moreover, hi, ... , h n
are continuously differentiable at c and (in light of Lemma 15.1.1) continuous at
c. Consequently, there exists a neighborhood Nh of c such that (i) hi, ... , h n are
continuously differentiable at every point in Nh and (ii) h(x) E N g for every x in
Nh.
Thus, it follows from Theorem 15.7.1 that f is continuously differentiable at
every x in Nh and that (for x E Nh)
n
Dj/(x) = LUi(X)Djhi(X),
i=1
where Ui(X) = Dig[h(x)]. Since Dig is continuously differentiable at h(c), we
have (as a further consequence of Theorem 15.7.1) that
n
DsUi(C) = L D;ig[h(c)]Dshdc).
k=1

Since D j hi is continuously differentiable at c, we conclude that D j f (like f)


is continuously differentiable at c and hence that f is twice continuously differ-
entiable at c. Moreover,

n
= L [ui(C)D~hi(C) + DsUi(C)Djhi(C)]
i=1
n n n
= L Dig[h(c)]D~hi(C) +L L D;i g[h(c)]Dshk (c) Djhi (c)
i=1 k=1 i=1
n
=L Dig[h(c)]D;jhi(C) + [Dsh(c)]'Hg[h(c)]Djh(c).
i=1

To complete the argument, observe that D;jhi (c) is the sjth element of Hhi (c)
and that [Dsh(c»)'Hg[h(c)] Dj h(c) is the sjth element of [Dh(c)]'Hg[h(c)]Dh(c)
124 15. Matrix Differentiation

and hence that


n
RI(e) = [Dh(e)],Rg[h(c)]Dh(e) +L Dig[h(c)]Rhi(e).
i=1

EXERCISE 14. Let X = {Xi}} represent an m x m matrix of m 2 "independent"


variables (where m ::: 2), and suppose that the range of X comprises all ofRmxm.
Show that (for any positive integer k) the function I defined (on R m xm) by I (X) =
IXl k is continuously differentiable at every X and that

alXl k
- - = kIXlk-l[adj(X)]'.
ax
Solution. For purposes of differentiation, rearrange the elements of X in the form
of an m2 -dimensional column vector x, and reinterpret I as a function of x (in
which case the domain of I comprises all of Rm\ Let h represent a function of
x defined (on Rm2) by hex) = det(X), let g represent a function of a variable y
defined (for all y) by g(y) = yk, and express I as the composite of g and h, so
that I(x) = g[h(x)].
The function g is continuously differentiable at every y, and

And, the function h is continuously differentiable at every x, and

ah
-a- = ~ij,
Xi}

where ~i} is the cofactor of the ijth element Xi} of X. Thus, it follows from the
chain rule that I is continuously differentiable at every x (or equivalently at every
X) and that

aal
Xij
= kIXl k - I '>IJ'
" ..

or equivalently that
al = klXl k- 1[adj(X)]'.
ax
EXERCISE 15. Let F = {fis} represent a p x p matrix of functions, defined on
a set S, of a vector x = (XI, ... , xm)' of m variables. Let c represent any interior
point (of S) at which F is continuously differentiable. Use the results of Exercise
13.10 to show that (a) if rank[F(e)] = p - 1, then (at x = e)

a det(F) , aF
---=ky-z,
aX} aX}
15. Matrix Differentiation 125

where z = {zs} and y = {Yi} are any nonnull p-dimensional vectors such that
F(c)z = 0 and [F(c)l'y = 0 and where [letting ¢is represent the cofactor of
.tis (c)] k is a scalar that is expressible as k = ¢i sf (Yi zs) for any i and s such that
Yi =1= 0 and Zs =1= 0; and (b) ifrank[F(c)] :s p - 2, then (at x = c)

_a_de_t_(F_) _ 0
aXj - .

Solution. Recall that det(F) is continuously differentiable at c and that (at x = c)

adet(F)
- - - = tr [ adj(F)-
aFJ .
aXj aXj
(a) Suppose that rank[F(c)] = p - 1. Then, according to the result of Part (a)
of Exercise 13.10,
adj[F(c)] = kzy',
so that (at x = c)

a det(F)
- aF ) = k tr ( y,aF)
- - = tr ( kzy'- - z = ky , -
aFz .
aXj aXj aXj aXj
(b) Suppose that rank[F(c)] :s p - 2. Then, it follows from the result of Part
(b) of Exercise 13.1 0 that (at x = c)

adet(F) = tr(o~) = o.
aXj aXj

EXERCISE 16. Let X = {xst} represent an m x n matrix of "independent"


variables, let A represent an m x m matrix of constants, and suppose that the range
of X is a set S comprising some or all X-values for which det(X' AX) > O. Show
that log det(X' AX) is continuously differentiable at any interior point C of Sand
that (at X = C)

alogdet(X'AX) = AX(X'AX)-l + [(X'AX)-IX'A]'.


ax
Solution. For purposes of differentiating a function of X, rearrange the elements of
X in the form of an mn-dimensional column vector x and reinterpret the function
as a function of x, in which case the domain of the function is the set S* obtained
by rearranging the elements of each m x n matrix in S in the form of a column
vector.
Let c represent the value of x corresponding to the interior point C of S (and
note that c is an interior point of S*). Since X is continuously differentiable at c,
X' AX is continuously differentiable at c, and hence log det(X' AX) is continuously
differentiable at c (or equivalently at C).
126 15. Matrix Differentiation

Moreover, making use of results (8.6), (4.6), (4.10), (5.3), and (5.2.3) and letting
Uj represent the jth column ofl m or In, we find that (at x = c)
alog det(X' AX) [, 1 a(X' AX) ]
- - ' - - - - - = tr (X AX) - - - -
aXij aXij

= tr{(X'AX)-I[X'A ax
aXij
+ (ax
aXij
)~xJ}
= tr[(X'AX)-IX'AuiUjJ + tr[(X'AX)-IUju;AXJ
= uj(X'AX)-IX'Aui + u;AX(X'AX)-IUj.
Upon observing that u;AX(X' AX)-IUj and uj(X' AX)-IX' AUi are the ijth ele-
ments ofAX(X'AX)-1 and [(X'AX)-IX'A)', respectively, we conclude that (at
x = c)
alog det(X' AX)
--=--=-- = AX(X'AX)-l + [(X'AX)-IX'A)'.
ax
EXERCISE 17. (a) Let X represent an m x n matrix of "independent" variables,
let A and B represent q x m and n x q matrices of constants, and suppose that the
range of X is a set S comprising some or all X-values for which det(AXB) > O.
Show that log det(AXB) is continuously differentiable at any interior point C of
S and that (at X = C)

alogdet(AXB) = [B(AXB)-IA),.
ax
(b) Suppose now that X is an m x m symmetric matrix; that A and Bare q x m and
m x q matrices of constants; that, for purposes of differentiating any function of X,
the function is to be interpreted as a function of the column vector x whose elements
are Xij (J :s i = 1, ... , m); and that the range of x is a set S comprising some
or all x-values for which det(AXB) > O. Show that log det(AXB) is continuously
differentiable at any interior point c (of S) and that (at x = c)
alog det(AXB) ,
ax = K +K - diag(kll, k22, kqq),

where K = {kij} = B(AXB)-l A.


Solution. (a) For purposes of differentiating a function of X, rearrange the elements
of X in the form of an mn-dimensional column vector x and reinterpret the function
as a function of x, in which case the domain of the function is the set S* obtained
by rearranging the elements of each m x n matrix in S in the form of a column
vector.
Let c represent the value of x corresponding to the interior point C of S (and
note that c is an interior point of S*). Then, X is continuously differentiable at c,
implying that AXB is continuously differentiable at c and hence that log det(AXB)
is continuously differentiable at c (or equivalently at C).
15. Matrix Differentiation 127

Moreover, in light of results (8.6), (4.7), (5.3), and (5.2.3), we have that (at
x = c)
alog det(AXB) a
= tr[(AXB) -I (AXB)] = tr[(AXB) -I A ax B]
aXij aX;} aXij

= tr[(AXB)-IAu;ujB]

= ujB(AXB)-1 Au;

and hence {since ujB(AXB)-1 Au; is the ijth element of [B(AXB)-I An that (at
x = c)
a10gdet(AXB)
ax B XB -lA'
= [(A ) ].

(b) By employing essentially the same reasoning as in Part (a), it can be estab-
lished that log det(AXB) is continuously differentiable at the interior point c and
that (at x = c)
a10gdet(AXB) = tr[(AXB)-1 A ax B].
aX;} aXij

Moreover, in light of results (5.6), (5.7), and (5.2.3), we have that

and that (for j < i)

Since u; Ku; is the ith diagonal element of K and since u; Ku} and uj Ku; are the
i j th elements of K and K', respectively, it follows that (at x = c)

alog det(AXB) ,
ax = K +K - diag(ku, k22, ... , kqq ).

EXERCISE 18. Let F = {f;s} represent a p x p matrix of functions, defined on a


set S, of a vector x = (XI, ... , xm)' of m variables, and let A and B represent q x p
and p x q matrices of constants. Suppose that S is the set of all x-values for which
F(x) is nonsingular and det[AF- I (x)B] > 0 or is a subset of that set. Show that if
F is continuously differentiable at an interior point c of S, then log det(AF- IB) is
continuously differentiable at c and (at x = c)
128 15. Matrix Differentiation

Solution. Suppose that F is continuously differentiable at c. Then, in light of


the results of Section 15.8, AF- 1B is continuously differentiable at c and hence
log det(AF- 1B) is continuously differentiable at c. Moreover, making use of results
(8.6), (8.18), and (5.2.3), we find that (at x = c)

a
logdet(AF-1B) [ -I _I a(AF-1B)]
- - - - - - = tr (AF B)
a~ a~

= tr[(AF-IB)-I(-AF-I~F-IB)J
aX}

= -tr[F-1B(AF-1B)-1 AF- 1~J.


aXj

EXERCISE 19. Let A and B represent q x m and m x q matrices of constants.


(a) Let X represent an m x m matrix of m 2 "independent" variables, and suppose
that the range of X is a set S comprising some or all X-values for which X is
nonsingular and det(AX- 1B) > O. Use the result of Exercise 18 to show that
log det(AX- 1B) is continuously differentiable at any interior point C of S and that
(at X = C)

(b) Suppose now that X is an m x m symmetric matrix; that, for purposes of


differentiating any function of X, the function is to be interpreted as a function
of the column vector x whose elements are xij (j ::: i = 1, ... , m); and that the
range of x is a set S comprising some or all x-values for which X is nonsingular
and det(AX- 1B) > O. Use the result of Exercise 18 to show that log det(AX- 1B)
is continuously differentiable at any interior point c of S and that (at x = c)
alogdet(AX-1B) /.
ax =-K-K +dlag(kll,kn, ... ,kqq ),

whereK= {kij} =X-1B(AX-1B)-IAX- I .

Solution. (a) For purposes of differentiating a function of X, rearrange the elements


of X in the form of an m 2 -dimensional column vector x and reinterpret the function
as a function of x, in which case the domain of the function is the set S* obtained
by rearranging the elements of each m x m matrix in S in the form of a column
vector.
Let c represent the value of x corresponding to the interior point C of S (and
note that c is an interior point of S*). Since X is continuously differentiable at
c, it follows from the result of Exercise 18 that log det(AX- 1B) is continuously
differentiable at c and that (at x = c)

alogdet(AX-1B) = -tr[X-1B(AX-1B)-IAX- 1 ax J.
bij bij
15. Matrix Differentiation 129

Moreover, in light of results (5.3) and (5.2.3), we have that

And, upon observing that u'.X-IB(AX-IB)-IAX-Iu;


]
is the ijth element of
I I
[X- B(AX- B) -I AX-I]" we conclude that

(b) By employing essentially the same reasoning as in Part (a), it can be estab-
lished that log det(AX- 1B) is continuously differentiable at the interior point c
and that (at x = c)

Moreover, in light of results (5.6), (5.7), and (5.2.3), we have that

and that (for j < i)

Since u; Ku; is the i th diagonal element of K and since u; Ku) and uj Ku; are the
ijth elements of K and K', respectively, it follows that (at x = c)

alog det(AX- 1B) ,.


ax = -K - K + dlag(kll, k22,···, kqq ).

EXERCISE 20. Let F = {f; s} represent a p x p matrix of functions, defined on


a set S, of a vector x = (XI, ... , x m )' of m variables. Let c represent any interior
point (of S) at which F is continuously differentiable. By, for instance, using the
result of Part (b) of Exercise 13.10, show that ifrank[F(c)) :::: p - 3, then

aadj(F) = o.
aX)

Solution. Let ¢s; represent the cofactor of Is; and hence the i sth element of adj(F),
and let Fsi represent the (p - 1) x (p - 1) submatrix of F obtained by striking
130 15. Matrix Differentiation

out the sth row and the ith column (ofF). Then, as discussed in Section 15.8, <Psi
is continuously differentiable at c and (at x = c)

a<psi = (_l)s+i tr[adj(Fsd aFSiJ.


aXj aXj
Now, suppose that rank[F(c)] :::: p - 3. Then, rank[Fsi(C)] :::: p - 3 [since
otherwise Fsi(c), and hence F(c), would contain an r x r nonsingular submatrix,
where r > p - 3, in which case the rank of F(c) would exceed p - 3]. Thus,
it follows from Part (b) of Exercise 13.10 that adj[Fsi(c)] = 0, leading to the
conclusion that (at x = c) a<psi/aXj = 0 and hence that (at x = c)

aadj(F) = o.
aXj

EXERCISE 21. (a) Let X represent an m x m matrix of m 2 "independent"


variables, and suppose that the range of X is a set S comprising some or all
X-values for which X is nonsingular. Show that (when the elements of X-I are
regarded as functions of X) X-I is continuously differentiable at any interior point
C of S and that (at X = C)
ax- I
- - =-y,z'.
aXij I J'

where Yi represents the ith column and zj the jth row of X-I.
(b) Suppose now that X is an m x m symmetric matrix; that, for purposes of
differentiating a function of X, the function is to be interpreted as a function of the
column vector x whose elements are xij (j :::: i = I, ... , m); and that the range of
x is a set S comprising some or all x-values for which X is nonsingular. Show that
X-I is continuously differentiable at any interior point c of S and that (at x = c)

ax- I
--=
{-Y'Y~
I r I
if j = i,
aXij -YiYj - YjYj' if j < i

(where Yj represents the ith column of X-I).


Solution. Denote by u j the jth column ofIm •
(a) For purposes of differentiating a function of X, rearrange the elements of
X in the form of an m2 -dimensional column vector x and regard the function as
a function of x, in which case the domain of the function is the set obtained by
rearranging the elements of each m x m matrix in S in the form of a column vector.
Let c represent the value of x corresponding to the interior point C of S. Then, X
is continuously differentiable at c, implying that X-I is continuously differentiable
at c and [in light of results (8.15) and (5.3)] that (at x = c)
15. Matrix Differentiation 131

(b) The matrix X is continuously differentiable at the interior point c, implying


that X-I is continuously differentiable at c and [in light of results (8.15), (5.6),
and (5.7)] that (at x = c)

ax- I
--=-
X-I -ax X-I =-
X-I UiUi'X-I =YiYi
,
aXii aXii
and similarly (for j < i)

ax-
aXij
I
= -X
-I'
(UiUj
,-I
+ UjUi)X
, ,
= -YiYj - YjYi'

EXERCISE 22. Let X represent an m x m matrix of m 2 "independent" variables.


Suppose that the range of X is a set S comprising some or all X-values for which X
is nonsingular, and let C represent an interior point of S. Denote the ijth element
of X-I by Yij, the jth column of X-I by Yj' and the ith row of X-I by z;.
(a) Show that X-I is twice continuously differentiable at C and that (at X = C)

(b) Suppose that det(X) > 0 for every X in S. Show that logdet(X) is twice
continuously differentiable at C and that (at X = C)

a2 Iogdet(X)
= -YtiYjs·

Solution. For purposes of differentiating a function of X, rearrange the elements


of X in the form of an m 2 -dimensional column vector x and reinterpret the function
as a function of X, in which case the domain of the function is the set S* obtained
by rearranging the elements of each m x m matrix in S in the form of a column
vector.
Let c represent the value of x corresponding to the interior point C of S (and
note that c is an interior point of S*). Denote by Uj the jth column ofIm.
It follows from the results of Section 15.5 (together with Lemma 15.4.1) that X
is twice continuously differentiable at c and that (at x = c) ax/axij = uiui and
a2x/axijaXst = O.
(a) Based on the results of Section 15.9, we conclude that X-I is twice contin-
uously differentiable at c and that (at x = c)

--- = X -I UiUj'X- IUsU'X-I


t + X-I UsU 'X-I
t
'X- I
UiUj
aXijaXst
132 15. Matrix Differentiation

(b) Similarly, based on the results of Section 15.9 (along with Lemma 5.2.1),
we conclude that log det(X) is twice continuously differentiable at c and that (at
x = c)

EXERCISE 23. Let F = {fis} represent a p x p matrix of functions, defined


on a set S, of a vector x = (XI, ... , xm)' of m variables. For any nonempty set
T = {tl, ... , ts }, whose members are integers between I and m, inclusive, define
O(T) = aSF laxI] ... aXIs' Let k represent a positive integer and, for i = I, ... , k,
let Ji represent an arbitrary integer between I and m, inclusive.
(a) Suppose that F is nonsingular for every x in S, and denote by c any interior
point (of S) at which F is k times continuously differentiable. Show that F- I is k
times continuously differentiable at c and that (at x = c)

where TI, ... , Tr are r nonempty mutually exclusive and exhaustive subsets of
{jl, ... , Jd (and where the second summation is over all possible choices for
TI, ... , Tr).
(b) Suppose that det(F) > 0 for every x in S, and denote by c any interior point
(of S) at which F is k times continuously differentiable. Show that log det(F) is k
times continuously differentiable at c and that (at x = c)

ak log det(F)
aXil ... aXik

L L
k
= (-l)r+l tr [F- I O(T))F- I O(T2)" .F-IO(Tr )], (E.2)
r=1 T], ... ,Tr

where T1, ... , Tr are r nonempty mutually exclusive and exhaustive subsets of
{jt, ... , Jd with Jk E Tr (and where the second summation is over all possible
choices for T1, ... , Tr).
Solutiou. (a) The proof is by mathematical induction. For k = I and k = 2, it
follows from the results of Sections 15.8 and 15.9 that F- i is k times continuously
differentiable and formula (E.I) valid at any interior point at which F is k times
continuously differentiable.
Suppose now that, for an arbitrary value of k, F- I is k times continuously
differentiable and formula (E.I) valid at any interior point at which F is k times
continuously differentiable. Denote by c* an interior point at which F is k + I
times continuously differentiable. Then, it suffices to show that F- i is k + I times
15. Matrix Differentiation 133

continuously differentiable at c* and that (at x = c*)

ak+IF- 1
aXil'" aXjk+l
k+l
= L L (-WF- 1D(Tt)F- 1D(T2*)·· .F-1D(Tr*)F- 1, (S.2)
r=1 Tt, ... ,Tr*

where jk+ 1 is an integer between 1 and m, inclusive, and where Tt, ... , T,* are r
nonempty mutually exclusive and exhaustive subsets of UI, ... , A+l}.
The matrix F is k times continuously differentiable at c* and hence at every
point in some neighborhood N of c*. By supposition, F- 1 is k times continuously
differentiable and formula (E.1) valid at every point in N. Moreover, all partial
derivatives of F of order less than or equal to k are continuously differentiable
at c*. Thus, it follows from results (4.8) and (8.15) that akF-I/axh ... aXjt is
continuously differentiable at c* and that (at x c*) =

aXil'" aXjt+l
k
=L L (-1)'

X [-F-l~F-ID(TdF-ID(T2)'" F-1D(Tr)F- 1
aXjk+l

-F-1D(TdF- 1~F-ID(T2)'" F-1D(Tr)F- 1


ax jt+l

aF
-F- 1D(TJ}F- 1D(T2)'" F-1D(Tr)F-l_-F- 1
aXjt+l
+F-1D(TI u Uk+lDF- 1D(T2)'" F-1D(Tr)F- 1
+F- 1D(TJ}F- 1D(T2 U Uk+l})" .F-1D(Tr)F- 1

The terms of sum (S.3) can be put into one-to-one correspondence with the terms
of sum (S.2) (in such a way that the corresponding terms are identical), so that
formula (S.2) is valid and the mathematical induction argument is complete.
(b) The proof is by mathematical induction. For k = 1 and k = 2, it follows
from the results of Sections 15.8 and 15.9 that logdet(F) is k times continuously
differentiable and formula (E.2) valid at any interior point at which F is k times
continuously differentiable.
134 15. Matrix Differentiation

Suppose now that, for an arbitrary value of k, log det(F) is k times continuously
differentiable and formula (E.2) valid at any interior point at which F is k times
continuously differentiable. Denote by c* an interior point at which F is k + 1
times continuously differentiable. Then, it suffices to show that log det(F) is k + 1
times continuously differentiable at c* and that (at x = c*)

ak 10gdet(F)
aXil'" aXjk+1
HI
=L L (-1)'+ltr[F- 10(Tt)F- 10(T2*)··· F-10(Tr*)], (S.4)
r=l Tt •...• Tr·

where Tt, . .. , Tr* are r nonempty mutually exclusive and exhaustive subsets of
{iI,···, A+u with iHI E T/.
The matrix F is k times continuously differentiable at c* and hence at every point
in some neighborhood N of c* . By supposition, log det(F) is k times continuously
differentiable and formula (E.2) valid at every point in N. Moreover, all partial
derivatives of F of order less than or equal to k are continuously differentiable at
c*, and F- 1 is continuously differentiable at c*. Thus, it follows from results (4.S)
and (S.15) that a k 10gdet(F)/axh ... aXjk is continuously differentiable at c* and
that (at x = c*)
aH1logdet(F)
aXil'" aXjk+1
k
=L L (-W+ l
r=l TI ..... Tr

-F- 10(TdF- 10(T2)" .F-I~F-IO(Tr)


aXjk+1

+F-10(TI u UHI})F- 10(T2)'" F-10(Tr )


+F- 10(TdF- 10(T2 U UHID··· F-10(Tr )

The terms of sum (S.5) can be put into one-to-one correspondence with the terms
of sum (S.4) (in such a way that the corresponding terms are identical), so that
formula (S.4) is valid and the mathematical induction argument is complete.
15. Matrix Differentiation 135

EXERCISE 24. Let X = {xij} represent an m x m symmetric matrix, and let


x represent the m(m + 1)/2-dimensional column vector whose elements are Xij
(j ~ i = 1, ... , m). Define S to be the set of all x-values for which X is nonsingular
and S* to be the set of all x-values for which X is positive definite. Show that S
and S* are both open sets.
Solution. Let c represent an arbitrary point in S, and c* an arbitrary point in S*. It
suffices to show that c and c* are interior points (of Sand S*, respectively). Denote
by C and C* the values of X at x = c and x = c*, respectively.
According to Lemma 15.10.2, there exists a neighborhood N of c such that X is
nonsingular for x EN. And, it follows from the very definition of S that N C S.
Thus, c is an interior point of S.
Now, let Xk and C'k represent the kth-order leading principal submatrices of X
and C*, respectively. Then, det(Xd is a continuous function of x (at all points in
n m (m+l)/2) and hence
lim det(Xd = det(C k).
x----+c*

Since (according to Theorem 14.9.5) det(Ck) > 0, there exists a neighborhood


N* of c* such that I det(Xd - det(Ck)1 < det(Ck) for x E N* and hence {since
-[det(Xk) - det(Ck)] ~ I det(Xd - det(Ck)l} such that - det(Xk) + det(Ck) <
det(Ck) for x E N*.Thus,-det(Xk) < Ofor x E N*or,equivalently,det(Xk) > 0
for x E N* (k = 1, ... , m). Based on Theorem 14.9.5, we conclude that X is
positive definite for x E N*, or equivalently that N* C S*, and hence that c* is an
interior point of S*.

EXERCISE 25. Let X represent an n x p matrix of constants, and let W represent


an n x n symmetric positive definite matrix whose elements are functions, defined
on a set S, of a vector z = (ZI, ... , Zm)' of m variables. Further, let c represent
any interior point (of S) at which W is twice continuously differentiable. Show
that W - wpx.w is twice continuously differentiable at c and that (at z = c)

02(W - WPx,w)
OZiOZj

I a2 w
= (I - Pxw)--(I - P xw )
'OZiOZj ,
oW, ,oW
(I - Pxw)-X(X wxrx -a
I
(I - Px,w)
, o~ 0
oW, ,oW ,
[(I - Pxw)-X(X wxrx - ( I - PX,w)].
I

, OZi OZj

Solution. Since W is twice continuously differentiable at c, it is continuously


differentiable at c and hence continuously differentiable at every point in some
neighborhood N of c. Then, it follows from Theorem 15.11.1 that W - WPx,w is
136 15. Matrix Differentiation

continuously differentiable at every point in N and that (for zEN)


a(W - WPx.w) aw
- - aZj
- - - = (I - I

Pxw)-(I - px.w).
. aZj
Further, px.w and aw /aZj are continuously differentiable at c.
Thus, a(W - wpx.w)/aZj is continuously differentiable at c, and hence W -
wpx .w is twice continuously differentiable at c. Moreover, making use of results
(4.6) and (11.1) [along with Part (3') of Theorem 14.12.11], we find that (at z = c)

a2(W - wpx.w ) a[a(W - Wpx.w)/aZj]


= ----------~~~
aZi
awapx.w
= -(I-Pxw I
)---
. aZj aZi

+ (I -
I

Pxw)--(I - px.w)
a2w
· aZiaZj
apxw)' aw
- ( - ' - --(I - px.w)
aZi aZj
ap )'aw
= - [( ~ -(I-Px.w)
aZi aZj
J'
+ (I - px.w)'--(I - px.w)
a2w
aZiaZj
apxw)' aw
- ( - ' - --(I-Px.w)
aZi aZj
aw, ,aw ,
= -[(I - Pxw)-X(X
. aZi
I
wxrx - ( I - px.w))
aZj
a2 w
+ (I -
I

Pxw)--(I - px.w)
aZiaZj ·
aw, I ,aw
- (I - Pxw)-X(X WX)-X - ( I - px.w).
· aZi aZj

EXERCISE 26. Let X represent an n x p matrix and W an n x n symmetric positive


definite matrix, and suppose that the elements of X and W are functions, defined
on a set S, of a vector z = (Z 1, ... , Zm)' of m variables. And, let c represent any
interior point (of S) at which W and X are continuously differentiable, and suppose
that X has constant rank on some neighborhood of c. Further, let B represent any
p x n matrix such that X'WXB = X'W. Then, at z = c,
a(WPx.w ) aw aw
---'--
aZj
= -
aZj
- I
(I - Pxw)-(I - px.w)
. aZj
+ WeI -
ax
Px.w)-B + [WeI - Px.w)-B].
ax , (*)
aZj aZj
15. Matrix Differentiation 137

Derive result (*) by using the result

p'x.w Wpx.w = Wpx.w


to obtain the representation

a(WPx,W) _ ' apx,w ' aw (apx.w)'wp


a Zj
-PxwW-a--+Pxw-a px.w+ -a--
. Zj
x,w,
. Zj Zj

and by then making use of the result

apxw )'
( --'- , aw ax
WPx,w = (I - Pxw)-Px,w + W(I - px.w)-a B.
a0 . a0 0

Solution. According to result (**) [or, equivalently, according to Part (6') of The-
orem 14.12.11], WPx,w = p~.wWPx,w. Thus, it follows from results (4.6) and
4.10) that

a(Wpx,w) _ ' apx.w ' aw (apx.w)'wp


a Zj
-PxwW-a--+Pxw-a Px,w+
.
a Zj
x.w·
. Zj Zj

Substituting from result (*) [or equivalently from result (11.16)], we find that (for
any p x n matrix B such that X'WXB = X'W)
a(WPxw ) , aw ax ,
- - - '- = [(I - Pxw)-Px,w + W(I - Px.w)-B]
aZj . aZj aZj
, aw , aw
+ Px,w aZj px.w + (I - Px,w) aZj Px,w
ax
+ W(I - PX,w)-B
aZj
, aw ax ,
= Pxw-(I - Px,w) + [W(I - PX,w)-B]
, aZj aZj
aw ax
+ -Px,w + W(I - Px,w)-a B.
aZj Zj

And, upon reexpressing (aW jaZj)px,w as


aw aw aw
- Pxw = - - - ( I - Pxw),
aZj' aZj aZj .

it is clear that
a (WPx,w)
---'-----"=-:...
aZj
= -aw
aZj
- ,
,
aw
(I - Pxw)-(I - Px,w)
aZj
ax ax ,
+ W(I - PX,w)-B + [W(I - PX,w)-B].
aZj aZj
16
Kronecker Products and the Vec and
Vech Operators

EXERCISE 1. (a) Verify that, for any m x n matrices A and B and any p x q
matrices C and D,

(b) Verify that, for any m x n matrices AI, A2, ... ,Ar and p x q matrices
Bl, B2, ... , Bs ,

Solution. (a) It follows from results (1.11) and (1.12) that


(A + B) ® (C + D) = [A ® (C + D)] + [B ® (C + D)]
= (A ® C) + (A ® D) + (B ® C) + (B ® D).
(b) Let us begin by showing that, for any m x n matrix A,

A® (t Bj )~ t<A®Bj). (S.l)

The proof is by mathematical induction. Result (S.l) is valid for s = 2, as is


evident from result (1.12). Suppose now that result (S.l) is valid for s = s*. Then,
making use of result (1.12), we find that
140 16. Kronecker Products and the Vee and Vech Operators

s*+!
= L(A®B j ),
j=!

which indicates that result (S.l) is valid for s = s* + I, thereby completing the
induction argument. Moreover, it can be shown in analogous fashion that, for any
p x q matrix B,

(~Ai) ®B = ~(Ai ®B). (S.2)

Now, making use of results (S.2) and (S.l), we find that

EXERCISE 2. Show that, for any m x n matrix A and p x q matrix B,


A ® B = (A ® Ip) diag(B, B, ... , B).

Solution. Making use of results (1.20) and (1.7), we find that


A ® B = (A ® Ip)(In ® B) = (A ® Ip) diag(B, B, ... , B).

EXERCISE 3. Show that, for any m x 1 vector a and any p x 1 vector b, (1)
a®b = (a ® Ip)b and (2) a' ® b' = b'(a' ® Ip).
Solution. Making use of results (1.20) and (1.1), we find (1) that
a ® b = (a ® Ip)(1 ® b) = (a ® Ip)b

and similarly (2) that

a' ® b' = (1 ® b')(a' ® Ip) = b'(a' ® Ip).

EXERCISE 4. Let A and B represent square matrices.


(a) Show that if A and B are orthogonal, then A ® B is orthogonal.
(b) Show that if A and B are idempotent, then A ® B is idempotent.
Solution. Note that (since A and B are square) A ® B is square.
(a) If A and B are orthogonal, then we have [in light of results (1.15), (1.19),
and (1.8)] that

(A ® B)' (A ® B) = (A' ® B')(A ® B) = (A'A) ® (B'B) = I ® I = I


16. Kronecker Products and the Vec and Vech Operators 141

and hence that A ® B is orthogonal.


(b) If A and B are idempotent, then we have [in light of result (1.19)] that

(A ® B)(A ® B) = (AA) ® (BB) = A ® B

and hence that A ® B is idempotent.

EXERCISE 5. Letting m, n, p, and q represent arbitrary positive integers, show


(a) that, for any p x q matrix B (having p > I or q > 1), there exists an m x n
matrix A such that A ® B has generalized inverses that are not expressible in the
form A - ® B- and (b) that, for any m x n matrix A (having m > I or n > 1),
there exists a p x q matrix B such that A ® B has generalized inverses that are not
expressible in the form A - ® B-.

Solution. (a) Take A = O. Then, A ® B = 0, so that any nq x mp matrix is a


generalized inverse of A ® B. Since every one of the mn (q x p dimensional)
blocks of the Kronecker product A - ® B- is a scalar multiple of the same q x p
matrix (namely, B-), A ® B has generalized inverses that are not expressible in the
form A - ® B-. Consider, for example, an nq x mp partitioned matrix comprising
mn (q x p dimensional) blocks, including one block that has a single nonzero
entry and a second block that also has a single nonzero entry but in a different
location than the first. Clearly, this matrix is a generalized inverse of A ® B that is
not expressible in the form A - ® B- .
(b) Take B = O. Then, A ® B = 0, so that any nq x mp matrix is a generalized
inverse of A ® B. Now, letting CIS represent the t sth element of B- and observing
that (for t = I, ... , q and s = I, ... , p) the n x m submatrix of A - ® B-
obtained by striking out all of the rows and columns except the tth, (q + t)th, ... ,
[(n - l)q + t]th rows and sth, (p + s)th, ... , [(m - l)p + s]th columns equals
clsA -, it follows that A ® B has generalized inverses that are not expressible in
the form A - ® B-. Consider, for example, an nq x mp matrix for which the
n x m submatrix obtained by striking out (for some t and s) all of the rows and
columns except the tth, (q + t)th, ... , [(n - 1)q + t]th rows and sth, (p + s)th,
... , [(m - I) p + s]th columns has a single nonzero entry and for which the n x m
submatrix obtained by striking out (for some t' and s' with t' =1= t or s' =1= s) all of
the rows and columns except the t'th, (q + t')th, ... , [(n - l)q + t']th rows and
s'th, (p + s')th, ... , [(m - 1) p + s']th columns also has a single nonzero entry but
in a different location than the first submatrix. Clearly, this matrix is a generalized
inverse of A ® B that is not expressible in the form A - ® B-.

EXERCISE 6. Let X = A ® B, where A is an m x n matrix and B a p x q matrix.


Show that Px = PA ® PB.
Solution. According to result (1.15), X' = A' ® B'. Thus, making use of result
(1.19), we find that

X'X = (A' ® B')(A ® B) = (A' A) ® (B'B) ,


142 16. Kronecker Products and the Vec and Vech Operators

so that (A' A) - ® (B'B) - is a generalized inverse of X'X. And, again making use
of result (1.19), it follows that
P x = X(X'X)-X'
= (A ® B)[(A'A)- ® (B'B)-](A' ® B')
= [A(A'A)- A'] ® [B(B'B)-B']
= PA ®PB .

EXERCISE 7. Show that the Kronecker product A ® B of an m x m matrix


A and an n x n matrix B is (a) symmetric nonnegative definite if A and B are
both symmetric nonnegative definite or both symmetric nonpositive definite and
(b) symmetric positive definite if A and B are both symmetric positive definite or
both symmetric negative definite.

Solution. (a) Suppose that A and B are both symmetric nonnegative definite. Then,
according to Corollary 14.3.8, there exist matrices P and Q such that A = P'P and
B = Q'Q. Thus, making use of results (1.19) and (1.15), we find that
A ® B = (p' ® Q')(P® Q) = (P® Q)'(P® Q).
We conclude (in light of Corollary 14.3.8 or 14.2.14) that A ® B is symmetric
nonnegative definite.
Alternatively, if A and B are both symmetric non positive definite, then -A
and -B are symmetric and (by definition) nonnegative definite, and the proof
[of Part (a)] is complete upon observing [in light of result (1.10)] that A ® B =
(-A) ® (-B).
(b) Suppose now that A and B are both symmetric positive definite. Then,
according to Corollary 14.3.13, there exist nonsingular matrices P and Q such
that A = P'P and B = Q'Q. Further, A ® B = (P ® Q)'(P ® Q), and P ® Q is
nonsingular. We conclude (in light of Corollary 14.3.13 or 14.2.14) that A ® B is
symmetric positive definite.
Alternatively, if A and B are both symmetric negative definite, then -A and-B
are symmetric and (by definition) positive definite, and the proof is complete upon
observing that A ® B = (-A) ® (-B).

EXERCISE 8. Let A and B represent m x m symmetric matrices and C and D


n x n symmetric matrices. Using the result of Exercise 7 (or otherwise), show that
if A - B, C - D, B, and C are nonnegative definite, then (A ® C) - (B ® D) is
symmetric nonnegative definite.

Solution. Using properties (1.10) - (1.12), we find that


(A ® C) - (B ® D) = {[(A - B) + B] ® C} - {B ® [C - (C - D)]}
= [(A - B) ® C] + (B ® C)
-{(B ® C) - [B ® (C - D)]}
= [(A - B) ® C] + [B ® (C - D)]. (S.3)
16. Kronecker Products and the Vec and Vech Operators 143

Now, suppose that A - B, C - D, B, and C are nonnegative definite. Then, it


follows from the result of Part (a) of Exercise 7 that (A - B) ® C and B ® (C - D)
are both symmetric nonnegative definite and hence (in light of Lemma 14.2.4) that
their sum [(A - B) ® C] + [B ® (C - D)] is symmetric nonnegative definite.
And, based on equality (S.3), we conclude that (A ® C) - (B ® D) is symmetric
nonnegative definite.

EXERCISE 9. Let A represent an m x n matrix and B a p x q matrix. Show that,


in the case of the usual nonn,

IIA®BII=IIAIIIIBII·

Solution. Making use of results (1.15), (1.19), and (1.25), we find that
I
IIA ® B II = {tr[(A ® B)'(A ® B)]P
= {tr[(A' ® B')(A ® B)]P
1

I
= {tr[(A' A) ® (B'B)]P
I I
= {tr(A'A) P {tr(B'B) P
= IIAIIIIBII·

EXERCISE 10. Verify that, for an m x n partitioned matrix

All AJ2
... AlC)
... A2c
( A2l A22
A= . .

Arl Ar2 Arc

and a p x q matrix B,

All ®B Al2 ® B . .. Alc ® B)


A22 ® B . .. A2c ® B
( A2l ®B
A®B= . ·· .
. ,
· .
Arl ®B Ar2 ®B Arc ®B

that is, A ® B equals the mp x nq matrix obtained by replacing each block Aij of
A with the Kronecker product of Aij and B.
Solution. For i = 1, ... , r, let mi represent the number ofrows in Ail, Ai2, ... ,
Aic; and, for j = 1, ... , c, let n j represent the number of columns in Alj, A2j,
... , A rj . Define F = A ® B; and partition F as
...
FJ2
Flo)
C
F2l F22 ... F2c
F= . . ,
Frl Fr2 F rc
144 16. Kronecker Products and the Vec and Vech Operators

that is, partition F into r rows and c columns of blocks, the ijth of which is
of dimensions mi P x n jq and is denoted by Fij. Then (for i = 1, ... , rand
j = 1, ... , c), Fij equals a partitioned matrix comprising mi rows and n j columns
of P x q dimensional blocks, the uvth of which is

amI +"+m,_1 +U.nl +··+n J_I+v B .

(When i = 1 or j = 1, interpret the degenerate sum mI + ... + mi -lor n I +


... + n j_1 as zero.) It follows that Fij = Aij ® B.
EXERCISE 11. Show that (a) if T and V are both upper triangular matrices, then
T ® V is an upper triangular matrix and (b) if T and L are both lower triangular
matrices, then T ® L is a lower triangular matrix.
Solution. (a) Suppose that T = {tij} is an upper triangular matrix of order m
and V = {Uij} an upper triangular matrix of order n. Then, T ® V is a square
matrix of order mn, and the element that appears in the [n(i - 1) + r]th row and
[n(j - 1) + s]th column ofT ® V is tijU rs .
Clearly, tijU rs 1= 0 only if j :::: i and s :::: r. Thus, the element that appears in
the [n(i - 1) + r]th row and [n(j - 1) + s ]th column of T ® V is nonzero only if
n(j - 1) + s :::: n(i - 1) + r. It follows that T ® V is an upper triangular matrix.
(b) Suppose that T and L are both lower triangular matrices. Then, T' and L'
are both upper triangular, and consequently it follows from Part (a) that T' ® L' is
upper triangular. Since [in light of result (1.15)] T ® L = (T' ® L')', we conclude
that T ® L is lower triangular.

EXERCISE 12. Let A represent an m x m matrix and B an n x n matrix. Suppose


that A and B have LDU decompositions, say A = LI DI V I and B = L2D2 V2.
Using the results of Exercise 11, show that A ® B has the LDU decomposition
A ® B = LDV, where L = LI ® L2, D = DI ® D2, and V = VI ® V2.
Solution. That A ® B = LDV is an immediate consequence of result 0.19).
Moreover, D is (by definition) the Kronecker product of two diagonal matrices
(namely, DI and D2) and hence is diagonal. And, V is (by definition) the Kronecker
product of two upper triangular matrices (namely, V I and V2) and hence [as a
consequence of Part (a) of Exercise 11] is upper triangular. Similarly, L is (by
definition) the Kronecker product of two lower triangular matrices (namely, LI
and L2) and hence [as a consequence of Part (b) of Exercise 11] is lower triangular.
It remains to show that the diagonal elements of L and V equal one. In this
regard, observe that the [(n - l)i + r ]th diagonal element of L is the product of
the ith diagonal element of LI and the rth diagonal element of L2 and that the
[en - l)j + r]th diagonal element of V is the product of the ith diagonal element
of V I and the rth diagonal element of V2. Since L I, L2, V I, and V2 are unit
triangular, their diagonal elements equal one. Thus, the diagonal elements of L
and V equal one.

EXERCISE 13. Let AI, A2, ... , Ak represent k matrices (of the same dimen-
16. Kronecker Products and the Vec and Vech Operators 145

sions). Show that AI, A2, ... , Ak are linearly independent if and only ifvec(Aj),
vec(A2), ... , vec(Ak) are linearly independent.
Solution. It suffices to show that AI, A2, ... ,Ak are linearly dependent if and
only if vec(A I), vec(A2), ... , vec(Ak) are linearly dependent.
Suppose that A I, A2, ... , Ak are linearly dependent. Then, there exist scalars
q, C2, ..• , q, not all zero, such that L:7=1 CiAi = O. Since [in light of result (2.6)]

k k
L Ci vee (Ai ) = vec(L Ci Ai) = vec(O) = 0,
i=1 i=1
we conclude that vec(Aj), vec(A2), ... , vec(Ak) are linearly dependent.
Conversely, suppose that vec(AI), vec(A2), ... , vec(Ak) are linearly depen-
dent. Then, there exist scalars CI, C2, ..• , q, not all zero, such that
k
L Ci vee (Ai ) = 0,
i=1

or equivalently [in light of result (2.6)] such that vec(L:7=1 CiAi) = 0, and hence
such that L:7=1 CiAi = O. We conclude thatAI , A2, ... , Ak are linearly dependent.

EXERCISE 14. Let m represent a positive integer, let ei represent the ith column
oflm (i = 1, ... , m), and (for i, j = 1, ... , m) let Vi} = eiej (in which case Vi)
is an m x m matrix whose ijth element is 1 and whose remaining m 2 - 1 elements
are 0).
(a) Show that
m
vec(Im) = Lei ® ei·
i=1
(b) Show that (for i, j, r, S = 1, ... , m)
vec(Vr;)[vec(Vsj)]' = Vi} ® Vrs .
(c) Show that
m m
L L Vi} ® Vi} = vee (1m Hvec(Im)]'.
i=1 j=1

Solution. (a) Making use of results (2.4.4), (2.6), and (2.3), we find that

vec(Im) = vec(L ei e;) = L


vec(ei e;) = Lei ® ei .
i i i

(b) Making use of results (2.3), (1.15), and (1.19), we find that
146 16. Kronecker Products and the Vec and Vech Operators

= (ei ®er)(ej ®es )'


= (ei ® e r )(e) ® e~)
= (e;e}) ® ere~) = Uij ® Urs .

(c) Making use of Part (b) and results (2.6) and (2.4.4), we find that

LUij ® Uij = Lvee(U;;) L[vee(Ujj)]'


i,j j

= Lvee(U;;)[Lvee(Ujj)]'
j

= vee(LUii)[vee(LUjj))'
i j

= vee(Im)[vee(Im)]'.

EXERCISE 15. Let A represent an n x n matrix.


(a) Show that if A is orthogonal, then (vee A)'vee A = n.
(b) Show that if A is idempotent, then [vee(A')]'vee A = rank(A).

Solution. (a) If A is orthogonal, then, making use of result (2.14), we find that
(vee A)'vee A = tr(A' A) = tr(In ) = n.
(b) If A is idempotent, then, making use of result (2.14) and Corollary 10.2.2,
we find that
[vee(A')]'vee A = tr(AA) = tr(A) = rank(A).

EXERCISE 16. Show that for any m x n matrix A, p x n matrix X, p x p matrix


B, and n x m matrix C,
tr(AX'BXC) = (vee X)'[(A'C') ® B]vee X = (vee X)'[(CA) ® B']vee X.

Solution. Making use of results (5.2.3) and (2.15), we find that


tr(AX'BXC) = tr(X'BXCA) = tr[X'BX(A'C')'] = (vee X)'[(A'C') ® B]vee X.
Further, as a consequence of Lemma 14.1.1 and result (1.15), we have that
(vee X)'[(A'C') ® B]vee X = (vee X)'[(A'C') ® B]'vee X
= (vee X)'[(A'C')' ® B']vec X
= (vee X)'[(CA) ® B']vec X.

EXERCISE 17. (a) Let V represent a linear space of m x n matrices, and let g
represent a function that assigns the value A * B to each pair of matrices A and B
in V. Take U to be the linear space of mn x 1 vectors defined by
U = {x E nmnx 1 : x = vee (A) for some A E V},
16. Kronecker Products and the Vec and Vech Operators 147

and let x y represent the value assigned to each pair of vectors x and y in U by
0

an arbitrary inner product f. Show that g is an inner product (for V) if and only if
there exists an f such that (for all A and B in V)

A * B = vec(A)ovec(B).

(b) Let g represent a function that assigns the value A * B to an arbitrary pair
of matrices A and B in nmxn. Show that g is an inner product (for nmxn) if and
only if there exists an mn x mn partitioned symmetric positive definite matrix

WI2 ... Win)


W22 ... W2n
. . .
. .
Wn2 ... Wnn

(where each submatrix is of dimensions m x m) such that (for alIA andB in nmxn)

A*B = L3;Wijbj,
i,j

where 31,32, ... ,3n and bl, b2, ... , b n represent the first, second, ... , nth col-
umns of A and B, respectively.
(c) Let g represent a function that assigns the value x' * y' to an arbitrary pair
of (row) vectors in n I xn. Show that g is an inner product (for n I xn) if and only
if there exists an n x n symmetric positive definite matrix W such that (for every
pair of n-dimensional row vectors x' and y')

x' * y' = x'Wy .

Solution. (a) Suppose that, for some f,

A *B = vec(A)ovec(B)
(for all A and B in V). Then,
(1) A *B = vec(A)ovec(B) = vec(B)ovec(A) = B * A;
(2) A * A = vec(A) 0vec(A) 2: 0, with equality holding if and only if vec(A) =
oor equivalently if and only if A = 0 ;
(3) (kA) * B = vec(kA) °vec(B) = [k vec(A)) ovec(B)
=k [vec(A) vec(B)] = k(A * B) ;
0

(4) (A + B) * C = vec(A + B)ovec(C)


= [vec(A) + vec(B)]ovec(C)
= [vec(A)ovec(C)] + [vec(B)ovec(C)]
*
= (A C) + (B C) *
148 16. Kronecker Products and the Vec and Vech Operators

(where A, B, and C represent arbitrary matrices in V and k represents an arbitrary


scalar). Thus, g is an inner product.
Conversely, suppose that g is an inner product, and consider the function f that
assigns to each pair of vectors x and y in U the value

where X and Yare the (unique) m x n matrices such that x = vec(X) and y =
vec(Y). Then, letting x, y, and z represent arbitrary vectors in U, taking X, Y, and
Z to be m x n matrices such that x = vec(X), y = vec(Y), and z = vec(Z), and
denoting by k an arbitrary scalar, we find that
(1) x * y = X * Y = Y * X = Y * x;
(2) x * x = X * X ~ 0, with equality holding if and only if X = 0 or equivalently
if and only if x = 0 ;
(3) (kx) * Y = (kX) * Y = k(X * Y) = k(x * y);
(4) (x + y) *Z = (X + Y) * Z = (X * Z) + (Y * Z) = (x *Z) + (y *Z).
Thus, 1 is an inner product (for U). Moreover, for f = 1, we have that
A *B = vec(A) * vec(B) = vec(A) ·vec(B)

(for all A and B in V).


(b) Let f represent an arbitrary inner product for nmnx I, and let x·y represent
the value assigned by f to an arbitrary pair of mn-dimensional column vectors x
and y. According to Part (a), g is an inner product (for nmxn) if and only if there
exists an f such that (for all A and B in V)

A *B = vec(A) ·vec(B) .

Moreover, according to the discussion of Section 14. lOa, every inner product for
n mn x 1 is expressible as a bilinear form, and a bilinear form (in mn x 1 vectors)
qualifies as an inner product for nmn x I if and only if the matrix of the bilinear
form is symmetric and positive definite. Thus, g is an inner product (for nm xn) if
and only if there exists an mn x mn symmetric positive definite matrix W such
that (for all A and B in V)

A * B = (vec A)'Wvec B.
Further, partitioning W as

Win)
W2n
. ,

Wnn
16. Kronecker Products and the Vec and Vech Operators 149

and denoting by ai, a2, ... , an and bl, b2, ... , b n the first, second, ... , nth col-
umns of A and B, respectively, we find that

(vec A)'Wvec B = I>;Wijbj.


i.j

n
(c) It follows from Part (b) that g is an inner product (for I x n) if and only if
there exists an n x n symmetric positive definite matrix W = {Wij} such that (for
every pair of n-dimensional row vectors x' = {Xi} and y' = {Yi})

x' * y' = LXiWijYj


i,j

or equivalently such that (for every x' and y')

x' * y' = x'Wy.

EXERCISE 18. (a) Define (for m ~ 2) P to be the mn x mn permutation matrix


such that, for every m x n matrix A,

( vecr A*) -_ P vec A ,

where A* is the (m - 1) x n matrix whose rows are respectively the first, ... ,
(m - l)th rows of A and r' is the mth row of A [and hence where A = (~*) and
A' = (A:, r)) .

(1) Show that Kmn = (Kmol,n ~J P.


(2) Show that IPI = (_1)(m-l)n(n-l)/2.
(3) Show that IKmnl = (_1)(m-l)n(n-I)/2IK m_l,nl.

(b) Show that IKmnl = (_l)m(m-l)n(n-l)/4.


(c) Show that IKmml = (_l)m(m-l)/2.

Solution. (a) (1) Since vec (A:) = Km-l,nvec A*, we have [in light of the defining
relation (3.1)] that

K mn vec A -- vec (A') -_ (vec r(A:)) -_ (Km-l,n


0 ~J (vecrA*)
_ (Km-l.n
- 0 ~J Pvec A.

Thus,
K a -_
mn
(Km-l.n
0
0)
In
Pa
150 16. Kronecker Products and the Vec and Vech Operators

for every mn-dimensional column vector a, implying (in light of Lemma 2.3.2)
that
K _ (Km-I,n
mn - 0
O)p
In .

(2) The vector r is the n x 1 vector whose first, second, ... , nth elements
are respectively the mth, (2m )th, ... , (nm)th elements of vec A, and vec A* is
the (m - l)n-dimensional subvector of vec A obtained by striking out those n
elements. Accordingly, P = (~~). where P2 is the n x mn matrix whose first,
second, ... ,nth rows are respectively the mth, (2m)th, ... , (nm)th rows of I mn ,
and PI is the (m - 1)n x mn submatrix of Imn obtained by striking out those n
rows. Now, applying Lemma 13.1.3, we find that IPI = (-1)/fJ, where

rp = (m - 1)1 + (m - 1)2 + ... + (m - l)(n - 1) = (m - l)n(n - 1)/2.

(3) It follows from Parts (1) and (2) that

IKmnl = IPIIKmol,n ~J = (_1)(m-l)n(n-I)/2IKm_l,nl.


(b) It follows from Part (a) that, for i ~ 2,
IKinl = (_I)(i-l)n(n-I)/2I K i_l,nl.

By applying this equality m - 1 times (with i = m, m - I, ... ,2, respectively),


we find that, for m ~ 2,
IKmnl = (_I)(m-l)n(n-I)/2 IK m_l,nl
=(_1)(m-l)n(n-I)/2( _1)(m-2)n(n-I)/2IKm_2,nl
= (_1)[(m-I)+(m-2)+.+1 1n(n-I)/2I K l n I
= (_1)[m(m-I)/21n(n-I)/2IKln I
= (_I)m(m-l)n(n-I)/4I K l n l.
Since Kin = In [and since (_1)0 = 1], we conclude that (for m ~ 1)

IKmn I = (_1)m(m-l)n(n-I)/4.

(c) It follows from Part (b) that

IKmml = (_I)[m(m-I)/212 .

Since the product of two odd numbers is odd and the product of two even numbers
even, we conclude that
IKmml = (_1)m(m-I)/2.

EXERCISE 19. Show that, for any m x n matrix A, p x 1 vector a, and q x 1


vectorb,
16. Kronecker Products and the Vec and Vech Operators 151

(1) b' ® A ® a = Kmp[(ab /) ® A] ;


(2) a ® A ® b' = Kpm[A ® (ab/)].

Solution. Making use of Corollary 16.3.3 and of results (1.16) and (1.4), we find
that

b' ®A®a = (b' ®A) ®a = Kmp[a® (b' ®A)]


= Kmp[(a ® b/) ® A] = Kmp[(ab /) ® A]
and similarly that

a®A®b' = a® (A®b /) = Kpm[(A®b /) ®a]


= Kpm[A ® (b' ® a)] = Kpm[A ® (ab /)].

EXERCISE 20. Let m and n represent positive integers, and let ei represent the ith
column ofIm (i = 1, ... , m) and U j represent the jth column ofIn (j = 1, ... , n).
Show that
n m
Kmn = Luj ®Im ®Uj = Lei ®In ®e;.
j=l i=l

Solution. Starting with result (3.3) and using results (1.4) and (2.4.4), we find that

Kmn = L (eiuj) ® (uje;)


i,j

= Luj ®ei ®e; ®Uj


i,j
= Luj ® (Lei ®e;) ®Uj
j i

= Luj®(Leie;)®uj = Luj®Im®Uj.
j i j

Similarly,

Kmn = L (eiuj) ® (uje;)


i,j

= Lei ®uj ®Uj ®e;


i,j
= Lei ® (Luj ®Uj) ®e;
i j

= Lei ® (L Ujuj) ® e; = Lei ® In ® e; .


i j i

EXERCISE 21. Let m, n, and p represent positive integers. Using the result of
Exercise 20, show that
152 16. Kronecker Products and the Vec and Vech Operators

(a) Kmp,n = Kp,mnKm,np ;


(b) Kmp.nKnp,mKmn,p = I;
(c) Kn,mp = Knp,mKmn,p ;
(d) Kp,mnKm,np = Km.npKp,mn ;
(e) Knp,mKmn,p = Kmn.pKnp.m ;
(f) Km,npKmp,n = Kmp,nKm,np .
[Hint. Begin by letting U j represent the jth column of In and showing that Kmp,n
= Lj (uj ® Ip) ® (1m ® Uj) and then making use ofthe resultthat, for any m x n
matrix A and p x q matrix B, B ® A = Kpm(A ® B)Knq .]
Solution. (a) Letting Uj represent the jth column of In and making use of the
result of Exercise 20 and the result cited in the hint [or equivalently result (3.10)],
we find [in light of results (1.8), (1.16), (1.4), and (2.4.4)] that

Kmp,n = L u j ® Imp ® Uj
j

=L uj ® (Ip ® 1m) ® Uj
j

= L(uj ® Ip) ® (1m ® Uj)


j

= LKp,mn[(lm ®Uj)®(uj ®lp)]Km.np


j

= L Kp.mn[lm ® (Ujuj) ® Ip]Km,np


j

= Kp.mn[lm ® (LUjuj) ® Ip]Km,np


j
= Kp.mn (1m ® In ® Ip)Km,np
= Kp.mnlmnpKm.np = Kp,mnKm.np .

(b) Making use of Part (a) and result (3.6), we find that

Kmp,nKnp,mKmn.p = Kp.mnKm.npKnp,mKmn,p
= Kp.mn I Kmn.p = Kp.mnKmn.p = I.
(c) Making use of Part (b) and result (3.6), we find that

Kn,mp = Kn.mplmnp = Kn.mpKmp.nKnp.mKmn,p


= I Knp.mKmn.p = Knp.mKmn,p .

(d) Using Part (a) (twice), we find that

Kp,mnKm,np = K mp.n = Kpm.n = Km.pnKp.nm = Km,npKp.mn .


16. Kronecker Products and the Vee and Vech Operators 153

(e) Using Part (c) (twice), we find that

Knp,mKmn,p = Kn,mp = Kn,pm = Knm.pKpn,m = Kmn,pKnp.m .

(0 Making use of Parts (c) and (a) and result (3.6), we find that
Km,npKmp,n = (Kmp,nKnm,p)(Kp,mnKm.np)
= Kmp,nKmn,pKp.mnKm,np = Kmp,nKm.np .

EXERCISE 22, Let A represent an m x n matrix, and define B = Kmn (A' ® A).
Show (a) that B is symmetric, (b) that rank(B) =
[rank(A)f, (c) that B2 =
(AA') ® (A' A), and (d) that tr(B) = tr(A' A).

Solution. (a) Making use ofresults (1.15), (3.6), and (3.9), we find that

B' = (A' ® A)'K~n = (A ® A')Knm = Kmn (A' ® A) = B.


(b) Since Kmn is nonsingular, rank(B) = rank(A' ® A). Moreover, it follows
from result (1.26) that rank (A' ® A) =rank(A') rank(A). Since rank(A') =
rank(A), we conclude that rank(B) = [rank(A)f.
(c) Making use of results (3.10) and (1.19), we find that

B2 = Kmn (A' ® A)Kmn (A' ® A) = (A ® A')(A' ® A) = (AA') ® (A'A).


(d) That tr(B) = tr(A' A) is an immediate consequence of the second equality
in result (3.15).

EXERCISE 23. Show that, for any m x n matrix A and any p x q matrix B,

vec(A ® B) = (In ® G)vec A = (H ® Ip)vec B ,

where G = (Kqm ® Ip)(lm ® vec B) and H = (In ® Kqm)[vec(A) ® Iq].

Solution. Making use of results (1.20), (1.1), and (1.8), we find that

vec(A) ® vec(B) = (Imn ® vec B)[vec(A) ® 1]


= (Imn ® vec B)vec A = (In ® 1m ® vec B)vec A

and similarly that

vec(A) ® vec(B) = [vec(A) ® Ipq](l ® vec B)


= [vec(A) ® IpqJvec B = [vec(A) ® Iq ® Ip]vec B.

Now, substituting these expressions [for vec(A) ® vec(B)] into formula (3.16)
and making use of result (1.19), we obtain

vec(A ® B) = [In ® (Kqm ® Ip)][ln ® (1m ® vec B)]vec A


= (In ® G)vec A
154 16. Kronecker Products and the Vee and Veeh Operators

and

vec(A ® B) = [(In ® Kqm) ® Ip]{[vec(A) ® Iq] ® Ip}vec B


= (H®lp)vecB.

EXERCISE 24. Show that, for Hn = (G~Gn)-JG~ ,

EXERCISE 25. There exists a unique matrix Ln such that

vech A = Ln vec A
for every n x n matrix A (symmetric or not). [The matrix Ln is one choice for the
matrix H n , i.e., for a left inverse of Gn . It is referred to by Magnus and Neudecker
(1980) as the elimination matrix-the effect of premultiplying the vec of an n x n
matrix A by Ln is to eliminate (from vec A) the "supradiagonal" elements of A.]
(a) Write out the elements of LJ, L2, and L3.
(b) For an arbitrary positive integer n, describe Ln in terms of its rows.
Solution. (a) LJ = (1),
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0
~= (~ 1 0
0 0 ~). and L3 = 0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0 0
0 0
0 0
0 0 0 0 0 0 0 0

(b) For i ::: j, the [(j - 1)(n - j /2) + i]th row of Ln is the [(j - 1)n + i]th
row ofln2.

EXERCISE 26. Let A represent an n x n matrix and b an n x 1 vector.


(a) Show that (1/2)[(A ® b') + (b' ® A)]Gn = (A ® b')G n.
(b) Show that, for Hn = (G~Gn)-JG~,
(1) (1/2)Hn[(b ® A) + (A ® b)] = Hn(b ® A);
(2) (A ® b')GnHn = (l/2)[(A ® b') + (b' ® A)];
(3) GnHn(b ® A) = (l/2)[(b ® A) + (A ® b)].
16. Kronecker Products and the Vee and Vech Operators 155

Solution. (a) Using results (3.13) and (4.16), we find that

(l/2) [(A ® b') + (b' ® A)]Gn = (A ® b')[(1/2)(In 2 + Knn)]Gn = (A ® b')Gn.


(b) Using results (3.12), (4.17), (4.22), and (3.13), we find that, for Hn =
(G'n G n)-lG'n'
(1) (l/2)Hn [(b®A)+(A®b)] = Hn[(1/2)(In2 + Knn)](b ®A) = Hn(b®A);
(2) (A®b')GnHn = (A ® b')[(1/2) (In2 + Knn)] = (1/2)[(A®b')+(b'®A)];
(3) GnHn(b ® A) = (l/2)(In2 + Knn)(b ® A) = (l/2)[(b ® A) + (A ® b)].

EXERCISE 27. Let A = {aij} represent an n x n (possibly nonsymmetric) matrix.


(a) Show that, for Hn = (G~Gn)-lG~,

Hn vec A = (1/2) vech(A + A').

(b) Show that

G~Gn vech A = vech[2A - diag(a1l, a22, ... , ann)].

(c) Show that

G~ vec A = vech[A + A' - diag(a1l, a22, ... , ann)].

Solution. (a) Since (1/2)(A + A') is an n x n symmetric matrix, we have [in light
of result (4.17)] that, for Hn = (G~Gn)-lG~,

Hn vec A = Hn[{1/2)(In2 + Knn)]vec A


= (1/2)Hn(vec A + Knnvec A)
= (l/2)Hn(vec A + vec A')
= (l/2)Hn vec(A + A')
= (l/2)HnGn vech(A + A') = (1/2)vech (A + A').
(b) The matrix G~Gn is diagonal. Further, the [(i - l)(n - i/2) + i]th diagonal
element of G~ G n equals 1, and the [(i - l)(n - i /2) + i]th elements of vech A and
vech[2A-diag (a1l, a22, ... , ann)] both equal aii, so that the [(i -1)(n-i/2)+i]th
elements of G~Gn vech A and vech [2A - diag (a1l, a22, ... , ann)] both equal
aii. And, for i > j, the [(j - 1)(n - j/2) + i]th diagonal element of G~Gn
equals 2, the [(j - 1) (n - j /2) + i]th element of vech A equals aij, and the
[(j - 1)(n - j /2) + i]th element of vech [2A - diag (au, a22, ... , ann)] equals
2aij, so that (for i > j) the [(j - l)(n - j /2) + i]th elements of G~ G n vech A
and vech[2A - diag (au, a22, ... , ann)] both equal2aij. We conclude that

G~Gn vech A = vech[2A - diag (a1l, a22, ... , ann)].


156 16. Kronecker Products and the Vec and Vech Operators

(c) Using the results of Parts (a) and (b), we find that

G~ vec A = G~Gn[(G~Gn)-IG~ vec A]


= (l/2)G~Gn vech (A + A')
= (1/2)vech[2(A + A') - diag (2al!, 2a22, ... , 2ann )]
= vech[A + A' - diag (all, a22, ... , ann)].

EXERCISE 28. Let A represent a square matrix of order n. Show that, for Hn =
(G'n G n )-IGn'
'
GnHn (A ® A)H~ = (A ® A)H~ .

Solution. Using result (4.26), we find that, for Hn = (G~Gn)-IG~,


GnHn(A ® A)H~ = GnHn(A ® A)Gn(G~Gn)-1 = (A ® A)Gn(G~Gn)-1
= (A®A)H~.

EXERCISE 29. Show that if an n x n matrix A = {aij} is upper triangular, lower


triangular, or diagonal, then Hn(A ® A)Gn is respectively upper triangular, lower
triangular, or diagonal with diagonal elementsaiiajj (i = 1, .... n; j = i, ... , n).

Solution. Let us use mathematical induction to show that for any n x n upper
triangular matrix A = {aij}, Hn(A ® A)Gn is upper triangular with diagonal
elements aiiajj (i = 1, ... , n; j = i, ... , n).
For every 1 x 1 upper triangular matrix A = (all), H!(A®A)Gl is the 1 x 1 ma-
trix (ail)' which is upper triangular with diagonal element aiia jj (i = 1; j = 1).
Suppose now that, for every n x n upper triangular matrix A = {aij}, Hn (A®A)G n
is upper triangular with diagonal elements a;;a jj (i = 1, ... , n; j = i, ... , n), and
let B = {bi}} represent an (n + 1) x (n + 1) upper triangular matrix. Then, to com-
plete the induction argument, it suffices to show that Hn+! (B ® B)Gn+! is upper
triangular with diagonal elements biibjj (i = 1, ... , n + 1; j = i, ... , n + I).
For this purpose, partition B as

B = (: ~)
(where A is n x n with ijth element b;+1.j+J}. Then (since B is upper triangular),
a = 0, and it follows from result (4.29) that

C2 2cb' (b ' ® b')Gn )


Hn+l (B ® B)Gn+l = ( 0 cA (b' ®A)Gn .
o 0 Hn(A®A)G n

Moreover, A is upper triangular, and hence (by supposition) Hn (A ®A)Gn is upper


triangular with diagonal elements b;;bjj(i = 2, ... , n + 1; j = i, ... , n + I).
16. Kronecker Products and the Vec and Vech Operators 157

Thus, Hn+ I (B ® B)Gn+ I is upper triangular. And, its diagonal elements are c 2 =
bil' cbjj = bllbjj(j = 2, ... , n+l),andbiibjj(i = 2, ... , n+l; j = i, ... , n+
1); that is, its diagonal elements are biibjj{i = 1, ... , n + 1; j = i, ... , n + 1).
It can be established via an analogous argument that, for any n x n lower
triangular matrix A = {aij}, Hn (A ® A)G n is lower triangular with diagonal
elements aiiajj (i = 1, ... , n; j = i, ... , n).
Finally, note that if an n x n matrix A = {aij} is diagonal, then A is both upper and
lower triangular, in which case Hn (A ® A)G n is both upper and lower triangular,
and hence diagonal, with diagonal elements aiiajj (i = 1, ... , n; j = i, ... , n).

EXERCISE 30. Let AI, ... , Ak, and B represent m x n matrices, and let b =
vec B.
(a) Show that the matrix equation L7=1 Xi Ai = B (in unknowns XI, ... , Xk) is
equivalent to a linear system of the form Ax = b, where x = (XI, ... , xd' is a
vector of unknowns.
(b) Show that if A I, ... , Ak, and B are symmetric, then the matrix equation
L7=1 xjAj = B (in unknowns XI, ... ,Xk) is equivalent to a linear system of
the form A *x = b*, where b* = vech B and x = (XI, ... , Xk)' is a vector of
unknowns.

Solution. Let A = (vec A I, ... , vec Ak).


(a) Making use of result (2.6), we find that

vec(LxiAj) = LXi vec Ai = Ax.


i i

Since clearly the (matrix) equation Li XiAi = B is equivalent to the (vector)


equation vec(Li XiAi) = vec B, we conclude that the equation Li xjAj = B is
equivalent to the linear system Ax = b.
(b) Suppose that Al , ... , Ab and B are symmetric (in which case m = n). And,
let A * = (vech AI, ... , vech Ad. Then, for any value of x such that A *x = b*,

Ax = GnA*x = Gnb* = b,

and conversely, for any value of x such that Ax = b,

A*x = HnAx = Hnb = b*.


We conclude that the linear system A *x = b* is equivalent to the linear system
Ax = b and hence [in light of Part (a)] equivalent to the equation Li XjAi = B.

EXERCISE 31. Let F represent a p x p matrix of functions, defined on a set S,


of a vector x = (XI, ... ,xm )' of m variables. Show that, for k = 2, 3, ... ,

avec(Fk) ~ s-I' k-s avec F


ax' = ~[(F ) ®F ]~
s=1
158 16. Kronecker Products and the Vec and Vech Operators

(where F> = Ip).


Solution. Making use of results (6.l), (15.4.8), and (2.l0), we find that

avee(Fk) = ve{a(Fk)]
aXj aXj
= vel Fk-I~ +Fk-2~F+ ... + ~Fk-I)
'\ aXj aXj aXj

= vee(t Fk - S aF.~_I)
s=1 ax;
= t
s=1 ax;
'\
~-I)
velFk- s aF.

= t [(~-I)'
s=1 ,\ax;
® Fk-s]vel aF.)

= ~ k aveeF
~ [(p-I)' ® F -S] _ _ ,
s=1 aXj
implying that
avee(Fk) _ ~ ~_I' k-s avee F
a' - ~ [( ) ®F ] a' .
x s=1 X

EXERCISE 32. Let F = {jis} and G represent p x q and r x s matrices of


functions, defined on a set S, of a vector x = (XI, ••. ,xm )' of m variables.
(a) Show that (for j = 1, ... ,m)
a(F®G)
aXj
= (F® aG) + (~®G).
aXj aXj
(b) Show that (for j = 1, ... , m)
a vee(F ® G) [ a vee G
---'-_""""':'" = (Iq ® Ksp ® Ir) (vee F) ®
a vee-
+- F ]
® (vee G) .
aXj aXj aXj
(c) Show that
a vee(F ® G) [a vee G a vee F ]
ax' = (Iq ® Ksp ® Ir) (vee F) ®
ax'
+ -a-'-
x
® (vee G) .

(d) Show that, in the special case where x' = [(vee X)', (vee V)'], F(x) = X,
and G(x) = Y for some p x q and r x s matrices X and Y of variables, the formula
in Part (c) simplifies to
avee(X ® Y)
ax' = (Iq ® Ksp ® Ir )[Ipq ® (vee V), (vee X) ® Irs].
16. Kronecker Products and the Vee and Vech Operators 159

Solution. (a) Partition each of the three matrices a(F®G)/ax j, F®(aG/aXj), and
(aF lax j) ® G into p rows and q columns of r x s dimensional blocks. Then, for
i = 1, ... , p ands = 1, ... , q, the isth blocks of a(F®G)/axj' F® (aG/aXj),
and (aF /aXj) ®G are respectively aUisG)/8xj, fis(aG/aXj), and (8fis/axj)G,
implying [in light ofresult(15.4.9)] that the isth block of a(F ® G)/aXj equals
the sum of the isth blocks ofF® (8G/8xj) and (8F/axj) ®G. We conclude that

a(F®G) = (F® aG) + (~®G).


aXj aXj 8xj

(b) Making use of Part (a) and Theorem 16.3.5, we find that

8 vee (F ® G)
aXj

= vec[a(F ® G)]
aXj

= vec(F ® 8G) +
aXj
vec(~
aXj
® G)

= (Iq ® Ksp ® Ir)[(VeC F)® vecG~)+vec(:~ )®(vec G)]


= (Iq ® Ksp ® Ir) [ (vee F) ® avec G avee F ]
+ - - ® (vee G) .
aXj aXj

(c) In light of result (1.28), (vee F) ® [a(vec G)/aXj] is the jth column of
(vee F) ® [8 (vee G)/ax']. And, in light of result (1.27), [8(vec F)/8x j] ® (vee G)
is the j th column of [a (vee F) / ax'] ® (vee G). Thus, it follows from Part (b) that
8 vee (F ® G)
ax' = (Iq ® Ksp ® Ir) [ 8 vee G avee F ]
(vee F) ® ax' + ax;- ® (vee G) .

(d) In this special case,


avee G _ 0 8 vee F _ (I 0)
ax' - ( , Irs), ax' - pq' ,

implying [in light of results (1.28) and (1.27)] that

(vee F) ®
avecG = [0, (vee F) ® Irs]
ax'
and that
avecF
- - ® (vee G) = [lpq ® (vee G), 0],
ax'
so that it follows from Part (c) that
avee (F®G)
---, --'- = (Iq ® Ksp ® IrHlpq ® (vee G), (vee F) ® Irs].
ax
17
Intersections and Sums of Subspaces

EXERCISE 1. Let U and V represent subspaces of R m xn .


(a) Show thatU U V c U + V.
(b) Show that U + V is the smallest subspace (of R m x n) that contains U U V,
or, equivalently [in light of Part (a)], show that, for any subspace W such that
U U V c W, U + V c W.
Solution. (a) Let A represent an arbitrary m x n matrix in U U V, so that (by
definition) A E U or A E V. Upon observing that A = A + 0 = 0 + A and that the
m x n null matrix 0 is a member of V and also of U, we conclude that A E U + V.
(b) Let A represent an arbitrary matrix in U + V. Then, A = U + V for some
matrix U E U and some matrix V E V. Moreover, both U and V are in U U V, and
hence both are in W (where W is an arbitrary subspace such that U U V c W).
Since W is a linear space, it follows that A (= U + V) is in W.

EXERCISE 2. Let A ~G !) and B ~ GD.Find (al a basis Co, C(Al +


C(B), (b) a basis for C(A) n C(B), and (c) a vector in C(A) + C(B) that is not in
C(A) U C(B).
Solution. (a) According to result 0.4), C(A) + C(B) = C(A, B). Since the parti-
tioned matrix (A, B) has only 3 rows, its rank cannot exceed 3. Further, the first
3 columns of (A, B) are linearly independent. We conclude that rank(A, B) = 3
and that the first 3 columns of (A, B), namely, (1,0,0)', (0, 1,0)', and (0,1,2)',
162 17. Intersections and Sums of Subspaces

form a basis for C(A, B) and hence for C(A) + C(B).


(b) The column space C(A) of A comprises vectors of the form (Xl, X2, 0)'
(where Xl and X2 are arbitrary scalars), and C(B) comprises vectors of the form
(2Y2, YI +Y2, 2YI +3Y2)' (where YI and Y2 are arbitrary scalars). Thus, C(A)nC(B)
comprises those vectors that are expressible as (2Y2, YI + Y2, 2YI + 3Y2)' for some
scalars Yl and Y2 such that 2YI + 3Y2 = 0 or equivalently (since 2Yl + 3Y2 =
o ~ Y2 = -2YI/3) of those vectors that are expressible as (-4YI/3, Yl/3, 0)' [=
YI (-4/3, 1/3,0)'] for some scalar YI. We conclude that C(A) nC(B) is of dimen-
sion one and that the set whose only member is (-4, 1,0)' (obtained by setting
Yl = 3) is a basis for C(A) n C(B).
(c) In light of the solution to Part (a), it suffices to find any 3-dimensional column
vector that is not contained in C(A) orC(B).1t follows from the solution to Part (b)
that the vector (2Y2, z, 2Yl + 3Y2)', where YI, yZ, and z are any scalars such that
2YI + 3Y2 i= 0 and z i= YI + Y2, is not contained in C(A) orC(B). For example, the
vector (0, 2, 2)' (obtained by taking Yl = 1, Y2 = 0, and z = 2) is not contained
in C(A) or C(B).

EXERCISE 3. Let U, W, and X represent subspaces of a linear space V of


matrices, and let Y represent an arbitrary matrix in V.
(a) Show (1) that if Y 1- Wand Y 1- X, then Y 1- (W + X), and (2) that if
U 1- Wand U 1- X, then U 1- (W + X).
(b) Show (1) that (U + W)..L = U..L n W..L and (2) that (U n W)..L = U..L + W..L.
Solution. (a) (1) Suppose that Y 1- Wand Y 1- X. Let Z represent an arbitrary
matrix in W + X. Then, there exists a matrix W in Wand a matrix X in X such
that Z = W + X. Moreover, Y 1- Wand Y 1- X, implying that Y 1- Z. We
conclude that Y 1- (W + X).
(2) Suppose that U 1- Wand U 1- X. Let U represent an arbitrary matrix in U.
Then, U 1- Wand U 1- X, implying [in light of Part (1)] that U 1- (W + X). We
conclude that U 1- (W + X).
(b) (1) Observing that U c (U + W) and W c (U + W) and making use of
Part (a)-(1), we find that

Y E (U + W)..L ~ Y 1- (U + W)
~ Y 1- U and Y 1- W
~ Y E U..L and Y E W..L
~ YE(U..LnW..L).

We conclude that (U + W)..L = U..L n W..L.


(2) Making use of Part (1) and Theorem 12.5.4, we find that
17. Intersections and Sums of Subspaces 163

EXERCISE 4. Let U, W, and X represent subspaces of a linear space V of


matrices.
(a) Show that (U n W) + (U n X) c Un (W + X).
(b) Show (via an example) that U n W = {OJ and U n X = {OJ does not
necessarily imply that Un (W + X) = {OJ.
(c) Show that if W C U, then (1) U +W = U and (2) un (W + X) =
w+(UnX).
Solution. (a) Let Y represent an arbitrary matrix in (U n W) + (U n X). Then,
Y = W + X for some matrix W in U n W and some matrix X in U n X. Since
both W and X are in U, Y is in U, and since W is in W and X in X, Y is in
W + X. Thus, Y is in U n (W + X). We conclude that (U n W) + (U n X) c
Un(W+X).
(b) Suppose that V = R}x2 and that U, W, and X are the one-dimensional
subspaces spanned by (1,1), (1,0), and (0, 1), respectively. Then, clearly, Un
W = {OJ and Un X = {OJ. However, W + X = R lx2 , and consequently
Un (W + X) = U =1= {OJ.
(c) Suppose that W CU.
(1) Since clearly U C U + W, it suffices to show that U + W c U. Let Y
represent an arbitrary matrix in U + W. Then, Y = V + W for some matrix V in U
and some matrix W in W. Moreover, W is in U (since W C U), and consequently
Vis inU. We conclude thatU + We U.
(2) It follows from Part (a)(and the supposition that W C U) that w+(unX) c
U n (W + X). Thus, it suffices to show that U n (W + X) c W + (U n X).
Let Y represent an arbitrary matrix in U n (W + X). Then, YEW + X, so that
Y = W + X for some matrix W in W and some matrix X in X, and also Y E U.
Thus, X = Y - W, and, since (in light of the supposition that W C U) W (like
Y) is in U, X is in U (as well as in X) and hence is in un X. It follows that Y is
in W + (U n X). We conclude that U n (W + X) c W + (U n X).

EXERCISE 5. Let UI, U2, ... ,Uk represent subspaces of Rmxn. Show that if,
for j = 1,2, ... , k, Uj is spanned by a (finite nonempty) set of (m x n) matrices
(j) (j)
VI , ... , V rj ,then

V(2) V(k) V(k»


U 1 + U2 + ... + Uk = sp(V (I) (I) (2)
I , ... , V rt ,V I , ... , r2'···' I , ... , rk .

Solution. Suppose that, for j = 1,2, ... ,k, Uj is spanned by the set {V~j), ... ,
VW}. The proof that UI +U2 + ... +Uk is spanned by ViI), ... , V~:), vi2), ... ,
(2) ' .•. , V(k)
V r2 I ' · .. , V(k). b th . I· d .
rk IS Y ma ematIca 10 uctIon.
It follows from Lemma 17.1.1 that

UI + U2 = sp(Vil), ... , V~:), vi 2), ... , V~;».


164 17. Intersections and Sums of Subspaces

Now, suppose that (for an arbitrary integer j between 2 and k - 1, inclusive)

U1 + U2 + ... O) U(l) U(2) U(2) U(j) U(j»


+ Uj = sp (U I" .. , r l ' I , ... , r z " ' " I , ... , rj •

Then, the proof is complete upon observing (in light of Lemma 17.1.1) that

UI +U2 + ... +Uj+1


= (UI +U2 + ... +Uj ) +Uj+l
= sp(Ui l ), ... , U~:), ui2), ... , U~;), ... , U\i+ 1), ... , U~1:/».
EXERCISE 6. Let Ul, ... ,Uk represent subspaces of nm
xn. The k subspaces
Ul, ... ,Uk are said to be independent if, for matrices U I E UI, ... , Uk E Uk. the
only solution to the matrix equation

(E. 1)

iSU1=",=Uk=O.
(a) Show that UI, ... ,Uk are independent if and only if, for i = 2, ... ,k, Ui
and UI + ... + Ui -I are essentially disjoint.
(b) Show that UI, ... , Uk are independent if and only if, for i = 1, ... , k, Ui
and UI + ... + Ui-I + Ui+1 + ... + Uk are essentially disjoint.
(c) Use the results of Exercise 3 [along with Part (a) or (b)] to show that if
UI, ... , Uk are (pairwise) orthogonal, then they are independent.
(d) Assuming that UI, Uz, ... , Uk are of dimension one or more and letting
{Ui j ), ... , UW} represent any linearly independent set of matrices in Uj (j =
1,2, ... , k), show that if UI, U2, ... ,Uk are independent, then the combined set
{U (l) UO) (2) (2) (k) (k) • . I' d d
1 , ... , r l ' U 1 , ... , U r2 ' ••• , U I , ... , U rk } IS lInear y III epen ent.

(e) Assuming that UI, U2, ... , Uk are of dimension one or more, show that UI,
U2, ' .. ,Uk are independent if and only if, for every nonnull matrix UI in UI, every
nonnull matrix U2 in U2, ... , and every nonnull matrix Uk in Uk. UI, U2, ... , Uk
are linearly independent.
(f) For j = 1, ... , k, let Pj = dim(Uj), and let Sj represent a basis forUj (j =
1, ... , k). Define S to be the set of 2:)=1 Pj matrices obtained by combining all
of the matrices in SI, ... , Sk into a single set. Use the result of Exercise 5 [along
with Part (d)] to show that (1) if UI, ... , Uk are independent, then S is a basis for
UI + ... + Uk; and (2) if UI, ... , Uk are not independent, then S contains a proper
subset that is a basis for UI + ... + Uk.
(g) Show that (1) if U I, ... , Uk are independent, then

dim(U1 + ... + Uk) = dim(U]) + ... + dim(Uk);


and (2) if UI, ... , Uk are not independent, then

dim(UI + ... +Uk) < dim(UI) + ... + dim(Ud.


17. Intersections and Sums of Subspaces 165

Solution. (a) It suffices to show that UI, ... , Uk are not independent if and only
if, for some i (2 SiS k), U i and UI + ... + Ui-I are not essentially disjoint.
Suppose thatUI, ... , Uk are not independent. Then, by definition, equation (E.l)
has a solution, say UI = Ur, ... , Uk = Uk' other than UI = ... = Uk = O. Let
r represent the largest value of i for which Ur is nonnull. (Clearly, r 2: 2.) Since
Ur + ... + U; = Ur + ... + Uk = 0,

Thus, for i = r, Ui and UI + ... + Ui -I are not essentially disjoint.


Conversely, suppose that for some i, say i = s, Ui and UI + ... + Ui -I are not
essentially disjoint. Then, there exists a nonnull matrix U, such that Us E Us and
Us E UI+' +Us_I.Further,thereexistmatricesUI E UI, ... , Us-I E Us-I such
thatU s = UI + .. ,+Us-I orequivalentlysuchthatU I + .. ,+Us-I +(-U s ) = O.
Thus, equation (E.l) has a solution other than UI = ... = Uk = O.
(b) It suffices to show that U I , ... , Uk are not independent if and only if, for
some i (l SiS k), Ui andUI + ... +Ui-I +Ui+1 + ... +Uk are not essentially
disjoint.
Suppose that UI, ... , Uk are not independent. Then, by definition, equation
(E.l) has a solution, say UI = Ur, ... , Uk = Uk' other than UI = ... = Uk = O.
Let r represent an integer (between I and k, inclusive) such that U; i= O. Since
U; = - Li i=r Ur, U; is in the subspace Li i=r Ui, as well as the subspace Ur •
Thus, for i = r, Ui and UI + ... + U-I + Ui+1 + ... + Uk are not essentially
disjoint.
Conversely, suppose that for some i, say i = S, Ui andUI + ... +Ui-I +Ui+1 +
... +Uk are not essentially disjoint. Then, there exists a nonnull matrix Us such that
Us E Us and Us E Lii=sUi , Further, there exist matrices UI E UI, ... , Us-I E
Us-I, Us+1 E Us+ I, ... , Uk E Uk such that Us = Li#s Ui or equivalently such
that UI + ... + Us-I + (-Us) + Us+I + ... + Uk = O. Thus, equation (E.l) has
a solution other than U I = ... = Uk = O.
(c) Suppose that UI, ... , Uk are orthogonal. Then, applying the result of Part
(a)-(2) of Exercise 3 (i - 2 times), we find that Ui and UI + ... + Ui-I are
orthogonal, implying (in light of Lemma 17.1.9) that Ui and UI + ... + Ui-I are
essentially disjoint (i = 2, ... , k). Based on Part (a), we conclude that UI, ... , Uk
are independent.
(d) Suppose that UI, U2, ... , Uk are independent. The proof that the set (Ui l ),
(I) U(2)
... , U q , I , ... , U(2)
r2"'"
U(k)
I , ... , U(k)}.
rk IS I'mearIy 'md epend ent IS
. by math-
ematical induction.
By definition, the set (Ui l ), ... , ug)} is linearly independent. Now, suppose
that (for an arbitrary integer j between land k - I, inclusive) the set (Ui l ), ... ,
Ur(I)l ' U(2)
I , ... , U(2)
r2"'"
U(j)
I , ... , U(j)}·
rj IS I'mearIy 'mdependent. Th en, It. suffi ces
(I) (I) (2) (2) (}+l) U(J+l)} .
to show that the set {U I , ... , U q , U I , ... , U r2 ' .•• , U I ' ... , r j +1 IS
linearly independent.
166 17. Intersections and Sums of Subspaces

According to Part (a), Uj+1 and UI + ... + Uj are essentially disjoint. And,
(1) V(1) (J). b
c IearIy, V I " ' " rr'
(2) (2) (j)
VI , ... , Vrz ' .... VI , ... , Vrj are m the su space
UI + U2 + ... + U j . Thus, it follows from Lemma 17.1.3 that the set (V~I), ... ,
el) V(2)
V rr' V(2) (J+I) (J+I) . . .
I , ... , r2" .. , V I ' ...• Vrj+r } IS lmearly mdependent.

(e) Suppose that UI. U2 • ... , Uk are independent. Let VI, V2 • ... , Vk represent
nonnull matrices in UI, U2 • ... , Uk, respectively. Then, it follows from Part (d)
that the set {VI, V2, ... , Vd is linearly independent.
Conversely, suppose that, for every nonnull matrix V I in UI, every nonnull ma-
trix V 2 in U2, ... , and every nonnull matrix V k in Uk. V I, V 2, ... , V k are linearly
independent. If UI, U2, ... ,Uk were not independent, then, for some nonempty
subset {JI, ... , ir} of the first k positive integers, there would exist nonnull matri-
ces V jr ' ... , V jr in Ujr' ...• U jr , respectively, such that

VJr +",+Vjr =0,

and the set {VI, V2, ... , Vk} (where, for i rf. {h, ... , ir}, Vj is an arbitrary
nonnull matrix in Uj) would be linearly dependent, which would be contradictory.
Thus, UI, U2, ... , Uk are independent.
(f) It is clear from the result of Exercise 5 that S spans UI + ... + Uk.
(1) Now, suppose that UI, ... , Uk are independent. Then, it is evident from Part
(d) that S is a linearly independent set. Thus, S is a basis for UI + ... + Uk.
(2) Alternatively, suppose that UI, ... , Uk are not independent. Then, for some
(nonempty) subset {Jr, ... , ir} of the first k positive integers, there exist nonnull
matrices V jr ' ... , V jr in UJr' ... , Ujr' respectively, such that
VJr +",+V jr =0.

Further, for m = 1, ... , r,


Pjm
. - ' " (m)V(m)
V 1m - ~C; ;'
;=1

h
werec (m) (m) . b ) d V(m) V(m)
1 , ... ,cpjmarescalars(notallofwhichcan ezero an 1 , ... , Pjm
are the matrices in S jm' Thus,

implying that S is a linearly dependent set. We conclude that S itself is not a basis
and consequently (in light of Theorem 4.3.11) that S contains a proper subset that
is a basis for Ul + ... + Uk.
(g) Part (g) is an immediate consequence of Part (f).

EXERCISE 7. Let AI, ... , Ak represent matrices having the same number of
rows, and let BI, ... , Bk represent matrices having the same number of columns.
17. Intersections and Sums of Subspaces 167

Adopting the terminology of Exercise 6, use Part (g) of that exercise to show (a)
that if C(Al), ... , C(Ak) are independent, then

rank:(Al, ... ,Ak) = rank:(Ad + ... + rank:(Ak) ,


and if C(Ad, ... ,C(Ak) are not independent, then

and (b) that if'R(BI), ... , 'R(Bk) are independent, then

,wQ:) ~ nmk(Bl) + ... + nmk(B,),


and if'R(BI), ... , 'R(Bk) are not independent, then

nmkQ:) < nmk(BIl + ... + nmk(B,).


Solution. (a) Clearly,

rank(AI) + ... + rank(Ak) = dim[C(Ad] + ... + dim[C(Ak)].


And, in light of equality (1.6),

rank(AI, ... ,Ak) = dim[C(AI ... ,Ak)] = dim[C(Ad + ... + C(Ak)]'


Thus, it follows from Part (g) of Exercise 6 that if C(Ad, ... , C(Ak) are indepen-
dent, then
rank(AI, ... ,Ak) = rank(Ad + ... + rank(Ak).
and ifC(Al), ... , C(Ak) are not independent, then

(b) The proof of Part (b) is analogous to that of Part (a).

EXERCISE 8. Letting A represent an m x n matrix and B an m x p matrix, show,


by for instance using the result of Part (c)-(2) of Exercise 4 in combination with
the result

C(A, B) = C[A, (I - AA -)B] = C(A) EB C[(I - AA -)B], (*)

that
(a) C[(I - AA -)B] = C(I - AA -) n C(A, B) and
168 17. Intersections and Sums of Subspaces

(b) C[(I - PA)B] = N(A' ) n C(A, B).

Solution. (a) According to result (*) [or, equivalently, the first part of Corollary
17.2.9],
C(A, B) = C(A) + C[(I - AA -)B].
Thus, observing thatC[(1 - AA -)B] c C(I - AA -) and making use of the result
of Part (c)-(2) of Exercise 4 (and also of Lemma 17.2.7), we find that

C(I - AA -) n C(A, B) = C(I - AA -) n {C[(I - AA -)B] + C(A)}


= C[(I - AA -)B] + [C(I - AA -) n C(A)]
= C[(I - AA -)B] + {OJ
= C[(I - AA -)B].

(b) According to Part (1) of Theorem 12.3.4, (A' A)- A' is a generalized inverse
of A. Substituting this generalized inverse for A-in the result of Part (a) and
making use of Lemma 12.5.2, we find that

C[(I - PA)B] = C(I - PA) n C(A, B)


= N(A' ) n C(A, B).

EXERCISE 9. Let A = (T, U) and B = (V, 0), where T is an m x p matrix, U


an m x q matrix, and V an n x p matrix, and suppose that U is of full row rank.
Show that R(A) and R(B) are essentially disjoint [even ifR(T) and R(V) are not
essentially disjoint].
Solution. Let x' represent an arbitrary [1 x (p + q)] vector in R(A) n R(B).
Then, x' = r' A and x' = s'B for some (row) vectors r' and S'. Partitioning x' as
x' = (x~, x;) (where x~ is of dimensions 1 x p), we find that

(x~, x;) = r'(T, U) = (r'T, r'U)


and similarly that
(x~, x;) = S'(V, 0) = (S'V, 0).
Thus, r'U = x; = 0, implying (since the rows of U are linearly independent) that
r' = 0 and hence that x' = O. We conclude that R(A) and R(B) are essentially
disjoint [even if R(T) and R(V) are not essentially disjoint].

EXERCISE 10. To what extent does the formula

rank (~ ~) = rank(U) + rank (V) + rank[(1 - UU-)T(I - V-V)] (*)

[where T is an m x p matrix, U an m x q matrix, and V an n x p matrix] simplify


in (a) the special case where C(T) and C(U) are essentially disjoint [but R(T) and
17. Intersections and Sums of Subspaces 169

R(V) are not necessarily essentially disjoint] and (b) the special case where R(T)
and R(V) are essentially disjoint.
Solution. (a) IfC(T) and C(U) are essentially disjoint, then {since C[T(I - V-V)]
c C(T)} C[T(I - V-V)] and C(U) are essentially disjoint, and (in light of Corollary
17.2.10) formula (*) [or, equivalently, formula (2.15)] simplifies to

rank (~ ~) = rank(U) + rank(V) + rank[T(1 - V-V)].

(b) If R(T) and R(V) are essentially disjoint, then it follows from an analogous
line of reasoning that formula (*) [or, equivalently, formula (2.15)] simplifies to

rank (~ ~) = rank(U) + rank(V) + rank[(1 - UU-)T].

EXERCISE 11. Let T represent an m x p matrix, U an m x q matrix, and V


an n x p matrix. Further, define ET = I - TT-, FT = 1- T-T, X = ETU,
.. d . (T- - T-UX-ET) .
and Y = VF T. Show () a th at the partltlOne matrIx X-ET IS a
generalized inverse of the partitioned matrix (T, U) and (b) that the partitioned
matrix (T- - F T Y-VT-, F T Y-) is a generalized inverse of the partitioned matrix
(~). Do so by applying formula (E.l) from Part (a) of Exercise 10.10 to the
partitioned matrices (! ~) and (~ ~) and by making use of the result that
for any generalized inverse G = (~~) of the partitioned matrix (A, B) and any

generalized inverse H = (HI, H2) of the partitioned matrix (~) (where A is an


m x n matrix, B an m x p matrix, and C a q x n matrix and where Gl has n rows
and HI m columns), (1) Gl is a generalized inverse of A and G2 a generalized
inverse of B if and only if C(A) and C(B) are essentially disjoint, and, similarly,
(2) HI is a generalized inverse of A and H2 a generalized inverse of C if and only
if R(A) and R(C) are essentially disjoint.
Solution. (a) Upon setting V = 0 and W = 0 (in which case Y = 0, Q = 0,
and Z = 0) and choosing Y- = 0 and Z- = 0 in formula (E.1) [from Part (a) of
Exercise 10.10], we obtain as a generalized inverse for (! ~) the partitioned
matrix
_ (T- - T-UX-ET
G- X-ET
0)
0 .
We conclude, on the basis of the cited result (or, equivalently, Theorem 17.3.3),
that ( T- - X-ET
T-UX-ET) IS . a general'Ized'Inverse 0 f (T , U) .
170 17. Intersections and Sums of Subspaces

(b) Upon setting U = 0 and W = 0 (in which case X = 0, Q = 0, and Z = 0)


and choosing X- = 0 and Z- = 0 in formula (E. I) [from Part (a) of Exercise
10.10], we obtain as a generalized inverse for (~ ~) the partitioned matrix
_ (T- - FT Y-VT- FT Y-)
G- 0 O·
We conclude, on the basis of the cited result (or, equivalently, Theorem 17.3.3),
that (T- - FT Y-VT-, FT Y-) is a generalized inverse of (~).

EXERCISE 12. Let T represent an m x p matrix, U an m x q matrix, and V


an n x p matrix. And, let (g~~ g~~) (where Gll is of dimensions p x m)

represent a generalized inverse of the partitioned matrix (~ ~). Show that (a)
if GIl is a generalized inverse of T and Gl2 a generalized inverse of V, then R(T)
and R(V) are essentially disjoint, and (b) if GIl is a generalized inverse of T and
G21 a generalized inverse of U, then C(T) and C(U) are essentially disjoint.
Solution. Clearly,

( TGll T + UG21 T + TG12 V + UG22 V TGllU + UG2I U )


VGIIT + VGl2V VGllU

= (~ ~) (g~~ (S.I)

(a) Result (S.I) implies in particular that


VGllT =V - VGI2V. (S.2)
Now, suppose that GIl is a generalized inverse of T and G12 a generalized inverse
of V. Then, equality (S.2) reduces to
VGllT = 0,
and it follows from Corollary 17.2.12 that R(T) and R(V) are essentially disjoint.
(b) The proof of Part (b) is analogous to that of Part (a).

EXERCISE 13. (a) Generalize the result that, for any two subspaces U and V of
Rmxn,
dim(U + V) = dim(U) + dim (V) - dim(U n V),
Do so by showing that, for any k subspaces Ul, ... , Uk.
dim(UI + ... + Uk) = dim(Ul) + ... + dim(Uk)
k
- L dim[(UI + ... + Ui-l) n U;1. (E.2)
i=2
17. Intersections and Sums of Subspaces 171

(b) Generalize the result that, for any m x n matrix A, m x p matrix B, and
q x n matrix C,

rank(A, B) = rank(A) + rank(B) - dim[C(A) n C(B)],

rank ( ~) = rank(A) + rank(C) - dim['R(A) n 'R(C)].

Do so by showing that, for any matrices AI, ... , Ak having the same number of
rows,

rank (A I , ... , Ak) = rank(Ad + ... + rank(Ak)


k
- Ldim[C(AI," .Ai-I) nC(Ai)]
i=2

and, for any matrices BI, ... , Bk having the same number of columns,

Solution. (a) The proof is by mathematical induction. In the special case where
k = 2, equality (E.2) reduces to the equality

which is equivalent to equality (*) and whose validity was established in Theorem
17.4.1.
Suppose now that equality (E.2) is valid for k = k' (where k' ::: 2). Then,
making use of Theorem 17.4.1, we find that

dim(UI + ... + Uk' +Uk'+I)


= dim(UI + ... + Uk') +dim(Uk'+I)
-dim[(UI + ... + Uk') nUk,+d
k'
= dim(UI) + ... + dim (Uk') - Ldim[(UI + ... +Ui-I) nUj)
i=2
+dim(Uk,+d - dim[(UI + ... + Uk') nUk,+d
k'+1
= dim(UI) + ... + dim(Uk,+d - L
dim[(UI + ... + Ui-I) n Uj),
i=2

thereby completing the induction argument.


172 17. Intersections and Sums of Subspaces

(b) Applying Part (a) with U\ = C(Ad, ... ,Uk = C(Ad [and recalling result
(1.6)], we find that

rank (A 1, ... , Ad = dim[C(Al, ... Ak))


= dim[C(Ad + ... + C(Ad]
= dim[C(Ad] + ... + dim[C(Ak)]

- L dim{[C(Ad + ... + C(Ai-d] n C(Ai)}


k

i=2
= rank(Ad + ... + rank(Ak)
- L dim[C(Al, ... , Ai-I> n C(Ai )].
k

i=2

And, similarly, applying Part (a) with Ul = R(Bd, ... , Uk = R(Bk) [and
recalling result (1.7)], we find that

4~}mm[RG}
= dim[R(Bl) + ... + R(Bk)]
= dim[R(BJ)] + ... + dim[R(Bk)]

- L dim{[R(Bd + ... + R(B;-d] n R(B;)}


k

;=2

EXERCISE 14. Show that, for any m x n matrix A, n x q matrix C, and q x p


matrix B,

rank{[I- CB(CB)-]C[I - (AC)- AC]}


= rank(A) + rank(C) - rank(AC) - n
+ rank{[I -CB(CB)-](I - A- A)}.

Hint. Apply the equality

rank(AC) = rank(A) + rank(C) - n + rank[(I - CC-)(I - A-A)] (*)

to the product A(CB), and make use of the equality

rank(ACB) = rank(AC) + rank(CB) - rank(C)


+ rank {[I - CB(CB)-] C [I - (AC)- AC]}. (**)
17. Intersections and Sums of Subspaces 173

Solution. Making use of equality (*) [or equivalently equality (5.8)], we find that

rank(ACB) = rank[A(CB)]
= rank (A) + rank(CB) - n
+ rank{[1 - CB(CB)-](I - A-A)}. (S.3)

And upon equating expression (**) [or equivalently expression (5.5)] to expression
(S.3), we find that

rank{[1 - CB(CB)-]C[I - (AC)- AC]}


= rank(A) + rank(C) - rank(AC) - n
+ rank{[1 - CB(CB)-](I - A - A)}.

EXERCISE 15. Show that if an n x n matrix A is the projection matrix for a


subspaceU ofRnxl along a subspace VofR nxl (whereU EEl V = Rnxl), then A'
is the projection matrix for Vl. along Ul. [where Ul. and Vl. are the orthogonal
complements (with respect to the usual inner product and relative to R n xl) of U
and V, respectively].

Solution. Suppose that A is the projection matrix for U along V (where U EEl V =
Rnxl). Then, according to Theorem 17.6.14, A is idempotent, U = C(A), and
V = C(I - A). And, since (according to Lemma 10.1.2) A' is idempotent, it
follows from Theorem 17.6.14 that A' is the projection matrix for C(A') along
N(A' ). Moreover, making use of Corollary 11.7.2 and of Lemma 12.5.2, we find
that
C(A') = N(I - A') = Cl.(1 - A) = Vl.

and that

EXERCISE 16. Show that, for any n x p matrix X, XX- is the projection matrix
forC(X) alongN(XX-).

Solution. According to Lemma 10.2.5, XX- is idempotent. Thus, it follows from


Theorem 17.6.14 that XX- is the projection matrix for C(XX-) along N(XX-).
Moreover, according to Lemma 9.3.7, C(XX-) = C(X).

EXERCISE 17. Let Y represent a matrix in a linear space V of m x n matrices, and


let Ul, ... , Uk represent subspaces of V. Adopting the tenninology and using the
results of Exercise 6, show that if Ul, ... , Uk are independent and if Ul +-. ,+Uk =
V, then (a) there exist unique matrices Zl, ... , Zk inUl, ... , Uk, respectively, such
that Y = Zl + ... + Zk and (b) for i = 1, .... k, Zi equals the projection of Y
on Ui along Ul + ... + Ui-l + Ui+l + ... + Uk.
174 17. Intersections and Sums of Subspaces

Solution. Suppose that UI, ... ,Uk are independent and that UI + ... + Uk = V.
(a) It follows from the very definition of a sum (of subspaces) that there exist
matrices ZI, ... , Zk in UI, ... ,Uk. respectively, such that Y = ZI + ... + Zk. For
purposes of establishing the uniqueness of Z" ... , Zk. let Zr, ... , Zk represent
matrices (potentially different from Z" ... , Zk) in UI, ... ,Uk, respectively, such
that Y = Zr + ... + ZZ. Then,

(Zr - ZI> + ... + (Zk - Zk) = Y - Y = 0,


and (for i = 1, ... , k) Z; - Zi E Ui. Thus, Z; - Zi = 0 and hence Z; = Zi (i =
1, ... , k), thereby establishing the uniqueness of ZI, ... , Zk.
(b) That (for i = 1, ... , k) Zi equals the projection ofY onUi along U, + ... +
Ui -I + Ui + I + ... + Uk is evident upon observing that [as a consequence of Part
(b) of Exercise 6] Ui andUI + ... +Ui-' +Ui+' + ... +Uk are essentially disjoint
and that

EXERCISE 18. Let U and W represent essentially disjoint subspaces (of R n X,)
whose sum is R nx I, and let U represent any n x s matrix such that C(U) = U and
W any n x t matrix such that C(W) = W.
(a) Show that the n x (s + t) partitioned matrix (U, W) has a right inverse.
(b) Taking R to be an arbitrary right inverse of (U, W) and partitioning R as
R = (:~) (where RI has s rows), show that the projection matrix for U along
W equals URI and that the projection matrix for W along U equals WR2.
Solution. (a) In light of result (1.4), we have that

rank(U, W) = dim[C(U, W)] = dim(U + W) = dim(Rn) = n.


Thus, (U, W) is of full row rank, and it follows from Lemma 8.1.1 that (U, W)
has a right inverse.
(b) For j = 1, ... , n, let e j represent the jth column of In; let Zj represent the
projection of e j on U along W; let r j, rlj, and r2j represent the jth columns of

R, R I, and R2, respectively, and observe that rj rl .) .


= ( r2~
By definition, (U, W)R = In, implying that (for j = 1, ... , n) (U, W)r j = e j.
Thus, it follows from Corollary 17.6.5 that (for j = 1, ... , n) Zj = Urlj. We
conclude (on the basis of Theorem 17.6.9) that the projection matrix for U along
Wequals
(ZI, .... zn) = (Urll, ... , Urln) = URI·
And, since URI + WR2 = In, we further conclude (on the basis of Theorem
17.6.10) that the projection matrix for W along U equals I - URI = WR2.
17. Intersections and Sums of Subs paces 175

EXERCISE 19. Let A represent the (n x n) projection matrix for a subspace U


ofnn x 1 along a subspace V of nnx 1 (where U E9 V = nn xl), let B represent the
(n x n) projection matrix for a subspace W of nn x 1 along a subspace X of nn x1
(where W E9 X = nn xl), and suppose that A and B commute (i.e., that BA= AB).

(a) Show that AB is the projection matrix for U n W along V + X.


(b) Show that A + B - AB is the projection matrix for U + W along V n X.
[Hintfor Part (b). Observe that I - (A + B - AB) = (I - A)(I - B), and make
use of Part (a).]
Solution. (a) According to Theorem 17.6.13, A and B are both idempotent, so that

(AB)2 = A(BA)B = A(AB)B = A2B2 = AB.


Thus, AB is idempotent, and it follows from Theorem 17.6.14 that AB is the
projection matrix for C(AB) along N(AB).
It remains to show that C(AB) = U n W and N (AB) = V + X or equivalently
(in light of Theorem 17.6.14) that C(AB) = C(A) nC(B) andN(AB) = N(A) +
N(B). Clearly, C(AB) c C(A) and (since AB = BA) C(AB) c C(B), so that
C(AB) c C(A) n C(B). And, for any vector y in C(A) n C(B), it follows from
Lemma 17.6.7 that y = Ay and y = By, implying that y = ABy and hence that y E
C(AB). Thus, C(A) n C(B) c C(AB), and hence [since C(AB) c C(A) n C(B)]
C(AB) = C(A) n C(B).
Further, for any vector x in N (A) and any vector y in N (B),

AB(x + y) = ABx + ABy = BAx + ABy = 0 + 0 = 0,


implying that x + y E N(AB). Thus, N(A) + N(B) c N(AB). And, for any
vector z inN (AB)(i.e., any vector z such that ABz = 0), Bz E N (A), which since
z = Bz+ (I - B)z and since (I - B)z E N (B) [as is evident from Theorem 11.7.1 or
upon observing thatB(I-B)z = (B-B2)z = 0] implies thatz E N(A)+N(B). It
followsthatN(AB) c N(A)+N(B), and hence [sinceN(A)+N(B) c N(AB)]
thatN(AB) = N(A) +N(B).
(b) As a consequence of Theorem 17.6.10, I - A is the projection matrix for
V along U, and I - B is the projection matrix for X along W. Thus, it follows
from Part (a) that (I - A)(I - B) is the projection matrix for V n X along U + W.
Observing that A + B - AB = I - (I - A)(I - B), we conclude, on the basis
of Theorem 17.6.10, that A + B - AB is the projection matrix for U + W along
Vnx.

EXERCISE 20. Let V represent a linear space of n-dimensional column vectors,


and let U and W represent essentially disjoint subspaces whose sum is V. Then,
an n x n matrix A is said to be a projection matrix for U along W if Ay is the
projection of y on U along W for every y E V - this represents an extension
of the definition of a projection matrix for U along W in the special case where
V= nn. s
Further, let U represent an n x matrix such that C(U) = U, and let W
represent an n x t matrix such that C (W) = W.
176 17. Intersections and Sums of Subspaces

(a) Show that an 11 x 11 matrix A is a projection matrix for U along W if and


only if AV = V and AW = 0 or, equivalently, if and only if A' is a solution to the
linear system (~,~ = (~') (in an 11 x 11 matrix B).

(b) Establish the existence of a projection matrix for U along W.


(c) Show that if A is a projection matrix for U along W, then 1 - A is a projection
matrix for W along U.
(d) Let X represent any 11 x p matrix whose columns span N(W') or, equiva-
lently, W~. Show that an 11 x 11 matrix A is a projection matrix for U along W if
and only if A' = XR* for some solution R* to the linear system V'XR = V' (in a
p x 11 matrix R).

Solution. (a) Clearly, an 11 x 1 vector y is in V if and only if y is expressible as


y = Vb + We for some vectors band e. Thus, an 11 x 11 matrix A is a projection
matrixforU along W ifandonly if, for every (s x I) vectorb and every (t x 1) vector
e, A(Vb+We) is the projection ofVb+We onU along W, or equivalently (in light
of Corollary 17.6.2) if and only if, for every b and every e, A(Vb + We) = Vb.
Now, if AV = V and A W = 0, then obviously A(Vb + We) = Vb for
every b and every e. Conversely, suppose that A(Vb + We) = Vb for every b
and every e. Then, A(Vb + We) = Vb for every b and for e = 0, or equivalently
AVb = Vb for every b, implying (in light of Lemma 2.3.2) thatAV = U. Similarly,
A(Vb + We) = Vb for b = 0 and for every e, or equivalently AWe = 0 for every
e, implying that A W = O.
(b) Clearly, the linear systems V'B = V' and W'B = 0 (in B) are both consistent.
And, since (in light of Lemma 17.2.1) R(V') and R(W') are essentially disjoint,
we have, as a consequence of Theorem 17.3.2, that the combined linear system
V' \n = (V')..
(W'r 0 is conSistent. Thus, th '
e eXistence 0
f a proJectlOn
.. . ..lor
matnx U
along W follows from Part (a).
(c) Suppose that A is a projection matrix for U along W. Then, according to
Part (a), AV = V and AW = O. Thus, (I - A)W = W, and (I - A)V = O. We
conclude [on the basis of Part (a)] that 1 - A is a projection matrix for W along U.
(d) In light of Part (a), it suffices to show that A' is a solution to the linear system
(~, ~ = (~') (in B) if and only if A' = XR* for some solution R* to the linear
system V'XR = V'.
Suppose that A' = XR* for some solution R* to V'XR = V'. Then, V' A' = V',
and (since clearly W'X = 0) W' A' = O. Thus, A' is a solution to (~, ~ = (~').

Conversely, suppose that A' is a solution to (~, ~ = (~') or equivalently that


V'A' = V' and W'A' = O. Then, according to Lemma 11.4.1, C(A') c N(W'),
or equivalently C(A') c C(X), and consequently A' = XR* for some matrix R*.
17. Intersections and Sums of Subspaces 177

And, U'XR* = U' A' = U', so that R* is a solution to U'XR = U'.


EXERCISE 21. Let UI, ... , Uk represent independent subspaces of R n x I such
that UI + ... + Uk = R nx I (where the independence of subspaces is as defined
in Exercise 6). Further, letting Si = dim(Ui) (and supposing that Si > 0), take
Ui to be any n x Si matrix such that C(Ui) = Ui (i = 1, ... , k). And, define

B ~ (U I •...• U.) -I • partition B '" B ~ (~:) (whcre,for i ~ [..... k. Bj has

Si rows), and let H = B'B or (more generally) let H represent any matrix of the
form
(E.3)

where AI, A2, ... , Ak are symmetric positive definite matrices.


(a) Using the result of Part (g)-(1) of Exercise 6 (or otherwise), verify that the
partitioned matrix (U I, ... , Uk) is nonsingular (i.e., is square and of rank n).
(b) Show that H is positive definite.
(c) Show that (for j i= i = 1, ... , k) Ui and Uj are orthogonal with respect to
H.
(d) Using the result of Part(a)-(2) of Exercise 3 (or otherwise), show that, for
i = 1, ... , k, (1) UI + ... + Ui-I + Ui+1 + ... + Uk equals the orthogonal
complement U/- of Ui (where the orthogonality in the orthogonal complement is
with respect to the bilinear form x'Hy) and (2) the projection of any n x 1 vector y
on Ui along UI + ... + Ui -I + Ui + I + ... + Uk equals the orthogonal projection
of y on Ui with respect to H.
(e) Show that if, for j i= i = 1, ... , k, Ui and Uj are orthogonal with respect
to some symmetric positive definite matrix H*, then H* is expressible in the form
(E.3).

Solution. (a) Clearly, dim(UI + ... + Uk) = dim(Rn x I) = n. Thus, making use
of Part (g)-( 1) of Exercise 6, we find that

SI + ... + Sk = dim(UI) + ... + dim(Uk) = dim(UI + ... + Uk) = n.


And, making use of result (1.6), we find that

rank(UI, ... , Uk) = dim[C(UI, ... , Uk)] = dim[C(U[) + ... + C(Uk))


= dim(UI + ... + Uk) = n.

(b) Clearly, H = B' diag(AI, ... , Ak)B. Thus, since (according to Lemma
14.8.3) diag(AI, ... , Ak) is positive definite, it follows from Corollary 14.2.10
that H is positive definite.
(c) In light of Lemma 14.12.1, it suffices to show that (for j i= i)U;HUj = O.
178 17. Intersections and Sums of Subspaces

By definition,

BIUI BIU2
... BIUk)
( B2UI B2U2 ... B2Uk
·· .. .
.
.
.
· .
BkUI BkU2 ... BkUk

implying in particular that (for j t= i) BjUi = 0 and (for r t= j) BrUj = O.


Thus, for j t= i,

U;HUj = (BjUdAjBjUj + LU;B~ArBrUj = 0+0 = O.


r-j.j

(d) (1) According to Part (c), Ui is orthogonal to UI, ... ,Ui-I, Ui+I, ... , Uk.
Thus, making repeated (k - 2 times) use of Part (a)-(2) of Exercise 3, we find that
Uj is orthogonal to UI + ... +Ui-I +UHI + ... +Uk. We conclude (on the basis
of Lemma 17.7.2) that UI + ... +Ui-I +UHI + ... +Uk = U/".
(2) That the projection of yon Uj along Ul + ... + Ui-l + Ui+l + ... + Uk
equals the orthogonal projection of y on Ui (with respect to H) is [in light of Part
(1)] evident from Theorem 17.6.6.
(e) Suppose that, for j t= i = 1, ... , k, Uj and Uj are orthogonal with respect to
some symmetric positive definite matrix H*. Then, according to Corollary 14.3.13,
there exists an n x n nonsingular matrix P such that H* = P'P. Further,

P~ PI" ~ P(U, ..... U'{~:) ~ L,B, + ... + L,B,.


where (for i = 1, ... , k) Li = PUi. And, making use of Lemma 14.12.1, we find
that, for j t= i = 1, ... , k,

L;Lj = U;P'PUj = U;H*Uj = O.


Thus,

H*= (LIBI + ... + LkBd(LIBI + ... + LkBk)


= B;L;LIBI + B~L~L2B2 + ... + BicLicLkBk
= B;AIBI + B~A2B2 + ... + BicAkBb
where (for i = 1, ... , k) Ai = L; Li [which is a symmetric positive definite matrix,
as is evident from Corollary 14.2.14 upon observing that rank(L i ) = rank(PUi) =
rank(Ui) = siJ.
18
Sums (and Differences) of Matrices

EXERCISE 1. Let R represent an n x n matrix, S an n x m matrix, T an m x m


matrix, and U an m x n matrix. Derive (for the special case where R and T are
nonsingular), the formula

IR + STU I = IRI IT + TUR-lSTI/ITI.

Do so by making two applications of the formula

I~ ~I = I~ ~I = ITIIW - VT-1UI.

(in which V is an n x m matrix and W an n x n matrix and in which T is assumed


to be nonsingular) to the partitioned matrix IT~ -~TI_ one with W set equal
to R, and the other with T set equal to R.
Solution. Suppose that Rand T are nonsingular. Then, making use of formula (*)
(or equivalently the formula of Theorem 13.3.8), we find that

IT~ -~TI = ITI IR - (-ST)T- 1TUI = ITI IR + STUI

and also that

IT~ -~TI = IRIIT - (TU)R-1(-ST)1 = IRIIT + TUR-1STI.


Thus,
180 18. Sums (and Differences) of Matrices

or equivalently
IR + STUI = IRIIT + TUR-ISTI/ITI.

EXERCISE 2. Let R represent an n x n matrix, S an n x m matrix, T an m x m


matrix, and U an m x n matrix. Show that if R is nonsingular, then

Do so by using the formula

(in which Rand T are assumed to be nonsingular), or alternatively the formula


lIn + SUI = 11m + USI or the formula IR + STUI = IRIIT + TUR-ISTI/ITI.
Solution. Note that
R + STU = R + (ST)ImU, (S.1)

R + STU = R + SIm(TU). (S.2)


Now, suppose that R is nonsingular. By applying formula (*) (or equivalently the
formula of Theorem 18.l.l) to the right side of equality (S.I) [i.e., by applying
formula (*) with ST and 1m in place of Sand T, respectively], we find that

Similarly, by applying formula (*) to the right side of equality (S.2) [i.e., by
applying formula (*) with 1m and TU in place of T and U, respectively]' we find
that

EXERCISE 3. Let A represent an n x n symmetric nonnegative definite matrix.


Show that ifI - A is nonnegative definite and if IAI = 1, then A = I.
Solution. Suppose that I - A is nonnegative definite and that IAI = 1. Then, as a
consequence of Corollary 14.3.12, A is positive definite. And,

III = 1 = IAI.

Thus, it follows from Corollary 18.1.7 (specifically from the special case of Corol-
lary 18.1.7 where C = I) that 1= A.

EXERCISE 4. Show that, for any n x n symmetric nonnegative definite matrix


B and for any n x n symmetric matrix C such that C - B is nonnegative definite,

ICI::: IC-BI,
18. Sums (and Differences) of Matrices 181

with equality holding if and only if C is singular or B = O.


Solution. Let A = C - B. Then, C - A = B. So, by definition, A is a (symmetric)
nonnegative definite matrix, and C - A is nonnegative definite. Thus, it follows
from Corollary 18.1.8 that
ICi 2: IC - BI,
with equality holding if and only if C is singular or C =C - B, or equivalently if
and only if C is singular or B = O.

EXERCISE 5. Let A represent a symmetric nonnegative definite matrix that has


been partitioned as

A= (~, ~).
where T is of dimensions m x m and W of dimensions n x n (and where U is of
dimensionsm xn). And, defineQ = W - U'T-U (which is the Schur complement
ofT).
(a) Using the result that the symmetry and nonnegative definiteness of A imply
the nonnegative definiteness of Q and the result of Exercise 14.33 (or otherwise),
show that

with equality holding if and only if W is singular or Q = O.


(b) Suppose that n = m and that T is nonsingular. Show that

with equality holding if and only if W is singular or rank(A) = m.


(c) Suppose that n = m and that A is positive definite. Show that

IWI ITI > IUI2.

Solution. (a) According to the result of Exercise 14.33, U'T-U is symmetric and
nonnegative definite. Further, W is symmetric. And, in light of the result that the
symmetry and nonnegative definiteness of A imply the nonnegative definiteness of
Q = W - U'T-U [a result that is implicit in Parts (1) and (2) of Theorem 14.8.4],
it follows from Corollary 18.1.8 that

with equality holding if and only ifW is singular or W = U'T-U, or equivalently


if and only if W is singular or Q = O.
(b) Since (in light of Corollary 14.2.l2) ITI > 0 and since
182 18. Sums (and Differences) of Matrices

and
IWIITI = IUl 2 ¢> IWI = IU'T-1UI.
Moreover, in light of Theorem 8.5.10,

rank (A) =m ¢> rank(Q) =0 ¢> Q = O.


Thus, it follows from Part (a) that

with equality holding if and only if W is singular or rank(A) = m.


(c) We have (in light of Lemma 14.2.8 and Corollary 14.2.12) that rank(A) =
2m > m and that W (and T) are nonsingular. Thus, it follows from Part (b) that

EXERCISE 6. Show that, for any n x p matrix X and any symmetric positive
definite matrix W,
(E.1)
[Hint. Begin by showing that the matrices X'X(X'WX)-X'X and X'W-1X -
X'X(X'WX)-X'X are symmetric and nonnegative definite.]
Solution. Let A = X'X(X'WX)-X'X and C = X'W-1X. Then, making use of
Part (6') of Theorem 14.12.11, we find that

A = X'W-lWPx.wW-lX = X'W-lp~.wWPx.wW-lX
= (Px.wW-lX)'W(Px.wW-lX),

so that (in light of Theorem 14.2.9) A is symmetric and nonnegative definite.


Further, C is symmetric, and, making use of Part (9') of Theorem 14.12.11, we
find that

C- A = X'W-1W(I - px.w)W-1X
= X, W- l ' 1
(I - px.w) WeI - px.w)W- X
= [(I - px.w)W-1X]'W[(I- px.w)W-1X],

so that C - A is nonnegative definite. Thus, it follows from Corollary 18.1.8 that

(S.3)

Ifrank(X) = p, then (in light of Theorem 14.2.9 and Lemma 14.9.1) IX'WXI
> 0 and (in light of Theorems 13.3.4 and 13.3.7)
18. Sums (and Differences) of Matrices 183

in which case inequality (S.3) is equivalent to inequality (S.l). Alternatively,


if rank(X) < p, then [since (according to Corollary 14.11.3) rank(X'WX) =
rank(X) and rank(X'X) = rank(X)] both sides of inequality (E.1) equal 0 and
hence inequality (E.1) holds as an equality.

EXERCISE 7. (a) Show that, for any n x n skew-symmetric matrix C,

with equality holding if and only if C = O.


(b) Generalize the result of Part (a) by showing that, for any n x n symmetric
positive definite matrix A and any n x n skew-symmetric matrix B,

IA + BI 2: IAI,

with equality holding if and only if B = O.


Solution. (a) Clearly,

II+q = I(I+c)'1 = II+C'I = II-q,


so that

II+q2 = II+q II-q = I(I+C)(I-c)1 = II-Cq = II+C'q.


Moreover, since C' C is symmetric and nonnegative definite, it follows from The-
orem 18.1.6 that
II + C' q 2: III,
with equality holding if and only if C' C = 0 or equivalently if and only if C = O.
Since III = 1, we conclude that

with equality holding if and only if C = O.


To complete the proof, it suffices to show that II + q > O. According to Lemma
14.6.4, C is nonnegative definite. Thus, we have (in light of Lemma 14.2.4) that
1+ C is positive definite and hence (in light of Corollary 14.9.4) that 11+ q > O.
(b) According to Corollary 14.3.13, there exists a nonsingular matrix P such
that A = P'P. Then,
A + B = P' (I + C)P,
where C = (p-l )'BP- 1. Moreover, since (according to Lemma 14.6.2) C is skew-
symmetric, we have [as a consequence of Part (a)] that

II+q 2: 1,

with equality holding if and only if C = 0 or equivalently if and only ifB = O. The
proof is complete upon observing that, since IA + BI = IPI 2 1I + q and IAI = IPI 2
184 18. Sums (and Differences) of Matrices

(and since IPI =1= 0), IA + BI > IAI {} II + C\ 2: 1, and IA + BI = IAI {}


II + C\ = 1.

EXERCISE 8. (a) Let R represent an n x n nonsingular matrix, and let B represent


an n x n matrix of rank one. Show that R + B is nonsingular if and only if
tr(R-1B) =1= -1, in which case

(b) To what does the result of Part (a) simplify in the special case where R = In?

Solution. (a) It follows from Theorem 4.4.8 that there exist n-dimensional column
vectors sand u such that B = su'. Then, as a consequence of Corollary 18.2.10,
we find that R + B is nonsingular if and only if u' R -I S =1= -1. Moreover, upon
applying result (5.2.6) (with b = u and a = R-1s), we obtain

u'R-1s = tr(R-1su') = tr(R-1B).


Thus, R + B is nonsingular if and only if tr(R -I B) =1= -1. And, if tr(R- 1B) =1= -1,
then we have, as a further consequence of Corollary 18.2.10, that

(R + B)-I = R- 1 - (1 + u'R-1s)R-1su'R- 1
= R- 1 - [1 + tr(R-1B)rIR-IBR- I.

(b) In the special case where R = In, the result of Part (a) can be restated as
follows: In + B is nonsingular if and only if tr(B) =1= -1, in which case

EXERCISE 9. Let R represent an n x n matrix, S an n x m matrix, T an m x m


matrix, and U an m x n matrix. Suppose that R is nonsingular. Show (a) that
R + STU is nonsingular if and only if 1m + UR -I ST is nonsingular, in which case

and (b) that R + STU is nonsingular if and only if 1m + TUR-1S is nonsingular,


in which case

Do so by using the result-applicable when T (as well as R) is nonsingular-that


R+STU is nonsingular if and only ifT- I + UR-IS is nonsingular, or equivalently
if and only if T + TUR- I ST is nonsingular, in which case

(R + STU)-I = R- I - R-IS(T- 1 + UR-IS)-IUR- I


= R- I - R-IST(T + TUR-IST)-ITUR- I.
18. Sums (and Differences) of Matrices 185

[Hint. Reexpress R + STU as R + STU =R+ (ST)lmU and as R + STU =


R+Slm TU.]
Solution. (a) Reexpress R + STU as

R+STU = R+ (ST)lmU.
Then, applying the cited result (or equivalently Theorem 18.2.8) with ST and 1m
in place of Sand T, respectively, we find that R + STU is nonsingular if and only
ifIm + UR- 1ST is nonsingular, in which case

(b) Reexpress R + STU as

R + STU = R + Slm(TU).

Then, applying the cited result (or equivalently Theorem 18.2.8) with 1m and TV
in place of T and U, respectively, we find that R + STU is nonsingular if and only
if 1m + TUR -1 S is nonsingular, in which case

EXERCISE 10. Let R represent an n x q matrix, S an n x m matrix, T an m x p


matrix, and U a p x q matrix. Extend the results of Exercise 9 by showing that if
R(STU) c R(R) and C(STU) C C(R), then the matrix

R- - R-ST(l p + UR-ST)-UR-

and the matrix


R- - R-S(lm + TUR-S)-TUR-
are both generalized inverses of the matrix R + STU.
Solution. Observe that R + STU can be reexpressed as

R + STU = R + (ST)lpU

and also as
R + STU = R + Slm(TU).
Suppose now that R(STU) C R(R) andC(STU) C C(R). Then, upon applying
Theorem 18.2.14 with ST and Ip in place of Sand T, respectively, we find that

is a generalized inverse of the matrix R + STU. And, upon applying Theorem


18.2.14 with 1m and TU in place of T and U, respectively, we find that
186 18. Sums (and Differences) of Matrices

is also a generalized inverse of R + STU.

EXERCISE 11. Let R represent an n x q matrix, S an n x m matrix, T an m x p


matrix, and U a p x q matrix.

(a) Take G to be a generalized inverse of the partitioned matrix (T~ -~T).


and partition G as G = (~~: ~~~) (where GIl is of dimensions q x n). Show
that G II is a generalized inverse of the matrix R + STU. Do so by using the result
that, for any partitioned matrix A = (!~: !~~) such that C(A2d C C(A22)

and R(A12) C R(A22) and for any generalized inverse (~~: ~~~) of A (where
CII is of the same dimensions as A;I)' CII is a generalized inverse of the matrix
All - A 12 A 22 A21.
(b)LetER = I-RR-,FR = I-R-R,X = ERST, Y = TUFR,Ey = I-YY-,
Fx = 1- X-X, Q = T+TUR-ST, Z = EyQFx, andQ* = FxZ-Ey. Use the
result of Part (a) of Exercise 10.10 to show that the matrix

R- - R-STQ*TUR- - R-ST(I - Q*Q)X-ER


- FRY-(I - QQ*)TUR- + FRY-(I - QQ*)QX-ER (E.2)

is a generalized inverse of the matrix R + STU.


(c) Show that if R(TU) c R(R) and C(ST) c C(R), then the formula

for a generalized inverse of R + STU can be obtained as a special case of formula


(E.2).
Solution. (a) It follows from the cited result (or equivalently from the second part
of Theorem 9.6.5) that Gu is a generalized inverse of the matrix

R - (-ST)T-TU = R + STT-TU = R + STU.

(b) Let G represent the generalized inverse of the matrix (T~ -~T) obtained
by applying formula (1O.E.I). Partition Gas G = (~~: ~~~) (where Gil is of
dimensions q x n), and assume that [in applying formula (lO.E.l)] the generalized
inverse of -X is set equal to -X- [in which case Fx = 1- (-X-)( -X)]. Then,
GIl equals the matrix (E.2), and we conclude on the basis of Part (a) (of the current
exercise) that the matrix (E.2) is a generalized inverse of the matrix R + STU.
(c) Suppose that R(TU) c R(R) and C(ST) c C(R). Then, it follows from
Lemma 9.3.5 that X = OandY = o(sothatFx = landEy = I and consequently
18. Sums (and Differences) of Matrices 187

Q* is an arbitrary generalized inverse of Q). Thus, fonnula (*) [or equivalently


fonnula (2.27)] can be obtained as a special case of fonnula (E.2) by setting X- = 0
and Y- = O.

EXERCISE 12. Let AI, A2, ... represent a sequence of m x n matrices, and let
A represent another m x n matrix.
(a) Using the result of Exercise 6.1 (i.e., the triangle inequality), show that if
IIAk-AIi -+ 0, then IIAkll -+ IIAII.
(b) Show that if Ak -+ A, then IIAkll -+ IIAII (where the nonns are the usual
nonns).
Solution. (a) Making use of the triangle inequality, we find that

and that

Thus,

and

implying that
I IIAk II - IIAII I ~ IIAk - All·
Suppose now that IIAk - All -+ 0. Then, corresponding to each positive scaler
E, there exists a positive integer p such that, for k > p, IIAk - All < E and hence
such that, for k > p, IIIAkll - IIAIII < E. We conclude that IIAkll -+ IIAII.
(b) In light of Lemma 18.2.20, Part (b) follows from Part (a).

EXERCISE 13. Let A represent an n x n matrix. Using the results of Exercise


6.1 and of Part (b) of Exercise 12, show that if IIAII < 1, then (for k = 0,1,2, ... )

11(1 - A)-I - (I + A + A2 + ... + Ak)1I ~ IIAIlk+I/(1 - IIAII)


(where the nonns are the usual nonns). (Note. If IIAII < 1, then I - A is nonsin-
gular.)
Solution. Suppose that IIAII < 1, and (for p = 0, 1,2, ... ) let Sp = L~=oAm
(where AO = I). Then, as a consequence of Theorems 18.2.16 and 18.2.19, we
have that(I -Arl = lim p -+ oo Sp, implying that

(I - A)-I - Sk = (p-+oo
lim Sp) - Sk = p-+oo
lim (Sp - Sd

LAm,
p
= lim
p-+oo m=k+1
188 18. Sums (and Differences) of Matrices

and it follows from the result of Part (b) of Exercise 12 that

(SA)

Moreover, making repeated use of the result of Exercise 6.1 (i.e., of the triangle
inequality) and of Lemma 18.2.21, we find that (for p ::: k + I)
p p p p-k-I
II L Amll S L IIAmll S L IIAli m = IIAIlk+1 L IIAlim. (S.5)
m=k+ I m=k+ I m=k+ I m=O
It follows from a basic result on geometric series [which is example 34.8(c) in
Bartle's (1976) book] that L~=o IIAli m = 1/(1 - IIAII). Thus, combining result
(S.5) with result (SA), we find that

= IIAIlk+1 L
00

IIAli m = IIAll k+ I /(1 - IIAII).


m=O

EXERCISE 14. Let A and B represent n x n matrices. Suppose that B is nonsin-


gular, and define F = B- 1A. Using the result of Exercise 13, show that if IIFII < 1,
then (for k = 0, 1,2, ... )
II(B - A)-I - (B- 1 + FB- I + F 2B- I + ... + FkB-I)1I
s liB-III IIFllk+I/(1 - IIFID
(where the norms are the usual norms). (Note. If liFil < I, then B - A is nonsin-
gular.)

Solution. Suppose that IIFII < 1. Then, since B - A = B(I - F) (and since B - A
is nonsingular), I - F is nonsingular, and
(B - A)-I = (I - F)-IB- I.

Thus, making use of Lemma 18.2.21 and the result of Exercise 13, we find that

II(B - A)-I - (B- 1 + FB- I + F 2B- I + ... + rB-I)II


= 11[(1 - F)-1 - (I + F + F2 + ... + r)]B- 111
s II(I-F)-I- (I+F+F2 + ... +Fk)II liB-III
S IIB- I IIIIFll k+ 1/(1 - IIFII)·

EXERCISE 15. Let A represent an n x n symmetric nonnegative definite matrix,


and let B represent an n x n matrix. Show that if B - A is nonnegative definite (in
which case B is also nonnegative definite), then 'R(A) C 'R(B) and C(A) C C(B).
18. Sums (and Differences) of Matrices 189

Solution. Define C = (I - B-B)'(B - A)(I - B-B). Since A is symmetric and


nonnegative definite, there exists a matrix R such that A = R'R. Clearly,

Suppose now that B - A is nonnegative definite. Then, according to Theorem


14.2.9, C is nonnegative definite. Moreover, it is clear from expression (S.6) that
C is nonpositive definite and symmetric. Consequently, it follows from Lemma
14.2.2 that C = 0 or equivalently that [R(I - B-B],R(I - B-B) = 0, implying
(in light of Corollary 5.3.2) that R(I - B-B) = 0 and hence (since A = R'R) that
A(I - B-B) = O. We conclude (in light of Lemma 9.3.5) that R(A) C R(B).
Further, since B - A is nonnegative definite, (B - A)' = B' - A' is also
nonnegative definite. Thus, by employing an argument analogous to that employed
in establishing that R(A) C R(B), it can be shown that R(A') C R(B') or
equivalently (in light of Corollary 4.2.5) that C(A) C C(B).
An alternative solution to Exercise 15 can be obtained by making use of Corol-
lary 12.5.6. Suppose that B - A is nonnegative definite. And, let x represent an
arbitrary vector in C.l(B). Then,

o ::: x' (B - A)x = -x' Ax ::: 0,

implying that x'Ax = oand hence (in light of Corollary 14.3.1l)thatA'x = Ax =


o or equivalently that x E C.l(A). Thus, C.l(B) c C.l(A), and it follows from
Corollary 12.5.6 that C(A) C C(B). That R(A) C R(B) can be established via an
analogous argument.

EXERCISE 16. Let A represent an n x n symmetric idempotent matrix, and let B


represent an n x n symmetric nonnegative definite matrix. Show that if I - A - B
is nonnegative definite, then BA = AB = O. (Hint. Show that A' (I - A - B)A =
- A'BA, and then consider the implications of this equality.)
Solution. Clearly,

A'(I - A - B)A = A'(A - A2 - BA) = A'(A - A - BA) = -A'BA. (S.7)

Suppose now that I - A - B is nonnegative definite. Then, as a consequence


of Theorem 14.2.9, A'(I - A - B)A is nonnegative definite, in which case it
follows from result (S.7) that A'BA is nonpositive definite. Moreover, as a further
consequence of Theorem 14.2.9, A'BA is nonnegative definite. Thus, in light of
Lemma 14.2.2, we have that
A'BA =0. (S.8)
And, since B is symmetric as well as nonnegative definite, we conclude (on the
basis of Corollary 14.3.11) that BA = 0 and also [upon observing that AB =
A'B' = (BA)'] that AB = O.

EXERCISE 17. Let Ai, ... , Ak represent n x n symmetric matrices, and de-
fine A = Ai + ... + Ak. Suppose that A is idempotent. Suppose further that
190 18. Sums (and Differences) of Matrices

A I, ... , Ak-I are idempotent and that Ak is nonnegative definite. Using the result
of Exercise 16 (or otherwise), show that AiAj = 0 (for j "I i = 1, ... , k), that
Ak is idempotent, and that rank(Ak) = rank(A) - L7::l rank(Ai).

Solution. Let Ao = I-A. Then, L7=0 Ai = I. Further, Ao (like A I, ... , Ak- d


is symmetric and idempotent, and (in light of Lemma 14.2.17) Ao, AI, ... , Ak-I
(like Ak) are nonnegative definite.
Thus, for i = 1, ... , k - 1 and j = i + 1, ... , k, Ai is idempotent, Aj is
nonnegative definite, and (since I - Ai - Aj = L~=o (mi'i,j) Am) I - Ai - Aj
is nonnegative definite. And, it follows from the result of Exercise 16 that (for
i = 1, ... , k -1 and j = i + 1, ... , k) AiAj = 0 and AjAi = 0 or equiv-
alently that, for j "I i = 1, ... , k, Ai A j = O. Moreover, since AI, ... , Ak
are symmetric, we conclude from Theorem 18.4.1 that Ak (like AI, ... , Ak-I)
is idempotent and that L7=1 rank(Ai) = rank (A) or, equivalently, rank(Ak) =
rank(A) - L7::11 rank(Ai).

EXERCISE 18. Let A I, ... , Ak represent n x n symmetric matrices, and define


A = Al + ... + Ak. Suppose that A is idempotent. Show that if AI, ... , Ak are
nonnegative definite and if tr(A) S L7=1 tr(A;), then AiAj = 0 (for j "I i =
1, ... , k) and AI, ... , Ak are idempotent. Hint. Show that Li,di tr(AiAj) sO
and then make use of the result that, for any two symmetric nonnegative definite
matrices Band C (of the same order), tr(BC) ::: 0, with equality holding if and
only ifBC = O.
Solution. Clearly,

A = A2 = (LA;)2 = LA~ + L A;Aj,


i i,di

so that

tr(A) = tr(LA; + L AiAj) = Ltr(Af) + L tr(AiAj)


i,ji'i ;·di

and hence
L tr(AiAj) = tr(A) - Ltr(A;). (S.9)
i,di

Suppose now that AI, ... , Ak are nonnegative definite and also that tr(A) S
Li tr(A;). Then, it follows from result (S.9) that

L tr(AiAj) SO.
i,di

And, since (according to Corollary 14.7.7, which is the result cited in the hint)
tr(AiAj) ::: 0 (for all i and j "I i), we have that tr(AiAj) = 0 (for all i and
j "I i). We conclude (on the basis of Corollary 14.7.7) that A;Aj = 0 (for all j
18. Sums (and Differences) of Matrices 191

and j f. i). And, in light of Theorem 18.4.1 (and the symmetry of AI, ... ,Ak),
we further conclude that A I, ... , Ak are idempotent.

EXERCISE 19. Let AI, ... , Ak represent n x n symmetric matrices such that
Al + ... + Ak = I. Show that if rank(Ad + ... + rank (Ad = n, then, for
any (strictly) positive scalars CI, ... , q, the matrix clAI + ... + qAk is positive
definite.
Solution. Suppose that rank(Ad + ... + rank(Ad = n. Then, it follows from
Theorem 18.4.5 that AI, ... , Ak are idempotent. Thus,

clAI + ... + CkAk = (~AI)'(~AI),


.JCkAk .JCkAk
implying (in light of Corollary 14.2.14) that clAI + ... + CkAk is nonnegative
definite and (in light of Corollaries 7.4.5 and 4.5.6) that

= rank(A~AI + ... + A~Ak)


= rank(AI + ... + Ak)
= rank(ln)
=n.
We conclude (on the basis of Corollary 14.3.12) that CiAI + ... + qAk is positive
definite.

EXERCISE 20. Let AI, ... , Ak represent n x n symmetric idempotent matrices


such that AiAj = 0 for j f. i = 1, ... , k. Show that, for any (strictly) positive
scalar Co and any nonnegative scalars Ci, ... , q, the matrix col + L~=I CiAi is
positive definite (and hence nonsingular), and
k k
(col + LCiAi)-1 =dol+ LdiAi,
i=1 i=1
where do = l/co and (for i = 1, ... , k) di = -ci![co(co + Ci)]·
Solution. Clearly, col is positive definite. Moreover, as a consequence of Lem-
ma 14.2.17, AI, ... , Ak are nonnegative definite, and hence cIAI,"" qAk are
192 18. Sums (and Differences) of Matrices

nonnegative definite. Thus, it follows from Corollary 14.2.5 that col + 2:7=1 CiAi
is positive definite (and hence, in light of Lemma 14.2.8, nonsingu1ar).
That (col + 2:7=1 CiAi)-1 = dol + 2:7=1 diAi is clear upon observing that

(col + LCiAi)(dol+ LdiAi)


i i

= codol + Co L di A i + do L CiAi + L CidiAf + L CidjAiAj


i i i i.Ni
2
= I - ' " _C_i-Ai + '" Ci Ai _ '" ci Ai +0
~CO+Ci
I
~co
I
~cO(CO+Ci)
I

' " COCi - Ci(CO + Ci) + cf


=I-~ ~
i co(co + Ci)
= I.

EXERCISE 21. Let AI, ... , Ak represent n x n symmetric idempotent matrices


such that (for j 1= i = 1, ... , k) AiAj = 0, and let A represent an n x n
symmetric idempotent matrix such that (for i = 1, ... , k) C(Ai) c C(A). Show
that ifrank(Al)+ ... + rank(Ak) = rank(A), then Al + ... + Ak = A.
Solution. Suppose that rank(At} + ... + rank(Ak) = rank(A). Then, in light of
Corollary 10.2.2, we have that

tr(AI + ... + Ak) = tr(At} + ... + tr(Ak)


= rank(Al) + ... + rank(Ak) = rank(A) = tr(A).
Moreover, [since C(A i ) C C(A)) there exists a matrix Li such that Ai = AL;, so
that AAi = A2Li = ALi = Ai and AiA = A;A' = (AA;), = A; = Ai (i =
1, ... , k), implying that

= (A - LAi)(A - LAi)
(A - LAi)'(A - LAi)
i i i i
= A2 - LAiA- LAAi + LAf+ L AiAj
i i i iJf=i
=A- LAi - LAi + LAi +0
i i i
=A-LAi.
i

Thus,

tr[(A - LAi)'(A - LAi)] = tr(A - LAi) = tr(A) - tr(LA;) = O.


i i i i
We conclude (on the basis of Lemma 5.3.1) that A - 2:i Ai = 0 or equivalently
that Al + ... +Ak = A.
18. Sums (and Differences) of Matrices 193

EXERCISE 22. Let A represent an m x n matrix and B an n x m matrix. If B


is a generalized inverse of A, then rank (I - BA) = n - rank(A). Show that the
converse is also true; that is, show that if rank(1 - BA) = n - rank(A), then B is
a generalized inverse of A.
Solution. Suppose that rank(1 - BA) =n - rank(A). Then, since rank(BA) :::
rank(A), we have that

n - rank(BA) ::: n - rank(A) = rank(1 - BA). (S.1O)

Moreover, making use of Corollary 4.5.9, we find that

rank(1 - BA) + rank(BA) ::: rank[(1 - BA) + BA] = rank(ln) = n


and hence that
rank(1 - BA) ::: n - rank(BA). (S.lI)

Together, results (S.1O) and (S.Il) imply that

rank(1 - BA) =n - rank(BA)

or equivalently that
rank(BA) + rank(1 - BA) = n.
Thus, it follows from Lemma 18.4.2 that BA is idempotent. Further,

n - rank(A) = rank(1 - BA) =n - rank(BA),

implying that rank(BA) = rank(A). We conclude (on the basis of Theorem 10.2.7)
that B is a generalized inverse of A.

EXERCISE 23. Let A represent the (n x n) projection matrix for a subspace U


ofnnxl along a subspace Vofn nxl (whereU ED V = nnxl), and letB represent
the (n x n) projection matrix for a subspace W of nnxl along a subspace X of
nnxl (where W ED X = nnxl).
(a) Show thatA+B is the projection matrix for some subspace C ofnnx I along
some subspace M ofnnxl (whereCEDM = nnxl) if and only ifBA = AB = 0,
in which case C = U ED Wand M = V n X.
(b) Show that A - B is the projection matrix for some subspace C of nn x I along
somesubspaceM ofnnxl (whereCEDM = nnxl) if and only ifBA = AB = B,
in which case C = U n X and M = V ED W. [Hint. Observe (in light of the result
that a matrix is a projection matrix for one subspace along another if and only if it
is idempotent and the result that a matrix, say K, is idempotent if and only if I - K
is idempotent) that A - B is the projection matrix for some subspace C along some
subspace M if and only if I - (A - B) = (I - A) + B is the projection matrix
for some subspace C* along some subspace M*, and then make use of Part (a)
and the result that the projection matrix for a subspace M* along a subspace C*
194 18. Sums (and Differences) of Matrices

(where M* (17 C* = nn xl) equals 1 - H, where H is the projection matrix for C*


along M*.]
Solution. (a) According to Theorem 17.6.14, A and B are idempotent, U =
C(A), W = C(B), V = N(A), and X = N(B).
Suppose now that A + B is the projection matrix for some subspace C along
some subspace M. Then, as a consequence of Theorem 17 .6.13 (or 17.6.14), A + B
is idempotent. And, it follows from Lemma 18.4.3 that BA = AB = O.
Conversely, suppose that BA = AB = O. Then, as a consequence of Lemma
18.4.3, A + B is idempotent. And, it follows from Theorem 17.6.13 that A + B is the
projection matrix for some subspace C along some subspace M and from Theorem
17.6.12 thatC = C(A+B) and (in light of Theorem 11.7.1) that M = N(A+B).
Moreover, since A, B, and A + B are all idempotent, it follows from Theorem
18.4.1 that
rank(A + B) = rank(A) + rank(B). (S.12)

Since (according to Lemmas 4.5.8 and 4.5.7)

rank(A + B) ::::: rank(A, B) ::::: rank(A) + rank(B),

we have [as a consequence of result (S.12)] that

rank(A + B) = rank(A, B) = rank(A) + rank(B),

implying [in light of result (4.5.5)] that C(A + B) = C(A, B) and (in light of
Theorem 17.2.4) that C(A) and C(B) are essentially disjoint. Thus, in light of
result (17.1.4), it follows that

C(A + B) = C(A) (17 C(B)

or equivalently that C = U (17 W.


It remains to show that M = V n X or equivalently that N (A + B) = N (A) n
N(B). Let x represent an arbitrary vector in N(A + B). Then, Ax + Bx = 0, and
consequently (since A2 = A, B2 = B, and BA = AB = 0)

Ax = A2x = A2x + ABx = A(Ax + Bx) = 0,


Bx = B2x = B2x + BAx = B(Ax + Bx) = o.
Thus, x E N(A)nN(B). WeconcludethatN(A+B) c N(A)nN(B) and hence
[since clearly N(A) nN(B) c N(A + B)] thatN(A + B) = N(A) nN(B).
(b) Since (according to Lemma 10.1.2) 1 - (A - B) is idempotent if and only if
A - B is idempotent, it follows from Theorem 17.6.13 that A - B is the projection
matrix for some subspace C along some subspace M if and only if 1 - (A -
B) = (I - A) + B is the projection matrix for some subspace C* along some
subspace M* - Lemma 10.1.2 and Theorem 17.6.13 are the results mentioned
parenthetically in the hint. And, since (according to Theorem 17.6.10) 1 - A is the
18. Sums (and Differences) of Matrices 195

projection matrix for V along U, it follows from Part (a) that (I - A) + B is the
projection matrix for some subspace £* along some subspace M* if and only if

B(I - A) = (I - A)B = 0,

or equivalently if and only if

BA=AB=B,

in which case £* = V $ Wand M* = Un X. The proof is complete upon


observing (in light of Theorem 17.6.10, which is the result whose use is prescribed
in the hint) that if (I - A) + B is the projection matrix for V $ W along un X,
then A - B = I - [(I - A) + B] is the projection matrix for Un X along V $ W.

EXERCISE 24. (a) Let B represent an n x n symmetric matrix, and let W represent
an n x n symmetric nonnegative definite matrix. Show that

WBWBW = WBW <=> (BW)3 = (BW)2


<=> tr[(BW)2] = tr[(BW)3] = tr[(BW)4].

(b) Let AI, ... , Ak represent n x n matrices, let V represent an n x n symmetric


nonnegative definite matrix, and define A = Al + ... Ak. If V Ai V Ai V = V Ai V
for all i and if V Ai VA j V = 0 for all i and j i= i, then V A V AV = V A V and
rank(VAI V) + ... + rank(VAk V) = rank(V AV). Conversely, if VA VAV =
V A V, then each of the following three conditions implies the other two: (1)
VAj VA} V = 0 (for j i= i = 1, ... , k) and rank(VAi VA i V) = rank (V Ai V) (for
i = 1, ... , k); (2) VAiVAiV = VAiV (for i = 1, ... , k); (3) rank(VAIV) +
... + rank(VAkV) = rank(VAV). Indicate how, in the special case where AI,
... , Ak are symmetric, the conditions VA V A V = V A V and V Ai V Ai V = V Ai V
can be reexpressed by applying the results of Part (a) (of the current exercise).
Solution. (a) Let S represent any matrix such that W = S'S - the existence of
such a matrix follows from Corollary 14.3.8.
If WBWBW = WBW, then clearly BWBWBW = BWBW, or equivalently
(BW)3 = (BW)2. Conversely, suppose that (BW)3 = (BW)2. Then,

(SB)'SBWBW = (SB)'SBW,

implying (in light of Corollary 5.3.3) that SBWBW = SBW, so that

WBWBW = S'SBWBW = S'SBW = WBW.

It remains to show that

(BW)3 = (BW)2 <=> tr[(BW)2] = tr[(BW)3] = tr[(BW)4].

Suppose that (BW)3 = (BW)2. Then, clearly,

(BW)4 = BW(BW)3 = BW(BW)2 = (BW)3.


196 18. Sums (and Differences) of Matrices

Thus, tr[(BW)2] = tr[(BW)3] = tr[(BW)4].


Conversely, suppose that tr[(BW)2] = tr[(BW)3] = tr[(BW)4]. Then, making
use of Lemma 5.2.1, we find that

tr[(SBS' - SBWBS')'(SBS' - SBWBS')]


= tr(SBWBS' - 2SBWBWBS' + SBWBWBWBS')
= tr(BWBS'S) - 2tr(BWBWBS'S) + tr(BWBWBWBS'S)
= tr[(BW)2] - 2tr[(BW)3] + tr[(BW)4]
=0.

Thus, it follows from Lemma 5.3.1 that SBS' - SBWBS' = 0, or equivalently that
SBWBS' = SBS', so that

(BW)3 = BS' (SBWBS')S = BS' (SBS')S = (BW)2.

(b) Suppose that AI, ... , Ak are symmetric (in which case A is also symmetric).
Then, applying the results of Part (a) (with A in place of B and V in place of W),
we find that

Similarly, applying the results of Part (a) (with Ai in place of B and V in place of
W), we find that

VAiVAiV = VAiV ¢> (AiV)3 = (A;V)2


¢> tr[(AiV)2] = tr[(Ai V)3] = tr[(AiV)4].

EXERCISE 25. Let R represent an n x q matrix, S an n x m matrix, T an m x p


matrix, and U a p x q matrix.
(a) Show that

rankeR + STU) = rank ( T~ -~T) - rank(T). (E.3)

(b)LetER = I-RR- ,FR = I-R-R,X = ERST, Y = TUFR,Ey = I-YY-,


Fx = I - X-X, Q = T + TUR-ST, and Z = EyQFx. Use the result of Part (b)
of Exercise lO.lO to show that

rankeR + STU) = rankeR) + rank(X) + rank(Y) + rank(Z) - rank(T).

Solution. (a) Observing that

R
( TU -ST)( I 0) _ (R + STU -ST)
T -U I - 0 T
18. Sums (and Differences) of Matrices 197

and making use of Lemma 8.5.2 and Corollary 9.6.2, we find that

R
rank(TU -ST)
T _- r
ank(R
O + STU
T -ST) = rankeR + STU) + rank(T)
and hence that

rankeR + STU) = rank ( ~ -~T) - rank(T).

Or, alternatively, equality (E.3) can be validated by making use of result (9.6.1)
- observing that C(TU) c C(T) and 'R( -ST) c 'R(T), we find that

rank ( T~ -~T) = rank(T) + rank[R - (-ST)T-TU]

= rank(T) + rankeR + STU)

and hence that

rankeR + STU) = rank ( ~ -~T) - rank(T).

=
(b) Upon observing thatrank(X) rank( -X), that -X- is a generalized inverse
of -X, and that Fx = 1- (-X-)( -X), it follows from the result of Part (b) of
Exercise 10.10 that

R
rank ( TU -ST)
T = rankeR) + rank (X) + rank(Y) + rank(Z).

We conclude, on the basis of Part (a) (of the current exercise) that

rankeR + STU) = rankeR) + rank(X) + rank(Y) + rank(Z) - rank(T).

EXERCISE 26. Show that, for any m x n matrices A and B,

rank(A + B) ::: 1rank (A) - rank(B) I·

Solution. Making use of results (4.5.7) and (4.4.3), we find that

rank (A) = rank[(A+B)-B] ~ rank(A+B)+rank(-B) = rank(A+B)+rank(B)

and hence that


rank(A + B) ::: rank(A) - rank(B). (S.13)

Similarly, we find that

rank(B) = rank[(A+B)-A] ~ rank(A+B)+rank(-A) = rank(A+B)+rank(A)


198 18. Sums (and Differences) of Matrices

and hence that

rank (A + B) ::: rank (B) - rank(A) = -[rank(A) - rank(B)]. (S.14)

Together, results (S.13) and (S.14) imply that

rank(A + B) ::: Irank(A) - rank (B) I.

EXERCISE 27. Show that, for any n x n symmetric nonnegative definite matrices
A andB,
C(A +B) = C(A, B), R(A + B) = R(:),
rank(A + B) = rank(A, B) = rank ( :).

Solution. According to Corollary 14.3.8, there exist matrices Rand S such that
A = R'R and B = S'S. And, upon observing that

A+B = (~)(~)
and recalling Corollaries 7.4.5 and 4.5.6, we find that

C(A, B) = C(R'R, S'S) = C(R', S') = c( (~),J = C(A + B)


[which implies that rank(A, B) = rank(A + B)] and similarly that

R(:) =R(~;:) =R(~) =R(A+B)

[which implies that rank(:) = rank(A + B)].


EXERCISE 28. Let A and B represent m x n matrices.
(a) Show that (1) C(A) c C(A + B) if and only if rank(A, B) = rank(A + B)
and (2) R(A) c R(A + B) if and only if rank ( : ) = rank(A + B).

(b) Show that (1) ifR(A) and R(B) are essentially disjoint, then C(A) c C(A +
B) and (2) if C(A) and C(B) are essentially disjoint, then R(A) c R(A + B).
Solution. (a)(l) Suppose that rank (A , B) = rank(A+B). Then, since (according to
Lemma 4.5.8) C(A + B) c C(A, B), it follows from Theorem 4.4.6 that C(A, B) =
C(A + B). Then, since C(A) C C(A, B), we have that C(A) C C(A + B).
Conversely, suppose that C(A) C C(A + B). Then, according to Lemma 4.2.2,
there exists a matrix F such that A = (A + B)F. Further, B = (A + B) - A =
18. Sums (and Differences) of Matrices 199

(A + B)(I - F). Thus (A, B) = (A + B)(F, I - F), implying that C(A, B) c


C(A + B). Since (according to Lemma 4.5.8) C(A + B) c C(A, B), we conclude
that C(A, B) = C(A + B) and hence that rank(A, B) = rank(A + B).
(2) The proof of Part (2) is analogous to that of Part (1).
(b) Let c = dim[C(A) n C(B)], d = dim[R(A) n R(B)], and

(1) If R(A) and R(B) are essentially disjoint (or equivalently if d = 0), then
(as a consequence of Theorem 18.5.6) rank(H) = 0 = d, implying [in light of
result (5.14)] that
rank (A + B) = rank(A, B),
and hence [in light of part (a)-(l)] that C(A) c C(A + B).
(2) Similarly, if C(A and C(B) are essentially disjoint (or equivalently if c = 0),
then (as a consequence of Theorem 18.5.6) rank(H) = 0 = c, implying [in light
of result (5.18)] that
rank(A + B) = rank(:)

and hence [in light of Part (a)-(2)] that R(A) c R(A + B).

EXERCISE 29. Let A and B represent m x n matrices. Show that each of the
following five conditions is necessary and sufficient for rank additivity [i.e., for
rank(A + B) = rank(A) + rank(B)]:

(a) rank(A, B) = rank(:) = rank(A) + rank(B);


(b) rank(A) = rank[A(1 - B-B)] = rank[(1 - BB-)A];
(c) rank(B) = rank[B(1 - A-A)] = rank[(1 - AA -)B];
(d) rank(A) = rank[A(1 - B-B)] and rank(B) = rank[(1 - AA -)B];
(e) rank (A) = rank[(1 - BB-)A] and rank(B) = rank[B(1 - A-A)].

Solution. Let r = dim[C(A) n C(B)] and s = dim[R(A) n R(B)]. In light of


Theorem 18.5.7, it suffices to show that the condition r = s = 0 is equivalent to
each of Conditions (a)-(e).
That r = s = 0 implies Condition (a) and conversely is an immediate con-
sequence of results (5.15) and (5.19). That r = s = 0 is equivalent to each
of Conditions (b) - (e) becomes clear upon observing that (as a consequence of
Corollary 17.2.10)

r =0 {:> rank(A) = rank[(1 - BB-)A] {:> rank(B) = rank[(1 - AA -)B],


200 18. Sums (and Differences) of Matrices

s = 0 #- rank(A) = rank[A(I - B-B)] #- rank(B) = rank[B(I - A-A)].

EXERCISE 30. Let A and B represent m x n matrices. And, let

(a) Show that


rank (A - B) = rank (A) - rank(B) + [rank(A, B) - rank(A)]

+ [rank(~) - rank(A)] + rankeD).

Do so by applying (with -B in place of B) the formula

rank(A + B) = rank(A, B) + rank(~) - rank(A) - rank (B) + rank(K), (*)

where

(b) Show that A and B are rank subtractive [in the sense that rank(A - B) =
rank(A) - rank(B)] if and only ifrank(A, B) = rank(~) = rank(A) and D = O.

(c) Show that if rank(A, B) = rank(~) = rank(A), then (1) (A -,0) and

( : -) are generalized inverses of (~) and (A, B), respectively, and (2) for

(~r =(A-,O)and(A,B)-=(:-). D=(~ BA-:-B).


(d) Show that each of the following three conditions is necessary and sufficient
for rank subtractivity [i.e., for rank (A - B) = rank (A) - rank(B)]:

(1) rank(A, B) = rank(~) = rank(A) and BA -B = B;

(2) C(B) c C(A), R(B) c R(A), and BA -B = B;


(3) AA-B = BA-A = BA-B = B.

(e) Using the result of Exercise 29 (or otherwise), show that rank (A - B) =
rank(A) - rank(B) if and only if rank (A - B) = rank[A(I - B-B)] = rank[(I-
BB-)A].
Solution. (a) Clearly,

(A, -B) = (A, B) ( 0In 0)


-In·
18. Sums (and Differences) of Matrices 201

Thus, it follows from Parts (1) and (2) of Lemma 9.2.4 that ( A)-(I
B Om 0
-1m )-1
is a generalized inverse of (_:) and that (~n o )-1(A, B) -
_ In is a generalized
inverse of (A, -B). Further,

(~m _~m)H(~n -~J


~ [1- (-!)(!r (~ -~n (~ -~)
x (! -:)Gi _~)[I- Gi _~r(A'DnA,-B)]

~ [1- (-!)(!r (!m -~n


x (! _:)[1- Gi _~J(A' DnA, -D)J
Now, applying result (*) [which is equivalent to result (5.7) or, when combined
with result (17.4.13) or (17.4.12), to result (5.8) or (5.9)] and recalling Corollary
4.5.6, we find that
rank(A - B) = rank[A + (-B)]

= rank(A, -B) +rank (_:) - rank(A) - rank(-B)

+ rank [ (~m _:m)H(~n -~J ]


= rank(A, B) + rank(!) - rank(A) - rank(B) + rank(H)
= rank(A) - rank(B) + [rank(A, B) - rank(A)]
+ [rank(!) - rank(A)] + rank(H).
(b) It follows from Part (a) that rank(A - B) = rank(A) - rank(B) if and only
if

[rank(A, B) - rank(A)] + [rank(!) - rank(A)] + rank(H) = O. (S.15)

Since all three terms of the left side of equality (S.15) are nonnegative, we conclude
that rank(A - B) = rank(A) - rank(B) if and only if rank(A, B) - rank(A) =
0, rank(!) - rank(A) = 0, and rank (H) = 0 or, equivalently, if and only if
rank(A, B) = rank(!) = rank(A) and H = O.
202 18. Sums (and Differences) of Matrices

(c) Suppose that rank (A , B) = rank(~) = rank(A). Then, according to Corol-


lary 4.5.2, C(B) c C(A) and R(B) c R(A), and it follows from Lemma 9.3.5
that AA -B = B and BA - A = B.
Thus, (1)

and (2) upon setting (~r and (A, B) - equal to (A - , 0) and (: -), respectively,
we obtain

(d) (1) Suppose that rank(A, B) = rank (~) = rank (A) and that BA-B = B.

Then, it follows from Part (c) that (A - ,0) and (: -) are generalized inverses

of (~) and (A, B), respectively, and that, for (~r = (A -,0) and (A, B)- =

( : -), H = O. Thus, as a consequence of Part (b), we have that rank(A - B) =


rank (A) - rank(B).
Conversely, suppose that rank(A - B) = rank(A) - rank(B). Then, according
to Part (b), rank(A, B) = rank(~) = rank(A) and H = 0 [for any choice

of (~r and (A, B)-]. And, observing [in light of Part (c)] that (A -,0) and

(:-) are generalized inverses of (~) and (A, B), respectively, and that, for

(~r = (A-, 0) and (A,B)- = (:-), H= (~ BA-~_B),wefindthat


BA -B - B = 0 or equivalently that BA -B = B.
18. Sums (and Differences) of Matrices 203

(2) Since (according to Corollary 4.5.2) C(B) c C(A) {:} rank(A, B) =


rank(A) and R(B) C R(A) {:} rank ( !) = rank(A), Condition (2) is equivalent
to Condition (1) and hence is necessary and sufficient for rank subtractivity.
(3) Since (according to Lemma 9.3.5) AA -B = B {:} C(B) C C(A) and
BA - A = B {:} R(B) c R(A), Condition (3) is equivalent to Condition (2) and
hence is necessary and sufficient for rank subtractivity.
(e) Clearly, rank(A - B) = rank(A) - rank(B) if and only ifrank[(A - B) + B]
= rank(A - B) + rank(B), that is, if and only if A - Band B are rank additive.
Moreover, it follows from the result of Exercise 29 [specifically, Condition (b)]
that A - B and B are rank additive if and only if

rank (A - B) = rank[(A - B)(I - B-B)] = rank[(I - BB-)(A - B)].

Since (A - B)(I - B-B) = A(I - B-B) and (I - BB-)(A - B) = (I - BB-)A,


we conclude that rank(A - B) = rank (A) - rank(B) if and only if rank (A - B) =
rank[A(I - B-B)] = rank[(I - BB-)A].

EXERCISE 31. Let AI, ... , Ak represent m x n matrices. Adopting the termi-
nology of Exercise 17.6, use Part (a) of that exercise to show that if R(Ad, ... ,
R(Ak) are independent and C(Ad, ... , C(Ak) are independent, then rank(AI +
... + Ak) = rank(Ad + ... + rank(Ak).
Solution. Suppose that R(Ad, ... , R(Ak) are independent and also that C(AI),
... , C(Ak) are independent. Then, as a consequence of Part (a) of Exercise 17.6,
we have that, for i = 2, ... , k, R(A;) and R(Ad + ... + R(A;-d are essentially
disjoint and C(A;) and C(AI) + ... + C(A;-d are essentially disjoint or equiva-
lently [in light of results (17.1.7) and (17.1.6)] that, for i = 2, ... , k, R(A;) and

R (~I ) are essentially disjoint and C(A;) and C(AI, ... ,A;-d are essentially
A;_I
disjoint. Moreover, it follows from results (4.5.9) and (4.5.8) that (for i = 2, ... ,k)

Thus, for i = 2, ... , k, R(A;) and R(A I + ... + Ai -I) are essentially disjoint
and C(Ai) and C(AI + ... + Ai-I) are essentially disjoint, implying (in light of
Theorem 18.5.7) that (for i = 2, ... , k)

rank(AI + ... + Ai-I + Ai) = rank(AI + ... + Ai-I) + rank(Ai)·


We conclude that
204 18. Sums (and Differences) of Matrices

EXERCISE 32. Let T represent an m x p matrix, U an m x q matrix, V an


n x p matrix, and W an n x q matrix, and define Q = W - VT-U. Further, let
ET = I - TT-, FT = I - T-T, X = ETU, and Y = VFT.
(a) Show that

(E.4)

(b) Show that

rank(~ ~) = rank(T) + rank ( -V~-U !).


[Hint. Observe that (since the rank of a matrix is not affected by a pennutation of
rows or columns) rank ( ~ ~) = rank(~ ~), and make use of Part (a).]
(c) Show that

rank(~ ~) = rank(T) + rank (X) + rank(Y) + rank(EyVT-UFx),


where Ey = 1- YY- and Fx = I - X-X. Hint. Use Part (b) in combination
with the result that, for any r x s matrix A, r x v matrix B, and u x s matrix C,

rank ( ~ :) = rank(B) + rank(C) + rank[(1 - BB-)A(I - c-q]. (*)

(d) Show that

rank (~ ~) = rank(T) + rank(Q) + rank(A) + rank(B)


+ rank[(1 - AA -)XQ-Y(I - B-B)],

where A = X(I-Q-Q) andB = (I-QQ-)Y. [Hint. Use Part (c) in combination


with Part (a).]
Solution. (a) Let K = Q - YT-U. Then, clearly,

(-~T- ~) (~ ~) G-~-U) = (~ ~) G -TI-U)

= (~ ~).
18. Sums (and Differences) of Matrices 205

so that (in light of Lemma 8.5.2)

rank (~ ~) = rank (~ ~). (S.16)

Further,

(~ ~) = (! ~) + (~ ~).
And, since n(!) = neT) and n(~) = n(Y) and since (according to Corol-

lary 17.2.8) neT) and n(Y) are essentially disjoint, n(!) and n (~) are es-

sentially disjoint, and hence it follows from Corollary 17.2.16 that n(! ~)

and n(~ ~) are essentially disjoint. Similarly, c(! ~) and c(~ ~) are
essentially disjoint. Thus, making use of Theorem 18.5.7, we find that

rank(~ ~) = rank(! ~) + rank ( ~ ~)


=rank(T)+rank(~ ~). (S.17)

Now, observe that

(0 X) (I T-U) _(0 X)
YK 01 -YQ
and hence that
rank(~ ~) = rank(~ ~). (S.18)

Finally, combining results (S.16)-(S.18), we obtain

rank (~ ~) = rank(T) + rank (~ ~) = rank(T) + rank (~ ~).


(b) Since (as indicated by Lemma 8.5.1) the rank of a matrix is not affected by
a permutation of rows or columns,

(S.19)

Moreover, applying result (E.4) (in the special case where W = 0) and again
making use of Lemma 8.5.1, we find that

rank(~ ~) = rank(T) + rank ( ~ _ V~-U)


= rank(T) + rank ( - V~-U !). (S.20)
206 18. Sums (and Differences) of Matrices

And, upon combining result (S.20) with result (S.l9), we obtain

0
rank ( U V)
T = rank(T) + rank (-VT-U
X Y)

(c) Applying result (* ) [or equivalently result (17.2.15), which is part of Theorem
17.2.17] with -VT-U, Y, and X in place of A, B, and C, respectively (or T, U,
and V, respectively), we find that

rank (
-VT-U
X !) = rank(Y) + rank(X)
+rank[(1 - YY-)(-VT-U)(I - X-X)]
= rank(Y) + rank(X) + rank(EyVT-UFx). (S.21)

And, upon combining result (S.21) with Part (b), we obtain

rank(~ ~) = rank(T) + rank(Y) + rank (X) + rank(EyVT-UFx).

(d) Applying Part (c) (with Q, Y, and X in place of T, U, and V, respectively,


and hence with A and B in place of Y and X, respectively), we find that

rank(~ ~) = rank(Q) + rank(B) + rank(A)


+ rank[(1 - AA -)XQ-Y(I - B-B)]. (S.22)

And, upon combining result (S.22) with Part (a), we obtain

rank(~ ~) = rank(T) + rank(Q) + rank(A) + rank(B)


+ rank[(1 - AA -)XQ-Y(T - B-B)].

EXERCISE 33. Let R represent an n x q matrix, S an n x m matrix, T an


m x p matrix, and U a p x q matrix, and define Q = T + TUR-ST. Further, let
ER = 1- RR-, FR = 1- R-R, X = ERST, Y = TUFR, A = X(I - Q-Q),
B = (I - QQ-)Y. Use the result of Part (d) of Exercise 32 in combination with
the result of Part (a) of Exercise 25 to show that

rankeR + STU) = rankeR) + rank(Q)


- rank(T) + rank(A) + rank(B)
+ rank[(1 - AA -)XQ-Y(I - B-B)].

Solution. Upon observing that rank(A) = rank( -A), that -A-is a generalized
inverse of -A, and that
rank[(1 - AA -)XQ-Y(I - B-B)]
= rank[ -(I - AA -)XQ-Y(I - B-B)]
= rank{[1 - (-A)(-A-)](-X)Q-Y(I - B-B»),
18. Sums (and Differences) of Matrices 207

it follows from the result of Part (d) of Exercise 32 that

R -ST)
rank ( TU T = rankeR) + rank(Q) + rank(A) + rank(B)
+ rank[(I - AA-)XQ-Y(I - B-B)].

We conclude, on the basis of Part (a) of Exercise 25, that

rankeR + STU) = rankeR) + rank(Q) - rank(T)


+ rank(A) + rank(B)
+ rank[(I - AA -)XQ-Y(I - B-B)].
19
Minimization of a Second-Degree
Polynomial (in n Variables) Subject to
Linear Constraints

EXERCISE 1. Let a represent an n x 1 vector of (unconstrained) variables, and


define tea) = a'Va - 2b'a, where V is an n x n matrix and b an n x 1 vector.
Show that if V is not nonnegative definite or ifb ¢ C(V), then tea) is unbounded
from below, that is, corresponding to any scalar c, there exists a vector a* such
that !(a*) < c.
Solution. Let c represent an arbitrary scalar.
Suppose that V is not nonnegative definite. Then, there exists an n x 1 vector x
such that x'Vx < O. Moreover, for any scalar k,

!(kx) = k2(x'Vx) - 2k(b'x) = k[k(x'Vx) - 2(b'x)].

Thus, limk--doo !(kx) = -00, and hence there exists a scalar k* such that
!(k*x) < c, so that, for a* = k*x, !(a*) < c.
Or, suppose that b fj C(V). According to Theorem 12.5.11, C(V) contains a
(unique) vector b I and Cl- (V) [or equivalently N (V')] contains a (unique) vector
b2 such that b = bl + b2. Clearly, b2 f. 0 [since otherwise we would arrive at a
contradiction of the supposition that b fj C(V)]. Moreover, for any scalar k,

Thus, limk-->oo !(kb2) = -00, and hence there exists a scalar k* such that
!(k*b2) < c, so that, for a* = k*b2, !(a*) < c.

EXERCISE 2. Let V represent an n x n symmetric matrix and X an n x p matrix.


Show that, for any p x p matrix U such that C(X) c C(V + XUX'),
210 19. Minimization of a Second-Degree Polynomial

(1) (V + XUX')(V + XUX')-V = V;


(2) V(V + XUX')-(V + XUX') = V.

Solution. (1) According to Lemma 19.3.4, C(V, X) = C(V +XUX'), implying (in
light of Lemma 4.5.1) that C(V) c C(V + XUX') and hence (in light of Lemma
9.3.5) that (V + XUX')(V + XUX')-V = V.
(2) According to Lemma 19.3.4, C(V, X) = C(V + XU'X'), implying that
C(V) c C(V + XU'X') and hence (in light of Lemma 4.2.5) that

'R(V) c 'R[(V + XU'X')'] = 'R(V + XUX').

Thus, it follows from Lemma 9.3.5 that

V(V + XUX')-(V + XUX') = v.

EXERCISE 3. Let V represent an n x n symmetric nonnegative definite matrix,


X an n x p matrix, B an n x s matrix such that C(B) c C(V, X), and D a p x s
matrix such that C(D) c C(X'). Further, let U represent any p x p matrix such
that C(X) c C(V + XUX'), and let W represent an arbitrary generalized inverse
of V + XUX'. Devise a short proof of the result that A* and R* are respectively
the first (n x s) and second (p x s) parts of a solution to the (consistent) linear
system

(in an n x s matrix A and a p x s matrix R) if and only if

R*=T*+UD

and
A* = WB - WXT* + [I - W(V + XUX')]L
for some solution T * to the (consistent) linear system

X'WXT = X'WB - D

(in a p x s matrix T) and for some n x s matrix L. Do so by taking advantage


of the result that if the coefficient matrix, right side, and matrix of unknowns in a
linear system HY = S (in Y) are partitioned (conformally) as

H= (H"
H2l
H12)
H22 ' S = (SI)
S2 ' and Y = (YI)
Y2 '

and if

C(H12) c C(Hll), C(Sj) c C(H,,), and 'R(H2d C'R(Hll),


19. Minimization of a Second-Degree Polynomial 211

then the matrix Y* = (iD is a solution to the linear system HY = S if and only
if Y~ is a solution to the linear system

and yr and Y~ are a solution to the linear system

Solution. According to Lemma 19.3.2, A* and R* are the first and second parts
of a solution to linear system (*) [or equivalently linear system (3.14)] if and only
if A* and R* - UD are the first and second parts of a solution to the linear system

(S.l)

(in A and T).


Now, observing (in light of Lemma 19.3.4) that R(X') c R(V + XUX') and
C(B) c C(V + XUX'), it follows from the cited result [or equivalently from Part
(1) of Theorem 11.11.1] that A* and T * are the first and second parts of a solution
to linear system (S.l) if and only if T * is a solution to the linear system

(0 - X'WX)T = D - X'WB (S.2)

(in T) and
(V + XUX')A* + XT * = B. (S.3)
Note that linear system (S.2) is equivalent to linear system (**) [which is iden-
tical to linear system (3.17)]. Note also that condition (S.3) is equivalent to the
condition
(V + XUX')A* = B - XT*. (S.4)
And, since (in light of Lemma 19.3.4) C(B - XT*) c C(V + XUX'), it follows
from Theorem 11.2.4 that condition (S.4) is equivalent to the condition that

A* = WB - WXT* + [I - W(V + XUX')]L

for some matrix L.


Thus, A* and R* - un are the first and second parts of a solution to linear
system (S.l) if and only if R* - un = T * (or equivalently R* = T * + UD) and

A* = WB - WXT* + [I - W(V + XUX')]L

for some solution T* to linear system (**) [or equivalently linear system (3.17)]
and for some matrix L.

EXERCISE 4. Let V represent an n x n symmetric matrix and X an n x p matrix.


Further, let U = X'TT'X, where T is any matrix whose columns span the null
212 19. Minimization of a Second-Degree Polynomial

space of V. Show that C(X) c C(V + XUX') and that C(V) and C(XUX') are
essentially disjoint and R(V) and R(XUX') are essentially disjoint (even in the
absence of any assumption that V is nonnegative definite).
Solution. Making use of Corollaries 7.4.5, 4.5.6, and 17.2.14, we find that

C(V, X) = C(V, XX') = C(V, XX'T).

Further,
C(XX'T) = C[(XX'T)(XX'T)'J = C(XUX').
Thus, again making use of Corollary 4.5.6, we have that

rank(V, X) = rank(V, XX'T) = rank(V, XUX').

And, in light of Corollary 17.2.14, C(V) and C(XUX') are essentially disjoint.
Moreover, since XUX' is clearly symmetric, it follows from Lemma 17.2.1 that
R(V) and R(XUX') are also essentially disjoint.
Finally, in light of Theorem 18.5.6, it follows from result (5.13) or (5.14) that
rank (V + XUX') = rank(V, XUX'), so that rank (V + XUX') = rank(V, X) or
equivalently (in light of Lemma 19.3.4) C(X) c C(V + XUX').

EXERCISE 5. Let V represent an n x n symmetric nonnegative definite matrix


and X an n x p matrix. Further, let Z represent any matrix whose columns span
N(X') or, equivalently, Cl.(X). And, adopt the same terminology as in Exercise
17.20.
(a) Using the result of Part (a) of Exercise 17.20 (or otherwise) show that an
n x n matrix H is a projection matrix for C(X) along C(VZ) if and only if H' is
the first (n x n) part of a solution to the consistent linear system

(E.l)

(in an n x n matrix A and a p x n matrix R).


(b) Letting U represent any p x p matrix such that C(X) C C(V + XUX') and
letting W represent an arbitrary generalized inverse of V + XUX', show that an
n x n matrix H is a projection matrix for C(X) along C(VZ) if and only if

H = px.w + K[I - (V + XUX')WJ

for some n x n matrix K.


Solution. (a) In light of the result of Part (a) of Exercise 17.20, it suffices to show
that HX = X and HVZ = 0 (or equivalently that X'H' = X' and Z'VH' = 0) if
and only if H' is the first part of a solution to linear system (E. 1).
Now, suppose that X'H' = X' and Z'VH' = o. Then, in light of Corollary
12.1.2, it follows from Corollary 12.5.5 that C(VH') c C(X) and hence (in light
19. Minimization of a Second-Degree Polynomial 213

of Lemma 4.2.2) that VH' = XT for some matrix T. Thus,

so that H' is the first part of a solution to linear system (E. I ).


Conversely, suppose that H' is the first part of a solution to linear system (E.l)
and hence that

for some p x n matrix R*. Then, X'H' = X'. And, VH' = X(-R*), implying
that C(VH') c C(X) and hence (in light of Corollaries 12.5.5 and 12.1.2) that
Z'VH'=O.
(b) In light of Part (a), it suffices to show that H' is the first part of a solution to
linear system (E.l) if and only if

H = px.w + K[I - (V + XUX')W]

for some matrix K.


According to Lemma 19.3.4, C(X) c C(V + XU'X'). Moreover, since V +
XU'X' = (V + XUX') , and since X'W'X = (X'WX)', W' is a generalized inverse
of V + XU'X', and [(X'WX) -]' is a generalized inverse of X'W'X. Thus, it follows
from the results of Section 19.3c that H' is the first part of a solution to linear system
(E.l) if and only if

H' = W'X[(X'WX)-],X' + [I - W'(V + XU'X')]K'

for some (n x n) matrix K, or equivalently if and only if

H = Px,w + K[I - (V + XUX')W]

for some matrix K.

EXERCISE 6. Let V represent an n x n symmetric nonnegative definite matrix, W


an n x n matrix, and X an n x p matrix. Show that, for the matrix WX(X'WX)-X'
to be the first (n x n) part of some solution to the (consistent) linear system

(E.2)

(in an n x n matrix A and a p x n matrix R), it is necessary and sufficient that


C(VWX) c C(X) and rank(X'WX) = rank(X).
Solution. Clearly, WX(X'WX)-X' is the first part of a solution to linear sys-
tem (E.2) if and only if VWX(X'WX) -X' + XR = 0 for some matrix R and
X'WX(X'WX)-X' = X', or equivalently if and only if

C[VWX(X'WX)-X'] c C(X) (S.5)


214 19. Minimization of a Second-Degree Polynomial

and
X'WX(X'WX)-X' = X'. (S.6)
And, by following the same line of reasoning as in the latter part of the proof
of Theorem 19.5.1, we find that conditions (S.5) and (S.6) are equivalent to the
conditions that C(VWX) c C(X) and rank(X'WX) = rank(X).

EXERCISE 7. Let a represent an n x 1 vector of variables, and impose on a


the constraint X' a = d, where X is an n x p matrix and d is a p x 1 vector
such that dE C(X'). Define I(a) = a'Va - 2b'a, where V is an n x n symmetric
nonnegative definite matrix and b is an n x 1 vector such that b E C(V, X). Further,
define g(a) = a' (V + W)a - 2(b + e)'a, where W is any n x n matrix such that
C(W) c C(X) and R(W) c R(X') and where e is any n x 1 vector in C(X).
Show that the constrained (by X'a = d) minimization of g(a) is equivalent to
the constrained minimization of I(a) [in the sense that g(a) and I(a) attain their
minimum values at the same points].
Solution. Clearly, e = Xr for some p x 1 vector r. Further, in light of Lemma
9.3.5, we have that W = XX-Wand W = W(X')-X' and hence that

W = XX-W(X')-X' = XUX',
where U = X-W(X')-. Thus,

g(a) = I(a) + a'Wa - 2e'a


= I(a) + (X'a)'UX'a - 2r'X'a,

so that, for a such that X' a = d,

g(a) = I(a) + d'Ud - 2r'd.

We conclude that, for a such that X'a = d, g(a) differs from I(a) only by an
additive constant and hence that g(a) and I (a) attain their minimum values (under
the constraint X' a = d) at the same points.

EXERCISE 8. Let V represent an n x n symmetric nonnegative definite matrix,


W an n x n matrix, X an n x p matrix, f an n x 1 vector, and d a p x I vector.
Further, let b represent an n x 1 vector such that b E C(V, X). Show that, for
the vector Wei - Px.w)f + WX(X'WX)-d to be a solution, for every dE C(X'),
to the problem of minimizing the second-degree polynomial a'Va - 2b'a (in a)
subject to X' a = d, it is necessary and sufficient that

VWf - b E C(X), (E.3)

C(VWX) c C(X), (E.4)

and
rank(X'WX) = rank(X). (E.5)
19. Minimization of a Second-Degree Polynomial 215

Solution. It follows from Theorem 19.2.1 that a'Va - 2b' a has a minimum at
W(I - PX,w)f + WX(X'WXrd under the constraint X'a = d [where dE C(X')]
if and only if VW (I - Px,w)f + VWX(X'WX) - d + Xr = b for some vector rand
X'W (I - PX,w)f + X'WX(X'WX) - d = d, or equivalently if and only if VW (I -
PX,w)f-b+VWX(X'WX)-d E C(X)andX'W(I-Px.w)f+X'WX(X'WX)-d =
d. Thus, forW(I-Px.w)f+ WX(X'WX)-d to be a solution, for every dE C(X'),
to the problem of minimizing a'Va - 2b' a subject to X' a = d, it is necesary and
sufficient that, for every n x I vector u, VW (I - Px.w)f - b+ VWX(X'WX) -X'u E
C(X) and X'W (I - PX,w)f+ X'WX(X'WX) -X' u = X' u, a requirement equivalent
to a requirement that

VW(I - Px.w)f - bE C(X), (S.7)


C[VWX(X'WX)-X'] c C(X), (S.8)
and
X'WX(X'WX)-X' = X', (S.9)
as we now show.
Suppose that conditions (S. 7) - (S.9) are satisfied. Then, observing that

X'WPx,w = X'WX(X'WX)-X'W,

we find that, for every u,

VW(I - Px.w)f - b + VWX(X'WX)-X'u E C(X)

and

X'W(I - PX,w)f + X'WX(X'WX)-X'u = (X'W - X'W)f + X'u = X'u.


Conversely, suppose that, for every u,

VW(I - PX,w)f - b + VWX(X'WX)-X'u E C(X) (S.lO)

and
X'W(I - PX,w)f + X'WX(X'WX)-X'u = X'u. (S.Il)

Then, since conditions (S.lO) and (S.ll) are satisfied in particular for u = 0, we
have that
VW(I - PX,w)f - bE C(X)
and
X'W(I - Px.w)f = O.
Further, for every u, VWX(X'WX)-X'u E C(X) and X'WX(X'WX)-X'u =
X'u, implying that
C[VWX(X'WX)-X'] c C(X)
and
X'WX(X'WX)-X' = X'.
216 19. Minimization of a Second-Degree Polynomial

Now, when condition (S.8) is satisfied, condition (E.3) is equivalent to con-


dition (S.7), as is evident from Lemma 4.1.2 upon observing that VWPx,w =
VWX(X'WX)-X'W and hence that VWPx,wf E C[VWX(X'WX)-X']. More-
over, by employing the same line of reasoning as in the latter part of the proof of
Theorem 19.5.1, we find that conditions (E.4) and (E.5) are equivalent to condi-
tions (S.8) and (S.9). Thus, conditions (E.3) - (E.5) are equivalent to conditions
(S.7)-(S.9). And, we conclude that, for Wei - PX,w)b + WX(X'WX)-d to be a
solution, for every d E C(X'), to the problem of minimizing a'Va - 2b'a subject
to X' a = d, it is necessary and sufficient that conditions (E.3) - (E.5) be satisfied.

EXERCISE 9. Let V and W represent n x n matrices, and let X represent an n x p


matrix. Show that if V and W are nonsingular, then the condition C(VWX) c C(X)
is equivalent to the condition C(V-I X) c C(WX) and is also equivalent to the
condition C(W- I V-I X) c C(X).

Solution. Assume that V and W are nonsingular. Then, in light of Corollary


8.3.3, rank(VWX) = rank(X), rank(V-IX) = rank (X) = rank(WX) , and
rank(W-IV-IX) = rank(X). Thus, as a consequence of Theorem 4.4.6,

C(VWX) c C(X) {} C(VWX) = C(X),


C(V-IX) c C(WX) {} C(V-IX) = C(WX), and
C(W-IV-IX) c C(X) {} C(W-IV-IX) = C(X).

Now, if C(VWX) c C(X), then C(X) = C(VWX), so that X = VWXQ for


some matrix Q, in which case V-IX = WXQ and W-IV-IX = XQ, implying
that C(V- I X) c C(WX) and C(W- IV-I X) c C(X). Conversely, if C(V- I X) c
C(WX), then C(WX) = C(V-IX), so that WX = V-IXQ for some matrix Q,
in which case VWX = XQ, implying that C(VWX) c C(X). And, similarly, if
C(W-IV-IX) c C(X), then C(X) = C(W-IV-IX), so that X = W-IV-IXQ
for some matrix Q, in which case VWX = XQ, implying that C(VWX) c C(X).
We conclude that C(VWX) c C(X) {} C(V- I X) c C(WX) and that C(VWX) c
C(X) {} C(W-IV-IX) c C(X).

EXERCISE 10. Let V represent an n x n symmetric positive definite matrix,


W an n x n matrix, X an n x p matrix, and d a p x 1 vector. Show that, for the
vector WX(X'WX)-d to be a solution, for every d E C(X'), to the problem of
minimizing the quadratic form a'Va (in a) subject to X'a = d, it is necessary and
sufficient that V-Ipx,w' be symmetric and rank(X'WX) = rank(X). Show that it
is also necessary and sufficient that (I - P~ w') V-I PX,w' = 0 and rank(X'WX) =
rank(X). '

Solution. Since (V- I PX,w,)' = P~,WI V-I, V-I PX,w' is symmetric if and only if
V-Ipx,w ' = P~,wIV-I. Moreover, ifV-1px,w' = P~,wIV-I, then

-I I _I I
PX,w,V = V(V PX,w')V = V(Pxw'V
, )V = VPXW"
,
19. Minimization of a Second-Degree Polynomial 217

And, conversely, if Px,w'V = VP~,w' , then

V-I Px,w' = V-I (Px,w' V) V-I = V-I (VP~,w') V-I = P~,w' V-I.
Thus, V-Ipx,w' is symmetric if and only if PX,W'V = VP~,w" and the neces-
sity and sufficiency of V-I Px,w' being symmetric and rank(X'WX) = rank(X)
follows from Theorem 19.5.4.
To complete the proof, it suffices to show that (I - P~,w,)V-Ipx,w' = 0 and
rank(X'WX) = rank(X), or equivalently that V-Ipx.w' = P~,w,V-Ipx,w' and
rank(X'WX) = rank(X), if and only if V-I p x.w' = P~,w' V-I and rank(X'WX)
= rank(X).
Suppose that V-IpX,w' = P~,w,V-1 and rank(X'WX) = rank(X). Then, since
X'W'X = (X'WX)', rank(X'W'X) = rank(X), and it follows from Part (3) of
Lemma 19.5.5 that pi,w' = PX,w" Thus,

V-Ipx,w' = (V-IPX,w')Px,w' = P~,w,V-IpX,w'.


Conversely, if V-Ipx,w' = P'x,w,V- 1Px,w" then (since clearly the matrix
P~,w' V-I Px,w' is symmetric)

V-Ipx,w' = (P~,w,V-IpX,W')' = (V-Ipx,W')' = P~,w,V-I.


= P~,w' V-I px.w' and rank(X'WX) = rank (X) if
We conclude that V-I Px,w'
and only ifV-Ipx,w' = P~,w,V-1 and rank(X'WX) = rank(X).

EXERCISE 11. Let V represent an n x n symmetric nonnegative definite matrix,


X an n x p matrix, and d a p x 1 vector. Show that each of the following six
conditions is necessary and sufficient for the vector X(X'X)-d to be a solution,
for every d E C(X'), to the problem of minimizing the quadratic form a'Va (in a)
subject to X' a = d:
(a) C(VX) c C(X) (or, equivalently, VX = XQ for some matrix Q);
(b) PxV(I - Px) = 0 (or, equivalently, PxV = PxVPx);
(c) PxV = VPx (or, equivalently, PxV is symmetric);
(d) C(VPx) c C(Px );
(e) C(VPx) = C(V) n C(Px );
(f) C(VX) = C(V) n C(X).
Solution. (a), (b), and (c) Upon applying Theorems 19.5.1 and 19.5.4 (with W = I)
and recalling (from Corollary 7.4.5) that rank(X'X) = rank(X) and (from Theorem
12.3.4) that P x is symmetric, we find that each of Conditions (a) - (c) is necessary
and sufficient for X(X'X)-d to be a solution to the problem of minimizing a'Va
subject to X' a = d.
218 19. Minimization of a Second-Degree Polynomial

(d) According to Theorem 12.3.4, C(P x ) = C(X). And, in light of Corollary


4.2.4, C(VPx ) = C(VX). Thus, Condition (d) is equivalent to Condition (a).
(e) Let y represent an arbitrary vector in C(V) n C(Px ). Then, y = Va for some
vector a and y = Pxb for some vector b, implying (since, according to Theorem
12.3.4, P x is idempotent) that

y = PxPxb = Pxy = PxVa E C(PxV).

Thus,
C(V) nC(Px) c C(PxV). (S.12)
Now, suppose that X(X'X)-d is a solution, for every dE C(X'), to the problem
of minimizing a'Va subject to X'a = d. Then, Condition (c) is satisfied (Le.,
PxV = VPx ), implying [since, clearly, C(PxV) c C(P x ) and C(VPx ) c C(V)]
that
C(VPx ) c C(V) n C(Px )
and also [in light of result (S.l2)] that

C(V) n C(Px ) c C(VPx ).

Thus, C(VPx ) = C(V) n C(Px ) [i.e., Condition (e) is satisfied].


Conversely, suppose that C(VPx) = C(V) nC (P x ). Then, obviously, C (VPx ) c
C(Px ) [i.e., Condition (d) is satisfied], implying that X(X'X) - d is a solution, for
every d E C(X'), to the problem of minimizing a'Va subject to X'a = d.
(f) Since [as noted in the proof of the necessity and sufficiency of Condition (d)]
C(Px ) = C(X) and C(VPx) = C(VX), Condition (f) is equivalent to Condition
(e).

EXERCISE 12. Let V represent an n x n symmetric nonnegative definite matrix,


Wan n x n matrix, X an n x p matrix, and d a p x I vector. Further, let K represent
any n x q matrix such that C(K) = C(I - p X.w')' Show that if rank(X'WX) =
rank(X), then each of the following two conditions is necessary and sufficient for
the vector WX(X'WX)-d to be a solution, for every dE C(X'), to the problem of
minimizing the quadratic form a'Va (in a) subject to X'a = d:
(a) V = XR1X' + (I - P X,w,)R2(1 - PX,w')'
for some p x p matrix R1 and some n x n matrix R2;
(b) V = XS1X' + KS2K'
for some p x p matrix Sl and some q x q matrix S2.
And, show that if rank(X'WX) = rank(X) and W is nonsingular, then another
necessary and sufficient condition is:
(c) V = tW- 1 + XT1X' + KT2K'
for some scalar t, some p x p matrix T1, and some q x q matrix T2.
[Hint. To establish the necessity of Condition (a), begin by observing that V = ce'
for some matrix e and by expressing e as C = PX,w'C + (I - p x .w' )C.]
19. Minimization of a Second-Degree Polynomial 219

Solution. Assume that rank(X'WX) = rank(X). Then, since X'W'X = (X'WX)',


rank(X'W'X) = rank(X). Thus, applying Parts (1) and (3) of Lemma 19.5.5 (with
W' in place of W), we find that p x .w' X = X and pi.w' = p x .w" And, applying
Part (2) of Lemma 19.5.5, we find that X'W'px .w' = X'W' and hence that

p~.w' WX = [X'W'px.w']' = (X'W')' = WX.


To establish the necesity and sufficiency of Condition (a), it suffices (in light
of Theorem 19.5.1) to show that Condition (a) is equivalent to the condition that
C(VWX) c C(X). If Condition (a) is satisfied, then

VWX = XR,X'WX + (I - P X.W,)R2(WX - p~.w'WX)


= XR,X'WX + (I - P X.W ,)R2(WX - WX) = XR,X'WX,

and consequently C(VWX) c C(X).


Conversely, suppose that C(VWX) c C(X). Then, VWX = XQ for some
matrix Q, so that

(I - px.w,)VWX = (I - PX.w')XQ = o. (S.l3)

Now, observe (in light of Corollary 14.3.8) that there exists a matrix C such that
V = CC'. Thus,

V = [Px.w,C + (I - px.w' )C][Px.w,C + (I - px.w' )C]'


= px.w'CC'p~.w' + px.w,CC'(1 - p~.w')
+(1 - px.W')CCI p x' .w' + (I - px.w,)CC " (I - p x .w')'

Moreover,

(I - p x .w' )VWX = OCC'WX + occ'o


+(1 - p x .w' )CC'WX + (I - p x .w' )CC'O
= (I - px.w')CC'WX, (S.14)

Together, results (S.l3) and (S.14) imply that

(I - p x .w' )CC'WX = 0,

so that

(I - px.w')CC'p~.w' = (I - px.w')CC'WX[(X'W'X)-]'X' = 0

and
px.w,CC'(1 - p~.w') = [(I - px.w')CC'p~.w']' = O.
We conclude that
, ' I '
V = pX.W'CC p x .w' + (I - px.w')CC (I - p x .w')
= XR,X' + (I - P x .w,)R2(1 - p x .w')"
220 19. Minimization of a Second-Degree Polynomial

where Rl = (X'W'X)-X'W'ee'WX[(X'W'X)-]' and R2 = ee'.


Thus, Condition (a) is equivalent to the condition that C(VWX) c C(X).
To establish the necessity and sufficiency of Condition (b), it suffices to show
that Conditions (a) and (b) are equivalent. According to Lemma 4.2.2, there exist
matrices A and B such that I - Px,w' = KA and K = (I - Px,w' )B. Thus, if
Condition (a) is satisfied, then

where Sl = R 1 and S2 = AR2A'. Conversely, if Condition (b) is satisfied, then

where Rl = Sl and R2 = BS2B'. Thus, Conditions (a) and (b) are equivalent.
Assume now that W is nonsingular [and continue to assume that rank(X'WX)
= rank(X)]. And [for purposes of establishing the necessity of Condition (c)]
suppose that WX(X'WX)-d is a solution, for every dE C(X'), to the problem of
minimizing a'Va subject to X'a = d. Then, Condition (b) is satisfied, in which
case
v
= tW- 1 + XTIX' + KT2K',
where t = 0, Tl = Sl, and T2 = S2.
Conversely, suppose that Condition (c) is satisfied. Then, recalling that K =
(I - PX,w,)B for some matrix B, we find that

VWX = tW-1WX+ XTIX'WX+ KT2K'WX


= tX + XTIX'WX + KT2B'(1 - P~,w')WX
= tX + XTIX'WX + KT2B'O
= X(tl + TIX'WX).

Thus, C(VWX) c C(X), and it follows from Theorem 19.5.1 that Condition (c)
is sufficient (as well as necessary) for WX(X'WX)-d to be a solution, for every
dE C(X'), to the problem of minimizing a'Va subject to X'a = d.
20
The Moore-Penrose Inverse

EXERCISE 1. Show that, for any m x n matrix B of full column rank and for
any n x p matrix C of full row rank,

(BC)+ = C+B+.

Solution. As a consequence of result (1.2), we have that


(BC)+ = C'(CC')-I(B'B)-IB'.

And, in light of results (2.1) and (2.2), it follows that

(BC) + = C+B+.

EXERCISE 2. Show that, for any m x n matrix A, A + = A' if and only if A' A
is idempotent.
Solution. Suppose that A' A is idempotent or equivalently that
A'A = A'AA'A. (S.l)

Then, premultiplying both sides of equality (S.l) by A + (A +)' and postmultiplying


both sides by A +, we find that

A+(A+)'A'AA+ = A+(A+)'A'AA'AA+.

Moreover,
A+(A+)'A'AA+ = A+(AA+)'AA+ = A+AA+AA+ = A+AA+ = A+,
222 20. The Moore-Penrose Inverse

and
A+(A+)'A'AA'AA + = A+(AA +)'AA'(AA +)'
= A+AA+A(AA+A)'
= A+AA' = (A+A)'A' = (AA+A)' = A'.
Thus, A+ = A'.
Conversely, suppose that A+ = A'. Then, clearly, A' A = A+ A, implying (in
light of Lemma 10.2.5) that A' A is idempotent.

EXERCISE 3. Let T represent an m x p matrix, U an m x q matrix, V an n x p


matrix, and W an n x q matrix, and define Q = W - VT-U. If C(U) c C(T) and
R(V) C R(T), then

is a generalized inverse of the partitioned matrix (~ ~). and

a generalized inverse of (~ ~). Show that if the generalized inverses T- and


Q- (ofT and Q, respectively) are both reflexive [and ifC(U) C C(T) and R(V) C
R(T)], then generalized inverses (*) and (**) are also reflexive.
Solution. Suppose that C(U) C C(T) and R(V) C R(T). [That these conditions
are sufficient to insure that partitioned matrices (*) and (**) are generalized in-
verses of (~ ~) and (~ ~), respectively, is the content of Theorem 9.6.1.]
Then, TT-U = U and VT-T = V (as is evident from Lemma 9.3.5). Further,

( T- + T-UQ-VT- -T-UQ-)(T U)
-Q-VT- Q- V W

(
T- + T-UQ-VT-
x -Q-VT-
= (T- + T-UQ-VT- - T-UQ-)( TT-
-Q-VT- Q- (I - QQ-)VT-

T-TT- + TUQ-VT-TT- _ T-UQ-QQ-).


_ ( -T-UQ-(I - QQ-)VT- (S.2)
-Q-VT-TT- + Q-(I - QQ-)VT- Q-QQ-

Now, if the generalized inverses T- and Q- are both reflexive (i.e., if T-TT-
= T- and Q-QQ- = Q-), then partitioned matrix (S.2) simplifies to partitioned
20. The Moore-Penrose Inverse 223

matrix (*). We conclude that if the generalized inverses T- and Q- are both reflex-
ive [and if if C(U) c C(T) and R(V) c R(T)], then the generalized inverse (*)
of (~ ~) [or equivalently the generalized inverse given by expression (9.6.2)]
is reflexive. And, it can be shown in similar fashion that if the generalized inverses
T- and Q- are both reflexive [and if if C(U) c C(T) and R(V) c R(T)], then
the generalized inverse (**) of (~ ~) [or equivalently the generalized inverse
given by expression (9.6.3)] is reflexive.

EXERCISE 4. Determine which of Penrose Conditions (1) - (4) [also known as


Moore-Penrose Conditions (1) - (4)] are necessarily satisfied by a left inverse of
an m x n matrix A (when a left inverse exists). Which of the Penrose conditions
are necessarily satisfied by a right inverse of an m x n matrix A (when a right
inverse exists)?
Solution. Suppose that A has a left inverse L. Then, by definition, LA = In. And,
as previously indicated (in Section 9.2d), ALA = AI = A. Thus, L necessarily
satisfies Penrose Condition (1). Further, LAL = IL = L and (LA)' = I' = I =
LA, so that L also necessarily satisfies Penrose Conditions (2) and (4).
However, there exist matrices that have left inverses that do not satisfy Penrose
Condition (3). Suppose, for example, that A = (~ ) (where m > n). And, take
L = (In, K), where K is an arbitrary n x (m - n) matrix. Then, LA = In, and
AL = (~ !), so that L is a left inverse of A that (unless K = 0) does not
satisfy Penrose Condition (3).
Similarly, if A has a right inverse R, then R necessarily satisfies Penrose Con-
ditions (1), (2), and (3). However, there exist matrices that have right inverses that
do not satisfy Penrose Condition (4).

EXERCISE 5. Let A represent an m x n matrix and G an n x m matrix.


(a) Show that G is the Moore-Penrose inverse of A if and only if G is a minimum
norm generalized inverse of A and A is a minimum norm generalized inverse of
G.
(b) Show that G is the Moore-Penrose inverse of A if and only if GAA' = A'
and AGG' = G'.
(c) Show that G is the Moore-Penrose inverse of A if and only if GA = PA , and
AG = P G ,.
Solution. (a) By definition, G is a minimum norm generalized inverse of A if and
only if AGA = A and (GA)' = GA [which are Penrose Conditions (1) and (4)],
and A is a minimum norm generalized inverse of G if and only if GAG = G and
(AG)' = AG [which, in the relevant context, are Penrose Conditions (2) and (3)].
Thus, G is the Moore-Penrose inverse of A if and only if G is a minimum norm
224 20. The Moore-Penrose Inverse

generalized inverse of A and A is a minimum norm generalized inverse of G.


(b) Part (b) follows from Part (a) upon observing (in light of Theorem 20.3.7)
that G is a minimum norm generalized inverse of A if and only if GAA' = A' and
that A is a minimum norm generalized inverse of G if and only if AGG' = G'.
(c) Part (c) follows from Part (a) upon observing (in light of Corollary 20.3.8)
that G is a minimum norm generalized inverse of A if and only if GA = PA' and
that A is a minimum norm generalized inverse of G if and only if AG = PG'.

EXERCISE 6. (a) Show that, for any m x n matrices A and B such that A'B =0
andBA' = 0, (A + B)+ = A+ + B+.
(b) Let AI, A2, ... , Ak represent m x n matrices such that, for j > i =
1, ... , k - 1, A;Aj = 0 and AjA; = O. Generalize the result of Part (a) by
showing that (AI + A2 + ... + Ad+ = Ai + At + ... + At·
Solution. (a) Let X represent any n x m matrix such that (A + B)' (A + B)X =
(A + B)' and Y any m x n matrix such that (A + B)(A + B)'Y = A + B. Then,
since B'A = (A'B)' = 0 and AB' = (BA')' = 0, we have that

(A' A + B'B)X = A' + B' and (AA' + BB')Y = A + B.


Moreover, as a consequence of Corollary 12.1.2, we have that C(A)..L C(B) and
[since (A')'B' = (BA')' = 0] that C(A') ..LC(B'), implying (in light of Lemma
17.1.9) that

C(A) n C(B) = {O} and C(A') n C(B') = {O}.

Thus, upon observing that C(A') = C(A'A), C(B') = C(B'B), C(A) = C(AA'),
and C(B) = C(BB'), it follows from Theorem 18.2.7 that

A' AX = A', B'BX = B', AA'y = A, and BB'Y = B.

Now, making use of Theorem 20.4.4, we find that

(A + B)+ = Y' (A + B)X = Y'AX + y'BX = A + + B+.


(b) The proof is by mathematical induction. The result of Part (b) is valid for
k = 2, as is evident from Part (a).
Suppose now that the result of Part (b) is valid for k = k* - 1. And, let
AI, ... ,Ak*-I, Ak* representm x n matrices such that, for j > i = 1, ... ,k* -1,
A;Aj = 0 and AjA; = O. Then, observing that

and that
20. The Moore-Penrose Inverse 225

and using the result of Part (a), we find that

(AI + ... + Ak*-l + Ak*)+ = [(AI + ... + Ak*-d + Ak']+


= (AI + ... + Ak'-l)+ + At.
= At + ... + At*-l + At. '
which establishes the validity of the result of Part (b) for k = k* and completes
the induction argument.

EXERCISE 7. Show that, for any m x n matrix A, (A + A)+ = A + A, and


(AA +)+ = AA +.

Solution. According to Corollary 20.5.2, A + A and AA + are symmetric and


idempotent. Thus, it follows from Lemma 20.2.1 that (A + A)+ = A + A and
(AA+)+ = AA+.

EXERCISE 8. Show that, for any n x n symmetric matrix A, AA + = A + A.


Solution. That AA + = A + A is an immediate consequence of Part (2) of Theorem
20.5.1.
Or, alternatively, this equality can be verified by making use of Part (2) of
Theorem 20.5.3 (and of the very definition of the Moore-Penrose inverse). We find
that
AA+ = (AA+)' = (A+)'A' = (A+)'A = A+A.

EXERCISE 9. Let V represent an n x n symmetric nonnegative definite matrix,


X an n x p matrix, and d a p x 1 vector. Using the results of Exercises 8 and
19.11 (or otherwise), show that, for the vectorX(X'X)-d to be a solution, for every
d E C(X'), to the problem of minimizing the quadratic form a'Va (in a) subject to
X'a = d, it is necessary and sufficient that C(V+X) c C(X).
Solution. In light of the results of Exercise 19.11, it suffices to show that C (VX) c
C(X) {:> C(V+X) c C(X).
Suppose that C(VX) c C(X). Then, VX = XQ for some matrix Q. And, using
the result of Exercise 8, we find that

VX = VV+VX = VV+XQ = V+VXQ = V+XQ2

and hence that


C(VX) c C(V+X). (S.3)
Moreover, since (according to Theorem 20.5.3) V+ is symmetric and nonneg-
ative definite, we have (in light of Lemma 14.11.2 and the result of Exercise 8)
that

rank(VX) ~ rank(V+VX) = rank(VV+X) ~ rank(X'V+VV+X)


= rank(X'V+X) = rank(V+X),
226 20. The Moore-Penrose Inverse

implying [since, in light of result (S.3), rank(VX) S rank(V+X)] that rank(VX)


= rank(V+X). Thus, it follows from Theorem 4.4.6 that C(VX) = C(V+X). We
conclude that C(V+X) c C(X).
Conversely, suppose that C(V+X) c C(X). Then, V+X = XR for some matrix
R. And, using the result of Exercise 8, we find that
V+X = V+VV+X = V+VXR = VV+XR = VXR2

and hence that


C(V+X) c C(VX). (S.4)

Moreover, in light of Lemma 14.11.2 and the result of Exercise 8, we have that
rank(V+X) ::: rank(VV+X) = rank(V+VX) ::: rank(X'VV+VX)
= rank(X'VX) = rank(VX),

implying [since, in light of result (S.4), rank(V+X) S rank(VX)] that rank(V+X)


= rank(VX). Thus, it follows from Theorem 4.4.6 that C(V+X) = C(VX). We
conclude that C(VX) c C(X).

EXERCISE 10. Let A represent an n x n matrix. Show that if A is symmetric and


positive semidefinite, then A+ is symmetric and positive semidefinite and that if
A is symmetric and positive definite, then A + is symmetric and positive definite.
Do so by taking advantage of the result that if A is symmetric and nonnegative
definite (and nonnull), then A + = T+ (T+)' for any matrix T offull row rank (and
with n columns) such that A = T'T.
Solution. Suppose that A is symmetric and nonnegative definite. Further, assume
that A is nonnull- if A = 0, then A is positive semidefinite, and A + = 0, so that
A + is also positive semidefinite (and symmetric). Then, it follows from the result
cited in the exercise [which is taken from Theorem 20.4.5] that
A+ = T+(T+)'

for any matrix T of full row rank (and with n columns) such that A = T'T.
Thus, A + is symmetric and (in light of Corollary 14.2.14) nonnegative definite.
And, since (T+), has n columns and since [in light of Part (1) of Theorem 20.5.1]
rank (T+)' = rank T+ = rank T, it follows from Corollary 14.2.14 that A + is
positive semidefinite if rank (T) < n or equivalently if A is positive semidefinite
and that A+ is positive definite if rank(T) = n or equivalently if A is positive
definite.

EXERCISE 11. Let C represent an m x n matrix. Show that, for any m x m


idempotent matrix A, (AC)+ A' = (AC)+ and that, for any n x n idempotent
matrix B, B'(CB)+ = (CB)+.

Solution. According to Corollary 20.5.5,


(AC)+ = [(AC)'AC]+(AC)' = [(AC)'AC]+C'A',
20. The Moore-Penrose Inverse 227

and
(CB)+ = (CB)'[CB(CB)'l+ = B'C'[CB(CB)'l+.
Thus,

(AC)+A' = [(AC)'AC]+C'A'A' = [(AC)'AC]+C'(AA)'


= [(AC)'AC]+C'A' = (AC)+,

and

B'(CB)+ = B'B'C'[CB(CB)'l+ = (BB)'C'[CB(CB)'l+


= B'C'[CB(CB)'l+ = (CB)+.

EXERCISE 12. Let a represent an n x 1 vector of variables, and impose on a


the constraint X' a = d, where X is an n x p matrix and d a p x 1 vector such
that dE C(X'). And, define lea) = a'Va - 2b'a, where V is an n x n symmetric
nonnegative definite matrix and b is an n x 1 vector such that b E C (V, X). Further,
let R represent any matrix such that V = R'R, let ao represent any n x 1 vector
such that X' ao = d, and take s to be any n x 1 vector such that b = Vs + Xt
for some p x 1 vector t. Show that lea) attains its minimum value (under the
constraint X'a = d) at a point a* if and only if

for some n x 1 vector w. Do so by, for instance, using the results of Exercise 11
in combination with the result that, for any n x k matrix Z whose columns span
N(X'), lea) attains its minimum value (subject to the constraint X'a = d) at a
point a* if and only if

a* = ao + Z(Z'VZ)-Z'(b - Vao) + Z[I - (Z'VZ)-Z'VZlw

for some k x 1 vector w.


Solution. Take Z = 1- P x . Then, according to Lemma 12.5.2, C(Z) = N(X').
And, it follows from the cited result on constrained minimization (which is taken
from Section 19.6) that lea) attains its minimum value (under the constraintX'a =
d) at a point a* if and only if

a* = ao + Z(Z'VZ)+Z' (b - Vao) + [I - Z(Z'VZ)+Z'V1Zw

for some n x 1 vector w.


Moreover, according to Part (9) of Theorem 12.3.4, Z is symmetric and idem-
potent. Thus, making use of Corollary 20.5.5 and of the results of Exercise 11, we
find that
228 20. The Moore-Penrose Inverse

And, since [in light of Part (1) of Theorem 12.3.4] Z'X = ZX = 0,

Z(Z'VZ)+Z'(b - Vao) = Z(Z'VZ)+Z'(Vs + Xt - Vao)


= Z(Z'VZ)+Z'V(s - 30) = (RZ)+R(s - 30).

We conclude that f (a) attains its minimum value (under the constraint X' a = d)
at a point a* if and only if

for some n x 1 vector w.

EXERCISE 13. Let A represent an n x n symmetric nonnegative definite matrix,


and let B represent an n x n matrix. Suppose that B - A is symmetric and non-
negative definite (in which case B is symmetric and nonnegative definite). Show
that A + - B+ is nonnegative definite if and only if rank(A) = rank (B) . Do so by,
for instance, using the results of Exercises 1 and 18.15, the result that W- I - V-I
is nonnegative definite for any m x m symmetric positive definite matrices W and
V such that V - W is nonnegative definite, and the result that the Moore-Penrose
inverse H+ of a k x k symmetric nonnegative definite matrix H equals T+(T+)',
where T is any matrix of full row rank (and with k columns) such that H = T'T.
Solution. Let r = rank(B). And, assume that r > 0 - if r = 0, then B = 0
and (in light of Lemma 4.2.2) A = 0, in which case A + - B+ = 0 - 0 = 0 and
rank(A) = 0 = rank(B). Then, according to Theorem 14.3.7, there exists an r x n
matrix P such that B = P'P. Similarly, according to Corollary 14.3.8, there exists
a matrix Q such that A = Q'Q. And, according to the result of Exercise 18.15,

R(A) c R(B), (S.5)

implying [since R(Q) = R(A) and R(P) = R(B)] that

R(Q) c R(P)

and hence that there exists a matrix K (having r columns) such that

Q=KP.

Thus,
B - A = P'P - Q'Q = P'(I - K'K)P.
Moreover, according to Lemma 8.1.1, P has a right inverse R, so that

I - K'K = (PR) , (I - K'K)PR = R' (B - A)R.

And, as a consequence, I - K'K is nonnegative definite.


Now, suppose that rank (A) = rank(B) (= r). Then,

r = rank(Q) = rank(KP) :::: rank(K),


20. The Moore-Penrose Inverse 229

implying (since clearly rank K ::s r) that rank(K) = r and hence (in light of
Corollary 14.2.14) that K'K is positive definite. Thus, it follows from one of
the cited results (a result encompassed in Theorem 18.3.4) that (K'K)-l - I is
nonnegative definite. Moreover, upon observing that A = P' (K'K)P, it follows
from the result of Exercise 1 that

and from another of the cited results (a result covered by Theorem 20.4.5) that

so that
A + - B+ = P+[(K'K)-l - IHP+)'.
And, in light of Theorem 14.2.9, we conclude that A + - B + is nonnegative definite.
Conversely, suppose that A + - B+ is nonnegative definite. Then, it follows from
the result of Exercise 18.15 that R.(B+) C R.(A +), implying that rank(B+) ::s
rank(A+) and hence [in light of Part (1) of Theorem 20.5.1] that rank (B) ::s
rank(A). Since [in light of result (S.5)] rank(A) ::s rank(B), we conclude that
rank(A) = raok(B).
21
Eigenvalues and Eigenvectors

EXERCISE 1. Show that an n x n skew-symmetric matrix A has no nonzero


eigenvalues.
Solution. Let A represent any eigenvalue of A and let x represent an eigenvector
that corresponds to A. Then, -A'x = Ax = AX, implying that

-A'Ax = -A'(AX) = A(-A'x) = A(AX) = A2X


and hence that -x' A' Ax = A2x'x. Thus, observing that x i= 0 and that A' A is
nonnegative definite, we find that

o:s A2 = -x'A'Ax/x'x:s 0,

leading to the conclusion that A2 = 0 or equivalently that A = O.


EXERCISE 2. Let A represent an n x n matrix, B a k x k matrix, and X an n x k
matrix such that AX = XB.
(a) Show that C (X) is an invariant subspace (of nn xl) relative to A.
(b) Show that if X is of full column rank, then every eigenvalue of B is an
eigenvalue of A.
Solution. (a) Corresponding to any (n x 1) vector u in C(X), there exists a k x 1
vector r such that u = Xr, so that

Au = AXr = XBr E C(X).

Thus, C(X) is an invariant subspace relative to A.


232 21. Eigenvalues and Eigenvectors

(b) Let A represent an eigenvalue of B, and let y represent an eigenvector of B


corresponding to A. By definition, By = AY, so that

A(Xy) = XBy = X(AY) = A(Xy).

Now, suppose that X is of full column rank. Then (since y f. 0) Xy f. 0, leading


us to conclude that A is an eigenvalue of A (and that Xy is an eigenvector of A
corresponding to A).

EXERCISE 3. Let p(A) represent the characteristic polynomial of an n x n matrix


A, and let Co, CI, C2, ... , Cn represent the respective coefficients of the characteristic
polynomial, so that
n
p(A) = COA 0 + CIA + C2A2 + ... + CnA n = L CsAs
s=o
(for A E 'R). Further, let P represent the n x n matrix obtained from p(A) by
formally replacing the scalar A with the n x n matrix A (and by setting A0 = In).
That is, let
n
P=COI+CIA+C2A2+",+cnAn = L csAs.
s=o
Show that P = 0 (a result that is known as the Cayley-Hamilton theorem) by
carrying out the following four steps.
(a) Letting B(A) = A - AI and letting H(A) represent the adjoint matrix ofB(A),
show that (for A E 'R)

H(A) = Ko + AKI +A2K2 + ... +An-1Kn_1 ,

where Ko, KI, K2, ... , Kn-l are n x n matrices (that do not vary with A).
(b) Letting To = AKo, Tn = -Kn-l, and (for s = 1, ... , n - 1) Ts =
AKs - Ks-I, show that (for A E 'R)

To + ATI + A2T2 + ... + AnTn = p(A)In .

[Hint. It follows from a fundamental result on adjoint matrices that (for A E 'R)
B(A)H(A) = IB(A)IIn = p(A)In .]
(c) Show that, for s = 0,1, ... , n, Ts = csl.
(d) Show that

Solution. (a) Let hij(A) represent the ijth element of H(A). Then, hij(A) is the
cofactor of the jith element of B(A). And, it is apparent from the definition of a
21. Eigenvalues and Eigenvectors 233

cofactor and from the definition of a determinant [given by formula (13.1.2)] that
hij (A) is a polynomial (in A) of degree n - 1 or n - 2. Thus,

forsomescalarsk~), kW, kg), ... ,kG-I) (that do not vary with A). And, it follows
that
H(A) = Ko + AKI + A2K2 + ... + )..n-IKn_l,
where (for s = 0, 1,2, ... , n - 1) Ks is the n x n matrix whose ijth element is
kIJ~~) .
(b) In light of Part (a), we have that (for A E R)

B()")H(A) = (A - H)(Ko + AKI + A2K2 + ... + An-IKn_d


= To + ATI + A2T2 + ... + )..nTn .

And, making use of the hint, we find that (for A E R)

To + ATI + A2T2 + ... + AnTn = p(A)In .


(c) For s = 0, 1, ... , n, lettg) represent the ijth element ofTs . Then, it follows
from Part (b) that (for A E R)

t~?) + At~~) + A2t~~) + ... + )..nt~~) = {P(A), if j = i,


IJ IJ IJ IJ 0, if j =1= i.

Consequently,
(s) _ { cs , if j = i,
tij - 0, 1'f' ...J. •
J ;- I,

and hence Ts = csl.


(d) Making use of Part (c), we find that

To+ATI +A2T2+···+AnTn
= col + A(cII) + A2(c21) + ... + An (cnl) = P.
Moreover,

To + ATI + A2T2 + ... + AnTn


= (A-A)Ko + (A-A)AKI + (A-A)A2K2
+ ... + (A-A)An-IKn_1
=0.

EXERCISE 4. Let co, CI, ... , Cn-I, Cn represent the respective coefficients of
the characteristic polynomial p(A) of an n x n matrix A [so that p(A) = Co +
234 21. Eigenvalues and Eigenvectors

CIA + ... + Cn_lA n- 1 + CnA n (for A E 'R)]. Using the result of Exercise 3 (the
Cayley-Hamilton theorem), show that if A is nonsingular, then Co =1= 0, and

A-I = -(l/co)(clI + C2A + ... + cnAn- I).


Solution. According to result (1.8), Co = IAI, and, according to the result of
Exercise 3,
(S.l)
Now, suppose that A is nonsingular. Then, it follows from Theorem 13.3.7 that
Co =1= O. Moreover, upon premultiplying both sides of equality (S.l) by A-I, we
find that

and hence that

EXERCISE 5. Show that if an n x n matrix B is similar to an n x n matrix A,


then (1) Bk is similar to Ak (k = 2, 3, ... ) and (2) B' is similar to A'.

Solution. Suppose that B is similar to A. Then, there exists an n x n nonsingular


matrix C such that B = C- 1AC.
(1) Clearly, it suffices to show that (for k = 1, 2, 3, ... ) Bk = C- I AkC. Let
us proceed by mathematical induction. Obviously, B I = C- IA I C. Now, suppose
that B k- 1 = C- 1Ak-IC (where k ~ 2). Then,
Bk = BBk- 1 = C-1ACC-IAk-lC = C-1AkC.
(2) We find that
B' = (C-1AC)' = C'A'(C- I )' = [(C,)-lrIA'(C,)-l.
Thus, B' is similar to A'.

EXERCISE 6. Show that if an n x n matrix B is similar to an (n x n) idempotent


matrix, then B is idempotent.
Solution. Let A represent an n x n idempotent matrix, and suppose that B is similar
to A. Then, there exists an n x n nonsingular matrix C such that B = C- 1 AC.
And, it follows that
B2 = C-IACC-IAC = C- IA 2C = C-IAC = B.

EXERCISE 7. Let A = G~) and B = G~). Show that B has the same
rank, determinant, trace, and characteristic polynomial as A, but that, nevertheless,
B is not similar to A.
21. Eigenvalues and Eigenvectors 235

Solution. Clearly, IBI = I = IAI, rank (B) = 2 = rank(A), tr(B) = 2 = tr(A),


and the characteristic polynomial of both B and A is p()..) = (). - 1)2.
Now, suppose that CB = AC for some 2 x 2 matrix C = {cij}. Then, since

CB = (Cll + C12)
CII and AC = C = (Cll CJ2),
C21 + C22
C21 C21 C22

Cll + C12 = CJ2 and C21 + C22 = C22, implying that Cll = 0 and C21 = 0 and hence
that C is singular.
Thus, there exists no 2 x 2 nOllsingu/ar matrix C such that CB = AC. And, we
conclude that B is not similar to A.

EXERCISE 8. Expand on the result of Exercise 7 by showing (for an arbitrary


positive integer n) that for an n x n matrix B to be similar to an n x n matrix A it
is not sufficient for B to have the same rank, determinant, trace, and characteristic
polynomial as A.

Solution. Suppose that A = In, and suppose that B is a triangular matrix, all of
whose diagonal elements equal 1. Then, in light of Lemma 13.1.1 and Corollary
8.5.6, B has the same determinant, rank, trace, and characteristic polynomial as A.
However, for B to be similar to A, it is necessary (and sufficient) that there exist an
n x n nonsingular matrix C such that B = C- I AC or equivalently (since A = In)
that B = In. Thus, it is only in the special case where B = In that B is similar to
A.

EXERCISE 9. Let A represent an n x n matrix, B a k x k matrix, and X an n x k


matrix such that AX = XB. Show that if X is of full column rank, then there exists
an orthogonal matrix Q such that Q'AQ = (!II ~~~). where Tll is a k x k
matrix that is similar to B.

Solution. Suppose that rank(X) = k. Then, according to Theorem 6.4.3, there


exists an n x k matrix U whose columns are orthonormal (with respect to the usual
inner product) vectors that form a basis for C(X). And, X = UC for some k x k
matrix C. Moreover, Cis nonsingular [since rank(X) = k].
Now, observe that AUC = AX = XB = UCB and hence that AU = AUCC- I
= UCBC- I. Then, in light of Theorem 21.3.2, there exists an orthogonal matrix Q
such that Q' AQ = (!ll ~~~). where Tll = CBC-I. Moreover, Tll is similar
to B.

EXERCISE 10. Show that if 0 is an eigenvalue of an n x n matrix A, then its


algebraic multiplicity is greater than or equal to n - rank(A).

Solution. Suppose that 0 is an eigenvalue of A. Then, according to Theorem 21.3.4,


its algebraic multiplicity is greater than or equal to its geometric multiplicity, and,
according to Lemma 11.3.1, its geometric multiplicity equals n - rank(A). Thus,
236 21. Eigenvalues and Eigenvectors

the algebraic multiplicity of the eigenvalue 0 is greater than or equal to n - rank(A).

EXERCISE 11. Let A represent an n x n matrix. Show that if a scalar A is an


eigenvalue of A of algebraic multiplicity y, then rank(A - AI) ::: n - y.
Solution. Suppose that A is an eigenvalue of A of algebraic multiplicity y. Then,
making use of Theorem 21.3.4 and result (1.1) {and recalling that, by definition,
the geometric multiplicity of A equals dim[N(A - AI)]}, we find that

y ::: dim[N (A - AI)] =n - rank (A - AI)

and hence that


rank (A - AI) ::: n - y.

EXERCISE 12. Let YI represent the algebraic multiplicity and VI the geometric
multiplicity of 0 when 0 is regarded as an eigenvalue of an n x n (singular) matrix A.
And let Y2 represent the algebraic multiplicity and V2 the geometric multiplicity
of 0 when 0 is regarded as an eigenvalue of A 2. Show that if VI = YI, then
V2 = Y2 = VI·

Solution. For any n x 1 vector x such that Ax = 0, we find that A2x = AAx =
AO = O. Thus,
(S.2)
There exists an n x VI matrix U whose columns form an orthonormal (with
respect to the usual inner product) basis for N(A). Then, AU = 0 = U O. And, it
follows from Theorem 21.3.2 that there exists an n x (n - VI) matrix V such that
the n x n matrix (U, V) is orthogonal and, taking V to be any such matrix, that

,
(U, V) A(U, V) = (0
0
U'AV)
V'AV

[so that A is similar to (~ ~;!~)]. Moreover, it follows from Theorem 21.3.1


that YI equals the algebraic multiplicity of 0 when 0 is regarded as an eigenvalue
of (~ ~; !~) and hence (in light of Lemma 21.2.1) that YI equals VI plus the
algebraic multiplicity of 0 when 0 is regarded as an eigenvalue of V' A V.
Now, suppose that VI = YI. Then, the algebraic multiplicity of 0 when 0 is
regarded as an eigenvalue of V' A V equals 0; that is, 0 is not an eigenvalue of
V' A V. Thus, it follows from Lemma 11.3.1 that V' A V is nonsingular.
Further,

(U, V)'A 2(U, V) = (U, V)'A(U, V)(U, V)'A(U, V)

= (0
0
U'AV)2
V'AV =
(0
0
U'AVV'AV)
(V'AV)2 '
21. Eigenvalues and Eigenvectors 237

(~ u;~,:~~V). And, since (V' AV)2 is nonsingular, it


so that A2 is similar to
follows from Lemma 21.2.1 that Y2 = VI. Recalling inequality (S.2), we conclude,
on the basis of Theorem 21.3.4, that VI = Y2 ::: V2 ::: VI and hence that V2 = Y2 =
VI·

EXERCISE 13. Let XI and X2 represent eigenvectors of an n x n matrix A, and


let CI and C2 represent nonzero scalars. Under what circumstances is the vector
X = CIXI + C2X2 an eigenvector of A?

Solution. Let Al and A2 represent the two (not-necessarily-distinct) eigenvalues to


which XI and X2 correspond. Then, by definition, AXI = AIXI and AX2 = A2X2,
and consequently

Thus, if A2 = AI, then X is an eigenvector of A [unless X2 = -(q/C2)XI, in


which case X = 0]. Alternatively, if A2 =1= AI, then, according to Theorem 21.4.1,
XI and X2 are linearly independent, implying that X and X2 are linearly independent
(as can be easily verified) and hence that there does not exist any scalar C such
that AIX + (A2 - AI)C2X2 = cx. We conclude that if A2 =1= Al then X is not an
eigenvector of A.

EXERCISE 14. Let A represent an n x n matrix, and suppose that there exists
an n x n nonsingular matrix Q such that Q-I AQ = 0 for some diagonal matrix
r;
0= {dj). Further, for i = 1, ... , n, let represent the ith row of Q-I. Show (a)
that A' is diagonalized by (Q-I)" (b) that the diagonal elements of 0 are the (not
necessarily distinct) eigenvalues of A', and (c) that rl, ... , r n are eigenvectors of
A' (with rj corresponding to the eigenvalue dj).
Solution. The validity of Part (a) is evident upon observing that

0=0' = (Q-IAQ)' = Q'A'(Q-I), = [(Q')-I]-IA'(Q-I)'


= [(Q-I)'J-IA'(Q-I)'.
And, observing also that rj is the ith column of (Q-I)" the validity of Parts (b)
and (c) follows from Parts (7) and (8) of Theorem 21.5.1.

EXERCISE 15. Show that if an n x n nonsingular matrix A is diagonalized by


an n x n nonsingular matrix Q, then A -I is also diagonalized by Q.
Solution. Suppose that A is diagonalized by Q. Then, it follows from Theorem
21.5.1 that the columns of Q are eigenvectors of A and hence (in light of Lemma
21.1.3) of A -I, leading us to conclude (on the basis of Theorem 21.5.2) that Q
diagonalizes A -I as well as A. [Another way to see that Q diagonalizes A -I is
to observe that Q-IA-IQ = (Q-IAQ)-I and that (since Q-1AQ is a diagonal
matrix) (Q-I AQ)-I is a diagonal matrix.]
238 21. Eigenvalues and Eigenvectors

EXERCISE 16. Let A represent an n x n matrix whose spectrum comprises


k eigenvalues AI, ... , Ak with algebraic multiplicities YI, ... , Yk> respectively,
that sum to n. Show that A is diagonalizable if and only if, for i = 1, ... , k,
rank(A - Ail) = n - Yi .

Solution. Let VI, ... , Vk represent the geometric multiplicities of AI, ... , Ak, re-
spectively. Then, according to result (1.1), Vi = n - rank(A - Ail) (i = 1, ... ,
k). Thus,

rank(A - Ail) = n - Yi {:} Yi = n - rank(A - Ail) ¢} Vi = Yi .

Moreover, it follows from Corollary 21.3.7 that Vi = Yi for i = 1, ... , k if and


only if L~=I Vi = L~=I Yi or equivalently (since L~=I Yi = n) if and only if
L~=I Vi = n. We conclude, on the basis of Corollary 21.5.4, that A is diagonaliz-
able if and only if, for i = 1, ... , k, rank(A - Ail) = n - Yi .

EXERCISE 17. Let A represent an n x n symmetric matrix with not-necessarily-


distinct eigenvalues dl, ... , d n that have been ordered so that dl .s: d2 .s: ... .s: d n .
And, let Q represent an n x n orthogonal matrix such that Q' AQ = diag(dl, ... ,
d n ) - the existence of which is guaranteed. Further, for m = 2, ... , n - 1, define
Sm = {x E nnx I : x i= 0, Q~x = O} and Tm = {x E nnx
I : x i= 0, P'mx = O},
where Qm = (ql"'" qm-l) and Pm = (qm+l"'" qn)' Show that, for m =
2, ... , n - 1,
x'Ax x'Ax
dm = min - - = max - - .
XESm XiX xETm XiX

Solution. Let x represent an arbitrary n x 1 vector, and let Y = Q' x. Partition Q


and Y as Q = (Qm, Rm) and Y = G~) (where YI has m - 1 elements). Then,

Yl = Q~x, Y2 = R~x, and

Moreover, since the columns of Rm are linearly independent and the columns of
Qm are linearly independent, Y2 = 0 ¢} R mY2 = 0 (or equivalently Y2 i= 0 ¢}
R mY2 i= 0) and Yl = 0 {:} QmYI = O. Thus,

Q~ x = 0 {:} Y1 = 0 ¢} x = Rm Y2 .
It follows that x E Sm if and only if x = R mY2 for some (n - m + 1) x 1 nonnull
vector Y2'
It is now clear (since R~Rm = I) that

. x'Ax . (R mY2)'AR mY2 . y;(R~ARm)y2


mm--=mm =mm I .
XESmXiX ydO (RmY2)'RmY2 Y2,<oO Y2Y2

Moreover, R~ARm = diag(dm, d m +l, ... , d n ), so that R~ARm is a symmetric


matrix whose smallest eigenvalue is dm . Thus. as a consequence of Theorem
21. Eigenvalues and Eigenvectors 239

21.5.6, we have that

And, we conclude that


x/Ax
min-- = dm .
xeSm x'x
That maxxeTm x' AX/X'X = d m follows from a similar argument.

EXERCISE 18. Let A represent an n x n symmetric matrix, and adopt the


following notation: dl, ... , dn are the not-necessarily-distinct eigenvalues of A,
ql' ... , qn are orthonormal eigenvectors corresponding to d I, ... , dn , respectively,
Q = (ql"'" Qn), {AI, ... , Ad is the spectrum of A; and, for j = 1, ... , k,
Sj = {i : di = Aj}, Ej = LieSjqiq;, andQj = (qil, .. ·,qivj),where
iI, ... , iVj denote the elements of Sj .
(a) Show that the matrices EI, ... , Ek , which appear in the spectral decompo-
sition A = L;=I AjEj, have the following properties:
(1) EI + ... + Ek = I;
(2) EI, ... , Ek are nonnull, symmetric, and idempotent;
(3) for t =f:. j = 1, ... , k, ErE j = 0; and
(4) rank(EI) + ... + rank(Ek) = n.
(b) Take FI, ... , Fr to be n x n nonnull idempotent matrices such that FI +
... + F r= I. And, suppose that, for some distinct scalars TI, •.• , Tr ,

Show that r = k and that there exists a permutation tl, ... , tr of the first r positive
integers such that (for j = 1, ... , r) Tj = Atj and F j = E rj .

Solution. (a) (1)


k n
EI + ... + Ek = L L qiq; = L qiq; = QQ' = I.
j=1 ieSj i=1

(2) Observe that (for j = 1, ... , k) Ej = QjQj and QjQj = I. Then, clearly,
Ej is symmetric. And, Qj is nonnull, implying (in light of Corollary 5.3.2) that
E j is nonnull. Further

so that E j is idempotent.
(3) and (4) In light of Theorem 18.4.1, Parts (3) and (4) follow from Parts (1)
and (2).
240 21. Eigenvalues and Eigenvectors

(b) Observe (in light of Theorem 18.4.1) that, for t f. j = 1, ... , r, FtFj = O.
Then, for j = 1, ... , r, we find that

implying that T:j is an eigenvalue of A and that any nonnull column of Fj is an


eigenvector of A corresponding to T:j. Consequently, there exists some subset
T = {tl, ... , t r } of the first k positive integers such that (for j = 1, ... , r)
T:j = Atj (so that r :::: k). Further,

C(Et)·) = C(Qt.Q;)
))
= C(Qt·)
)
= N(A - At)· I) = N(A - T:jl),

so that Fj = EtjLj for some n x n matrix L j .


We have that
A = Atl Etl Ll + ... + At,Et,Lr ,

implying [in light of Part (a) and equality (5.5)] that (for j = 1, ... , r)

so that Atj = 0 or F j = Etj' Moreover, for j fj. T, we find that

AjEj = AjE] = EjA = 0,

implying (since E j f. 0) that Aj = 0 (and hence that k :::: r + 1).


To complete the proof, it suffices to show that r = k and that (for j = 1, ... , r)
F j = Etj' Let us consider separately the following two cases: (1) Atj f. 0 for
j = 1, ... , r; and (2) Ats = 0 for some integer s (1 :::: s :::: r).
In Case (1), it follows from result (S.3) that (for j = 1, ... , r) Fj = Etj'
Moreover, r = k, since otherwise there would exist a positive integer s fj. T such
that As = 0, and we would have [in light of Part (a)] that
r r
Es =1- LEtj =1- LFj =1-1=0,
j=l j=l

which [since, according to Part (a), Es f. 0] would lead to a contradiction.


In Case (2), it is clear that r = k and also [in light of result (S.3) and Part (a)]
that F j = Etj for j f. sand

Fs =1- LFj =1- LEtj = Ets '


jf.s jf.s

EXERCISE 19. Let A represent an n x n symmetric matrix, and let dl, ... ,dn
represent the (not-necessarily-distinct) eigenvalues of A. Show that limk ...... oo Ak
= 0 if and only if, for i = 1, ... ,n, Idd < 1.
21. Eigenvalues and Eigenvectors 241

Solution. Let n = diag(dl, ... , dn ). Then, according to Corollary 21.5.9, there


exists an n x n orthogonal matrix Q such that Q' AQ = n. And, it follows from
result (5.6) that Ak = QnkQ'. Moreover, since nk = diag(d1k, ... , d:),

Idd < 1 for i = 1, ... , n ¢> lim


k-H)()
d/ = 0 for i = 1, ... , n
¢> lim nk = O.
k--+oo

Now, suppose that, for i = 1, ... , n, Idi I < 1. Then,

lim Ak = Q( lim nk)Q' = QOQ' = o.


k--+oo k--+oo

Conversely, suppose that limk--+oo Ak = O. Then, observing that nk = Q' AkQ,


we find that

lim nk = lim Q'AkQ = Q'( lim Ak)Q = Q'OQ = o.


k--+oo k--+oo k--+oo

And, it follows that, for i = 1, ... , n, Idi I < 1.

EXERCISE 20. (a) Show that if 0 is an eigenvalue of an n x n not-necessarily-


symmetric matrix A, then it is also an eigenvalue of A + and that the geometric
multiplicity of 0 is the same when it is regarded as an eigenvalue of A+ as when
it is regarded as an eigenValue of A.
(b) Show (via an example) that the reciprocals of the nonzero eigenvalues of a
square nonsymmetric matrix A are not necessarily eigenvalues of A + .
Solution. (a) Suppose that 0 is an eigenvalue of A. Then, according to Lemma
21.1.1, rank(A) < n. And, since (according to Theorem 20.5.1) rank (A +) =
rank(A), it follows that rank(A +) < n, implying (in light of Lemma 21.1.1) that
ois also an eigenvalue of A +. Further, making use of Lemma 11.3.1, we find that
dim [N(A +)] =n- rank(A +) =n - rank(A) = dim[N(A)],
so that the geometric multiplicity of 0 is the same when it is regarded as an eigen-
value of A + as when it is regarded as an eigenvalue of A.
(b) Consider the n x n matrix A = (in, 0) (where n ~ 2). We find that A' A =
diag(n, 0, 0, ... ,0) and hence that (A'A)+ = diag(1ln, 0, 0, ... ,0), implying
(in light of Corollary 20.5.5) that

Since A and A+ are triangular, it is easy to see that the distinct eigenvalues of A
are 0 and 1, while those of A + are 0 and lin. Obviously, lin is not (for n ~ 2)
the reciprocal of 1.
242 21. Eigenvalues and Eigenvectors

EXERCISE 21. Show that, for any positive integer n that is divisible by 2, there
exists an n x n orthogonal matrix that has no eigenvalues. [Hint. Find a 2 x 2
orthogonal matrix Q that has no eigenvalues, and then consider the block-diagonal
matrix diag(Q, Q, ... , Q).]

Solution. Let Q = (_ ~ ~). Clearly, Q is orthogonal, and (as shown in Sec-


tion 21.1) it has no eigenvalues. Now, consider the n x n block-diagonal matrix
diag(Q, Q, ... , Q) (having n/2 diagonal blocks). This matrix is orthogonal (as
is easily verified), and, as a consequence of Part (2) of Lemma 21.2.1, it has no
eigenvalues.

EXERCISE 22. Let Q represent an n x n orthogonal matrix, and let p(A) represent
the characteristic polynomial of Q. Show that (for A i= 0)
p()...) = ±)...n p (1/)...).

Solution. Let A represent an arbitrary nonzero scalar. Then,


Q - AI = Q - )...QQ' = -)...Q[Q' - (1/)",)1] = -AQ[Q - O/A)I)'.

Thus, making use of Theorem l3.3.4, Lemma l3.2.1, and Corollaries 13.2.4 and
13.3.6, we find that

p()...) = IQ - All = 1- )...QII[Q - (1/)",)1]'1


= (_)...)nIQIIQ _ O/A)II
= (-It)...n(±l)pO/A) = ±An p(1/A).

EXERCISE 23. Let A represent an n x n matrix, and suppose that the scalar 1 is
an eigenvalue of A of geometric multiplicity v. Show that v :::: rank (A) and that if
v = rank(A), then A is idempotent.

Solution. That v :::: rank(A) is an immediate consequence of Corollary 21.3.8.


Now, suppose that v = rank(A). Then, it follows from Corollary 21.3.8 that A
has no nonzero eigenvalues other than 1, and it follows from Corollary 21.5.4 that
A is diagonalizable. Thus, as a consequence of Theorem 21.8.3, we have that A is
idempotent.

EXERCISE 24. Let A represent an n x n nonsingular matrix. And, let A repre-


sent an eigenvalue of A, and x represent an eigenvector of A corresponding to A.
Show that IAI/)... is an eigenvalue of adj(A) and that x is an eigenvector of adj(A)
corresponding to IAI/)....
Solution. Note (in light of Lemma 21.1.1 and Theorem l3.3.7) that)... i= 0 and
IAI i= O.
According to Lemma 21.1. 3, 1/ Ais an eigenvalue of A -\, and x is an eigenvector
of A -1 corresponding to I/A. And, since (according to Corollary 13.5.4) adj(A) =
21. Eigenvalues and Eigenvectors 243

IAIA -1, it follows from the results of Section 21.10 that lAllA is an eigenvalue of
adj(A) and that x is an eigenvector of adj(A) corresponding to lAllA.

EXERCISE 25. Let A represent an n x n matrix, and let peA) represent the char-
acteristic polynomial of A. And, let 1..1,0' . , Ak represent the distinct eigenvalues
of A, and YI, .. Yk represent their respective algebraic multiplicities, so that (for
0 ,

alIA)

peA) = (-I)nq(A) nk

j=1
(A - Aj)Yj

for some polynomial q(A) (of degree n - L~=I Yj) that has no real roots. Further,
define B = A - AIUV', where U = (UI, ... , UYI) is an n x YI matrix whose
columns UI, .. u YI are (not necessarily linearly independent) eigenvectors of A
0 ,

corresponding to Al and where V = (VI, ... , v YI ) is an n x YI matrix such that


V'U is diagonal.
(a) Show that the characteristic polynomial, say rCA), of B is such that (for all
A)
YI k
r(A) = (-I)n q (A)n [A-(l-v;ui)Adn(A-Aj)Yj . (Eol)
i=1 j=2

[Hint. Since the left and right sides of equality (E.I) are polynomials (in A), it
suffices to show that they are equal for all A other than AI, ... ,Ak.)
(b) Show that in the special case where U'V = cI for some nonzero scalar c, the
distinct eigenvalues of B are either 1..2, ... , As-I, As, As+I .... , Ak with algebraic
multiplicities)/2, 00', Ys-l, Ys + YI, Ys+I, ... , Yk. respectively, or (1 - C)AI, 1..2,
o ••Ak with algebraic multiplicities YI, )/2, ... , Yk. respectively [depending on
,

whether or not (1 - C)AI = As for some s (2 ~ s ~ k»).


(c) Show that, in the special case where YI = 1, (1) UI is an eigenvector of B
corresponding to the eigenvalue (1 - v~ udAl and (2) for any eigenvector x of B
corresponding to an eigenvalue A [other than (1 - v; uI)Ad, the vector

is an eigenvector of A corresponding to A.
Solution. (a) Let A represent any scalar other than AI, .. 0 • Ak. And, observe that

(A - AI)U = AU - AU = (AI - A)U,

so that (A - AI)-IU = -(A - Ad-IV. Then, making use of Corollary 18.1.2, we


find that

rCA) = IA - AI - Al UV'I
= IA - AIIII - AI(A - AI)-IUV'I
244 21. Eigenvalues and Eigenvectors

= p(A)IIn + AI(A - Ad-IUV'I


= p(A)IIY1 + Al (A - AI)-IV'UI

= peA) nYI

i=1
[1 + Al (A - Ad-Iv;uiJ

= p(A)(A - Ad- Y1 nYI

i=I
(A - Al + AIV;Ui)

= (-I)nq(A) n
YI

i=1
[A - (1 - v;ui)AIl nk

j=2
(A - Aj)Yj.

(b) Part (b) follows from Part (a) upon observing that, in this special case,

nYI

i=1
[A - (1 - v;ui)AIl = [A - (1 - c)AIl Y1 •

(c) (1) In this special case,

BUI = AUI - AIUI~UI = AIUI - AI(~udul = (1 - ~udAlul.

(2) Suppose that YI = 1, and let d = Al (AI - A)-Iv~ x. Since (by definition)
Bx = AX, we have that

and hence that


A[x - (v~x)uIl = Ax.
Thus,

A(x - dUI) = A[x - (V~X)UI + (V~X)UI - dull = AX - (d - v'lx)AIUI.

Moreover, Aid - Ad = (AI - A)d = AIV~X, implying that (d - V~X)AI = Ad.


And, it follows that
A(x - dud = A(X - dud.
Since X - dUI t= 0 (as is evident upon observing that X and UI are eigenvectors of
B that correspond to different eigenvalues and hence, in light of Theorem 21.4.1,
that X and U I are linearly independent), we conclude that X - du I is an eigenvector
of A corresponding to A.

EXERCISE 26. Let A represent an m x n matrix of rank r. And, take P to be


any m x m orthogonal matrix and DI to be any r x r nonsingular diagonal matrix
such that
P' AA'P = (~~ ~).
21. Eigenvalues and Eigenvectors 245

Further, partition P as P = (PI, P2), where PI has r columns, and let Q =


t
(Ql, Q2), where Ql = A'PID l and where Q2 is any n x (n - r) matrix such
=
that Q; Q2 O. Show that

Solution. Upon applying Theorem 21.12.1 with A' in place of A (and with nand
m in place of m and n, respectively) and writing P = (PI, P2) for Q = (Ql, Q2)
and Q = (Ql, Q2) for P = (PI, P2), we find that

Q'A'P = (~I ~).


Thus,
P' AQ = (Q' A'P)' = (~l o 0)' = (Dl
0
0)

Or, alternatively, the equality P' AQ = (~l ~) can be established via an


argument paralleling the proof of Theorem 21.12.1.

EXERCISE 27. Let A represent an m x n matrix. And, let P represent an m x m


orthogonal matrix, Q an n x n orthogonal matrix, and Dl = {Si} an r x r diagonal
matrix with (strictly) positive diagonal elements such that P' AQ = (~I ~}
denote by PI' ... 'Pr the first through rth columns of P and by ql' ... , qr the
first through rth columns of Q; and define aI, ... ,ak to be the distinct values
represented among Sl, ... , Sr. Then, the singular value decomposition of A is
A = L'=l ajUj, where (for j = 1, ... , k) Uj = LieLj Piq; with Lj = {i :
Si = aj}. Show that the matrices UI, ... , Uk. which appear in the singular value
decomposition, are such that UjUjUj = Uj (for j = 1, ... , k) and U~Uj = 0
and U,Uj = 0 (for t 1= j = 1, ... , k).
Solution. By definition,

for v =i
for v 1= i.
Thus,

UjUj = (L Pvq~)' L Piq; =L L qvP~Piq; = L qiq;,


veLj ieLj veLj ieLj ieLj

so that

UjUjUj =L Pvq~ L qiq; = L L Pvq~qiq; =L Piq; = Uj.


veLj ieLj veLj ieLj ieLj
246 21. Eigenvalues and Eigenvectors

And, for t 1= j,

U;Uj = (L Pt'q~)' L Piq; = LL qt'P~,Piq; = 0,


VEL I iELj VEL I iELj

and
UtU) = L Pvq~( L Piq;)' = LL Pvq~qiP; = O.
VEL I iELj VEL I iELj

EXERCISE 28. Let A represent an m x n matrix. And, as in Exercise 27, take


P to be an m x m orthogonal matrix, Q an n x n orthogonal matrix, and DI an
r x r nonsingular diagonal matrix such that p' AQ = (~I ~). Further, partition
P and Q as P = (PI, P2) and Q = (QI, Q2), where each of the matrices PI and
QI has r columns. Show that C(A) = CcPd and that N(A) = C(Q2)'
Solution. Clearly,

A= p(~1 ~)Q' = PIDIQ;,

implying that C(A) c CcPI). Moreover, as a consequence of result (12.12),


CcPd c C(A). Thus, C(A) = CcP]).
We have that

so that Q; Q2 = O. Thus,

Moreover, making use of result (12.9), we tind that

rank(Q2) = n - r = n - rank(A).

And, we conclude, on the basis of Lemma 1l.4.1, that N(A) = C(Q2)'

EXERCISE 29. Let AI, ... , Ak represent n x n not-necessarily-symmetric ma-


trices, each of which is diagonalizable. Show that if A I, ... , Ak commute in pairs,
then AI, ... , Ak are simultaneously diagonalizable.
Solution. The proof is a modi tied version of the mathematical induction argument
employed in Section 13 (in showing that symmetric matrices that commute in pairs
are simultaneously diagonalizable).
That one diagonalizable matrix Al is "simultaneously" diagonalizable is ob-
vious. Suppose now that any k - 1 diagonalizable matrices (of the same order)
that commute in pairs can be simultaneously diagonalized (where k ~ 2). And,
21. Eigenvalues and Eigenvectors 247

let AI, ... ,Ak represent k diagonalizable matrices of arbitrary order n that com-
mute in pairs. Then, to complete the induction argument, it suffices to show that
AI, ... , Ak can be simultaneously diagonalized.
Let AI, ... , Ar represent the distinct eigenvalues of Ak. and let VI, ... , Vr rep-
resent their respective geometric multiplicities. Take Qj to be an n x Vj matrix
whose columns are linearly independent eigenvectors of Ak corresponding to the
eigenvalue Aj or, equivalently, whose columns form a basis for the V j -dimensional
linear spaceN(Ak - AjI) (j = 1, ... , r). And, define Q = (QI,"" Qr)'
Since Ak is diagonalizable, it follows from Corollary 21.5.4 that L:J=I Vj = n,
so that (in light of Theorem 21.4.2) Q has n linearly independent columns and
hence is nonsingular. Further,

(S.4)

(as is evident from Theorems 21.5.2 and 21.5.1).


As in the case where A I, ... , Ak are symmetric, there exists a vj x vj matrix
Bij such that
AiQj = QjBij (S.5)
(i = 1, ... , k - 1; j = 1, ... , r), and we find that (for i = 1, ... , k - 1)

Q-1AiQ = diag(Bil,"" Bir). (S.6)

Now, partition Q-I as

where (for j = 1, ... , r) T j has Vj rows. And, note that


TjQm={ I vj , form=!=I, ... ,r,
0, form =1=1 = l, ... ,r.

Then, using result (S.5), we find that, for j = 1, ... , r,

(i = 1, ... , k - 1) and that

BsjBij = TjAsQjBij = TjAsAiQj


= TjAiAsQj = TjAiQjBSj = BijBsj (S.7)

(s > i = 1, '" ,k - 1).


Since (by definition) Ai is diagonalizable, there exists a nonsingular matrix
Li and a diagonal matrix Fi such that Li 1 AiLi = Fi , or equivalently such that
AiLi =
LiFi ,and hence such that

(S.8)
248 21. Eigenvalues and Eigenvectors

(i = I, ... ,k-I).Comparing(fori = I, ... ,k-l andj = I, ... ,r)thejth


group of rows of the left and right sides of equality (S.8) and using equality (S.6),
we find that

and hence that


Bij(TjL;) = (TjL;)F;.
Thus, any nonnull column of the vj x n matrix T j L; is an eigenvector of Bij .
Moreover, since L; is nonsingular and since the rows of Q-I are linearly indepen-
dent,
rank(TjL;) = rank(Tj) = Vj,
so that (according to Theorem 4.4.10) TjL; contains Vj linearly independent
columns and it follows from Corollary 21.5.3 that Bij is diagonalizable.
We have (for j = 1, ... , r) that each of the matrices Blj, ... , Bk- I,j is diago-
nalizable, and it follows from result (S.7) that Blj, ... , Bk-l,j commute in pairs.
Thus, by supposition, Blj, ... , Bk-I,j can be simultaneously diagonalized; that
is, there exists a vj x vj nonsingular matrix S j and vj x vj diagonal matrices
Dlj, ... , Dk-l,j such that (for i = 1, ... , k - 1)

(S.9)

Define S = diag(SI, ... , Sr) and P = QS. Then, clearly, S is nonsingular, and
hence P is also nonsingular. Further, using results (S.6), (S.9), and (S.4), we find
that, for i = 1, ... , k - 1,

P-1A;P = S-lQ-IA;QS = diag(Dil, ... , D;r)

and that
p- IAkP = S-lQ-I AkQS = diag(AlI v1 , ••• , ArIvr ),
so that all k of the matrices AI, ... , Ak are simultaneously diagonalized by the
nonsingular matrix P.

EXERCISE 30. Let V represent an n x n symmetric nonnegative definite matrix,


X an n x p matrix of rank r, and d a p x 1 vector. Using the results of Exercise
19.11 (or otherwise), show that each of the following three conditions is necessary
and sufficient for the vector X(X'X)-d to be a solution, for every d E C(X'), to
the problem of minimizing the quadratic form a'Va (in a) subject to X'a = d:
(a) there exists an orthogonal matrix that simultaneously diagonalizes V and P x ;
(b) there exists a subset of r orthonormal eigenvectors of V that is a basis for
C(X);
(c) there exists a subset of r eigenvectors of V that is a basis for C(X).

Solution. (a) Recall from Part (3) of Theorem 12.3.4 that P x is symmetric. Then,
as a consequence of Corollary 21.13.2, there exists an orthogonal matrix that
21. Eigenvalues and Eigenvectors 249

simultaneously diagonalizes V and Px if and only if P x V = VPx . And, it follows


from the results of Exercise 19.11 that the existence of an orthogonal matrix that
simultaneously diagonalizes V and Px is a necessary and sufficient condition for
X(X'X)-d to be a solution [for every d E C(X')] to the problem of minimizing
a'Va subject to X'a = d.
(b) and (c). Suppose that there exist r (possibly orthonormal) eigenvectors
U1, ... , U r of V that form a basis for C(X), and let U = (U1, ... , u r ). Then,
VU = un for some (diagonal) matrix D. Moreover, since clearly C(U) = C(X),
X = UT and U = XS for some matrices T and S. Thus,

VX = VUT = UDT = XSDT = XQ

for Q = SDT. And, it follows from the results of Exercise 19.11 that X(X'X) - d
is a solution [for every d E C(X')] to the problem of minimizing a'Va subject to
X'a=d.
Conversely, suppose that X(X'X)-d is a solution [for every d E C(X')] to the
problem of minimizing a' Va subject to X'a = d. Then, it follows from Part (a)
that there exists an n x n orthogonal matrix Q that simultaneously diagonalizes
P x and V. That is, there exists an n x n orthogonal matrix Q such that Q'PxQ =
diag(d1, ... ,dn) and Q'VQ = diag(f1, ... , In) for some scalars d), ... , dn and
fl,···,ln.
Further, it follows from Theorem 21.5.1 that the (not necessarily distinct) eigen-
values of P x are d1, ... , dn and the (not necessarily distinct) eigenvalues of V are
fl, ... , In and that the first, ... , nth columns of Q are eigenvectors of Px corre-
sponding to d1, ... , dn , respectively, and are also eigenvectors of V corresponding
to fl, ... , In, respectively. And, since [according to Part (8) of Theorem 12.3.4]
rank(P x) = r, n - r of the eigenvalues of Px are (in light of Lemma 21.1.1) equal
toO.
Now, let Q1 represent the n x r matrix obtained from Q by deleting those
columns that are eigenvectors of P x corresponding to O. Then, in light of Theorem
21.4.3 and Part (7) of Theorem 12.3.4, C(Q1) = C(Px) = C(X). And, since the
columns of Q1 are (orthonormal) eigenvectors of V, there exist r orthonormal
eigenvectors of V that form a basis for C(X).

EXERCISE 31. Let A represent an n x n symmetric matrix, and let B represent


an n x n symmetric positive definite matrix. And, let Amax and Amin represent,
respectively, the largest and smallest roots of IA - ABI. Show that

x'Ax
Amin :s ~B
x x :s Amax
for every nonnull vector x in nn.
Solution. Let S represent any n x n nonsingular matrix such that B = S'S, let
R = (S-l), (so that B- 1 = R'R), and let C = RAR'. Then, in light of result
(14.7), Amax and Amin are, respectively, the largest and smallest eigenvalues of C.
250 21. Eigenvalues and Eigenvectors

And, it follows from Theorem 21.5.6 that

y'Cy
Amin .::: - ,- .::: Amax
yy

for every nonull vector y in nn.


Now, let x represent an arbitrary nonulI vector in nn, and let y = Sx. Then, y is
nonull, and
y'Cy x'S'CSx x' Ax
--- =
y'y x'S'Sx x'Bx
Thus,
x'Ax
Amin .::: -'--B .::: Amax.
x x

EXERCISE 32. Let A represent an n x n symmetric matrix, and let B represent an


n x n symmetric positive definite matrix. Show that A - B is nonnegative definite
if and only if all n (not necessarily distinct) roots of IA - ABI are greater than or
equal to 1 and is positive definite if and only if all n roots are (strictly) greater than
1.

Solution. Letd1, ... ,dn representthe n (not necessarily distinct) roots of IA-ABI.
And, let S represent any n x n nonsingular matrix such that B = S'S, let R = (S-1),
(so that B- 1 = R'R), and let C = RAR'. Then, in light of result (14.7), the (not
necessarily distinct) eigenvalues ofC are d1, ... , dn , and it follows from Corollary
21.5.9 that there exists an n x n orthogonal matrix P such that P'CP = D, where
=
D diag(d1, ... , dn ).
Now, take Q = S'P. Then, according to results (14.2) and (14.1), A = QDQ'
and B = QQ'. Thus,

A- B = Q(D - In)Q' = Q diag(d1 - 1, ... , d n - l)Q'.

And, it follows from Corollary 14.2.15 that A - B is nonnegative definite if and


only if, for i = 1, ... , n, di - 1 ~ 0 and is positive definite if and only if, for
i = 1, ... , n, di - 1 > O. Or, equivalently, A - B is nonnegative definite if
and only if, for i = 1, ... ,n, d i ~ 1 and is positive definite if and only if, for
i = 1, ... ,n, d i > 1.
22
Linear Transformations

EXERCISE 1. LetU represent a subspace of a linear space V, and let S represent


a linear transformation from U into a linear space W. Show that there exists a
linear transformation T from V into W such that S is the restriction of T to U.
Solution. Let {XI, ... , X r } represent a basis forU. Then, it follows from Theorem
4.3.12 thatthere exist matrices X r + I, ... , Xr+k such that {X I, ... , X r , X r + I, ... ,
Xr+k} is a basis for V.
Now, for i = 1, ... , r, define Y i = S(X;); and, for i = r + 1, ... , r + k, take
Yi to be any matrix in W. And, letting X represent an arbitrary matrix in V, take
T to be the transformation from V into W defined by

T(X) = CI YI + ... + Cr Y r + Cr+1 Yr+1 + ... + Cr+k Yr+k,


where q, ... , Cr , Cr+l, ... , cr+k are the (unique) scalars that satisfy X = CIX I +
... + crXr + Cr+IXr + 1 + ... + Cr+kXr+k - since Y I , ... , Y r , Yr+I, ... , Yr+k
are in the linear space W, T(X) is in W. Clearly, if X E U, then Cr+1 = ... =
Cr+k = 0, and hence

Moreover, it follows from Lemma 22.1.8 that T is linear. Thus, T is a linear


transformation from V into W such that S is the restriction of T to U.

EXERCISE 2. Let T represent a 1-1 linear transformation from a linear space V


into a linear space W. And, write U· Y for the inner product of arbitrary matrices
252 22. Linear Transfonnations

U and Y in W. Further, define X * Z = T (X) " T (Z) for all matrices X and Z in
V. Show that the "*-operation" satisfies the four properties required of an inner
product for V.

Solution. Observe (in light of Lemma 22.1.3) that N(T) = {O} and hence that
T (X) = 0 if and only if X = O. Then, letting X, Z, and Y represent arbitrary
matrices in V and letting k represent an arbitrary scalar, we find that
(1) X * Z = T(X)"T(Z) = T(Z)"T(X) = Z * X;
(2) X * X = T (X)" T (X) > 0, if T (X) i= 0 or, equivalently, if X i= 0,
= 0, if T (X) = 0 or, equivalently, if X = 0 ;
(3) (kX) *Z = T(kX)"T(Z)
= [kT(X)]"T(Z) = k[T(X)"T(Z)] = k(X * Z); and
(4) (X+Z)*Y=T(X+Z)"T(Y)
= [T(X) + T(Z)]"T(Y)
= [T(X)"T(Y)] + [TCZ)"TCY)] = eX * Y) + (Z * Y).

EXERCISE 3. Let T represent a linear transformation from a linear space V into


a linear space W, and let U represent any subspace of V such that U and N(T)
are essentially disjoint. Further, let {XI, ... , X r } represent a linearly independent
set of r matrices in U.
(a) Show that T(XJ}, .... T(X r ) are linearly independent.
(b) Show that if r = dim(U) (or equivalently if X I, ... , Xr form a basis for U)
and ifU (f)N(T) = V, then T(Xd, ... , T(X r ) form a basis for T(V).

Solution. (a) Let CI, ... , C r represent any scalars such that L~=l Ci T (X..) = O.
Then, T(L~=I CiXi) = L~=I C;T(Xi) = 0, implying that L~=I CiXi is inN(T)
and hence (since clearly L;=I CiXi E U) that L~=l CiXi is in Un N(T). Thus,
L~=I Ci Xi = 0, and (since XI, ... , Xr are linearly independent) it follows that
CI = ... = Cr = O. And, we conclude that T(XI), ... , T(X r ) are linearly inde-
pendent.
(b) Suppose that r = dim(U) and that U (f) NeT) = V. Then, making use of
Theorem 22.1.1 and of Corollary 17.1.6, we find that

dim[T(V)] = dim (V) - dim[N(T)] = r.

And, in light of Theorem 4.3.9 and the result of Part (a), it follows that T(XI),
... , T (X r ) form a basis for T eV).
An alternative proof [of the result of Part (b)] can be obtained [in light of the
result of Part (a)] by showing that the the set {T(XJ}, ... , TCX r )} spans T(V).
Continue to suppose that r = dim (U) and that U {f)N (n = V. And, let Z I, ... , Zs
represent any matrices that form a basis for N (n. Further, observe, in light of
Theorem 17.1.5, that the r + s matrices XI, ... , X r , ZI, ... , Zs form a basis for
V.
Now, let Y represent an arbitrary matrix in T(V). Then, Y = T(X) for some
22. Linear Transfonnations 253

matrix X in V and X = :L~=1 CiXi + :LJ=I kjZj, so that

Thus, {T(Xt}, ... , T(X r )} spans T(V).

EXERCISE 4. Let T and S represent linear transformations from a linear space


V into a linear space W, and let k represent an arbitrary scalar.
(a) Verify that the transformation kT is linear.
(b) Verify that the transformation T + S is linear.
Solution. (a) For any matrices X and Z in V and for any scalar c,

(kT)(X + Z) = kT(X + Z) = T[k(X + Z)]


= T(kX + kZ)
= T(kX) + T(kZ)
= kT(X) + kT(Z) = (kT)(X) + (kT)(Z),
and
(kT)(cX) = kT(cX) = k[cT(X)] = c[kT(X)] = c(kT)(X).

(b) For any matrices X and Z in V and for any scalar c,

(T+S)(X+Z) = T(X+Z)+S(X+Z)
= T(X) + T(Z) + SeX) + S(Z)
= T(X) + SeX) + T(Z) + S(Z)
= (T + S)(X) + (T + S)(Z),

and

(T + S)(cX) = T(cX) + S(cX) = cT(X) + cS(X)


= c[T(X) + SeX)] = c(T + S)(X).

EXERCISE 5. Let S represent a linear transformation from a linear space U into


a linear space V, and let T represent a linear transformation from V into a linear
space W. Show that the transformation T S is linear.
Solution. For any matrices X and Z in U and for any scalar c,

(T S)(X + Z) = T[S(X + Z)] = T[S(X) + S(Z)]


= T[S(X)] + T[S(Z)] = (T S)(X) + (T S)(Z),
and
(T S)(cX) = T[S(cX)] = T[cS(X)] = cT[S(X)] = c(T S)(X).
254 22. Linear Transformations

EXERCISE 6. Let T represent a linear transformation from a linear space V into


a linear space W, and let R represent a linear transformation from a linear space
U into W. Show that if T(V) C R(U), then there exists a linear transformation S
from V into U such that T = R S.
Solution. Suppose that T (V) c R (U). And, let {X 1, ... , X r } represent a basis for
V. Then, for i = 1, ... , r, T(Xj} E R(U), and consequently T(Xj) = R(Yj) for
some matrix Y j in U.
Now, let X represent an arbitrary matrix in V, and let C1, ••. , Cr represent the
(unique) scalars that satisfy the equality X = L~=1 CiXi. And, take S to be the
transformation from V into U defined by S (X) = L~= 1 Ci Y i. Then,

T(X) = T(tCiX) = tCiT(Xj)


r
= LCiR(Y j )
i=1

= R(tCiYi) = R[S(X)) = (RS)(X).

Thus, T = RS. Moreover, it follows from Lemma 22.1.8 that S is linear.


EXERCISE 7. Let T represent a transformation from a linear space V into a
linear space W, and let Sand R represent transformations from W into V. And,
suppose that RT = J (where the identity transformation J is from V onto V) and
that T S = J (where the identity transformation J is from W onto W).
(a) Show that T is invertible.
(b) Show that R = S = T- 1•
= T(Z),
Solution. (a) For any matrices X and Z in V such that T(X)

X = J(X) = (RT)(X) = R[T(X)) = R[T(Z)) = (RT)(Z) = J(Z) = Z.

Thus, T is 1-1. Further, for any matrix Y in W,

Y = J(Y) = (TS)(Y) = T(X),


where X = S(Y). And, it follows that T is onto. Since T is both 1-1 and onto, we
conclude that T is invertible.
(b) Using results (3.3) and (3.1), we find that

R = RJ = R(TT- 1) = (RT)T- 1 = IT- 1 = T- 1


and
22. Linear Transfonnations 255

EXERCISE 8. Let T represent an invertible transformation from a linear space


V into a linear space W, let S represent an invertible transformation from a linear
space U into V, and let k represent an arbitrary scalar. Using the results of Exercise
7 (or otherwise), show that
(a) kT is invertible and (kT)-1 = (II k)T- 1 and that
(b) T S is invertible and (T S)-I = S-I T- 1.

Solution. (a) In light of the results of Exercise 7, it suffices to show that

«llk)T- 1)(kT) = f and (kT)((1lk)T- 1) = f.

Using results (2.12), (3.1), (3.3), (2.2), and (2.1), we find that

«llk)T- 1)(kT) = (1lk)(T- 1(kT))


= (11 k)(k(T- 1T)) = (11 k)(kJ) = [(II k)k]f = 1I = f
and similarly that

(kT)«(1lk)T- 1) = (1lk)«kT)T- 1)
=
(1lk)(k(TT- 1)) = (1Ik)(kf) = [(1lk)k]f = 1I = f.
(b) In light of the results of Exercise 7, it suffices to show that

(S-IT-1)(TS) =f and (TS)(S-IT- 1 ) = f.


Using results (2.9), (3.1), (3.3), and (2.13), we find that
(S-IT-I)(TS) = «S-IT-1)T)S
= (S-I (T- 1T))S = (S-I J)S = S-I S = f
and similarly that
(TS)(S-IT-I) = «TS)S-I)T- 1
= (T(SS-I))T- 1 = (T J)T- 1 = TT- 1 = f.

EXERCISE 9. Let T represent a linear transformation from an n-dimensional


linear space V into an m-dimensionallinear space W. And, write U" Y for the
inner product of arbitrary matrices U and Y in W. Further, let B represent a set
of matrices VI, ... , Vn (in V) that form a basis for V, and let C represent a set of
matrices WI, ... , W m (in W) that form an orthonormal basis for W. Show that
the matrix representation of T with respect to B and C is the m x n matrix whose
ijthelementis T(Vj)"W;.
Solution. As a consequence of Theorem 6.4.4, we have that (for j = 1, ... , n)
m
T(Vj) = L [T(Vj ) "W;lW; .
;=1
256 22. Linear Transfonnations

And, upon comparing this expression for T(V i ) with expression (4.3), we find
that the matrix representation of T with respect to Band C is the m x n matrix
whose ijth element is T(Vi)"W i .

EXERCISE 10. Let T represent the linear transformation from nmxn into nnxm
defined by T(X) = X'. And, let C represent the natural basis for n mxn , com-
prising the mn matrices Ull, U 21 , ... , Uml, ... , Uln, U2n, ... , U mn , where (for
i = I, ... , m and j = I, ... n) Uij is the m x n matrix whose ijth element equals
1 and whose remaining mn - I elements equal 0; and similarly let D represent the
natural basis for nn xm. Show that the matrix representation for T with respect to
the bases C and D is the vec-permutation matrix Kmn .
Solution. Making use of results (4.11) and (16.3.1), we find that, for any m x n
matrix X,

(L[}TLc)(vec X) = vec[T(X)] = vec(X') = Kmnvec X.


And, in light of result (4.7), it follows that the matrix representation of T (with
respect to C and D) equals Kmn .

EXERCISE 11. Let W represent the linear space of all p x p symmetric matrices,
and let T represent a linear transformation from nmxn into W. Further, let B
represent the natural basis for nm xn, comprising the mn matrices U 11, U2\, ... ,
Uml, ... , Uln, U2n, ... , U mn , where (for i = 1, ... , m and j = 1, ... , n) Uii
is the m x n matrix whose ijth element equals 1 and whose remaining mn - 1
elements equal O. And, let C represent the usual basis for W.
(a) Show that, for any m x n matrix X,

(Lei T LB)(vec X) = vech[T(X)].

(b) Show that the matrix representation of T (with respect to B and C) equals
the pep + 1)/2 x mn matrix

[vech T(U ll ), ... , vech T(Uml), ... , vech T(Uln), ... , vech T(Umn )].

(c) Suppose that p =m = n and that (for every n x n matrix X)

T(X) = (1/2)(X + X').


Show that the matrix representation of T (with respect to B and C) equals

(G'nG n)-IG'n
(where G n is the duplication matrix).
Solution. (a) Making use of result (3.5), we find that, for any mn x 1 vector x,

(Le iT LB)(X) = (Le\T LB))(X)


= Le1[(T LB)(x)] = vech[(T LB)(X)] = vech{T[LB(x)]}.
22. Linear Transformations 257

And, in light of result (3.4), it follows that, for any m x n matrix X,

(Lc: 1 T LB)(vec X) = vech{T[LB(vec X)]}


= vech(T{LB[LsI(X)]}) = vech[T(X)].

(b) For any m x n matrix X = {xij}, we find [using Part (a)] that

(Lc TL8
I )(Vtt Xl ~ Vtt{T(:t>j U'j ) 1
= vec{.l:xijT(Uij)]
(.j

= I>ijvech[T(Uij)]
i.j
= [vech T(UII), ... , vech T(Umd,
... , vech T(Ul n ), .... vech T(Umn)]vec(X).

And, in light of result (4.7), it follows that the matrix representation of T (with
respect to B and C) equals the pep + 1)/2 x mn matrix

[vech T(U ll ), ... , vech T(Umd, ... , vech T(Ul n), ... , vech T(U mn )].

(c) Using Part (a) and results (16.4.6), (16.3.1), and (16.4.15), we find that, for
any n x n matrix X,

(Lc:IT LB)(vec X) = vech[(1/2)(X + X')]


= (1/2)vech(X + X')
= (1/2)(G~Gn)-IG~ vec(X+ X')
= (l/2)(G~Gn)-IG~[vec(X) + Knnvec(X)]
= (1/2)[(G~Gn)-IG~ + (G~Gn)-IG~Knn](vec X)
= (1/2)[(G~Gn)-IG~ + (G~Gn)-IG~](vec X)
= (G~Gn)-IG~(vec X).

And, in light of result (4.7), it follows that the matrix representation of T (with
respect to B and C) equals (G~Gn)-IG~.

EXERCISE 12. Let V represent an n-dimensional linear space. Further, let B =


{V I, V 2, ... , Vn} represent a basis for V, and let A represent an n x n nons in gular
matrix. And, for j = 1, ... , n, let

Wj = !ljVI + !zjV2 + ... + !njVn ,


where (for i = 1, ... , n) !ij is the ijth element of A -I. Show that the set C
comprising the matrices WI, W 2, ... W n is a basis for V and that A is the matrix
258 22. Linear Transfonnations

representation of the identity transfonnation [ (from V onto V) with respect to B


andC.

Solution. Lemma 3.2.4 implies that the set C is linearly independent and hence
that C is a basis for V. Then, clearly, A -I is the matrix representation of the identity
transfonnation [ (from V onto V) with respect to C and B. And, it follows from
Corollary 22.4.3 that (A -I) -I is the matrix representation of r I with respect
to B and C and hence [since (A -I) -I = A and [-I = I] that A is the matrix
representation of [ with respect to Band C.

EXERCISE 13. Let T represent the linear transfonnation from R4x I into R 3x I
defined by

where x = (XI, X2, X3, X4)'. Further, let B represent the natural basis for R4xl
(comprising the columns of 4), and let E represent the basis (for 1<}x I ) comprising
the four vectors (1, -1,0, -1)', (0,0,1,1)', (0,0,0,1)', and (1, 1,0,0)'. And,
let C represent the natural basis for R 3x I (comprising the columns of 13), and F
represent the basis (for R3xl) comprising the three vectors (1,0,1)', (1,1,0)',
and (-1, 0, 0)'.
(a) Find the matrix representation of T with respect to Band C.
(b) Find (1) the matrix representation of the identity transfonnation from R4x I
onto R4x I with respect to E and Band (2) the matrix representation of the identity
transfonnation from R 3x1 onto R 3xl with respect to C and F.
(c) Find the matrix representation of T with respect to E and F via each of two
approaches: (1) a direct approach, using the equality

WA = [T(Vd, ... , T(V n )],

where A is the matrix of a linear transfonnation T from an n-dimensional linear


space V into R mx I, where {V I, ... , Vn } is the basis for V, and where W is anm xm
matrix whose columns fonn the basis for R m x I; and (2) an indirect approach, using
the results of Parts (a) and (b) in combination with the result that if A is the matrix
representation of a linear transfonnation T from a linear space V into a linear space
W with respect to bases Band C (for V and W, respectively), then the matrix
representation of T with respect to alternative bases E and F is S-I AR, where
R is the matrix representation of the identity transfonnation from V onto V with
respect to E and Band S is the matrix representation of the identity transfonnation
from W onto W with respect to F and C.
(d) Find rank T and dim[N(T)]. Do so by, for instance, using the result that
the rank of a linear transfonnation T from an n-dimensional linear space V into
a linear space W equals the rank of its matrix representation (with respect to any
bases B and C) and the result that dim[N(T)] = n - rank(T).
22. Linear Transfonnations 259

Solution. (a) Let A represent the matrix representation of T with respect to B and
C, and denote the first, ... , 4th columns of 4 by el, .... e4, respectively. Then,
in light of the discussion in Part 1 of Section 4b, we find that

-D
(b) (1) In light of the discussion in Part 3 of Section 4b, the matrix representation
of the identity transformation from n4x 1 onto n4x 1 with respect to E and B is
the 4 x 4 matrix whose first, ... , 4th columns are the vectors that form E, that is,
the 4 x 4 matrix

( -~o ~
-1
~0 0~) .
1 0

(2) Let S represent the 3 x 3 nonsingular matrix whose inverse S-I is the matrix
n n
representation of the identity transformation from 3x 1 onto 3x 1 with respect
to C and F. Then, in light of Part 3 of Section 4b,

(o11-1)
l O S - I = 13,
1 0 0

so that

S-I = ( 0 0 1)
0
-1
1 0 .
1 1

(c) Let H represent the matrix representation of T with respect to E and F.


(1) Equality (*) [or equivalently equality (4.8)] gives

o -I)
CI C D 0 0
1 0 -1
1 0
~ H= ~ 0

Co n
Thus,

H= 0 0 -1
0 0

(2) The matrix S [from Part (b)] is (in light of Corollary 22.4.3) the matrix
representation of the identity transformation from n 3 x 1 onto n 3 x 1 with respect
to F and C. Thus, in light of the results of Parts (a) and (b), it follows from the
260 22. Linear Transfonnations

result cited (which is Theorem 22.4.4) that

u no
0 0

-DU ~)
0 0
0 0
H= 1
0
0 -1

D
0
~G 0
0
-1
0

(d) It is clear from Part (c) that the rank of the matrix representation of T with
respect to E and F equals 2. Thus, it follows from the first result cited (which is
Theorem 22.5.2) that rank T = 2 and from the second result cited (which is part
of Corollary 22.5.3) that dim [N(T)] = 4 - 2 = 2.

EXERCISE 14. Let T represent a linear transfonnation of rank k (where k > 0)


from an n-dimensional linear space V into an m-dimensional linear space W.
Show that there exists a basis E for V and a basis F for W such that the matrix
representation of T with respect to E and F is of the fonn (~k ~).
Solution. Let B represent any basis for V and C any basis for W. And, let A
represent the matrix representation of T with respect to Band C. Then, as a
consequence of Theorem 22.5.2, rank A = k, and it follows from Theorem 4.4.9
that there exists an n x n nonsingular matrix R and an m x m nonsingular matrix S
such that A = S(~k ~ )R- I or equivalently such that (~k ~) = S-IAR. And,

based on Theorem 22.4.7, we conclude that (~k ~) is the matrix representation


of T with respect to some bases E and F.

EXERCISE 15. Let T represent a linear transfonnation from an n-dimensional


linear space V into an m-dimensionallinear space W, and let A representthe matrix
representation of T with respect to bases Band C (for V and W, respectively).
Use the result that an n x 1 vector x is in N(A) if and only if the corresponding
matrix LB(X) is in N(T) [or equivalently that a matrix X (in V) is in N(T) if and
only if the corresponding vector L8 1 (X) is in N(A)] to devise a "direct" proof
that dim[N(T)] = dim[N(A)] (as opposed to deriving this equality as a corollary
of the result that rank T = rank A).
Solution. The result cited [which is Part (2) of Theorem 22.5.1] implies that
L B [N (A)] = N (T) (as can be easily verified). Thus, there exists a 1-1 linear trans-
fonnation fromN(A) ontoN(T), namely, the lineartransfonnation R defined [for
x E N(A)] by R(x) = LB(X). And, it follows that N(A) and N(T) are isomor-
phic. Based on Theorem 22.3.1, we conclude that dim[N(T)] = dim[N(A)].
22. Linear Transformations 261

EXERCISE 16. Let T represent a linear transformation from an n-dimensional


linear space V into V.
(a) Let U represent an r-dimensional subspace of V, and suppose that U is
invariant relative to T. Show that there exists a basis B for V such that the matrix
representation of T with respect to Band B is of the (upper block-triangular) form
(: ~) (where E is of dimensions r x r).

(b) Let U and W represent subspaces of V such that U EB W = V (i.e., essentially


disjoint subspaces of V whose sum is V). Suppose that both U and W are invariant
relative to T. Show that there exists a basis B for V such that the matrix represen-
tation of T with respect to Band B is of the (block-diagonal) form diag(E, H)
[where the dimensions of E equal dim(W)].
Solution. (a) Let XI, ... , Xr represent any r matrices that form a basis for U. And,
take B to be any basis for V comprising XI, ... , Xr and n - r additional matrices
Xr+l, ... , Xn - the existence of such a basis is guaranteed by Theorem 4.3.12.
Further, let A = {aij} represent the matrix representation of T with respect to B
and B. Then, for j = 1, ... , r,
aljXI + ... + arjXr + ar+I,jXr+1 + ... + anjXn = T(Xj) E U,
implying (since any matrix in U can be expressed as a linear combination of
X I, ... , X r ) that a r+ l.j = ... = anj = O. Thus, aij = 0 for i = r + 1, ... , nand
j=I, ... ,r.
(b) Let r = dim (U) [in which case dim (W) = n - r]. Further, let X I, ... , Xr
represent any r matrices that form a basis for U and Xr+ I, ... , Xn any n - r
matrices that form a basis for W. And, take B to be the basis for V comprising
XI, ... , X r , Xr+l, ... , Xn - that XI, ... , X r , Xr+l, ... , Xn form a basis for
V is evident from Theorem 17.1.5.
Now, let A = {aij} represent the matrix representation of T with respect to B
and B. Then, for j = 1, .... r, r + 1, ... , n,
aljXI + ... + arjXr + ar+I,jXr+1 + ... + anjXn = T(X j ).
And, for j = 1, ... , r, T(Xj) E U, implying (since any matrix in U can be
expressed as a linear combination of XI, ... , X r ) that ar+I,j = ... = anj = O.
Similarly, for j = r + 1, .... n, T (X j) E W, implying (since any matrix in W can
be expressed as a linear combination of Xr+ I, ...• Xn) that a Ij = ... = arj = O.
Thus, aij = 0 for i = r + 1, ... ,n and j = 1. ... , r; and also aij = 0 for
i = 1, ... , rand j = r + 1, ... , n.

EXERCISE 17. Let V, W, and U represent linear spaces.


(a) Show that the dual transformation of the identity transformation f from V
onto V is f.
(b) Show that the dual transformation of the zero transformation 01 from V into
W is the zero transformation 02 from W into V.
262 22. Linear Transfonnations

(c) Let S represent the dual transformation of a linear transformation T from V


into W. Show that T is the dual transformation of S.
(d) Let k represent a scalar, and let S represent the dual transformation of a
linear transformation T from V into W. Show that k S is the dual transformation
ofkT.
(e) Let Tl and T2 represent linear transformations from V into W, and let Sl
and S2 represent the dual transformations of Tl and T2, respectively. Show that
Sl + S2 is the dual transformation of Tl + T2.
(f) Let P represent the dual transformation of a linear transformation S from U
into V, and let Q represent the dual transformation of a linear transformation T
from V into W. Show that P Q is the dual transformation of T S.
Solution. Write X" Z for the inner product of arbitrary matrices X and Z in V,
U * Y for the inner product of arbitrary matrices U and Y in W, and A * B for the
inner product of arbitrary matrices A and B in U.
(a) For every matrix X in V and every matrix Y in V,

X"/(Y) = X"Y = I(X)"Y.


Thus, I is the dual transformation of I.
(b) For every matrix X in V and every matrix Y in W,

X" 02 (Y) = X" 0 = 0 = 0 * Y = 01 (X) * Y.


Thus, 02 is the dual transformation of 01.
(c) For every matrix Y in Wand every matrix X in V,

Y * T(X) = T(X) * Y = X"S(Y) = S(Y)"X.


Thus, T is the dual transformation of S.
(d) For every matrix X in V and every matrix Y in W,

X"(kS)(Y) = X"[kS(Y)]
= k[X"S(Y)] = k[T(X) * Y] = [kT(X)] * Y = (kT)(X) * Y.
Thus, kS is the dual transformation of kT.
(e) For every matrix X in V and every matrix Y in W,

X"(Sl + S2)(Y) = X"[Sl(Y) + S2(Y)]


= X"Sl (Y) + X"S2(Y)
= Tl (X) * Y + T2 (X) * Y
= [Tl (X) + T2(X)] * Y = (Tl + T2)(X) * Y.
Thus, Sl + S2 is the dual transformation of Tl + T2.
22. Linear Transformations 263

(f) For every matrix X in U and every matrix Y in W,

X * (P Q)(Y) = X * P[Q(Y)) = S(X) 0 Q(Y) = T[S(X)] * Y = (T S)(X) * Y.


Thus, P Q is the dual transformation of T S.

EXERCISE 18. Let S represent the dual transformation of a linear transforma-


tion T from an n-dimensional linear space V into an m-dimensionallinear space
W. And, let A = {aij} represent the matrix representation of T with respect to
orthonormal bases C and D, and B = {bij} represent the matrix representation of
S with respect to D and C. Using the result of Exercise 9 (or otherwise), show that
B=A'.
Solution. Write X oZ for the inner product of arbitrary matrices X and Z in V, and
*
U Y for the inner product of arbitrary matrices U and Yin W. And, let XI, ... ,
Xn represent the matrices that form the orthonormal basis C and Y 1, ... , Ym the
matrices that form the orthonormal basis D. Then, for i = I, ... , m and j = I,
... , n, it follows from the result of Exercise 9 that

and

implying that
b ji =XjoS(Yi) = T(Xj)*Y i =aij
and hence that the j i th element of B equals the j i th element of A'. Thus, B = A'.

EXERCISE 19. Let A represent an m x n matrix, let V represent an n x n symmetric


positive definite matrix, and let W represent an m x m symmetric positive definite
matrix. And let S represent the dual transformation of the linear transformation T
from nn x I into nm x 1 defined by T (x) = Ax (where x is an arbitrary n x I vector).
Taking the inner product of arbitrary vectors x and z in nn x I to be x'Vz and the
inner product of arbitrary vectors u and yin n mxl to be u'Wy, obtain a formula
for S(y) that generalizes the formula S(y) = A'y (y E nmxl) obtained in the
special case of the usual inner products (i.e., in the special case where V = In and
W=Im).
Solution. For every vector x in n n x I and every vector y in n m xl,
x'VS(y) = (Ax)'Wy = x/ A/Wy.
And, in light of the uniqueness of S,

S(y) = V-I A'Wy.

EXERCISE 20. Let S represent the dual transformation of a linear transformation


T from a linear space V into a linear space W.
264 22. Linear Transfonnations

(a) Show that [S(W)].i = N(T) (i.e., that the orthogonal complement of the
range space of S equals the null space of T).
(b) Using the result of Part (c) of Exercise 17 (or otherwise), show that [N (S)].i
= T (V) (i.e., that the orthogonal complement of the null space of S equals the
range space of T).
(c) Show that rank S = rank T.
Solution. (a) Write X· Z for the inner product of arbitrary matrices X and Z in V
and U * Y for the inner product of arbitrary matrices U and Y in W.
Let X represent an arbitrary matrix in V. Suppose that X E N(T). Then, for
every matrix Y in W,

Thus, X E [S(W)].i.
Conversely, suppose that X E [S(W)].i. Then, since T(X) E W,

T(X) * T(X) = X·S[T(X)] = 0,

implying that T(X) = 0 and hence that X E N(T).


Thus, [S(W)].i = N(T).
(b) Since [according to Part (c) of Exercise 17] T is the dual transformation of
S, it follows from Part (a) that [T(V)].i = N(S). And, making use of Theorem
12.5.4, we find that

(c) Making use of Corollary 22.5.3 and Theorem 12.5.12 [together with the
result of Part (a)), we find that

rank T = dim(V) - dim[N(T)]


= dim(V) - dim{[S(W)].i}
= dim(V) - {dim(V) - dim[S(W)]}
= dim[S(W)]
= rankS.
References

References
Bartle, R. G. (1976), The Elements of Real Analysis (2nd ed.), New York: John
Wiley.
Goodnight, J. H. (1979), "A Tutorial on the SWEEP Operator," The American
Statistician, 33, 149-158.
Magnus, J. R., and Neudecker, H. (1980), "The Elimination Matrix: Some Lem-
mas and Applications," SIAM Journal on Algebra and Discrete Mathematics,
1,422-449.
Meyer, C. D. (1973), "Generalized Inverses and Ranks of Block Matrices," SIAM
Journal on Applied Mathematics, 25, 597-602.
Index

adjoint matrix, 72, 75, 232 of a product, 27


determinant of, see under determi- of a sum, 198
nant orthogonal complement of, see un-
differentiation of, see under differ- der orthogonal complement
entiation sum of, 161
eigenvalues of, see under eigenvalue(s) union of, 161
eigenvectors of, see under eigen-
vector(s) decomposition
of a product, 77 Cholesky, 90, 91
algebraic multiplicity, 236 LDU, 87-91,98, 144
of zero, 235, 236 of a nonnegative definite matrix, 86
of a symmetric matrix, 30, 83
basis, 15
QR, 24, 90, 91
orthonormal, 22, 24, 31
singular value, 245, 246
bilinear form, 79
spectral, 239, 241
Binet-Cauchy formula, 77
U ' DU,87
Cayley-Hamilton theorem, 232, 234 determinant, 71, 87
characteristic polynomial, 232, 233, 243 differentiation of, see under differ-
of an orthogonal matrix, 242 entiation
cofactor matrix, 72 effect of elementary row or column
cofactor(s),74 operations on, 70
expansion by, see under determi- expansion by cofactors, 71. 74
nant of a partitioned matrix, 71, 76
column space(s), 13, 15,31,55, 167 of a positive definite matrix, 100
essential disjointness of, see under of a product, see Binet-Cauchy for-
essential disjointness mula
intersection of, 161 of an adjoint matrix, 72
268 Index

of an inverse matrix, 71 essential disjointness (of subspaces), 252


of R + STU, 179, 180 as applied to row and column spaces,
of Vandermonde matrix, 77 168,198
diagonalization, 238
of a transposed matrix. 237 function (continuously differentiable), 113
of an inverse matrix, 237
simultaneous, 246 generalized eigenvalue problem, 249, 250
differentiation generalized inverse, 35, 36, 91, 193,200
chain rule for, 122-124 alternative characterizations for, 35,
of a determinant, 124 37
of a Kronecker product, 158 existence of, 35
of a log of a determinant, 125-128, minimum norm, 223
131,132 nonnegative definite, 87
of a power of a determinant, 124 of a block-diagonal matrix, 39,45
of a power of a function, 115 of a partitioned matrix, 39,41,42,
of a product of a scalar and a vector, 45,46,52,97,169,170,222
116 of a product, 51, 106
of a projection matrix, 135. 136 of a scalar mUltiple, 38
of a trace of a power, 119, 120 of a Schur complement, 46
of a trace of a product, 116, 117, of a submatrix, 45, 46
120 of A'A, 36
of a vec of a Kronecker product, ofR+STU, 185, 186
158 reflexive, 222
of a vec of a matrix power, 157 geometric multiplicity
of an adjoint matrix, 129 of one, 242
of an idempotent matrix, 115 of zero, 236
of an inverse matrix, 130-132 Gram matrix, 81
with respect to a matrix or sym- Gram-Schmidt orthogonalization, 22-24,
metric matrix, 122, 125, 126, 66
128, 130-132 Gramian,81
distance, 22
duplication matrix, 154-156 Hadamard product, 95
left inverse of, 154-156 Hessian matrix, 114

eigenvalue(s), 231, 243 index of inertia, 83


algebraic multiplicity of, see alge- inner product, 102, 105, 147,251
braic multiplicity quasi, 105
geometric multiplicity of, see geo- inverse, 29, 30, 74, 234
metric multiplicity determinant of, see under determi-
not necessarily distinct, 238, 240 nant
of a skew-symmetric matrix, 231 diagonalization of, see under diag-
of an adjoint matrix, 242 onalization
of an idempotent matrix, 242 differentiation of, see under differ-
of an orthogonal matrix, 242 entiation
of Moore-Penrose inverse, 241 of a 2 x 2 matrix, 73
eigenvector(s),243 of a block-triangular matrix, 31
linear combination of, 237 of a positive definite matrix, 82, 182
of an adjoint matrix, 242 of a sum or difference, 184, 187,
elimination matrix, 154 188, 191
Index 269

ofR+STU, 184 linear transformation(s), 252


dual, 261, 263
Kronecker product, 140, 156 identity, 258, 261
differentiation of, see under differ- matrix representation of, 255, 256,
entiation 258,260,261,263
generalized inverse of, 141 null space of, 260, 264
involving a diagonal matrix, 156 one to one, 251
involving a partitioned matrix, 143 product of, 253, 254, 262
involving a sum (or sums), 139 range space of, 264
involving a triangular matrix, 144, rank of, 264
156 restriction of, 251
involving a vector, 140, 151, 154 scalar multiple of, 253, 262
LDU decomposition of, 144 sum of, 253, 262
nonnegative definite, 142 zero, 261
norm of, 143
of idempotent matrices, 140 matrix (or matrices)
of orthogonal matrices, 140 commutativity of, 3
of projection matrices, 141 congruence of, 83
positive definite, 142 diagonal,7
projection matrix for column space diagonally dominant, 98
of, see under projection ma- difference between, 3
trix idempotent, 49, 50, 82, 115, 140,
vec of, see under vec 146,189-192,195,226
invertible, 29
left inverse, 29 involutory, 29, 50
of duplication matrix, see under du- negative definite, 80, 83, 142
plication matrix negative semidefinite, 80
linear dependence, II, 12 nonnegative definite, 84, 90, 93, 102,
linear independence, II, 12, 145 142,180,181,188-190,195,
linear space(s), 13 198,250
basis for, see basis 2 x 2, 102
isomorphic, 260 partitioned, 96, 97
subspace(s) of, see under subspace(s) sum of, 93
linear system(s) nonpositive definite, 80, 142
absorption in, 211 nonsingular, 30, 98, 135
augmented, 60 nonsymmetric, 13
consistent, 58 of the form V + XUX', 209, 212
Cramer's rule for, 75 orthogonal, 30, 31, 49, 140, 146,
equivalence of, 58, 157,210 221
homogeneous, 55 permutation, 149
inconsistent, 58 positive definite, 80, 83, 84, 98,101,
invariance to choice of solution of, 105,135,142,191,250
60 2 x 2, 101
linear combination of solutions of, product of, 94
55,56 positive semidefinite, 80, 90
nonhomogeneous, 56 nonsingular, 80
of the form X'XB = X', 65 partitioned, 96
solution set of, 56, 58 power of, 4
solution to, 57, 75 product of, 1-3,27
270 Index

scalar multiple of, I determinant of, see under determi-


similarity of, see similarity nant
singular, 72 generalized inverse of, see under
skew-symmetric, 92, 93,183 generalized inverse
submatrix of, see submatrix in a Kronecker product, see under
sum of, 1-3 Kronecker product
symmetric, 3, 7,20,36, 72 nonnegative definite, see under ma-
transpose of, 4, 9 trix (or matrices): nonnegative
triangular, 31 definite
lower,4,144 positive semidefinite, see under ma-
upper, 4, 7,8,13,79,144 trix (or matrices): positive semidef-
minimization (of a 2nd-degree polyno- inite
mial),209 product of, 9
subject to linear constraints, 214, rank of, see under rank
216-218,225,248 row space of, see under row space(s)
transformation from constrained to Schur complement in, see Schur com-
unconstrained,214,227 plement
Moore-Penrose conditions, 223 transpose of, 9
Moore-Penrose inverse, 221, 223, 225 positive or negative pair (of matrix ele-
eigenvalues of, see under eigenvalue(s) ments),69
of a nonnegative definite matrix, 228 projection, 64
of a product, 221, 225 along a subspace, 173, 177
of asurn, 224 of a column vector, 65, 106, III
of a symmetric nonnegative defi- on an orthogonal complement, see
nite matrix, 226 under orthogonal complement
projection matrix, 66, 108-110,213
differentiation of, see under differ-
neighborhood, 113
entiation
norm, 21
for column space of a Kronecker
limit of, 187
product, 141
quasi,105
for one subspace along another, 173-
usual, 143
175,193,212
normal equations, 65
in a Kronecker product, see under
null space, 112
Kronecker product

orthogonal complement, 177 quadratic form (matrix of), 79


dimension of, 112
of a column space, 112 rank, 15, 17,82
of a sum, 162 additivity of, 191, 192, 195, 199,
of an intersection, 162 203
projection on, 67, 112 full column, 30
orthogonality full row, 30
of2 subspaces, 63,162,177 of a difference, 200
of a matrix and a subspace, 63, 64, of a partitioned matrix, 17,51,53,
162 97,167,168,171,204
of a vector and a subspace, 107 of a product, 16, 27, 172
ofa sum, 197, 198
partitioned matrix (or matrices) of a triangular matrix, 31
block-triangular, 8 ofR + STU, 196,206
Index 271

subtractivity of, 200 triangle inequality, 21, 22, 187


right inverse, 29, 174
row space(s), 13 vee, 114, 155, 157
of a partitioned matrix, 17 differentiation of, see under differ-
of a product, 27 entiation
ofa sum, 198 of a Kronecker product, 153
of an idempotent matrix, 146
Schur complement, 42, 46, 98 of an identity matrix, 145
generalized inverse of, see under of an orthogonal matrix, 146
generalized inverse vee-permutation matrix, 151-153,256
Schwarz inequality, 21 determinant of, 149
set recursive formula for, 149
interior point of, 114 vech, 154, 155, 157
open, 113, 135
span of, 14, 15, 163
similarity, 234, 235
to an idempotent matrix, 234
submatrix
generalized inverse of, see under
generalized inverse
principal, 7, 98
transpose of, 7
subspace(s), 14
direct sum of, 261
essential disjointness of, see essen-
tial disjointness
independence of, 164, 173, 177,203
intersection of, 163
invariant, 231, 261
orthogonality of, see orthogonality
projeetion along, see under projec-
tion
projection matrix for, see under pro-
jection matrix
sum of, 161-164, 170
union of, 161
sweep operation, 43

trace
differentiation of, see under differ-
entiation
of a product, 19, 94, 146
ofasum,96
transformation(s)
inverse of, 254, 255
invertible, 254, 255
linear, see linear transformation(s)
product of, 255
scalar multiple of, 255
ALSO AVAILABLE FROM SPRINGER!

Jt.1.\11UX .\1 r.f"RA


Flu,. \
STA11STl( IA~ ':\
Pf1<.,rHTIn

DAVID A. HARVILLE CHRIS HEYDE and EUGENE SEN ETA (Editors)


MATRIX ALGEBRA FROM A STATISTICIANS
STATISTICIAN'S PERSPECTIVE OF THE CENTURIES
This book presents matrix algebra in a way that Statisticians o/the Centuries aims to demonstrate
is well suited for those with an interest in statis- the achievements of statistics to a broad audience.
tics or a related discipline. It provides thorough and to commemorate the work of celebrated
and unified coverage of the fundamental concepts statisticians. This is done through short biogra-
along with the specialized topics encountered in phies that pUI the statistical work in its historical
areas of statistics such as linear statistical mod- and sociological context. emphasizing contribu-
els and multivariate analysis. It includes a num- tions to science and society in the broadest terms.
ber of very useful results that have heretofore only rather than narrow technical achievement. The dis-
been available from relatively obscure sources. cipline is treated from its earliest times and only
Detailed proofs are provided for all results. individuals born prior to the 20th Century are
1997/648 PP. / HARDCOVER / ISBN ()'387·94978-X included. The biographies of 104 individuals
were contributed by 73 authors from across the
world.
GLEN MCPHERSON

APPLYING AND INTERPRETING


2001/ 456 PAGES/ HARDCOVER / ISBN ()'387·95283-7

STATISTICS
A Comprehensive Guide
Second Edition
This book describes the basis. application. and
interpretation of statistics. and presents a wide
range of univariate and multivariate statistical
methodology. Its first edition was popular across To Order or for Infonnation:
all science and technology- based disciplines. The In MJfth An •• ;'.d: ~ l.J<OO.!J'RINl:.ER '" FAX:
book is developed without the use of calculus. 1:/(,1, 34A-lr;o<> · WIII1E: ~.io",,,, V",(...: No-w H"L
Based on the author's Statistics in Scientific h .• [).-",. S:1<·;3. PO av. :l-l8.!>. ~~... u~ NJ
Investigation. the book has been extended sub-
0709.):!J&5 • VISIT: TU4.M' ,,"~~1 t~hfli-...,~ st.:H'e Ix.,.
• (.MAIl.: Ml~'''''''''''lr~ .... nv.C'om
stantially in the area of multivariate applications
tlu/",<1I·M..rhArt.'f" J: ~ ,4'.1.3<) j/:!7b;3/~
and through the expansion of logistic regression • -f4'J 30. 21 ~:J.I . fAX; 14~ / JlJ~::';~" .;$1.11 .
and log linear methodology. It presumes readers WIIITt: ::.prI~~ Vetl.'e. ~.o. S<>.•. 1.\1)~()1, [)'143U2
have access to a statistical computing package and Be~ln. Gormany • E-MAIL: Ofdfor' : prlr.g~, .<1~
<.,:.-4iii7
includes guidance on the application of statisti- rp(~~"mot':

cal computing packages. The new edition retains


the unique feature of being written from the
users' perspective; it connects statistical models f'
t
and methods to investigative questions and back- ......
.~

f'., ' .
ground information. and connects statistical
results with interpretations in plain English. rf~O~i
. -.'
Springer
2001/672 PP.jHARDCOVER/ ISBN ()'387·9511()'5 ",,_, .~ ...... ", 14111 . ' f fl, • II
SPRINGER TEXTS IN STATISTICS

Вам также может понравиться