Академический Документы
Профессиональный Документы
Культура Документы
Number 9
l.
D. Montgomery and L. Zippin
TOPOLOGICAL TRANSFORMATION GROUPS
2.
Fritz John
PLANE WAVES AND SPHERICAL MEANS
Applied to Partial Differential Equations
3.
E. Artin
GEOMETRIC ALGEBRA
4.
R. D. Richtmyer
DIFFERENCE METHODS FOR INITIAL-VALUE PROBLEMS
5.
Serge Lang
INTRODUCTION TO ALGEBRAIC GEOMETRY
6.
Herbert Busemann
CONVEX SURFACES
7.
Serge Lang
ABELIAN VARIETIES
8.
S. M. Ulam
A COLLECTION OF MATHEMATICAL PROBLEMS
9.
I. M. Gel’fand
LECTURES 0N LINEAR ALGEBRA
The second edition differs from the first in two ways. Some of the
material was substantially revised and new material was added. The
major additions include two appendices at the end of the book dealing
with computational methods in linear algebra and the theory of pertur-
bations, a section on extremal properties of eigenvalues, and a section
on polynomial matrices (§§ 17 and 21). As for major revisions, the
chapter dealing with the Jordan canonical form of a linear transforma-
tion was entirely rewritten and Chapter IV was reworked. Minor
changes and additions were also made. The new text was written in colla-
boration with Z. Ia. Shapiro.
I wish to thank A. G. Kurosh for making available his lecture notes
on tensor algebra. I am grateful to S. V. Fomin for a number of valuable
comments. Finally, my thanks go to M. L. Tzeitlin for assistance in
the preparation of the manuscript and for a. number of suggestions.
vii
TABLE OF CONTENTS
Page
Preface to the second edition.
of the matrices Ham“ and |]b,.k|| we take the matrix Haik + bikll.
As the product of the number A and the matrix Hamil we take the
matrix lilaikll. It is easy to see that the above set R is now a
vector space.
It is natural to call the elements of a vector space vectors. The
fact that this term was used in Example 1 should not confuse the
reader. The geometric considerations associated with this word
will help us clarify and even predict a number of results.
If the numbers 1, pl, - - - involved in the definition of a vector
space are real, then the space is referred to as a real vector space. If
the numbers A, ,u, - - - are taken from the field of complex numbers,
then the space is referred to as a complex vector space.
More generally it may be assumed that Z, ,a, - - .’ are elements of an
arbitrary field K. Then R is called a vector space over the field K. Many
concepts and theorems dealt with in the sequel and, in particular, the
contents of this section apply to vector spaces over arbitrary fields. How-
ever, in chapter I we shall ordinarily assume that R is a real vector space.
X=§1e1+§292+"'+§nen
and
§1_§’1=£2—£’2=".=£n_£’n=0:
i.e.,
51:31; 52:52, "3 En=£,n'
then the numbers £1, £2, - - -, E" are called the coordinates of the vector
x relative to the basis e1, e2, - - ~, en.
Theorem 1 states that given a basis e1, e2, - - -, en of a vector
space R every vector x e R has a unique set of coordinates.
If the coordinates of x relative to the basis e1, e2, - - -, en are
51, £2, - - -, En and the coordinates of y relative to the same basis
are 171,172, - - - 17”, i.e., if
X=§1e1+52e2+"'+5nen
y = ’71e1 + 77292 + ' ' ' + men,
then
and then compute the coordinates 171, 172, - - -, 17,, of the vector
x = (.51, £2, - - -, 5") relative to the basis e1, e2, - - -, en. By definition
’71 = 51:
’71 ‘l‘ 772 = ‘52»
171+772+".+77n='§n'
Consequently,
Then
X=(1,£2,' 2:571)
E=(1,,-0 ,0)+52(0,1,---,0)+ -+.§n(0,0,---,1)
=£e1+£2eg+'--+Enen.
It follows that m the space R of n tuples (£1, £2, - - -, 5“) the numbers
n—DIMENSIONAL SPACES 9
the coordinates 171, ’72: - - -, 17,, of a vector x = (51. £2, - - -, E") are linear
combinations of the numbers 51, 5,, - - -, E”.
the vector
X, = 519,1 + Ezelz + ' ' ' + Enelm
i.e., a linear combination of the vectors e’i with the same coeffi-
cients as in (5).
This correspondence is one-to-one. Indeed, every vector x e R
has a unique representation of the form (5). This means that the
5,. are uniquely determined by the vector x. But then x’ is likewise
uniquely determined by x. By the same token every x’ eR’
determines one and only one vector x e R.
n—DIMENSIONAL SPACES 11
If we ignore null spaces, then the simplest vector spaces are one—
dimensional vector spaces. A basis of such a space is a single
vector e1 :fi 0. Thus a one-dimensional vector space consists of
all vectors ocel, where a is an arbitrary scalar.
Consider the set of vectors of the form x = x0 + ocel, where x0
and e1 75 0 are fixed vectors and ac ranges over all scalars. It is
natural to call this set of vectors ~— by analogy with three-
dimensional space — a line in the vector space R.
n~DIMENSIONAL SPACES 13
(a1, a2, - ~ ~, an are fixed numbers not all of which are zero) form a subspace
of dimension n — l.
2. Show that if two subspaces R1 and R2 of a vector space R have only
the null vector in common then the sum of their dimensions does not exceed
the dimension of R.
3. Show that the dimension of the subspace generated by the vectors
e, f, g, - - - is equal to the maximal number of linearly independent vectors
among the vectors e, f, g, - - -.
where the bi,c are the elements of the inverse of the matrix 421’.
Thus, the coordinates of a vector are transformed by means of a
matrix .93 which is the inverse of the transpose of the matrix .2! in (6)
which determines the change of basis.
§ 2. Euclidean space
(X, Y) = 12151472'
and the result is the Euclidean space of Example 2.
EXERCISE. Show that the matrix
0 1
1 0
cannot be used to define an inner product (the corresponding quadratic
form is not positive definite), and that the matrix
(1 i)
can be used to define an inner product satisfying the axioms 1 through 4.
n—DIMENSIONAL SPACES 17
(P. Q) = f P(t)Q(t)dt-
2. Length of a vector. Angle between two vectors. We shall now
make use of the concept of an inner product to define the length
of a vector and the angle between two vectors.
DEFINITION 2. By the length of a vector x in Euclidean space we
mean the number
(4) x/ (X, X)-
We shall denote the length of a vector x by the symbol [X].
It is quite natural to require that the definitions of length of a
vector, of the angle between two vectors and of the inner product
of two vectors imply the usual relation which connects these
quantities. In other words, it is natural to require that the inner
product of two vectors be equal to the product of the lengths of
these vectors times the cosine of the angle between them. This
dictates the following definition of the concept of angle between
two vectors.
cosgo = (x, y)
IXI IYI
If (p is to be always computable from this relation we must show
that
V
_1<(£'_¥_
_ 1
II/\
J
xl [Y
or, equivalently, that
(x, y)2
ll/\
’
IXI2 [YI2
which, in turn, is the same as
(X, Y) = 2 £17k-
i=1
It follows that
7| 11
(2) “in: = “M
and
15
(3) 2 when a 0
i,k=1
for any choice of the 5‘. Hence (6) implies that
If the numbers a” satisfy conditions (2) and (3), then the following inequality
holds:
1| 2 n 71
EXERCISE. Show that if the numbers a” satisfy conditions (2) and (3),
a“:2 g awakk. (Hint: Assign suitable values to the numbers El, 5,, - - -, 5,,
and 171, 17,, - - -, 17,, in the inequality just derived.)
(7) IX + YI é l + [YI-
n—DIMENSIONAL SPACES 21
Proof:
IX+ylz= (X+y,X+y) = (x,X) + 2(X,y) + (353').
Since 2(x, y) g 2|x| |y|, it follows that
IX+yl2 = (X+y, x+y) g (x, X)+2IX| [YI+(y, y) = (IXI+Iy|)2,
i.e., [x + y| g [XI + lyl, which is the desired conclusion.
EXERCISE. Interpret inequality (7) in each of the concrete Euclidean
spaces considered in the beginning of this section.
1 if r’ = k
(1) (6i: ek) = {0 if ‘L ¢ k.
(4) lie—1 = _(fk’ e1)/(e1: e1): AIc—2 = —(fk: e2)/(e2: e2): ' ' ':
11 = — (fk) ek—1)/(ek—1’ ek—l)‘
So far we have not made use of the linear independence of the
vectors f1, f2, - - -, f”, but we shall make use of this fact presently
to prove that e,c 7’: 0. The vector ek is a linear combination of the
vectors e1, e2, - - -, ek_1, fk. But e,c_1 can be written as a linear
combination of the vector f,c_1 and the vectors e1, e2, - - -, e,,_2.
Similar statements hold for ek_2, ek_3, - - -, e1. It follows that
0: (t+ot-l,1)=f11(t+oc)dt=2oc,
(x, Y) = (51e1 + 5232 + ' ° ° + Enen-v ’7191 + 77292 + ' ' ' + me”)-
Since
(e e)—{1 if i=k
5’ " _ 0 if i¢ k,
n—DIMENSIONAL SPACES 25
it follows that
(XY) = 2 “4155.471“
12,Ic=1
where a“, = aM and £1, £3, - - -, 5,, and 17,, 17,, - - ~, 17,, are the coordinates of
x and y respectively.
2. Show that if in some basis f1, f,, - - -, f,I
(X: e1) = 51(91, el) + 52(e2 e1) + ° ' ' + §n(en’ e1) = 51
and, similarly,
c; = :1Q(t)P,-(t) dt.
2. Consider the system of functions
(8) 1, cos t, sin t, cos 2t, sin 2t, - . -, cos ht, sin rtt,
on the interval (0, 27;). A linear combination
P(t) = (ac/2) + a1 cost + blsint + a2 cos 2t + - - - + b” sin nt
of these functions is called a trigonometric polynomial of degree rt. The
totality of trigonometric polynomials of degree rt form a (2r; + 1) -dimen-
sional space R1. We define an inner product in R1 by the usual integral
2n
(P. Q) = o Puma) a
It is easy to see that the system (8) is an orthogonal basis. Indeed
2
Jo" cos kt cos t: dt = 0 if k ya I,
2n
J'o sin kt cos a dt = o,
2
fonsin kt sin lt dt = 0, if k 7&1.
Since
211 _ 2n 2n
f sm2 kt dt = f cos” kt dt = n and [ ldt = 2n,
0 o o
it follows that the functions
(8’) 1/1/27”: (l/Vn) cos t, (IA/n) sin t, - - -, (l/Vn) cos at, (IA/n) sin nt
are an orthonormal basis for R1.
If — in > If — fol-
Indeed, as a difference of two vectors in R1, the vector f0 — f1
belongs to R1 and is therefore orthogonal to h = f — f0. By the
theorem of Pythagoras
|f — fol2 + Ifo — f1]2 = If — fo + fo — f1]2 = If — fllz,
so that
If — fll > If — fol-
We shall now show how one can actually compute the orthogo—
nal projection f0 of f on the subspace R1 (i.e., how to drop a
perpendicular from f on R1). Let e1, e2, - - ~, em be a basis of R1.
As a vector in R1, f0 must be of the form
(11) 61(81, etc) + 02(82, etc) + ' ' ' + cm(em’ eh)
= (f, ek) (k = 1, 2, - - -,m),
We first consider the frequent case when the vectors e1, e2, - - -,
em are orthonormal. In this case the problem can be solved with
ease. Indeed, in such a basis the system (11) goes over into the
system
y=c1x1+...+cmxmr
n-DIMENSIONAL SPACES 29
(e1: e1)01 + (92’ e1)02 + ' ' ' + (em! e1)Cm = (f: e1)»
(15) (e1) e2)cl + (32’ e2)02 + ' ' ' + (em: e2)cm = (f: e2),
Solution: e1 = (2, 3, 4), f = (3, 4, 5). In this case the normal system
consists of the single equation
xnc = 3/71!
(x,y) = 131
c = — —— xkyk .
(XX) "
n—DIMENSIONAL SPACES 31
1 cost sint
eo=——-: el=—: ez= ;
1/2” V7: V7;
cos nt sin nt
e 2n _ 1 = —;
Vfl e 2'» = v”
1 Zn 1 2v
(:0 = _f fmdt; c”_1 = —— f(t) cos kt dt;
0 x/fl o
1 2n .
c” = W] f(t) sm kt dt.
0
1 2n
bk = — f(t) sin kt dt.
7t 0
The numbers ak and b k defined above are called the Fourier coefficients of
the function f(t).
in R the vector
x, =(51: £21 . I .’ En)
in R’.
We now show that this correspondence is an isomorphism.
The one-to-one nature of this correspondence is obvious.
Conditions 1 and 2 are also immediately seen to hold. It remains
to prove that our correspondence satisfies condition 3 of the defini—
tion of isomorphism, i.e., that the inner products of corresponding
pairs of vectors have the same value. Clearly,
Thus
(X', Y') = (x, Y):
i.e., the inner products of corresponding pairs of vectors have
indeed the same value.
This completes the proof of our theorem.
EXERCISE. Prove this theorem by a method analogous to that used in
para. 4, § 1.
Further, let
EXERCISE. Show that if five) and g(y) are linear functions, then their
product [(x) - g(y) is a bilinear function.
FM:
A(X; Y) = A(ei§ ek)§i77k»
i 1
=2 aik‘Eink-
i, h=1
and compute the matrix .2! of the bilinear form A (x; y). Making use of (4)
we find that:
“11:1-1+2- 1-1+3 1- 1:6,
a12=a21=11+2-1-1+3-1- (—1)=0,
an=1----—1+211+3( 1-)(—1) .
aia=aai=l'1+2'1'_( )+3'1'(‘)=‘-4:
“23=a32=1'1+2'1'_( )+3' (_1)'(_1)=2,
“33:1'1+2(—1)'“(1')+3(—1)'(_1)=6,
i.e.,
6 0 —
.2! = 0 6 2 .
—4 2 6
It follows that if the coordinates of x and y relative to the basis e1, eg, e3
are denoted by 5’1, 5”,, 5’3, and 17’1, 1;", 17’s, respectively, then
A (X; Y) = 65/177’1 "— 45/1713 + 65,277I2 ‘1‘ 25,2173 — 45,377,; + 25,377’2 + 65/377,3-
n-DIMENSIONAL SPACES 39
0n: = Z ambula-
a=1
Using this definition twice one can show that if .0) = .2199, then
it
due: 2 “iabaflcfik-
¢.fi=1
40 LECTURES ON LINEAR ALGEBRA
n
(7*) beta = 2 c/m'atkckq-
i,k=1
5. Quadratic forms
DEFINITION 4. Let A(x; y) be a symmetric bilinear form. The
function A (x; x) obtained from A (x; y) by putting y = x is called
a quadratic form.
A (x; y) is referred to as the bilinear form polar to the quadratic
form A(x; x).
The requirement of Definition 4 that A (x; y) be a symmetric
form is justified by the following result which would be invalid if
this requirement were dropped.
THEOREM 1. The polar form A (x; y) is uniquely determined by its
quadratic form.
Proof: The definition of a bilinear form implies that
A(X + y; X + y) = A(X; X) + A(X; y) + A(y; X) + A(y; y)-
Hence in View of the symmetry of A (x; y) (i.e., in View of the
equality A (X; y) = A (y; X)),
A(X; Y) = %[A(X + y; X + y) — A(X; X) — A(y; y)]-
Since the right side of the above equation involves only values of
the quadratic form A(X; x), it follows that A(X; y) is indeed
uniquely determined by A(x; x).
To show the essential nature of the symmetry requirement in
the above result we need only observe that if A (x; y) is any (not
necessarily symmetric) bilinear form, then A (x; y) as well as the
symmetric bilinear form
A1(X; y) = l[1‘1(X;y) + A(y; X)]
n—DIMENSIONAL SPACES 41
A(X;Y) = 2 aikginkr
i,7c=1
where a“c = a“. It follows that relative to a given basis every
quadratic form A(x; x) can be expressed as follows:
it
A(x;x)=.§1a¢k§i£k, “m = “7a-
1, =
(7*) b“ = Z c’piaikckq.
i,k=1
Using matrix notation we can state that
(7) 33’ = g’selg.
Thus, ifsz/ is the matrix of a bilinear form A (x; y) relative to the
basis e1, e2, - - -, en and 33 its matrix relative to the basis f1 , f2, - - -, f,"
then fl = Whig, where (g is the matrix of transition from e1,
e2, - ~ -, en to f1, f2, - - -, f" and ig’ is the transpose of g.
5. Quadratic forms
DEFINITION 4. Let A(X; y) be a symmetric bilinear form. The
function A (x; x) obtained from A (x; y) by putting y = X is called
a quadratic form.
A (x; y) is referred to as the bilinear form polar to the quadratic
form A(X; x).
The requirement of Definition 4 that A (x; y) be a symmetric
form is justified by the following result which would be invalid if
this requirement were dropped.
THEOREM 1. The polar form A (x; y) is uniquely determined by its
quadratic form.
Proof: The definition of a bilinear form implies that
A(x + y; x + Y) = A(X; X) + A(X;y) + A(y; X) + A(y;y).
Hence in View of the symmetry of A (x; y) (i.e., in View of the
equality A(X; y) = A(y; x)),
A(X; y) = %[A(x + y; x + y) — A(X; X) — A(y; y)l-
Since the right side of the above equation involves only values of
the quadratic form A (x; x), it follows that A (x; y) is indeed
uniquely determined by A(X; x).
To show the essential nature of the symmetry requirement in
the above result we need only observe that if A (x; y) is any (not
necessarily symmetric) bilinear form, then A (x; y) as well as the
symmetric bilinear form
A(X} Y) = 2 aik‘fink:
1:, Ic=1
where a“c = a“. It follows that relative to a given basis every
quadratic form A(X; x) can be expressed as follows:
7;
(2) A(X,X)=-I’:Z_ldik7]i1]k,
where the dots stand for a sum of terms in the variables 172, - - - 17”.
If we put
After a finite number of steps of the type just described our ex-
pression will finally take the form
A(X; X) = 11512 + 12522 + ' ' ' + lmémz’
where m g n.
We leave it as an exercise for the reader to write out the basis
transformation corresponding to each of the coordinate transfor-
mations utilized in the process of reduction of A (x; x) (cf. para. 6,
§ 1) and to see that each change leads from basis to basis, i.e., to
rt linearly independent vectors.
Ifm<ri,weputhm+1=~--=h,,=0. Wemaynowsumup
our conclusions as follows:
THEOREM 1. Let A (x; x) be a quadratic form in an rt~dimehsional
space R. Then there exists a basis e1, e2, - ' -, en of R relative to
which A(x; x) has the form
A(X, X) = 11512 + A24:22 + ' ' ' + 17157}:
where 51, 52, - - -, 6,, are the coordinates of x relative to e1, e2, - - -, en.
We shall now give an example illustrating the above method of reducing
a quadratic form to a sum of squares. Thus let A (x; x) be a quadratic form
in three-dimensional space which is defined, relative to some basis f1, f,, f,,
by the equation
A (X; X) = 2mm + 4171173 — m” — 8m“-
If
’71 = 77,2:
772 = 77,1:
773 = 77,3:
then
A (X; x) = “77,12 + 277’1W’2 + 477,277,: — 877,32-
Again, if
771* = — 77,1 + 77,2
772* = 77,2:
773* = ”,3,
then
A(x5 x) = _m*s + 77:“ + 4712*773* _ 8773“-
Finally, if
51 = 771*:
£2 = 172* + 2773*.
E: = 773*:
n-DIMENSIONAL SPACES 45
etc., we can express £1, £2, - - -, 5,, in terms of 171, 171, - - -, 17,, in the
form
51 = 011771 + 012772 + ' ' ' + 011977.
‘52 = 021771 + 022772 + ' ' ' + 521.771;
...........................
£2 = 771 + 2173;
53 = 773-
. “11 “12 . .
A1=“115é0: A2: 3’50; "3
a 21 a 22
(1)
“11 “12 “11»
A” = “21 “22 “2n 75 0
e1 = “11f1’
(3) e2 = “21f1 + “2J2:
“k1A(f1c5 f1) ‘l‘ 0‘s (fie; f2) + ' ' ' ‘l' 0cIckA (fro; fie) = 1-
The determinant of this system is equal to
A (ek; 9k) = A (ek; “kifi ‘l‘ “mt-2 + ' ' ' + 0%]c
= “MA (91c; f1) + “k2Alek; f2) + ' ‘ ' ‘l' “IckA(eki flc)’
which in View of (4) and (5) is the same as
Aye—1
o‘-1ck:=—:
Ak
Thus
zit—1
bkk = A (ek; ek) =
M
To sum up:
THEOREM 1. Let A (x; x) be a quadratic form defined relative to
some basis f1, f2, - - -, fn by the equation
be all different from zero. Then there exists a basis e1, e2, - - -, e”
relative to which A (x; x) is expressed as a sum of squares,
A A A-
A(X:X)=A—:£12+A—:§22+-~+ 515,3.
Here 51, £2, ' ~ -, En are the coordinates ofx in the basis e1, e2, - - -, en.
This method of reducing a quadratic form to a sum of squares is
known as the method of Jacobi.
REMARK: The fact that in the proof of the above theorem we
were led to a definite basis el, e2, - - -, e,| in which the quadratic
form is expressed as a sum of squares does not mean that this basis
is unique. In fact, if one were to start out with another basis
£1,152, - - -,fn (or if one were simply to permute the vectors f1,
f2, - - -, f") one would be led to another basis e1, e2, - ~ -, e".
Also, it should be pointed out that the vectors e1, e2, - ~ -, en need
not have the form (3).
EXAMPLE. Consider the quadratic form
The determinants A1, A 2, A aare 2, —l, — £1, i.e., none of them vanishes.
Thus our theorem may be applied to the quadratic form at hand. Let
A (e; n) = 1.
i.e., 20111 = 1, or an = % and
e1 = #1 = 0%, 0, 0).
Next an and a“ are determined from the equations
whence
and
— 3 __.1_2 _1_ — _3_ _H L
e3‘fif1 17f2+17f3‘(17’ 17'17)'
Relative to the basis e1, e,, ea our quadratic form becomes
1 A A
A(x; x) = 41—14-12 + [:52 + [:52 = £112 — 8622 + 11—74-32.
Here :1, L}, {'3 are the coordinates of the vector x in the basis e1, e,, e,.
n—DIMENSIONAL SPACES 51
It is clear that if A,._1 and A,- have the same sign then the coefficient
of 552 is positive and that if A.,._1 and A. have opposite signs, then
this coefficient is negative. Hence,
THEOREM 2. The number of negative coefficients which appear in
the canonical form (8) of a quadratic form is equal to the number of
changes of sign in the sequence
1, A1, A2, - - -, A".
A(x;x> = z w: 0 i=1
is equivalent to
£1=EZ="'=§"=0_
In other words,
If A1 > 0, A2 > 0, - - -, An > 0, then the quadratic form A(x; x)
is positive definite.
52 LECTURES ON LINEAR ALGEBRA
Alfllfl + (“21.2 + ' ' ' + Iukfk; M1f1 + sz + ' ' ' + Mafia) = 0-
In view of the fact that ,alfl + ,uzfz + - - - + :ukfk 7+_ 0, the latter
equality is incompatible with the assumed positive definite nature
of our form.
The fact that Ah 7E 0 (k = 1, - - -, n) combined with Theorem 1
permits us to conclude that it is possible to express A (x; x) in the
form
A(x; X) = 11512 + 12522 + . . . + 1251}, 110 = Ala—1
Ah '
Since for a positive definite quadratic form all ilk > 0, it follows
that all Ak > 0 (we recall that A0 = 1).
We have thus proved
THEOREM 3. Let A (x; y) be a symmetric bilinear form and
f1, f2, - - -, f" , a basis of the n-dirnensional space R. For the quadratic
form A (x; x) to be positive definite it is necessary and sufficient that
A1>0,A2>0,---,An>0.
This theorem is known as the Sylvester criterion for a quadratic
form to be positive definite.
rt-DIMENSIONAL SPACES 53
131 x2 x3
1) = 3/1 3/2 3/3 -
3'1 3: zs
<9) if..y.‘...'.'f..y.*. ’
wl 1”: wk
where the as, are coordinates of x in some orthogonal basis, the y; are the
coordinates of y in that basis, etc.
(It is clear that the space R need not be k-dimensional. R may, indeed,
be even infinite-dimensional since our considerations involve only the
subspace generated by the k vectors x, y, - - -, w.)
By analogy with the three—dimensional case, the determinant (9) is
referred to as the volume of the k-dimensional parallelepiped determined by
the vectors x, y, - - -, w.
3. In the space of functions (Example 4, § 2) the Gramm determinant
takes the form
(”mom
VG
fabmtww [abutment
A: fabfg(t)f1(t)dt Lbfln(t)dt fab/annoy;
b b
L comm f run/malt
and the theorem just proved implies that:
The Gramm determinant of a system of functions is always g 0. For a
system of functions to be linearly dependent it is necessary and sufficient that
their Gramm determinant vanish.
Hark“;
A1 = “11: A2—
_ all a 12 ’ .’
“21 “22
are different from zero. Then, as was shown in para. 2, § 6, all 1.,
in formula (1) are different from zero and the number of positive
coefficients obtained after reduction of A (X; x) to a sum of squares
by the method described in that section is equal to the number of
changes of sign in the sequence 1, 411,412, - - -, A”.
Now, suppose some other basis e’1,e’2, - - -, e’n were chosen.
Then a certain matrix Ha’mH would take the place of llaikll and
certain determinants
All! A’z’ . . .’ Aln
would replace the determinants A1, A2, - - -, A”. There arises the
question of the connection (if any) between the number of changes
of sign in the squences 1, A’1,A’2, - - -,A',, and 1, A1, A2, - - -, A”.
The following theorem, known as the law of inertia of quadratic
forms, answers the question just raised.
Theorem 1 states that the number of positive 1,. in (1) and the
number of negative 1,. in (1) are invariants of the quadratic form.
Since the total number of the h; is n, it follows that the number of
coefficients 1,. which vanish is also an invariant of the form.
We first prove the following lemma:
LEMMA. Let R’ and R” be two subspaces of an n—climensional
space R of dimension k and l, respectively, and let k + l > n. Then
there exists a vector X 7i 0 contained in R’ n R”.
Proof: Let e1, e2, - - -, ek be a basis of R’ and f1, f2, - - -, fl,
basis of R”. The vectors e1, e2, - - -, ek, f1, f2, - - -, f, are linearly
dependent (k + l > n). This means that there exist numbers
2.1, 2.2, - - -,}.,c,,u1, ,uz, - - ', ,ul not all zero such that
liei+lze2+"'+lkek+flif1+lu2f2+"'+sz=0:
i.e.,
1191 + A292 + ' ' ' + lkek= —1u'1f1 *sz — ' ' ° —Hzfi-
Let us put
(2) A (X; X) = 512 + 522 ‘l‘ ' ' ' + 592 — 5294.1 — §2r+2 _ ' ° ' _ 5222“-
(Here 51,52, - - -, E” are the coordinates of the vector x, i.e.,
x = 5191 ‘l‘ £2e2 + ' ° ' ‘l' Epe9+£m+1ep+1+n' +§p+qep+q + ' ' '
+§nen.) Let f1, f2, - - -, fn be another basis relative to which the
quadratic form becomes
(3) A(X: x) = 1712 + 1722 + ' ' ' + ’72:»! — 712w+1 — ° ' ' — 1729'”,-
(Here 771, 772’ - - -, 17” are the coordinates of x relative to the basis
f1, f2, - - -, fn.) We must show that j) = 15’ and q = q’. Assume
that this is false and that It: > p’, say.
Let R’ be the subspace spanned by the vectors e1, e2, - - -, e9.
58 LECTURES 0N LINEAR ALGEBRA
(4) A<x;x>=£12+522+~-+§:>0
(since not all the E, vanish) and, on the other hand,
We shall now try to get a better insight into the space R0.
If f1, f2, - - -, fn is a basis of R, then for a vector
7711f"
(6) y = ’rllfl -|— 7721.2 + . . . +
to belong to the null space of A (x; y) it suffices that
_
(7) A(fi;Y)=0 fori=1,2’...’n
Replacing y in (7) by (6) we obtain the following system of
equations:
A (f1; "71f1 + 7721.2 + ' ' ' + finfn) = 0,
Alfz; 77111 ‘l‘ ’72f2 + ' ’ ' + mfn) = O
(x, Y) = 2 aikéifik,
1-, Ic=1
62 LECTURES ON LINEAR ALGEBRA
where a“c are given complex numbers satisfying the following two
conditions:
(fa). gm) = f: mm a.
By the length of a vector x in a unitary space we shall mean the
number V (x, x). Axiom 4 implies that the length of a vector is
non-negative and is equal to zero only if the vector is the zero
vector.
Two vectors x and y are said to be orthogonal if (x, y) = 0.
Since the inner product of two vectors is, in general, not a real
number, we do not introduce the concept of angle between two
vectors.
3. Orthogonal basis. Isomorphisrn of unitary spaces. By an
orthogonal basis in an n-dimensional unitary space we mean a set
of n pairwise orthogonal non-zero vectors e1, e2, - ~ -, e,,. As in § 3
we prove that the vectors e1, e2, - - -, e,, are linearly independent,
i.e., that they form a basis.
The existence of an orthogonal basis in an n-dimensional unitary
space is demonstrated by means of a procedure analogous to the
orthogonalization procedure described in § 3.
If e1, e2, - - -, e" is an orthonormal basis and
X=§1e1+§2ez+"°+§nem y=7liei+772e2+"'+’7nen
are two vectors, then
(X, Y) =(Ele1 + $282 + . ' . + inert) 97181 + 17282 + . ' ' + 771.91.)
X=£1e1+§2e2+ -'--+£,.e,..
rt-DIMENSIONAL SPACES 63
then
(x: ei) = (51e1 + $292 + ' ' ' + Enem ei) = 51(91: ei)
+ 52(32: e71) + ' ' ' + Side": er):
so that
where 5,. are the coordinates of the vector x relative to the basis
e1, e2, - - -, e1, and al- are constants, a,- = f(e,-), and that every
linear function of the second kind can be written in the form
A(X; y) = Z “mfifik
i,k=1
then
A(x; Y) = 1405191 + 5292 + ' ' ' Enen; 77191 + 772% + ' ' ' + me")
" Note that A(x; 1y) = IA (x; y), so that, in particular, A(x;11y)
= —iA(X; y)-
66 LECTURES ON LINEAR ALGEBRA
a“, = a“, then the same must be true for the matrix of this form
relative to any other basis. Indeed, a“ = a)“. relative to some basis
implies that A (x; y) is a Hermitian bilinear form; but then aik=ziu
relative to any other basis.
If a bilinear form is Hermitian, then the associated quadratic
form is also called Hermitian. The following result holds:
For a bilinear form A(x; y) to be Hermitian it is necessary
and sufficient that A(x; x) be real for every vector x.
Proof: Let the form A(x; y) be Hermitian; i.e., let A(x;y)
= A(y; x). Then A(x; x) = A (x; x), so that the number
A(x; x) is real. Conversely, if A (x; x) is real for al x, then, in
particular, A (x + y; x + y), A (x + iy; x + iy),A (x — y; x—y),
A (x — iy; x — iy) are all real and it is easy to see from formulas
(1) and (2) that A(x; y) = A(Y;X).
COROLLARY. A quadratic form is Hermitian if and only ifit is real
valued.
The proof is a direct consequence of the fact just proved that for
a bilinear form to be Hermitian it is necessary and sufficient that
A (x; x) be real for all x.
One example of a Hermitian quadratic form is the form
fl = (KHz/g.
n—DIMENSIONAL SPACES 67
X=§1ei+§2ez+'°'+§nen
is an arbitrary vector, then
where a“, = A (ei; ek), are all different from zero. Then just as in
§ 6, we can write down formulas for finding a basis relative to
which the quadratic form is represented by a sum of squares.
These formulas are identical with (3) and (6) of § 6. Relative to
such a basis the quadratic form is given by
A A A
A
(x,- x) = _0 A2 rel 2 + . . . + A""—1 lei 2,
A1 I511 2 + _l
where A0 = 1. This implies, among others, that the determinants
A1, A2, - - -, A” are real. To see this we recall that if a Hermitian
quadratic form is reduced to the canonical form (3), then the
coefficients are equal to A (ed; e1.) and are thus real.
EXERCISE. Prove directly that if the quadratic form A (x; x) is Hermitian,
then the determinants A0, A1, - - -, A" are real.
relative to two bases, then the number of positive, negative and zero
coefficients is the same in both cases.
The proof of this theorem is the same as the proof of the corre-
sponding theorem in § 7.
The concept of rank of a quadratic form introduced in § 7 for real
spaces can be extended without change to complex spaces.
CHAPTER II
Linear Transformations
772' = 2 aikék'
k=l
This mapping is another instance of a linear transformation.
4. Consider the n-dimensional vector space of polynomials of
degree g n — 1.
If we put
AP(t) = P'(t),
where P’ (t) is the derivative of P(t), then A is a linear transforma-
tion. Indeed
1- [P1(t) + Pz(t)]' = P10) + P20),
2. [AP(t)]’ = lP’(t).
5. Consider the space of continuous functions f(t) defined on
the interval [0, 1]. If we put
0 O 1
AP(t) = P’(t).
We choose the following basis in R:
t2 tn—l
Then
752 ’
Ae1=1’=0, Ae2=t’=1=e1, Ae3=(§)=t=e,,
A tn—l I tn—B
..) en: (—(n_ 1)!) = ——-—(n—2)! =en_1.
.............
000--°0
Let A be a linear transformation, e1, e2, - - -, e,I a basis in R and
”am“ the matrix which represents A relative to this basis. Let
(4) x=£1e1+§2e2+”'+sneny
or, briefly,
Further
We see that the element cm of the matrix g is the sum of the pro-
ducts of the elements of the ith row of the matrix .2! and the
corresponding elements of the kth column of the matrix g. The
matrix (g with entries defined by (8) is called the product of the
matrices .2! and Q in this order. Thus, if the (linear) transforma-
tion A is represented by the matrix Ham” and the (linear) trans-
formation B by the matrix “m I, then their product is represented
by the matrix ||cik|| which is the product of the matrices Ham“
and ”bile“
DEFINITION 3. By the sum of two linear transformations A and B
we mean the transformation C defined by the equation Cx = Ax + Bx
for all x.
If C is the sum of A and B we write C = A + B. It is easy to
see that C is linear.
Let C be the sum of the transformations A and B. If Ham” and
”ball represent A and B respectively (relative 'to some basis
e1, e2, - ~ -, en) and Howl] represents the sum C of A and B (relative
to the same basis), then, on the one .hand,
Aek = 2 awe), Bek = 2 bike), Cek = 2 Gwen
i 13 i
so that
cik = “m + bar
The matrix Ham + 5m” is called the sum of the matrices Hat-kl] and
||bikl l. Thus the matrix of the sum of two linear transformations is the
sum of the matrices associated with the summaha’s.
Addition and multiplication of linear transformations have
some of the properties usually associated with these operations.
Thus
1.A+B=B+m
2. (A+B)+C=A+(B+C);
3. A(BC)=(AB)C;
4 {(A+B)C=AC+BC,
' C(A+B)=CA+CB.
We could easily prove these equalities directly but this is unnec-
essary. We recall that we have established the existence of a
one-to—one correspondence between linear transformations and
matrices which preserves sums and products. Since properties
1 through 4 are proved for matrices in a course in algebra, the iso-
morphism between matrices and linear transformations just
mentioned allows us to claim the validity of 1 through 4 for linear
transformations.
We now define the product of a number I. and a linear transfor-
mation A. Thus by AA we mean the transformation which associ-
ates with every vector x the vector MAX). It is clear that if A is
represented by the matrix Ham”, then 2A is represented by the
matrix ”ham“.
If P(t) = aot’" + alt”‘—1 + - - - + am is an arbitrary polynomial
and A is a transformation, we define the symbol P(A) by the
equation
P(A) = aoAm + alAm-I + . ~ - + amE.
EXAMPLE. Consider the space R of functions defined and
infinitely differentiable on an interval (a, 6). Let D be the linear
mapping defined on R by the equation
Df(t) =f’(t)-
78 LECTURES ON LINEAR ALGEBRA
it follows that
'P(2.1) 0 0
P(M) = 0 P(Az) 0
_0 O O 0 0
ACek = Z bikCei.
i=1
Premultiplying both sides of this equation by C“1 (which exists in
view of the linear independence of the ft.) we get
’fl
C—lACe,c = 2 biker
i=1
It follows that the matrix ”ball represents C—lAC relative to the
basis e1, e2, - - -, en. However, relative to a given basis matrix
(C—lAC) = matrix (C‘l) - matrix (A) -matrix (C)! so that
(11) a = awe.
To sum up: Formula (11) gives the connection between the matrix
.4? of a transformation A relative to a basis f1, f2, - - -, f” and the
matrix .2! which represents A relative to the basis e1, e2, - - -, en.
The matrix % in (11) is the matrix of transition from the basis
e1, e2, - - -, en to the basis f1, f2, - - -, f" (formula (10)).
EXERCISE. Show that if 11 .72 12, then the coordinate axes are the only
invariant one-dimensional subspaces.
AP(t) = P’(t).
The set of polynomials of degree g k g n — 1 is an invariant
subspace.
“k1 ' ' ' akin “n+1 ' ' ' “kn
0 0 ak+1k+1 “Min
“ik+1="'=am=0 (léiék),
then the subspace generated by ek+1, eh”, - ' ~, en would also be
invariant under A.
2. Eigenvectors and eigenvalues. In the sequel one-dimensional
invariant subspaces will play a special role.
Let R1 be a one—dimensional subspace generated by some vector
x 9E 0. Then R1 consists of all vectors of the form «X. It is clear
that for R1 to be invariant it is necessary and sufficient that the
vector Ax be in R1, i.e., that
Ax = Ax.
X=5191+§2ez+"'+5nen
be any vector in R. Then the coordinates 171, 172, - - 317,, of the
vector Ax are given by
1 The proof holds for a vector space over any algebraically closed field
since it makes use only of the fact that equation (2) has a solution.
84 LECTURES ON LINEAR ALGEBRA
or
(“11 “ ”£1 + “1252 + ' ' ' + “17‘5" = 0,
“2151 + (“22 — A)52 + ' ' ' + “21161: = 0:
(1)
“75151 + M252 + I I ' + (amt _ 10511 = 0
then
Ax”) = 10 xw)’
LINEAR TRANSFORMATIONS 85
basis by a diagonal matrix, then the vectors of this basis are eigen-
values of A.
NOTE: There is one important casein which a linear transforma-
tion is certain to have n linearly independent eigenvectors. We
lead up to this case by observing that
If e1, e2, - - -, e,c are eigenvectors of a transformation A and the
corresponding eigenvalues 11, 112, - - -, 1,, are distinct, then e1, e2, - - -,
ek are linearly independent.
For k = 1 this assertion is obviously true. We assume its
validity for k — l vectors and prove it for the case of k vectors.
If our assertion were false in the case of k vectors, then there
would exist k numbers «1, a2, - - -, 09,, with «1 ¢ 0, say, such that
(3) otle1 + ocze2 + - - - + akek = 0.
0 0 0 1 lo
2. Find the characteristic polynomial of the matrix
“1 a2 “a a'n—l a”
1 0 0 0 0
O l 0 0 0
0 0 0 1 0
Solution: (—1)"(A" — div-1 — asp-3 — - - - — an).
+ ...........................
+ (“17:51 + a2n£2 + I ° ' + ann£n)fin'
A (x; y) = (Ax, Y)
and
A(X; y) = (Bx, y)-
Then
(Ax, y) —=- (Bx, y),
i.e.,
(Ax — Bx, y) = 0
“*tk = dki'
For a non-orthogonal basis the connection between the two
matrices is more complicated.
2. Transition from A to its adjoint (the operation *)
DEFINITION 1. Let A be a linear transformation on a complex
Euclidean space. The transformation A* defined by
Denote A* by C. Then
(Ax, y) = (x, Cy),
whence
(y, Ax) = (Cy, x).
1
= — — * — = A2)
21- (A A)
or,
11(X, x) = 1(X, x).
Since (x, x) 7s 0, it follows that A = X, which proves that Z is real.
LEMMA 2. Let A be a self-adjoint transformation on an n—dimen-
sional Euclidean vector space R and let e be an eigenvector of A.
The totality R1 of vectors x orthogonal to e form an (n — 1)-dimen-
sional subspace invariant under A.
Proof: The totality R1 of vectors x orthogonal to e form an
(n —— 1)-dimensional subspace of R.
We show that R1 is invariant under A. Let x 6 R1. This means
that (x, e) .= 0. We have to show that Ax 6 R1, that is, (Ax, e)
= 0. Indeed,
(Ax, e) = (x, A*e) = (x, Ae) = (x, 2e) = 1(x, e) = 0.
THEOREM 1. Let A be a self-adjoint transformation on an n-
dimensional Euclidean space. Then there exist n pairwise orthogonal
eigenvectors of A. The corresponding eigenvalues of A are all real.
Proof: According to Theorem 1, § 10, there exists at least one
eigenvector e1 of A. By Lemma 2, the totality of vectors orthogo-
nal to e1 form an (n — 1)-dimensional invariant subspace R1.
We now consider our transformation A on R1 only. In R1 there
exists a vector e2 which is an eigenvector of A (cf. note to Theorem
1, § 10). The totality of vectors of R1 orthogonal to e2 form an
(n — 2)-dimensional invariant subspace R2. In R2 there exists an
eigenvector e3 of A, etc.
In this manner we obtain n pairwise orthogonal eigenvectors
e1, e2, - - -, e". By Lemma 1, the corresponding eigenvalues are
real. This proves Theorem 1.
Since the product of an eigenvector by any non-zero number is
again an eigenvector, we can select the vectors ei so that each of
them is of length one.
THEOREM 2. Let A be a linear transformation on an n-dimensional
Euclidean space R. For A to be self-adjoint it is necessary and
sufficient that there exists an orthogonal basis relative to which the
matrix of A is diagonal and real.
Necessity: Let A be self—adjoint. Select in R a basis consisting of
LINEAR TRANSFORMATIONS 99
11 o 0
(1) 0 22 0
0 o 1,,
where the 1,, are real.
Sufficiency: Assume now that the matrix of the transformation
A has relative to an orthogonal basis the form (1). The matrix of
the adjoint transformation A* relative to an orthonormal basis is
obtained by replacing all entries in the transpose of the matrix of
A by their conjugates (cf. § 11). In our case this operation has no
effect on the matrix in question. Hence the transformations A and
A* have the same matrix, i.e., A = A*. This concludes the proof
of Theorem 2.
We note the following property of the eigenvectors of a self-
adjoint transformation: the eigenvectors corresponding to different
eigenvalues are orthogonal.
Indeed, let
Then
(A31: e2) = (91: A* ea) = (e1, A92),
that is
11(e1, ea) = 12(31’ ea):
01'
(e1: 92) = 0-
100 LECTURES ON LINEAR ALGEBRA
(v3 V?)
to the 28th power. Hint: Bring the matrix to its diagonal form, raise it-to
the proper power, and then revert to the original basis.
A (X; X) = Z Ziléila
where the ii are real, and the 4-} are the coordinates of the vector
x.6
Proof: Let A(x; y) be a Hermitian bilinear form, i.e.,
X=§1ei+§2ez+"‘+§new y=7liei+7lzez+"’+”7nen-
Since
(e e)_:1 for i=k
" "_ 0 for igék,
we get
A(X; Y) E (AX, Y)
= (51Ae1 + §2Ae2 + ' ' ' + EnAen’ 77191 + 77292 + ' ° ° + finen)
= (115191 + 125292 + ' ' ' + Angnem 77191 + 77292 + ' ° ' + mien)
= A14:17-11 + 1252772 + ' ' ' + lflgnfi’n'
In particular,
A (X; X) = Z lil‘silZ:
where the it are real, and the E,- are the coordinates of the vector
x.‘3
Proof: Let A(x; y) be a Hermitian bilinear form, i.e.,
X=§1e1+5292+”'+§uem y=771e1+77292+"'+’7nen-
Since
(e e)—{1 for i=k
" k _ 0 for igék,
we get
A(X; Y) E (AX, Y)
= (glAel + §2Ae2 + ' ' ' + EnAen: 77191 + 77292 + ' ' ' + When)
= (Alglel + 12§2e2 + ' ' ' + Augnen’ 771e1 + 77a + ' ' ' + mien)
= 1151771 + 1252772 + ' ' ' + AnE’nfin'
In particular,
Consequently,
(4) Det (a! 433) = (Al—z)<zz—z>---<An—z).
Under a change of basis the matrices of the Hermitian quadratic
forms A and B go over into the matrices .21] = g* M‘g and
$1 = %*fl%. Hence, if e1, e2, - - -, en is an arbitrary basis, then
with respect to this basis
Det(.ia!1 — 3331) = Det %* - Det (.52! — W) - Det ‘6,
i.e., Det (M1 — lfll) differs from (4) by a multiplicative constant.
It follows that the numbers Ill, 12, - - -, in are the roots of the equation
“11 —' ”’11 “12 — 3-512 ' ' ' “1n — M711;
“21 —' 1521 “22 — A(’22 ' ' ' “2n — Abs" =
where ||a,-,,|| and “bu,” are the matrices of the quadratic forms
A(x; x) and B(x; x) in some basis e1, e2, - - -, en.
NOTE: The following example illustrates that the requirement that one of
the two forms be positive definite is essential. The two quadratic forms
1 0]
e — [o -1
and the matrix of the second form is
_01
39—10.
Consider the matrix .21 — 1.93, where A is a real parameter. Its determinant
is equal to — (2.2 + l) and has no real roots. Therefore, in accordance with
the preceding discussion, the two forms cannot be reduced simultaneously
to a sum of squares.
The numbers 11, 112, - - -, 1,, are in absolute value equal to one.
Proof: Let U be a unitary transformation. We claim that the n
pairwise orthogonal eigenvectors constructed in the preceding
theorem constitute the desired basis. Indeed,
Uel = 2181,
Ue2 = Azez,
...........
LINEAR TRANSFORMATIONS 107
A=HU=UH,
0 0 1,,
where 11,12, - - -, A” are the eigenvalues of B. By Lemma 2 all
A,- g 0. Put
V11 0 - - - 0
H _ 0 V1.2 0
0 0 V1,,
Applying Lemma 2 again we conclude that H is positive definite.
114 LECTURES ON LINEAR ALGEBRA
b. lo=a+i13,fi;é0. Let
£1 +1.771’E2 + “72’ I I I) En +1177!
and
“11’71 + “12712 + ' ° ' + “m7!" = “’71 + I351:
(2), “21711 + “22712 + ' ' ' + ‘12a = 0“72 + 1352:
x = §1e1 + 5292 + ' ' ' + Enen’ y = ’71e1 + 772% + ' ' ' +7lnen'
Furthermore, let C,- be the coordinates of the vector 2 = Ax, i.e.,
LINEAR TRANSFORMATIONS 117
Ct = 2 “flask!
Ic=1
where Hat-k” is the matrix of A relative to the basis e1, e2, - - -, en.
It follows that
Similarly,
’fl
“tic = aki'
To sum up, for a linear transformation to be self-adjoint it is
necessary and sufficient that its matrix relative to an orthonormal basis
be symmetric.
Relative to an arbitrary basis every symmetric bilinear form
A(x; y) is represented by
n
Ax = ocx — fly,
Ay = fix + any.
But then
(Ax, Y) = «(x Y) — My, Y)
(x, Ay) = fi(x, X) + «(X Y)-
Subtracting the first equation from the second we get [note that
(AX, Y) = (X, AY)]
0 = 2fl[(X. X) + (y, y)].
Since (x, x) + (y, y) ale 0, it follows that it = 0. Contradiction.
LEMMA 2. Let A be a self-adjoint transformation and e1 an
eigenvector of A. Then the totality R’ of vectors orthogonal to e1
forms an (n — 1)-dimensional invariant subspace.
Proof: It is clear that the totality R’ of vectors x, xeR,
orthogonal to e1 forms an (n — 1)-dimensional subspace. We
show that R’ is invariant under A.
Thus, let x eR’, i.e., (X, el) = 0. Then
(Ax, el) = (x, Ael) = (x, Zel) = 1(x, el) = 0,
i.e., Ax e R’.
THEOREM 2. There exists an orthonormal basis relative to which
the matrix of a self—adjoint transformation A is diagonal.
Proof: By Lemma 1, the transformation A has at least one
eigenvector e1.
Denote by R’ the subspace consisting of vectors orthogonal to e1.
Since R’ is invariant under A, it contains (again, by Lemma 1)
LINEAR TRANSFORMATIONS 119
o 22 0
o 0 11,.
3. Reduction of a quadratic form to a sum of squares relative to an
orthogonal basis (reduction to principal axes). Let A(x; y) be a
symmetric bilinear form on an n-dimensional Euclidean space.
We showed earlier that to each symmetric bilinear form A (x; y)
there corresponds a linear self-adjoint transformation A such that
A (x; y) = (Ax, y). According to Theorem 2 of this section there
exists an orthonormal basis e1, e2, - - -, en consisting of the
eigenvectors of the transformation A (i.e., of vectors such that
Aei = lei). With respect to such a basis
A(X; y) = (Ax, y)
=(A(§191 + 5292 ‘l‘ ‘ ' ' ‘l‘ guenlflhei + ’7292 ‘l' ' ' ' ‘l‘ men)
= (115191 + 125262 + ' ' ' + Ansnen’nlel ‘l' 71292 + ' ' ' +77nen)
= A151’71 ‘l‘ A2’52’72 "l' ' ' ' ‘l‘ lnénm-
Putting y = x we obtain the following
THEOREM 3. Let A (x; x) be a quadratic form on an n-dimensional
Euclidean space. Then there exists an orthonormal basis relative to
which the quadratic form can be represented as
A(X; x) = 2 LE}.
Here the 1,. are the eigenvalues of the transformation A or, equiv-
alently, the roots of the characteristic equation of the matrix
Hatkll'
For n = 3 the above theorem is a. theorem of solid analytic geometry.
Indeed, in this case the equation
A(x;x) =1
is the equation of a central conic of order two. The orthonormal basis
120 LECTURES ON LINEAR ALGEBRA
mw=3ww
By Theorem 3 of this section there exists an orthonormal basis
e1, e2, - - -, en relative to which the form A (x; x) is expressed as a
sum of squares, i.e.,
(8) Z 53-
(X; X) = B(X; X) = i=1
Thus, relative to the basis e1, e2, - - -, en each quadratic form
can be expressed as a sum of squares.
5. Orthogonal transformations
DEFINITION. A linear transformation A defined on a real n—dimen—
sional Euclidean space is said to be orthogonal if it preserves inner
products, i.e., if
(9) (Ax, Ay) = (x, y)
for all x, y e R.
Putting x = y in (9) we get
(10) [Ax]2 = IXIZ,
that is, an orthogonal transformation is length preserving.
EXERCISE. Prove that condition (10) is sufficient for a transformation
to be orthogonal.
LINEAR TRANSFORMATIONS 121
Since
(x, y)
=m
and since neither the numerator nor the denominator in the
expression above is changed under an orthogonal transformation,
it follows that an orthogonal transformation preserves the angle
between two vectors.
Let e1, e2, - - -, en be an orthonormal basis. Since an orthogonal
transformation A preserves the angles between vectors and the
length of vectors, it follows that the vectors Ae1,Ae2, - ' -, Aen
likewise form an orthonormal basis, i.e.,
1 for i = k
(11) (Aet: Aek) — {0 for 1’ ¢ [3
Now let Haw” be the matrix of A relative to the basis e1, e2, - ~ -,
en. Since the columns of this matrix are the coordinates of the
vectors Aei, conditions (11) can be rewritten as follows:
" 1 for i = k
(12) 2 “Ma” = {0 for igék.
a=1
EXERCISE. Show that conditions (11) and, consequently, conditions (12)
are sufficient for a transformation to be orthogonal.
Conditions (12) can be written in matrix form. Indeed,
it
03> [Z ’2]
be the matrix of A relative to that basis.
We first study the case when A is a proper orthogonal trans-
formation, i.e., we assume that «6 —- fly = 1.
The orthogonality condition implies that the product of the
matrix (13) by its transpose is equal to the unit matrix, i.e., that
l: H :1-
LINEAR TRANSFORMATIONS 123
t: 2H: t]-
It follows from (14) and (15) that in this case the matrix of the
transformation is
a _
fl 0! ’
where a2 + [32 = 1. Putting at = cos (p, [3 = sin (7) we find that
the matrix of a proper orthogonal transformation on a two dimensional
space relative to an orthogonal basis is of the form
cos go — sin go
[sin ()0 cos go]
(a rotation of the plane by an angle (p).
Assume now that A is an improper orthogonal transformation,
that is, that 0:6 — fly = — 1. In this case the characteristic
equation of the matrix (13) is 12 — (ac + 6)). — 1 = O and, thus,
has real roots. This means that the transformation A has an
eigenvector e, Ae = he. Since A is orthogonal it follows that
Ae = ie. Furthermore, an orthogonal transformation preserves
the angles between vectors and their length. Therefore any vector
e1 orthogonal to e is transformed by A into a vector orthogonal to
Ae = ie, i.e., Ae1 = iel. Hence the matrix of A relative to the
basis e, e1 has the form
i1 0
0 i1 '
—1
cos (pl — sin (p1
sin (p1 cos (p1
—1
cos (p1 — sin (p1
sin (p1 cos (p1
1
cos (p — sin (p
sin ([7 cos (p
1
— 1—
1
Making use of Theorem 5 one can easily show that every orthogonal
transformation can be written as the product of a number of simple rota-
tions and simple reflections. The proof is left to the reader.
and
(Ael, el) = 21, where (e1, el) = 1.
Inequality (1) can be rewritten as follows
(2) (Ax, x) ; 21(x, x), where (x, x) = 1.
This inequality holds for vectors of unit length. Note that if we
multiply x by some number at, then both sides of the inequality
become multiplied by «2. Since any vector can be obtained from a
vector of unit length by multiplying it by some number or, it
follows that inequality (2) holds for vectors of arbitrary length.
We now rewrite (2) in the form
(Ax — 111x, x) g 0 for all x.
In particular, for X = e1, we have
(Ae1 — llel, e) = 0.
This means that the transformation B = A — 21E satisfies the
conditions of Lemma 1. Hence
(A — 11E)e1 = 0, i.e., Ael = Alel.
We have shown that e1 is an eigenvector of the transformation
A corresponding to the eigenvalue 21. This proves the theorem.
To find the next eigenvalue of A we consider all vectors of R
orthogonal to the eigenvector e1. As was shown in para. 2, § 16
(Lemma 2), these vectors form an (n — l)-dimensional subspace
R1 invariant under A. The required second eigenvalue 12 of A is
the minimum of (Ax, x) on the unit sphere in R1. The corre-
sponding eigenvector e2 is the point in R1 at which the minimum
is assumed.
Obviously, 12 g 11 since the minimum of a function considered
on the whole space cannot exceed the minimum of the function in a
subspace.
We obtain the next eigenvector by solving the same problem in
LINEAR TRANSFORMATIONS 129
Alélzgn'éln
(Ax: X) = i~1£12 + A24522 + ' ' ' + 11¢n ; 11(512 + £52 + ' ' ' + 51:”) =
= 11(x, x).
Similarly,
(Ax, x) g 1,,(x, x).
It follows that
111(x, x) g (Ax, x) $1,,(x, x).
Now let Rk be a subspace of dimension n — k + 1. In § 7 (Lemma of
para. 1) we showed that if the sum of the dimensions of two subspaces of an
n-dimensional space is greater than n, then there exists a vector different
from zero belonging to both subspaces. Since the sum of the dimensions of
R,, and S is (n — k + l) + k it follows that there exists a vector x0
common to both R,c and S. We can assume that xo has unit length, that'is,
130 LECTURES ON LINEAR ALGEBRA
But then the minimum of (Ax, x) for x on the unit sphere in Rh must be
equal to or less than h.
To sum up: If Rk is an (n — k + l)-dimensional subspace and x varies
over all vectors in R,c for which (x, x) = 1, then
min (Ax, x) g M.
In this formula the minimum is taken over all x e R,“ (x, x) = l, and
the maximum over all subspaces Rk of dimension n — k + 1.
As a consequence of our theorem we have:
Let A be a self-adjoint linear transformation and B a postive definite linear
transformation. Let A g A, g - - - g 1,, be the eigenvalues of A and let
#1 g ,u2 g - - - g ,u" be the eigenvalues of A + B. Then 1,, g Mk-
Indeed
(Ax, x) g ((A + B)x, x),
for all x. Hence for any (n — k + l)—dimensional subspace Rh we have
It follows that the maximum of the expression on the left side taken over
all subspaces Rk does not exceed the maximum of the right side. Since, by
formula (3), the maximum of the left side is equal to 1,, and the maximum
of the right side is equal to the: we have 1,, g M.
A(cleJ + 02e2 + ' ' ' + opera) = M6191 + czez ’l‘ ' ' ' + ope”).
Substituting the appropriate expressions of formula (2) on the left
side we obtain
Aep
+
H
(D
(D
a
e
I
...
l
0 0 0 -- 21 1
0 0 0 0 l
CANONICAL FORM OF LINEAR TRANSFORMATION 135
‘zllou-o T
Olll-HO
000 111
1210 0
0/121 0
(4) 000 12
_ 000---/1k_
Here all the elements outside of the boxes are zero.
Although a matrix in the canonical form described above seems more
complicated than a diagonal matrix, say, one can nevertheless perform
algebraic operations on it with relative ease. We show, for instance, how to
compute a polynomial in the matrix (4). The matrix (4) has the form
”1
Ma
Mk
where the #1 are square boxes and all other elements are zero. Then
#12 M1".
M22 .212“
”2: ' I ...’ Mm: ’
Mk3 “km
that is, in order to raise the matrix .2! to some power all one has to do is
raise each one of the boxes to that power. Now let P(t) = a0 + alt + - - - +
+ amt“ be any polynomial. It is easy to see that
136 LECTURES ON LINEAR ALGEBRA
P0311)
PMs)
PM) =
'Pm,»
We now show how to compute P(.x&’1), say. First we write the matrix all
in the form
.211 = 11 g + J,
where 6' is the unit matrix of order p and where the matrix I has the form
010 0 0
O 0 1 0 0
"6— 0 0 0 01'
_0 O 0 0 0
We note that the matrices J”, .1“, - - -, 19-1 are of the form 2
O 0 1 0 O 0 0 0 01
0 0 0 1 0 0 0 O 0 0
J2: ................. , J94: ................
0 0 0 0 0 O 0 0 0 0
0 O 0 0 0 0 0 0 0 0
and
jr=jp+1=...=0.
where n is the degree of P(t). Substituting for t the matrix .211 we get
(M1 — 116')” ,,
P(dl) = PUulé” + (-5211 — 3-16,)P,('11)+ —'—2'—P (11)
d — 6’"
+ . . . +$ Pong“)
n.
But .211 — 116’ = J. Hence
A*e = he.
We claim that the (n — 1)-dimensional subspace R’ consisting of
3 The main idea for the proof of this theorem is due to I. G. Petrovsky.
See I. G. Petrovsky, Lectures on the Theory 0/ Ordinary Differential Equa-
tions, chapter 6.
138 LECTURES ON LINEAR ALGEBRA
Ah1 = Zkhl,
Ah2 = h1 + lkhz,
forms a basis in R.
Applying the transformation A to e we get
4 We assume here that R is Euclidean, i.e., that an inner product is
defined on R. However, by changing the proof slightly we can show that
the Lemma holds for any vector space R.
CANONICAL FORM OF LINEAR TRANSFORMATION 139
Ae=oc1e1+---+aper+fi1f1+~-+/3qfq—|—---+61h1
+---+6,h,+1e.5
We can assume that r = 0. Indeed, if relative to some basis A is
in canonical form then relative to the same basis A — IE is also in
canonical form and conversely. Hence if 1 7S 0 we can consider
the transformation A — 1E instead of A.
This justifies our putting
(1) Ae=°‘1e1+"'+°‘mep+13]f1+""l‘flqfq
+"'+51h1+"'+5shs-
We have
A9, = Ae — A0519] + ' ° ' + hep) — A(M1f1 + ' ' ' + l‘qfa]
“‘ ° ' ° "' A(0)1h1 + ' + wshs);
or, making use of (1)
Ael=°‘1ei+"'+°‘pew+l31f1+"'+/3afa+"'+51h1
(3) +"'+6shs_A(X1el+°"°+Zpeaz) —A(M1f1+"'
+Iuafa) _ 'OI—A(w1h1+ "'+wshs)'
The coefficients X1: - - -, 75,; ,ul, - - -, ,uq; ---; ml, -- -, a), can
be chosen arbitrarily. We will choose them so that the right side
of (3) has as few terms as possible.
We know that to each set of basis vectors in the n-dimensional
space R’ relative to which A is in canonical form there corresponds
5 The linear transformation A has in the (n + 1)—dimensional space R
the eigenvalues 1.1, 12, - - -, 1,, and 1. Indeed, the matrix of A relative to the
basis e1, e2, - - -, e,; f1, f2, - - -, fa; -- -; h1,h2,---, h,, e is triangular with
the numbers 1.1, 1,, - - -, A,“ 1: on the principal diagonal.
Since the eigenvalues of a triangular matrix are equal to the entries on
the diagonal (of. for instance, § 10, para. 4) it follows that 11, 12, - - -, }.k, and
r are the eigenvalues of A considered on the (n + 1)-dimensiona1 space R.
Thus, as a result of the transition from the n-dimensional invariant sub-
space R’ to the (n + 1)-dimensional space R the number of eigenvalues is
increased by one, namely, by the eigenvalue 1.
140 LECTURES ON LINEAR AL'GEBRA
(1.181 + ' ' ' + “pep _ A(x191 + ' ' ' + liter)
= “1‘31 + ‘ ' ' + “pep — 9611191
— 12(e1 + A1‘32) — ' ° ' — Zp(ep—1 ‘l' A1%)
= (“1 — 1111 — Z2)e1 + (‘12 — X211 ‘“ X3)ez
+ ' ' ' + (“p—1 _ Xp—lj'l _ Idem—1 + (“17 _ lmll)ep'
Ael=°c1el+”'+°‘pev+fllf1+'.'+fiqfa+71g1
(4) + ' ' ' + yrgr _ A(Xle1 + ' ' ' + Znep)
_ A(Au'1f1 + ' ' ' + Mafa) _ Aollgl + ' ° ° + 90%)-
“1‘31 + 05292: + ' ' ' + “pep — X291 — 1392 _ ' ' ' _ hem—1-
By putting 12 = a1, 13 = a2, - - -, Zm = «1,4 we annihilate all
vectors except a, ep. Proceeding in the same manner with the
sets f , ' - -, fa and g1, - - -, g, we obtain a vector e’ such that
Ae’=0
I
e n+1 — e ,
I I
e p = Ae 9+1 = “pep + [3q + 71%,,
I I
e p—r+1 = Ae p—r+2 = “pep—1+1 + fife—7+1 + 7431»
I
e az—r = Ae 11—1'+1 = “pep—r + nfq—r’
I I
e1=Ae2=ocpe1.
We now replace the basis vectors e’, e1 , e2, - - -, e7, by the vectors
I I I
e1,e’2, ' ' "ewep+1
and leave the other basis vectors unchanged. Relative to the new
basis the transformation A is in canonical form. Note that the
order of the first box has been increased by one. This completes
the proof of the theorem.
While constructing the canonical form of A we had to distinguish
two cases:
1. The case when the additional eigenvalue T (we assumed
1: = 0) did not coincide with any of the eigenvalues 11, - - -, 1,0.
In this case a separate box of order 1 was added.
2. The case when r coincided with one of the eigenvalues
11, - - -, 2k. Then it was necessary, in general, to increase the order
of one of the boxes by one. If on, = ,Bq = y, = 0, then just as in
the first case, we added a new box.
in 1 0
[0 A, I].
0 0 1.,
Answer: D3(A).= (A — in)”, D,(}.) = D10.) = l.
Here A3) are the minors of $1 of order m1 and AS; the minors of .932
of order m2. 7 Indeed, if one singles out those of the first n1 rows
which enter into the minor in question and expands it by these
rows (using the theorem of Laplace), the result is zero or is of the
form Afi’Afif’.
We shall rfow find the polynomials D km) for an arbitrary matrix
.2! which is in Jordan canonical form. We assume that .2! has 1)
boxes corresponding to the eigenvalue 11, 9 boxes corresponding
to the eigenvalue 12, etc. We denote the orders of the boxes
corresponding to the eigenvalue 11 by n1, n2, - - -, n, (n1 ; n2
2 . . . g 7,”).
_Let er. denote the ith box in g = .2! — M. Then g1, say, is of
the form
111—; 1 0 o
0 11—2 1 0
n: .........................
0 0 0 1
o 0 0 21—2
We first compute Dn(h), i.e., the determinant of 3?. This determi-
nant is the product of the determinants of the 331., i.e.,
7 Of course, a non-zero kth order minor of Q may have the form A (1’,
i.e., it may be entirely made up of elements of $1. In this case we shall
write it formally as A,c = Ako‘z’, where A0”) = 1.
CANONICAL FORM or LINEAR TRANSFORMATION 147
The expressions for the Dkm show that in place of the Dk(l) it is
more convenient to consider their ratios
Dk 2
EN) = D._((i) '
The Ek(}.) are called elementary divisors. Thus if the jordan
canonical form of a matrix .2! contains p boxes of order n1, n2, - - -,
n1, (n1 2 n2 2 - - - g n,) corresponding to the eigenvalue 11, q boxes
of order m1, m2, - - -, ma (m1 2 m2 2 - - - 2 ma) corresponding
to the eigenvalue 22, etc., then the elementary divisors Ek(}.) are
Ac; = Z amok.
k=1
The matrix of this system of equations is A — 1E, with A the matrix of
coefficients in the system (1). Thus the study of the system of differential
equations (1) is closely linked to polynomial matrices of degree one, namely,
those of the form A — 1E.
Similarly, the study of higher order systems of differential equations leads
to polynomial matrices of degree higher than one. Thus the study of the
system
n dyk n
n dzyk
+ 2 bik—+ Z Emil/1c = 0
k=1 9‘ k=1
is synonymous with the study of the polynomial matrix Al.” + B}. + C,
where A = “an“, B = ||b.,,||, C = ||c,-k||.
We now consider the problem of the canonical form of polyno-
mial matrices with respect to so-called elementary transformations.
The term “elementary” applies to the following classes of trans-
formations.
1. Permutation of two rows or columns.
2. Addition to some row of another row multiplied by some
polynomial <p(/'l) and, similarly, addition to some column of another
column multiplied by some polynomial.
3. Multiplication of some row or column by a non-zero constant.
DEFINITION 1. Two polynomial matrices are called equivalent if it
is possible to obtain one from the other by a finite number of ele-
mentary transformations.
The inverse of an elementary transformation is again an elemen-
tary transformation. This is easily seen for each of the three types
CANONICAL FORM OF LINEAR TRANSFORMATION 151
E10.) 0 0 - - -
0 522(1) 023(1) ' ' ' 0211(1)
(3) 0 632(1) 633(1) ' ' ' can“)
Here the diagonal elements 13,60.) are manic polynomials and 131(1)
divides E201), 132(1) divides E30»), etc. This form of a polynomial
matrix is called its canonical diagonal form.
It may, of course, happen that
Er+1(2') = Er+2(z’) = ' ' ' = 0
1311,33,], mm.
to canonical diagonal form.
Answer:
1 O i
[0 (l — 111W» — 12)]
2. In this paragraph we prove that the canonical diagonal
form of a given matrix is uniquely determined. To this end we
shall construct a system of polynomials connected with the given
polynomial matrix which are invariant under elementary trans-
formations and which determine the canonical diagonal form
completely.
Let there be given an arbitrary polynomial matrix. Let Dk(fi.)
denote the greatest common divisor of all kth order minors of the
given matrix. As before, it is convenient to put D00.) = 1. Since
Dk(}.) is determined to within a multiplicative constant, we take
its leading coefficient to be one. In particular, if the greatest
common divisor of the kth order minors is a constant, we take
DkU.) = 1.
We shall prove that the polynomials Dk(Z) are invariant under
elementary transformations, i.e., that equivalent matrices have
the same polynomials Dk(l).
In the case of elementary transformations of type 1 which
permute rows or columns this is obvious, since such transformations
either do not affect a particular kth order minor at all, or change
CANONICAL FORM OF LINEAR TRANSFORMATION 155
its sign or replace it with another kth order minor. In all these
cases the greatest common divisor of all kth order minors remains
unchanged. Likewise, elementary transformations of type 3 do
not change DkU.) since under such transformations the minors are
at most multiplied by a constant. Now consider elementary
transformations of type 2. Specifically, consider addition of the
jth column multiplied by (p (A) to the 15th column. If some particular
kth order minor contains none of these columns or if it contains
both of them it is not affected by the transformation in question.
If it contains the ith column but not the kth column we can write
it as a combination of minors each of which appears in the original
matrix. Thus in this case, too, the greatest common divisor of the
kth order minors remains unchanged.
If all kth order minors and, consequently, all minors of order
higher than k are zero, then we put Dk(}.) = Dk+101) = - - -
= Dam) = 0. We observe that equality of the Dk(}.) for all
equivalent matrices implies that equivalent matrices have the
same rank.
We compute the polynomials Dk(}.) for a matrix in canonical
form
E10.) 0 0
<5> ..‘.’....EF.”_’........ T’..
0 0 113.01)
We observe that in the case of a diagonal matrix the only non-
zero minors are the principal minors, that is, minors made up of
like numbered rows and columns. These minors are of the form
E.,(A>E.-,(A) - - - an).
Since E20.) is divisible by E1 (1), E30.) is divisible by E2(l), etc.,
it follows that the greatest common divisor D10) of all minors of
order one is £10.). Since all the polynomials E,c(}.) are divisible
by E10») and all polynomials other than E10.) are divisible by
E2(}.), the product E,(}.)E,(l)(z' < j) is always divisible by the
minor E1(A)E2(Z). Hence D20.) = E1().)E2(}.). Since all Ek(}.)
other than E10.) and E20.) are divisible by E3(}.), the product
Ei(l)E,-(Z)Ek(fi.) (i <j < k) is divisible by the minor
Blow/DEM and so 03(2) = E.<A>E2<2>Ea(z).
156 LECTURES ON LINEAR ALGEBRA
(6) DM) = Ei(l)E2(/1) ' ' ' EM) (k = 1, 2, ' ' u n)-
Clearly, if beginning with some value of r
EN) =
Dim
Die—1(1)
Here, if D,+1(l) = - - - = D710.) = O we must put Erflu)
="‘=En(/l)=0-
The polynomials EkW are called elementary divisors. In § 20 we
defined the elementary divisors of matrices of the form A — 1E.
THEOREM 2. The canonical diagonal form of a polynomial matrix
A(l) is uniquely determined by this matrix. If Dk(h) ¢ 0 (k = 1, 2,
- - -, r) is the greatest common divisor of all kth order minors of A(/‘l)
and D,+1(h) = - - - = Dn(l) = 0, then the elements of the canonical
diagonal form (5) are defined by the formulas
D 1
E141) = k( ) ( — 1) 2: . 2 7’),
Bic—1(1)
Er+1(}‘) = Er+2(}‘) = ' ' ' = Eng“) = 0'
O 0 0 1
obtained from the unit matrix by multiplying its second column
(or, what amounts to the same thing, row) by at.
Finally, to add to the first column of A0.) the second column
multiplied by (pm) we must multiply A(/’l) on the right by the
matrix 1 0 0 0
q) (A) 1 0 0
(10) 0 0 l 0
0 0 0 1
obtained from the unit matrix by just such a process. Likewise
to add to the first row of A0») the second row multiplied by (19(1)
we must multiply AU.) on the left by the matrix
1 (pa) 0 0
0 1 0 0
(11) 0 0 l 0
................
CANONICAL FORM OF LINEAR TRANSFORMATION 159
Let
P0.) = P01" + Plln‘l + - - - + P",
where the Pk are constant matrices.
It is easy to see that the polynomial matrix
PO.) + (A — lE)PoA"—1
is of degree not higher than n — 1.
If
PM) + (A — AE)P0}."—1 = For—1 + PIP—2 + - - - + P’”_1,
then the polynomial matrix
PM) + (A — 1E)POZ"—1 + (A — ZE)P’OA"—2
is of degree not higher than n — 2. Continuing this process we
obtain a polynomial matrix
13(2) + (A — ZE)(POA“—1 + POM—2 + - . -)
of degree not higher than zero, i.e., independent of A. If R denotes
the constant matrix just obtained, then
PM) = (A — lE)(—Pox"—1 — POM—2 + - - -) + R,
or putting 8(1) = (—PM’“1 — FOP—2 —{— - - -),
PM) = (A — 1E)S(l) + R.
This proves our lemma.
A similar proof holds for the possibility of division on the right;
i.e., there exist matrices 81(1) and R1 such that
PM) = 81(1) (A — 2E) + R1.
We note that in our case, just as in the ordinary theorem of Bezout, we
can claim that
R = R1 = P(A).
THEOREM 4. The polynomial matrices A — 1E and B — 1E are
equivalent if and only if the matrices A and B are similar.
Proof: The sufficiency part of the proof was given in the
beginning of this paragraph. It remains to prove necessity. This
means that we must show that the equivalence of A —- 1E and
B — 1E implies the similarity of A and B. By Theorem 3 there
exist invertible polynomial matrices PM) and 9(1) such that
162 LECTURES ON LINEAR ALGEBRA
KW = (B — 1B)P1(1) (A — 139(1)
(16) PA(1)( — 1E)Q1(/1)(B— 1E)
—(B —1E)P( )( —1E)Q1(l)(B —-/1E)-
But in View of (12)
(A — 139(1) = P‘1(/1)(B — 1E),
PO.) (A — AB) = (B — 1E)Q‘1(}.).
Using these relations we can rewrite K0.) in the following manner
CANONICAL FORM OF LINEAR TRANSFORMATION 163
Km = (B — 1E)[P1(/1)P‘1(1) + Q‘1(1)Ql(l)
— PIWA — mamas — 1E).
We now show that K0.) = 0. Since PM) and 9(1) are invertible,
the expression in square brackets is a polynomial in 1.. We shall
prove this polynomial to be zero. Assume that this polynomial is
not zero and is of degree m. Then it is easy to see that K9) is of
degree m + 2 and since m g 0, K(/l) is at least of degree two. But
(15) implies that K0.) is at most of degree one. Hence the expres-
sion in the square brackets, and with it KM), is zero.
We have thus found that
(17) B — AB = P0(A —- ZE)Q0,
where P0 and Q0 are constant matrices; i.e., we may indeed replace
PO.) and Q0.) in (12) with constant matrices.
Equating coefficients of A in (17) we see that
P0 90 = E:
which shows that the matrices P0 and 90 are non-singular and that
P0 = 90—1-
Equating the free terms we find that
B = 1)oAQo = Qo_1AQo:
i.e., that A and B are similar. This completes the proof of our
theorem.
Since equivalence of the matrices A — 1E and B — AB is
synonymous with identity of their elementary divisors it follows
from the theorem just proved that two matrices A and B are similar
if and only if the matrices A —— AB and B — 2E have the same
elementary divisors. We now show that every matrix A is similar
to a matrix in jordan canonical form.
To this end we consider the matrix A — 1E and find its ele-
mentary divisors. Using these we construct as in § 20 a matrix B
in Jordan canonical form. B — 1E has the same elementary
divisors as A — 1E, but then B is similar to A.
As was indicated on page 160 (footnote) this paragraph gives
another proof of the fact that every matrix is similar to a matrix
in Jordan canonical form. Of course, the contents of this paragraph
can be deduced directly from §§ 19 and 20.
CHAPTER IV
Introduction to Tensors
X=Ele1+§2e2+"'+£men
is a vector in R and f is a linear function on R, then (of. § 4) we
can write
then
(f": X) = (f’fi 99:) = Ei(fk’e1') = 5‘5!“ = 5"-
Hence, the coordinates E“ of a vector x in the basis e1, e2, ' - -, e fl
can be computed from the formulas
1 This is seen by comparing the matrices in (6) and (6’). We say that the
matrix ||n,~"|| in (6’) is the transpose of the transition matrix in (6) because
the summation indices in (6) and (6’) are different.
170 LECTURES ON LINEAR ALGEBRA
Now let y be the vector with coordinates a1, a2, - - -, a“. Since the
basis e1, e2, - - -, en is orthonormal,
ez‘ = giafa'
(10) ek = gkafa’
where
gm = (ei: ek)‘
Solving equation (10) for f‘ we obtain the required result
(11) fi = gmea.
where the matrix llgikll is the inverse of the matrix ”gm”, i.e.,
gmgak = aki'
EXERCISE. Show that
g” = (1‘21“)-
§ 23. Tensors
1. Multilinear functions. In the first chapter we studied linear
and bilinear functions on an n-dimensional vector space. A natural
2 If R is an n-dimensional vector space, then R is also n—dimensional and
so R and _R are isomorphic. If we were to identify R and K we would have
to write in place of (f, x), (y, x), y, x e R. But this would have the effect
of introducing an inner product in R.
172 LECTURES ON LINEAR ALGEBRA
1(lx’y’-..;f’g’...) =ll(x’y’.-u;f)g’...)'
Again,
1(X,y, - - -;f’ +f”,g. ' ' ')
= 1(X, y, - - -;f’, g, ' ' -) + 1(X, y, - - -;f”, g, ' - -);
“X; y, - - mtg, ' - ') = ul(X:y, ° - -;f,g, ' ' )-
A multilinear function of p vectors in R (contravariant vectors)
and q vectors in R (covariant vectors) is called a multilinear
function of type (p, q).
The simplest multilinear functions are those of type (1, 0) and
(0, 1).
A multilinear function of type (1, 0) is a linear function of one
vector in R, i.e., a vector in R (a covariant vector).
Similarly, as was shown in para. 3, § 22, a multilinear function
of type (0, 1) defines a vector in R (a contravariant vector).
There are three types of multilinear functions of two vectors
(bilinear functions):
(a) bilinear functions\ on R (considered in § 4),
(,6) bilinear functions on R,
(9/) functions of one vector in R and one in R.
There is a close connection between functions of type (y) and
linear transformations. Indeed, let
y=Ax
be a linear transformation on R. The bilinear function of type ()1)
INTRODUCTION TO TENSORS 173
(f, AX)
which depends linearly on the vectors X e R and f GK.
As in § 11 of chapter II one can prove the converse, i.e.,that one
can associate with every bilinear function of type (y) a linear
transformation on R.
2. Expressions for multilinear functions in a given coordinate
system. Coordinate transformations. We now express a multilinear
function in terms of the coordinates of its arguments. For simplici—
ty we consider the case of a multilinear function l(x, y; f), x, y e R,
fefi (a function of type (2, 1)).
Let e1, e2, - - -, en be a basis in R andf1,f“', - - -,f"its dualin ii.
Let
X=§iew y=77jew f: Ckfk-
Then
(2) “3333=l(ei.e,-,"';f':f’,"')-
We now show how the system of numbers which determine a.
multilinear form changes as a result of a change of basis.
Thus let e1, e2, - ° ', en be a basis in R andf1,f2, - - ',f"its dual
basisinfi. Let e’1,e’z, - - -, e’n be a new basis in R andf’1,f’2, - - ',f’”
be its dual in R. If
(3) e’“ = of e5,
174 LECTURES ON LINEAR ALGEBRA
(4) f” = baflf“,
where the matrix llbfll is the transpose of the inverse of ”of”.
For a fixed a the numbers co,” in (3) are the coordinates of the
vector e’, relative to the basis e1, e2, - ~ -, e". Similarly, for a
fixed I3 the numbers bf in (4) are the coordinates of f’3 relative to
the basis f1,f2, - - -,f".
We shall now compute the numbers a’gjjj which define our
multilinear function relative to the bases e’1,e’2, - - -, e’n and
f’1,f’2, - - -, f’”. We know that
“/3... = l(e,1~, e’j, . . ..f/,-’fls’ . . .).
Hence to find a’T;jI' we must put 1n (1 ) 1n place of 6‘, 171, ' - - ; Anus, - - -
the coordinates of the vectors e',, e’,,- - - f’r f’s . i the
numbers cf, cf, - - -; b,’, b, , - -In this way we find that
“’3‘: = ciacifl . . - by’bfs . . . “ZEII
en, i.e.,
Ae, = aikek.
Define a change of basis by the equations
e’, = ciaea.
Then
e,- = b,“e’¢, where biaca" = 6"“.
It follows that
Ae’i = Acfiea = afiAea = cyan/gel, = ofaaflbfike’k = a’ie’k.
This means that the matrix [la’i’cll of A relative to the e'i basis
takes the form
a?“ = aafic,“ bfl",
which proves that the matrix of a linear transformation is indeed a
tensor of rank two, once covariant and once contravariant.
In particular, the matrix of the identity transformation E
relative to any basis is the unit matrix, i.e., the system of numbers
6,_ 1 if i=k,
.-— o if igék.
Thus 6,." is the simplest tensor of rank two once covariant and once
INTRODUCTION TO TENSORS 177
bij--- 13-” = “"0 ei’ ' I '; er! es: ' ' ')
and
“um... = l"(ekl ell . ' .;ft’fu’ . I -)’
it follows that
rs‘ntu-fl _ I"... Hm...
“if... kl--- _ a 1‘3"" a kl-H'
(7) ---;f1,g.---)+l(e2,y,---;f2,g.---)
=l(er.Y.+"'+l(emy,"';f",g,"‘)
= “ea“ y‘v . . .;fa’g’ . . .)
A(ea;f“) = A(e’a;f’°‘)-
We recall that if
e’i = other“
then
flc = cikf".
Therefore
and
l’(er: ' - ifs, - ' ') = New ea, - - -;f°‘,f’, - - -),
if follows that
(10) “man-2;. = id
_ i i.,,i,,___ ’71772"'77n .
l(x,y, ,Z)—ailiz...in£1n2 C —a
This proves the fact that apart from a multiplicative constant the
only skew symmetric multilinear function of n vectors in an n—
dimensional vector space is the determinant of the coordinates of
these vectors.
The operation of symmetrization. Given a tensor one can always
construct another tensor symmetric with respect to a preassigned
group of indices. This operation is called symmetrization and
consists in the following.
Let the given tensor be ai1i2.,.in, say. To symmetrize it with
respect to the first k indices, say, is to construct the tensor
1
“(ilig"'ik)ih+1"' = k_! z ajlj2".ikik+1”"
where the sum is taken over all permutations jl, jz, - - -, jk of the
indices i1, i2, - - - ik. For example
_ 1
“(my ‘ 7(“1'11'2 + ai2i1)'
INTRODUCTION TO TENSORS 185
where the sum is taken over all permutations jl, jz, - - -, jk of the
indices i1, i2, - - -, ik and the sign depends on the even or odd
nature of the permutation involved. For instance
flung] = %(ai1i2 _ “1313)-
The operation of alternation is indicated by the square bracket
symbol []. The brackets contains the indices involved in the
operation of alternation.
Given k vectors 5“, 17“, - - -, Cik we can construct their tensor
product ai1i2"'ik = 52117"2 - - - file and then alternate it to get a["1"2""'¢].
It is easy to see that the components of this tensor are all kth
order minors of the following matrix
51 £2 . . . 6"
n1 n2 . . . 77”