Вы находитесь на странице: 1из 350

Applied Linear Algebra

Instructors Solutions Manual


by Peter J. Olver and Chehrzad Shakiban

Table of Contents
Chapter
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

Page

Linear Algebraic Systems . . . . . . . . . . . . . . . .


Vector Spaces and Bases . . . . . . . . . . . . . . . . .
Inner Products and Norms . . . . . . . . . . . . . . .
Minimization and Least Squares Approximation
Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . .
Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . .
Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . .
Linear Dynamical Systems . . . . . . . . . . . . . . . .
Iteration of Linear Systems . . . . . . . . . . . . . . .
Boundary Value Problems in One Dimension . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.. 1
. 46
. 78
114
131
174
193
226
262
306
346

Solutions Chapter 1

1.1.1.
(a) Reduce the system to x y = 7, 3 y = 4; then use Back Substitution to solve for
4
x = 17
3 ,y = 3.
(b) Reduce the system to 6 u + v = 5, 52 v = 52 ; then use Back Substitution to solve for
u = 1, v = 1.
(c) Reduce the system to p + q r = 0, 3 q + 5 r = 3, r = 6; then solve for p = 5, q =
11, r = 6.
(d) Reduce the system to 2 u v + 2 w = 2, 23 v + 4 w = 2, w = 0; then solve for
u = 13 , v = 34 , w = 0.
(e) Reduce the system to 5 x1 + 3 x2 x3 = 9, 15 x2 52 x3 = 25 , 2 x3 = 2; then solve for
x1 = 4, x2 = 4, x3 = 1.
(f ) Reduce the system to x + z 2 w = 3, y + 3 w = 1, 4 z 16 w = 4, 6 w = 6; then
solve for x = 2, y = 2, z = 3, w = 1.
3 55
5
(g) Reduce the system to 3 x1 + x2 = 1, 38 x2 + x3 = 32 , 21
8 x3 + x4 = 4 , 21 x4 = 7 ; then
3
2
2
3
solve for x1 = 11
, x2 = 11
, x3 = 11
, x4 = 11
.
1.1.2. Plugging in the given values of x, y and z gives a+2 b c = 3, a2 c = 1, 1+2 b+c = 2.
Solving this system yields a = 4, b = 0, and c = 1.
1.1.3.
(a) With Forward Substitution, we just start with the top equation and work down. Thus
2 x = 6 so x = 3. Plugging this into the second equation gives 12 + 3y = 3, and so
y = 3. Plugging the values of x and y in the third equation yields 3 + 4(3) z = 7,
and so z = 22.
(b) We will get a diagonal system with the same solution.
(c) Start with the last equation and, assuming the coefficient of the last variable is 6= 0, use
the operation to eliminate the last variable in all the preceding equations. Then, again
assuming the coefficient of the next-to-last variable is non-zero, eliminate it from all but
the last two equations, and so on.
(d) For the systems in Exercise 1.1.1, the method works in all cases except (c) and (f ).
Solving the reduced system by Forward Substitution reproduces the same solution (as
it must):
(a) The system reduces to 32 x = 17
2 , x + 2 y = 3.
15
(b) The reduced system is 15
u
=
2
2 , 3 u 2 v = 5.
(c) The method doesnt work since r doesnt appear in the last equation.
(d) Reduce the system to 32 u = 12 , 72 u v = 52 , 3 u 2 w = 1.
(e) Reduce the system to 32 x1 = 83 , 4 x1 + 3 x2 = 4, x1 + x2 + x3 = 1.
(f ) Doesnt work since, after the first reduction, z doesnt occur in the next to last
equation.
5
21
3
8
2
(g) Reduce the system to 55
21 x1 = 7 , x2 + 8 x3 = 4 , x3 + 3 x4 = 3 , x3 + 3 x4 = 1.

1.2.1. (a) 3 4,

(b) 7,

(c) 6,

(d) ( 2 0 1 2 ),
1

0
C
(e) B
@ 2 A.
6

2 3
5 6C
1.2.2. (a)
A, (b)
7
8
9
0 1
1
C
(e) B
@ 2 A, (f ) ( 1 ).
3
B
@4

1
1

2
4

3
, (c)
5

B
@4

2
5
8

3
6
9

4
7C
A, (d) ( 1
3

4 ),

1.2.3. x = 31 , y = 43 , z = 31 , w = 23 .

1.2.4.
(a)
(b)
(c)
(d)
(e)

(f )

(g)

1
1
6
3

1
x
7
A=
, x=
, b=
;
2
y
3
!
!
!
5
u
1
;
, b=
, x=
A=
5
v
2
0
1
0 1
0 1
1
1 1
p
0
B C
B C
A=B
3C
@ 2 1
A, x = @ q A , b = @ 3 A ;
1 1
0
r
6 1
0
1
0 1
0
2
1 2
u
3
B C
B
C
A=B
3 3C
@ 1
A, x = @ v A, b = @ 2 A;
4 3 1
0
w1
71
0
0
0
9
5 3 1
x1
B
C
C
B
C
A=B
@ 3 2 1 A, x = @ x2 A, b = @ 5 A;
x3 0 1 1 0
1 1
2
0
1
1
1
0
1 2
x
3
B
C
B C
B
C
2 1
2 1 C
B yC
B 3C
B
C, x = B C , b = B
C;
A=B
@ 0 6 4
@ zA
@ 2A
2A
1
3
2 1 1
w
1
0
0
1
0 1
3 1 0 0
x1
1
B
B
C
B C
1 3 1 0C
C
Bx C
B1C
B
C, x = B 2 C, b = B C.
A=B
@0 1 3 1A
@ x3 A
@1A
x4
0 0 1 3
1

1.2.5.
(a) x y = 1, 2 x + 3 y = 3. The solution is x = 65 , y = 51 .
(b) u + w = 1, u + v = 1, v + w = 2. The solution is u = 2, v = 1, w = 1.
(c) 3 x1 x3 = 1, 2 x1 x2 = 0, x1 + x2 3 x3 = 1.
The solution is x1 = 51 , x2 = 52 , x3 = 25 .
(d) x + y z w = 0, x + z + 2 w = 4, x y + z = 1, 2 y z + w = 5.
The solution is x = 2, y = 1, z = 0, w = 3.

1.2.6.

1
B
B0
(a) I = B
@0
0
(b) I + O =

0
1
0
0
I,

0 0
0 0 0
B
0 0C
C
B0 0 0
C,
O=B
@0 0 0
1 0A
0 1
0 0 0
I O = O I = O. No, it does

(f )

B
@3

11
12
8

6
4

0
, (d) undefined, (e) undefined,
2
1
0
1
9
9 2
14
B
12 C
6 17 C
A, (g) undefined, (h) @ 8
A, (i) undefined.
8
12 3
28

1.2.7. (a) undefined, (b) undefined, (c)

3
1

0
0C
C
C.
0A
0
not.

1.2.8. Only the third pair commute.


1.2.9. 1, 6, 11, 16.
0

1
1.2.10. (a) B
@0
0

0
0
0

0
0C
A, (b)
1

1.2.11. (a) True, (b) true.


1.2.12. (a) Let A =

x
z

2
B
B0
B
@0
0

0
2
0
0

0
0
3
0

0
0C
C
C.
0A
3

y
. Then A D =
w

ax
az

by
bw

ax
bz

ay
bw

= D A, so if a 6= b these
!

a 0
= a I.
are equal if and only if y = z = 0. (b) Every 2 2 matrix commutes with
0 a
0
1
x 0 0
C
(c) Only 3 3 diagonal matrices. (d) Any matrix of the form A = B
@ 0 y z A. (e) Let
0 u v
D = diag (d1 , . . . , dn ). The (i, j) entry of A D is aij dj . The (i, j) entry of D A is di aij . If
di 6= dj , this requires aij = 0, and hence, if all the di s are different, then A is diagonal.
1.2.13. We need A of size m n and B of size n m for both products to be defined. Further,
A B has size m m while B A has size n n, so the sizes agree if and only if m = n.
1.2.14. B =

x
0

y
x

where x, y are arbitrary.

1.2.15. (a) (A + B)2 = (A + B)(A + B) = AA


+ AB + BA
+ BB = A2 + 2AB + B 2 , since
!
!
1 2
0 0
AB = BA. (b) An example: A =
,B=
.
0 1
1 0
1.2.16. If A B is defined and A is m n matrix, then B is n p matrix and A B is m p matrix;
on the other hand if B A is defined we must have p = m and B A is n n matrix. Now,
since A B = B A, we must have p = m = n.
1.2.17. A Onp = Omp , Olm A = Oln .
1.2.18. The (i, j) entry of the matrix equation c A = O is c aij = 0. If any aij 6= 0 then c = 0, so
the only possible way that c 6= 0 is if all aij = 0 and hence A = O.
1.2.19. False: for example,

1
0

0
0

0
1

0
0

0
0

0
.
0

1.2.20. False unless they commute: A B = B A.


1.2.21. Let v be the column vector with 1 in its j th position and all other entries 0. Then A v
is the same as the j th column of A. Thus, the hypothesis implies all columns of A are 0
and hence A = O.
1.2.22. (a) A must be a square matrix. (b) By associativity, A A2 = A A A = A2 A = A3 .
(c) The nave answer is n 1. A more sophisticated answer is to note that you can comr
pute A2 = A A, A4 = A2 A2 , A8 = A4 A4 , and, by induction, A2 with only r matrix
multiplications. More generally, if the binary expansion of n has r + 1 digits, with s nonzero
digits, then we need r + s 1 multiplications. For example, A13 = A8 A4 A since 13 is 1101
in binary, for a total of 5 multiplications: 3 to compute A2 , A4 and A8 , and 2 more to multiply them together to obtain A13 .

1.2.23. A =

0
0

1
.
0

1.2.24. (a) If the ith row of A has all zero entries, then the (i, j) entry of A B is ai1 b1j + +
ain bnj = 0 b1j + + 0 bnj = 0, which holds for all j, so the ith row of A B will have all 0s.
!
!
!
1 1
1 2
1 1
.
, then B A =
,B=
(b) If A =
3 3
3 4
0 0
1
3

1.2.25. The same solution X =


1.2.26. (a)

4
1

5
, (b)
2

5
2

1
2

in both cases.

1
. They are not the same.
1
!

1
0

1.2.27. (a) X = O. (b) Yes, for instance, A =

2
,B=
1

3
2

2
,X=
1

1
1

0
.
1

1.2.28. A = (1/c) I when c 6= 0. If c = 0 there is no solution.


1.2.29.
(a) The ith entry of A z is 1 ai1 +1 ai2 + +1 ain = ai1 + +ain , which is the ith row sum.
1
1n
(b) Each row of W has n 1 entries equal to n and one entry equal to n and so its row
1
1n
sums are (n 1) n + n = 0. Therefore, by part (a), W z = 0. Consequently, the
row 0
sums
0 the
1 of B = A W are
1
1
B
C
(c) z = B
@ 1 A, and so A z = @ 2
4
1
0
10
2
1

1 2 1 C B
3
3
B
1
2
B
CB
B
2
1
3

@
A@
3
3
1
1
4 5 1
3
3

entries of
= AW
= A 0 = 0, and the result follows.
0 z1
0z 1
1B
2
1
2 1
B C
B C
1
3C
A @ 1 A = @ 6 A, while B = A W =
0
1
5 1
1
0
1
0 1
1
1
4
5
3 3
0
3C
3
B
C
1C
C, and so B z = B
C=B
0C
@
A.
0
1
1
@
A
3A
0
2
4 5
1

1.2.30. Assume A has size m n, B has size n p and C 0


has size p 1
q. The (k, j) entry of B C
is

p
X

l=1

bkl clj , so the (i, j) entry of A (B C) is

On the
other hand,
the (i, l) entry of A B is
0
1
p
X

l=1

n
X

k=1

aik bkl A clj =

n
X

p
X

n
X

kk= 1
X

i=1

aik @

p
X

l=1

bkl clj A =

n
X

p
X

k=1 l=1

aik bkl clj .

aik bkl , so the (i, j) entry of (A B) C is

aik bkl clj . The two results agree, and so A (B C) =

k=1 l=1

(A B) C. Remark : A more sophisticated, simpler proof can be found in Exercise 7.1.44.


1.2.31.
(a) We need A B and B A to have the same size, and so this follows from Exercise 1.2.13.
(b) A B B A = O if and only if A B = B A.
0
1
!
!
0 1 1
0 0
1 2
B
, (iii) @ 1 0 1 C
, (ii)
(c) (i)
A;
0 0
6 1
1 1 0
(d) (i) [ c A + d B, C ] = (c A + d B)C C(c A + d B)
= c(A C C A) + d(B C C B) = c [ A, B ] + d [ B, C ],
[ A, c B + d C ] = A(c B + d C) (c B + d C)A
= c(A B B A) + d(A C C A) = c [ A, B ] + d [ A, C ].
(ii) [ A, B ] = A B B A = (B A A B) = [ B, A ].
4

(iii)

[ A, B ], C = (A B B A) C C (A B B A) = A B C B A C C A B + C B A,
i

[ C, A ], B = (C A A C) B B (C A A B) = C A B A C B B C A + B A C,
i

[ B, C ], A = (B C C B) A A (B C C B) = B C A C B A A B C + A C B.
Summing the three expressions produces O.

1.2.32. (a) (i) 4, (ii) 0, (b) tr(A + B) =


(c) The diagonal entries of A B are

n
X

i=1

j =1

entries of B A are

n
X

i=1

n
X

(aii + bii ) =

n
X

i=1

aij bji , so tr(A B) =


n
X

bji aij , so tr(B A) =

n
X

i=1 j =1

aii +
n
X

n
X

i=1
n
X

i=1 j =1

bii = tr A + tr B.
aij bji ; the diagonal

bji aij . These double summations are

clearly equal. (d) tr C = tr(A B B A) = tr A B tr B A = 0 by part (a).


(e) Yes, by the same proof.
1.2.33. If b = A x, then bi = ai1 x1 + ai2 x2 + + ain xn for each i. On the other hand,
cj = (a1j , a2j , . . . , anj )T , and so the ith entry of the right hand side of (1.13) is
x1 ai1 + x2 ai2 + + xn ain , which agrees with the expression for bi .
1.2.34.
(a) This follows by direct computation.
(b) (i)
!
!
!
!
!
!
!
2 1
1 2
2
1
2
4
1 0
1
4
=
( 1 2 ) +
(1 0) =
+
=
.
3 2
1
0
3
2
3 6
2 0
5 6
0
1
!
!
!
!
2
5
(ii)
0
2
1
1 2 0 B
C
( 1 1 )
( 3 0 ) +
(2 5) +
0A =
@ 3
2
1
3
3 1 2
1 1
=
(iii)
3 1
B
2
@ 1
1
1
0

10

2
6

5
15

6
3

0
0

0
2

0
2

8
1

5
.
17
0

1
2
3 0
3
1
1
B
C
B
C
B
C
B
C
1C
A @ 3 1 4 A = @ 1 A( 2 3 0 ) + @ 2 A( 3 1 4 ) + @ 1 A( 0 4 1 )
5
0
4 1
1
1
5
0
1
0
1
0
1
0
1
6
9 0
3
1 4
0
4
1
3
14 3
C
B
B
B
=B
8C
4
1C
1
9C
@ 2 3 0 A + @ 6 2
A + @0
A = @4
A.
2
3 0
3 1
4
0 20 5
5 18 1
(c) If we set B = x, where x is an n 1 matrix, then we obtain (1.14).

(d) The (i, j) entry of A B is

n
X

k=1

aik bkj . On the other hand, the (i, j) entry of ck rk equals

the product of the ith entry of ck , namely aik , with the j th entry of rk , namely bkj .
Summing these entries, aik bkj , over k yields the usual matrix product formula.
1.2.35.

1
2 8
, q(A) =
(a) p(A) = A 3A + 2 I , q(A) = 2A + I . (b) p(A) =
0
4
6
3
2
5
3
2
(c) p(A)q(A) = (A 3A + 2 I )(2A + I ) = 2A 5A + 4A 3A + 2 I , while
p(x)q(x) = 2 x5 5 x3 + 4 x2 3 x + 2.
(d) True, since powers of A mutually commute.
For the particular matrix from (b),
!
2
8
p(A) q(A) = q(A) p(A) =
.
4 6
3

0
.
1

1.2.36.

2
0

(a) Check that S = A by direct computation. Another example: S =

0
. Or, more
2

generally, 2 times any of the matrices in part (c).


(b) S 2 is only defined if S is square.!
!
1
0
a
b
(c) Any of the matrices
,
, where a is arbitrary and b c = 1 a2 .
0 1
c a
!
0 1
.
(d) Yes: for example
1
0
1
0
1 1 1
B 3 0
1C
C
B
C
B
C. (c) Since matrix addition is
1
1
3
1.2.37. (a) M has size (i+j)(k+l). (b) M = B
C
B
@ 2 2
0A
1 1 1
done entry-wise, adding the entries of each block is the same as adding the blocks. (d) X
has size k m, Y has size k n, Z has size l m, and W has size l n. Then A X + B Z
will have size i m. Its (p, q) entry is obtained by multiplying the pth row of M times the
q th column of P , which is ap1 x1q + + api xiq + bp1 z1q + + bpl zlq and equals the
sum of the (p, q) entries of A X and B Z. A similar argument works
for the remaining
three
!
!
0
0 1
blocks. (e) For example, if X = (1), Y = ( 2 0 ), Z =
, W =
, then
1
1
0
0
1
0
1 1
0
1
B 4
1 2
0
7
0C
B
C
B
C
B
C
C. The individual block products are
P = @ 0 0 1 A, and so M P = B
4
5
1
B
C
@ 2 4 2 A
1 1
0
0
1 1

0
4

B
C
@ 2 A

1
3

(1) +

1
0

B
C
B
@ 2 A (1) + @ 2

1.3.1.
(a)

1
1

1
2

7
9

4
2

!
!

!
3
C 0
0A
,
1
1

2R1 +R2

0
,
1

1
0

7
5

1
0

1
1
B
C
2 C
A = @ 2 A ( 2
1
1

B
@ 4

1
7

1
3

=
0

(2
1

0) +

1
0
0

1
1
1

0) + B
@2
1

0
1
1

3
0
0C
A
1
1

1
,
0
!

1
.
0

4
. Back Substitution yields x2 = 2, x1 = 10.
10

5 1 3 R1 +R2 3 5 1
26 . Back Substitution yields w = 2, z = 3.

(b)

0 13
1 8
3

1
1
03
1
0
0

1 2
1 0 4R1 +R3 1 2
1 0 23 R2 +R3 1 2
1 0
B
B
(c) B
2 8 8 C
2 8 8 C
2 8 8 C
A @ 0
A @ 0
@ 0
A.

9
9
4
5
9
0 3 13
0
0
1 3
Back
Substitution yields

0
1 z = 3, y
0= 16, x = 29.
1
0
1
1
4 2 1 2R1 +R2 1
4 2 1 3R1 +R3 1
4 2 1
B

C
B

C
B

(d) @ 2
0 3 7 A @ 0
8 7 5 A @ 0
8 7 5 C
A
3 2
2 1
3 2
2 1
0 14
8 4

1
0
7
1
1 4
2
R
+R
2
3
4
B
7 5 C
A. Back Substitution yields r = 3, q = 2, p = 1.
@0 8
51
0 0 17
4
4
3
2

1
0 2
0 1
1 0 2
0 1

C
B
0
1
0 1 2 C
0 1 2 C
B0 1
C

C reduces to B

C.
(e)
@0 0
0 3
2
0 0 A
2 3 6 A
4
0
0
7 5
0 0
0 5 15
3
Solution: x4 = 3, x3 = 2 , x2 = 1, x1 = 4.

0
0
1
1
1
3 1
1 2
1 3 1
1 2
B
B
1 1
3 1 0 C
2
0 2 C
B 0 2
C
C
B

C reduces to B

C.
(f ) B
@ 0
@ 0 0 2
1 1
4 7 A
4
8A

4 1
1
0
0 0
0 24
5
48
Solution: w = 2, z = 0, y = 1, x = 1.
B
B
B
@

1.3.2.
(a) 3 x + 2 y = 2, 4 x 3 y = 1; solution: x = 4, y = 5,
(b) x + 2 y = 3, x + 2 y + z = 6, 2 x 3 z = 1; solution: x = 1, y = 2, z = 1,
(c) 3 x y + 2 z = 3, 2 y 5 z = 1, 6 x 2 y + z = 3;
solution: x = 32 , y = 3, z = 1,
(d) 2 x y = 0, x + 2 y z = 1, y + 2 z w = 1, z + 2 w = 0;
solution: x = 1, y = 2, z = 2, w = 1.
,
y
= 34 ; (b) u = 1, v = 1; (c) u = 23 , v = 31 , w = 16 ; (d) x1 =
1.3.3. (a) x = 17
3
11
10
2
2
19
5
1
4
2
3 , x2 = 3 , x3 = 3 ; (e) p = 3 , q = 6 , r = 2 ; (f ) a = 3 , b = 0, c = 3 , d = 3 ;
(g) x = 31 , y = 67 , z = 38 , w = 92 .
1.3.4. Solving 6 = a + b + c, 4 = 4 a + 2 b + c, 0 = 9 a + 3 b + c, yields a = 1, b = 1, c = 6, so
y = x2 + x + 6.
1.3.5.

2
(a) Regular:
1
(b) Not regular.
0

1
4

2
0

7
2

.
0

2
1
3 2
1
B
10
(c) Regular:
4 3 C
83 C
A @ 0
A.
3
3
2
5
0
0
4
0
1
0
1
1 2
3
1 2
3
B
C
B
(d) Not regular: @ 2
4 1 A @ 0
0
5C
A.
3 1
2
0
5 7
(e) Regular:
0
1
0
1
0
1 3 3 0
1
3 3 0
1
B
C
B
B
3 4 2 C
B 1 0 1 2 C
B0
C
B0
B
C B
C B
@ 3 3 6 1 A
@ 0 6
@0
3 1A
2 3 3 5
0 3
3 5
0
B
@ 1

1.3.6.

i
1 i

(a)
0

i
(b) B
@ 0
1
(c)

1 i
i

1+ i
1
0
2i
2i

1
3 i

i
0

1+ i
1 2i

3
3
0
0

1
1 2i

3
4
5
1

0
1
B
2C
C
B0
C B
@0
5A
7
0

use Back Substitution to obtain the solution y = 1, x = 1 2 i .

1 i
2i
i
0
1 i
B
1 + i
2 C
1+ i
A @ 0 2 i
1 2i
i
0 0 2 i
1
solution:
z=
i , y = 2 23 i , x = 1 + i .

1 i
2
i
2
i

0
2 i 23 12 i
1 + i 1
solution: y = 41 + 43 i , x = 21 .
7

2i
2
1 2i

C
A.

3
3
0
0

3
4
5
0

0
2C
C
C.
5A
6

(d)

0
B
@

1+ i i
2 + 2 i 0
i 2 + 2 i 0

B
C
0 A @ 0
1 2 + 3 i 0 C
2
i
A;

0
0 6 + 6 i 6
i 3 11 i 6
solution: z = 12 21 i , y = 25 + 21 i , x = 52 + 2 i .

1+ i
1 i
3 3i

1.3.7. (a) 2 x = 3, y = 4, 3 z = 1, u = 6, 8 v = 24. (b) x = 23 , y = 4, z = 31 ,


u = 6, v = 3. (c) You only have to divide by each coefficient to find the solution.
1.3.8. 0 is the (unique) solution since A 0 = 0.
1.3.9.
Back Substitution
start
set xn = cn /unn
for i = n 1 to 1 with increment 1
1
set xi =
uii

next j

@c

i+1
X

j =1

uij xj A

end

1.3.10. Since
a11
0

a12
a22

b11
0

b12
b22

a11 b11
0

a11 b12 + a12 b22


,
a22 b22
!

a11 b11 a22 b12 + a12 b11


a11 a12
b11 b12
,
=
0
a22 b22
0
a22
0 b22
the matrices commute if and only if
a11 b12 + a12 b22 = a22 b12 + a12 b11 ,
or
(a11 a22 )b12 = a12 (b11 b22 ).
1.3.11. Clearly, any diagonal matrix is both lower and upper triangular. Conversely, A being
lower triangular requires that aij = 0 for i < j; A upper triangular requires that aij = 0 for
i > j. If A is both lower and upper triangular, aij = 0 for all i 6= j, which implies A is a
diagonal matrix.
1.3.12.
(a) Set lij =
0

0
(b) L = B
@ 1
2

1.3.13.

aij ,
0,
0
0
0

i > j,
,
i j,

0
0C
A,
0

uij =
0

3
D=B
@0
0
0

aij ,
0,

0
4
0

i < j,
i j,

0
0C
A,
5

dij =
0

0
U =B
@0
0

1
0
0

aij ,
0,

i = j,
i 6= j.

1
2C
A.
0

0 0 1
C
3
(a) By direct computation, A2 = B
@ 0 0 0 A, and so A = O.
0 0 0
(b) Let A have size n n. By assumption, aij = 0 whenever i > j 1. By induction, one

proves that the (i, j) entries of Ak are all zero whenever i > j k. Indeed, to compute
the (i, j) entry of Ak+1 = A Ak you multiply the ith row of A, whose first i entries are 0,
8

by the j th column of Ak , whose first j k 1 entries are non-zero, and all the rest are
zero, according to the induction hypothesis; therefore, if i > j k 1, every term in the
sum producing this entry is 0, and the induction is complete. In particular, for k = n,
n
every entry of Ak is zero, and
! so A = O.
1
1
(c) The matrix A =
has A2 = O.
1 1

1.3.14.
(a) Add
(b) Add
(c) Add
(d) Add
(e) Add

2 times the second row to the first row of a 2 n matrix.


7 times the first row to the second row of a 2 n matrix.
5 times the third row to the second row of a 3 n matrix.
1
2 times the first row to the third row of a 3 n matrix.
3 times the fourth row to the second row of a 4 n matrix.

1
B
B0
1.3.15. (a) B
@0
0

0
1
0
0

1.3.16. L3 L2 L1 =

0
0
1
1
0

B
@2

0
0C
C
C, (b)
0A
1
0
1

21

1
B
B0
B
@0
0

0
1
0
0

0
0
1
0

0
0C
A 6= L1 L2 L3 .
1
1

0
0C
C
C, (c)
1 A
1

1
B
B0
B
@0
0

0
1
0
0

0
0
1
0

3
0C
C
C, (d)
0A
1

1
B
B0
B
@0
0

0
1
0
2

0
0
1
0

0
0C
C
C.
0A
1

1 0 0
1 0 0
B
1 0C
1 0C
1.3.17. E3 E2 E1 = B
A, E1 E2 E3 = @ 2
A. The second is easier to predict
@ 2
1
1
1 2 1
2 2 1
since its entries are the same as the corresponding entries of the Ei .
1.3.18.
e adds d 6= 0 times row k to
(a) Suppose that E adds c 6= 0 times row i to row j 6= i, while E
e
row l 6= k. If r1 , . . . , rn are the rows, then the effect of E E is to replace
(i) rj by rl + c ri + d rk for j = l;
(ii) rj by rj + c ri and rl by rl + (c d) ri + d rj for j = k;
(iii) rj by rj + c ri and rl by rl + d rk otherwise.
e is to replace
On the other hand, the effect of E E
(i) rj by rl + c ri + d rk for j = l;
(ii) rj by rj + c ri + (c d) rk and rl by rl + d rk for i = l;
(iii) rj by rj + c ri and rl by rl + d rk otherwise.
e =E
e E whenever i 6= l and j 6= k.
Comparing results, we see that E E
(b) E1 E2 = E2 E1 , E1 E3 6= E3 E1 , and E3 E2 = E2 E3 .
(c) See the answer to part (a).
1.3.19. (a) Upper triangular; (b) both special upper and special lower triangular; (c) lower
triangular; (d) special lower triangular; (e) none of the above.
1.3.20. (a) aij = 0 for all i 6= j; (b) aij = 0 for all i > j; (c) aij = 0 for all i > j and aii = 1
for all i; (d) aij = 0 for all i < j; (e) aij = 0 for all i < j and aii = 1 for all i.
1.3.21.
(a) Consider the product L M of two lower triangular n n matrices. The last n i entries
in the ith row of L are zero, while the first j 1 entries in the j th column of M are zero.
So if i < j each summand in the product of the ith row times the j th column is zero,
9

and so all entries above the diagonal in L M are zero.


(b) The ith diagonal entry of L M is the product of the ith diagonal entry of L times the ith
diagonal entry of M .
(c) Special matrices have all 1s on the diagonal, and so, by part (b), does their product.
1.3.22. (a) L =
0

1
(c) L = B
@ 1
1
0
1
(e) L = B
@ 2
1
0
1
B
0
B
(g) L = B
B 1
@
0

1
B
0
B
U =B
@0
0

1 0
1 3
1 0
1
3
, U=
,
(b) L =
, U=
,
1 1
0 3
3 1
0 8
1
0
1
0
1
0
2 0
1 0 0
0 0
1 1 1
1
B
B
1 0C
1 0C
0C
(d) L = B
A, U = @ 0 3
A, U = @ 0 2
A,
@2
0 1
0 0
3
0 13 1
0 0
1
0
1
0
1
0
1
0 0
1 0 0
0 0
1 0
B
B
C
B
1 0C
1 0C
A, U = @ 0 3 0 A, (f ) L = @ 2
A, U = @ 0 3
1 1
0
0 2
3 31 1
0 0
0
1
1
0
1 0 1
0
0 0 0
1
0
0
B 0 2 1
B
1 C
1 0 0C
B
C
C
1
1
0
B
C, U = B
C
3
(h) L = B
B
C
1
7 C,
@ 2
1
1
2 1 0A
@0 0
2
2A
1
3
1
2
3 1

0 0
0 10
2

1
3
0
0

2
1
4
0

3
3C
C
C,
1A
1

(i) L =

0
B
B
B
B
B
@

0
1

1
2
3
2
1
2

37
1
7

0
0
1

5
22

0
2
C
B
C
B
0C
B0
C, U = B
C
B0
0A
@
0
1

7
2

23
22
7
0

1
2
5
7
35
22

0
0

21

C
A,
7
6 1

1
4C
A,
13
3
1

0
0C
C
C,
0A
1
1

C
C
C
C.
C
A

1.3.23. (a) Add 3 times first row to second row. (b) Add 2 times first row to third row.
(c) Add 4 times second row to third row.
1.3.24. 0

1 0 0 0
B
C
B2 1 0 0C
C
(a) B
@3 4 1 0A
5 6 7 1
(b) (1) Add 2 times first row to second row. (2) Add 3 times first row to third row.
(3) Add 5 times first row to fourth row. (4) Add 4 times second row to third row.
(5) Add 6 times second row to fourth row. (6) Add 7 times third row to fourth row.
(c) Use the order given in part (b).

1.3.25. See equation (4.51) for the general case.


0
0

1
B
Bt
B 1
B 2
Bt
@ 1
t31

1
Bt
@ 1
t21
1
t2
t22
t32

1
t3
t23
t33

1
t1

1
t1

1
t2

1
t2
t22

1
1
B
t3 C
A = @ t1
t21
t23

1
1
C
B
C
B
t4 C B t1
C=B 2
Bt
t24 C
A
@ 1
3
t31
t4
0

0
1

1
0

0
1
t1 + t 2

1
t2 t1
10

0
1
B
0C
A@0
0
1

0
1
t1 + t 2
t21 + t1 t2 + t22
1

B
B0
B
B
B0
@

1
t2 t1
0
0

1
t2 t1
0

0
0
1
t1 + t 2 + t 3

1
t3 t 1
(t3 t1 ) (t3 t2 )
0
10

1
C
t3 t 1
A,
(t3 t1 ) (t3 t2 )
1

0
C
0C
C
C
0C
A
1

1
C
C
t4 t 1
C
C.
C
(t4 t1 ) (t4 t2 )
A
(t4 t1 ) (t4 t2 ) (t4 t3 )

1 1
is regular. Only if the zero appear in the (1, 1) position
1 0
does it automatically preclude regularity of the matrix.

1.3.26. False. For instance

1.3.27. (n 1) + (n 2) + + 1 =
1.3.28. We solve the equation

0
1

u1
0

u2
u3

a
c

b
d

for u1 , u2 , u3 , l, where a 6= 0 since

b
is regular. This matrix equation has a unique solution: u1 = a, u2 = b,
d
c
bc
, l= .
u3 = d
a
a

A =

a
c

1
l

n(n 1)
.
2

0 1
1.3.29. The matrix factorization A = L U is
=
1 0
This implies x = 0 and a x = 1, which is impossible.

1
a

0
1

x
0

y
z

x
ax

y
.
ay + z

1.3.30.
(a) Let u11 , . . . , unn be the pivots of A, i.e., the diagonal entries of U . Let D be the diagonal matrix whose diagonal entries are dii = sign uii . Then B = A D is the matrix obtained by multiplying each column of A by the sign of its pivot. Moreover, B = L U D =
e , where U
e = U D, is the L U factorization of B. Each column of U
e is obtained by
LU
e , which are
multiplying it by the sign of its pivot. In particular, the diagonal entries of U
the pivots of B, are uii sign uii = | uii | > 0.
(b) Using the same notation as in part (a), we note that C = D A is the matrix obtained
by multiplying each row of A by the sign of its pivot. Moreover, C = D L U . However, D L is not special lower triangular, since its diagonal entries are the pivot signs.
b = D L D is special lower triangular, and so C = D L D D U = L
bU
b , where
But L
b
b
U = D U , is the L U factorization of B. Each row of U is obtained by multiplying it
b , which are the pivots of
by the sign of its pivot. In particular, the diagonal entries of U
C, are uii sign uii = | uii | > 0.
(c)
10
1
0
1
0
2 2
1
2 2 1
1 0 0
CB
B
C
B 1
3C
@ 1 0 1 A = @ 2 1 0 A@ 0 1
2 A,
2 6 1
0 0 4
4 2 3
10
0
1
0
1
2 2 1
2 2 1
1 0 0
CB
B
C
B 1
3C
@ 1 0 1 A = @ 2 1 0 A@ 0 1 2 A,
2 6 1
4 2 3
0 0
4
1
10
0
1
0
2 2 1
2 2 1
1
0 0
CB
B
C
B 1
3C
1 0 A@ 0
1
0
1A = @ 2
@ 1
2 A.
0
0
4
2 6 1
4 2 3

1.3.31. (a) x =

1
2
3

, (b) x =

1
@ 4 A,
1
4

(c) x =
0

B C
@ 1 A,

1
37
12
2
B
B
0
B C
B 17
B
C
B1C
12
C, (h) x = B
(f ) x = @ 1 A, (g) x = B
B
@1A
B 1
1
@ 4
0
2
0

11

C
C
C
C,
C
C
A

(d) x =

(i) x =

0
B
B
B
@

0
B
B
B
B
B
B
@

4
7
2
7
5
17

3
35
6
35
1
7
8
35

C
C
C
C.
C
C
A

C
C
C,
A

(e) x =

B 1 C
@
A,
5
2

1.3.32.
1
3

0
1

1
(b) L = B
@ 1
1

0
1
0

(a) L =
0

(c)

(d)

(e)

(f )

, U=
1

1
0

3
;
11

0
1
B
0C
A, U = @ 0
1
0
1

1
2
0

x1 = @
1

1
0C
A;
3

1
5
11 A,
2
11
0

x2 =
1

1
B
B
C
B
x 1 = @ 0 A, x 2 = B
@
0
0

1
9
11 A;
3
1 11

1
, x3 = @
1

1
6
3
2
5
3

C
C
C;
A

9 2 1
0
1
2
B
C
B C
B
1C
B0 1
C;
2
9 C
,
U
=
x
=
,
x
=
L = 23
@
@
A
A;
0C
@
A
1
2
3
3A
5
1
2
1
3
00 0 3
9
3 1
0
1
1
1
0
0
2.0
.3
.4
B
L=B
1
0C
.355
4.94 C
@ .15
A, U = @ 0
A;
.2 1.2394
1
0
0
.2028
0
0
0
1
1
1
.6944
1.1111
9.3056
B
B
B
C
C
x1 = @ 1.3889 A, x2 = @ 82.2222 A, x3 = @ 68.6111 C
A
.0694
6.1111
4.9306
0
0 51
0
1
1
0
1
1 0 1
0
1
0
0 0
4
14
B
C
B
B0 2
B 5
B 0
B1 C
3 1 C
B
C
1
0 0C
B 14
C
B 4C
C;
B
C
B
C, U = B
B
,
x
=
L=B
x
=
3
7
7
B
C
2
1
1
0
1
1
B
C
B
@ 1
A
0
0

@
2
2
2A
@ 14
@ 4A
1
0 2 1 1
1
1
0 0
0
4
4
2
0
1
1
0
1
0 0 0
1 2
0
2
B
B
4
1 0 0C
9 1 9 C
C
C
B0
B
C, U = B
C;
L=B
17
1
@ 8
A
A
@0
1
0
0
0
9
9
4 0
1 1 0 1 0 1 0
0
0
1
0
1
1
1
10
B C
B C
B
C
0C
B1C
B 8C
B C, x 2 = B C, x 3 = B
C.
x1 = B
@4A
@3A
@ 41 A
0
2
4
B
B
@

0
1

C
C
C
C;
C
A

1.4.1. The nonsingular matrices are (a), (c), (d), (h).


1.4.2. (a) Regular and nonsingular, (b) singular, (c) nonsingular, (d) regular and nonsingular.
1.4.3. (a) x1 = 53 , x2 = 10
(b) x1 = 0, x2 = 1, x3 = 2;
3 , x3 = 5;
9
(c) x1 = 6, x2 = 2, x3 = 2;
(d) x = 13
2 , y = 2 , z = 1, w = 3;
(e) x1 = 11, x2 = 10
3 , x3 = 5, x4 = 7.
1.4.4. Solve the equations 1 = 2 b + c, 3 = 2 a + 4 b + c, 3 = 2 a b + c, for a = 4, b = 2,
c = 3, giving the plane z = 4 x 2 y + 3.
1.4.5.
(a) Suppose A is nonsingular. If a 6= 0 and c 6= 0, then we subtract c/a times the first row
from the second, producing the (2, 2) pivot entry (a d b c)/a 6= 0. If c = 0, then the
pivot entry is d and so a d b c = a d 6= 0. If a = 0, then c 6= 0 as otherwise the first
column would not contain a pivot. Interchanging the two rows gives the pivots c and b,
and so a d b c = b c 6= 0.
(b) Regularity requires a 6= 0. Proceeding as in part (a), we conclude that a d b c 6= 0 also.
1.4.6. True. All regular matrices are nonsingular.

12

1.4.7. Since A is nonsingular, we can reduce it to the upper triangular form with nonzero diagonal entries (by applying the operations # 1 and # 2). The rest of argument is the same as
in Exercise 1.3.8.
1.4.8. By applying the operations # 1 and # 2 to the system Ax = b we obtain an equivalent
upper triangular system U x = c. Since A is nonsingular, uii 6= 0 for
all i, so by Back Sub0
1
n
X
cn
1 @
and xi =
stitution each solution component, namely xn =
ci
uik xk A,
unn
uii
k = i+1
for i = n 1, n 2, . . . , 1, is uniquely defined.
0

0 0 0
0 0 0
B
0 0 1C
C
B0 1 0
C, (b) P2 = B
1.4.9. (a) P1 =
@0 0 1
0 1 0A
0 1 0 0
1 0 0
(c) No, they do not commute. (d) P1 P2 arranges
P2 P1 arranges them in the order 2, 4, 3, 1.
B
B0
B
@0

1.4.10. (a)

B
@0

1
0
0

0
0
B
B0
C
1 A, (b) B
@1
0
0

0
0
0
1

0
1
0
0

1
0C
C
C, (c)
0A
0

1
0C
C
C,
0A
0
the rows in the order 4, 1, 3, 2, while

0
B
B1
B
@0
0

1
0
0
0

0
0
0
1

0
0C
C
C, (d)
1A
0

B1
B
B
B0
B
@0

0
0
0
1
0

0
0
1
0
0

1
0
0
0
0

0
0C
C
C.
0C
C
0A
1

1.4.11. The (i, j) entry of the following Multiplication Table indicates the product P i Pj , where
0

1
P1 = B
@0
0
0
0
P4 = B
@1
0

0
1
0
1
0
0

0
0C
A,
1
1
0
0C
A,
1

0
P2 = B
@0
1
0
0
P5 = B
@0
1

1
0
0
0
1
0

0
1C
A,
0
1
1
0C
A,
0

0
P3 = B
@1
0
0
1
P6 = B
@0
0

0
0
1
0
0
1

The commutative pairs are P1 Pi = Pi P1 , i = 1, . . . , 6, and P2 P3 = P3 P2 .

1.4.12. (a)

B
B0
B
@0

P1

P2

P3

P4

P5

P6

P1

P1

P2

P3

P4

P5

P6

P2

P2

P3

P1

P6

P4

P5

P3

P3

P1

P2

P5

P6

P4

P4

P4

P5

P6

P1

P2

P3

P5

P5

P6

P4

P3

P1

P2

P6

P6

P4

P5

P2

P3

P1

0
1
0
0

0
0
1
0

0
0C
C
C,
0A
1

B
B0
B
@0

1
0
0
0

0
0
1
0

0
1C
C
C,
0A
0

B
B1
B
@0

13

0
0
0
1

0
0
1
0

1
0C
C
C,
0A
0

B
B1
B
@0

1
0
0
0

0
0
1
0

1
0C
A,
0
1
0
1C
A.
0

0
0C
C
C,
0A
1

B
B0
B
@0

0
1
0
0

0
0
1
0

1
0C
C
C,
0A
0

B
B0
B
@0

0
0
B
1
B
B
@0
0
0

0
0
0
1
0
0
1
0

0
0
1
0
0
0
0
1

0
1C
C
C;
0A
01
1
0C
C
C,
0A
0

(b)
0

0
B
0
B
B
@1
0

B
B1
B
@0

1
0
0
0

0
0
0
0
1

1 0
0 0
0 0
0 11
0
1C
C
C;
0A
0

0
0C
C
C,
1A
0

0
0
1
0
0
0
1 0
B
0 0
B
(c) B
@0 1
0 0
B
B0
B
@0

0
0
0
1
0
1
0
0

0
1C
C
C,
0A
01
0
0C
C
C,
0A
1

0
1
0
0
0
0
0 0
B
0
0
B
B
@0 1
1 0
B
B0
B
@1

0
0
0
1
0
1
0
0

1
0C
C
C,
0A
01
1
0C
C
C.
0A
0

B
B0
B
@0

0
1
0
0

0
0
0
1

0
0C
C
C,
1A
0

1.4.13. (a) True, since interchanging the same pair of rows twice brings you back 0
to where 1
0 0 1
C
you started. (b) False; an example is the non-elementary permuation matrix B
@ 1 0 0 A.
!
0 1 0
1
0
(c) False; for example P =
is not a permutation matrix. For a complete list of
0 1
such matrices, see Exercise 1.2.36.
1.4.14. (a) Only when all the entries of v are different; (b) only when all the rows of A are
different.
0

1 0 0
C
1.4.15. (a) B
@ 0 0 1 A. (b) True. (c) False A P permutes the columns of A according to
0 1 0
the inverse (or transpose) permutation matrix P 1 = P T .
1.4.16.
(a) If P has a 1 in position ((j), j), then it moves row j of A to row (j) of P A, which is
enough to establish the correspondence.
0
1
0
1
0
1
0 0 0 0 1
0
1
0 0 0 1
1 0 0 0
B
C
0 1 0
B
C
B
C
B0 0 0 1 0C
B
C
B0 1 0 0C
B0 0 1 0C
B
C
C, (iii) B
C, (iv ) B 0 0 1 0 0 C.
(b) (i) @ 1 0 0 A, (ii) B
B
C
@0 0 1 0A
@0 0 0 1A
@0 1 0 0 0A
0 0 1
1 0 0 0
0 1 0 0
1 0 0 0 0
Cases (i) and !
(ii) are elementary matrices.
!
!
!
1 2 3
1 2 3 4
1 2 3 4
1 2 3 4 5
(c) (i)
, (ii)
, (iii)
, (iv )
.
2 3 1
3 4 1 2
4 1 2 3
2 5 3 1 4
1.4.17. The first row of an nn permutation matrix can have the 1 in any of the n positions, so
there are n possibilities for the first row. Once the first row is set, the second row can have
its 1 anywhere except in the column under the 1 in the first row, and so there are n 1
possibilities. The 1 in the third row can be in any of the n 2 positions not under either of
the previous two 1s. And so on, leading to a total of n(n 1)(n 2) 2 1 = n ! possible
permutation matrices.
1.4.18. Let ri , rj denote the rows of the matrix in question. After the first elementary row operation, the rows are ri and rj + ri . After the second, they are ri (rj + ri ) = rj and
rj + ri . After the third operation, we are left with rj and rj + ri + ( rj ) = ri .

1.4.19. (a)

0
1

1
0

0
2

1
1

1
0

0
1

2
0

1
,
1

14

x=

1
5
@ 2 A;
0

0
(b) B
@0
1
(c)

(d)

B
B0
B
@0

0
0
B
1
B
(e) B
@0
0
0
0
B0
B
B0
(f ) B
B
@1
0
0

1.4.20.

0
B
@1
0
0
0
1
0
0
0
1
0
0
1
0
0
0

1
(a) B
@0
0

(b)

B
B0
B
@1

(c)

1
0
0

B
B0
B
@0

0
0
1
0
1
0
0
1
0
0
0
1
0
0
0
0
0
0
1
0
1
0
0
0
0
0
1

10

0
0
B
1C
A@ 1
0
0
10

1
0 4
B
2 3C
A = @0
0
1 7

0
1
0

10

0
1
B
0C
A@ 0
1
0
10

1
0 1 3
1 0 0
1
B
B
CB
0C
3C
A@ 0 2
A = @ 0 1 0 A@ 0
010 1 0 2
00 2 1
0
1
0
1 2 1 0
1 0 0
B
B
0C
6 2 1 C
0
CB 3
C
B1 1
CB
C=B
0 A@ 1 1 7 2 A @ 3 0 1
1 10 1 1 2 11 0 1 3 21
5
0
0 1 0 0
1
0 0
B
B
0C
3 1 0C
1 0
CB 2
C
B0
CB
C=B
0 A@ 1 4 1 2 A @ 2 5 1
1
7 1 2 3
7 29 3
10
1
0
0 0
0 0 2 3 4
1 0
C
B
C
0 0 CB 0 1 7 2 3 C B
0
1
B
B
C
B
CB 1 4
C = B0 0
1 0C
1
1
1
CB
C
B
0 0 A@ 0 0 1 0 2 A @ 0 0
0 1
0 0 1 7 3
0 0
10

1
5
4C
3C
C;
=
4A
1
41
0
B
B
B
@

0 2
1
C
1 3 C
x=B
@ 1 A;
A,
0109
0
1
0
1
1 2 1 0
22
0
B
B
C
2C
0C
C
B 13 C
CB 0 1 6
C, x = B
CB
C;
@ 5 A
0 A@ 0 0 5 1 A
4
110 0 0 0 51
22
0
1
0
1 4 1 2
1
B
B
C
0C
0 0C
CB 0 1
C
B 1 C
CB
C, x = B
C;
@ 1A
0 A@ 0 0 3 4 A
1
0 0 0 1
3
10
1
0
1
1 4
1 1 1
1
0 0 0
C
B
C
B
C
0 0 0 CB 0 1 7 2 3 C
B 0C
CB
C
B
B0 0
C.
1 0 2C
,x=B
0C
1 0 0C
CB
C
B
C
@ 1 A
0 3 0A
2 1 0 A@ 0 0
0 0
0 0 1
0
1 37 1
10

2 3
1 7C
A,
0 4

4 4
2
1 0 0
2
CB
C
B
3
CB 0 2 1 C;
1C
A=B
1
0

A@
@
4
2A
5
2
0
0
34 0 1
2
5
7
3
solution: x1 = 4 , x2 = 4 , x3 = 2 .
10
1
0
10
1
1 0
0
1 1
1
1 0 0 0
1 1
1 3
B
B
B
0 0C
1
1
0C
0 0C
1
1
0C
CB 0
C
B0 1
CB 0
C
CB
C=B
CB
C;
A
@
A
@
A
@
0 0
1 1
1 3
0 1 1 0
0
0 2
1A
3
0 1
1
2 1
1
1 3 25 1
0
0
0
2
solution:
x
=
4,
y
=
0,
z
=
1,
w
=
1.
10
1
10
1
0
1 1
2
1
0 0
1 1
2
1
1 0
0 0
CB
CB
C
B
3 3
0C
0 1 CB 1
1 3
0C B 1 1
0 0 CB 0
C
CB
C;
CB
C=B
0
2 4 A
1 0 A@ 1 1
1 3 A @ 1 0
1 0 A@ 0
1
0
0
0
1
0 0
1
2 1
1
1 0 2 1
19
5
solution: x = 3 , y = 3 , z = 3, w = 2.
0
4
B
1C
A@ 3
0
3

4
3
1

1.4.21.
(a) They are all of the form P A = L U , where P is a permutation matrix. In the first case,
we interchange rows 1 and 2, in the second case, we interchange rows 1 and 3, in the
third case, we interchange rows 1 and 3 first and then interchange rows 2 and 3.
(b) Same solution x = 1, y = 1, z = 2 in all cases. Each is done by a sequence of elementary row operations, which do not change the solution.
1.4.22. There are four
0 in
0
B
@1
0
0
0
B
@0
1
0
0
B
@0
1

all: 1 0
1 0
0
B
0 0C
A@1
0 1
1
10
1 0
0
B
0 1C
A@1
0 0
1
10
0 1
0
B
1 0C
A@1
0 0
1

1
0
1
1
0
1
1
0
1

2
1
B
1 C
A = @0
3
1
1
0
2
1
B
1 C
A = @1
3
0
1
0
2
1
B
1 C
A = @1
3
0
15

0
1
1
0
1
1

10

0
1 0 1
B
0C
2C
A@0 1
A,
1
0 0
2
10
1
0
1 0 1
B
0C
4C
A@0 1
A,
1
0 0 2
10
1
0 0
1
1
3
B
C
1 0C
A @ 0 1 4 A ,
1 1
0
0 2

10

10

0 1
0 1
2
1
0
B
B
C
0 0C
1
A @ 1 0 1 A = @ 0
0 1 0
1 1
3
1 1
The other two permutation matrices are not regular.
B
@1

1.4.23. The maximum is 6 since there


0
1
B
@ 1
1
0
10
1 0 0
1
B
CB
@0 0 1A@ 1
0 1 0
1
0
10
0 1 0
1
B
CB
@1 0 0A@ 1
0 0 1
1
0

B
@0

1
0
B
@0
1
0
0
B
@1
0
0

1
0
0
0
1
0
0
0
1

10

0
1
B
1C
A@ 1
0
1
10
1
1
B
0C
A@ 1
0
1
10
1
1
B
0C
A@ 1
0
1

are
0
1
1
0
1
1
0
1
1
0
1
1
0
1
1
0
1
1

0
1
B
0C
A@0
1
0

1
1
0

3
2C
A.
2

6 different 3 3 permutation matrices. For example,


1
0
10
1
0
1 0 0
1 0 0
B
CB
C
0C
A = @ 1 1 0A@0 1 0A,
1
1 1 1
0 0 1
1
0
10
1
0
1 0 0
1 0
0
B
CB
0C
1C
A = @ 1 1 0 A @ 0 1
A,
1
1 1 1
0 0 1
1
0
10
1
0
1
0 0
1
1 0
B
B
C
0C
1 0C
A=@ 1
A @ 0 1 0 A ,
1
1 2 1
0
0 1
1

0
1
B
0C
A = @ 1
1
1
0
1
1
0
B
0C
A = @ 1
1
1
1
0
0
1
B
0C
A = @ 1
1
1

10

1 1 0
0
B
1C
0C
A@0 2
A,
1
1
0 0 2
2 1
1
10
1 1 1
0 0
CB
1 0A@ 0 2 1C
A,
1
0 0 12
2 1
10
1
0 0
1 1
1
CB
1 0A@ 0 1
1C
A.
2 1
0 0 1
0
1

1.4.24. False. Changing the permuation matrix typically changes the pivots.
1.4.25.
Permuted L U factorization

start
set P = I , L = I , U = A
for j = 1 to n
if ukj = 0 for all k j, stop; print A is singular
if ujj = 0 but ukj 6= 0 for some k > j then

interchange rows j and k of U


interchange rows j and k of P
for m = 1 to j 1 interchange ljm and lkm next m
for i = j + 1 to n
set lij = uij /ujj

add uij times row j to row i of A


next i
next j
end

16

1.5.1.

2
3
1 3
1 0
1 3
2
3
,
=
=
(a)
1 1
1
2
0 1
1
2
1 1
0
10
1
0
1
0
10
1
2 1 1
3 1 1
1 0 0
3 1 1
2 1 1
CB
B
C
B
B
C
(b) B
2
1C
2
1C
@ 3 2 1 A @ 4
A = @ 0 1 0 A = @ 4
A @ 3 2 1 A,
2 1 2
1
0
1
0 0 1
1
0
1
2 1 2
0
1
10
0
10
0
1
1
1
1
1
1
1 3
2 B 1
1
0
0
1 3
B
C
CB
CB
B
C
4
1
3
4
1
3
B
C
C
(c) B
2
2
1
=
0
1
0
=
2 2
@
A@
@
A
@
7 7 7 A
7 7 7 A@
6
5
8
6
5
8
2 1
3
0
0
1
2
1
17
7 17 0 7
0
07
10
1
7
5 16
6
5 16
6
1
2
0
1 0 0
B
B
C
CB
C
1
3C
1.5.2. X = B
@ 3 8 3 A; X A = @ 3 8 3 A @ 0
A = @ 0 1 0 A.
1
3
1
1
3
1
1 1 8
0 0 1
1.5.3. (a)
0

1
(d) B
@0
0
1.5.4.

B
@a

0
1

1
,
0

0
1
0
0
1
0

1 0
(b)
,
5 1
0
1
1
0
0
B
1
C
B0
3 A, (e) B
@ 0 6
1
0
0
10

1
0
B
0C
A @ a
b
1

0
1
0

(c)
0
0
1
0

0
0C
C
C,
0A
1

2
,
1

0
B
B0
(f ) B
@0
1

0
1
0
0

0
0
1
0

1
0C
C
C.
0A
0

10

1
1 0 0
0
CB
B
0C
A = @ a 1 0 A @ a
b
b 10 1
1
1
0 0
=B
1 0C
@ a
A.
ac b c 1

1
0
B
0C
A = @0
0
1
M 1

1
0

0
1
0
0

0
1
0

0
0C
A;
1

1.5.5. The ith row of the matrix multiplied by the ith column of the inverse should be equal 1.
This is not possible if all the entries of the ith row are zero; see Exercise 1.2.24.
1.5.6. (a) A1 =
(b) C =

2
3

1
2
!

1
, B 1 = @
1

2
3
1
30

1
1
3 A.
1
3

0
1
, C 1 = B 1 A1 = @
0
1
!

sin
a
x
x cos + y sin
1.5.7. (a)
=
.
(b)
= R1
=
.
cos
b
y
x sin + y cos
!
cos a sin
(c) det(R a I ) = det
= (cos a)2 + (sin )2 > 0
sin
cos a
provided sin 6= 0, which is valid when 0 < < .
R1

1.5.8.

cos
sin

1
1
3 A.
23

1
(a) Setting P1 = B
@0
0
0
0
P4 = B
@1
0
1
we find P1 = P1 ,

0 0
0
1 0C
P2 = B
A,
@0
0 1
1
1
0
1 0
0
0 0C
P5 = B
A,
@0
0 1
1
1
1
P2 = P 3 , P3 = P 2 ,

(b) P1 , P4 , P5 , P6 are their own inverses.


17

1 0
0
0 1C
P3 = B
A,
@1
0 0
0
1
0
0 1
1
1 0C
P6 = B
A,
@0
0 0
0
1
1
P4 = P 4 , P5 = P 5 ,

0 1
0 0C
A,
1 0
1
0 0
0 1C
A,
1 0
P61 = P6 .

2
1 C
A.
3

B
B1
B
@0

(c) Yes: P =

0
B
B0
1.5.9. (a) B
@0
1

0
0
1
0

0
1
0
0

1
0
0
0

0
0
0
1
1

0
0C
C
C interchanges two pairs of rows.
1A
0

1
0C
C
C, (b)
0A
0

0
B
B1
B
@0
0

0
0
1
0

0
0
0
1

1
0C
C
C, (c)
0A
0

1
B
B0
B
@0
0

0
0
0
1

0
1
0
0

0
0C
C
C, (d)
1A
0

B0
B
B
B0
B
@0

0
0
1
0
0

0
0
0
0
1

0
1
0
0
0

0
0C
C
C.
0C
C
1A
0

1.5.10.
(a) If i and j = (i) are the entries in the ith column of the 2 n matrix corresponding to
the permutation, then the entries in the j th column of the 2 n matrix corresponding to
the permutation are j and i = 1 (j). Equivalently, permute the columns so that the
second row is in order 1, 2, . . . , n and then switch the two rows.
(b) The permutations
correspond to !
!
!
!
1 2 3 4 5
1 2 3 4
1 2 3 4
1 2 3 4
.
, (iv )
, (iii)
, (ii)
(i)
1 4 2 5 3
1 3 4 2
4 1 2 3
4 3 2 1
The inverse permutations
correspond
!
! to
!
!
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4 5
(i)
, (ii)
, (iii)
, (iv )
.
4 3 2 1
2 3 4 1
1 4 2 3
1 3 5 2 4
1.5.11. If a = 0 the first row is all zeros, and so A is singular. Otherwise, we make d 0 by
an elementary row operation. If e = 0 then the resulting matrix has a row of all zeros.
Otherwise, we make h 0 by another elementary row operation, and the result is a matrix
with a row of all zeros.
2
1.5.12. This is true if and
! only if A =! I , and so, according to Exercise 1.2.36, A is either of
a
b
1
0
, where a is arbitrary and b c = 1 a2 .
or
the form
c a
0 1

1.5.13. (3 I A)A = 3A A2 = I , so 3 I A is the inverse of A.


1.5.14.

1 1
A
c

(c A) =

1
c A1 A = I .
c

1.5.15. Indeed, (An )1 = (A1 )n .


1.5.16. If all the diagonal entries are nonzero, then D 1 D = I . On the other hand, if one of
diagonal entries is zero, then all the entries in that row are zero, and so D is not invertible.
1.5.17. Since U 1 is also upper triangular, the only nonzero summand in the product of the ith
row of U and the ith column of U 1 is the product of their diagonal entries, which must
equal 1 since U U 1 = I .
1.5.18. (a) A = I 1 A I . (b) If B = S 1 AS, then A = S B S 1 = T 1 B T , where T = S 1 .
(c) If B = S 1 A S and C = T 1 B T , then C = T 1 (S 1 AS)T = (S T )1 A(S T ).
1.5.19. (a) Suppose D
!

X
Z

Y
W

. Then, in view of Exercise 1.2.37, the equation D D 1 =

I O
I =
requires A X = I , A Y = O, B Z = O, B W = I . Thus, X = A1 , W = B 1
O I
and, since they are invertible, Y = A1 O = O, Z = B 1 O = O.

18

(b)
1.5.20.

0
B
B
B
@

1
3
2
3

2
3
13

0C
C
,
0C
A

1
3

1
1
0
0

B
B 2
B
@ 0

0
0
5
2

0
0C
C
C.
3A
1

!
1 1
1
1 0 B
1 0
(a) B A =
1C
.
@0
A=
1 1 1
0 1
1
1
(b) A X = I does not0have a solution.
Indeed,
the first column of this matrix equation is
1
0 1
!
1 1
1
x
C
the linear system B
1C
=B
@0
A
@ 0 A, which has no solutions since x y = 1, y = 0,
y
1
1
0
and x + y = 0 are incompatible.
!
2
3 1
(c) Yes: for instance, B =
. More generally, B A = I if and only if B =
1 1
1
!
1 z 1 2z z
, where z, w are arbitrary.
w 1 2w w
0

2 y 1 2 v
1.5.21. The general solution to A X = I is X =
y
v C
A, where y, v are arbitrary.
1
1
Any of these matrices serves as a right inverse. On the other hand, the linear system
Y A = I is incompatible and there is no solution.
B
@

1.5.22.

q
(a) No. The only solutions are complex, with a = 12 i 23 b, where b 6= 0 is any
nonzero complex number.
!
!
1 1
1 0
(b) Yes. A simple example is A =
,B =
. The general solution to the
1 0
0 1
!
x y
2 2 matrix equation has the form A = B M , where M =
is any matrix with
z w
tr M = x + w = 1, and det M = x w y z = 1. To see this, if we set A = B M ,
then ( I + M )1 = I + M 1 , which is equivalent to I + M + M 1 = O. Writing this
out using the formula (1.38) for the inverse, we find that if det M = x w y z = 1 then
tr M = x+w = 1, while if det M 6= 1, then y = z = 0 and x+x1 +1 = 0 = w+w 1 +1,
in which case, as in part (a), there are no real solutions.

1
B
B0
1.5.23. E = B
@0
0
1.5.24. (a)
0

3
(e) B
@9
1

0
@

1
1
2
7
1

0
1
0
0

0
0
7
0
1

2
3 A,
1
3
1

2
6 C
A,
1

0
0C
C
C,
0A
1
(b)

(f )

E 1
0
@

0
B
B
B
@

1
B
B0
=B
@0
0

1
8
3
8

5
8
1
2
7
8

0
1
0
0

0
0

1
7

3
8 A,
18
1
8
1
2
38

0
0C
C
C.
0A
1

(c)
5
8
12
1
8

19

C
C
C,
A

0
@

3
5
4
5
0

1
4
5 A,
3
5

5
B 2
(g) B
2
@
2

(d) no inverse,
3
2

1
1

1
1
2C
1 C
A,

(h)

B
B1
B
@0

1.5.25.

1
3
1
3

(a)
(b)

(c)

4
3

(d) not
0
1
(e) B
@3
0

(f )

(g)

B
@3

B
@0

2
6
5
2
!

0
1
!
0
1

!0

1
0
1
0

0
3

1
3 C
C
C,
3 A
1
!

0
8

0 @ 35 0 A
1
0 1
possible,
10
0 0
1
B
1 0C
A@ 0
0 1
2

0
1
0

0
0
1

0
0
(h)
1
0 0
0
1
B
B0
B
@0
0
0
1 0
B
0 1
B
(i) B
@0 0
00 0
1
B
B0
B
@0
0
B
B0
B
@0

1
2
0
0

1
0

1
0

(i)

51

8
2
3
1

B
B 13
B
@ 21

2
=
1
!
1 3
=
0 1
0
5
3

!0
@1

10

1
3
1
3

12
3
5
1

3
1C
C
C.
1 A
0

2
,
3
!
3
,
1

43 A @
=
1

10

3
5
4
5

45
3
5

A,

10

0 0
1
0 0
1
0 0
1 0
0
B
B
CB
1 0C
1 0C
0C
A@0
A @ 0 1 0 A @ 0 1
A
0
1
0
1
1
0
0
1
0
0
1
0
10
1
0
1
1 0 2
1 0
0
1
0 2
B
B
C
B
0C
0C
@0 1
A @ 0 1 6 A = @ 3 1
A,
0
0
1
0
0
1
2
1
3
10
10
10
10
1
0
1 0 0
1 0 0
1
0 0
1 0 0
CB
CB
CB
CB
0 A @ 0 1 0 A @ 0 1 0 A @ 0 1 0 A @ 0 1 0 C
A
1 0 2 0 11 0 0 3 11 0 0
0 1
1 00 0 8 1
1 2 3
1 2 0
1 0 0
1 0 3
C
B
C
CB
CB
B
@ 0 1 0 A @ 0 1 4 A @ 0 1 0 A = @ 3 5 5 A,
0 0 1 1 00 0 1 1 00 0 1 1 0 2 1 2 1
10
0
1 0 0
2 0 0
1
0 0
1 0
0
B
CB
CB
CB
1C
0C
A @ 0 1 0 A @ 0 1 0 A @ 0 1 0 A @ 0 1
A
0 0 2 0 11 0 0 0 1 1 0
0
0 1 1 00 0 1 1
1 0 1
1 0
0
2
1 2
1 21 0
B
CB
CB
B
C
2 3C
@ 0 1 0 A @ 0 1 1 A @ 0
A,
1 0A = @4
0 0 1
0 0
1
0 1 1
0
0
1
10
10
1
10
10
1 0 0 0
0 0
1 0
0 0
2 0 0 0
1
0 0 0
1
B1
CB
B
CB
C
1 0C
0 0C
C B 2 1 0 0 CB 0 1
CB 0 1 0 0 C B 0 2 0 0 C
CB
CB
CB
CB
C
0 0 A @ 0 0 1 0 A@ 0 0
1 0 A@ 0 0 1 0 A @ 0
0 1 0A
0 1
0 0 0 1
0 0 2 1
0 0 0 1
0
0 0 1
1
0
1
10
10
10
2 1
0
1
1 0 0 0
1 0 0 0
0 0 12
1 21 0 0
C
B
CB
CB
CB
1
3C
B0 0
C
B 0 1 0 3 CB 0 1 0 0 CB 0
1 0 0C
1 0 0C
C=B
C,
CB
CB
CB
0 1 A
0 1 0 A@ 0 0 1 0 A @ 0 0 1 3 A @ 0 0 1 0 A @ 1 0
0 0 2 5
1
0 0 110 0 0 0 110 0 0 0 110 0 0 0 10
1
0 0
1 0 0 0
1 0 0 0
1 0 0 0
1
0 0 0
CB
CB
CB
CB
0 0 C B 2 1 0 0 CB 0 1 0 0 CB 0 1 0 0 CB 0
1 0 0C
C
CB
CB
CB
CB
C
0 1 A @ 0 0 1 0 A@ 0 0 1 0 A@ 0 2 1 0 A@ 0
0 1 0A
1 0
0100 0 1
3 100 0 1
0 100 0 1
0 1
0 1
10
1
0
0 0
1 0 0
0
1 0 0 1
1 0 0
0
1 0 0
0
CB
CB
CB
CB
1
0 0 CB 0 1 0
0 CB 0 1 0 0 CB 0 1 0 2 CB 0 1 0
0C
C
CB
CB
CB
CB
C
0 1 0 A@ 0 0 1
0 A@ 0 0 1 0 A@ 0 0 1
0 A@ 0 0 1 5 A
0
0 1
0 0 0 1 100 0 0 1
0 00 0
1
0 10 0
1
0
10
1
1 0 1 0
1 0 0 0
1 2 0 0
1 2 1 1
B
CB
CB
B
C
1 0 0C
B 0 1 0 0 CB 0 1 1 0 CB 0
C
B 2 3 3 0 C
B
CB
CB
C=B
C.
@ 0 0 1 0 A@ 0 0 1 0 A@ 0
0 1 0 A @ 3 7 2 4 A
0 0 0 1
0 0 0 1
0
0 0 1
0
2 1 1

20

1.5.26. Applying Gaussian Elimination:


0

E1 = @
E2 =

E3 =

E4 = @
and hence A =

1.5.27. (a)

0
@

0
A,
1

1
3

1
0

0
A
3
2

2
@ 3
0

1
2 A,
2i

i
2
1
2

i
(c) B
@1 i
1

0
i
1

A,

0
@

1
1 C
A,
i

3
@ 2

C
A,
2
3

12 A
,
1

10

1
0
A@
1
0

1
1
3

1 i
1

0
2
3

10

3 A,

1
0

E 4 E3 E2 E1 A = I =

1
1+ i

(b)

12

E 3 E2 E1 A =

3
B 2
@

E 2 E1 A =

1
1
3 A,

E11 E21 E31 E41


1

E1 A =

0
1

10

3
A@ 2

0 A@ 1
1
0

3 A.

3+ i
(d) B
@ 4 + 4 i
1 + 2 i

1 i
2 i
1 i

i
2+ i
1

C
A.

1.5.28. No. If they have the same solution, then they both reduce to I | x under elementary
row operations. Thus, by applying the appropriate

elementary row operations to reduce the


augmented matrix of the first system to I | x , and then applying the inverse elementary row operations we arrive at the augmented matrix for second system. Thus, the first
system can be changed into the second by the combined sequence of elementary row operations, proving equivalence. (See also Exercise 2.5.44 for the general case.)
1.5.29.
e =E E
(a) If A
N N 1 E2 E1 A where E1 , . . . , EN represent the row operations applied to
e =A
eB =E E
A, then C
N N 1 E2 E1 A B = EN EN 1 E2 E1 C, which represents
the same sequence of row operations applied to C.
(b)
0
10
1
0
1
0
10
1
1
2 1
1 2
8 3
1 0 0
8 3
B
B
C
B
CB
C
(E A) B = B
2C
0C
@ 2 3
A@ 3
A = @ 9 2 A = @ 0 1 0 A @ 9 2 A = E (A B).
2 3 2
1
1
9
2
2 0 1
7 4

1.5.30. (a)

1
@2
1
4
0

(c)

B
B
@1

4
(e) B
@ 2
3

1
1
2A
14

25
1
1
2

3
1
1

1
2

10

3
2 CB
B
0C
A@
1
2
10

=@

1
1
2 A;
3
4
0

(b)
1

3 C B 14 C
B
2C
5C
A=@
A;
2
2
1

3
4
1
B
C
B
C
0C
A @ 5 A = @ 1 A;
1
7
3

0
@

5
17
1
17
0

1
2
17 A
3
17

9
(d) B
@ 6
1
(f )

B
B0
B
@2

21

0
0
1
1

2
12

15
10
2
1
1
1
1

2
;
2

10

8
3
2
B
C
B C
5 C
A@ 1 A = @ 3 A;
1
5
0
10

1
4
3
B
C
B
C
1 C
CB 11 C
B 1C
CB
C=B
C;
0 A@ 7 A @ 4 A
1
6
2

(g)

1.5.31. (a)

B
B
B
B
B
@

25
4
21

1
1
3A
2
3
1

1
2
3
1

1
3

(a)

2
1

0 1
1 0
0
2
1
(c) B
4
@2
0 2

(b)

1
(d) B
@0
0

(e)

(f )

B
B0
B
@0

(g)

B
@1

B
B1
B
@1

0
1
0
0

3
0
0
0
1

1.5.33.
(a)

0
@

0
0
1

3
1
1

3
7 A,
5
7

1.6.1. (a) ( 1

3
2

1
3

CB

1
7
5 A,
(c) @
15
0
1
21
C
(g) B
@ 0 A,

0
1

1
0

0
7

10

0
1
B
1C
A@1
0
2

1
0

1
2
B1
1C
A=@2
1
2
2

(b)

5 ),

5
1
B
2 C
A = @2
3
1

2
;
1

0
2 0
B
1
0C
A@ 0 2
0 0
1
0
1
1
0 0
2
B
1
1 0
5C
B
C
C=B
1 A @ 1 1 1
3 34 1
6
1
0
0
2 3
1
C
2
0
1C B
B2
C=B
2 2 1 A @ 0
1
1
2
1
!

8
, (c)
3

(b)

1
1

0
B
B
B
@

0
,
2

1
6
2
3
2
3

(h)

C
C
C,
A

B
C
B 10 C
B
C,
@ 8 A

0
B
B
B
@

(i)

1 27
0
1
10
1
0
1 21
1
CB
0 A@ 0 1 1 C
A;
1
0 0
1
10
10
0
1
0
0
1
B
B
0C
0C
A@ 0 3
A@ 0
0
1
0
0 7
10

1
1
0

23

28
7 C
C
C.
12 A
3

7C
3 A;

0
1
1
C
B
0C
A@ 0
1 0 A;
1
0
0 1
1
10
10
1 1 1
2
0
1
0
0 0
CB
CB
1 0 1 C
0 CB 0 3
0 0 CB 0
C
C;
CB
CB
0 1
0A
0 A@ 0
0 2 0 A@ 0
0
0 0
1
0
0
0 4
1
10
10
1 0 2
0 0 0
1
0
0
0
B
B
1 0 0C
0
0C
CB 0 2
CB 0 1 2
CB
CB
21 1 0 A@ 0
0 1
0 A@ 0 0 1
1 0 1
0
0
0 5
0 0 0
0

1
C
(d) B
@ 2 A, (e)
0

(c)

0
1
0

10

0
1
1

(d) singular matrix,

1
1
1

3
2C B
C
1C
B
C
3C B
2C
C=B
C.
B
C
3C
@ 1A
A
3
2
2

B
32 C
CB
CB
B
3 C
A@
1
2

1
2

10

0 4
1 0
7 0
=
7 2
0 1
0 4
0
1
10
1
0 0
2
2 0
B
B
1 0C
1 C
A = @1
A@ 0 3
0 23 1
1
0 0

1
1
4
1
2 1
1
1
10
0
1
B
0C
CB 2
CB
1 A@ 1
0
0

1
1
(b) @ 41 A,
0 4 1
1
8C
B
C
B 1 C,
(f ) B
2A
@
5
8

1
C
(e) B
@ 4 A,
1

1.5.32.

1
2

2
,
1

22

12
B
C
@ 3 A,
7

1
(d) B
@ 2
1

(f )

2
0C
A,
2

7 1
3
C
B
B 2 C
C,
B
C
B
@ 5 A
5
3
0

(g)

0
B
B
B
@

3
27 C
C
.
11 C
2 A
1
1

0
1C
C
C.
0A
2

(f )

3
1
1.6.2. AT = B
@
1

1
2C
A,
1

(e)

B
@

1
2C
A,
3
0

1
2

3
4

5
,
6

2
2

(A B) = B A =

B
@

(g)
1
2

BT =

0
,
6

2
0

1
2
1

0
3
2

1
1C
A.
5

3
,
4
T

(B A) = A B

0
B
@

1
5
3

6
2
2

5
11 C
A.
7

1.6.3. If A has size m n and B has size n p, then (A B)T has size p m. Further, AT has
size n m and B T has size p n, and so unless m = p the product AT B T is not defined.
If m = p, then AT B T has size n n, and so to equal (A B)T , we must have m = n = p,
so the matrices are square. Finally, taking the transpose of both sides, A B = (A T B T )T =
(B T )T (AT )T = B A, and so they must commute.
1.6.4. The (i, j) entry of C = (A B)T is the (j, i) entry of A B, so
cij =

n
X

k=1

ajk bki =

n
X

k=1

e
e ,
bik a
kj

T
T
e
e
where a
ij = aji and bij = bji are the entries of A and B respectively. Thus, cij equals

the (i, j) entry of the product B T AT .


1.6.5. (A B C)T = C T B T AT
1.6.6. False. For example,

1
0

1
1

does not commute with its transpose.

a b
, then AT A = A AT if and only if b2 = c2 and (a d)(b c) = 0.
1.6.7. If A =
c d
So either
! b = c, or c =
! b 6= 0 and a = d. Thus all normal 2 2 matrices are of the form
a b
a b
or
.
b d
b a
1.6.8.
T
(a) (A B)T = ((A
B)T )1 = (B T AT )1 = !(AT )1 (B T )1 = AT B!
.
!
1 0
1 2
0 1
T
T
(b) A B =
, so (A B)
=
, while A
=
, B T =
2 1
0
1
1
1
!
1 2
T T
so A B
=
.
0
1

1
1

1
,
2

1.6.9. If A is invertible, then so is AT by Lemma 1.32; then by Lemma 1.21 A AT and AT A are
invertible.
1.6.10. No; for example,

1
(3 4) =
2

3
6

4
8

while

3
(1 2) =
4

3
4

6
.
8

1.6.11. No. In general, B T A is the transpose of AT B.


1.6.12.
(a) The ith entry of A ej is the product of the ith row of A with ej . Since all the entries in
ej are zero except the j th entry the product will be equal to aij , i.e., the (i, j) entry of A.
bT and the j th column of A. Since
bT A e is the product of the row matrix e
(b) By part (a), e
i
i
j

23

bT are zero except the ith entry, multiplication by the j th column of A


all the entries in e
i
will produce aij .

1.6.13.
(a) Using Exercise 1.6.12, aij = eT
A ej = eT
ej = bij for all i, j.
i B!
!i
!
1 2
1 1
0 0
(b) Two examples: A =
, B=
; A=
, B=
0 1
1 1
0 0

0
1

1
.
0

1.6.14.
(a) If pij = 1, then P A maps the j th row of A to its ith row. Then Q = P T has qji = 1,
and so it does the reverse, mapping the ith row of A to its j th row. Since this holds for
all such entries, the result follows.
!
cos sin
also has this property. See Section 5.3.
(b) No. Any rotation matrix
sin
cos
1.6.15.
(a) Note that (A P T )T = P AT , which permutes the rows of AT , which are the columns of
A, according to the permutation P .
(b) The effect of multiplying P A P T is equivalent to simultaneously permuting rows and
columns of A according to the permutation P . Associativity of matrix multiplication
implies that it doesnt matter whether the rows or the columns are permuted first.
1.6.16.
(a) Note that w vT is a scalar, and so
A A1 = ( I v wT )( I c v wT ) = I (1 + c)v wT + c v (wT v)wT
= I (1 + c cwT v)v wT = I

provided c = 1/(vT w 1),! which works whenever wT v 6= 1.


1
2 2
= 14 , so A1 = I
and c = T
(b) A = I v wT =
3 5
v w1

5
4
3
4

12
vw =
12
T
(c) If v w = 1 then A is singular, since A v = 0 and v =
6 0, and so the homogeneous
system does not have a unique solution.

1.6.17. (a) a = 1;
1.6.18. 0

1
(a) B
@0
0
0
1
B
B0
(b) B
@0
0
0
1
B
B0
B
@0
0

0
1
0
0
1
0
0
0
0
0
1

(b) a = 1, b = 2, c = 3;

1 0

0
0 1
B
0C
A, @ 1 0
1
0 0
1 0
0 0
0
B
0 0C
C B1
C, B
1 0A @0
0 11 00
0 0
1
B
0 1C
C B0
C, B
1 0A @0
0 0
0

1 0

1
4

(c) a = 2, b = 1, c = 5.

1 0

0
0 0 1
1 0 0
B
C B
C
0C
A, @ 0 1 0 A , @ 0 0 1 A.
1
1 10 0 0
0 1 10 0
1 0 0
0 0 1 0
0
B
C B
0 0 0C
C B0 1 0 0C B0
C, B
C, B
0 1 0A @1 0 0 0A @0
0 0 11 00 0 0 11 01
0 0 0
0 1 0 0
0
B
C B
1 0 0C
C B1 0 0 0C B0
C, B
C, B
0 0 1A @0 0 0 1A @1
0 1 0
0 0 1 0
0

1.6.19. True, since (A2 )T = (A A)T = AT AT = A A = A2 .

0
1
0
0
0
0
0
1

0
0
1
0
1
0
0
0

1 0

1
0C
C
C,
0A
01
0
1C
C
C,
0A
0

1
B
B0
B
@0
0
0
0
B
B0
B
@0
1

1.6.20. True. Invert both sides of the equation AT = A, and use Lemma 1.32.
24

0
0
1
0
0
0
1
0

0
1
0
0
0
1
0
0

0
0C
C
C,
0A
11
1
0C
C
C.
0A
0

0
1

1.6.21. False. For example

1
0

2
1

1
3

1
2

3
.
1

1.6.22.
(a) If D is a diagonal matrix, then for all i 6= j we have aij = aji = 0, so D is symmetric.
(b) If L is lower triangular then aij = 0 for i < j, if it is symmetric then aji = 0 for i < j,
so L is diagonal. If L is diagonal, then aij = 0 for i < j, so L is lower triangular and it
is symmetric.
1.6.23.
(a) Since A is symmetric we have (An )T = (A A . . . A)T = AT AT . . . AT = A A . . . A = An
(b) (2 A2 3 A + I )T = 2 (A2 )T 3 AT + I = 2 A2 3 A + I
(c) If p(A) = cn An + + c1 A + c0 I , then p(A)T = cn An + c1 A + c0 I T = cn (AT )n +
c1 AT + c0 I = p(AT ). In particular, if A = AT , then p(A)T = p(AT ) = p(A).
1.6.24. If A has size m n, then AT has size n m and so both products are defined. Also,
K T = (AT A)T = AT (AT )T = AT A = K and LT = (AAT )T = (AT )T AT = A AT = L.
1.6.25.
(a)

1
1

(b)

2
3

(c)

1
4

B
@ 1
0

1
B
B 1
(d) B
@ 0
3
1.6.26.

3
1

1
3
2

0
1

1
0

23

0
3

0
1

1
1
B
2C
A = @ 1
0
1

1
2
2
0

M2 =

1
1

0
2
1
0

1
1
2

0
1

1
0

M4 =

0
3
2
0

0
1

1
2

!0

B1
B
B2
B
B 0
@

1.6.27. The matrix is not regular,


More0explicitly,1if
0
1 0 0
p
B
C
L = @a 1 0A, D = B
@0
b c 1
0

1
1

2
0

3
1
B
0C
C
B 1
C=B
0A @ 0
1
3

1
0

@1

0
1
2
3

7
2
10

0
1
B
0C
A@ 0
1
0

0
1
2
3

0
0
1

6
5
1

1
2 A,

23 ,
1
10
1
0
0
B
2
0C
A@ 0
0
0 32

1
0

10

0
1
B
0C
CB 0
CB
0 A@ 0
1
0

3
4

0
0
5
0

0
1

B1
2

M3 = B
@

10

2
0
CB
C
B
0 CB 0
CB
B
0C
A@ 0
1
0

0
0
1

0
1
0
0

0
3
2

0
0

1
1
0

2
3

10

1
0
B
0C
CB 0
CB
0 A@ 0
0
49
5
10

2
0
CB
B0
0C
A@
1
0

10

1
0
CB
C
B
0 CB 0
CB
B
0C
A@ 0

0
0
4
3

5
4

1C
2 A,

1
2

1
0
0

3
2

0
0
2
3

1
0

1
1
0
0

0
2
1
0

10
0 B1
CB
0C
AB
@0
4
0
3
1

0
C
0C
C

3
3C
C

.
6C
A
5
1

1
2

1
0

0C

2C
C,
3A

C.
3C
4A

since after the first set of row operations the (2, 2) entry is 0.
1

0
0C
A,
r

ap
bp
then L D L =
a p+q
abp + cq C
A.
2
2
bp abp + cq b p + c q + r
Equating this to A, the (1, 1) entry requires p = 1, and so the (1, 2) entry requires a = 2,
but the (2, 2) entry then implies q = 0, which is not an allowed diagonal entry for D.
Even if we ignore this, the (1, 3) entry would set b = 1, but then the (2, 3) entry says
a b p + c q = 2 6= 1, which is a contradiction.
0
q
0

B
@ ap

eU
e , where L
e = V T and U
e = D L.
e Thus, AT
1.6.28. Write A = L D V , then AT = V T D U T = L
T
e
is regular since the diagonal entries of U , which are the pivots of A , are the same as those

25

of D and U , which are the pivots of A.

0 1
. (c) No,
1 0
because the (1, 1) entry is always 0. (d) Invert both sides of the equation J T = J and
use Lemma 1.32. (e) (J T )T = J = J T , (J K)T = J T !K T = J! K = (J !
K).
0 1
0 1
1
0
J K is not, in general, skew-symmetric; for instance
=
.
1 0
1 0
0 1
(f ) Since it is a scalar, vT J v = (vT J v)T = vT J T (vT )T = vT J v equals its own
negative, and so is zero.

1.6.29. (a) The diagonal entries satisfy jii = jii and so must be 0. (b)

1.6.30.
S T = S,
J T0= J,
(a) Let S = 21 (A + AT ), J = 21 (A AT ). Then
0
1
!
!
!
1 2 3
1 3
1 25
0 21
1 2
B
C
B
= 5
+ 1
;
(b)
@4 5 6A = @3 5
3 4
4
0
2
2
7 8 9
5 7

and
A0
= S + J.
1
1
5
0 1 2
C
B
7A + @1
0 1 C
A.
9
2
1
0

1.7.1.
19
(a) The solution is x = 10
7 , y = 7 . Gaussian Elimination and Back Substitution requires 2 multiplications and
GaussJordan also uses 2 multiplications and 3
1
0 3 additions;
additions; finding A1 =

1
7
3
7

2
7A
1
7

by the GaussJordan
method1requires 2 0
additions
0
1
!

1
2
10

4
7
7
7
A
and 4 multiplications, while computing the solution x = @
= @ 19 A
7
37 17
7
takes another 4 multiplications and 2 additions.
(b) The solution is x = 4, y = 5, z = 1. Gaussian Elimination and Back Substitution requires 17 multiplications and 110additions; GaussJordan
uses 20 multiplications
1
0 1 1
C
and 11 additions; computing A1 = B
@ 2 8 5 A takes 27 multiplications and 12
3
2 5 3
additions, while multiplying A1 b = x takes another 9 multiplications and 6 additions.
(c) The solution is x = 2, y = 1, z = 25 . Gaussian Elimination and Back Substitution
requires 6 multiplications and 5 additions;
GaussJordan
is the same: 6 multiplications
0
1
1
3
3
2
2C
B 2
1
1C
B 1
C takes 11 multiplications and 3
and 5 additions; computing A1 = B
2
2
2A
@
2
1
5 0 5
1
additions, while multiplying A b = x takes another 8 multiplications and 5 additions.

1.7.2.
(a) For a general matrix A, each entry of A2 requires n multiplications and n 1 additions,
for a total of n3 multiplications and n3 n2 additions, and so, when compared with
the efficient version of the GaussJordan algorithm, takes exactly the same amount of
computation.
(b) A3 = A2 A requires a total of 2 n3 multiplications and 2 n3 2 n2 additions, and so is
about twice as slow.
(c) You can compute A4 as A2 A2 , and so only 2 matrix multiplications are required. In
general, if 2r k < 2r+1 has j ones in its binary representation, then you need r multir
plications to compute A2 , A4 , A8 , . . . A2 followed by j 1 multiplications to form Ak as
a product of these particular powers, for a total of r + j 1 matrix multiplications, and
hence a total of (r + j 1)n3 multiplications and (r + j 1)n2 (n 1) additions. See
26

Exercise 1.7.8 and [ 11 ] for more sophisticated ways to speed up the computation.
1.7.3. Back Substitution requires about one half the number of arithmetic operations as multiplying a matrix times a vector, and so is twice as fast.
1.7.4. We begin by proving (1.61). We must show that 1 + 2 + 3 + . . . + (n 1) = n(n 1)/2
for n = 2, 3, . . .. For n = 2 both sides equal 1. Assume that (1.61) is true for n = k. Then
1 + 2 + 3 + . . . + (k 1) + k = k(k 1)/2 + k = k(k + 1)/2, so (1.61) is true for n = k + 1. Now
the first equation in (1.62) follows if we note that 1 + 2 + 3 + . . . + (n 1) + n = n(n + 1)/2.
Next we prove the first equation in (1.60), namely 2 + 6 + 12 + . . . + (n 1)n = 31 n3 31 n
for n = 2, 3, . . .. For n = 2 both sides equal 2. Assume that the formula is true for n = k.
Then 2 + 6 + 12 + . . . + (k 1)k + k(k + 1) = 31 k3 13 k + k 2 + k = 31 (k + 1)3 31 (k + 1), so the
formula is true for n = k + 1, which completes the induction step. The proof of the second
equation is similar, or, alternatively, one can use the first equation and (1.61) to show that
n
X

j =1

(n j)2 =

n
X

j =1

(n j)(n j + 1)

n
X

j =1

(n j) =

n3 n
n2 n
2 n3 3 n 2 + n

=
.
3
2
6

1.7.5. We may assume that the matrix is regular, so P = I , since row interchanges have no
effect on the number of arithmetic operations.
(a) First, according to (1.60), it takes 13 n3 13 n multiplications and 13 n3 21 n2 + 16 n
additions to factor A = L U . To solve L cj = ej by Forward Substitution, the first j 1
entries of c are automatically 0, the j th entry is 1, and then, for k = j + 1, . . . n, we need
k j 1 multiplications and the same number of additions to compute the k th entry, for
a total of 12 (n j)(n j 1) multiplications and additions to find cj . Similarly, to solve
U xj = cj for the j th column of A1 requires 21 n2 + 12 n multiplications and, since the
first j 1 entries of cj are 0, also 21 n2 12 n j + 1 additions. The grand total is n3
multiplications and n (n 1)2 additions.

(b) Starting with the large augmented matrix M = A | I , it takes 12 n2 (n 1) multipli

cations and 21 n (n 1)2 additions to reduce it to triangular form U | C with U upper


triangular and C lower triangular,
then n2 multiplications to obtain the special upper

triangular form V | B , and then 12 n2 (n 1) multiplications and, since B is upper

triangular, 21 n (n 1)2 additions to produce the final matrix I | A1 . The grand


total is n3 multiplications and n (n 1)2 additions. Thus, both methods take the same
amount of work.

and 31 n3 31 n
1.7.6. Combining (1.6061), we see that it takes 31 n3 + 21 n2 56 n multiplications

additions to reduce the augmented matrix to upper triangular form U | c . Dividing the
j th row by its pivot requires nj +1 multiplications,
for a total of 21 n2 + 12 n multiplications

to produce the special upper triangular form V | e . To produce the solved form I | d
requires an additional 21 n2 21 n multiplications and the same number of additions for a
grand total of 31 n3 + 23 n2 56 n multiplications and 13 n3 + 12 n2 56 n additions needed to
solve the system.
1.7.7. Less efficient, by, roughly, a factor of
1 3
1
2 n 2 n additions.

3
2

. It takes

27

1
2

n3 + n 2

1
2

n multiplications and

1.7.8.
(a) D1 + D3 D4 D6 = (A1 + A4 ) (B1 + B4 ) + (A2 A4 ) (B3 + B4 )
(A1 + A2 ) B4 A4 (B1 B3 ) = A1 B1 + A2 B3 = C1 ,
D4 + D7 = (A1 + A2 ) B4 + A1 (B2 B4 ) = A1 B2 + A2 B4 = C2 ,
D5 D6 = (A3 + A4 ) B1 A4 (B1 B3 ) = A3 B1 + A4 B3 = C3 ,
D1 D2 D5 + D7 = (A1 + A4 ) (B1 + B4 ) (A1 A3 ) (B1 + B2 )
(A3 + A4 ) B1 + A1 (B2 B4 ) = A3 B2 + A4 B4 = C4 .
(b) To compute D1 , . . . , D7 requires 7 multiplications and 10 additions; then to compute
C1 , C2 , C3 , C4 requires an additional 8 additions for a total of 7 multiplications and 18
additions. The traditional method for computing the product of two 2 2 matrices requires 8 multiplications and 4 additions.
(c) The method requires 7 multiplications and 18 additions of n n matrices, for a total of
7 n3 and 7 n2 (n1)+18 n2 7 n3 additions, versus 8 n3 multiplications and 8 n2 (n1)
8 n3 additions for the direct method, so there is a savings by a factor of 87 .
(d) Let r denote the number of multiplications and r the number of additions to compute
the product of 2r 2r matrices using Strassens Algorithm. Then, r = 7 r1 , while
r = 7 r1 + 18 22 r2 , where the first factor comes from multiplying the blocks,
and the second from adding them. Since 1 = 1, 1 = 0. Clearly, r = 7r , while an
induction proves the formula for r = 6(7r1 4r1 ), namely
r+1 = 7 r1 + 18 4r1 = 6(7r 7 4r1 ) + 18 4r1 = 6(7r 4r ).
Combining the operations, Strassens Algorithm is faster by a factor of
2 n3
23 r+1
,
=
r + r
13 7r1 6 4r1
which, for r = 10, equals 4.1059, for r = 25, equals 30.3378, and, for r = 100, equals
678, 234, which is a remarkable savings but bear in mind that the matrices have size
around 1030 , which is astronomical!
!
!
A O
B O
(e) One way is to use block matrix multiplication, in the trivial form
=
O I
O I
!
C O
where C = A B. Thus, choosing I to be an identity matrix of the appropriate
O I
size, the overall size of the block matrices can be arranged to be a power of 2, and then
the reduction algorithm can proceed on the larger matrices. Another approach, trickier
to program, is to break the matrix up into blocks of nearly equal size since the Strassen
formulas do not, in fact, require the blocks to have the same size and even apply to rectangular matrices whose rectangular blocks are of compatible sizes.

1.7.9.

1
(a) B
@ 1
0
0
1
B
B 1
(b) B
@ 0
0

2
1
2
1
2
1
0

0
1
0
B
1C
1
A = @ 1
3
0 2
0
1
0 0
1
C
1 0C B
B 1
C=B
4 1A @ 0
1 6
0

10

0
1 2 0
B
C
0C
A @ 0 1 1 A,
1
0 0 5
10
0
0 0
1
B
1
0 0C
CB 0
CB
1
1 0 A@ 0
0 1 1
0
28

2
C
x=B
@ 3 A;
0
1
1 0 0
1 1 0C
C
C,
0 5 1A
0 0 7

x=

B C
B0C
B C;
@1A

1
B
1
B
(c) B
@ 0
0
1.7.10. 0
(a)

B
@ 1

0
2
B
1
B
B
@ 0
0
0

2
B 1
B
B
B 0
B
@ 0
0
(b)

4
B
2C
C
B
C
x=B
2
B C.
@ 5A

1
0 0
2 1
0
0
3
B1
C
CB 0
1
0
1
1 C
=
@ 2
A,
A
A@
2
2
4
2
00 3 1
0
0 103
1
1
2 1
0
0
1
0
0 0
0
0
3
B
B1
1
0 0C
0C
CB 0
C
B 2
1
0C
C
2 1
CB
C ,
C=B
2
4
CB 0
C
2 1 A B
1
0
1
0

0
A@
A
@
3
3
3
5
1
2
00 0 4 1
0
0 100
4
1
1
0
0
0 0
2 1
0
0
0
0
3
B 1
CB

0
1
0
0
0
1
B
C
B
C
2
1
0
0C B 2
CB
4
B
B
0 23
0
1
0 0C
C=B
CB 0
2 1
0C
3
B
C
CB
3
B
1
2 1 A B
1 0C
0
0 4
0
0
@
A@ 0
4
0 1
2
0
0
0 5 1
0
0
0

0
1
1
0

41

1
0
B
0C
CB 0
CB
0 A@ 0
1
0

0
0
1

, ( 2, 3, 3, 2 )T ,

5
9
2 , 4, 2 , 4,

0
0
4
0

0
0C
C
C,
1 A
5
4

35

10

2
1
0
0

1
2
1
1
2
1
0

3 T
2

0
1
B
0C
1
C
B
C=B
1 A @ 0
1
0

10

0
0
4
1

1
2
1
0
0

3
2 , 2,

2
3
1
0

5 T
2

0
0
1
5
4

0
0C
C
C
0C
C;
C
1 C
A
6
5

(c) The subdiagonal entries in L are li+1,i = i/(i + 1) 1, while the diagonal entries in
U uii = (i + 1)/i 1.
1.7.11. 0

2
(a) B
@ 1
0
0
2
B
B 1
B
@ 0
0
0

2
B1
B
B
B0
B
@0
0

1
2
1
1
2
1
0
1
2
1
0
0

0
1
2
1
0

1 1 2 T
,
,
3 3 3

10

1
0 0
2 1
0
0
5
B 1
B
1 0C
1C
1C
A = @2
A@ 0 2
A ,
2
12
2
00 5 1
0 0 1
50
1
1
0
0 0
2 1
0 0
CB 0 5
B1
C
1
0
0
CB
1 0C B
2
2
CB
C=B
B
0 25
0
1 0C
2 1A B
A@ 0
@
5
1 2
0
0 12 1
0100
0
1
1
0
0
0
0
2
0 0
B 1
CB

0
1
0
0
0
B
C
C
B
0 0C B 2
CB
2
C
B
CB 0
0

1
0
0
CB
1 0C
=B
5
C
B
CB
5
B
2 1A B
1 0C
0
0 12
@
A@ 0
12
1 2
0
0
0 29 1
0

8 13 11 20 T
,
,
,
29 29 29 29

3 2 1 2 7 T
,
,
,
,
10 5 25 10

(b)
,
,
(c) The subdiagonalentries in L approach 1
U approach 1 + 2 = 2.414214.
1.7.12. Both false.
10
1 1 0 0
1
CB
B
1
1
1
0
1
B
CB
B
CB
@ 0 1 1 1 A@ 0
0 0 1 1
0
0

1.7.13.

B
B1
@

1
4
1

For
1
1
1
0
1

example,
1
0
0 0
2
C
2
1 0C B
B
C=B
1 1A @1
0
1 1
0

1C B 1
B1
1C
A=@4
1
4
4

0
1
1
5

2
3
2
1
10

4
0
CB
B0
0C
A@
1
0

1
2
3
2

15
4

29

0
0C
C
C ,
1C
A

12
5

0
1

29
12

0
0
0

12
5

0
1

5
2

0
0

0
0
1

29
12

0
0C
C
C
0C
C;
C
1C
A

70
29

.
2 = .414214, and the diagonal entries in
0

0
1C
C
C,
2A
2
1

0
1

B
B1
B
@0

3C
C
4A
18
5

1
1
1
0

0
1
1
1

0 1
1
B
0C
C
B 0
C =B
@ 1
1A
1
1

0
0
1
1

1
1
0
0

1
1 C
C
C.
0A
1

4
B
B
B1
B
B0
@
1

4
B
B1
B
B
B0
B
B
@0
1

1
4
1
0
0

For
4
B
B1
B
B
B0
B
B
B0
B
B
@0
1

the
1
4
1
0
0
0

1
4
1
0
0
1
4
1
0

0
1C B 1
C
B1
1
0C B 4
C=B
4
C
B
1A @ 0
15
1
1
4
15
4
0
1
1
0
0 1
B1
C
B
1
0 0C
B4
C
B
C
4
B
=B0
1 0C
15
C
B
C
4 1A B
0
@ 0
1
1
1 4
4 15

0
1
4
1

10

4
1
0
CB
15
C
B
0 CB 0 4
CB
B
0
0C
A@ 0
2
0
0
7 1
10
0
0 0
4
CB
C
B
0
0 0 CB 0
CB
B
1
0 0C
CB 0
CB
15
B
1 0C
A@ 0
56
1
5
0
56
19 1
0
0
1

6 6 version we have
0
1
1
0
0 0 0 1
B1
C
B
1
B4
1 0 0 0C
C
B
C
4
B
0
4 1 0 0C
15
C=B
B
C
B
C
1 4 1 0C B 0
0
B
C
0 1 4 1A B
0
@ 0
1
1
0 0 1 4
4 15

0
0
1
15
56

0
1
56

0
0
0
1
56
209
1
209

0
1

0
1
15
4

0
0
0
0
0
0
0
1
7
26

C
C
C
C
16 C
15 A
24
7

41

56
15

0
1

56
15

0
0
10

0
4
CB
B
C
0 CB 0
CB
B
0C
CB 0
CB
B
0C
CB 0
CB
B
0C
A@ 0
1
0

0
0
1
209
56

0
1
15
4

0
0
0
0

1
14
1
15
55
56
66
19

0
1
56
15

0
0
0

1
C
C
C
C
C
C
C
C
A

0
0
1
209
56

0
0

0
0
0
1
780
209

1
14

1
15
1
56
210
209
45
13

1
C
C
C
C
C
C
C
C
C
C
C
A

The pattern is that the only the entries lying on the diagonal, the subdiagonal or the last
row of L are nonzero, while the only nonzero entries of U are on its diagonal, superdiagonal
or last column.

1.7.14.
(a) Assuming regularity, the only row operations required to reduce A to upper triangular
form U are, for each j = 1, . . . , n 1, to add multiples of the j th row to the (j + 1)st and
the nth rows. Thus, the only nonzero entries below the diagonal in L are at positions
(j, j + 1) and (j, n). Moreover, these row operations only affect zero entries in the last
column,
leading 1to the
of0U .
1
0 final form1
0
1 1 1
1 0 0
1 1 1
CB
C
B
B
1 2 C
2 1 A = @ 1 1 0 A@ 0
(b) @ 1
A,
0
0
2
1 2 0
1
1 1
3
10
1
1
0
1
0
0
0 0
1 1
0
0 1
1 1
0
0 1
B
B
B 1
1
0
0 0C
1 1
0 1 C
B 1
CB 0
C
2 1
0
0C
C
B
B
CB
C
C
B
B
B
C
1
0 0 CB 0
0
2 1 1 C
C = B 0 1
B 0 1
3
1
0
,
C
B
7
3C
B
C
@ 0
0 12
0
0
1 0C

0 1
4 1 A B
@ 0
A@ 0
A
2
2
13
1
0
0 1
5
1 1 12 37 1
0
0
0
0
7
0
1 1
0
0
0 1 1
B 1
2 1
0
0
0C
B
C
B
3 1
0
0C
B 0 1
C
B
C=
B 0
C
0
1
4
1
0
B
C
@ 0
0
0 1
5 1 A
1
0
0
0 1
6
0
0
0
0 1 1
1
0
0
0
0 0 10 1 1
B
B 1
1 1
0
0 1 C
1
0
0
0 0C
CB 0
C
B
B
B 0 1
0
2 1
0 1 C
1
0
0 0C
CB 0
C
B
CB
B
7
1C
1
CB 0
C.
B 0
1
0
0
0
0
1

2
2
2C
CB
B
B
C
C
B
@ 0
0
0 72
0
0
0 33
1 0 A@ 0
87 A
7
8
1 1 12 17 33
1
0
0
0
0
0 104
33
The 4 4 case is a singular matrix.

30

1.7.15.
(a) If matrix A is tridiagonal, then the only nonzero elements in ith row are ai,i1 , aii , ai,i+1 .
So aij = 0 whenever | i j | > 1.
0
0
2 1 1 1 0 01
2 1 1 0 0 01
B1 2 1 1 1 0C
B1 2 1 1 0 0C
C
B
C
B
C
B
C
B
B1 1 2 1 1 1C
B1 1 2 1 1 0C
C has band
C has band width 2; B
(b) For example, B
B1 1 1 2 1 1C
B0 1 1 2 1 1C
C
B
C
B
@0 1 1 1 2 1A
@0 0 1 1 2 1A
0 0 1 1 1 2
0 0 0 1 1 2
width 3.
(c) U is a matrix that result from applying the row operation # 1 to A, so all zero entries
in A will produce corresponding zero entries in U . On the other hand, if A is of band
width k, then for each column of A we need to perform no more than k row replacements to obtain zeros below the diagonal. Thus L which reflects these row replacements will have at most k nonzero entries below the diagonal.
0
10
1
1 0 0 0 0 0
2 1 1 0 0 0
0
1
B
B
C
C
2 1 1 0 0 0
3
1
B1
B
1 0 0 0 0C
1 0 0C
B1 2 1 1 0 0C
B2
CB 0 2
C
2
B
B1
C
CB
C
4
1
2
B
B
C
CB 0
C
0
1
0
0
0
1
0
B1 1 2 1 1 0C
B2
CB
C
3
3
3
CB
C,
C=B
(d) B
1
1
B0 1 1 2 1 1C
B 0 2
CB 0
C
1
0
0
1
0
0
1
B
B
CB
C
C
3
2
2
B
CB
C
@0 0 1 1 2 1A
B
B
0 34 21 1 0 C
0 0 0 1 12 C
@ 0
A@ 0
A
0 0 0 1 1 2
1
3
0 0 0 1 2 1
0 0 0 0 0 4
10
1
0
2 1 1 1 0 0
1 0 0 0 0 0
0
B
C
C
2 1 1 1 0 01 B
3
1
1
B
B1
1 0 0 0 0C
1 0C
B1 2 1 1 1 0C
CB 0 2
C
B2
2
2
B
CB
C
C
B1
1
1
2
4
C
CB 0
B
C
B
1
0
0
0
1
0
1
1
2
1
1
1
CB
C
B
C
B2
3
3
3
3
CB
C
B
C=B
5
1
1
1
3 C.
CB 0
B1 1 1 2 1 1C
B1
0
0
1
0
0
B
C
C
B
B2
C
3
4
4
2
4
C
CB
B
@0 1 1 1 2 1A
1
2
1C
2
4
CB
B
1 0 A@ 0 0 0 0 5 5 A
@ 0 3
2
5
0 0 1 1 1 2
0 0 34 53 14 1
0 0 0 0 0 34

1
1 1 2
2 1
(e) 31 , 13 , 0, 0, 13 , 31
,
.
3, 3,3,3, 3, 3
(f ) For A we still need to compute k multipliers at each stage and update at most 2 k 2 entries, so we have less than (n 1)(k + 2 k 2 ) multiplications and (n 1) 2 k 2 additions. For
the right-hand side we have to update at most k entries at each stage, so we have less
than (n 1)k multiplications and (n 1)k additions. So we can get by with less than
total (n 1)(2 k + 2 k 2 ) multiplications and (n 1)(k + 2 k 2 ) additions.
(g) The inverse of a0banded matrix is1not necessarily banded. For example, the inverse of
1
0
3
1
1
2 1 0
4 2
4C
B
C
B
B
1
1C
C
@ 1 2 1 A is B
1

2
2A
@
0 1 2
1
1
3
4 2
4

1.7.16. (a) ( 8, 4 )T , (b) ( 10, 4.1 )T , (c) ( 8.1, 4.1 )T . (d) Partial pivoting reduces
the effect of round off errors and results in a significantly more accurate answer.
1
1
1.7.17. (a) x = 11
7 1.57143, y = 7 .142857, z = 7 .142857,
(b) x = 3.357, y = .5, z = .1429, (c) x = 1.572, y = .1429, z = .1429.

1.7.18. (a) x = 2, y = 2, z = 3, (b) x = 7.3, y = 3.3, z = 2.9, (c) x = 1.9, y = 2.,


z = 2.9, (d) partial pivoting works markedly better, especially for the value of x.
31

1.7.19. (a) x = 220., y = 26, z = .91;


(b) x = 190., y = 24, z = .84;
(c) x = 210,
y = 26, z = 1.
(d) The exact solution is x = 213.658, y = 25.6537, z = .858586.
Full pivoting is the most accurate. Interestingly, partial pivoting fares a little worse than
regular elimination.

1.7.20. (a)

1.7.21. (a)

(c)

6
B 5
B 13
B
5
@
95

C
C
C
A

1.2
C
=B
@ 2.6 A,
1.8

1
@ 13 A =
8
13
1
0
2
121
C
B
B
38 C
C
B
121
C=
B
B
59 C
C
B
@ 242 A
56
121

(b)

.0769
,
.6154
1

.0165
C
B
B .3141 C
C,
B
@ .2438 A
.4628

1
B 4
B 5
B
B 4
B 1
B
@ 8
1
4
0

C
C
C
C,
C
C
A

4
B 5
B
8
(b) B
@ 15
19
15
0

0
B C
B1C
C,
(c) B
@1A
0
1
C
C
C
A

(d)

32
B 35
B 19
B
B 35
B 12
B
@ 35
76
35

1
C
C
C
C
C
C
A

.8000
C
=B
@ .5333 A,
1.2667
1

.732
C
(d) B
@ .002 A.
.508

1.7.22. The results are the same.


1.7.23.
Gaussian Elimination With Full Pivoting
start
for i = 1 to n
set (i) = (i) = i
next i
for j = 1 to n
if m(i),j = 0 for all i j, stop; print A is singular
choose i j and k j such that m(i), (k) is maximal

interchange (i) (j)


interchange (k) (j)
for i = j + 1 to n
set z = m(i), (j) /m(k), (j)
set m(i), (j) = 0

for k = j + 1 to n + 1
set m(i), (k) = m(i), (k) z m(i), (k)

next k
next i
next j
end

32

.9143
B
C
B .5429 C
C.
=B
@ .3429 A
2.1714

1.7.24. We let x R n be generated using a random number generator, compute b = Hn x and


then solve Hn y = b for y. The error is e = x y and we use e? = max | ei | as a measure of
the overall error. Using Matlab, running Gaussian Elimination with pivoting:
n
e?

10
.00097711

20
35.5111

50
318.3845

100
1771.1

Using Mathematica, running regular Gaussian Elimination:


n
e?

10
.000309257

20
19.8964

50
160.325

100
404.625

In Mathematica, using the built-in LinearSolve function, which is more accurate since
it uses a more sophisticated solution method when confronted with an ill-posed linear system:
n
e?

10
.00035996

20
.620536

50
.65328

100
.516865

(Of course, the errors vary a bit each time the program is run due to the randomness of the
choice of x.)
1.7.25.

9 360
30
C
(a)
H31 = B
36
192
180
@
A,
30 180
180
0
1
16
120
240
140
B
120
1200 2700
1680 C
C
B
C,
H41 = B
@ 240 2700
6480 4200 A
140
1680 4200
2800
1
0
25
300
1050
1400
630
B 300
4080
18900
26880 12600 C
C
B
B
1
C.
1050 18900
79380 117600
56700 C
H5 = B
B
C
@ 1400
26880 117600
179200 88200 A
630 12600
56700
88200
44100
(b) The same results are obtained when using floating point arithmetic in either Mathematica or Matlab.
f H , where K
f
(c) The product K
10 10
10 is the computed inverse, is fairly close to the 10 10
identity matrix; the largest error is .0000801892 in Mathematica or .000036472 in
f H , it is nowhere close to the identity matrix: in MathematMatlab. As for K
20 20
ica the diagonal entries range from 1.34937 to 3.03755, while the largest (in absolute
value) off-diagonal entry is 4.3505; in Matlab the diagonal entries range from .4918
to 3.9942, while the largest (in absolute value) off-diagonal entry is 5.1994.

1.8.1.
(a)
(b)
(c)
(d)
(e)
(f )
(g)

Unique solution: ( 21 , 34 )T ;
infinitely many solutions: (1 2 z, 1 + z, z)T , where z is arbitrary;
no solutions;
unique solution: (1, 2, 1)T ;
infinitely many solutions: (5 2 z, 1, z, 0)T , where z is arbitrary;
infinitely many solutions: (1, 0, 1, w)T , where w is arbitrary;
unique solution: (2, 1, 3, 1)T .
33

1.8.2. (a) Incompatible; (b) incompatible; (c) (1, 0)T ; (d) (1 + 3 x2 2 x3 , x2 , x3 )T , where x2
T
T
and x3 are arbitrary; (e) ( 15
2 , 23, 10) ; (f ) (5 3 x4 , 19 4 x4 , 6 2 x4 , x4 ) , where
x4 is arbitrary; (g) incompatible.
1.8.3. The planes intersect at (1, 0, 0).
1.8.4. (i) a 6= b and b 6= 0;

(ii) b = 0, a 6= 2;

1.8.5. (a) b = 2, c 6= 1 or b =
1.8.6.
(a)

1
2,

1 + i 12 (1 + i )y, y, i

(iii) a = b 6= 0, or a = 2 and b = 0.

c 6= 2; (b) b 6= 2, 21 ;

(c) b = 2, c = 1, or b =

, where y is arbitrary;;

(b) ( 4 i z + 3 + i , i z + 2 i , z ) , where z is arbitrary;


(c) ( 3 + 2 i , 1 + 2 i , 3 i )T ;
(d) ( z (3 + 4 i )w, z (1 + i )w, z, w )T , where z and w are arbitrary.
1.8.7. (a) 2, (b) 1, (c) 2, (d) 3, (e) 1, (f ) 1, (g) 2, (h) 2, (i) 3.
1.8.8.
(a)
(b)
(c)
(d)
(e)
(f )

1
1

1
2

1 0
1
1
=
,
1 1
0 3
!
!
!
2 1 3
1 0
2
1
3
,
=
0 0 0
1 1
2 1 3
1
0
10
1
0
1 1 1
1 0 0
1 1 1
B
C
B
CB
0 1C
@ 1 1 2 A = @ 1 1 0 A @ 0
A,
1
1
0
1
1
1
0
0
0
10
0
0
10
1
2 1
1 0 0
2 1
0
1 0 0
CB
B
CB
B1
3
1 1 C
@0 0 1A@1
A = @ 2 1 0 A@ 0
2
1 0 1
0
0
0 1
1 0 0 2 1 10
1 1
0
3
3
1 0 0
B
C
B C
B
0 1 0C
@ 0A = @
A @ 0 A,
2
2
0
3 0 1
( 0 1 2 5 ) = ( 1 )( 0 1 2 5 ),
0

0
B
B1
(g) B
@0
0
0
1
B2
B
B1
(h) B
B
@4
0
0
0
(i) B
@0
1

x + y = 1,
(b) y + z = 0,
x z = 1.

x = 1,
1.8.9. (a) y = 0,
z = 0.
1.8.10. (a)

10

1
0 0
0 3
1 0 0
B
C
B
0
1 0
0 0 0C
B
CB 4 1 C
C=B 1
CB
3

2A @ 4
0 1 0 A@ 1
4 1
1
1 5
0 0 1
4 74 0
1
0
10
1
2
1
1 0 0 0 0
1
C
B
B0
1 1
0C B2 1 0 0 0C
CB
B
CB
C = B 1 1 1 0 0 CB 0
2 3 1 C
C
B
CB
1
3
2 A @ 4 1 0 1 0 A@ 0
3 1
50 2
0 1 01 0 01
0
1 0
0 0
0 3
1
1 0
B
C
B
0 1C
A @ 1 2 3 1 2 A = @ 2 1
0 0
2 4 2 1 2
0 0

1
0

0
1

0
, (b)
0

1
B
@0
0

0
1
0

x + y = 1,
(c) y + z = 0,
x z = 0.
1

0
0C
A, (c)
0

1
B
@0
0

34

0
1 C
A,
1

10

0
1 2
C
B
0C
CB 0 7 C
C,
B
0C
A@ 0 0 A
0 0
1
1
1
2
1
3 5 2 C
C
C,
0
0
0C
C
0
0
0A
0
0
0
10
0
1 2 3
1
B
0C
4 1
A@0 0
1
0 0
0
3

0
1C
A, (d)
0

1
B
@0
0

0
1
0

0
0C
A.
1

2
2C
A.
1

1
2,

c = 2.

1.8.11.
(a) x2 + y 2 = 1, x2 y 2 = 2;
(b) y = x2 , x y + 2 = 0; solutions: x = 2, y = 4 and x = 1, y = 1;
(c) y = x3 , x y = 0; solutions: x = y = 0, x = y = 1, x = y = 1;
(d) y = sin x, y = 0; solutions: x = k , y = 0, for k any integer.
1.8.12. That variable does not appear anywhere in the system, and is automatically free (although it doesnt enter into any of the formulas, and so is, in a sense, irrelevant).
1.8.13. True. For example, take a matrix in row echelon form with r pivots, e.g., the matrix A
with aii = 1 for i = 1, . . . , r, and all other entries equal to 0.
1.8.14. Both false. The zero matrix has no pivots, and hence has rank 0.
1.8.15.
(a) Each row of A = v wT is a scalar multiple, namely vi w, of the vector w. If necessary,
we use a row interchange to ensure that the first row is non-zero. We then subtract the
appropriate scalar multiple of the first row from all the others. This makes all rows below the first zero, and so the resulting matrix is in row echelon form has a single nonzero row, and hence a 0
single pivot1 proving that A has rank 1.
!
!
8
4
1 2
2
6 2
B
C
(b) (i)
, (ii) @ 0
0 A, (iii)
.
3 6
3 9
3
4 2
(c) The row echelon form of A must have a single nonzero row, say w T . Reversing the elementary row operations that led to the row echelon form, at each step we either interchange rows or add multiples of one row to another. Every row of every matrix obtained
in such a fashion must be some scalar multiple of w T , and hence the original matrix
A = v wT , where the entries vi of the vector v are the indicated scalar multiples.
1.8.16. 1.
1.8.17. 2.
1.8.18. Example: A =
has rank 0.

1
0

0
0

, B=

0
0

1
0

so A B =

0
0

1
0

has rank 1, but B A =

0
0

0
0

1.8.19.

(a) Under elementary row operations, the reduced form of C will be U Z where U is the
row echelon form
! of A. Thus, C has!at least r pivots, namely
! the pivots in A. Examples:
!
1 2 1
1 2
1 2 1
1 2
rank
= 1 = rank
, while rank
= 2 > 1 = rank
.
2 4 2
2 4
2 4 3
2 4
!
U
(b) Applying elementary row operations, we can reduce E to
where U is the row echW
elon form of A. If we can then use elementary row operations of type #1 to eliminate
all entries of W , then the row echelon form of E has the same number of pivots as A
and so rank E = rank A. Otherwise, at least one new pivot appears in the rows below U ,
and rank E0> rank1A. Examples:
1
0
!
!
1 2
1 2
1 2
1 2
C
C
rank B
, while rank B
.
@ 2 4 A = 1 = rank
@ 2 4 A = 2 > 1 = rank
2 4
2 4
3 6
3 5

1.8.20. By Proposition 1.39, A can be reduced to row echelon form U by a sequence of elementary row operations. Therefore, as in the proof of the L U decomposition, A = E 1 E2 EN U
1
are the elementary matrices representing the row operations. If A is
where E11 , . . . , EN
singular, then U = Z must have at least one all zero row.
35

1.8.21. After row operations, the augmented matrix becomes N = U | c where the r = rank A
nonzero rows of U contain the pivots of A. If the system is compatible, then the last m r
entries of c are all zero, and hence N is itself a row echelon matrix with r nonzero rows and
hence rank M = rank N = r. If the system is incompatible, then one or more of the last
m r entries of c are nonzero, and hence, by one more set of row operations, N is placed
in row echelon form with a final pivot in row r + 1 of the last column. In this case, then,
rank M = rank N = r + 1.

1.8.22. (a) x = z, y = z, where z is arbitrary; (b) x = 32 z, y = 97 z, where z is arbitrary;


(c) x = y = z = 0; (d) x = 13 z 32 w, y = 56 z 16 w, where z and w are arbitrary;
(e) x = 13 z, y = 5 z, w = 0, where z is arbitrary; (f ) x = 32 w, y = 12 w, z = 21 w, where
w is arbitrary.

(c)

T
1
, where y is arbitrary; (b) 65 z, 85 z, z , where z is arbitrary;
3 y, y

T
3
2
6
11
, where z and w are arbitrary; (d) ( z, 2 z, z )T ,
5 z + 5 w, 5 z 5 w, z, w
T
T

1.8.23. (a)

where

z is arbitrary; (e) ( 4 z, 2 z, z ) , where z is arbitrary; (f ) ( 0, 0, 0 ) ; (g) ( 3 z, 3 z, z, 0 )T ,


where z is arbitrary; (h) ( y 3 w, y, w, w )T , where y and w are arbitrary.
1.8.24. If U has only nonzero entries on the diagonal, it must be nonsingular, and so the only
solution is x = 0. On the other hand, if there is a diagonal zero entry, then U cannot have
n pivots, and so must be singular, and the system will admit nontrivial solutions.
1.8.25. For the homogeneous case x1 = x3 , x2 = 0, where x3 is arbitrary. For the inhomogeneous case x1 = x3 + 14 (a + b), x2 = 21 (a b), where x3 is arbitrary. The solution to the
homogeneous version is a line going through the origin, while the inhomogeneous solution

is a parallel line going through the point 14 (a + b), 0, 21 (a b)


free variable x3 is the same as in the homogeneous case.

. The dependence on the

1.8.26. For the homogeneous case x1 = 16 x3 16 x4 , x2 = 23 x3 + 34 x4 , where x3 and x4


are arbitrary. For the inhomogeneous case x1 = 16 x3 61 x4 + 13 a + 61 b, x2 = 23 x3 +
1
1
4
3 x4 + 3 a + 6 b, where x3 and x4 are arbitrary. The dependence on the free variable x3 is
the same as in the homogeneous case.
1.8.27. (a) k = 2 or k = 2;

(b) k = 0 or k =

1
2;

(c) k = 1.

1.9.1.
(a) Regular matrix, reduces to upper triangular form U =
0

36

0
1
0

1
, so determinant is 2;
1

3
2 C
(b) Singular matrix, row echelon form U =
A, so determinant is 0;
0 0
1
1 2
3
B
2C
(c) Regular matrix, reduces to upper triangular form U = @ 0 1
A, so determinant is 3;
0 00 3
1
2 1
3
B
(d) Nonsingular matrix, reduces to upper triangular form U = @ 0 1 1 C
A after one row
interchange, so determinant is 6;
0 0
3
B
@

1
0
0

2
0

(e) Upper triangular matrix, so the determinant is a product of0diagonal


1 2
B
0
2
B
(f ) Nonsingular matrix, reduces to upper triangular form U = B
@0
0
0
0
one row interchange, so determinant is 40;
0
1 2
B0
3
B
B0
(g) Nonsingular matrix, reduces to upper triangular form U = B
0
B
@0
0
0
0
after one row interchange, so determinant is 60.
1.9.2. det A = 2, det B = 11 and det A B =
!

det B
@

5
1
2

4
5
10

entries: 180;
1
1
4
1 7 C
C
C after
2 8 A
0 10
1
3
4
0
0

4
1
12
5
0

5
2C
C
C
24 C
C
10 A
1

4
1C
A = 22.
0

2
3
; (b) By formula (1.82),
1.9.3. (a) A =
1 2
1 = det I = det(A2 ) = det(A A) = det A det A = (det A)2 , so det A = 1.

1.9.4. det A2 = (det A)2 = det A, and hence det A = 0 or 1

1.9.5.
(a) True. By Theorem 1.52, A !
is nonsingular, so, by Theorem 1.18, A1 exists
2
3
(b) False. For A =
, we have 2 det A = 2 and det 2 A = 4. In general,
1 2
n
det(2 A) = 2 det A.
!
!
!
2
3
0 1
2
4
(c) False. For A =
and B =
, we have det(A + B) = det
=
1 2
0 0
1 2
0 6= 1 = det A + det B.
(d) True. det AT = det(A1 )T = det A1 = 1/ det A, where the second equality follows
from Proposition 1.56, and the third equality follows from Proposition 1.55.
(e) True. det(A B 1 ) = det A det B 1 = det A/ det B, where the first equality follows from
formula (1.82) and the second
equality follows
from Proposition 1.55.
!
!
!
0 4
0 1
2
3
=
, then det(A + B)(A B) = det
and B =
(f ) False. If A =
0
2
0 0
1 2
!
1 0
= 1. However, if A B = B A, then det(A + B)(A B) =
0 6= det(A2 B 2 ) = det
0 1
det(A2 A B + B A B 2 ) = det(A2 B 2 ).
(g) True. Proposition 1.42 says rank A = n if and only if A is nonsingular, while Theorem 1.52 implies that det A 6= 0.
(h) True. Since det A = 1 6= 0, Theorem 1.52 implies that A is nonsingular, and so B =
A1 O = O.
1.9.6. Never its determinant is always zero.
1.9.7. By (1.82, 83) and commutativity of numeric multiplication,
1
det B = det(S 1 A S) = det S 1 det A det S =
det A det S = det A.
det S
1.9.8. Multiplying one row of A by c multiplies its determinant by c. To obtain c A, we must
multiply all n rows by c, and hence the determinant is multiplied by c a total of n times.
1.9.9. By Proposition 1.56, det LT = det L. If L is a lower triangular matrix, then LT is an
37

upper triangular matrix. By Theorem 1.50, det LT is the product of its diagonal entries
which are the same as the diagonal entries of L.
1.9.10. (a) See Exercise 1.9.8. (b) If n is odd, det( A) = det A. On the other hand,
if
!
0 1
.
AT = A, then det A = det AT = det A, and hence det A = 0. (c) A =
1 0
1.9.11. We have
det

a
c + ka

b
d + kb
c
a

det
det

ka
c

kb
d
a
0

det

d
b

b
d

a
c

= a d + a k b b c b k a = a d b c = det
= c b a d = (a d b c) = det

a
c

= k a d k b c = k (a d b c) = k det

b
d
a
c

b
d

b
d

= a d b 0 = ad.

1.9.12.
(a) The product formula holds if A is an elementary matrix; this is a consequence of the
determinant axioms coupled with the fact that elementary matrices are obtained by applying the corresponding row operation to the identity matrix, with det I = 1.
(b) By induction, if A = E1 E2 EN is a product of elementary matrices, then (1.82) also
holds. Proposition 1.25 then implies that the product formula is valid whenever A is
nonsingular.
(c) The first result is in Exercise 1.2.24(a), and so the formula follows by applying Lemma 1.51
to Z and Z B.
(d) According to Exercise 1.8.20, every singular matrix can be written as A = E 1 E2 EN Z,
where the Ei are elementary matrices, while Z, its row echelon form, is a matrix with a
row of zeros. But then Z B = W also has a row of zeros, and so A B = E1 E2 EN W
is also singular. Thus, both sides of (1.82) are zero in this case.
1.9.13. Indeed, by (1.82), det A det A1 = det(A A1 ) = det I = 1.
1.9.14. Exercise 1.6.28 implies that, if A is regular, so is AT , and they both have the same pivots. Since the determinant of a regular matrix is the product of the pivots, this implies
det A = det AT . If A is nonsingular, then we use the permuted L U decomposition to write
A = P T L U where P T = P 1 by Exercise 1.6.14. Thus, det A = det P T det U = det U ,
while det AT = det(U T LT P ) = det U det P = det U where det P 1 = det P = 1.
Finally, if A is singular, then the same computation holds, with U denoting the row echelon
form of A, and so det A = det U = 0 = det AT .
1.9.15.
0

a11
B
B a21
det B
@ a31
a41

a12 a13 a14


a22 a23 a24 C
C
C=
a32 a33 a34 A
a42 a43 a44
a11 a22 a33 a44 a11 a22 a34 a43 a11 a23 a32 a44 + a11 a23 a34 a42 a11 a24 a33 a42
+ a11 a24 a32 a43 a12 a21 a33 a44 + a12 a21 a34 a43 + a12 a23 a31 a44 a12 a23 a34 a41
+ a12 a24 a33 a41 a12 a24 a31 a43 + a13 a21 a32 a44 a13 a21 a34 a42 a13 a22 a31 a44
+ a13 a22 a34 a41 a13 a24 a32 a41 + a13 a24 a31 a42 a14 a21 a32 a43 + a14 a21 a33 a42
+ a14 a22 a31 a43 a14 a22 a33 a41 + a14 a23 a32 a41 a14 a23 a31 a42 .

38

1.9.16.
(i) Suppose
8 B is obtained from A by adding c times row k to row l, so
< alj + c aij , i = l,
bij = :
Thus, each summand in the determinantal formula for
aij ,
i 6= l.
det B splits into two terms, and we find that det B = det A + c det C, where C is the
matrix obtained from A by replacing row l by row k. But rows k and l of C are identical, and so, by axiom (ii), if we interchange the two rows det C = det C = 0. Thus,
det B = det A.
(ii) Let B be obtained from A by interchanging rows k and l. Then each summand in the
formula for det B equals minus the corresponding summand in the formula for det A,
since the permutation has changed sign, and so det B = det A.
(iii) Let B be obtained from A by multiplying rows k by c. Then each summand in the formula for det B contains one entry from row k, and so equals c times the corresponding
term in det A, hence det B = c det A.
(iv ) The only term in det U that does not contain at least one zero entry lying below the
diagonal is for the identity permutation (i) = i, and so det U is the product of its diagonal entries.
1.9.17. If U is nonsingular, then, by GaussJordan elimination, it can be reduced to the identity matrix by elementary row operations of types #1 and #3. Each operation of type #1
doesnt change the determinant, while operations of type #3 multiply the determinant by
the diagonal entry. Thus, det U = u11 u22 unn det I . On the other hand, U is singular if
and only if one or more of its diagonal entries are zero, and so det U = 0 = u11 u22 unn .
1.9.18. The determinant of an elementary matrix of type #2 is 1, whereas all elementary matrices of type #1 have determinant +1, and hence so does any product thereof.
1.9.19.
(a) Since A is regular, a 6= 0 and a d b c 6= 0. Subtracting c/a times the first
from from the
!
a
b
, and its pivots
second row reduces A to the upper triangular matrix
0
d
+
b
(c/a)
ad bc
det A
c
= a .
are a and d b a =
a
(b) As in part (a) we reduce A to an upper triangular form. First, we subtract c/a times
the first row from
the second row, and g/a
times the first row from third row, resulting
0
1
a
b
e
B
ad bc af ce C
C
B0
C. Performing the final row operation reduces
in the matrix B
a
a
@
ah bg aj cg A
0
1
0
a
b
e
a
a
B
C
ad bc
af ce C
B0
C, whose pivots
the matrix to the upper triangular form U = B

@
A
a
a
ad bc
are a,
, and
0
0
P
a
(a f c e)(a h b g)
adj + bf g + ech af h bcj edg
det A
aj eg

=
=
.
a
a (a d b c)
ad bc
ad bc
(c) If A is a regular n n matrix, then its first pivot is a11 , and its k th pivot, for k =
2, . . . , n, is det Ak /det Ak1 , where Ak is the k k upper left submatrix of A with entries aij for i, j = 1, . . . , k. A formal proof is done by induction.
1.9.20. (ac) Applying an elementary column operation to a matrix A is the same as applying the elementary row operation to its transpose AT and then taking the transpose of the
result. Moreover, Proposition 1.56 implies that taking the transpose does not affect the de39

terminant, and so any elementary column operation has exactly the same effect as the corresponding elementary row operation.
(d) Apply the transposed version of the elementary row operations required to reduce A T
to upper triangular form. Thus, if the (1, 1) entry is zero, use a column interchange to
place a nonzero pivot in the upper left position. Then apply elementary column operations
of type #1 to make all entries to the right of the pivot zero. Next, make sure a nonzero
pivot is in the (2, 2) position by a column interchange if necessary, and then apply elementary column operations of type #1 to make all entries to the right of the pivot zero. Continuing in this fashion, if the matrix is nonsingular, the result is an lower triangular matrix.
(e) We first interchange the first and second columns, and then use elementary column operations of type #1 to reduce the matrix to lower triangular form:
0
0
1
1
0
1 2
1
0 2
B
C
3 5C
det B
@ 1
A = det @ 3 1 5 A
2 3 1
3
2 1
0
1
0
1
1
0
0
1
0 0
C
B
C
= det B
@ 3 1 1 A = det @ 3 1 0 A = 5.
3
2
7
3
2 5
1.9.21. Using the L U factorizations established
in Exercise
1.3.25:
0
1
!
1 1 1
1 1
C
= t2 t1 , (b) det B
(a) det
@ t1 t2 t3 A = (t2 t1 )(t3 t1 )(t3 t2 ),
t1 t2
2
2
2
t1 t2 t3
1
0
1 1 1 1
C
Bt
B 1 t2 t3 t4 C
C
(c) det B
B t2 t2 t2 t2 C = (t2 t1 )(t3 t1 )(t3 t2 )(t4 t1 )(t4 t2 )(t4 t3 ).
@ 1
2
3
4A
t31 t32 t33 t34
The general formula is found in Exercise 4.4.29.
1.9.22.
(a) By direct substitution:
pd bq
aq pc
pd bq
aq pc
ax + by = a
+b
= p,
cx + dy = c
+d
= q.
ad bc
ad bc
ad bc
ad bc
!
!
1
1
13 3
1 13
(b) (i) x =
det
det
= 2.6, y =
= 5.2;
0
2
4
0
10
10
!
!
1
5
7
1
4 2
1
4
det
det
= , y=
= .
(ii) x =
2
6
3 2
12
3
12
6
(c) Proof by direct0substitution,
expanding all the determinants.
0
1
1
3
4
0
1
3
0
1
1
1
7
(d) (i) x = det B
y = det B
,
1C
1C
@2 2
@ 4 2
A= ,
A=
9
9
9
9
0
1
1
1
0
1
0
1
0
1
1 4 3
1
2 1
1
1
8
B
C
B
2C
z = det @ 4 2 2 A = ; (ii) x = det @ 2 3
A = 0,
9
9
2
1
1
0
3
1
1
1
1
0
0
3 1 1
3
2 1
1
1
C
B
B
2 A = 4, z = det @ 1 3 2 C
y = det @ 1 2
A = 7.
2
2
2 3
1
2 1 3
(e) Assuming A is nonsingular, the solution to A x = b is xi = det Ai / det A, where Ai
is obtained by replacing the ith column of A by the right hand side b. See [ 60 ] for a
complete justification.

40

1.9.23.
(a) We can individually reduce A and B to upper triangular forms U1 and U2 with the
determinants equal to the products of their respective diagonal entries. Applying the
analogous !elementary row operations to D will reduce it to the upper triangular form
U1 O
, and its determinant is equal to the product of its diagonal entries, which
O U2
are the diagonal entries of both U1 and U2 , so det D = det U1 det U2 = det A det B.
(b) The same argument as in part (a) proves the result. The row operations applied to A
are also applied to C, but this doesnt affect the final upper triangular form.
0

3
(c) (i) det B
@0
0
(ii)

B
3
B
det B
@ 0

(iii)

B
3
B
det B
@ 0

(iv )

B
2
B
det B
@2

2
4
3
2
1
0
0

2
5 C
A = det(3) det
7
2
0
1
2

2
1
3
0

0
4
1
0

1
5
4
2

0
0
4
9

4
3

5
5 C
C
C = det
3A
2

1
3

0
4
1
1 C
C
B
C = det @ 3
8A
0
3
1

0
0C
C
C = det
2 A
5

5
7

5
2

2
1

41

1
2

det

3
2

= 7 (8) = 56,

2
1
3

1
5

= 3 43 = 129,

0
4C
A det(3) = (5) (3) = 15,
1
!

det

4
9

2
5

= 27 (2) = 54.

Solutions Chapter 2

2.1.1. Commutativity of Addition:


(x + i y) + (u + i v) = (x + u) + i (y + v) = (u + i v) + (x + i y).
Associativity of Addition:
h
i
h
i
(x + i y) + (u + i v) + (p + i q) = (x + i y) + (u + p) + i (v + q)
= (x + u + p) + i (y + v + q)
h

= (x + u) + i (y + v) + (p + i q) = (x + i y) + (u + i v) + (p + i q).
Additive Identity: 0 = 0 = 0 + i 0 and
(x + i y) + 0 = x + i y = 0 + (x + i y).
Additive Inverse: (x + i y) = ( x) + i ( y) and
h

(x + i y) + ( x) + i ( y) = 0 = ( x) + i ( y) + (x + i y).

Distributivity:
(c + d) (x + i y) = (c + d) x + i (c + d) y = (c x + d x) + i (c y + d y) = c (x + i y) + d (x + i y),
c[ (x + i y) + (u + i v) ] = c (x + u) + (y + v) = (c x + c u) + i (c y + c v) = c (x + i y) + c (u + i v).
Associativity of Scalar Multiplication:
c [ d (x + i y) ] = c [ (d x) + i (d y) ] = (c d x) + i (c d y) = (c d) (x + i y).
Unit for Scalar Multiplication: 1 (x + i y) = (1 x) + i (1 y) = x + i y.
Note: Identifying the complex number x + i y with the vector ( x, y )T R 2 respects the operations of vector addition and scalar multiplication, and so we are in effect reproving that R 2 is a
vector space.
2.1.2. Commutativity of Addition:
(x1 , y1 ) + (x2 , y2 ) = (x1 x2 , y1 y2 ) = (x2 , y2 ) + (x1 , y1 ).
Associativity of Addition:
h
i
h
i
(x1 , y1 ) + (x2 , y2 ) + (x3 , y3 ) = (x1 x2 x3 , y1 y2 y3 ) = (x1 , y1 ) + (x2 , y2 ) + (x3 , y3 ).
Additive Identity: 0 = (1, 1), and
(x, y) + (1, 1) = (x, y) = (1, 1) + (x, y).
Additive Inverse:
!
h
i
h
i
1 1
(x, y) =
,
and
(x, y) + (x, y) = (1, 1) = (x, y) + (x, y).
x y
Distributivity:

(c + d) (x, y) = (xc+d , y c+d ) = (xc xd , y c y d ) = (xc , y c ) + (xd , y d ) = c (x, y) + d (x, y)


h

c (x1 , y1 ) + (x2 , y2 ) = ((x1 x2 )c , (y1 y2 )c ) = (xc1 xc2 , y1c y2c )


= (xc1 , y1c ) + (xc2 , y2c ) = c (x1 , y1 ) + c (x2 , y2 ).
Associativity of Scalar Multiplication:
c (d (x, y)) = c (xd , y d ) = (xc d , y c d ) = (c d) (x, y).
Unit for Scalar Multiplication: 1 (x, y) = (x, y).
42

Note: We can uniquely identify a point (x, y) Q with the vector ( log x, log y )T R 2 . Then
the indicated operations agree with standard vector addition and scalar multiplication in R 2 ,
and so Q is just a disguised version of R 2 .
2.1.3. We denote a typical function in F(S) by f (x) for x S.
Commutativity of Addition:
(f + g)(x) = f (x) + g(x) = (f + g)(x).
Associativity of Addition:
[f + (g + h)](x) = f (x) + (g + h)(x) = f (x) + g(x) + h(x) = (f + g)(x) + h(x) = [(f + g) + h](x).
Additive Identity: 0(x) = 0 for all x, and (f + 0)(x) = f (x) = (0 + f )(x).
Additive Inverse: ( f )(x) = f (x) and
[f + ( f )](x) = f (x) + ( f )(x) = 0 = ( f )(x) + f (x) = [( f ) + f ](x).
Distributivity:
[(c + d) f ](x) = (c + d) f (x) = c f (x) + d f (x) = (c f )(x) + (d f )(x),
[c (f + g)](x) = c f (x) + c g(x) = (c f )(x) + (c g)(x).
Associativity of Scalar Multiplication:
[c (d f )](x) = c d f (x) = [(c d) f ](x).
Unit for Scalar Multiplication: (1 f )(x) = f (x).
2.1.4. (a) ( 1, 1, 1, 1 )T , ( 1, 1, 1, 1 )T , ( 1, 1, 1, 1 )T , ( 1, 1, 1, 1 )T . (b) Obviously not.

2.1.5. One example is f (x) 0 and g(x) = x3 x.

2.1.6. (a) f (x) = 4 x + 3; (b) f (x) = 2 x2 x + 1.


2.1.7.

1
ex
, which is a constant function.
, and
(a)
3
cos y
!
!
5 x + 5 y 5 ex 5
x y + ex + 1
.
. Multiplied by 5 is
(b) Their sum is
5 x y 5 cos y 15
x y + cos y + 3
!
0
(c) The zero element is the constant function 0 =
.
0
xy
,
xy

2.1.8. This is the same as the space of functions F(R 2 , R 2 ). Explicitly:


Commutativity of Addition:
!
!
!
!
v1 (x, y)
w1 (x, y)
v1 (x, y) + w1 (x, y)
w1 (x, y)
+
=
=
+
v2 (x, y)
w2 (x, y)
v2 (x, y) + w2 (x, y)
w2 (x, y)
Associativity of Addition:
!
!
"
v1 (x, y)
u1 (x, y)
+
+
v2 (x, y)
u2 (x, y)

w1 (x, y)
w2 (x, y)

!#

=
=

"

u1 (x, y) + v1 (x, y) + w1 (x, y)


u2 (x, y) + v2 (x, y) + w2 (x, y)
u1 (x, y)
u2 (x, y)

Additive Identity: 0 = (0, 0) for all x, y, and


!
!
v1 (x, y)
v1 (x, y)
+0=
=0+
v2 (x, y)
v2 (x, y)
Additive Inverse:

v1 (x, y)
v2 (x, y)

v1 (x, y)
v2 (x, y)

v1 (x, y)
, and
v2 (x, y)

v1 (x, y)
v2 (x, y)

=0=
43

v1 (x, y)
v2 (x, y)

v1 (x, y)
v2 (x, y)

v1 (x, y)
v2 (x, y)

v1 (x, y)
v2 (x, y)

!#

v1 (x, y)
v2 (x, y)

w1 (x, y)
w2 (x, y)

Distributivity:
(c + d)
"

v1 (x, y)
v2 (x, y)

!#

(c + d) v1 (x, y)
(c + d) v2 (x, y)

=c
!

v1 (x, y)
v2 (x, y)

+d
!

v1 (x, y)
v2 (x, y)

v1 (x, y)
w1 (x, y)
c v1 (x, y) + c w1 (x, y)
v1 (x, y)
c
+
=
=c
+c
v2 (x, y)
w2 (x, y)
c v2 (x, y) + c w2 (x, y)
v2 (x, y)
Associativity of Scalar
Multiplication:
"
!#
!
!
v1 (x, y)
c d v1 (x, y)
v1 (x, y)
c d
=
= (c d)
.
v2 (x, y)
c d v2 (x, y)
v2 (x, y)
Unit for Scalar Multiplication:
!
!
v1 (x, y)
v1 (x, y)
.
=
1
v2 (x, y)
v2 (x, y)

w1 (x, y)
w2 (x, y)

2.1.9. We identify each sample value with the matrix entry mij = f (i h, j k). In this way, every
sampled function corresponds to a uniquely determined m n matrix and conversely. Addition of sample functions, (f + g)(i h, j k) = f (i h, j k) + g(i h, j k) corresponds to matrix
addition, mij + nij , while scalar multiplication of sample functions, c f (i h, j k), corresponds
to scalar multiplication of matrices, c mij .
2.1.10. a + b = (a1 + b1 , a2 + b2 , a3 + b3 , . . . ), c a = (c a1 , c a2 , c a3 , . . . ). Explicity verification of
the vector space properties is straightforward. An alternative, smarter strategy is to identify R as the space of functions f : N R where N = { 1, 2, 3, . . . } is the set of natural
numbers and we identify the function f with its sample vector f = (f (1), f (2), . . . ).

2.1.11. (i) v + (1)v = 1 v + (1)v = 1 + (1) v = 0 v = 0.


(j) Let z = c 0. Then z + z = c (0 + 0) = c!
0 = z, and so, as in the proof of (h), z = 0.
1
1
1
c v = (c v) = 0 = 0.
(k) Suppose c 6= 0. Then v = 1 v =
c
c
c
e both satisfy axiom (c), then 0 = 0
e+0=0+0
e = 0.
e
2.1.12. If 0 and 0

2.1.13. Commutativity of Addition:


b , w)
b = (v + v
b , w + w)
b = (v
b , w)
b + (v, w).
(v, w) + (v
Associativity hof Addition:
i
h
i
b , w)
b + (v
e , w)
e
b+v
e, w + w
b + w)
e = (v, w) + (v
b , w)
b
e , w).
e
(v, w) + (v
= (v + v
+ (v
Additive Identity: the zero element is (0, 0), and
(v, w) + (0, 0) = (v, w) = (0, 0) + (v, w).
Additive Inverse: (v, w) = ( v, w) and
(v, w) + ( v, w) = (0, 0) = ( v, w) + (v, w).
Distributivity:
(c + d) (v, w) = ((c + d) v, (c + d) w) = c (v, w) + d (v, w),
h

b , w)
b
b , c v + c w)
b = c (v, w) + c (v
b , w).
b
c (v, w) + (v
= (c v + c v
Associativity of Scalar Multiplication:
c (d (v, w)) = (c d v, c d w) = (c d) (v, w).
Unit for Scalar Multiplication: 1 (v, w) = (1 v, 1 w) = (v, w).

2.1.14. Here V = C0 while W = R, and so the indicated pairs belong to the Cartesian product vector space C0 R. The zero element is the pair 0 = (0, 0) where the first 0 denotes
the identically zero function, while the second 0 denotes the real number zero. The laws of
vector addition and scalar multiplication are
(f (x), a) + (g(x), b) = (f (x) + g(x), a + b),
c (f (x), a) = (c f (x), c a).
44

2.2.1.
e = (x
e, y
e, z
e )T also satisfies x
ey
e + 4z
e = 0,
(a) If v = ( x, y, z )T satisfies x y + 4 z = 0 and v
T
e = (x + x
e, y + y
e, z + z
e ) since (x + x
e ) (y + y
e) + 4 (z + z
e) = (x y + 4 z) +
so does v + v
T
e y
e +4 z
e) = 0, as does c v = ( c x, c y, c z ) since (c x)(c y)+4 (c z) = c (xy +4 z) = 0.
(x
(b) For instance, the zero vector 0 = ( 0, 0, 0 )T does not satisfy the equation.
2.2.2. (b,c,d,g,i) are subspaces; the rest are not. Case (j) consists of the 3 coordinate axes and
the line x = y = z.
-1
-0.5

-1
-0.5

0
0.5

0.5
0

0.5
1

0.5

-0.5
-1

2.2.3. (a) Subspace:

-1
1 -0.5 0

10

(b) Not a subspace:

-2

-10

1
0.5

0.5

-0.5

0
-0.5

-1
1

-1
1

0.5
0.5

(c) Subspace:

(d) Not a subspace:

-0.5
-1
-1

0
-0.5
-1
-1
-0.5

-0.5
0

0.5

0.5

-1.5
-1.75
-2
-2.25
-2.5
1.5

1.25
1

(e) Not a subspace:

(f ) Even though the cylinders are not

0.75
0.5
-1

2
1

-0.5
0

0
-1

0.5
-2
2

subspaces, their intersection is the z axis, which is a subspace:

0
-1
-2
-2
-1
0
1
2

1
2
0
a + 2b
x
B C
B
C
B
B C
2.2.4. Any vector of the form
2C
2a c C
A + b @ 0 A + c @ 1 A = @
A = @ y A will
1
1
3
a + b + 3c
z
0
1
1 2
0
C
belong to W . The coefficient matrix B
@ 2 0 1 A is nonsingular, and so for any
1 1
3
aB
@

45

x = ( x, y, z )T R 3 we can arrange suitable values of a, b, c by solving the linear system.


Thus, every vector in R 3 belongs to W and so W = R 3 .
2.2.5. False, with two exceptions: [ 0, 0 ] = {0} and ( , ) = R.
2.2.6.
(a) Yes. For instance, the set S = { (x, 0 } { (0, y) } consisting of the coordinate axes has
the required property, but is not a subspace. More generally, any (finite) collection of 2
or more lines going through the origin satisfies the property, but is not a subspace.
(b) For example, S = { (x, y) | x, y 0 } the positive quadrant.
2.2.7. (a,c,d) are subspaces; (b,e) are not.
2.2.8. Since x = 0 must belong to the subspace, this implies b = A 0 = 0. For a homogeneous
system, if x, y are solutions, so A x = 0 = A y, so are x + y since A(x + y) = A x + A y = 0,
as is c x since A(c x) = c A x = 0.
2.2.9. L and M are strictly lower triangular if lij = 0 = mij whenever i j. Then N = L + M
is strictly lower triangular since nij = lij + mij = 0 whenever i j, as is K = c L since
kij = c lij = 0 whenever i j.
2.2.10. Note tr(A + B) =

n
X

i=1

(aii + bii ) = tr A + tr B and tr(c A) =

n
X

i=1

c aii = c

Thus, if tr A = tr B = 0, then tr(A + B) = 0 = tr(c A), proving closure.

n
X

i=1

aii = c tr A.

2.2.11.
(a) No. The zero matrix is not an element.!
!
1 0
0 0
(b) No if n 2. For example, A =
,B =
satisfy det A = 0 = det B, but
0 0
0 1
!
1 0
= 1, so A + B does not belong to the set.
det(A + B) = det
0 1
2.2.12. (d,f,g,h) are subspaces; the rest are not.
2.2.13. (a) Vector space; (b) not a vector space: (0, 0) does not belong; (c) vector space;
(d) vector space; (e) not a vector space: If f is non-negative, then 1 f = f is not (unless f 0); (f ) vector space; (g) vector space; (h) vector space.
2.2.14. If f (1) = 0 = g(1), then (f + g)(1) = 0 and (c f )(1) = 0, so both f + g and c f belong to the subspace. The zero function does not satisfy f 0) = 1. For a subspace, a can be
anything, while b = 0.
2.2.15. All cases except (e,g) are subspaces. In (g), | x | is not in C1 .
2.2.16. (a) Subspace; (b) subspace; (c) Not a subspace: the zero function does not satisfy
the condition; (d) Not a subspace: if f (0) = 0, f (1) = 1, and g(0) = 1, g(1) = 0, then f
and g are in the set, but f + g is not; (e) subspace; (f ) Not a subspace: the zero function
does not satisfy the condition; (g) subspace; (h) subspace; (i) Not a subspace: the zero
function does not satisfy the condition.
2.2.17. If u00 = x u, v 00 = x v, are solutions, and c, d constants, then (c u + d v)00 = c u00 + d v 00 =
c x u + d x v = x(c u + d v), and hence c u + d v is also a solution.
2.2.18. For instance, the zero function u(x) 0 is not a solution.
2.2.19.
(a) It is a subspace of the space of all functions f : [ a, b ] R 2 , which is a particular instance
of Example 2.7. Note that f (t) = ( f1 (t), f2 (t) )T is continuously differentiable if and
46

only if its component functions f1 (t) and f2 (t) are. Thus, if f (t) = ( f1 (t), f2 (t) )T and
g(t) = ( g1 (t), g2 (t) )T are continuously differentiable, so are
(f + g)(t) = ( f1 (t) + g1 (t), f2 (t) + g2 (t) )T and (c f )(t) = ( c f1 (t), c f2 (t) )T .
(b) Yes: if f (0) = 0 = g(0), then (c f + d g)(0) = 0 for any c, d R.
2.2.20. (c v + d w) = c v + d w = 0 whenever v = w = 0 and c, d, R.
2.2.21. Yes. The sum of two convergent sequences is convergent, as is any constant multiple of
a convergent sequence.
2.2.22.
(a) If v, w W Z, then v, w W , so c v + d w W because W is a subspace, and
v, w Z, so c v + d w Z because Z is a subspace, hence c v + d w W Z.
e +z
e W + Z then c (w + z) + d (w
e +z
e) = (c w + d w)
e + (c z + d z
e) W + Z,
(b) If w + z, w
since it is the sum of an element of W and an element of Z.
(c) Given any w W and z Z, then w, z W Z. Thus, if W Z is a subspace, the
e W or w + z = z
e Z. In the first case
sum w + z W Z. Thus, either w + z = w
e w W , while in the second w = z
e z Z. We conclude that for any w W
z=w
and z Z, either w Z or z W . Suppose W 6 Z. Then we can find w W \ Z, and
so for any z Z, we must have z W , which proves Z W .
2.2.23. If v, w Wi , then v, w Wi for each i and so c v + d w Wi for any
T c, d R because
Wi is a subspace. Since this holds for all i, we conclude that c v + d w Wi .
T

!
!
!
2.2.24.
0
x
x
can
+
=
(a) They clearly only intersect at the origin. Moreover, every v =
y
0
y
be written as a sum of vectors on the two axes.
(b) Since the only common solution to x = y and x!= 3 y is
x = y! = 0, the lines only
!
x
a
3b
intersect at the origin. Moreover, every v =
=
+
, where a = 21 x+ 32 y,
y
a
b
b = 21 x 12 y, can be written as a sum of vectors on each line.
(c) A vector v = ( a, 2 a, 3 a )T in the line belongs to the plane if and only if a + 2 (2 a) +
3 (3 a)0 =114 a = 0
0, so a = 0 and the
only 0
common element is1v = 0. Moreover, every
1
x
x
+
2
y
+
3
z
1 B
1 B 13 x 2 y 3 z C
C
C
v=B
@ 2 (x + 2 y + 3 z) A +
@ 2 x + 10 y 6 z A can be written as a sum
@yA =
14 3 (x + 2 y + 3 z)
14
z
3x 6y + 5z
of a vector in the line and a vector in the plane.
e +z
e, then w w
e =z
e z. The left hand side belongs to W , while the right
(d) If w + z = w
hand side belongs to Z, and so, by the first assumption, they must both be equal to 0.
e z=z
e.
Therefore, w = w,

2.2.25.
(a) (v, w) V0 W0 if and only if (v, w) = (v, 0) and (v, w) = (0, w), which means v =
0, w = 0, and hence (v, w) = (0, 0) is the only element of the intersection. Moreover, we
can write any element (v, w) = (v, 0) + (0, w).
(b) (v, w) D A if and only if v = w and v = w, hence (v, w) = (0, 0). Moreover, we
can write (v, w) = ( 12 v + 21 w, 12 v + 21 w) + ( 21 v 21 w, 21 v + 21 w) as the sum of an
element of D and an element of A.
2.2.26.
(a) If f ( x) = f (x), fe ( x) = fe (x), then (c f + d fe )( x) = c f ( x) + d fe ( x) = c f (x) +
d fe (x) = (c f + d fe )(x) for any c, d, R, and hence it is a subspace.
(b) If g( x) = g(x), ge( x) = ge(x), then (c g + d ge)( x) = c g( x) + d ge( x) =
c g(x) d ge(x) = (c g + d ge)(x), proving it is a subspace. If f (x) is both even and
47

odd, then f (x) = f ( x) = f (x) and so f (x) 0 for all x. Moreover,


we can write
any
h
i
1
function h(x) = f (x) + g(x) as a sum of an even function f (x) = 2 h(x) + h( x) and
h

an odd function g(x) = 21 h(x) h( x) .


(c) This follows from part (b), and the uniqueness follows from Exercise 2.2.24(d).

2.2.27. If A = AT and A = AT is both symmetric and skew-symmetric,


then A = O.

Given any square matrix, write A = S + J where S = 21 A + AT is symmetric and

J = 12 A AT is skew-symmetric. This verifies the two conditions for complementary


subspaces. Uniqueness of the decomposition A = S + J follows from Exercise 2.2.24(d).

2.2.28.
(a) By induction, we can show that
f

(n)

(x) = Pn

1
x

e 1/x = Qn (x)

e 1/x
,
xn

where Pn (y) and Qn (x) = xn Pn (1/x) are certain polynomials of degree n. Thus,
lim f (n) (x) = lim Qn (x)

x0

x0

e 1/x
= Qn (0) y lim
y n e y = 0,

xn

because the exponential e y goes to zero faster than any power of y goes to .
(b) The Taylor series at a = 0 is 0 + 0 x + 0 x2 + 0, which converges to the zero
function, not to e 1/x .
2.2.29.
(a) The Taylor series is the geometric series

1
= 1 x 2 + x4 x6 + .
1 + x2
(b) The ratio test can be used to prove that the series converges precisely when | x | < 1.
(c) Convergence of the Taylor series to f (x) for x near 0 suffices to prove analyticity of the
function at x = 0.

2.2.30.
(a) If v+a, w+a A, then (v+a)+(w+a) = (v+w+a)+a A requires v+w+a = u V ,
and hence a = u v w A.

(b) (i)

-3

-2

-1

(ii)

-3

-2

-1

(iii)

-3

-2

-1

-1

-1

-1

-2

-2

-2

-3

-3

-3

(c) Every subspace V R is either a point (the origin), or a line through the origin, or all
of R 2 . Thus, the corresponding affine subspaces are the point { a }; a line through a, or
all of R 2 since in this case a V = R 2 .
e, y
e, z
e )T + ( 1, 0, 0 )T where
(d) Every vector in the plane can be written as ( x, y, z )T = ( x
T
e, y
e, z
e ) is an arbitrary vector in the subspace defined by x
e 2y
e + 3x
e = 0.
(x
(e) Every such polynomial can be written as p(x) = q(x) + 1 where q(x) is any element of
the subspace of polynomials that satisfy q(1) = 0.

48

2.3.1. B
@

B
B
2.3.2. B
@

2.3.3.

1
2
5
B
B
C
C
2C
A = 2@ 1 A @ 4 A.
3
2
1
0

3
1
2
2
B
C
B
C
B
C
7C
3
6
C
B
C
B
C
B 4C
C = 3B
C + 2B
C+B
C.
@ 2 A
@ 3A
@ 6A
6A
1
0
4
7
0

1
1
0
C
B C
B C
(a) Yes, since B
@ 2 A = @ 1 A 3@ 1 A;
3 1
00 1
10
1
0 1
0
1
1
0
1
C
C
7 B
4 B C
3 B C
(b) Yes, since B
@ 2 A = 10 @ 2 A + 10 @ 2 A 10 @ 3 A;
2
0 0 14 0
1
0
1
1
0
1
3
1
0
2
B
C
B C
B
C
B
C
0C
B2C
B 1 C
B 0C
B
C = c1 B C + c2 B
C + c3 B
C does not have a
(c) No, since the vector equation B
@ 1 A
@0A
@ 3A
@ 1A
2
1
0
1
solution.

2.3.4. Cases (b), (c), (e) span R 2 .


2.3.5.

1
0.5
0
-0.5
1-1
0.5
0
-0.5
-1

(a) The line ( 3 t, 0, t )T :


-2
0
2
-1
1
0.5

-0.5

-0.5

0.5

-1

1
0

(b) The plane z = 35 x

6
5

y:

-1

0.5
0
-0.5

1-1
-0.5
0

-1
2

0.5
1

1
0

(c) The plane z = x y:

-1
-2

2.3.6. They are the same. Indeed, since v1 = u1 + 2 u2 , v2 = u1 + u2 , every vector v V can
be written as a linear combination v = c1 v1 + c2 v2 = (c1 + c2 ) u1 + (2 c1 + c2 ) u2 and hence
belongs to U . Conversely, since u1 = v1 + 2 v2 , u2 = v1 v2 , every vector u U can be
written as a linear combination u = c1 u1 + c2 u2 = ( c1 + c2 ) v1 + (2 c1 c2 ) v2 , and hence
belongs to U .
2.3.7. (a) Every symmetric matrix has the form

49

a
b

b
c

=a

1
0

0
0

+c

0
0

0
1

+b

0
1

1
.
0

(b)

B
@0

0
0
0

0
0C
A,
0

B
@0

0
1
0

0
0C
A,
0

B
@0

0
0
0

0
0C
A,
1

B
@1

1
0
0

0
0C
A,
0

0
0
0

B
@0

1
0C
A,
0

B
@0

0
0
1

0
1C
A.
0

2.3.8.
(a) They span P (2) since ax2 + bx + c = 21 (a 2b + c)(x2 + 1) + 12 (a c)(x2 1) + b(x2 + x + 1).
(b) They span P (3) since ax3 + bx2 + cx + d = a(x3 1) + b(x2 + 1) + c(x 1) + (a b + c + d)1.
(c) They do not span P (3) since ax3 + bx2 + cx + d = c1 x3 + c2 (x2 + 1) + c3 (x2 x) + c4 (x + 1)
cannot be solved when b + c d 6= 0.
2.3.9. (a) Yes. (b) No. (c) No. (d) Yes: cos2 x = 1 sin2 x. (e) No. (f ) No.

2.3.10. (a) sin 3 x = cos 3 x 21 ; (b) cos x sin x = 2 cos x + 14 ,

(c) 3 cos 2 x+4 sin 2 x = 5 cos 2 x tan1

4
3

, (d) cos x sin x =

1
2

sin 2 x =

1
2

cos 2 x

1
2

2.3.11. (a) If u1 and u2 are solutions, so is u = c1 u1 + c2 u2 since u00 4 u0 + 3 u = c1 (u00


1
x 3x
0
+
3
u
)
=
0.
(b)
span
{
e
,
e
};
(c)
2.

4
u
4 u01 + 3 u1 ) + c2 (u00
2
2
2
2.3.12. Each is a solution, and the general solution u(x) = c1 + c2 cos x + c3 sin x is a linear
combination of the three independent solutions.

2.3.13. (a) e2 x ; (b) cos 2 x, sin 2 x; (c) e3 x , 1; (d) e x , e 3 x ; (e) e x/2 cos 23 x,

x
x
x
x
e x/2 sin 23 x; (f ) e5 x , 1, x; (g) ex/ 2 cos , ex/ 2 sin , e x/ 2 cos , e x/ 2 sin .
2
2
2
2
2.3.14. (a) If u1 and u2 are solutions, so is u = c1 u1 + c2 u2 since u00 + 4 u = c1 (u00
1 + 4 u1 ) +
+
4
u
)
=
0,
u(0)
=
c
u
(0)
+
c
u
(0)
=
0,
u()
=
c
u
()
+
c
u
c2 (u00
2
2
1 1
2 2
1 1
2 2 () = 0.
(b) span { sin 2 x }
!

1 2x
2
= 2 f1 (x) + f2 (x) f3 (x); (b) not in the span; (c)
2.3.15. (a)
1
x
1
!
2x
f2 (x) f3 (x); (d) not in the span; (e)
= 2 f1 (x) f3 (x).
0
2.3.16. True, since 0 = 0 v1 + + 0 vn .
2.3.17. False. For example, if z =

B C
@ 1 A,

B C
@ 0 A,

B C
@ 1 A,

B C
@ 0 A,

v=
w=
0
0
1
0
1
c1 + c 3
C
the equation w = c1 u + c2 v + c3 z = B
@ c2 + c3 A has no solution.
0
0

u=

= f1 (x)

then z = u + v, but

2.3.18. By the assumption, any v V can be written as a linear combination


v = c1 v1 + + cm vm = c1 v1 + + cn vm + 0 vm+1 + + 0 vn
of the combined collection.
2.3.19.
(a) If v =

m
X

j =1

cj vj and vj =

n
X

i=1

aij wi , then v =

n
X

i=1

bi vi where bi =

m
X

j =1

aij cj , or, in

vector language, b = A c.
(b) Every v V can be written as a linear combination of v1 , . . . , vn , and hence, by part
(a), a linear combination of w1 , . . . , wm , which shows that w1 , . . . , wm also span V .

50

2.3.20.
(a) If v =

m
X

i=1

cv + dw =

ai v i , w =
max{
m,n }
X
i=1
()

n
X

i=1

bi vi , are two finite linear combinations, so is

(c ai + d bi ) vi where we set ai = 0 if i > m and bi = 0 if i > n.

(b) The space P


of all polynomials, since every polynomial is a finite linear combination
of monomials and vice versa.

2.3.21. (a) Linearly independent; (b) linearly dependent; (c) linearly dependent;
(d) linearly independent; (e) linearly dependent; (f ) linearly dependent;
(g) linearly dependent; (h) linearly independent; (i) linearly independent.
2.3.22. (a) The only solution to the homogeneous linear system
1
0
1
0
0 1
2
2
1
C
B
C
B
B C
0C
B 2 C
B 3C
C=0
C + c3 B
B C + c2 B
is
c1 = c2 = c3 = 0.
c1 B
@ 1A
@ 1 A
@2A
1
1
1
(b) All but the second lie in the span. (c) a c + d = 0.
2.3.23.
(a) The only solution to the homogeneous linear system
0 1
0
1
0
1
0
1
1
1
1
1
B C
B
C
B
C
B
C
1C
B 1C
B 1 C
B 1 C
B C + c2 B
C + c3 B
C + c4 B
C=0
A c = c1 B
@1A
@ 1 A
@ 0A
@ 0A
0
0
1
1
0
1
1
1
1
1
B
1
1 1 1 C
C
B
C is c = 0.
with nonsingular coefficient matrix A = B
@ 1 1
0
1A
0
0
1 1
(b) Since A is nonsingular, the inhomogeneous linear system
1
0
1
0
1
0
0 1
1
1
1
1
C
B
C
B
C
B
B C
1C
B 1 C
B 1 C
B 1C
C
C + c4 B
C + c3 B
B C + c2 B
v = A c = c1 B
@ 0A
@ 0A
@ 1 A
@1A
1
1
0
0
has a solution c = 0
A11v for 0
any1v R04 . 1
1
1
1
B C
B C
B
C
B0C
3B1C
1B 1C
B C = 8B C + 8B
C+
(c)
@1A
@ 1 A
@0A
1
0
0

1
B
C
3 B 1 C
C
4B
@ 0A
1

1
B
C
1 B 1 C
C
4B
@ 0A
1

2.3.24. (a) Linearly dependent; (b) linearly dependent; (c) linearly independent; (d) linearly
dependent; (e) linearly dependent; (f ) linearly independent.
2.3.25.
0
0 False: 1
0
1 0 0
B
C
B
@0 1 0A@1
0
0 0 1

1
0
0

0
0
B
0C
A@0
1
1

0
1
0

1
1
B
0C
A@0
0
0

0
0
1

2.3.26. False the zero vector always belongs to the span.


2.3.27. Yes, when it is the zero vector.
51

0
0
B
1C
A+@0
1
0

1
0
0

0
0
B
1C
A+@1
0
0

0
0
1

1
0C
A = O.
0

2.3.28. Because x, y are linearly independent, 0 = c1 u + c2 v = (a c1 + c c2 )x + (b c1 + d c2 )y if


and only if a c1 + c c2 = 0, b c1 + d c2 = 0. The latter linear system has a nonzero solution
(c1 , c2 ) 6= 0, and so u, v are linearly
dependent, if and only if the determinant of the coef!
a c
ficient matrix is zero: det
= a d b c = 0, proving the result. The full collection
b d
x, y, u, v is linearly dependent since, for example, a x + b y u + 0 v = 0 is a nontrivial linear
combination.
2.3.29. The statement is false. For example, any set containing the zero element that does not
span V is linearly dependent.
2.3.30. (b) If the only solution to A c = 0 is the trivial one c = 0, then the only linear combination which adds up to zero is the trivial one with c1 = = ck = 0, proving linear
independence. (c) The vector b lies in the span if and only if b = c1 v1 + + ck vk = A c
for some c, which implies that the linear system A c = b has a solution.
2.3.31.
(a) Since v1 , . . . , vn are linearly independent,
0 = c1 v1 + + ck vk = c1 v1 + + ck vk + 0 vk+1 + + 0 vn
if and only if c1 = = ck = 0.
!
!
1
2
(b) This is false. For example, v1 =
, v2 =
, are linearly dependent, but the
1
2
subset consisting of just v1 is linearly independent.
2.3.32.
(a) They are linearly dependent since (x2 3) + 2(2 x) (x 1)2 0.
(b) They do not span P (2) .
2.3.33. (a) Linearly dependent; (b) linearly independent; (c) linearly dependent; (d) linearly
independent; (e) linearly dependent; (f ) linearly dependent; (g) linearly independent;
(h) linearly independent; (i) linearly independent.
2.3.34. When x > 0, we have f (x) g(x) 0, proving linear dependence. On the other hand, if
c1 f (x) + c2 g(x) 0 for all x, then at, say x = 1, we have c1 + c2 = 0 while at x = 1, we
must have c1 + c2 = 0, and so c1 = c2 = 0, proving linear independence.
2.3.35.
(a) 0 =

k
X

i=1

ci pi (x) =

n
X

k
X

j =0 i=1
T

ci aij xj if and only if

n
X

k
X

j =0 i=1

ci aij = 0, j = 0, . . . , n, or, in

matrix notation, A c = 0. Thus, the polynomials are linearly independent if and only if
the linear system AT c = 0 has only the trivial solution c = 0 if and only if its (n + 1) k
coefficient matrix has rank AT = rank A = k.
(b) q(x) =
0

n
X

j =0

b j xj =

k
X

i=1

ci pi (x) if and only if AT c = b.


1

1
0 0 1
0
B 4 2 0 1
0C
B
C
C
B 0 4 0 0
C has rank 4 and so they are linearly dependent.
(c) A = B
1
B
C
@ 1
0 1 0
0A
1
2 0 4 1
(d) q(x) is not in the span.
2.3.36. Suppose the linear combination p(x) = c0 + c1 x + c2 x2 + + cn xn 0 for all x.
Thus, every real x is a root of p(x), but the Fundamental Theorem of Algebra says this is
only possible if p(x) is the zero polynomial with coefficients c0 = c1 = = cn = 0.
52

2.3.37.
(a) If c1 f1 (x) + + cn fn (x) 0, then c1 f1 (xi ) + + cn fn (xi ) = 0 at all sample points,
and so c1 f1 + + cn fn = 0. Thus, linear dependence of the functions implies linear
dependence of their sample vectors.
(b) Sampling f1 (x) = 1 and
f2 (x) = x2 at 1, 1 produces the linearly dependent sample
!
1
vectors f1 = f2 =
.
1
(c) Sampling at 0, 41 ,
0 1
1
B C
B1C
B C
B C
B 1 C,
B C
B C
B C
@1A

1
3
20, 4 ,1,
1
B C
2 C
B
B 2 C
C
B
B 0 C,
B C
C
B
2C
B
@ 2 A

leads to the linearly independent sample vectors


1
0
0
1
0
1
1
0
0
B
B
B
B
B
B
B
B
@

2
2

C
C
C
C
1 C
,
C
C
2C
2 A

B
C
B 0C
B
C
B
C
B 1 C,
B
C
B
C
B
C
@ 0A

B
C
B 1C
B
C
B
C
B 0 C.
B
C
B
C
B
C
@ 1 A

2.3.38.
(a) Suppose c1 f1 (t) + + cn fn (t) 0 for all t. Then c1 f1 (t0 ) + + cn fn (t0 ) = 0, and
hence, by linear independence of the sample vectors, c1 = = cn = 0, which proves
linear independence of the functions.
!
2 c2 t + (c1 c2 )
0 if and only if c2 = 0, c1 c2 = 0, and
(b) c1 f1 (t) + c2 f1 (t) =
2 c2 t2 + (c1 c2 )t
so c1 = c2 = 0, proving linear independence. However, at any t0 , the vectors f2 (t0 ) =
(2 t0 1)f1 (t0 ) are scalar multiples of each other, and hence linearly dependent.
2.3.39.
(a) Suppose c1 f (x) + c2 g(x) 0 for all x for some0c = ( c1 , c2 )T1 0
6= 0.1Differentiating,
c
f
(x)
g(x)
A @ 1 A = 0 for all x.
we find c1 f 0 (x) + c2 g 0 (x) 0 also, and hence @ 0
0
f (x) g (x)
c2
The homogeneous system has a nonzero solution if and only if the coefficient matrix is
singular, which requires its determinant W [ f (x), g(x) ] = 0.
(b) This is the contrapositive of part (a), since if f, g were not linearly independent, then
their Wronskian would vanish everywhere.
(c) Suppose c1 f (x) + c2 g(x) = c1 x3 + c2 | x |3 0. then, at x = 1, c1 + c2 = 0, whereas
at x = 1, c1 + c2 = 0. Therefore, c1 = c2 = 0, proving linear independence. On the
other hand, W [ x3 , | x |3 ] = x3 (3 x2 sign x) (3 x2 ) | x |3 0.

2.4.1. Only (a) and (c) are bases.


2.4.2. Only (b) is a basis.
0

1 0

0
1
C B C
2.4.3. (a) B
@ 0 A , @ 1 A;
2
0

(b)

1 0 1
3
1
B4C B4C
B C , B C;
@ 1A @ 0A

(c)

0
B
B
B
@

1 0

1 0

2
1
1
B
C B C
1C
C B 0C B0C
C, B
C , B C.
0A @ 1A @0A
0
0
1

2.4.4.
(a) They0do not span R 3 because
the linear system A c = b with coefficient matrix
1
1
3
2
4
C
A=B
@ 0 1 1 1 A does not have a solution for all b since rank A = 2.
2
1 1
3
(b) 4 vectors in R 3 are automatically linearly dependent.
53

(c) No, because if v1 , v2 , v3 , v4 dont span R 3 , no subset of them will span it either.
(d) 2, because v1 and v2 are linearly independent and span the subspace, and hence form a
basis.
2.4.5.
(a) They0span R 3 because the1linear system A c = b with coefficient matrix
1
2
0
1
3C
A=B
@ 1 2 2
A has a solution for all b since rank A = 3.
2
5
1 1
(b) 4 vectors in R 3 are automatically linearly dependent.
(c) Yes, because v1 , v2 , v3 also span R 3 and so form a basis.
(d) 3 because they span all of R 3 .
2.4.6.

2y + 4z
C
(a) Solving the defining equation, the general vector in the plane is x = B
y
@
A where
z
0 1
0 1
0
1
0
1
2
4
2
0
C
B C
B
C
B
C
y, z are arbitrary. We can write x = y B
@ 1 A + z @ 0 A = (y + 2 z) @ 1 A + (y + z) @ 2 A
0
1
1
1
and hence both pairs of vectors span the plane. Both pairs are linearly independent
since 1
they are not
and hence
both form
0
0 parallel,
1
0 1
0
1
0 1a basis.
0 1
2
2
4
0
2
4
C
B C
B C
B
C
B C
B C
(b) B
@ 1 A = ( 1) @ 1 A + @ 0 A , @ 2 A = 2 @ 1 A @ 0 A;
1
0
1
1
0
1 1
0 1 0
6
10
C B
C
(c) Any two linearly independent solutions, e.g., B
@ 1 A , @ 1 A, will form a basis.
1
2

2.4.7. (a) (i) Left handed basis; (ii) right handed basis; (iii) not a basis; (iv ) right handed
basis. (b) Switching two columns or multiplying a column by 1 changes the sign of the
determinant. (c) If det A = 0, its columns are linearly dependent and hence cant form a
basis.

2.4.8.
T
T
, 13 , 32 , 0, 1 ; dim = 2.
(a) 23 , 56 , 1, 0
(b) The condition p(1) = 0 says a + b + c = 0, so p(x) = ( b c) x2 + b x + c = b ( x2 + x) +
c( x2 + 1). Therefore x2 + x, x2 + 1 is a basis, and so dim = 2.
(c) ex , cos 2 x, sin 2 x, is a basis, so dim = 3.
0

3
C
2.4.9. (a) B
@ 1 A, dim = 1; (b)
1

1 0

2
0
B C B
C
@ 0 A, @ 1 A, dim = 2; (c)
1
3

1 0

1 0

B
C B C B
C
B 0 C B 1 C B 2 C
B
C, B C , B
C,
@ 1 A @ 1 A @ 1 A

dim = 3.

2.4.10. (a) We have a + b t + c t2 = c1 (1 + t2 ) + c2 (t + t2 ) + c3 (1 + 2 t + t2 ) provided


a = c1 +1c3 ,
0
1 0 1
C
b = c2 + 2 c3 , c = c1 + c2 + c3 . The coefficient matrix of this linear system, B
@ 0 1 2 A,
1 1 1
is nonsingular, and hence there is a solution for any a, b, c, proving that they span the space
of quadratic polynomials. Also, they are linearly independent since the linear combination
is zero if and only if c1 , c2 , c3 satisfy the corresponding homogeneous linear system c1 +c3 =
0, c2 + 2 c3 = 0, c1 + c2 + c3 = 0, and hence c1 = c2 = c3 = 0. (Or, you can use
the fact that dim P (2) = 3 and the spanning property to conclude that they form a basis.)
54

(b) 1 + 4 t + 7 t2 = 2(1 + t2 ) + 6(t + t2 ) (1 + 2 t + t2 )

2.4.11. (a) a+b t+c t2 +d t3 = c1 +c2 (1t)+c3 (1t)2 +c4 (1t)3 provided a = c1 +c2 +c3 +c4 ,
0
1
1
1
1
1
B
0 1 2 3 C
C
B
C
b = c2 2 c3 3 c4 , c = c3 + 3 c4 , d = c4 . The coefficient matrix B
@0
0
1
3A
0
0
0 1
(3)
is nonsingular, and hence they span P . Also, they are linearly independent since the linear combination is zero if and only if c1 = c2 = c3 = c4 = 0 satisfy the corresponding
homogeneous linear system. (Or, you can use the fact that dim P (3) = 4 and the spanning
property to conclude that they form a basis.) (b) 1 + t3 = 2 3(1 t) + 3(1 t)2 (1 t)3 .
2.4.12. (a) They are linearly dependent because 2 p1 p2 + p3 0. (b) The dimension is 2,
since p1 , p2 are linearly independent and span the subspace, and hence form a basis.
0

1 0

1 0

1 0

1
1 C B 1 C
1
B C B C B
2 C B
2C
C B
B C B
1C B
C B 0C B 2 C
C, B
C are linearly independent and
B C , B 2 C, B
(a) The sample vectors B
C B 1 C B
B1C B
0 C
A @
A
@ A @ 0
A @
2
2
0
1
2
2
hence form a basis for R 4 the space of sample functions.
0 1
0
1
0
1
0 1
1
1
0
1

B
B 2 C
B C
B1C
C
2C
B C
C
C
1B
2 2B
2 + 2B
B1C
B 2 C
B 2 C
C
B
B
B 4 C= B C
C
C.
(b) Sampling x produces B
B1C
C
C
2B
8 B
8 B
0 C
@2A
@1A
@ 0 A
@
A
3
2
2
1
2
4
2
2.4.14.
!
!
!
!
0 0
0 0
0 1
1 0
is a basis since we
, E22 =
, E21 =
, E12 =
(a) E11 =
0 1
1 0
0 0
0 0
!
a b
can uniquely write any
= a E11 + b E12 + c E21 + d E22 .
c d
(b) Similarly, the matrices Eij with a 1 in position (i, j) and all other entries 0, for
i = 1, . . . , m, j = 1, . . . , n, form a basis for Mmn , which therefore has dimension m n.
2.4.13.

2.4.15. k 6= 1, 2.

2.4.16. A basis is given by the matrices Eii , i = 1, . . . , n which have a 1 in the ith diagonal
position and all other entries 0.

2.4.17.

1 0
0 1
0 0
(a) E11 =
, E12 =
, E22 =
; dimension = 3.
0 0
0 0
0 1
(b) A basis is given by the matrices Eij with a 1 in position (i, j) and all other entries 0 for
1 i j n, so the dimension is 12 n(n + 1).

2.4.18. (a) Symmetric: dim = 3; skew-symmetric: dim = 1; (b) symmetric: dim = 6; skewsymmetric: dim = 3; (c) symmetric: dim = 21 n(n+1); skew-symmetric: dim = 21 n(n1).
2.4.19.
(a) If a row (column) of A adds up to a and the corresponding row (column) of B adds up
to b, then the corresponding row (column) of C = A + B adds up to c = a + b. Thus,
if all row and column sums of A and B are the same, the same is true for C. Similarly,
the row (column) sums of c A are c times the row (column) sums of A, and hence all the
same if A is a semi-magic square.

55

b c
e fC
(b) A matrix A =
A is a semi-magic square if and only if
g h j
a + b + c = d + e + f = g + h + j = a + d + e = b + e + h = c + f + j.
The
general
solution
to
this system is
1
1
0
1
0
1
0
0
0
0 1
0 0 1
1 1 1
1 0 1
1 1 0
C
C
B
B
B
B
1C
1 0C
A = eB
A + g@ 1 0 0A + h@1 0 0A + j @1 0
A + f @ 1 0
@ 1
0 0
0 1 0
1 0 0
0 0
0
0
0 0
0
1
0
1
0
1
1 0 0
0 1 0
0 0 1
C
B
C
B
C
= (e g) B
@ 0 1 0 A + (g + j e) @ 1 0 0 A + g @ 0 1 0 A +
0 0 1
0 0 1
1 0 0
0
0
1
1
1 0 0
0 0 1
B
C
C
+fB
@ 0 0 1 A + (h f ) @ 1 0 0 A ,
0 1 0
0 1 0
which is a linear combination of permutation matrices.
(c) The dimension is 5, with any 5 of the 6 permutation matrices forming a basis.
(d) Yes,
by the same
reasoning as 1
in part
(a). Its dimension
is 3, with basis
0
1 0
0
1
2 2 1
2 1 2
1 2
2
B
B
B
C
4C
1 1C
@ 2 1
A, @1
A , @ 4 1 2 A.
3 0
0
0
01 3 0
0
0 10
30
1
2 2 1
2 1 2
1 2
2
B
C
B
C
B
(e) A = c1 @ 2 1
4 A + c2 @ 1
1 1 A + c3 @ 4 1 2 C
A for any c1 , c2 , c3 .
3 0
0
0
3 0
0 0
3
B
@d

0
0C
A
1

1
0
1
2
, v2 =
, v3 =
. Then
= 2 v 1 + v2 =
0
1
1
1
v1 + v3 . In fact, there are infinitely many different ways of writing this vector as a linear
combination of v1 , v2 , v3 .

2.4.20. For instance, take v1 =

2.4.21.
(a) By Theorem 2.31, we only need prove linear independence. If 0 = c1 A v1 + +
cn A vn = A(c1 v1 + + cn vn ), then, since A is nonsingular, c1 v1 + + cn vn = 0,
and hence c1 = = cn = 0.
(b) A ei is the ith column of A, and so a basis consists of the column vectors of the matrix.
2.4.22. Since V 6= {0}, at least one vi 6= 0. Let vi1 6= 0 be the first nonzero vector in the list
v1 , . . . , vn . Then, for each k = i1 + 1, . . . , n 1, suppose we have selected linearly independent vectors vi1 , . . . , vij from among v1 , . . . , vk . If vi1 , . . . , vij , vk+1 form a linearly independent set, we set vij+1 = vk+1 ; otherwise, vk+1 is a linear combination of vi1 , . . . , vij ,
and is not needed in the basis. The resulting collection vi1 , . . . , vim forms a basis for V
since they are linearly independent by design, and span V since each vi either appears in
the basis, or is a linear combination of the basis elements that were selected before it. We
have dim V = n if and only if v1 , . . . , vn are linearly independent and so form a basis for
V.
2.4.23. This is a special case of Exercise 2.3.31(a).
2.4.24.
(a) m n as otherwise v1 , . . . , vm would be linearly dependent. If m = n then v1 , . . . , vn
are linearly independent and hence, by Theorem 2.31 span all of R n . Since every vector
in their span also belongs to V , we must have V = R n .
(b) Starting with the basis v1 , . . . , vm of V with m < n, we choose any vm+1 R n \ V .
Since vm+1 does not lie in the span of v1 , . . . , vm , the vectors v1 , . . . , vm+1 are linearly
independent and span an m + 1 dimensional subspace of R n . Unless m + 1 = n we can
56

then choose another vector vm+2 not in the span of v1 , . . . , vm+1 , and so v1 , . . . , vm+2
are also linearly independent. We continue on in this fashion until we arrive at n linearly independent vectors v1 , . . . , vn which necessarily form a basis of R n .
(c) (i)

1, 1,

1
2

, ( 1, 0, 0 )T , ( 0, 1, 0 )T ;

(ii) ( 1, 0, 1 )T , ( 0, 1, 2 )T , ( 1, 0, 0 )T .

2.4.25.
(a) If dim V = , then the inequality is trivial. Also, if dim W = , then one can find
infinitely many linearly independent elements in W , but these are also linearly independent as elements of V and so dim V = also. Otherwise, let w1 , . . . , wn form a basis
for W . Since they are linearly independent, Theorem 2.31 implies n dim V .
(b) Since w1 , . . . , wn are linearly independent, if n = dim V , then by Theorem 2.31, they
form a basis for V . Thus every v V can be written as a linear combination of
w1 , . . . , wn , and hence, since W is a subspace, v W too. Therefore, W = V .
(c) Example: V = C0 [ a, b ] and W = P () .
2.4.26. (a) Every v V can be uniquely decomposed as v = w + z where w W, z Z. Write
w = c1 w1 + . . . + cj wj and z = d1 z1 + + dk zk . Then v = c1 w1 + . . . + cj wj + d1 z1 +
+ dk zk , proving that w1 , . . . , wj , z1 , . . . , zk span V . Moreover, by uniqueness, v = 0 if
and only if w = 0 and z = 0, and so the only linear combination that sums up to 0 V is
the trivial one c1 = = cj = d1 = = dk = 0, which proves linear independence of the
full collection. (b) This follows immediately from part (a): dim V = j +k = dim W +dim Z.
2.4.27. Suppose the functions are linearly independent. This means that for every 0 6= c =
( c1 , c2 , . . . , cn )T R n , there is a point xc R such that

n
X

i=1

ci fi (xc ) 6= 0. The as-

sumption says that {0} 6= Vx1 ,...,xm for all choices of sample points. Recursively define the
following sample points. Choose x1 so that f1 (x1 ) 6= 0. (This is possible since if f1 (x) 0,
then the functions are linearly dependent.) Thus Vx1 ( R m since e1 6 Vx1 . Then, for each
m = 1, 2, . . . , given x1 , . . . , xm , choose 0 6= c0 Vx1 ,...,xm , and set xm+1 = xc0 . Then
c0 6 Vx1 ,...,xm+1 ( Vx1 ,...,xm and hence, by induction, dim Vm n m. In particular,
dim Vx1 ,...,xn = 0, so Vx1 ,...,xn = {0}, which contradicts our assumption and proves the
result. Note that the proof implies we only need check linear dependence at all possible collections of n sample points to conclude that the functions are linearly dependent.

2.5.1.
(a) Range: all b =

b1
b2

such that

3
4 b1

+ b2 = 0; kernel spanned by

1!
2 .

11 0
1
1
2
b1
C B
C
(b) Range: all b =
such that 2 b1 + b2 = 0; kernel spanned by B
@ 1 A , @ 0 A.
b2
0
1 1
0
0
1
5
b
B4 C
B 1C
B 7C
C.
(c) Range: all b = @ b2 A such that 2 b1 + b2 + b3 = 0; kernel spanned by B
@8 A
b3
1
T
(d) Range: all b = ( b1 , b2 , b3 , b4 ) such that 2 b1 b2 + b3 = 2 b1 + 3 b2 + b4 = 0;
0 1 0
1
1
1
B C B
1C B 0C
C
B C, B
C.
kernel spanned by B
@1A @ 0A
0
1
0

57

1 0

1
5
B 2C B2C
C
B
B
2.5.2. (a) @ 0 A, @ 1 C
A: plane;
1
0
0

1
1
B4C
B3C
B C:
@8A

0
C
(e) B
@ 0 A: point;
0

1
C
(d) B
@ 2 A: line;
1

2.5.3.

(a) Kernel spanned by

B C
B1C
B C;
@0A

1
1
B3C
B 5 C:
@3A

(f )

1
C
2.5.4. (a) b = B
@ 2 A;
1

line.

range spanned by

1 0

2
3
C B
C
(c) B
@ 0 A, @ 1 A: plane;
1
0

0
(b) compatibility: a + 41 b + c = 0.
1
2

line;

(b)

1 0

1 0

B C B C B
@ 2 A, @ 0 A, @

0
2C
A;
3

1+t
C
(b) x = B
@ 2 + t A where t is arbitrary.
3+t

2.5.5. In each case, the solution is x = x? + z, where x? is the particular solution and z belongs
to the kernel:
1
0
0 1
0 1
0
0
1
1
2

1
1
3
1
B 7C
B C
B
C
C
C
1C
(a) x? = B
(b) x? = B
@ 0 A, z = y @ 1 A + z @ 0 A;
@ 1 A, z = z B
@ 7 A;
0
0
1
0
1
(c) x? =

(f ) x? =

7
B9

C
B 2 C
B
C,
@ 9 A
10
9
0 11 1
2
B 1 C
B
C
B 2 C,
B
C
@ 0 A

2
C
z = zB
@ 2 A;
1
0

13
2
B
B 3
2
z = rB
B
@ 1
0

(d) x? =

23
C
B 1
C
B
C + sB 2
C
B
A
@ 0
1

0
B
B
@

51
6C
1C
A,
2
3

C
C
C;
C
A

z = 0; (e) x? =
0

2
1
;
, z=v
1
0
0

3
6
4
B C
B C
B
C
B2C
B2C
B 1 C
?
C, z = z B C + w B
C.
(g) x = B
@0A
@1A
@ 0A
0
0
1

2.5.6. The ith entry of A ( 1, 1, . . . , 1 )T is ai1 + . . . + ain which is n times the average of the entries in the ith row. Thus, A ( 1, 1, . . . , 1 )T = 0 if and only if each row of A has average 0.

2.5.7. The kernel has dimension n1, with basis r k1 e1 +ek = rk1 , 0, . . . , 0, 1, 0, . . . , 0
for k = 2, . . . n. The range has dimension 1, with basis (1, r n , r2 n . . . , r(n1)n )T .

2.5.8. (a) If w = P w, then w rng P . On the other hand, if w rng P , then w = P v for
some v. But then P w = P 2 v = P v = w. (b) Given v, set w = P v. Then v = w + z
where z = v w ker P since P z = P v P w = P v P 2 v = P v P v = 0. Moreover, if
w ker P rng P , then 0 = P w = w, and so ker P rng P = {0}, proving complementarity.
2.5.9. False. For example, if A =

1
1

1
1

then

1
1

is in both ker A and rng A.

2.5.10. Let r1 , . . . , rm+k be the rows of C, so r1 , . . . , rm are the rows of A. For v ker C, the
ith entry of C v = 0 is ri v = 0, but then this implies
A v = 0 and so
v ker A. As an
!
!
1
1 0
example, A = ( 1 0 ) has kernel spanned by
, while C =
has ker C = {0}.
0
0 1

58

A=

0
0

2.5.12. x?1 =

x
, and so b rng C. As an example,
0
!
0 1
has rng A = {0}, while the range of C =
is the x axis.
0 0

2.5.11. If b = A x rng A, then b = C z where z =

2
3
2

, x?2 =
0

1
2
1

; x = x?1 + 4 x?2 =

6
7
2

1
C
2.5.13. x? = 2 x?1 + x?2 = B
@ 3 A.
3
0
1
2.5.14.
1
B
?
?
(a) By direct matrix multiplication: A x1 = A x2 = @ 3 C
A.
5

4
1
C
B
C
(b) The general solution is x = x?1 + t (x?2 x?1 ) = (1 t) x?1 + t x?2 = B
@ 1 A + t@ 2 A.
2
0
2.5.15. 5 meters.
2.5.16. The mass will move 6 units in the horizontal direction and 6 units in the vertical direction.
2.5.17. x = c1 x?1 + c2 x?2 where c1 = 1 c2 .

2.5.18. False: in general, (A + B) x? = (A + B) x?1 + (A + B) x?2 = c + d + B x?1 + A x?2 , and the


third and fourth terms dont necessarily add up to 0.

2.5.19. rng A = R n , and so A must be a nonsingular matrix.


2.5.20.
(a) If A xi = ei , then xi = A1 ei which, by (2.13), is the ith column of the matrix A1 .
0
0
0
1
1
1
1
1
12
2
2
B
B
B
C
C
C
B
B
C
C
(b) The solutions to A xi = ei in this case are x1 = B
2 C
@
A, x 2 = @ 1 A, x 3 = @ 1 A,
1
21
1
2
0
1
1
1
12
2
2C
B
1
B
which are the columns of A = @ 2 1 1 C
A.
1
1
1
2
2
2

2.5.21.

2
3
1
1
.
; cokernel:
; kernel:
; corange:
1
1
3
2
0 1 0
1
0
1 0
1
0
1
0
1
0
8
1
0
2
1
C B
C
B
C B
C
B
C
B
C
(b) range: B
@ 1 A, @ 1 A; corange: @ 2 A, @ 0 A; kernel: @ 1 A; cokernel: @ 2 A.
2
6
1
8
0
1
1
0
1 0
1
0 1 0
0
1
0 1 0 1
3
1
0
1
3
1
1
C
B
C B
C
B C B
B
C
B 3 C B 2 C
B 1 C B 1 C
B C B C
C; cokernel: @ 1 A.
C, B
C; kernel: B
C, B
(c) range: @ 1 A, @ 0 A; corange: B
@ 1A @ 0A
@ 2 A @ 3 A
1
3
2
1
0
2
1
(a) range:

59

1 0

1 0

1 0

1 0

1
3
1
1
0
0
B 0 C B 3 C B 2 C
B 3 C B 3 C B 0 C
B C B
C B
C
B
C B
C B C
C B
C B
C
B
C B
C B C
B 2 C, B 3 C, B 0 C; corange: B 2 C, B 6 C, B 0 C;
(d) range: B
B C B
C B
C
B
C B
C B C
@ 3 A @ 3 A @ 3 A
@ 2A @ 0A @0A
1
0
3
1
2
4
1
1 0
0
1
0 1 0
2
2
2
4
B 1 C B 1 C
B2C B 0C
C
C B
B
C
B C B
C
C B
B
C
C B
B 1 C, B 0 C; cokernel: B 1 C, B 0 C.
kernel: B
C
C B
B
C
B C B
@ 0 A @ 1 A
@0A @ 1A
1
0
0
0
0
1 0 1 0
1
1
0
3
C B C B
C
2.5.22. B
@ 2 A, @ 1 A, @ 1 A, which are its first, third and fourth columns;
3
2 0
01
0
1
0
1
0
1
0 1
0
1
2
1
5
1
0
3
C
B
C
B
C
B
C
B C
B
C
Second column: B
@ 4 A = 2@ 2 A; fifth column: @ 4 A = 2@ 2 A + @ 1 A @ 1 A.
6
3
8
3
2
0
0
1 0 1
0
1 0 1
0
1
0
1
1
0
1
0
3
1
C B C
B
C B C
B
C
B
C
2.5.23. range: B
@ 2 A, @ 4 A; corange: @ 3 A, @ 0 A; second column: @ 6 A = 3@ 2 A;
3
1 0
01 04 1 0
9
3
0 1
0
1
0
1
1
0
2
1
0
3
1
B
B C B
C
B
C
C
C
1B C
second and third rows: B
@ 6 A = 2@ 3 A + @ 0 A, @ 9 A = 3@ 3 A + 4 @ 0 A.
4
4
0
4
1
0

2.5.24.
(i) rank = 1; dim rng!
A = dim corng A = 1, !dim ker A = dim coker A = 1;
2
2
; compatibility conditions: 2 b1 + b2 = 0;
; cokernel basis:
kernel basis:
1
1
!
!
!
1
1
2
example: b =
, with solution x =
+z
.
2
0
1
(ii) rank = 1; dim rng A = dim corng A = 1, dim ker A = 2, dim coker A = 1; kernel basis:
011 021
B3C B3C
@ 1 A, @ 0 A;

2
; compatibility conditions: 2b1 + b2 = 0;
1
011
0 1
021
0
1
!
1
3C
3
B C
B3C
, with solution x = @ 0 A + y @ 1 A + z B
example: b =
@ 0 A.
6
0
0
1
(iii) rank = 2; dim rng A = dim corng
A
=
2,
dim
ker
A
=
0,
dim
coker
A = 1;
1
0
20
B 13 C
20
3
3 C
B
C; compatibility conditions: 13 b1 + 13 b2 + b3 = 0;
kernel: {0}; cokernel basis: B
@ 13 A
cokernel basis:

1
0 1
1
1
1
C
B C
example: b = B
@ 2 A, with solution x = @ 0 A.
2
0
(iv ) rank = 2; dim
rng
A
=
dim
corng
A
=
2,
dim
ker A = dim coker A = 1;
0
1
0
1
2
2
C
B
C
kernel basis: B
@ 1 A; cokernel basis: @ 1 A; compatibility conditions:
1
1
0 1
0 1
0
1
2
1
2
C
B C
B
C
2 b1 + b2 + b3 = 0; example: b = B
@ 1 A, with solution x = @ 0 A + z @ 1 A.
3
0
1
(v ) rank = 2; dim rng A = dim corng A = 2, dim ker A = 1, dim coker A = 2; kernel

60

B
C
@ 1 A;

basis:

1 0 1
C B 4
B
B 1 C B 1
B 4 C B4
C, B
B
C B
B
@ 1 A @ 0
0

cokernel basis:

94

C
C
C
C;
C
A

compatibility: 94 b1 +

1
4 b2

+ b3 = 0,

1
1
0
1
0 1
2
1
1
B C
B
C
B C
B6C
1
1
C, with solution x = @ 0 A + z @ 1 A.
4 b1 4 b2 + b4 = 0; example: b = B
@3A
1
0
1
(vi) rank = 3; dim rng A = dim corng A = 3, dim ker A = dim coker A = 1; kernel basis:
0 13
B 4
B 13
B 8
B
B 7
@2

1
B
C
B 1 C
C; compatibility conditions: b1 b2 + b3 + b4 = 0;
cokernel basis: B
@ 1A
0 13 1
0 1
1
0 1
1
1
1
B 4 C
B C
B 13 C
B C
B C
B
B0C
C
B3C
C, with solution x = B C + w B 8 C.
example: b = B
B 7C
B0C
@1A

@ 2A
@ A
3
0
1
(vii) rank = 4; dim rng A = dim corng A = 4, dim ker A = 1, dim coker A = 0; kernel basis:
0
1
2
B 1C
B
C
B
C
B 0 C; cokernel is {0}; no conditions;
B
C
@ 0A
0 1
0
1
0
1
1
2
0
2
B0C
B 1C
B
C
B C
B
C
B 1C
B C
B
C
C, with x = B 0 C + y B 0 C.
example: b = B
B C
B
C
@ 3A
@0A
@ 0A
3
0
0
C
C
C
C;
C
A

1 0

1
2
1
C B C
C
2.5.25. (a) dim = 2; basis: B
(b) dim = 1; basis: B
@ 2 A, @ 2 A;
@ 1 A;
1
0
1 1 0
0 1 0 1 0 1
0
1 0
1
1
1
2
1
0
1
B C B C B C
B
C B
C B
C
0C B0C B2C
B 0 C B 1 C B 3 C
B C, B C , B C;
B
C, B
C, B
C;
(c) dim = 3; basis: B
(d)
dim
=
3;
basis:
@1A @0A @1A
@ 3 A @ 2 A @ 8 A
0
1
0
2
3
7
1
1 0
1 0
0
1
2
1
B 1 C B 1 C B 3 C
C
C B
C B
B
C
C B
C B
B 1 C, B 2 C, B 1 C.
(e) dim = 3; basis: B
C
C B
C B
B
@ 1A @ 2A @ 2A
1
1
1
0

1 0

1 0

1 0

1
3
0
0
B C B
C B C B
C
B1C B 0C B2C B 4C
C, B
C, B C , B
C; the dimension is 3.
2.5.26. Its the span of B
@ 0 A @ 1 A @ 3 A @ 1 A
0
0
1
1
0

1 0

0
2
C
B C B
B 0 C B 1 C
C;
C, B
2.5.27. (a) B
@1A @ 0A
1
0

1 0

0
1
C
B C B
B 1 C B 1 C
C;
C, B
(b) B
@1A @ 0A
1
0

61

1
B
C
B 3C
C.
(c) B
@ 0A
1

2.5.28. First method:

1 0

B C B
C
B0C B 3C
B C, B
C;
@ 2 A @ 4 A

second method:

1 0 1
5 0
1
1
2
1
0
B
C
B C
B
C
3C
B0C
B 3C
B
C = 2B C + B
C;
same, while B
@ 4 A
@2A
@ 8 A
5
1
3
0

1 0

B C B
C
B0C B 3C
B C, B
C.
@ 2 A @ 8 A

The first vectors are the

1 0 1
3 0
1
1
0
1
2
B
C
B C
B
C
B 3C
B0C
B 3C
B
C = 2B C + B
C.
@ 8 A
@2A
@ 4 A
3
1
5
0

2.5.29. Both sets are linearly independent and hence span a three-dimensional subspace of R 4 .
Moreover, w1 = v1 + v3 , w2 = v1 + v2 + 2 v3 , w3 = v1 + v2 + v3 all lie in the span of
v1 , v2 , v3 and hence, by Theorem 2.31(d) also form a basis for the subspace.
2.5.30.
(a) If A = AT , then ker A = { A x = 0 } = { AT x = 0 } = coker A, and rng A = { A x } =
{ AT x } = corng A.
(b) ker A = coker A has basis ( 2, 1, 1 )T ; rng A = corng A has basis ( 1, 2, 0 )T , ( 2, 6, 2 )T .
(c) No. For instance, if A is any nonsingular matrix, then ker A = coker A = {0} and
rng A = corng A = R 3 .
2.5.31.
(a) Yes. This is our method of constructing the basis for the range, and the proof is outlined in the text.
0
1
0
1
1 0 0 0
1 0 0 0
B
B
C
1 0 0 0C
C
B0 1 0 0C
B
C, then U = B
C and the first three
(b) No. For example, if A = B
@0 1 0 0A
@0 0 1 0A
0 0 1 0
0 0 0 0
rows of U form a basis for the three-dimensional corng U = corng A. but the first three
rows of A only span a two-dimensional subspace.
(c) Yes, since ker U = ker A.
(d) No, since coker U 6= coker A in general. For the example in part (b), coker A has basis
( 1, 1, 0, 0 )T while coker A has basis ( 0, 0, 0, 1 )T .
!

0 0
. (b) No, since then the first r rows of U are linear combina1 0
tions of the first r rows of A. Hence these rows span corng A, which, by Theorem 2.31c,
implies that they form a basis for the corange.

2.5.32. (a) Example:

2.5.33. Examples: any symmetric matrix; any permutation matrix


0
0
the identity. Yet another example is the complex matrix B
@1
0

since 1
the row echelon form is
0 1
i iC
A.
i i

2.5.34. The rows r1 , . . . , rm of A span the corange. Reordering the rows in particular interchanging two will not change the span. Also, multiplying any of the rows by nonzero
scalars, eri = ai ri , for ai 6= 0, will also span the same space, since
v=

n
X

i=1

c i ri =

n
X

i=1

ci
e
r.
ai i

2.5.35. We know rng A R m is a subspace of dimension r = rank A. In particular, rng A = R m


if and only if it has dimension m = rank A.
2.5.36. This is false. If A =

1
1

1
1

then rng A is spanned by

62

1
1

whereas the range of its

row echelon form U =

1
0

1
0

is spanned by

1
.
0

2.5.37.
(a) Method 1: choose the nonzero rows in the row echelon form of A. Method 2: choose the
columns of 0
AT 1
that
to pivot columns
0 correspond
1 0
1
0 of
1 its
0 row
1echelon
0 1 form.
1
3
2
1
0
0
C B
C B
C
B C B
C B C
(b) Method 1: B
@ 2 A, @ 1 A, @ 4 A. Method 2: @ 2 A, @ 7 A, @ 0 A. Not the same.
4
5
2
4
7
2
2.5.38. If v ker A then A v = 0 and so B A v = B 0 = 0, so v ker(B A). The first statement
follows from setting B = A.

2.5.39. If v rng A B then v = A B x for some vector x. But then v = A y where y = B x, and
so v rng A. The first statement follows from setting B = A.
2.5.40. First note that B A and A C also have size m n. To show rank A = rank B A, we prove
that ker A = ker B A, and so rank A = n dim ker A = n dim ker B A = rank B A.
Indeed, if v ker A, then A v = 0 and hence B A v = 0 so v ker B A. Conversely, if v
ker B A then B A v = 0. Since B is nonsingular, this implies A v = 0 and hence v ker A,
proving the first result. To show rank A = rank A C, we prove that rng A = rng A C, and
so rank A = dim rng A = dim rng A C = rank A C. Indeed, if b rng A C, then b = A C x
for some x and so b = A y where y = C x, and so b rng A. Conversely, if b rng A
then b = A y for some y and so b = A C x where x = C 1 y, so b rng A C, proving the
second result. The final equality is a consequence of the first two: rank A = rank B A =
rank(B A) C.
2.5.41. (a) Since they are spanned by the columns, the range of ( A B ) contains the range of
A. But since A is nonsingular, rng A = R n , and so rng ( A B ) = R n also, which proves
rank ( A B ) = n. (b) Same argument, using the fact that the corange is spanned by the
rows.
2.5.42. True if the matrices have the same size, but false in general.
2.5.43. Since we know dim rng A = r, it suffices to prove that w1 , . . . , wr are linearly independent. Given
0 = c1 w1 + + cr wr = c1 A v1 + + cr A vr = A(c1 v1 + + cr vr ),

we deduce that c1 v1 + + cr vr ker A, and hence can be written as a linear combination


of the kernel basis vectors:
c1 v1 + + cr vr = cr+1 vr+1 + + cn vn .
But v1 , . . . , vn are linearly independent, and so c1 = = cr = cr+1 = = cn = 0, which
proves linear independence of w1 , . . . , wr .

2.5.44.
(a) Since they have the same kernel, their ranks are the same. Choose a basis v 1 , . . . , vn of
R n such that vr+1 , . . . , vn form a basis for ker A = ker B. Then w1 = A v1 , . . . , wr =
A vr form a basis for rng A, while y1 = B v1 , . . . , yr = B vr form a basis for rng B.
Let M be any nonsingular m m matrix such that M wj = yj , j = 1, . . . , r, which
exists since both sets of vectors are linearly independent. We claim M A = B. Indeed,
M A vj = B vj , j = 1, . . . , r, by design, while M A vj = 0 = B vj , j = r + 1, . . . , n,
since these vectors lie in the kernel. Thus, the matrices agree on a basis of R n which is
enough to conclude that M A = B.
(b) If the systems have the same solutions x? + z where z ker A = ker B, then B x =
M A x = M b = c. Since M can be written as a product
of elementary
matrices, we

conclude that one can get from the augmented matrix A | b to the augmented matrix
63

B|c

by applying the elementary row operations that make up M .

2.5.45. (a) First, W rng A since every w W can be written as w = A v for some v
V R n , and so w rng A. Second, if w1 = A v1 and w2 = A v2 are elements of W , then
so is c w1 + d w2 = A (c v1 + d v2 ) for any scalars c, d because c v1 + d v2 V , proving
that W is a subspace. (b) First, using Exercise 2.4.25, dim W r = dim rng A since it is
a subspace of the range. Suppose v1 , . . . , vk form a basis for V , so dim V = k. Let w =
A v W . We can write v = c1 v1 + + ck vk , and so, by linearity, w = c1 A v1 + +
ck A vk . Therefore, the k vectors w1 = A v1 , . . . , wk = A vk span W , and therefore, by
Proposition 2.33, dim W k.
2.5.46.
(a) To have a left inverse requires an nm matrix B such that B A = I . Suppose dim rng A =
rank A < n. Then, according to Exercise 2.5.45, the subspace W = { B v | v rng A }
has dim W dim rng A < n. On the other hand, w W if and only if w = B v where
v rng A, and so v = A x for some x R n . But then w = B v = B A x = x, and
therefore W = R n since every vector x R n lies in it; thus, dim W = n, contradicting
the preceding result. We conclude that having a left inverse implies rank A = n. (The
rank cant be larger than n.)
(b) To have a right inverse requires an mn matrix C such that A C = I . Suppose dim rng A =
rank A < m and hence rng A ( R m . Choose y R m \ rng A. Then y = A C y = A x,
where x = C y. Therefore, y rng A, which is a contradiction. We conclude that having
a right inverse implies rank A = m.
(c) By parts (ab), having both inverses requires m = rank A = n and A must be square
and nonsingular.

2.6.1. (a)

(d)

(b)

(c)

(e)

or, equivalently,

2.6.2. (a)
(b) ( 1, 1, 1, 1, 1, 1, 1 )T is a basis for the kernel. The cokernel is trivial, containing only the
zero vector, and so has no basis. (c) Zero.

64

2.6.3. (a)

0
B
B
B
@

1
0
0
0

0
1
1
0
0

(d)

1
1
0
1

B1
B
B
B0
B
B0
B
B
B0
B
@0

0
0C
C
C;
1 A
1

1
0
1
1
0
0
0

0
1
0
0
1
1
0

1
B 1
B
B 1
(b) B
B
@ 0
0

1
0
0
1
0

0
0
1
0
1

0
1C
C
C;
0C
C
1 A
1

1
B 1
B
B
0
B
(c) B
B 0
B
@ 0
0

0
0
0
1
0
0
0
0C
C
B 1
0
0
C
B
1
0C
B
C
0
1
1
B
(e) B
0
1C
C;
B 0 1
0
B
1
0C
C
@ 0
C
0
1
0
1A
0
0
0
1 1
0
1
1 1
0
0
0
0
B 1
0
1
0
0
0C
B
C
B
C
1
0
1
0
0
B 0
C
B
C
B 1
C
0
0
1
0
0
C.
(f ) B
B 0
C
0
1
0
0
1
B
C
B
C
B 0
0 1
0
1
0C
B
C
@ 0
0
0 1
0
1A
0
0
0
0 1
1
0

1 0

1
0
0
0
0
1

0
1
1
1
0
0

1
0
0
0
1
0

0
1
0
0
0
1

01
0C
C
0C
C
C;
1C
C
1 A
0

1
0
0
B 1 C B 1 C
B
C
B
C B
C
1 C
B
C B
C
B
C; (b) 2 circuits: B 0 C, B 1 C; (c) 2 circuits:
2.6.4. (a) 1 circuit: B
B
C B
C
@ 1 A
@ 1A @ 0A
1
0
1
0

1 0

1 0

1
1
0
B 1 C B 1 C B 0 C
B
C B
C B
C
B
C B
C B
C
B 1 C B 0 C B 1 C
B
C B
C B
C
B
C
B
C
B
(d) 3 circuits: B 0 C, B 1 C, B 1 C
C; (e) 2 circuits:
B
C B
C B
C
B 1C B 0C B 0C
B
C B
C B
C
@ 0A @ 1A @ 0A
0
0
1
1 0 1
0 1 0
0
1
1
B0C B 1C B0C
C B C
B C B
C B C
B C B
B 1 C B 1 C B 0 C
C B C
B C B
B1C B 0C B0C
C B C
C, B
(f ) 3 circuits: B
B 0 C B 1 C, B 1 C.
C B C
B C B
C B C
B C B
B0C B 0C B1C
C B C
B C B
@0A @ 1A @0A
1
0
0
0

1
B1
B
B1
2.6.5. (a) B
B
@0
0

1
0
0
1
1

0
1
0
1
0

01 011
B0C B1C
B C B C
B C B C
B1C B0C
B C, B C;
B1C B0C
B C B C
@1A @0A
0
1

0
B
B
B
B
B
B
B
@

0
0
1
0
1
1

1 1 0 0 1
C
B
1C
C B 0C
C B
1 C B 1 C
C
C;
C, B
B 1C
0C
C
C B
@
A
0A
1
1
0

0
0C
C
C; (b) rank = 3; (c) dim rng A = dim corng A = 3,
1 C
C
0A
1
0
1 0
1
0 1
1
1
1
B 1 C B 0 C
B C
B
C B
C
B1C
B
C B
C
C;
B 0 C, B 1 C;
dim ker A = 1, dim coker A = 2; (d) kernel: B
cokernel:
B
C B
C
@1A
@ 1A @ 0A
1
0
1
65

01
0C
C
0C
C
C;
1C
C
0A
1

1
B1C
B C
C
B 1 C;
(e) b1 b2 + b4 = 0, b1 b3 + b5 = 0; (f ) example: b = B
B C
@0A
0

2.6.6.
(a)

B1
B
B
B1
B
B0
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B
B0
B
@0

1
0
0
1
1
0
0
0
0
0
0
0
1

0
1
0
0
0
1
1
0
0
0
0
0

1
B 1C
C
B
C
B
B 0C
C
B
B 1 C
C
B
C
B
B 0C
C
B
B 1C
C
Cokernel basis: v1 = B
B 0 C, v2 =
C
B
C
B
B 0C
C
B
B 0C
C
B
C
B
B 0C
C
B
@ 0A
0

0
0
1
0
0
0
0
1
1
0
0
0

0
0
0
1
0
1
0
0
0
1
0
0

1
B 0C
C
B
C
B
B 1C
C
B
B 0C
C
B
C
B
B 1 C
C
B
B 0C
C
B
B 0 C, v 3 =
C
B
C
B
B 1C
C
B
B 0C
C
B
C
B
B 0C
C
B
@ 0A
0

0
0
0
0
1
0
0
1
0
0
1
0

0
0
0
0
0
0
1
0
1
0
0
1

1+t
C
B
B t C
C.
x=B
@ t A
t
1

0
0C
C
C
0C
C
0C
C
0C
C
C
0C
C
0C
C
C
0C
C
0C
C
1 C
C
C
1 A
1

0
B 1 C
C
B
C
B
B 1C
C
B
B 0C
C
B
C
B
B 0C
C
B
B 0C
C
B
B 1 C, v4 =
C
B
C
B
B 0C
C
B
B 1C
C
B
C
B
B 0C
C
B
@ 0A
0

0
B 0C
C
B
C
B
B 0C
C
B
B 1 C
C
B
C
B
B 1C
C
B
B 0C
C
B
B 0 C, v5 =
C
B
C
B
B 0C
C
B
B 0C
C
B
C
B
B 1 C
C
B
@ 1A
0

0
B 0C
C
B
C
B
B 0C
C
B
B 0C
C
B
C
B
B 0C
C
B
B 1 C
C
B
B 1 C.
C
B
C
B
B 0C
C
B
B 0C
C
B
C
B
B 1 C
C
B
@ 0A
1

These vectors represent the circuits around 5 of the cubes faces.


0

0
B 0C
B
C
C
B
B 0C
C
B
B 0C
C
B
C
B
B 0C
B
C
B 0C
C=v v +v v +v ,
B
(b) Examples: B
C
1
2
3
4
5
B 0C
C
B
B 1 C
C
B
B 1C
C
B
C
B
B 0C
C
B
@ 1 A
1

2.6.7.

(a) Tetrahedron:

1
B1
B
B
B1
B
B0
B
@0
0

1
0
0
1
1
0

0
1
0
1
0
1

01
0C
C
1 C
C
C
0C
C
1 A
1

66

0
B 1C
B
C
C
B
B 1 C
C
B
B 1 C
C
B
C
B
B 1C
B
C
B 1C
C
B
B 0 C = v1 v2 ,
B
C
C
B
B 1 C
C
B
B 0C
C
B
C
B
B 0C
C
B
@ 0A
0

0
B 1 C
B
C
C
B
B 1C
C
B
B 1C
C
B
C
B
B 1 C
B
C
B 0C
C
B
B 1 C = v3 v4 .
B
C
C
B
B 0C
C
B
B 1C
C
B
C
B
B 1C
C
B
@ 1 A
0

number of circuits = dim coker A = 3, number of faces = 4;


(b) Octahedron:

1
B1
B
B
B1
B
B1
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B
B0
B
@0
0

1
0
0
0
1
1
1
0
0
0
0
0

0
1
0
0
1
0
0
1
1
0
0
0

0
0
1
0
0
0
0
1
0
1
1
0

0
0
0
1
0
1
0
0
0
1
0
1

0
0C
C
C
0C
C
0C
C
0C
C
C
0C
C
1 C
C
C
0C
C
1 C
C
0C
C
C
1 A
1

number of circuits = dim coker A = 7, number of faces = 8.


(c) Dodecahedron:
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
B 1
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C
B
C
B 0
1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C
B
C
B
C
B 0
C
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
B
C
B 0
C
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
B
C
B
0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 C
B 0
C
B
C
B 0
C
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
B
C
B 0
C
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
B
C
B
B 1
C
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0C
B
C
B 0
C
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
B
C
B 0
0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 C
B
C
B
C
B 0
C
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
B
C
B 0
C
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
B
C
B
0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 C
B 0
C
B
C
B 0
C
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
B
C
B 0
0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 C
B
C
B
C
B 0
C
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
B
C
B 0
C
0
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
B
C
B
0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 C
B 0
C
B
C
B 0
C
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
B
C
B 0
0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 C
B
C
B
C
B 0
C
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
B
C
B 0
C
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
B
C
B 0
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 C
B
C
B
C
B 0
C
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
B
C
B 0
C
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
B
C
B
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 C
B 0
C
B
C
B 0
C
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
B
C
@ 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 A
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1
0

number of circuits = dim coker A = 11, number of faces = 12.

67

(d) Icosahedron:
0

1
B1
B
B1
B
B
B1
B
B1
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B0
B
B
B0
B
B0
B
B
B0
B
B0
B
@0
0

1
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0

0
1
0
0
0
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
1
0
0
0
0
0
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
0
1
0
0
0
0
0
0
0
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0

0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0

0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
0
1
1
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
1
0
1
1
0
0
0
0

0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
1
0
1
1
0
0

0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
1
1

01
0C
C
0C
C
C
0C
C
0C
C
0C
C
C
0C
C
0C
C
C
0C
C
0C
C
0C
C
C
0C
C
0C
C
0C
C
C
0C
C
0C
C
C
0C
C
0C
C
0C
C
C
0C
C
0C
C
C
1 C
C
0C
C
1 C
C
C
0C
C
1 C
C
0C
C
C
1 C
C
0A
1

number of circuits = dim coker A = 19, number of faces = 20.


2.6.8.

1
(a) (i) B
@ 0
0
0

1
B 0
B
B 0
(iii) B
B
@ 0
0

1
1
1

0
1
0
1
1
0
1
1

0
1
1
0
0

0
0C
A,
1
0
0
1
0
0

1
B
B 0
(ii) B
@ 0
0
1
0
0
0
0C
C
C,
0
0C
C
1
0A
0 1

1
1
0
1

0 0
0
1 0
0C
C
C,
1 1
0A
0 0 1
0
1
1
0
B 0 1
1
B
B 0
(iv ) B
0
1
B
@ 0
1
0
0
0
1

0
0
1
0
0

0
0
0
1
0

0
0C
C
C.
0C
C
0A
1

1
0
0
0

1
1
1
1

(b)

0
B
B
B
@

1
0
0
0

1
1
0
0

0
1
1
0

0
0
1
1

0
0C
C
C,
0A
1

0
B
B
B
@

1
0
0
0

1
1
0
1
68

0
1
1
0

0
0
1
0

0
0C
C
C,
0A
1

0
B
B
B
@

0
1
0
0

0
0
1
0

0
0C
C
C.
0A
1

(c) Let m denote the number of edges. Since the graph is connected, its incidence matrix
A has rank n 1. There are no circuits if and only if coker A = {0}, which implies
0 = dim coker A = m (n 1), and so m = n 1.
2.6.9.

(a)

1
B
@1
0

(b)

(c)

1
2

1
0
1

n(n 1);

1
B1
B
B
B1
B
B0
B
@0
0

0
1 C
A,
1

(d)

1
2

1
0
0
1
1
0

0
1
0
1
0
1

01
0C
C
1 C
C
C,
0C
C
1 A
1

1
0
0
0
1
0
0
0

0
1
0
0
0
1
0
0

0
0
1
0
0
0
1
0

1
B1
B
B
B1
B
B1
B
B
B0
B
B0
B
B0
B
B
B0
B
@0
0

(n 1)(n 2).

1
0
0
0
1
1
1
0
0
0

0
1
0
0
1
0
0
1
1
0

0
0
1
0
0
1
0
1
0
1

0
0C
C
C
0C
C
1 C
C
0C
C
C.
0C
C
1 C
C
0C
C
C
1 A
1

0
0
0
0
0
0
1
1
1

1
0
0
1
0
0
1
0
0

0
1
0
0
1
0
0
1
0

01
0C
C
1 C
C
C
0C
C
0C
C.
1 C
C
C
0C
C
0A
1

2.6.10.

(a)

1
B1
B
B
1
B
(b) B
B0
B
@0
0
(c) m n;

0
0
0
1
1
1

1
0
0
1
0
0

0
1
0
0
1
0

01
0C
C
1 C
C
C,
0C
C
0A
1

(d) (m 1)(n 1).

1
B1
B
B
B1
B
B1
B
B0
B
B
B0
B
@0
0

0
0
0
0
1
1
1
1

69

0
0C
C
0C
C
C
1 C
C,
0C
C
C
0C
C
0A
1

1
B1
B
B1
B
B
B0
B
B0
B
B
B0
B
B0
B
@0
0

0
0
0
1
1
1
0
0
0

2.6.11.

1
B 1
B
B
B 0
B
(a) A = B
B 0
B
B 0
B
@ 0
0

1
0
1
0
0
0
0

0 0
0 0 0
0
0 1
0 0 0
0C
C
C
0 1
0 0 0
0C
C
1 0
0 1 0
0C
C.
0 0 1 0 1
0C
C
C
0 0 1 0 0
1A
0 0
0 0 1 1
0 1
0 1
0 1
1
0
0
B1C
B0C
B0C
B C
B C
B C
B C
B C
B C
B0C
B1C
B0C
B C
B C
B C
B1C
B0C
B C
C, v = B C, v = B 0 C form a basis for ker A.
(b) The vectors v1 = B
B0C
B0C
B1C
2
3
B C
B C
B C
B C
B C
B C
B0C
B1C
B0C
B C
B C
B C
@0A
@0A
@1A
0
0
1
(c) The entries of each vi are indexed by the vertices. Thus the nonzero entries in v1 correspond to the vertices 1,2,4 in the first connected component, v2 to the vertices 3,6 in
the second connected component, and v3 to the vertices 5,7,8 in the third connected
component.
(d) Let A have k connected components. A basis for ker A consists of the vectors v 1 , . . . , vk
where vi has entries equal to 1 if the vertex lies in the ith connected component of the
graph and 0 if it doesnt. To prove this, suppose A v = 0. If edge #` connects vertex a
to vertex b, then the `th component of the linear system is va vb = 0. Thus, va = vb
whenever the vertices are connected by an edge. If two vertices are in the same connected component, then they can be connected by a path, and the values va = vb =
at each vertex on the path must be equal. Thus, the values of va on all vertices in the
connected component are equal, and hence v = c1 v1 + + ck vk can be written as a
linear combination of the basis vectors, with ci being the common value of the entries
va corresponding to vertices in the ith connected component. Thus, v1 , . . . , vk span the
kernel. Moreover, since the coefficients ci coincide with certain entries va of v, the only
linear combination giving the zero vector is when all ci are zero, proving their linear independence.

2.6.12. If the incidence matrix has rank r, then # circuits


= dim coker A = n r = dim ker A 1,
since ker A always contains the vector ( 1, 1, . . . , 1 )T .
2.6.13. Changing the direction of an edge is the same as multiplying the corresponding row of
the incidence matrix by 1. The dimension of the cokernel, being the number of independent circuits, does not change. Each entry of a cokernel vector that corresponds to an edge
that has been reversed is multiplied by 1. This can be realized by left multiplying the
incidence matrix by a diagonal matrix whose diagonal entries are 1 is the corresponding
edge has been reversed, and +1 if it is unchanged.
2.6.14.
(a) Note that P permutes the rows of A, and corresponds to a relabeling of the vertices of
the digraph, while Q permutes its columns, and so corresponds to a relabeling of the
edges.
(b) (i),(ii),(v ) represent equivalent digraphs; none of the others are equivalent.
b = P v = (v
(c) v = (v1 , . . . , vm ) coker A if and only if v
(1) . . . v(m) ) coker B. Indeed,
b T B = (P v)T P A Q = vT A Q = 0 since, according to Exercise 1.6.14, P T = P 1 is the
v
inverse of the permutation matrix P .

70

2.6.15. False. For example, any two inequivalent trees, cf. Exercise 2.6.8, with the same number of nodes have incidence matrices of the same size, with trivial cokernels: coker A =
coker B = {0}. As another example, the incidence matrices
0
1
0
1
1 1
0
0
0
1 1
0
0
0
B 0
C
B
1 1
0
0C
1 1
0
0C
B
B 0
C
B
C
B
C
C
B 1
C
A=B
1
0
1
0
0
and
B
=
0
1
0
0
B
C
B
C
@ 1
@ 1
0
0 1
0A
0
0 1
0A
1
0
0
0 1
0
1
0
0 1
both have cokernel basis ( 1, 1, 1, 0, 0 )T , but do not represent equivalent digraphs.

2.6.16.
(a) If the first k vertices belong to one component and the last nk to the other, then there
is no edge between the two sets of vertices and so the entries aij = 0 whenever i =
1, . . . , k, j = k + 1, . . . , n, or when i = k + 1, . . . , n, j = 1, . . . , k, which proves that A has
the indicated block form.
(b) The graph consists of two disconnected triangles. If we use 1, 2, 3 to label the vertices in
one triangle and 4, 5, 6 for those in the second, the resulting incidence matrix has the in0
1 1
0
0
0
01
B 0
1 1
0
0
0C
C
B
B
1
0
1
0
0
0C
C
C, with each block a 3 3 submatrix.
B
dicated block form B
C
B 0
0
0
1
1
0
C
B
@ 0
0
0
0
1 1 A
0
0
0 1
0
1

71

Solutions Chapter 3

3.1.1. Bilinearity:
h c u + d v , w i = (c u1 + d v1 ) w1 (c u1 + d v1 ) w2 (c u2 + d v2 ) w1 + b (c u2 + d v2 ) w2
= c (u1 w1 u1 w2 u2 w1 + b u2 w2 ) + d (v1 w1 v1 w2 v2 w1 + b v2 w2 )
= c h u , w i + d h v , w i,
h u , c v + d w i = u1 (c v1 + d w1 ) u1 (c v2 + d w2 ) u2 (c v1 + d w1 ) + b u2 (c v2 + d w2 )
= c (u1 v1 u1 v2 u2 v1 + b u2 v2 ) + d (u1 w1 u1 w2 u2 w1 + b u2 w2 )
= c h u , v i + d h u , w i.
Symmetry:
h v , w i = v1 w1 v1 w2 v2 w1 + b v2 w2 = w1 v1 w1 v2 w2 v1 + b w2 v2 = h w , v i.
To prove positive definiteness, note
h v , v i = v12 2 v1 v2 + b v22 = (v1 v2 )2 + (b 1) v22 > 0

for all

v = ( v1 , v2 )T 6= 0

if and only if b > 1. (If b = 1, the formula is only positive semi-definite, since v = ( 1, 1 ) T
gives h v , v i = 0, for instance.)
3.1.2. (a), (f ) and (g) define inner products; the others dont.
3.1.3. It is not positive definite, since if v = ( 1, 1 )T , say, h v , v i = 0.
3.1.4.
(a) Bilinearity:
h c u + d v , w i = (c u1 + d v1 ) w1 + 2 (c u2 + d v2 ) w2 + 3 (c u3 + d v3 ) w3
= c (u1 w1 + 2 u2 w2 + 3 u3 w3 ) + d (v1 w1 + 2 v2 w2 + 3 v3 w3 )
= c h u , w i + d h v , w i,
h u , c v + d w i = u1 (c v1 + d w1 ) + 2 u2 (c v2 + d w2 ) + 3 u3 (c v3 + d w3 )
= c (u1 v1 + 2 u2 v2 + 3 u3 v3 ) + d (u1 w1 + 2 u2 w2 + 3 u3 w3 )
= c h u , v i + d h u , w i.
Symmetry:
h v , w i = v1 w1 + 2 v2 w2 + 3 v3 w3 = w1 v1 + 2 w2 v2 + 3 w3 v3 = h w , v i.
Positivity:
h v , v i = v12 + 2 v22 + 3 v32 > 0

for all

v = ( v1 , v2 , v3 )T 6= 0

because it is a sum of non-negative terms, at least one of which is strictly positive.


72

(b) Bilinearity:
h c u + d v , w i = 4 (c u1 + d v1 ) w1 + 2 (c u1 + d v1 ) w2 + 2 (c u2 + d v2 ) w1 +
+ 4 (c u2 + d v2 ) w2 + (c u3 + d v3 ) w3
= c (4 u1 w1 + 2 u1 w2 + 2 u2 w1 + 4 u2 w2 + u3 w3 ) +
+ d (4 v1 w1 + 2 v1 w2 + 2 v2 w1 + 4 v2 w2 + v3 w3 )
= c h u , w i + d h v , w i,
h u , c v + d w i = 4 u1 (c v1 + d w1 ) + 2 u1 (c v2 + d w2 ) + 2 u2 (c v1 + d w1 ) +
+ 4 u2 (c v2 + d w2 ) + u3 (c v3 + d w3 )
= c (4 u1 v1 + 2 u1 v2 + 2 u2 v1 + 4 u2 v2 + u3 v3 ) +
+ d (4 u1 w1 + 2 u1 w2 + 2 u2 w1 + 4 u2 w2 + u3 w3 )
= c h u , v i + d h u , w i.
Symmetry:
h v , w i = 4 v 1 w1 + 2 v 1 w2 + 2 v 2 w1 + 4 v 2 w2 + v3 w3
= 4 w1 v1 + 2 w1 v2 + 2 w2 v1 + 4 w2 v2 + w3 v3 = h w , v i.
Positivity:
h v , v i = 4 v12 + 4 v1 v2 + 4 v22 + v32 = (2 v1 + v2 )2 + 3 v22 + v32 > 0 for all v = ( v1 , v2 , v3 )T 6= 0,
because it is a sum of non-negative terms, at least one of which is strictly positive.
(c) Bilinearity:
h c u + d v , w i = 2 (c u1 + d v1 ) w1 2 (c u1 + d v1 ) w2 2 (c u2 + d v2 ) w1 +
+ 3 (c u2 + d v2 ) w2 (c u2 + d v2 ) w3 (c u3 + d v3 ) w2 + 2 (c u3 + d v3 ) w3
= c (2 u1 w1 2 u1 w2 2 u2 w1 + 3 u2 w2 u2 w3 u3 w2 + 2 u3 w3 ) +
+ d (2 v1 w1 2 v1 w2 2 v2 w1 + 3 v2 w2 v2 w3 v3 w2 + 2 v3 w3 )
= c h u , w i + d h v , w i,
h u , c v + d w i = 2 u1 (c v1 + d w1 ) 2 u1 (c v2 + d w2 ) 2 u2 (c v1 + d w1 ) +
+ 3 u2 (c v2 + d w2 ) u2 (c v3 + d w3 ) u3 (c v2 + d w2 ) + 2 u3 (c v3 + d w3 )
= c (2 u1 v1 2 u1 v2 2 u2 v1 + 3 u2 v2 u2 v3 u3 v2 + 2 u3 v3 ) +
+ d (2 u1 w1 2 u1 w2 2 u2 w1 + 3 u2 w2 u2 w3 u3 w2 + 2 u3 w3 )
= c h u , v i + d h u , w i.
Symmetry:
h v , w i = 2 v 1 w1 2 v 1 w2 2 v 2 w1 + 3 v 2 w2 v2 w3 v3 w2 + 2 v 3 w3
= 2 w1 v1 2 w1 v2 2 w2 v1 + 3 w2 v2 w2 v3 w3 v2 + 2 w3 v3 = h w , v i.
Positivity:
h v , v i = 2 v12 4 v1 v2 + 3 v22 2 v2 v3 + 2 v32 = 2 (v1 v2 )2 + (v2 v3 )2 + v32 > 0
for all v = ( v1 , v2 , v3 )T 6= 0, because it is a sum of non-negative terms, at least one of
which is strictly positive.
cos t sin t
,
2
5

3.1.5. (a)

( cos t, sin t ) ,
1

-0.5

-1

0.5

0.5

-0.5

sin t sin t
cos t + ,
3
5

0.5

-1

!T

-1

-0.5

0.5

0.5

-0.5

-1

73

-1

-0.5

0.5

-0.5

-1

!T

(b) Note: By elementary analytical geometry, any quadratic equation of the form
a x2 + b x y + c y 2 = 1 defines an ellipse provided a > 0 and b2 4 a c < 0.
Case (b): The equation 2 v12 + 5 v22 = 1 defines an ellipse with semi-axes 1 , 1 .
2

Case (c): The equation v12 2 v1 v2 +4 v22 = 1 also defines an ellipse by the preceding remark.
3.1.6.
(a) The vector v = ( x, y )T can be viewed as the hypotenuse of a right triangle with side
lengths x, y, and so by Pythagoras, k v k2 = x2 + y 2 .
(b) First, theqprojection p = ( x, y, 0 )T of v = ( x, y, z )T onto the xy plane ha length
k p k = x2 + y 2 by Pythagoras, as in part (a). Second, the right triangle formed by
0, p and v has side lengths k p k and z, and, again by Pythagoras,
k v k 2 = k p k 2 + z 2 = x2 + y 2 + z 2 .
3.1.7. k c v k =

h cv , cv i =

c2 h v , v i = | c | k v k.

3.1.8. By bilinearity and symmetry,


h av + bw , cv + dw i = ah v , cv + dw i + bh w , cv + dw i
= ach v , v i + adh v , w i + bch w , v i + bdh w , w i
= a c k v k2 + (a d + b c)h v , w i + b d k w k2 .

3.1.9. If we know the first bilinearity property and symmetry, then the second follows:
h u , c v + d w i = h c v + d w , u i = c h v , u i + d h w , u i = c h u , v i + d h u , w i.
3.1.10.
(a) Choosing v = x, we have 0 = h x , x i = k x k2 , and hence x = 0.
(b) Rewrite the condition as 0 = h x , v i h y , v i = h x y , v i for all v V . Now use part
(a) to conclude that x y = 0 and so x = y.
(c) If v is any element of V , then we can write v = c1 v1 + +cn vn as a linear combination
of the basis elements, and so, by bilinearity, h x , v i = c1 h x , v1 i + + cn h x , vn i = 0.
Since this holds for all v V , the result in part (a) implies x = 0.
3.1.11.
2
2
(a) k u + v k k u v k = h u + v , u + v i h u v , u v i

= hu,ui + 2hu,vi + hv,vi hu,ui 2hu,vi + hv,vi

(b) h v , w i =

1
4

(v1 + w1 ) 3(v1 + w1 )(v2 + w2 ) + 5(v2 + w2 )

1
4

= v1 w1

(v1 w1 )2 3(v1 w1 )(v2 w2 ) + 5(v2 w2 )2

3
2 v1 w 2

3
2 v2 w 1

+ 5 v 2 w2 .

= 4 h u , v i.

3.1.12.

2
2
2
2
2
2
(a) k x + y k + k x y k = k x k + 2 h x , y i + k y k + k x k 2 h x , y i + k y k
= 2 k x k2 + 2 k y k2 .

(b) The sum of the squared lengths of the diagonals in a parallelogram equals the sum of
the squared lengths of all four sides:

74

kxk
kx + yk
kyk
kyk
kx yk
kxk

3.1.13. By Exercise 3.1.12, k v k2 = 12 k u + v k2 + k u v k2 k u k2 = 17, so k v k =


The answer is the same in all norms coming from inner products.

17 .

3.1.14. Using (3.2), v (A w) = v T A w = (AT v)T w = (AT v) w.


3.1.15. First, if A is symmetric, then

(A v) w = (A v)T w = vT AT w = vT A w = v (A w).
To prove the converse, note that A ej gives the j th column of A, and so
aij = ei (A ej ) = (A ei ) ej = aji for all i, j. Hence A = AT .

3.1.16. The inner product axioms continue to hold when restricted to vectors in W since they
hold for all vectors in V , including those in W .
3.1.17. Bilinearity:
hhh c u + d v , w iii = h c u + d v , w i + hh c u + d v , w ii
= c h u , w i + d h v , w i + c hh u , w ii + d hh v , w ii = c hhh u , w iii + d hhh v , w iii,
hhh u , c v + d w iii = h u , c v + d w i + hh u , c v + d w ii
= c h u , v i + d h u , w i + c hh u , v ii + d hh u , w ii = c hhh u , v iii + d hhh u , w iii.
Symmetry:
hhh v , w iii = h v , w i + hh v , w ii = h w , v i + hh w , v ii = hhh w , v iii.
Positivity: hhh v , v iii = h v , v i + hh v , v ii > 0 for all v 6= 0 since both terms are positive.
3.1.18. Bilinearity:
e , w)
e , (v
b , w)
b iii = hhh (c v + d v
e , c w + d w)
e , (v
b , w)
b iii
hhh c (v, w) + d (v
e ,v
b i + hh c w + d w
e ,w
b ii
= hcv + dv
b i + dhv
e ,v
b i + chw,w
b i + dhw
e ,w
b i
= chv,v
b , w)
b iii + d hhh (v
e , w)
e , (v
b , w)
b iii,
= c hhh (v, w) , (v
e , w)
e + d (v
b , w)
b iii = hhh (v, w) , (c v
e + dv
b, c w
e + d w)
b iii
hhh (v, w) , c (v
e + dv
b i + hh w , c w
e + dw
b ii
= hv,cv
e i + dhv,v
b i + chw,w
e i + dhw,w
b i
= chv,v
e , w)
e iii + d hhh (v, w) , (v
b , w)
b iii.
= c hhh (v, w) , (v
Symmetry:
e , w)
e iii = h v , v
e i + hh w , w
e ii = h v
e , v i + hh w
e , w ii = hhh (v
e , w)
e , (v, w) iii.
hhh (v, w) , (v
Positivity:
hhh (v, w) , (v, w) iii = h v , v i + hh w , w ii > 0

75

for all (v, w) 6= (0, 0), since both terms are non-negative and at least one is positive because either v 6= 0 or w 6= 0.

3.1.19.
(a) h 1 , x i =

1
2,

k 1 k = 1, k x k =

1 ;
3

1 ,
2

(b) h cos 2 x , sin 2 x i = 0, k cos 2 x k =


(c) h x , ex i = 1, k x k =

(d)

1
(x + 1) ,
x+1
2

3.1.20. (a) h f , g i =
(c) h f , g i =

8
15

3
4

k ex k =

3
= ,
2

, kf k =

1
2

k sin 2 x k =

1 2
2 (e 1) ;
s

31

2
(x + 1) =

,
5

1 ,
3

, kf k =

1
3

, kgk =

, kgk =

7
6

28
15

1
x+1

1 ;
2

1
= .
2

; (b) h f , g i = 0, k f k =

2
3

, kgk =

56
15

3.1.21. All but (b) are inner products. (b) is not because it fails positivity: for instance,
Z 1

(1 x)2 x dx = 43 .

3.1.22. If f (x) is any nonzero function that


satisfies f (x) = 0 for all 0 x 1 then h f , f i = 0.
(
x, 1 x 0,
However, if the function f C0 [ 0, 1 ]
An example is the function f (x) =
0, 0 x 1.
is only considered on [ 0, 1 ], its values outside the interval are irrelevant, and so the positivity is unaffected. The formula does define an inner product on the subspace of polynomial
functions because h p , p i = 0 if and only if p(x) = 0 for all 0 x 1, which implies
p(x) 0 for all x since only the zero polynomial can vanish on an interval.
3.1.23.
(a) No positivity doesnt hold since if f (0) = f (1) = 0 then h f , f i = 0 even if f (x) 6= 0
for any 0 < x < 1;
(b) Yes. Bilinearity and symmetry are readily established. As for positivity,
h f , f i = f (0)2 +f (1)2 +

Z 1
0

f (x)2 dx 0 is a sum of three non-negative quantities, and is

equal to 0 if and only if all three terms vanish, so f (0) = f (1) = 0 and
which, by continuity, implies f (x) 0 for all 0 x 1.

3.1.24. No. For example, on [ 1, 1 ], k 1 k = 2 , but k 1 k2 = 2 6= k 12 k = 2.


3.1.25. Bilinearity:
h cf + dg , h i =
=

Z bh

a
Z bh

=c

Z bh
a

f (x)2 dx = 0

c f (x) h(x) + d g(x) h(x) + c f 0 (x) h0 (x) + d g 0 (x) h0 (x) dx


i

f (x) h(x) + f 0 (x) h0 (x) dx + d

= c h f , h i + d h g , h i.
h f , cg + dh i =

{ c f (x) + d g(x) } h(x) + { c f (x) + d g(x) }0 h0 (x) dx

a
Z bh
a

Z 1

Z bh
a

f (x) { c g(x) + d h(x) } + f 0 (x) { c g(x) + d h(x) }0 dx

76

g(x) h(x) + g 0 (x) h0 (x) dx

Z bh

=c
Symmetry:

f (x) g(x) + f 0 (x) g 0 (x) dx + d

= c h f , g i + d h f , h i.

hf ,gi =
Positivity

a
Z bh
a

Z bh
a

c f (x) g(x) + d f (x) h(x) + c f 0 (x) g 0 (x) + d f 0 (x) h0 (x) dx

f (x) g(x) + f 0 (x) g 0 (x) dx =

hf ,f i =

Z bh
a

Z bh
a

Z bh
a

f (x) h(x) + f 0 (x) h0 (x) dx

g(x) f (x) + g 0 (x) f 0 (x) dx = h g , f i.

f (x)2 + f 0 (x)2 dx > 0

for all

f 6 0,

since the integrand is non-negative, and, by continuity, the integral is zero if and only if
both f (x) 0 and f 0 (x) 0 for all a x b.
3.1.26.
(a) No, because if f (x) is any constant function, then h f , f i = 0, and so positive definiteness does not hold.
(b) Yes. To prove the first bilinearity condition:
h cf + dg , h i =

Z 1 h

=c

c f 0 (x) + d g 0 (x) h0 (x) dx

1
Z 1

f 0 (x) h0 (x) dx + d

Z 1

g 0 (x) h0 (x) dx = c h f , h i + d h g , h i.

The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To prove
symmetry:
Z 1

hf ,gi =

f 0 (x) g 0 (x) dx =

1
Z 1

As for positivity, h f , f i =

Z 1

g 0 (x) f 0 (x) dx = h g , f i.

f 0 (x)2 dx 0. Moreover, since f 0 is continuous,

h f , f i = 0 if and only if f 0 (x) 0 for all x, and so f (x) c is constant. But the only
constant function in W is the zero function, and so h f , f i > 0 for all 0 6= f W .

3.1.27. Suppose h(x0 ) = k > 0 for some a < x0 < b. Then, by continuity, h(x) 21 k for
a < x0 < x < x0 + < b for some > 0. But then, since h(x) 0 everywhere,
Z b
a

h(x) dx

Z x0 +
x0

h(x) dx k > 0,

which is a contradiction. A similar contradiction can be shown when h(x0 ) = k < 0 for
some a < x0 < b. Thus h(x) = 0 for all a < x < b, which, by continuity, also implies
h(a) = h(b) = 0. The function in (3.14) gives a discontinuous counterexample.
3.1.28.
(a) To prove the first bilinearity condition:
h cf + dg , h i =

Z bh

=c

a
Z b
a

c f (x) + d g(x) h(x) w(x) dx


f (x) h(x) w(x) dx + d

Z b
a

g(x) h(x) w(x) dx = c h f , h i + d h g , h i.

The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To prove
symmetry:
hf ,gi =

Z b
a

f (x) g(x) w(x) dx =

As for positivity, h f , f i =

Z b
a

Z b
a

g(x) f (x) w(x) dx = h g , f i.

f (x)2 w(x) dx 0. Moreover, since w(x) > 0 and the inte77

grand is continuous, Exercise 3.1.27 implies that h f , f i = 0 if and only if f (x) 2 w(x)
0 for all x, and so f (x) 0.
(b) If w(x0 ) < 0, then, by continuity, w(x) < 0 for x0 x x0 + for some > 0. Now
choose f (x) 6 0 so that f (x) = 0 whenever | x x0 | > . Then
Z b

hf ,f i =

f (x)2 w(x) dx =

Z x0 +

f (x)2 w(x) dx < 0,

x0

violating positivity.

(c) Bilinearity and symmetry continue to hold. The positivity argument says that h f , f i =
0 implies that f (x) = 0 whenever w(x) > 0. By continuity, f (x) 0, provided w(x) 6 0
on any open subinterval a c < x < d b, and so under this assumption it remains an
inner product. However, if w(x) 0 on a subinterval, then positivity is violated.
3.1.29.
(a) If f (x0 , y0 ) = k > 0 then, Zby
continuity, f (x, y)

Z
ZZ
for some > 0. But then

(b) Bilinearity:

h cf + dg , h i =

ZZ

=c

ZZ

f (x, y) dx dy

k for (x, y) D = { k x x0 k }

f (x, y) dx dy

{ c f (x, y) + d g(x, y) } h(x, y) dx dy


f (x, y) h(x, y) dx dy + d

1
2

ZZ

1
2

k 2 > 0.

g(x, y) h(x, y) dx dy = c h f , h i + d h g , h i.

The second bilinearity


conditions follows fromZthe
first and symmetry:
Z
ZZ
hf ,gi =

f (x, y) g(x, y) dx dy =

Positivity; using part (a),

hf ,f i =

ZZ

2
3,

g(x, y) f (x, y) dx dy = h g , f i.

[ f (x, y) ]2 dx dy > 0

The formula for the norm is k f k =


3.1.30. (a) h f , g i =

k f k = 1, k g k =

sZ Z

28
45

for all

f 6 0.

[ f (x, y) ]2 dx dy .
(b) h f , g i =

1
2

, k f k =

3.1.31.
(a) To prove the first bilinearity condition:
hh c f + d g , h ii =

Z 1h

=c

, k g k = 3 .

(c f1 (x) + d g1 (x)) h1 (x) + (c f2 (x) + d g2 (x)) h2 (x) dx

0
Z 1h
0

f1 (x) h1 (x) + f2 (x) h2 (x) dx + d

Z 1h
0

g1 (x) h1 (x) + g2 (x) h2 (x) dx

= c hh f , h ii + d hh g , h ii.
The second has a similar proof, or follows from symmetry, cf. Exercise 3.1.9. To prove
symmetry:
hh f , g ii =

Z 1h
0

f1 (x) g1 (x) + f2 (x) g2 (x) dx =

As for positivity, hh f , f ii =

Z 1h
0

Z 1h
0

g1 (x) f1 (x) + g2 (x) f2 (x) dx = hh g , f ii.

f1 (x) + f2 (x)2 dx 0, since the integrand is a non-

negative function. Moreover, since f1 (x) and f2 (x) are continuous, so is f1 (x)2 + f2 (x)2 ,
and hence hh f , f ii = 0 if and only if f1 (x)2 + f2 (x)2 = 0 for all x, and so f (x) =
( f1 (x), f2 (x) )T 0.
(b) First bilinearity:
hh c f + d g , h ii =

Z 1

=c

h c f (x) + d g(x) , h(x) i dx

0
Z 1
0

h f (x) , h(x) i dx + d
78

Z 1
0

h g(x) , h(x) i dx = c hh f , h ii + d hh g , h ii.

Symmetry:
hh f , g ii =
Positivity: hh f , f ii =

Z 1

Z 1
0

h f (x) , g(x) i dx =

Z 1
0

h g(x) , f (x) i dx = hh g , f ii.

k f (x) k2 dx 0 since the integrand is non-negative. Moreover,

h f , f i = 0 if and only if k f (x) k2 = 0 for all x, and so, in view of the continuity of
k f (x) k2 we conclude that f (x) 0.
!
1 1
T
(c) This follows because h v , w i = v1 w1 v1 w2 v2 w1 +3 v2 w2 v K w for K =
>0
1
3
2
defines an inner product on R .

3.2.1.
(a)
(b)
(c)
(d)


| v1 v2 | = 3 5 = 5 5 = k v1 k k v2 k; angle: cos1 53 .9273;

| v1 v2 | = 1 2 = 2 2 = k v1 k k v2 k; angle: 32 2.0944;


| v1 v2 | = 0 2 6 = 2 12 = k v1 k k v2 k; angle: 12 1.5708;


| v1 v2 | = 3 3 2 = 3 6 = k v1 k k v2 k; angle: 34 2.3562;


(e) | v1 v2 | = 4 2 15 = 10 6 = k v1 k k v2 k; angle: cos1 2

15

2.1134.

3.2.2. (a) 31 ; (b) 0, 31 , 12 , 23 , or , depending upon whether 1 appears 0, 1, 2, 3 or 4 times


in the second vector.
3.2.3. The side lengths are all equal to
k (1, 1, 0) (0, 0, 0) k = k (1, 1, 0) (1, 0, 1) k = k (1, 1, 0) (0, 1, 1) k = =

The edge angle is

1
3

2.

= 60 . The center angle is cos = 31 , so = 1.9106 = 109.4712 .

3.2.4.

(a) | v w | = 5 7.0711 = 5 10 = k v k k w k.
(b) | v w | = 11 13.0767 = 3
19
= k v k k w k.
(c) | v w | = 22 23.6432 = 13 43 = k v k k w k.
3.2.5.

(a) | v w | = 6 6.4807 = 14 3 =k v k k w k.
(b) | h v , w i | = 11 11.7473 = 23 6 = k v k k w k.
(c) | h v , w i | = 19 19.4936 = 38 10 = k v k k w k.
3.2.6. Set v = ( a, b )T , w = ( cos , sin )T , so that CauchySchwarz gives
| v w | = | a cos + b sin |

a2 + b2 = k v k k w k.

3.2.7. Set v = ( a1 , . . . , an )T , w = ( 1, 1, . . . , 1 )T , so that CauchySchwarz gives


q
| v w | = | a1 + a2 + + an | n a21 + a22 + + a2n = k v k k w k.
Equality holds if and only if v = a w, i.e., a1 = a2 = = an .

3.2.8. Using (3.20), k v w k2 = h v w , v w i = k v k2 2 h v , w i + k w k2


= k v k2 + k w k2 2 k v k k w k cos .
3.2.9. Since a | a | for any real number a, so h v , w i | h v , w i | k v k k w k.
3.2.10. Expanding k v + w k2 k v w k2 = 4 h v , w i = 4 k v k k w k cos .
79

kv + wk

kwk

kv wk
kvk

3.2.11.
(a) It is not an inner product. Bilinearity holds, but symmetry and positivity do not.
(b) Assuming v, w 6= 0, we compute
sin2 = 1 cos2 =

(v w)2
k v k2 k w k2 h v , w i
=
.
k v k2 k w k2
k v k2 k w k2

The result follows by taking square roots of both sides, where the sign is fixed by the
orientation of the angle.
(c) By (b), v w = 0 if and only if sin = 0 or v = 0 or w = 0,!which implies that they
v1 w 1
are parallel vectors. Alternative proof: v w = det
= 0 if and only if the
v2 w 2
columns v, w of the matrix are linearly dependent, and hence parallel vectors.
(d) The parallelogram has side length k v k and height k w k | sin |, so its area is
k v k k w k | sin | = | v w |.
3.2.12.
q
(a) | h f , g i | = 1 1.03191 = 13
(b) | h f , g i | =
(c) | h f , g i | =

3.2.13. (a)

1
2

1
1 2
2 eq q
2 = k f k k g k;
2
2/e = .7358 1.555 = 3 12 (e2 e2 ) = k f k k g k;

1
1 e 1 = k f k k g k.
2 = .5 .5253 = 2 5 e

(b) cos

3.2.14. (a) | h f , g i | =

3.2.15. (a) a = 34 ;

2
3

2 2
= .450301,

28
45

= k f k k g k;

(c)

1
2

(b) | h f , g i | =

= k f k k g k.

(b) no.

3.2.16. All scalar multiples of


3.2.17. 3.2.15: a = 0.

T
7
1
,

,
1
.
2
4
0

T
1
21
.
6 , 24 , 1
1
1 0

3.2.16: all scalar multiples of

2
1
2
1
C
B
C
B
C
C B
B
B 3 C
B 2 C
B 2 C B 3 C
C.
C + bB
C, so v = aB
C, B
3.2.18. All vectors in the subspace spanned by B
@ 0A
@ 1A
@ 1A @ 0A
1
0
1
0
3.2.19. ( 3, 0, 0, 1 )T ,

0, 32 , 0, 1

, ( 0, 0, 3, 1 )T .
80

3.2.20. For example, u = ( 1, 0, 0 )T , v = ( 0, 1, 0 )T , w = ( 0, 1, 1 )T are linearly independent,


whereas u = ( 1, 0, 0 )T , v = w = ( 0, 1, 0 )T are linearly dependent.
3.2.21. (a) All solutions to a + b = 1; (b) all solutions to a + 3 b = 2.
3.2.22. Only the zero vector satisfies h 0 , 0 i = k 0 k2 = 0.

3.2.23. Choose v = w; then 0 = h w , w i = k w k2 , and hence w = 0.

3.2.24. h v + w , v w i = k v k2 k w k2 = 0 provided k v k = k w k. They cant both be


unit vectors: k v + w k2 = k v k2 + 2 h v , w i + k w k2 = 2 + 2 h v , w i = 1 if and only if
h v , w i = 12 , while k v w k2 = 2 2 h v , w i = 1 if and only if h v , w i = 21 , and so
= 60 .

3.2.25. If h v , x i = 0 = h v , y i, then h v , c x + d y i = c h v , x i + d h v , y i = 0 for c, d, R,


proving closure.
3.2.26. (a) h p1 , p2 i =

Z 1

h p 2 , p3 i =

(b) For n 6= m,

h sin n x , sin m x i =

0Z
1
0

Z 1
0

1
2

1
2

dx = 0,

h p 1 , p3 i =

x2 x +

1
6

sin n x sin m x dx =

3.2.28. p(x) = a (e 1) x 1 + b x2 (e 2) x

x+

dx = 0.

Z 1

3.2.27. Any nonzero constant multiple of x2 31 .

Z 1
x2
0

1
0 2

1
6

dx = 0,

cos(n + m) x cos(n m) x dx = 0.

for any a, b R.

3.2.29. 1 is orthogonal to x, cos x, sin x; x is orthogonal to 1, cos x; cos x is orthogonal to


1, x, sin x; sin x is orthogonal to 1, cos x; ex is not orthogonal to any of the others.
3.2.30. Example: 1 and x 32 .

5
3.2.31. (a) = cos1 0.99376 radians; (b) v w = 5 < 9.165 84 = k v k k w k,
84

k v + w k = 30 5.477 < 6.191 14 + 6 = k v k + k w k; (c) 73 t, 31 t, t .


3.2.32.

(a) k v1 + v2 k = 4
2 5
= k v1 k + k v2 k;
(b) k v1 + v2 k = 2 22 =
k v1 k + k v2 k;
(c) k v1 + v2 k = 14 2 + 12 = k v1 k + k v2 k;
(d) k v1 + v2 k = 3 3 +
6 = k v1 k + k v2 k;
(e) k v1 + v2 k = 8 10 + 6 = k v1 k + k v2 k.

3.2.33.

(a) k v1 + v2 k = 5 5 +
10 = k v1 k + k v2 k;
(b) k v1 + v2 k = 6 3
+ 19
= k v1 k + k v2 k;
(c) k v1 + v2 k = 12 13 + 43 = k v1 k + k v2 k.
3.2.34.
q
(a) k f + g k = 11
6 +
(b) k f + g k =

2
3

q
q
1 2
1
1 2
e

2.35114

2.36467

+
2
3
2e
1 2
1 2
1
2 e 2.40105
2 e + 4e

81

1 = k f k + k g k;

2.72093 23 + 12 (e2 e2 ) = k f k + k g k;

(c) k f + g k = 2 + e 5 e1 1.69673 1.71159 2 5 e1 + e 1 = k f k + k g k.


3.2.35.
q
q
1.71917 2.71917 1 + 28
(a) k f + g k = 133
45
45 = k f k + k g k;
q
q

7
(b) k f + g k = 3 2.70747 2.79578 + 13 = k f k + k g k.

3.2.36.
(a) h 1 , x i = 0, so = 21 . Yes, they are orthogonal.
q
q

(b) Note k 1 k = 2, k x k = 23 , so h 1 , x i = 0 < 43 = k 1 k k x k, and


q
q

k 1 + x k = 83 1.63299 < 2.23071 2 + 23 = k 1 k + k x k.

(c) h 1 , p i = 2 a + 32 c = 0 and h x , p i = 32 b = 0 if and only if p(x) = c x2


3.2.37.
(a)

s
s

Z 1
Z 1
Z 1

2 x
x

f
(x)
e
dx
g(x)2 ex dx
f
(x)
g(x)
e
dx
0

0
0
s
s
s
Z 1h
Z 1
Z 1
i2
x
2 x

1
3

f (x) e dx +
g(x)2 ex dx ;
0
q

(b) h f , g i = 21 (e2 1) = 3.1945 3.3063 = e 1 13 (e3 1) = k f k k g k,


q
q

k f + g k = 13 e3 + e2 + e 37 = 3.8038 3.8331 = e 1 + 13 (e3 1) = k f k + k g k;

3
e2 1
q
= .9662, so = .2607.
(c) cos =
2
(e 1)(e3 1)
0

f (x) + g(x)

3.2.38.
(a)
s

e dx

Z 1 h
i

0
0

f
(x)
g(x)
+
f
(x)
g
(x)
dx

Z 1h

s
Z 1h
0

f (x) + f (x)

Z 1h

dx

Z 1h

g(x)2 + g 0 (x)2 dx;


rh

[f (x) + g(x)]2 + [f 0 (x) + g 0 (x)]2 dx


f (x)2 + f 0 (x)2 dx +
g(x)2 + g 0 (x)2 .
0
0

(b) h f , g i = e 1 1.7183 2.5277 1 e2 1 = k f k k g k;

k f + g k = e2 + 2 e 2 3.2903 3.5277 1 + e2 1 = k f k + k g k.
(c) cos =

e1
.6798, so .8233.
e+1

3.2.39. Using the triangle inequality, k v k = k (v w) + w k k v w k + k w k.


Therefore, k v w k k v k k w k. Switching v and w proves
v
k v w k k w k k v k, and the result is the combination of both
inequalities. In the figure, the inequality states that the length of
the side vw of the triangle opposite the origin is at least as large
as the difference between the other two lengths.

vw
w

3.2.40. True. By the triangle inequality,


k w k = k ( v) + (w + v) k k v k + k v + w k = k v k + k v + w k.
3.2.41.
(a) This follows immediately by identifying R with the space of all functions f : N R
where N = { 1, 2, 3, . . . } are the natural numbers. Or, one can tediously verify all the
vector space axioms.

82

(b) If x, y `2 , then, by the triangle inequality on R n ,


n
X

k=1

(xk + yk )2

0v
u
u n
Bu X
@t
k=1

x2k +

v
u
n
uX
u
t
k=1

yk2 C
A

0v
u
u
Bu X
@t
k=1

x2k +

v
u

uX
u
t
k=1

yk2 C
A < ,

and hence, in the limit as n , the series of non-negative terms is also bounded:

k=1

(xk + yk )2 < , proving that x + y `2 .

(c) (1, 0, 0, 0, . . . ), (1, 21 , 14 , 81 , . . . ) and (1, 12 , 13 , 14 , . . . ) are in `2 , while


(1, 1, 1, 1, . . . ) and (1, 1 , 1 , 1 , . . . ) are not.
2

(d) True. convergence of

k=1

x2k requires x2k 0 as k and hence xk 0.

(e) False see last example in part (b).


(f )

k=1

x2k =

2 k is a geometric series which converges if and only if | | < 1.

k=1

(g) Using the integral test,

x2k =

k=1

k=1

k2 converges if and only if 2 < 1, so < 21 .

(h) First, we need to prove that it is well-defined, which we do by proving that the series is
absolutely convergent. If x, y `2 , then, by the CauchySchwarz inequality on R n for
the vectors ( | x1 |, . . . , | xn | )T , ( | y1 |, . . . , | yn | )T ,
n
X

k=1

| x k yk |

v
u
n
uX
u
t
k=1

v
u
u n
2 uX
xk t
k=1

and hence, letting n , we conclude that

v
u

uX
u
t
k=1

yk2
n
X

k=1

v
u
u
2 uX
xk t
k=1

yk2 < ,

| xk yk | < . Bilinearity, symmetry

and positivity are now straightforward to verify.


(i)

k=1

| x k yk |

v
u

uX
u
t
k=1

v
u
u
2 uX
xk t
k=1

yk2

v
u

uX
u
t
(xk
k=1

3.3.1. k v + w k1 =
2 2 = 1 + 1 = k v k 1 + k w k1 ;
k v + w k2 = 2 2 = 1 + 1 = k v k2 + k w k2 ;
k v + w k 3 = 3 2 2 = 1 + 1 = k v k 3 + k w k3 ;
k v + w k = 1 2 = 1 + 1 = k v k + k w k .

+ yk )

3.3.2.
(a) k v + w k1 = 6
6=3
1 + k w k1 ;
+ 3 =
k v k
k v + w k2 = 3 2 2 5 = 5 + 5 = k v k2 + k w k2 ;
k v + w k3 = 3 54 2 3 9 = 3 9 + 3 9 = k v k3 + k w k3 ;
k v + w k = 3 4 = 2 + 2 = k v k + k w k .
(b) k v + w k1 = 2
= k v
k 1 + k w k1 ;
4 =2 + 2
k v + w k2 = 2 2 2 = 2 + 2 = k v k2 + k w k2 ;
k v + w k 3 = 3 2 2 3 2 = 3 2 + 3 2 = k v k 3 + k w k3 ;
k v + w k = 1 2 = 1 + 1 = k v k + k w k .
(c) k v + w k1 = 10
10= 4 +6 = k v k1 + k w k1 ;
k v + w k2 = 34 6 + 14 = k v k2 + k w k2 ;
83

v
u

uX
u
t
k=1

x2k

v
u

uX
u
t
k=1

yk2 .

k v + w k3 = 3 118 3 10 + 3 36 = k v k3 + k w k3 ;
k v + w k = 4 5 = 2 + 3 = k v k + k w k .

3.3.3.
(a) k u v k1 = 5,
k v w k1 = 7, so u,
v are closest.
k u w k1 = 6,
(b) k u v k2 = 13, k u w k2 = 12, k v w k2 = 21, so u, w are closest.
(c) k u v k = 3, k u w k = 2, k v w k = 4, so u, w are closest.
3.3.4. (a) k f k = 32 , k g k = 41 ; (b) k f + g k =

3.3.5. (a) k f k1 =

5
18 ,

3.3.6. (a) k f g k1 =

k g k1 = 16 ; (b) k f + g k1 =

1
2

5
6

9 3

2
3

5
18

1
4

= k f k + k g k .

1
6

= k f k1 + k g k1 .

= .5, k f h k1 = 1 2 = .36338, k g h k1 =

are closest. (b) k f g k2 =


q

2
3

1
3

= .57735, k f h k2 =

3
2

1
2

1 = .18169, so g, h

= .47619, k g h k2 =

= .44352, so g, h are closest. (c) k f g k = 1, k f h k = 1, k g h k = 1,


so they are equidistant.

3.3.7.
(a) k f + g k1 =
(b) k f

(c) k f
(d) k f

5
3
k f k1 + k g k1 ;
4 = .75 1.3125 1 + 16 = q
q
31
7
= k f k2 + k g k2 ;
+ g k2 = 48 .8036 1.3819 1 + 48

3
3
39
41
+ g k3 = 4 .8478 1.4310 1 + 8 = k f k3 + k g k3 ;
+ g k = 45 = 1.25 1.75 = 1 + 43 = k f k + k g k .

3.3.8.
(a) k f + g k1 = eq e1 2.3504 2.3504 (e 1) + (1 e1 ) = k f k1 + k g k1 ;
(b) k f + g k2 = 12 e2 + 2 12 e2 2.3721
(c) k f
(d) k f

q
1
1 2
1
+
= k f k2 + k g k2 ;
2
2 2e
q
3 2 3
2 3
1
+ g k3 = 3 e + 3 e 3 e 3 e 2.3945
q
q
2.5346 3 13 e3 13 + 3 13 31 e3 = k f k3 + k g k3 ;
+ g k = e + e1 3.08616 3.71828 e + 1 = k f k + k g k .

2.4448

1 2
2e

3.3.9. Positivity: since both summands are non-negative, k x k 0. Moreover, k x k = 0 if and


only if x = 0 = x y, and so x = ( x, y )T = 0.

Homogeneity: k c x k = | c x | + 2 | c x c y | = | c | | x | + 2 | x y | = | c | k v k.
Triangle inequality: k x + v k = |x + v | + 2
| x + v y w |

| x | + | v | + 2 | x y | + | v w | = k x k + k v k.

3.3.10.
(a) Comes from weighted inner product h v , w i = 2 v1 w1 + 3 v2 w2 .
(b) Comes from inner product h v , w i = 2 v1 w1 21 v1 w2 21 v2 w1 + 2 v2 w2 ; positivity
2
follows because h v , v i = 2 (v1 41 v2 )2 + 15
8 v2 .

(c) Clearly positive; k c v k = 2 | c v1 | + | c v2 | = | c | 2 | v1 | + | v2 | = | c | k v k;


k v + w k = 2 | v1 + w1 | + | v2 +nw2 | 2 | v1 | +o| v2 | + 2 | w1 n
| + | w2 | = k v
k + k w k.
o
(d) Clearly positive; k c v k = max 2 | c v1 |, | c v2 | = | c | max 2 | v1 |, | v2 | = | c | k v k;
n

k v + w k = max 2 | v1 + w1 |, | v2 + w2 |
n

max 2 | v1 | + 2 | w1 |, | v2 | + | w2 |
o

max 2 | v1 |, | v2 | + max 2 | w1 |, | w2 | = k v k + k w k.
(e) Clearly non-negative
and equals zero if and
only if v1n v2 = 0 = v1 + v2o, so v = 0;
n
o
k c v k = max | c v1 c v2 |, | c v1 + c v2 | = | c | max | v1 v2 |, | v1 + v2 | = | c | k v k;
84

k v + w k = max | v1 + w1 v2 w2 |, | v1 + w1 + v2 + w2 |
n

max | v1 v2 | + | w1 w2 |, | v1 + v2 | + | w1 + w2 |
n

max | v1 v2 |, | v1 + v2 | + max | w1 w2 |, | w1 + w2 | = k v k + k w k.
(f ) Clearly non-negative and equals zero if and
only if v1 v2 = 0 =
v1 + v2 , so v = 0;

k c v k = | c v1 c v2 | + | c v1 + c v2 | = | c | | v1 v2 | + | v1 + v2 | = | c | k v k;
k v + w k = | v 1 + w 1 v2 w 2 | + | v 1 + w 1 + v2 + w 2 |
| v1 v2 | + | v1 + v2 | + | w1 w2 | + | w1 + w2 | = k v k + k w k.
3.3.11. Parts (a), (c) and (e) define norms. (b) doesnt since, for instance, k ( 1, 1, 0 ) T k = 0.
(d) doesnt since, for instance, k ( 1, 1, 1 )T k = 0.
3.3.12. Clearly if v = 0, then w = 0 since only the zero vector has norm 0. If v 6= 0, then
w = c v, and k w k = | c | k v k = k v k if and only if | c | = 1.
3.3.13. True for an inner product norm, but false in general. For example,
k e 1 + e 2 k1 = 2 = k e 1 k1 + k e 2 k1 .
3.3.14.
If x = ( 1, 0 )T , y = ( 0, 1 )T , say, then k x + y k2 + k x y k2 = 1 + 1 = 2 6= 4 =

2 k x k2 + k y k2 , which contradicts the identity in Exercise 3.1.12.

3.3.15. No neither result satisfies the bilinearity property.


For example, if v = ( 1, 0 )T , w = ( 1, 1 )T , then
h 2v , w i =
h 2v , w i =

1
4 k 2v

1
4 k 2v

+ w k21 k 2 v w k21

+ w k2 k 2 v w k2

1
2

= 3 6= 2 h v , w i =

= 2 6= 2 h v , w i =

k v + w k21 k v w k21

1
2

= 4,

k v + w k2 k v w k2
0

3.3.16. Let m = k v k = max{| v1 |, . . . , | v1 |}. Then k v kp = m @


!p

n
X

i=1

| vi |
m

!p 11/p
A
.

= 23 .

Now if

| vi |
0 as p . Therefore, k v kp m k 1/p m as p , where
m
1 k is the number of entries in v with | vi | = m.
| vi | < m then

3.3.17.

(a) k f + g k1 =

Z b
a

| f (x) + g(x) | dx
=

Z bh
a
Z b
a

| f (x) | + | g(x) | dx

| f (x) | dx +

Z b
a

| g(x) | dx = k f k1 + k g k1 .

(b) k f + g k = max | f (x) + g(x) | max | f (x) | + | g(x) |

max | f (x) | + max | g(x) | = k f k + k g k .

3.3.18.
(a) Positivity follows since the integrand is non-negative; further,
k c f k1,w =
k f + g k1,w =

Z b

a
Z b

a
Z b
a

| c f (x) | w(x) dx = | c |

axb

| f (x) | w(x) dx = | c | k f k1,w ;

| f (x) + g(x) | w(x) dx


| f (x) | w(x) dx +

(b) Positivity is immediate; further,


k c f k,w = max

Z b

| c f (x) | w(x)

Z b
a

| f (x) | w(x) dx = k f k1,w + k g k1,w .

= | c | max

85

axb

| f (x) | w(x)

= | c | k f k,w ;

k f + g k,w = max

axb

max

axb

| f (x) + g(x) | w(x)


| f (x) | w(x)

+ max

axb

| g(x) | w(x)

= k f k,w + k g k1,w .

3.3.19.
n
o
n
o
(a) Clearly positive; k c v k = max k c v k1 , k c v k2 = | c | max k v k1 , k v k2 = | c | k v k;
n

k v + w k = max k v + w k1 , k v + w k2
o

max k v k1 + k w k1 , k v k2 + k w k2
o

max k v k1 , k v k2 + max k w k1 , k w k2 = k v k + k w k.
(b) No. The triangle inequality is not necessarily valid. For example, in R 2 set k v k1 =
T
|x|n
+ | y |, k v k2 =o 23 max{| x |, | y |}. Then
if v = ( 1, .4 )o
, w = ( 1, .6 )T , then k v k =
n
min k v k1 , k v k2 = 1.4, k w k = min k w k1 , k w k2 = 1.5, but k v + w k =
n

min k v + w k1 , k v + w k2 = 3 > 2.9 = k v k + k w k.


(c) Yes.
(d) No. the triangle inequality is not necessarily
valid. For example, if v = ( 1, 1 ) T , w =
q

T
T
( 1, 0 ) , so v + w = ( 2, 1 ) , then k v + w k1 k v + w k = 6 > 2 + 1 =
q

k v k1 k v k +

3.3.20. (a)

1
14
B
B
B 2
B
14
@
3

14

C
C
C;
C
A

k w k1 k w k .

(b)

0
B
B
B
@

1
3
2
3

C
C
C;
A

(c)

1
B 6
B 1
B
@ 3
12

C
C
C;
A

(d)

0
B
B
B
@

1
3
2
3

C
C
C;
A

(e)

1
B 6
B 1
B
@ 3
12

C
C
C.
A

3.3.21.
(a) k v k2 = cos2 cos2 + cos2 sin2 + sin2 = cos2 + sin2 = 1;
(b) k v k2 = 12 (cos2 + sin2 + cos2 + sin2 ) = 1;
(c) k v k2 = cos2 cos2 cos2 + cos2 cos2 sin2 + cos2 sin2 + sin2 = cos2 cos2 +
cos2 sin2 + sin2 = cos2 + sin2 = 1.
3.3.22. 2 vectors, namely u = v/k v k and u = v/k v k.
1
1

0.5
0.5

3.3.23. (a)

-1

-0.5

0.5

(b)

-1

-0.5

0.5

0.5

(c)

-1

-0.5

0.5

-0.5

-0.5

-0.5

-1

-1

-1

0.75

3.3.24. (a)

0.5

0.5

0.25

0.25

-1 -0.75 -0.5-0.25

0.75

0.25 0.5 0.75

(b)

-1 -0.75 -0.5-0.25

-0.25

0.5

0.25 0.5 0.75

(c)

-1

-0.5

0.5

-0.25

-0.5

-0.5

-0.75

-0.75

-1

-1

86

-0.5

-1

0.5

(d)

-1

-0.5

0.5

0.5

(e)

-1

-0.5

-0.5

0.5

(f )

-2

-0.5

-1

-1

-1

-1

-2

3.3.25.

(a) Unit octahedron:

(b) Unit cube:

(c) Ellipsoid with semi-axes

1 , 1, 1 :
2
3

(d)

In the last case, the corners of the top face of the parallelopiped are at v 1 =

21 , 21 , 32

23 , 12 , 12

, v3 =
, v4 =
v2 =
bottom (hidden) face are v1 , v2 , v3 , v4 .

12 , 23 , 12

1 1 1
2, 2, 2

, while the corners of the

3.3.26. Define |k x k| = k x k/k v k for any x V .


3.3.27. True. Having the same unit sphere means that k u k1 = 1 whenever k u k2 = 1. If v 6= 0
is any other nonzero vector space element, then u = v/k v k1 satisfies 1 = k u k1 = k u k2 ,
and so k v k2 = k k v k1 u k2 = k v k1 k u k2 = k v k1 . Finally k 0 k1 = 0 = k 0 k2 , and so the
norms agree on all vectors in V .
3.3.28. (a)

18
5

x 65 ; (b) 3 x 1; (c)

3
2

x 12 ; (d)

9
10

3
10 ;

(e)

2 2

;
2 2

(f )

3
4

x 14 .

3.3.29. (a), (b), (c), (f ), (i). In cases (g), (h), the norm of f is not finite.
3.3.30. If k x k, k y k 1 and 0 t 1, then, by the triangle inequality, k t x + (1 t) y k
t k x k + (1 t) k y k 1. The unit sphere is not convex since, for instance, 21 x + 21 ( x) =
0 6 S1 when x, x S1 .

3.3.31.

(a) k v k2 = 2, k v k = 1, and

(b) k v k2 = 14, k v k = 3, and

2 1 2;

14 3 14 ;

1
2
1
3

(c) k v k2 = 2, k v k = 1, and 12 2 1 2;

(d) k v k2 = 2 2, k v k = 2, and 1 2 2 2 2 2 .
5

87

3.3.32. (a) v = ( a, 0 )T or ( 0, a )T ; (b) v = ( a, 0 )T or ( 0, a )T ;


(c) v = ( a, 0 )T or ( 0, a )T ; (d) v = ( a, a )T or ( a, a )T .
3.3.33. Let 0 < 1 be small. First, if the entries satisfy | vj | < for all j, then k v k =
n

max | vj | < ; conversely, if k v k < , then | vj | k v k < for any j. Thus, the
entries of v are small if and only if its norm is small. Furthermore, by the equivalence
of norms, any other norm satisfies ck v k k v k C k v k where C, c > 0 are fixed.
Thus, if k v k < is small, then its entries, | vj | k v k C are also proportionately
1
1
small, while if the entries are all bounded by | vj | < , then k v k k v k , is also
c
c
proportionately small.

3.3.34. If | vi | = k v k is the maximal entry, so | vj | | vi | for all j, then

2
n vi2 = n k v k2 .
k v k2 = vi2 k v k22 = v12 + + vn

3.3.35.
k v k21

(i)

0
@

n
X

i=1

12

| vi | A =

n
X

i=1

| v i |2 + 2

On the other hand, since 2 x y x2 + y 2 ,


k v k21 =

(ii) (a)
(b)
(c)
(d)
(iii) (a)

n
X

i=1

| v i |2 + 2

i<j

i<j

| vi | | v j |

| vi | | v j | n

n
X

i=1

n
X

i=1

| vi |2 = k v k22 .

| vi |2 = n k v k22 .


k v k2 = 2, k v k1 = 2, and
2 2 2 2;

k v k2 = 14, k v k1 = 6, and
14 6 3 14;
k v k2 = 2, k v k1 = 4, and 2 4 2 2;

k v k2 = 2 2, k v k1 = 6, and 2 2 6 5 2 2.
v = c ej for some j = 1, . . . , n; (b) | v1 | = | v2 | = = | vn |.

3.3.36.
(i) k v k k v k1 n k v k .
(ii) (a) k v k = 1, k v k1 = 2, and 1 2 2 1;
(b) k v k = 3, k v k1 = 6, and 3 6 3 3;
(c) k v k = 1, k v k1 = 4, and 1 4 4 1;
(d) k v k = 2, k v k1 = 6, and 2 6 5 2.
(iii) k v k = k v k1 if and only if v = c ej for some j = 1, . . . , n; k v k1 = n k v k if and
only if | v1 | = | v2 | = = | vn |.
3.3.37. In each case, we
minimize and
maximize k ( cos , sin
)T k for 0 2 :

(a) c? = 2 , C ? = 3 ; (b) c? = 1, C ? = 2 .
3.3.38. First, | vi | k v k . Furthermore, Theorem 3.17 implies k v k C k v k, which proves
the result.
3.3.39. Equality implies that k u k2 = c? for all vectors u with k u k1 = 1. But then if v 6= 0 is
any other vector, setting u = v/k v k1 , we find k v k2 = k v k1 k u k2 = c? k v k1 , and hence
the norms are merely constant multiples of each other.
3.3.40. If C = k f k , then | f (x) | C for all a x b. Therefore,
k f k22 =

Z b
a

f (x)2 dx

Z b
a

C 2 dx = (b a) C 2 = (b a) k f k2 .

88

3.3.41.
(a) The maximum (absolute) value of fn (x) is 1 = k fn k . On the other hand,
sZ
sZ

n
2
| fn (x) | dx =
dx = 2 n
k f n k2 =
.

(b) Suppose there


exists a constant C such that k f k2 C k f k for all functions. Then, in
particular, 2 n = k fn k2 C k fn k = v
C for all n, which is impossible.
(c) First, k fn k2 =

sZ

| fn (x) | dx =

uZ
u 1/n
t
1/n

n
dx = 1. On the other hand, the
2

maximum (absolute) value of fn (x) is k fn k = n/2 . Arguing as in part (b), we


conclude that
there is no constant C such that k f k C k f k2 .
8
>
< n, 1 x 1 ,
n
has k fn k1 = 1, k fn k =
;
(d) (i) fn (x) = > 2
n
n
2
:
0,
otherwise,
8
1
>
<

, n x n,
has k fn k2 = 1, k fn k1 = 2 n ;
(ii) fn (x) = >
2n
:
otherwise,
8 0,
>
< n, 1 x 1 ,
n
has k fn k1 = 1, k fn k2 =
(iii) fn (x) = > 2
.
n
n
2
:
0, otherwise,

3.3.42.
(a) We cant use the functions in Exercise 3.3.41 directly since they are not continuous. In(
n (1 n | x |), n1 x n1 ,
stead, consider the continuous functions fn (x) =
0,
otherwise,
v
uZ
1/n
u

2
Then k fn k = n, while k fn k2 = t
n(1 n | x |)2 dx = . Thus, there is no
1/n
3

constant C such that k f k C k f k2 as otherwise n = k fn k C k fn k2 = 2 C


3
for all n, which is impossible.
(b) Yes: since, by the definition of the L norm, | f (x) | k f k for all 1 x 1,
s
s
Z 1
Z 1

2
k f k2 dx = 2 k f k .
k f k2 =
| f (x) | dx =
1
(

n (1 n | x |), n1 x n1 ,
are
0,
otherwise,
continuous and satisfy k fn k = n, while k fn k1 = 1. Thus, there is no constant C such
that k f k C k f k1 for all f C0 [ 1, 1 ].

(c) They are not equivalent. The functions fn (x) =

3.3.43. First, since h v , w i is easily shown to be bilinear and symmetric, the only issue is positivity: Is 0 < h v , v i = k v k21 + k v k22 for all 0 6= v V ? Let = min k v k2 /k v k1 over
all 0 6= v V . Then h v , v i = k v k21 +k v k22 (+ 2 )k v k21 > 0 provided + 2 > 0.
Conversely, if + 2 0 and 0 6= v achieves the minimum value, so k v k2 = k v k1 , then
h v , v i 0. (If there is no v that actually achieves the minimum value, then one can also
allow + 2 = 0.)

In infinite-dimensional situations, one should replace the minimum by the infimum, since the
minimum value may not be achieved.
89

3.4.1. (a) Positive definite: h v , w i = v1 w1 + 2 v2 w2 ; (b) not positive definite; (c) not positive definite; (d) not positive definite; (e) positive definite: h v , w i = v1 w1 v1 w2
v2 w1 + 3 v2 w2 ; (f ) not positive definite.
3.4.2. For instance, q(1, 0) = 1, while q(2, 1) = 1.
3.4.3.
(a) The associated quadratic form q(x) = xT D x = c1 x21 + c2 x22 + + cn x2n is a sum of
squares. If all ci > 0, then q(x) > 0 for x 6= 0, since q(x) is a sum of non-negative terms,
at least one of which is strictly positive. If all ci 0, then, by the same reasoning, D
is positive semi-definite. If all the ci < 0 are negative, then D is negative definite. If D
has both positive and negative diagonal entries, then it is indefinite.
(b) h v , w i = vT D w = c1 v1 w1 + c2 v2 w2 + + cn vn wn , which is the weighted inner
product (3.10).
!
1 2
T
is not positive definite or even
3.4.4. (a) kii = ei K ei > 0. (b) For example, K =
2 1
!
1 0
semi-deifnite. (c) For example, K =
.
0 0
3.4.5.

| 4 x 1 y1 2 x 1 y2 2 x 2 y1 + 3 x 2 y2 |

4 x21 4 x1 x2 + 3 x22

4 y12 4 y1 y2 + 3 y22 ,

4 (x1 + y1 )2 4 (x1 + y1 ) (x2 + y2 ) + 3 (x2 + y2 )2

4 x21 4 x1 x2 + 3 x22 +

4 y12 4 y1 y2 + 3 y22 .

3.4.6. First, (c K)T = c K T = c K is symmetric. Second, xT (c K) x = c xT K x > 0 for any


x 6= 0, since c > 0 and K > 0.

3.4.7. (a) xT (K + L)x = xT K x + xT L x > !0 for all x 6= 0, !


since both summands are
strictly
!
1 0
1 0
2
0
> 0.
, with K + L =
,L=
positive. (b) For example, K =
0 1
0 2
0 1
!
!
!
3 1
1 1
4 7
3.4.8.
=
is not even symmetric. Even the associated quadratic form
1 1
1 4
2 5
!
!
x
4 7
= 4 x2 + 9 x y + 5 y 2 is not positive definite.
(x y)
y
2 5
3.4.9. Example:

0
1

1
.
0

3.4.10.
(a) Since K 1 is also symmetric, xT K 1 x = xT K 1 K K 1 x = (K 1 x)T K(K 1 x) = yT K y.
(b) If K > 0, then yT K y > 0 for all y = K 1 x 6= 0, and hence xT K 1 x > 0 for all x 6= 0.
v Kv
vT K v
3.4.11. It suffices to note that K > 0 if and only if cos =
=
> 0 for
k v k k Kv k
k v k k Kv k
all v 6= 0, which holds if and only if | | < 12 .
3.4.12. If q(x) = xT K x with K T = K, then
q(x + y) q(x) q(y) = (x + y)T K(x + y) xT K x yT K y = 2 xT K y = 2 h x , y i.

3.4.13. (a) No, by continuity. Or, equivalently, q(c x+ ) = c2 q(x+ ) > 0 for any c 6= 0, so q is
positive at any nonzero scalar multiple of x+ . (b) In view of the preceding calculation, this
holds if and only if q(x) is either positive or negative definite and x0 = 0.

90

3.4.14.
(a) The quadratic form for K = N is xT K x = xT N x > 0 for all x 6= 0.
(b) a < 0 and det N = ac
b2 > 0.
!
1
1
is negative definite. The others are not.
(c) The matrix
1 2
1
3.4.15. x K x = ( 1 1 )
1
T

= 0, but K x =

1
2

2
3

1
1

1
1

6= 0.

3.4.16. If q(x) > 0 and q(y) < 0, then the scalar function f (t) = q(t x + (1 t) y) satisfies
f (0) > 0 and f (1) < 0, so, by continuity, there is a point 0 < t? < 1 such that f (t? ) = 0
and hence, setting z = t? x+(1t? ) y gives q(z) = 0. Moreover, z 6= 0, as otherwise x = c y,
c = 1 1/t? , would be parallel vectors, but then q(x) = c2 q(y) would have the same sign.
1
0

3.4.17. !
(a) False. For
example, the nonsingular matrix K =
!
1
1
and
. (b) True; see Exercise 3.4.16.
1
1
3.4.18.

0
1

has null directions, e.g.,

0.5

(a) x2 y 2 = (x y)(x + y) = 0:

-1

-0.5

0.5

0.5

-0.5

-1

0.5

(b) x2 + 4 x y + 3 y 2 = (x + y)(x + 3 y) = 0:

-1

-0.5

-0.5

-1

(c) x2 y 2 z 2 = 0:

3.4.19.
T
(a) First, kii = eT
i K ei = ei L ei = lii , so their diagonal entries are equal. Further,
kii + 2 kij + kjj = (ei + ej )T K (ei + ej ) = (ei + ej )T L (ei + ej ) = lii + 2 lij + ljj ,
and hence kij = kji = lij = lji , and so K = L.
!
!
0 2
0 1
then xT K x = xT L x = 2 x1 x2 .
and L =
(b) Example: If K =
0 0
1 0
3.4.20. Since q(x) is a scalar q(x) = xT A x = (xT A x)T = xT AT x, and hence
q(x) = 12 (xT A x + xT AT x) = xT K x.
3.4.21. (a) `(c x) = a c x = c (a x) = c `(x); (b) q(c x) = (c x) T K(c x) = c2 xT K x =
c2 q(x); (c) Example: q(x) = k x k2 where k x k is any norm that does not come from an
inner product.
91

3.4.22. (i)

10
6

6
; positive definite. (ii)
4
0

0
B
@

5
4
3

4
13
1

3
1 C
A; positive semi-definite; null
2

!
5
6 8
B
C
vectors: all scalar multiples of @ 1 A. (iii)
; positive definite.
8 13
7
0
1
0
1
2 1 1
9 6 3
C
B
C
(iv ) B
@ 1 2 1 A; positive definite. (v ) @ 6 6 0 A; positive semi-definite; null vectors:
1 1 2
3 0 3
0
1
0
1
!
1
30
0 6
2 1
C
all scalar multiples of B
; positive definite. (vii) B
3C
@ 1 A. (vi)
@ 0 30
A;
1
3
1
6
3
15
1
0
2 2 1
0
B
2
5
2
2C
C
C; positive definite.
B
positive definite. (viii) B
@ 1
2
2
3A
0
2
3 13
!

3 1 2
21 12 9
B
C
B
9 3C
3.4.23. (iii)
@ 1 4 3 A, (v ) @ 12
A. Positive definiteness doesnt
2 3 5
9
3 6
change, since it only depends upon the linear independence of the vectors.
9
12

3.4.24. (vi)

4
3

12
, (iv )
21

1
7
4

(vii)

0
B
B
@

10
2
1

1
e1
1 2
3.4.25. K =
e1
2 (e 1)
1 3
1 2
(e

1)
2
3 (e 1)
early independent functions.
B
B
B
@

1 1/e
1
3.4.26. K =
1
e1
1 2
e1
2 (e 1)
(still) linearly independent.
B
B
@

3.4.27. K =

3.4.28. K =

2
0

B
B
B
B2
B
@3

0
B
B
B
B
B
@

0
2
3

2
5

2
3
2
3
2
5
2
5

2
3
2
3
2
5

2
3

0
2
5

0
2
3
2
5
2
5
2
7

C
2C
5C
C
0C
A
2
7
21
5C
2C
5C
C
2C
7A
2
9

145
12
10
3
1
2
1
3
1
4

10 C
C,
3 A
41
6
1

(viii)

5
4

B
B 2
B
B
@ 1

2
9
2

1
2

2
1

4
3

0
C
1C
C.
C
1A
5

(e2 1) C
C
is positive definite since 1, ex , e2 x are lin(e3 1) C
A
(e4 1)
1

e1
C
x 2x
1 2
are
(e
1) C
A is also positive definite since 1, e , e
2
1 3
3 (e 1)

is positive definite since 1, x, x2 , x3 are linearly independent.

is positive definite since 1, x, x2 , x3 are (still) linearly independent.

3.4.29. Let h x , y i = xT K y be the corresponding inner product. Then kij = h ei , ej i, and


hence K is the Gram matrix associated with the standard basis vectors e 1 , . . . , en .
92

3.4.30.
(a) is a special case of (b) since positive definite matrices are symmetric.
(b) By Theorem 3.28 if S is any symmetric matrix, then S T S = S 2 is always positive semidefinite, and positive definite if and only if ker S = {0}, i.e., S is nonsingular. In particular, if S = K > 0, then ker K = {0} and so K 2 > 0.
3.4.31.
(a) coker K = ker K since K is symmetric, and so part (a) follows from Proposition 3.36.
(b) By Exercise 2.5.39, rng K rng AT = corng A. Moreover, by part (a) and Theorem 2.49, both have the same dimension, and hence they must be equal.
3.4.32. 0 = zT K z = zT AT C A z = yT C y, where y = A z. Since C > 0, this implies y = 0, and
hence z ker A = ker K.
3.4.33.
(a) L = (AT )T is the Gram matrix corresponding to the columns of AT , i.e., the rows of A.
(b) From Exercise 3.4.31 and Theorem 2.49, rank K = rank A = rank AT = rank L.
(c) This is true if and only if both ker A and coker A are {0}, which, by Theorem 2.49, requires that A be square and nonsingular.
3.4.34. A Gram matrix is positive definite if and only if the vector space elements used to construct it are linearly independent. Linear independence doesnt depend upon the inner
product being used, and so if the Gram matrix for one inner product is positive definite,
so is the Gram matrix for any other inner product on the vector space.
3.4.35.
(a) As in Exercise 3.4.7, the sum of positive definite matrices is positive definite.
(b) Example: A1 = ( 1 0 ), A2 =
( 0 1 ), C1 = C2 = !I , K = I .
!
A1
C1
O
(c) In block form, set A =
and C =
. Then
A2
O C2
T
AT C A = A T
1 C1 A1 + A2 C2 A2 = K.

3.5.1. Only (a), (e) are positive definite.


3.5.2.

1
2

(a)

5
1
0
3
(c) B
@ 1
3

(b)

(d)

(e)

B
@

2
1
1

B
@

2
1
2

2
3

1
2

0
1

1
0
1 2
=
; not positive definite.
0 1
0 1
!
!
!
!
1
5
0
1
1 0
1

5 ; positive definite.
=
0 14
3
51 1
0
1
5
1
10
10
1
0
1
3
0
0
1
0
0
1

1
1 3
3
C
B
14
B
B 1
1 0C
0C
5 1C
A@ 0
AB
A = @3
1 37 C
3
A; positive definite.
@0
1 5
1 37 1
0
0 87
0
00 1
1
0
1
10
1
1
1
1
0 0
2
0
0
1

1 1
2
2C
B
3
B 1
B
1 0C
0C
2
1C
A = @2
A@ 0 2
AB
1 31 C
@0
A;
1
1
4
1 2

1
0
0

0
0
1
2
3
3
not positive
definite.
1
0
10
10
1
1 2
1
0 0
2 0 0
1 21 1
B 1
B
CB
C
1
1 3 C
1 0C
A=@ 2
A@ 0 2 0 A @ 0
1 4 A; positive definite.
3 11
1 4 1
0 0 1
0 0
1

93

1 1 0
1
0
0
B
2 0 1C
1
0
C
B1
C=B
(f )
0 1 1 A @ 1 1
1
0 1 1 2
0
1 2
not positive definite.
0
0
1
1
0 0
3 2 1 0
B2
B
C
1 0
2 3 0 1C B
3
B
C=B
(g) B
1
2
B
@1 0 3 2A
@ 3 5 1
3
0 1 2 4
0
5 1
positive definite.
0
1
0
1
2
1 2
0
B 1
C
B
1
3 3
2C B
2
C=B
B
(h) B
@ 2 3
4 1 A B
1
@
0
2 1
7
0
positive definite.
B
B1
B
@1

10

0
1
B
0C
CB 0
CB
0 A@ 0
1
0
1

0 03
0C
CB
0
CB
B
0C
A@ 0
0
1
0
1

0
0

10

0
1
B
0C
CB 0
CB
0 A@ 0
5
0

1
1
0
0

10

2
3

0
1
B
0C
CB
0
CB
0 AB
@0
1
0

12
5

10

0
2
B0
0C
CB
CB
B
0C
A@ 0
1
0

0
0

5
2

2
5

0
0

1
0
0

1
1
1
0

0
1C
C
C;
2 A
5

1
3
2
5

10

0
1
B
0C
CB 0
CB
B
0C
A@ 0
9
0
2

C
3C
5C
C;
1A

1
0

1
2

1
0
0

3.5.4. K =

0
B
B
B
@

1
2
1
2

1
c1
0

1
45
1
0

0
4
5
3
2

C
C
C;
C
A

0
C;
1 C
c2 A
c1
2 are all positive if and only if c > 2.
the pivots 1, c 1, cc
1 10
0
0
1
10
1
1 1 0
1 0 0
1 0 0
1 1 0
B
C
B
B
1 0C
(b) B
0C
1C
@1 3 1A = @1
A@ 0 2
A@ 0 2
A;
1
1
1
0 1 1
0 2 1
0 0 2
0 0 2
(c) q(x, y, z) = (x + y)2 + 2 (y + 21 z)2 + 21 z 2 .
(d) The coefficients (pivots) 1, 2, 12 are positive, so the quadratic form is positive definite.

3.5.3.
(a) Gaussian Elimination leads to U =

1
B
B0
@
0

0
0

3
2

4
5

0
0
1
0

5
3

0
0
1

45

0
1
0
0

12 C
C
; yes, it is positive definite.
0C
A
3

1
2

2
0

3.5.5.
(a) (x + 4 y)2 15 y 2 ; not positive definite.
(b) (x 2 y)2 + 3 y 2 ; positive definite.

(c) (x y)2 2 y 2 ; not positive definite.

(d) (x + 3 y)2 9 y 2 ; not positive definite.


3.5.6. (a) (x + 2 z)2 + 3 y 2 + z 2 ,

(c) 2 x1 +
3.5.7.

1
(a) ( x y z )T B
@0
2
0

0
2
4

3
(b) ( x y z )T B
@ 4
0

1
2

1
(c) ( x y z )T B
@ 1
2

1
4

(b)

x2

10

1
2

x3

x+
2

3
2

yz

15
8

(y + 2 z)2 + 4 z 2 ,

3
4

2
5

x3

x2

2
x
B C
4C
A@ y A; not positive definite;
12 1z0 1
x
4 21
CB C
2 0 A@ y A; not positive definite;
0 1 10z 1
x
1 2
B C
2 3 C
A@ y A; positive definite;
z
3
6
94

6
5

x23 .

(d) ( x1 x2

B
B
x3 ) T B
@

(e) ( x1 x2 x3

3
2
27
0

1
B 2
B
x4 ) T B
@ 1
0

9
2

10

1
CB x 1 C
9C
C;
CB
2 A@ x 2 A

72

2
1

x3

2
5
0
1

1
0
6
12

not positive definite;

10

0
x1
C
B
1 C
CB x C
1 CB 2 C; positive definite.
2 A@ x 3 A
x4
4

3.5.8. When a2 < 4 and a2 + b2 + c2 a b c < 4.

2
b
c a b2 2
3.5.9. True. Indeed, if a 6= 0, then q(x, y) = a x + y
y ;
+
a
a
!2
b
c a b2 2
if c 6= 0, then q(x, y) = c y + x
x ;
+
c
c
while if a = c = 0, then q(x, y) = 2 b x y = 21 b (x + y)2

1
2

b (x y)2 .

3.5.10. (a) According to Theorem 1.52, det K is equal to the product of its pivots, which are
n
X

all positive by Theorem 3.37. (b) tr K =

i=1

kii > 0 since, according to Exercise 3.4.4,


!

a b
, if tr K = a + c > 0, and a 0,
every diagonal entry of K is positive. (c) For K =
b c
then c > 0, but then det K = a c b2 0, which contradicts the assumptions.
0 Thus, both
1
3
0
0
a > 0 and a cb2 > 0, which, by (3.62), implies K > 0. (d) Example: K = B
0C
@ 0 1
A.
0
0 1
!
x1
T
3.5.11. Writing x =
R 2 n where x1 , x2 R n , we have xT K x = xT
1 K1 x1 +x2 K2 x2 > 0
x2
for all x 6= 0 by positive definiteness of!K1 , K2 . The converse is also true, because
x1
T
xT
with x1 6= 0.
1 K1 x1 = x K x > 0 when x =
0
3.5.12.
(a) If x 6= 0 then u = x/k x k is a unit vector, and so q(x) = xT K x = k x k2 uT K u > 0.
(b) Using the Euclidean norm, let m = min{ uT S u | k u k = 1 } > , which is finite since
q(u) is continuous and the unit sphere in R n is closed and bounded. Then uT K u =
uT S u + c k u k2 m + c > 0 for k u k = 1 provided c > m.
3.5.13. Write S = (S + c I ) + ( c I ) = K + N , where N = c I is negative definite for any
c > 0, while K = S + c I is positive definite provided c 0 is sufficiently large.
3.5.14.
(a) The ith column of D LT is di li . Hence writing K = L(D LT ) and using formula (1.14)
results in (3.69).
!
!
!
!
!

3 0
4 1
1
0 0
4 1
1
(b)
=4
+
,
(
0
1
)
=
1

+
1
4
1
1
41
0 43
1
4 1
4
0

1
B
@2
1

2
6
1

1
1
B C
1C
A = @ 2 A( 1
4
1
0
1 2
=B
@2 4
1 2

2 1 ) + 2B
@
1

0
1
B
2C
A + @0
0
1

21

0
2
1

0
1C
A 0

12

1
0

0
0
B
1 C
A + @0
1
0
2

95

0
0
0

5 0C
+ B
@ 0 A( 0 0 1 )
2 1
1

0
0C
A.

5
2

3.5.15. According to Exercise 1.9.19, the pivots of a regular matrix are the ratios of these sucdet K
a d b2
, and
. Therecessive subdeterminants. For the 33 case, the pivots are a,
a
a d b2
fore, in general the pivots are positive if and only if the subdeterminants are.
3.5.16. If a negative diagonal entry appears, it is either a pivot, or a diagonal entry of the remaining lower right symmetric (m i) (m i) submatrix, which, by Exercise 3.4.4, must
all be positive in order that the matrix be positive definite.
!

a b
3.5.17. Use the fact that K = N is positive definite. A 2 2 symmetric matrix N =
b c
2
is negative
definite
0
1 if and only if a < 0 and det N = ac b > 0. Similarly, a 3 3 matrix
a b c
C
2
N = B
@ b d e A < 0 if and only if a < 0, ad b > 0, det K < 0. In general, K is
c e f
negative definite if and only if its upper left entry is negative and the sequence of square
upper left i i subdeterminants, i = 1, . . . , n, have alternating signs: , +, , +, . . . .
3.5.18. False: if N has size n n then tr N < 0 but det N > 0 if n is even and < 0 if n is odd.

3.5.19.
3
2

(a)

2
2

=@
!

1
10
3 2
3 C
AB
@
A,
2
2
0
3
3
!
!

2 0
4 12
=
6 3
12
45
0
1
0
1
1
1
1
0
B
(c) B
2 2 C
1
A = @1
@1
1 2 14
1 3
0
0
1
2 0
2 1 1
B 1
B
C
B
3
(d) @ 1 2 1 A = B
2
@ 2
1 1 2
1
1
(b)

(e)

B
B1
B
@0

3.5.20. (a)

(e)

B
B1
B
@1

1
2
1
0

0
1
2
1

4
2
1
2
1
1

B
0
B
B
0C
C
B
C=B
1A B
B
@
2

2
4
1
1
2
1

2
0

2
1
2

0
0

2 6
,
0
3
10
0
1 1
B
0C
A@ 0 1
2
0 0
10
2
0

CB
B
0C
CB
AB
@
2
3

0
0

3
2
2
3

2 0
3
1
0
2 0

B
1
B
C
1C B
B
C=B
1A B
B
@
2

1
2
1
2
1

3
2
1
6
1

1
2
3
2

1
2
1
6
0 2
3
10

0 B
CB
0C
CB
B

CB
CB
0C
CB
B
A@
5
2
!

2
3
3
2

1
3 C
A,
2

2
0

,
3
0
0

2
3
1

2 3

3.5.21.
(a) z12 + z22 , where z1 = 4 x1 , z2 = 5 x2 ;

CB

CB
0C
CB

AB
5 @
2

C
C
C,
C
A

1
2
3
2

1
2
3
2

1
2
1
6
2
3

2
0
0

C
C
0C
C
C
C.
3C
2 C
A
5
2

2
3
2
3

10

0 B
CB
0C
CB
B

96

1
2
1
6
1

2 3
5
2

C
C
C
C
C.
C
C
C
A


(b) z12 + z22 , where z1 = x1 x2 , z2 = 3 x2 ; r

(c) z12 + z22 , where z1 = 5 x1 2 x2 , z2 = 11


5 x2 ;
5
r

2
2
2
1
1
(d) z1 + z2 + z3 , where z1 = 3 x1 x2 x3 , z2 = 53 x2
3

(e) z12 + z22 + z32 , where z1 = x1 +

1
2

x2 , z 2 =

3
2 x2

1
3

x3 , z 3 =

1 x ,
3
r 15
2
3 x3 ;

z3 =

28
5

x3 ;

(f ) z12 + z22 + z32 , where z1 = 2 x1 21 x2 x3 , z2 = 12 x2 2 x3 , z3 = x3 ;


(g) z12 + z22 + z32 + z42 ,
r
r
r
r
r

8
55
where z1 = 3 x1 + 1 x2 , z2 = 83 x2 + 38 x3 , z3 = 21
x
+
x
,
z
=
4
8 3
21 4
21 x3 .
3

3.6.1. The equation is e i + 1 = 0, since e i = cos + i sin = 1.


3.6.2. e

k i

= cos k + i sin k = (1) =

1,
1,

k even,
k odd.

3.6.3. Not necessarily. Since 1 = e2 k i for any integer k, we could equally well compute

1z = e2 k i z = e 2 k y+ i (2 k x) = e 2 k y cos 2 k x + i sin 2 k x .
If z = n is an integer, this always reduces to 1n = 1, no matter what k is. If z = m/n is
a rational number (in lowest terms) then 1m/n has n different possible values. In all other
cases, 1z has an infinite number of possible values.
3.6.4. e2 a i = cos 2 a + i sin 2 a = 1 if and only if a is an integer. The problem is that, as in
Exercise 3.6.3, the quantity 1a is not necessarily equal to 1 when a is not an integer.

i = e i /4 = 1 + i and e5 i /4 = 1 i .
3.6.5. (a) i = e i /2 ; (b)
2 2
2
2

4
i = e i /8 , e5 i /8 , e9 i /8 , e13 i /8 .
(c) 3 i = e i /6 , e5 i /6 , e3 i /2 = i ;
3.6.6. Along the line through z at the reciprocal radius 1/r = 1/| z |.
3.6.7.
(a) 1/z moves in a clockwise direction around a circle of radius 1/r;
(b) z moves in a clockwise direction around a circle of radius r;
1
(c) Suppose the circle has radius r and is centered at a. If r < | a |, then z moves in a
a
| a |2
centered at
; if
counterclockwise direction around a circle of radius
| a |2 r 2
| a |2 r 2
| a |2
1
r > | a |, then z moves in a clockwise direction around a circle of radius 2
cenr | a |2
a
1
; if r = | a |, then z moves along a straight line. On the other hand, z
tered at
2
2
|a| r
moves in a clockwise direction around a circle of radius r centered at a.
3.6.8. Set z = x + i y. We find | Re z | = | x | =
inequality. Similarly, | Im z | = | y | =

y2

x2

x2 + y 2 = | z |, which proves the first

x2 + y 2 = | z |.

3.6.9. Write z = r e i so = ph z. Then Re e i z = Re (r e i (+) ) = r cos( + ) r = | z |,


with equality if and only if + is an integer multiple of 2 .
3.6.10. Set z = r e i , w = s e i , then z w = r s e i (+) has modulus | z w | = r s = | z | | w | and
97

phase ph (z w) = + = ph z + ph w. Further, z = r e i has modulus | z | = r = | z | and


phase ph z = = ph z.
3.6.11. If z = r e i , w = s e i 6= 0, then z/w = (r/s) e i () has phase ph (z/w) = =
ph z ph w, while z w = r s e i () also has phase ph (z w) = = ph z ph w.
3.6.12. Since tan(t + ) = tan t, the inverse tan1 t is only defined up to multiples of , whereas
ph z is uniquely defined up to multiples of 2 .
3.6.13. Set z = x + i y, w = u + i v, then z w = (x + i y) (u i v) = (x u + y v) + i (y u x v)
has real part Re (z w) = x u + y v, which is the dot product between ( x, y )T and ( u, v )T .
3.6.14.
(a) By Exercise 3.6.13, for z = x + i y, w = u + i v, the quantity Re (z w) = x u + y v is equal
to the dot product between the vectors ( x, y )T , ( u, v )T , and hence equals 0 if and only
if they are orthogonal.
(b) z i z = i z z = i | z |2 is purely imaginary, with zero real part, and so orthogonality
follows from part (a). Alternatively, note that z = x + i y corresponds to ( x, y ) T while
i z = y + i x corresponds to the orthogonal vector ( y, x )T .
3.6.15.

e(x+ i y)+(u+ i v) = e(x+u)+ i (y+v) = ex+u cos(y + v) + i sin(y + v)


h

= ex+u (cos y cos v sin y sin v) + i (cos y sin v + sin y cos v)


h

= ex (cos y + i sin y)

ih

eu (cos v + i sin v) = ez ew .

Use induction: e(m+1) z = em z+z = em z ez = (ez )m ez = (ez )m+1 .


3.6.16.
(a) e2 i = cos 2 + i sin 2 while (e i )2 = (cos + i sin )2 = (cos2 sin2 ) + 2 i cos sin ,
and hence cos 2 = cos2 sin2 , sin 2 = 2 cos sin .
2
(b) cos 3 = cos3 3 cos sin0
, 1sin 3 = 3 cos sin2 sin3 .
X
m
(c) cos m =
(1)k @ A cosmj sinj ,
j
0j=2 km
sin m =

(1)

0<j=2 k+1m

3.6.17.

cos

cos
=
2
2
=

1
4
1
4

0 1
k @ mA

cosmj sinj .

e i ()/2 + e i ()/2

ei +

1 i
4e

1
4

ei +

ih

e i (+)/2 + e i (+)/2

1 i
4e

1
2

cos

1
2

cos .

3.6.18. ez = ex cos y + i ex sin y = r cos + i r sin implies r = | ez | = ex and = ph ez = y.


3.6.19. cos(x + i y) = cos x cosh y i sin x sinh y, sin(x + i y) = sin x cosh y + i cos x sinh y,
ey + e y
ey e y
where cosh y =
, sinh y =
. In particular, when y = 0, cosh y = 1 and
2
2
sinh y = 0, and so these reduce to the usual real trigonometric functions. If x = 0 we obtain
cos i y = cosh y, sin i y = i sinh y.
3.6.20.
(a) cosh(x + i y) = cosh x cos y i sinh x sin y, sinh(x + i y) = sinh x cos y + i cosh x sin y;
(b) Using Exercise 3.6.19,
e z + e z
e z e z
ez e z
= cosh z, sin i z =
= i
= i sinh z.
cos i z =
2
2i
2
98

3.6.21.
(a) If j + k = n, then (cos )j (sin )k =

1
2n i k

(e i + e i )j (e i e i )k . When multiplied

out, each term has 0 l n factors of e i and n l factors of e i , which equals


e i (2 ln) with n 2 l n n, and hence the product is a linear combination of the
indicated exponentials.
(b) This follows from part (a), writing each e i k = cos k + i sin k and e i k = cos k
i sin k for k 0.
(c) (i) cos2 = 14 e2 i + 12 + 41 e 2 i = 12 + 21 cos 2 ,
(ii) cos sin = 41 i e2 i + 14 i e 2 i = 12 sin 2 ,
(iii) cos3 = 81 e3 i + 38 e i + 38 e i + 81 e 3 i = 43 cos + 41 cos 3 ,
1 4i
1 4 i
(iv ) sin4 = 16
e
14 e2 i + 38 14 e 2 i + 16
e
= 38 12 cos 2 + 18 cos 4 ,
1 4 i
1 4i
e
+ 18 16
e
= 18 18 cos 4 .
(v ) cos2 sin2 = 16
3.6.22. xa+ i b = xa e i b log x = xa cos(b log x) + i xa sin(b log x).

3.6.23. First, using the power series for ex , we have the complex power series e i x =
8
>
>
>
>
>
<

Since i n = >
>
>
>
>
:

eix =

1,
i,
1,
i,

n=1

n = 4 k,
n = 4 k + 1,
n = 4 k + 2,
n = 4 k + 3,

n=1

( i x)n
.
n!

we can rewrite the preceding series as

X
x2 k
( i x)n
=
+i
(1)k
n!
(2 k) !
k=0

(1)k

k=0

x2 k+1
= cos x + i sin x.
(2 k) !

3.6.24.

d x
d x
e cos x + i e x sin x = ( e x cos x e x sin x) +
e =
dx
dx

+ i ( e x sin x + e x cos x) = ( + i ) e x cos x + i e x sin x = e x .


(b) This follows from the Fundamental Theorem of Calculus. Alternatively, one can calculate the integrals of the real and imaginary parts directly:.
Z b

1
e x cos x dx = 2
e b cos b + e b sin b e a cos a e a sin a ,
2
a
+
Z b

1
b
b
a
a

e
sin

e
cos

e
sin

a
+

e
cos

a
.
e x sin x dx = 2
a
+ 2

(a)

3.6.25. (a) 21 x + 14 sin 2 x, (b) 12 x 14 sin 2 x, (c) 14 cos 2 x,


1
1
(e) 38 x + 14 sin 2 x + 32
sin 4 x, (f ) 38 x 14 sin 2 x + 32
sin 4 x,
1
1
1
(h) 14 cos x + 20
cos 5 x 36
cos 9 x 60
cos 15 x.

3.6.26.

Re z 2 :

Im z 2 :

Both have saddle points at the origin,


99

1
(d) 41 cos 2 x 16
cos 8 x,
1
1
(g) 8 x 32 sin 4 x,

Re

1
:
z

Im

1
:
z

Both have singularities (poles) at the origin.

3.6.27.

ph z:

| z |:

3.6.28. (a) Linearly independent; (b) linearly dependent; (c) linearly independent; (d) linearly
dependent; (e) linearly independent; (f ) linearly independent; (g) linearly dependent.

3.6.29. (a) Linearly


independent;
(b)
yes,
they
are
a
basis;
(c)
k
v
k
=
2,
k
v
k
=
6,
1
2

k v3 k = 5, (d) v1 v2 = 1 + i , v2 v1 = 1 i , v1 v3 = 0, v2 v3 = 0, so v1 , v3 and
v2 , v3 are orthogonal, but not v1 , v2 . (e) No, since v1 and v2 are not orthogonal.
3.6.30.
(a) Dimension = 1; basis: ( 1, i , 1 i )T .
(b) Dimension = 2; basis: ( i 1, 0, 1 )T , ( i , 1, 0 )T .
(c) Dimension = 2; basis: ( 1, i + 2 )T , ( i , 1 + 3 i )T .
(d) Dimension = 1; basis:

14
5

8
5

13
5
T

i,

4
5

i,1

(e) Dimension = 2; basis: ( 1 + i , 1, 0 ) , ( i , 0, 1 ) .

3.6.31. False it is not closed under scalar multiplication. For instance, i


not in the subspace since i z = i z.
3.6.32.

(b) Range:

(c) Range:

2i
; cokernel:
1
0
1 0
!
!
2
0
1 + i
2
C B
; corange: B
1
+
i
,
1
+
i
,
@
A @
3
4
1 2i
3 3i
cokernel: {0}.
i
; corange:
1

(a) Range:

1 0

B
C B
@ 1 + 2 i A, @

i
; kernel:
2

2 i
3
1+ i

C
A;

corange:

100

0
B
@

i
1
2 i

1 0
C B
A, @

i
1

z
z

iz
iz

.
0

1 + 25 i
C
B
;
kernel:
A
@ 3i
1

0
0 C
A; kernel:
2

0
B
@

i
1
0

C
A;

C
A;

is

cokernel:

1 32
B
B 1
@2 +

i
i

C
C.
A

3.6.33. If c v + d w = 0 where c = a + i b, d = e + i f , then, taking real and imaginary parts,


(a + e)x + ( b + f )y = 0 = (b + f )x + (a e)y. If c, d are not both zero, so v, w are
linearly dependent, then a e, b f cannot all be zero, and so x, y are linearly dependent.
Conversely, if a x+b y = 0 with a, b not both zero, then (a i b)v+(a+ i b)w = 2(a x+b y) =
0 and hence v, w are linearly dependent.
3.6.34. This can be proved directly, or by noting that it can be identified with the vector space
C m n . The dimension is m n, with a basis provided by the m n matrices with a single entry
of 1 and all other entries equal to 0.
3.6.35. Only (b) is a subspace.
3.6.36. False.
3.6.37. (a) Belongs: sin x = 21 i e i x + 12 i e i x ; (b) belongs: cos x 2 i sin x = ( 12 + i )e i x +
( 12 i )e i x ; (c) doesnt belong; (d) belongs: sin2 21 x = 12 12 e i x 21 e i x ;
(e) doesnt belong.
3.6.38.
(a) Sesquilinearity:
h c u + d v , w i = (c u1 + d v1 ) w1 + 2 (c u2 + d v2 ) w2
= c (u1 w1 + 2 u2 w2 ) + d (v1 w1 + 2 v2 w2 ) = c h u , w i + d h v , w i,
h u , c v + d w i = u 1 ( c v1 + d w1 ) + 2 u2 ( c v2 + d w2 )
(u w + 2 u w ) = c h u , v i + d
h u , w i.
= c (u1 v 1 + 2 u2 v 2 ) + d
1 1
2 2
Conjugate Symmetry:
h v , w i = v1 w1 + 2 v2 w2 = w1 v 1 + 2 w2 v 2 = h w , v i.

Positive definite:
h v , v i = | v1 |2 + 2 | v2 |2 > 0 whenever v = ( v1 , v2 )T 6= 0.
(b) Sesquilinearity:
h c u + d v , w i = (c u1 + d v1 ) w1 + i (c u1 + d v1 ) w2 i (c u2 + d v2 ) w1 + 2 (c u2 + d v2 ) w2
= c (u1 w1 + i u1 w2 i u2 w1 + 2 u2 w2 ) + d (v1 w1 + i v1 w2 i v2 w1 + 2 v2 w2 )
= c h u , w i + d h v , w i,

h u , c v + d w i = u 1 ( c v1 + d w1 ) + i u1 ( c v2 + d w2 ) i u2 ( c v1 + d w1 ) + 2 u2 ( c v2 + d w2 )
(u w + i u w i u w + 2 u w )
= c (u v + i u v i u v + 2 u v ) + d
1 1

1 2

2 1

2 2

h u , w i.
= c h u , v i + d
Conjugate Symmetry:

h v , w i = v1 w1 + i v1 w2 i v2 w1 + 2 v2 w2 = w1 v 1 + i w1 v 2 i w2 v 1 + 2 w2 v 2 = h w , v i.
Positive definite: Let v = ( v1 , v2 )T = ( x1 + i y1 , x2 + i y2 )T :

h v , v i = | v1 |2 + i v1 v 2 i v2 v 1 + 2 | v2 |2 = x21 + y12 + x1 y2 x2 y1 + 2 x22 + 2 y22


= (x1 + y2 )2 + (y1 x2 )2 + x22 + y22 > 0

provided

v 6= 0.

3.6.39. Only (d), (e) define Hermitian inner products.


3.6.40. (A v) w = (A v)T w = vT AT w = vT A w = vT A w = v (A w).

101

3.6.41.
(a) k z k2 =

n
X

j =1

| z j |2 =

n
X

j =1

| x j |2 + | y j |2

n
X

j =1

| x j |2 +

n
X

j =1

| yj |2 = k x k 2 + k y k 2 .

(b) No; for instance, the formula is not valid for the inner product in Exercise 3.6.38(b).
3.6.42.
2
2
2
(a) k z + w k = h z + w , z + w i = k z k + h z , w i + h w , z i + k w k
(b) Using (a),

= k z k2 + h z , w i + h z , w i + k w k2 = k z k2 + 2 Re h z , w i + k w k2 .

k z + w k2 k z w k2 + i k z + i w k2 i k z i w k2 = 4 Re h z , w i + 4 i Re ( i h z , w i)
= 4 Re h z , w i + 4 i Im h z , w i = 4 h z , w i.
3.6.43.

|hv,wi|
. Note the modulus on the
kvk kwk
inner product
term, which

is needed in order to keep the angle real.


(b) k v k = 11 , k w k = 2 2, h v , w i = 2 i , so = cos1 1 = 1.3556 radians.
(a) The angle between v, w is defined by cos =

22

3.6.44. k v k = k c v k if and only if | c | = 1, and so c = e i for some 0 < 2 .


3.6.45. Assume w 6= 0. Then, by Exercise 3.6.42(a), for t C,

0 k v + t w k2 = k v k2 + 2 Re t h v , w i + | t |2 k w k2 .
hv,wi
With t =
, we find
k w k2
0 k v k2 2

| h v , w i |2
| h v , w i |2
| h v , w i |2
2
+
=
k
v
k

,
k w k2
k w k2
k w k2

which implies | h v , w i |2 k v k2 k w k2 , proving CauchySchwarz.


To establish the triangle inequality,
k v + w k2 = h v + w , v + w i = k v k2 + 2 Re h v , w i + k w k2

k v k2 + 2 k v k k w k + k w k 2 =

kvk + kwk

since, according to Exercise 3.6.8, Re h v , w i | h v , w i | k v k2 k w k2 ,.

3.6.46.
(a) A norm on the complex vector space V assigns a real number k v k to each vector v V ,
subject to the following axioms, for all v, w V , and c C:
(i) Positivity:
k v k 0, with k v k = 0 if and only if v = 0.
(ii) Homogeneity:
k c v k = | c | k v k.
(iii) Triangle inequality:
kqv + w k k v k + k w k.
(b) k v k1 = | v1 |+ +| vn |; k v k2 = | v1 |2 + + | vn |2 ; k v k = max{ | v1 |, . . . , | vn | }.
3.6.47. (e) Infinitely many, namely u = e i v/k v k for any 0 < 2 .

3.6.48.
(a) (A ) = (AT )T = (AT )T = A,
(b) (z A + w B) = (z A + w B)T = (z AT + w B T ) = z AT + w B T = z A + w B ,
(c) (AB) = (AB)T = B T AT = B T AT = B A .

102

3.6.49.
(a) The entries of H satisfy hji = hij ; in particular, hii = hii , and so hii is real.
(b) (H z) w = (H z)T w = zT H T w = zT H w = z (H w).
n
X

(c) Let z =
n
X

i=1

i,j = 1

zi e i , w =
T

n
X

i=1

wi ei be vectors in C n . Then, by sesquilinearity, h z , w i =

hij zi wj = z H w, where H has entries hij = h ei , ej i = h ej , ei i = hji , proving

that it is a Hermitian matrix. Positive definiteness requires k z k2 = zT H z > 0 for all


z 6= 0.
(d) First check that the matrix is Hermitian: hij = hji . Then apply Regular Gaussian
Elimination, checking that all pivots are real and positive.
3.6.50.
(a) The (i, j) entry of the Gram matrix K is kij = h vi , vj i = h vj , vi i = kji , and so
K = K is Hermitian.

(b) xT K x =

n
X

i,j = 1

kij xi xj =

n
X

i,j = 1

xi xj h vi , vj i = k v k2 0 where v =

n
X

i=1

xi v i .

(c) Equality holds if and only if v = 0. If v1 , . . . , vn are linearly independent, then


v=

n
X

i=1

xi vi = 0 requires x = 0, proving positive definiteness.

3.6.51.
(a) (i) h 1 , e i x i = 2 i , k 1 k = 1, k e i x k = 1;

k 1 + e i x k = 2 2 = k 1 k + k e i x k.
(ii) | h 1 , e i x i | = 2 1 = k 1 k k e i x k,
(b) (i) h x + i , x i i = 23 + i , k x + i k = k x i k = 2 ;
3

13
3

4
3

= k x + i k k x i k,
(ii) | h x + i , x i i | =
k (x + i ) + (x i ) k = k 2 x k = 2 4 = k x + i k + k x i k.

(c)

3
3
2
1
(i) h i x , (1 2 i )x + 3 i i = + 4 i , k i x k = 1 , k (1 2 i )x + 3 i k
5

q
2
2
5
14
(ii) | h i x , (1 2 i )x + 3 i i | = 4 15 = k i x k k (1 2 i )x + 3 i k,
q
q
14
1 +
k i x2 + (1 2 i )x + 3 i k = 88

2.4221

2.6075

15
3
5
2
2

1
2

14
3

= k i x k + k (1 2 i )x + 3 i k.

3.6.52. w(x) > 0 must be real and positive. Less restrictively, one needs only require that
w(x) 0 as long as w(x) 6 0 on any open subinterval a c < x < d b; see Exercise
3.1.28 for details.

103

Solutions Chapter 4

4.1.1. We need to minimize (3 x 1)2 + (2 x + 1)2 = 13 x2 2 x + 2. The minimum value of


1
.
occurs when x = 13

25
13

4.1.2. Note that f (x, y) 0; the minimum value f (x? , y? ) = 0 is achieved when
x? = 57 , y? = 74 .
4.1.3. (a) ( 1, 0 )T , (b) ( 0, 2 )T , (c)

1 1
2, 2

, (d)

23 ,

3
2

, (e) ( 1, 2 )T .

4.1.4. Note: To minimize the distance between the point ( a, b )T to the line
y = m x + c:
n
o
(i) in the norm we must minimize the scalar function f (x) = max | x a |, | m x + c b | ,
while (ii) in the 1 norm we must minimize the scalar function f (x) = | x a |+| m x + c b |.
(i) (a) all points on the line segment ( x, 0 )T for 3 x 1; (b) all points on the line
segment ( 0, y )T for 1 y 3; (c)

1 1
2, 2

; (d)

32 ,

3
2

; (e) ( 1, 2 )T .

(ii) (a) ( 1, 0 )T ; (b) ( 0, 2 )T ; (c) all points on the line segment ( t, t )T for 1 t 2;
(d) all points on the line segment ( t, t )T for 2 t 1; (e) ( 1, 2 )T .
4.1.5.
(a) Uniqueness is assured in the Euclidean norm. (See the following exercise.)
(b) Not unique. For instance, in the norm, every point on the x-axis of the form (x, 0)
for 1 x 1 is at a minimum distance 1 from the point ( 0, 1 )T .
(c) Not unique. For instance, in the 1 norm, every point on the line x = y of the form (x, x)
for 1 x 1 is at a minimum distance 1 from the point ( 1, 1 )T .
b
4.1.6.
(a) The closest point v is found by dropping a
perpendicular from the point to the line:

(b) Any other point w on the line lies at a larger distance since k w b k is the hypotenuse
of the right triangle with corners b, v, w and hence is longer than the side length
k v b k.
(c) Using the properties of the cross product, the distance is k b k | sin | = | a b |/k a k,
where is the angle between the line through a and b. To prove the other formmula,
we note that
k a k2 k b k2 (a b)2 = (a21 + a22 )(b21 + b22 ) (a1 b1 + a2 b2 )2 = (a1 b2 a2 b1 )2 = (a b)2 .
4.1.7. This holds because the two triangles in the figure are congruent. According to Exercise
4.1.6(c), when k a k = k b k = 1, the distance is | sin | where is the angle between a, b, as
ilustrated:
104

4.1.8.

| a x0 |
=
kak
| a x0 + b y 0 |
| a x0 + b y 0 + c |

. (b) A similar geometric construction yields the distance


.
a2 + b 2
a2 + b 2
| a x0 + b y 0 + c z 0 + d |
1

4.1.9. (a) The distance is given by


.
(b) .
2
2
2
14
a +b +c
4.1.10.
(a) Let v? be the minimizer. Since x2 is a montone, strictly increasing function for x 0,
we have 0 x < y if and only if x2 < y 2 . Thus, for any other v, we have
x = k v? b k < y = k v b k if and only if x2 = k v? b k2 < y 2 = k v b k2 ,
proving that v? must minimize both quantities.
(b) F (x) must be strictly increasing: F (x) < F (y) whenever x < y.
(a) Note that a = ( b, a )T lies in the line. By Exercise 4.1.6(a), the distance is

4.1.11.
(a) Assume V 6= {0}, as otherwise the minimum and maximum distance is k b k. Given any
0 6= v V , by the triangle inequality, k b t v k | t | k v k k b k as t , and
hence there is no maximum distance.
(b) Maximize distance from a point to a closed, bounded (compact) subset of R n , e.g., the
unit sphere { k v k = 1 }. For example, the maximal distance between the point ( 1, 1 ) T

being the farthest


and the unit circle x2 + y 2 = 1 is 1 + 2, with x? = 1 , 1
2
2
point on the circle.

4.2.1. x =

1
2,y

1
2 , z0=

2
1 1
coefficient matrix B
@1 3
0 1

with
f (x, y, z) = 23 . This is the glboal minimum because the
1
0
1C
A is positive definite.
1

1
4.2.2. At the point x? = 11
, y? =

5
22 .

4.2.3. (a) Minimizer: x = 23 , y = 16 ; minimum value: 34 . (b) Minimizer: x = 92 , y = 29 ;


1
minimum value: 32
9 . (c) No minimum. (d) Minimizer: x = 2 , y = 1, z = 1; minimum value: 45 . (e) No minimum. (f ) No minimum. (g) Minimizer: x = 57 , y = 54 ,
z = 51 , w = 25 ; minimum value: 58 .
1
b

b
4

1
b

0
1

0
1 b
4.2.4. (a) | b | < 2, (b) A =
=
; (c) If | b | < 2, the
4 b2
0 1
b
1
1
, achieved at x? =
, y? =
. When b 2, the
minimum value is
4 b2
4 b2
4 b2
minimum is .
105

1
0

4.2.5.

1 T
1
(a) p(x) = 4 x2 24 x y + 45 y 2 + x 4 y + 3; minimizer: x? = 24
, 18
( .0417, .0556 )T ;
minimum value: p(x? ) = 419
144 2.9097.
2
2
(b) p(x) = 3 x + 4 x y + y 8 x 2 y; no minimizer since K is not positive definite.
(c) p(x) = 3 x2 2 x y + 2 x z + 2 y 2 2 y z + 3 z 2 2 x + 4 z 3; minimizer: x? =

65
( .5833, .1667, .9167 )T ; minimum value: p(x? ) = 12
= 5.4167.
2
2
(d) p(x) = x + 2 x y + 2 x z + 2 y 2 y z + z + 6 x + 2 y 4 z + 1; no minimizer since K is
not positive definite.
(e) p(x) = x2 + 2 x y + 2 y 2 + 2 y z + 3 z 2 + 2 z w + 4 w 2 + 2 x 4 y + 6 z 8 w; minimizer:
x? = ( 8, 7, 4, 2 )T ; minimum: p(x? ) = 42.
7
1
11
12 , 6 , 12
2

4.2.6. n = 2: minimizer x? = 61 , 16

; minimum value 61 .

3
5
5
, 14
, 28
n = 3: minimizer x? = 28

; minimum value 72 .

2
5
5
2
n = 4: minimizer x? = 11
, 22
, 22
, 11
T

9
; minimum value 22
.

3
; maximum value: p(x? ) =
4.2.7. (a) maximizer: x? = 10
11 , 11
mum, since the coefficient matrix is not negative definite.

16
11 .

(b) There is no maxi-

4.2.8. False. Even in the scalar case, p1 (x) = x2 has minimum at x?1 = 0, while p2 (x) = x2 2 x
has minimum at x?2 = 1, but the minimum of p1 (x) + p2 (x) = 2 x2 2 x is at x? = 21 6=
x?1 + x?2 .
4.2.9. Let x? = K 1 f be the minimizer. When c = 0, according to the third expression in
(4.12), p(x? ) = (x? )T K x? 0 because K is positive definite. The minimum value is 0 if
and only if x? = 0, which occurs if and only if f = 0.
4.2.10. First, using Exercise 3.4.20, we rewrite q(x) = xT K x where K = K T is symmetric. If
K is positive definite or positive semi-definite, then the minimum value is 0, attained when
x? = 0 (and other points in the semi-definite cases). Otherwise, there is at least one vector
v for which q(v) = vT K v = a < 0. Then q(t v) = t2 a can be made arbitrarily large
negative for t 0; in this case, there is no minimum value.
4.2.11. If and only if f = 0 and the function is constant, in which case every x is a minimizer.
4.2.12. p(x) has a maximum if and only if p(x) has a minimum. Thus, we require either K is
negative definite, or negative semi-definite with f rng K. The maximizer x? is obtained
by solving K x? = f , and the maximum value p(x? ) is given as before by any of the expressions in (4.12).
4.2.13. The complex numbers do not have an ordering, i.e., an inequality z < w doesnt make
any sense. Thus, there is no minimum of a set of complex numbers. (One can, of course,
minimize the modulus of a set of complex numbers, but this places us back in the situation
of minimizing a real-valued function.)

4.3.1. Closest point:

6 38 36
7 , 35 , 35

( .85714, 1.08571, 1.02857 )T ; distance:

4.3.2.
(a) Closest point: ( .8343, 1.0497, 1.0221 )T ; distance: .2575.

106

1
35

.16903.

(b) Closest point: ( .8571, 1.0714, 1.0714 )T ; distance: 2673.


4.3.3.

7 8 11
9, 9, 9

( .7778, .8889, 1.2222 )T .

4.3.4.
(a) Closest point:
(b) Closest point:
(c) Closest point:
(d) Closest point:

11
7 7 7 7 T
,
,
,
;
distance:
4 4 4 4
4 .
q
T

5
3 3
; distance:
2, 2, 2 , 2
2.
T
( 3, 1, 2, 0 ) ; distance: 1.

3 1
3 T
5
; distance: 72 .
4,4, 4,4

4.3.5. Since the vectors are linearly dependent, one must first reduce to a basis consisting of the
first two. The closest point is
4.3.6.
(i) 4.3.4: (a) Closest point:

4.3.5:
(ii) 4.3.4:

distance:

4.3.8. (a)

1 1 1 1
2, 2, 2, 2

and the distance is 3.

7
3 3 3 3 T
; distance:
2, 2, 2, 2
4.
q

5
5 5 4 4 T
(b) Closest point: 3 , 3 , 3 , 3
; distance:
3.
T
(c) Closest point: ( 3, 1, 2, 0 ) ; distance: 1.
T

; distance: 7 .
(d) Closest point: 32 , 16 , 13 , 13
6

T
3 13
4
1
Closest point: 11
, 22 , 11
, 22
( .2727, .5909, .3636, .0455 )T ;
q
155
distance:
22 2.6543.

25 25 25 25 T
(a) Closest point: 14
, 14 , 14 , 14
( 1.7857, 1.7857, 1.7857, 1.7857 )T ;
q
215
distance:
14 3.9188.

66 66 59 59 T
(b) Closest point: 35 , 35 , 35 , 35
( 1.8857, 1.8857, 1.6857, 1.6857 )T ;
q
534
distance:
35 3.9060.
T

28 11 16
(c) Closest point:
( 3.1111, 1.2222, 1.7778, 0 )T ;
9q, 9 , 9 , 0
32
distance:
9 1.8856.
T

(d) Closest point: 23 , 1, 0, 21


; distance: 42.

6 3 3 3
5, 5, 2, 2
8
3;

(b)

26
107
292 159
, 259
, 259
, 259
q 259
143
8 259 5.9444.

4.3.5: Closest point:

4.3.7. v =

( .1004, .4131, 1.1274, .6139 )T ;

= ( 1.2, .6, 1.5, 1.5 )T .

7 .
6

4.3.9.
T
(a) P 2 = A(AT A)1 AT A(AT A)1 AT = A(A
A)1 AT = P .
0
1
1
1
1
0
!
1
1
6
3 6
B

1
0
B
1
2
1
2
2 A, (ii)
, (iii) B
(b) (i) @
1
3
3 3
@
0 1
12
1
2
16 13
6

C
C
C,
A

(iv )

0
B
B
B
@

5
6
1
6
1
3

16
5
6
13

31
31
1
3

C
C
C.
A

(c) P T = (A(AT A)1 AT )T = A((AT A)T )1 AT = A(AT A)1 AT = P .


(d) Given v = P b rng P , then v = A x where x = (AT A)1 AT b and so v rng A.
Conversely, if v = A x rng A, then we can write v = P b rng P where b solves the
107

linear system AT b = (AT A)x R n , which has a solution since rank AT = n and so
rng AT = R n .
(e) According to the formulas in the section, the closest point is w = A x where
x = K 1 AT b = (AT A)1 AT b and so w = A(AT A)1 AT b = P b.
(f ) If A is nonsingular, then (AT A)1 = A1 AT , hence P = A(AT A)1 AT =
A A1 AT AT = I . In this case, the columns of A span R n and the closest point to any
b R n is b = P b itself.
4.3.10.
(a) The quadratic function to be minimized is
n
X

p(x) =

i=1

k x ai k = n k x k 2 x @
n
X

which has the form (4.22) with K = n I , f =


1
minimizing point is x = K 1 f =
n
(b) (i)

21 , 4 , (ii)

1 1
3, 3

, (iii)

i=1

n
X

ai , which
i=1
41 , 34 .

n
X

i=1

ai A +

ai , c =

n
X

i=1

n
X

i=1

k a i k2 ,

k ai k2 . Therefore, the

is the center of mass of the points.

4.3.11. In general, for the norm based on the positive definite matrix C, the quadratic function
to be minimized is
p(x) =

n
X

i=1
n
1 X

k x a i k2 = n k x k 2 2 n h x , b i + n c = n x T C x 2 x T C b + c ,

n
1 X
k ai k2 . Therefore, the minimizing
n i=1
n i=1
point is still the center of mass: x = C 1 C b = b. Thus, all answers are the same as in
Exercise 4.3.10.

where b =

ai is the center of mass and c =

4.3.12. The CauchySchwarz inequality guarantees this.


4.3.13.

k v b k2 = k A x b k2 = (A x b)T C (A x b) = (xT AT bT )C (A x b)

= x T AT C A x x T AT C b b T C A x + b T b = x T AT C A x 2 x T AT C b T + b T b
= xT Kx 2 xT f + c,

where the scalar quantity xT AT C b = (xT AT C b)T = bT C A x since C T = C.

4.3.14. (a) x =

1
15 , y

(d) x
4.3.15. (a)

1
2,

(b)

41
45 ;
= 13 , y

8
@ 5 A
28
65

1
8
(b) x = 25
, y = 21
;

= 2, z = 34 ;
!

(c) u = 32 , v = 35 , w = 1;

(e) x1 = 13 , x2 = 2, x3 = 31 , x4 = 43 .

1.6
, (c)
.4308

0
B
B
B
@

2
5
1
5

C
C
C,
A

(d)

1
227
@ 941 A
304
941

.0414
.2412
C
, (e) B
@ .0680 A.
.3231
.0990

4.3.16. The solution is ( 1, 0, 3 )T . If A is nonsingular, then the least squares solution to the system A x = b is given by x? = K 1 f = (AT A)1 AT b = A1 AT AT b = A1 b, which
coincides with the ordinary solution.
108

4.3.17. The solution is x? = ( 1, 2, 3 )T . The least squares error is 0 because b rng A and so
x? is an exact solution.
4.3.18.
(a) This follows from Exercise 3.4.31 since f corng A = rng K.
(b) This follows from Theorem 4.4.
(c) Because if z ker A = ker K, then x + z is also a solution to the normal equations, and
has the same minimum value for the least squares error.

4.4.1. (a) y =

12
7

12
7

t = 1.7143 (1 + t);

(b) y = 1.9 1.1 t;

(c) y = 1.4 + 1.9 t.


140
120

4.4.2. (a) y = 30.6504 + 2.9675 t;


(c) profit: $179, 024,

(b)

(d) profit: $327, 398.

100
80
60
40
20
10

4.4.3. (a) y = 3.9227 t 7717.7;

15

20

25

30

35

(b) $147, 359 and $166, 973.

4.4.4. (a) y = 2.2774 t 4375; (b) 179.75 and 191.14; (c) y = e.0183 t31.3571 , with estimates 189.79 and 207.98. (d) The linear model has a smaller the least squares error between its predictions and the data, 6.4422 versus 10.4470 for the exponential model, and
also a smaller maximal error, namely 4.3679 versus 6.1943.
4.4.5. Assuming a linear increase in temperature, the least squares fit is y = 71.6 + .405 t, which
equals 165 at t = 230.62 minutes, so you need to wait another 170.62 minutes, just under
three hours.
4.4.6.
(a) The least squares exponential is y = e4.6051.1903 t and, at t = 10, y = 14.9059.
(b) Solving e4.6051.1903 t = .01, we find t = 48.3897 49 days.

4.4.7. (a) The least squares exponential is y = e2.2773.0265 t . The half-life is log 2/.0265 =
26.1376 days. (b) Solving e2.2773.0265 t = .01, we find t = 259.5292 260 days.
4.4.8.
(a) The least squares exponential is y = e.0132 t20.6443 , giving the population values (in
millions) y(2000) = 296, y(2010) = 337, y(2050) = 571.
(b) The revised least squares exponential is y = e.0127 t19.6763 , giving a smaller predicted
0
0
value of y(2050) = 536.
1 1 11
31

B 6C
1 2C
C
C
B
B
C
2 1C
B 11 C
C
C is the
C, while z = B
B 2 C
2 2C
C
B
C
@
A
3 2
0A
1 3 4
3
T
T
data vector. The least squares solution to the normal equations A A x = A z for x =
( a, b, c )T gives the plane z = 6.9667 .8 x .9333 y.
(b) Every plane going through ( 0, 2, 2 )T has the an equation z = a(x 2) + b(y 2), i.e., a

4.4.9.
(a) The sample matrix for the functions 1, x, y is A =

109

B1
B
B
B1
B
B1
B
@1

1
B 1
B
B
0
B
linear combination of the functions x 2, y 2. The sample matrix is A = B
B 0
B
and the least squares solution gives the plane
@ 1
14
(y 2) = 3.4667 .8 x .9333 y.
z = 54 (x 2) 15
1
4.4.10. For two data points t1 = a, t2 = b, we have
t2 =

1
2

(a2 + b2 ),

while

( t )2 =

1
4

1 1
0C
C
1 C
C
C,
0C
C
0A
2

(a2 + 2 a b + b2 ),

which are equal if and only if a = b. (For more data points, there is a single quadratic condition for equality.) Similarly, if y1 = p, y2 = q, then
ty =

1
2

(a p + b q),

while

ty =

1
4

(a + b)(p + q).

Fixing a 6= b, these are equal if and only if p = q.


4.4.11.
1
m

m
X

i=1

(ti t )2 =

1
m

m
X

i=1

t2i

2t
m

m
X

i=1

ti +

( t )2
m

m
X

1 = t 2 2 ( t ) 2 + ( t ) 2 = t2 ( t ) 2 .

i=1

4.4.12.
4
(a) p(t) = 51 (t 2) + (t + 3) = 17
5 + 5 t,
1
(b) p(t) = 13 (t 1)(t 3) 14 t(t 3) + 24
t(t 1) = 1 85 t + 81 t2 ,
(c) p(t) = 12 t (t 1) 2 (t 1)(t + 1) 21 (t + 1) t = 2 t 2 t2 ,
(d) p(t) = 21 t (t 2)(t 3) 2 t (t 1)(t 3) + 23 t (t 1)(t 2) = t2 ,
1
(e) p(t) = 24
(t + 1) t (t 1)(t 2) + 31 (t + 2) t (t 1)(t 2) + 21 (t + 2) (t + 1) (t 1)(t 2)
1
1
5
13 2
1 3
3 4
6 (t + 2) (t + 1) t (t 2) + 8 (t + 2) (t + 1) t (t 1) = 2 + 3 t 4 t 6 t + 4 t .
4.4.13.
10

(a) y = 2 t 7

10

-5

14
12
10

(b) y = t2 + 3 t + 6

8
6
4
2
-3

-2

-1

4
2

(c) y = 2 t 1

-3

-2

-1

1
-2
-4

110

5
4
3

(d) y = t3 + 2t2 1

2
1
-1.5

-1 -0.5

0.5

1.5

2.5

-1
-2
-3

25
20
15

(e) y = t4 t3 + 2 t 3

10
5
-1

-2

-5
-10

4.4.14.
(a) y = 34 + 4 t.
(b) y = 2 + t2 . The error is zero because the parabola interpolates the points exactly.
4.4.15.
8
7

(a) y = 2.0384 + .6055 t

6
5
4
3
2

10

10

8
7

(b) y = 2.14127 + .547248 t + .005374 t2

6
5
4
3

10

2
3
(c) y = 2.63492 + .916799 t .796131 t + .277116 t
4

.034102 t + .001397 t

10

(d) The linear and quadratic models are practically identical, with almost the same least
squares errors: .729045 and .721432, respectively. The fifth order interpolating polynomial, of course, has 0 least squares error since it goes exactly through the data points.
On the other hand, it has to twist so much to do this that it is highly unlikely to be
the correct theoretical model. Thus, one strongly suspects that this experimental data
comes from a linear model.
4.4.16. The quadratic least squares polynomial is y = 4480.5 + 6.05 t 1.825 t2 , and y = 1500 at
42.1038 seconds.
4.4.17. The quadratic least squares polynomial is y = 175.5357 + 56.3625 t .7241 t 2 , and y = 0
at 80.8361 seconds.
4.4.18.
(a) p2 (t) = 1 + t +

1 2
2t ,

p4 (t) = 1 + t +

1 2
2t

111

1 6
6t

1 4
24 t ;

(b) The maximal error for p2 (t) over the interval [ 0, 1 ] is .218282, while for p4 (t) it is .0099485.
The Taylor polynomials do a much better job near t = 0, but become significantly worse
at larger values of t; the least squares approximants are better over the entire interval.
4.4.19. Note: In this solution t is measured in degrees! (Alternatively, one can set up and solve
the problem in radians.) The error is the L norm of the difference sin t p(t) on the interval 0 t 60.
(a) p(t) = .0146352 t + .0243439; maximum error .0373.
(b) p(t) = .000534346 + .0191133 t .0000773989 t2 ; maximum error .00996.

t; maximum error .181.


(c) p(t) = 180
(d) p(t) = .0175934 t 9.1214
106 t2 7.25655 107 t3 ; maximum error .000649.

(e) p(t) = 180


t 61 180
t 3 ; maximum error .0102.
(f ) The Taylor polynomials do a much better job at the beginning of the interval, but the
least squares approximants are better over the entire range.
4.4.20. (a) For equally spaced data points, the least squares line is y = .1617 + .9263 t with a
maximal error of .1617 on the interval 0 t 1. (b) The least squares quadratic polynomial is y = .0444 + 1.8057 t .8794 t2 with a slightly better maximal error of .1002.
Interestingly, the line is closer to t over a larger fraction of the interval than the quadratic
polynomial, and only does significantly worse near t = 0.
1
1 1 1
Alternative solution: (a) The data points 0, 25
, 16
, 9 , 4 , 1 have exact square roots 0, 51 , 14 , 31 , 12 , 1.
For these, we obtain the least squares line y = .1685 + .8691 t, with a maximal error of
.1685. (b) The least squares quadratic polynomial y = .0773 + 2.1518 t 1.2308 t 2 with,
strangely, a worse maximal error of .1509, although it does do better over a larger fraction
of the interval.
4.4.21. p(t) = .9409 t + .4566 t2 .7732 t3 + .9330 t4 . The graphs are very close over the interval
0 t 1; the maximum error is .005144 at t = .91916. The functions rapidly diverge above
1, with tan t as t 21 , whereas p( 12 ) = 5.2882. The first graph is on the interval
[ 0, 1 ] and the second on [ 0, 12 ]:
14

1.5

12

1.25

10
1
8
0.75

0.5

0.25

2
0.2

0.4

0.6

0.8

0.2 0.4 0.6 0.8

1.2 1.4

4.4.22. The exact value is log10 e .434294.


(a) p2 (t) = .4259 + .48835 t .06245 t2 and p2 (e) = .440126;
(b) p3 (t) = .4997 + .62365 t .13625 t2 + .0123 t3 and p2 (e) = .43585.
4.4.23.
(a) q(t) = + t + t2 where
y t t (t t1 ) + y1 t2 t0 (t0 t2 ) + y2 t0 t1 (t1 t0 )
= 0 1 2 2
,
(t1 t0 )(t2 t1 )(t0 t2 )

y (t t1 ) + y1 (t0 t2 ) + y2 (t1 t0 )
y0 (t22 t21 ) + y1 (t20 t22 ) + y2 (t21 t20 )
, = 0 2
.
(t1 t0 )(t2 t1 )(t0 t2 )
(t1 t0 )(t2 t1 )(t0 t2 )

(b) The minimum is at t? =


, and m0 s1 m1 s0 = 12 (t2 t0 ), s1 s0 = (t2 t0 ).
2
y02 (t2 t1 )4 + y12 (t0 t2 )4 + y22 (t1 t0 )4
2
h
i.
=
(c) q(t? ) =
4
4 (t1 t0 )(t2 t1 )(t0 t2 ) y0 (t2 t1 ) + y1 (t0 t2 ) + y2 (t1 t0 )

112

4.4.24. When a < 2, the approximations are very good. At a = 2, a small amount of oscillation
is noticed at the two ends of the intervals. When a > 2, the approximations are worthless
for | x | > 2. The graphs are for n + 1 = 21 iteration points, with a = 1.5, 2, 2.5, 3:

-1

0.8

0.8

0.8

0.6

0.6

0.6

0.6

0.4

0.4

0.4

0.4

0.2

0.2

0.2

0.2
-1.5

1
0.8

-0.5

0.5

1.5

-2 -1.5

-1 -0.5

-0.2
-0.4

0.5

1.5

-1

-2

-3

-1

-2

-0.2

-0.2

-0.2

-0.4

-0.4

-0.4

Note: Choosing a large number of sample points, say n = 50, leads to an ill-conditioned
matrix, and even the small values of a exhibit poor approximation properties near the ends
of the intervals due to round-off errors when solving the linear system.
4.4.25. The conclusions are similar to those in Exercise 4.4.24, but here the critical value of a is
around 2.4. The graphs are for n + 1 = 21 iteration points, with a = 2, 2.5, 3, 4:
1

0.8

0.8

0.6

0.6

0.6

0.6

0.4

0.4

0.4

0.4

0.2

-2 -1.5

0.8

0.8

-1 -0.5

0.2

0.2

0.5

1.5

-2

-1

-3

-2

-1

0.2

-4

-3

-2

-1

-0.2

-0.2

-0.2

-0.2

-0.4

-0.4

-0.4

-0.4

4.4.26. x ker A if and only if p(t) vanishes at all the sample points: p(ti ) = 0, i = 1, . . . , m.
4.4.27.
(a) For example, the the interpolating polynomial for the data (0, 0), (1, 1), (2, 2) is the
straight line y = t.
(b) The Lagrange interpolating polynomials are zero at n of the sample points. But the
only polynomial of degree < n than vanishes at n points is the zero polynomial, which
does not interpolate the final nonzero data value.
4.4.28.
(a) If p(xk ) = a0 + a1 xk + a2 x2k + + an xn
k = 0 for k = 1, . . . , n + 1, then V a = 0
where V is the (n + 1) (n + 1) Vandermonde matrix with entries vij = xi1
for
j
i, j = 1, . . . , n + 1. According to Lemma 4.12, if the sample points are distinct, then V is
a nonsingular matrix, and hence the only solution to the homogeneous linear system is
a = 0, which implies p(x) 0.
(b) This is a special case of Exercise 2.3.37.
(c) This follows from part (b); linear independence of 1, x, x2 , . . . , xn means that p(x) =
a0 + a1 x + a2 x2 + + an xn 0 if and only if a0 = = an = 0.
4.4.29. This follows immediately from (4.51), since the determinant of a regular matrix is the
product of the pivots, i.e., the diagonal entries of U . Every factor ti tj appears once
among the pivot entries.
4.4.30. Note that kij = 1 + xi xj + (xi xj )2 + + (xi xj )n1 is the dot product of the ith and

j th columns of the n n Vandermonde matrix V = V (x1 , . . . , xn ), and so K = V T V is a


Gram matrix. Moreover, V is nonsingular when the xi s are distinct, which proves positive
definiteness.

113

4.4.31.

(a) f 0 (x)

f (x + h) f (x h)
;
2h

(b) f 00 (x)

f (x + h) 2 f (x) + f (x h)
;
h2

(c) f 0 (x)

f (x + 2 h) + 4 f (x + h) 3 f (x)
;
2h

(d) f 0 (x)

f (x + 2 h) + 8 f (x + h) 8 f (x h) + f (x 2 h)
,
12 h

f 00 (x)

f (x + 2 h) + 16 f (x + h) 30 f (x) + 16 f (x h) f (x 2 h)
,
12 h2

f 000 (x)

f (x + 2 h) 2 f (x + h) + 2 f (x h) f (x 2 h)
,
2 h3

f (iv) (x)

f (x + 2 h) 4 f (x + h) + 6 f (x) 4 f (x h) + f (x 2 h)
.
h4

(e) For f (x) = ex at x = 0, using single precision arithmetic, we obtain the approximations:
For h = .1:

f 0 (x) 1.00166750019844,

f 00 (x) 1.00083361116072,
f 0 (x) .99640457071210,

f 0 (x) .99999666269610,

f 00 (x) .99999888789639,

f 000 (x) 1.00250250140590,


For h = .01:

f (iv) (x) 1.00166791722567.


f 0 (x) 1.00001666675000,

f 00 (x) 1.00000833336050,
f 0 (x) .99996641549580,

f 0 (x) .99999999966665,

f 00 (x) .99999999988923,

f 000 (x) 1.00002500040157,


For h = .001:

f (iv) (x) 1.00001665913362.


f 0 (x) 1.00000016666670,

f 00 (x) 1.00000008336730,
f 0 (x) .99999966641660,

f 0 (x) .99999999999997,

f 00 (x) 1.00000000002574,

f 000 (x) 1.00000021522190,

f (iv) (x) .99969229415631.

114

For h = .0001:

f 0 (x) 1.00000000166730,

f 00 (x) 1.00000001191154,
f 0 (x) 0.99999999666713,

f 0 (x) 0.99999999999977,

f 00 (x) 0.99999999171286,

f 000 (x) 0.99998010138450,

f (iv) (x) 3.43719068489236.

When f (x) = tan x at x = 0, using single precision arithmetic,


For h = .1:

f 0 (x) 1.00334672085451,

f 00 (x) 3.505153914964787 1016 ,


f 0 (x) .99314326416565,

f 0 (x) .99994556862489,

f 00 (x) 4.199852359823657 1015 ,


f 000 (x) 2.04069133777138,

f (iv) (x) 1.144995345400589 1012 .

For h = .01:

f 0 (x) 1.00003333466672,

f 00 (x) 2.470246229790973 1015 ,


f 0 (x) .99993331466332,

f 0 (x) .99999999466559,

f 00 (x) 2.023971870815236 1014 ,


f 000 (x) 2.00040006801198,

f (iv) (x) 7.917000388601991 1010 .

For h = .001:

f 0 (x) 1.00000033333347,

f 00 (x) 3.497202527569243 1015 ,


f 0 (x) .99999933333147,

f 0 (x) .99999999999947,

f 00 (x) 5.042978979811304 1015 ,


f 000 (x) 2.00000400014065,

f (iv) (x) 1.010435775508413 107 .

For h = .0001:

f 0 (x) 1.00000000333333,

f 00 (x) 4.271860643001446 1013 ,


f 0 (x) .99999999333333,

f 0 (x) 1.00000000000000,

f 00 (x) 9.625811347808891 1013 ,


f 000 (x) 2.00000003874818,

f (iv) (x) 8.362027156624034 104 .


115

In most cases, the accuracy improves as the step size gets smaller, but not always. In
particular, at the smallest step size, the approximation to the fourth derivative gets
worse, indicating the increasing role played by round-off error.
(f ) No if the step size is too small, round-off error caused by dividing two very small
quantities ruins the approximation.
(g) If n < k, the k th derivative of the degree n interpolating polynomial is identically 0.
4.4.32.

(a) Trapezoid Rule:

(b) Simpsons Rule:


(c) Simpsons

3
8

Zab

Z b
a

1
2

(b a) f (x0 ) + f (x1 ) .

f (x) dx

1
6

(b a) f (x0 ) + 4 f (x1 ) + f (x2 ) .

Z b a
a

f (x) dx

aZ

Rule:

(d) Midpoint Rule:


(e) Open Rule:

Z b

f (x) dx

1
8

1
2

(b a) f (x0 ) + 3 f (x1 ) + 3 f (x2 ) + f (x3 ) .

f (x) dx (b a) f (x0 ).

f (x) dx

(b a) f (x0 ) + f (x1 ) .

(f ) (i) Exact: 1.71828; Trapezoid Rule: 1.85914; Simpsons Rule: 1.71886;


Simpsons 83 Rule: 1.71854; Midpoint Rule: 1.64872; Open Rule: 1.67167.
(ii) Exact: 2.0000; Trapezoid Rule: 0.; Simpsons Rule: 2.0944;
Simpsons 38 Rule: 2.04052; Midpoint Rule: 3.14159; Open Rule: 2.7207.
(iii) Exact: 1.0000; Trapezoid Rule: .859141; Simpsons Rule: .996735;
Simpsons 83 Rule: .93804; Midpoint Rule: 1.06553; Open Rule: .964339.
(iv ) Exact: 1.11145; Trapezoid Rule: 1.20711; Simpsons Rule: 1.10948;
Simpsons 38 Rule: 1.11061; Midpoint Rule: 1.06066; Open Rule: 1.07845.
Note: For more details on numerical differentiation and integration, you are encouraged to consult a basic numerical analysis text, e.g., [ 10 ].

1
4.4.33. The sample matrix is A = B
@ 0
1
gives g(t) = 38 cos t + 12 sin t.

0
1C
A; the least squares solution to A x = y =
0

4.4.34. g(t) = .9827 cosh t 1.0923 sinh t.

1
B
C
@ .5 A
.25

4.4.35. (a) g(t) = .538642 et .004497 e2 t , (b) .735894.


(c) The maximal error is .745159 which occurs at t = 3.66351.
(d) Now the least squares approximant is 0.58165 et .0051466 e2 t .431624; the least
squares error has decreased to .486091, although the maximal error over the interval [ 0, 4 ]
has increased to 1.00743, which occurs at t = 3.63383!
4.4.36.
(a) 5 points: g(t) = 4.4530 cos t + 3.4146 sin t = 5.6115 cos(t 2.4874);
9 points: g(t) = 4.2284 cos t + 3.6560 sin t = 5.5898 cos(t 2.4287).
(b) 5 points: g(t) = 4.9348 cos t + 5.5780 sin t + 4.3267 cos 2 t + 1.0220 sin 2 t
= 4.4458 cos(t .2320) + 7.4475 cos(2 t 2.2952);
9 points: g(t) = 4.8834 cos t + 5.2873 sin t + 3.6962 cos 2 t + 1.0039 sin 2 t
= 3.8301 cos(t .2652) + 7.1974 cos(2 t 2.3165).
116

4.4.37.
(a) n = 1, k = 4: p(t) = .4172 + .4540 cos t;
maximal error: .1722;

0.8
0.6
0.4
0.2
-3

-2

-1

1
0.8

(b) n = 2, k = 8: p(t) = .4014 + .3917 cos t + .1288 cos 2 t;


maximal error: .0781;

0.6
0.4
0.2
-3

-2

-1
1
0.8

(c) n = 2, k = 16: p(t) = .4017 + .389329 cos t + .1278 cos 2 t;


maximal error: .0812;

0.6
0.4
0.2
-3

-2

-1
1
0.8

(d) n = 3, k = 16: p(t) = .4017 + .3893 cos t + .1278 cos 2 t + .0537 cos 3 t;
maximal error: .0275;

0.6
0.4
0.2

-3

-2

-1

(e) Because then, due to periodicity of the trigonometric functions, the columns of the sample matrix would be linearly dependent.
(

1, j = k,
, the same as the Lagrange polynomials, the coeffi0, otherwise.
cients are cj = f (xj ). For each function and step size, we plot the sinc interpolant S(x)
and a comparison with the graph of the function.

4.4.38. Since Sk (xj ) =

0.8

f (x) = x2 ,

h = .25,

max error: .19078:

0.8

0.6

0.6

0.4

0.4

0.2

0.2
0.2

0.4

0.6

0.8

f (x) = x2 ,

0.8

h = .1,

max error: .160495:

0.6

0.4

0.4

0.4

0.6

0.8

0.6

0.4

0.4

1
2 ,

h = .25, max error: .05066:

0.4

0.6

0.8

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.4

0.6

0.8

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

0.1
0.2

117

0.2

0.2
0.2

1
2

0.8

0.6

0.2

f (x) =

0.8

0.8

max error: .14591:

0.6

0.2
0.2

h = .025,

0.4

0.8

0.6

0.2

f (x) = x2 ,

0.2
1

0.4

0.6

0.8

0.5

f (x) =

1
2

x 21 ,

h = .1, max error: .01877:

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1
0.2

0.4

0.6

0.8

0.5

f (x) =

1
2

1
2 ,

h = .025, max error: .004754:

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1
0.2

0.4

0.6

0.8

In the first case, the error is reasonably small except near the right end of the interval. In
the second case, the approximation is much better; the larger errors occur near the corner
of the function.

4.4.39. (a) y = .9231 + 3.7692 t, (b) The same interpolating parabola y = 2 + t 2 .


Note: When interpolating, the error is zero irrespective of the weights.
4.4.40. (a) .29 + .36 t,

(b) 1.7565 + 1.6957 t,

(c) 1.2308 + 1.9444 t,

(d) 2.32 + .4143 t.

4.4.41. The weights are .3015, .1562, .0891, .2887, .2774, .1715. The weighted least squares plane
has the equation z = 4.8680 1.6462 x + .2858 y.

4.4.42. When x = (AT C A)1 AT C y is the least squares solution,


Error = k y A x k =

4.4.43. (a)

3
7

9
14

t;

(b)

9
28

9
7

9 2
14 t ;

yT C y yT C A (AT C A)1 AT C y .

(c)

24
91

180
91

216 2
91 t

15 3
13 t .

0.5
-1

-0.5

0.5

-0.5

4.4.44.

1
8

27
20

3 2
2t ;

the maximal error if

-1
-1.5

2
5

at t = 1

-2
-2.5
-3

4.4.45. p1 (t) = .11477 + .66444 t,

p2 (t) = .024325 + 1.19575 t .33824 t2 .

4.4.46. 1.00005 + 0.99845 t + .51058 t2 + 0.13966 t3 + .069481 t4 .


4.4.47. (a) 1.875 x2 .875 x, (b) 1.9420 x2 1.0474 x + .0494, (c) 1.7857 x2 1.0714 x + .1071.
(d) The interpolating polynomial is the easiest to compute; it exactly coincides with the
function at the interpolation points; the maximal error over the interval [ 0, 1 ] is .1728 at
t = .8115. The least squares polynomial has a smaller maximal error of .1266 at t = .8018.
The L2 approximant does a better job on average across the interval, but its maximal error
of .1786 at t = 1 is comparable to the quadratic interpolant.
4.4.48. g(x) = 2 sin x.
4.4.49. Form the n n Gram matrix with entries kij = h gi , gj i =
vector f with entries fi = h f , gi i =

Z b
a

Z b
a

gi (x) gj (x) w(x) dx and

f (x) gi (x) w(x) dx. The solution to the linear sys118

tem K c = f then gives the required coefficients c = ( c1 , c2 , . . . , cn )T .


4.4.50.
2
3
25 2
5
(i) 28
15
14 t + 14 t .10714 1.07143 t + 1.78571 t ; maximal error: 28 = .178571 at t = 1;
2
50 2
2
(ii) 72 25
14 t + 21 t .28571 1.78571 t + 2.38095 t ; maximal error: 7 = .285714 at t = 0;
(iii) .0809 .90361 t + 1.61216 t2 ; maximal error: .210524 at t = 1. Case (i) is the best.
4.4.51.

a dx
1

1 2

(a)
=
=
tan
a
x
= .
4
2

1 + a x
a
a
x =

(b) The maximum value of fa (x) occurs at x = 0, where fa (0) = a = k fa k


.
s

k fa k22

, which
a
is small when a is large. But fa (x) has a large maximum value and so is very far from
zero near x = 0. Note that fa (x) 0 for all x 6= 0 as a , but fa (0) .

(c) The least squares error between fa (x) and the zero function is k fa k2 =

4.4.52. (a) z = x + y

1
3

(b) z =

9
10 (x y),

4.4.53. p(x, y) 0.

119

(c) z = 4/ 2 a constant function.

Solutions Chapter 5

5.1.1. (a) Orthogonal basis; (b) orthonormal basis; (c) not a basis; (d) basis; (e) orthogonal
basis; (f ) orthonormal basis.
5.1.2. (a) Basis; (b) orthonormal basis; (c) not a basis.
5.1.3. (a) Basis; (b) basis; (c) not a basis; (d) orthogonal basis; (e) orthonormal basis; (f ) basis.
5.1.4. h e1 , e2 i = h e1 , e3 i = h e2 , e3 i = 0. {e1 ,
5.1.5. (a) a = 1. (b) a =
5.1.6. a = 2 b > 0.
5.1.7. (a) a =
5.1.8.

1
2

3
2

x 1 y1 +

2
3

, (c) a =

1
2

3
2

e2 ,

1
3

e3 } is an orthonormal basis.

b > 0; (b) no possible values because they cannot be negative!


1
8

x 2 y2 .

1
0
0
B C
B C
C
5.1.9. False. Consider the basis v1 = B
@ 1 A, v2 = @ 1 A, v3 = @ 0 A. Under the weighted inner
0
0
1
product, h v1 , v2 i = b > 0, since the coefficients of a, b, c appearing in the inner product
must be strictly positive.
5.1.10.
(a) By direct computation: u v = u w = 0.
(b) First, if w = c v, then we compute v w = 0. Conversely, suppose v 6= 0 otherwise
the result is trivial, and, in particular, v1 6= 0. Then v w = 0 implies wi = c vi ,
i = 1, 2, 3, where c = w1 /v1 . The other cases when v2 6= 0 and v3 6= 0 are handled in a
similar fashion.
(c) If v, w are orthogonal and nonzero, then by Proposition 5.4, they are linearly independent, and so, by part (b), u = v w is nonzero, and, by part (a), orthogonal to both.
Thus, the three vectors are nonzero, mutually orthogonal, and so form an orthogonal
basis of R 3 .
(d) Yes. In general, k v w k = k v k k w k | sin | where is the angle between them, and
so when v, w are orthogonal unit vectors, k v w k = k v k k w k = 1. This can also be
shown by direct computation of k v w k using orthogonality.
5.1.11. See Example 5.20.

5.1.12. We repeatedly use the identity sin2 + cos2 = 1 to simplify

h u1 , u2 i = cos sin sin2 + ( cos cos sin cos sin )(cos cos cos sin sin )
+(cos sin + cos cos sin )(cos cos cos sin sin ) = 0.
By similar computations, h u1 , u3 i = h u2 , u3 i = 0, h u1 , u1 i = h u2 , u2 i = h u3 , u3 i = 1.

5.1.13.
(a) The (i, j) entry of AT K A is viT K vj = h vi , vj i. Thus, AT K A = I if and only if
(
1, i = j,
h v i , vj i =
and so the vectors form an orthonormal basis.
0, i 6= j,
120

(b) According to part (a), orthonormality requires AT K A = I , and so K = AT A1 =


(A AT )1 is the Gram matrix for A1 , and K > 0 since A1 is nonsingular. This also
proves the uniqueness
of the inner
product.
!
!
1 2
10 7
(c) A =
,K=
, with inner product
1 3
7
5
h v , w i = vT K w = 10 v1 w1 7 v1 w2 7 v2 w1 + 5 v2 w2 ;
0
1
0
1
1 1 1
3 2
0
C
B
(d) A = B
6 3 C
@ 1 1 2 A, K = @ 2
A, with inner product
1 2 3
0 3
2
h v , w i = v T K w = 3 v 1 w1 2 v 1 w2 2 v 2 w1 + 6 v 2 w2 3 v 2 w3 3 v 3 w2 + 2 v 3 w3 .
5.1.14. One way to solve this is by direct computation. A more sophisticated approach is to
apply the Cholesky factorization (3.70) to the inner product matrix: K = M M T . Then,
bT w
b where v
b = M v, w
b = M w. Therefore, v , v form an orthonorh v , w i = vT K w = v
1 2
T
b = Mv , v
b = M v , form an
mal basis relative to h v , w i = v K w if and only if v
1
1
2
2
orthonormal basis for the dot product, and hence of the form determined in Exercise 5.1.11.
Using this we find:
!
!
!
cos
sin
1 0
, so v1 = 1 sin , v2 = 1 cos , for any 0 < 2 .
(a) M =
2
0
2

(b) M =

1
0

1
, so v1 =
1

cos + sin
, v2 =
sin

sin + cos
, for any 0 < 2 .
cos

5.1.15. k v + w k2 = h v + w , v + w i = h v , v i+2 h v , w i+h w , w i = k v k2 +k w k2 if and only


if h v , w i = 0. The vector v + w is the hypotenuse of the right triangle with sides v, w.
5.1.16. h v1 + v2 , v1 v2 i = k v1 k2 k v2 k2 = 0 by assumption. Moreover, since v1 , v2 are
linearly independent, neither v1 v2 nor v1 + v2 is zero, and hence Theorem 5.5 implies
that they form an orthogonal basis for the two-dimensional vector space V .
5.1.17. By orthogonality, the Gram matrix is a k k diagonal matrix whose diagonal entries
are k v1 k2 , . . . , k vk k2 . Since these are all nonzero, the Gram matrix is nonsingular. An
alternative proof combines Propositions 3.30 and 5.4.
5.1.18.
(a) Bilinearity: for a, b constant,
e,qi =
hap + bp

Z 1

e(t)) q(t) dt
t (a p(t) + b p

0
Z 1

=a

t p(t) q(t) dt + b

Z 1
0

e(t) q(t) dt = a h p , q i + b h p
e , q i.
tp

The second bilinearity condition h p , a q + b eq i = a h p , q i + b h p , eq i follows similarly, or is


a consequence of symmetry, as in Exercise 3.1.9.

Symmetry: h q , p i =

Z 1

Positivity: h p , p i =

Z0 1

t q(t) p(t) dt =

Z 1
0

t p(t) q(t) dt = h p , q i.

t p(t)2 dt 0, since t 0 and p(t)2 0 for all 0 t 1.

Moreover, since p(t) is continuous, so is t p(t)2 . Therefore, the integral can equal 0 if
and only if t p(t)2 0 for all 0 t 1, and hence p(t) 0.
(b) p(t) = c 1 23 t for any c.

(c) p1 (t) = 2 , p2 (t) = 4 6 t;

(d) p1 (t) = 2 , p2 (t) = 4 6 t, p3 (t) = 2 (3 12 t + 10 t2 ).


121

5.1.19. Since

sin x cos x dx = 0, the functions cos x and sin x are orthogonal under the L2

inner product on [ , ]. Moreover, they span the solution space of the differential equation, and hence, by Theorem 5.5, form an orthogonal basis.
5.1.20. They form a basis, but not on orthogonal basis since
h ex/2 , e x/2 i =

Z 1
ex/2 e x/2 dx
0

= 1. An orthogonal basis is ex/2 , e x/2

ex/2
.
e1

5.1.21.
(a) We compute h v1 , v2 i = h v1 , v3 i = h v2 , v3 i = 0 and k v1 k = k v2 k = k v3 k = 1.
T
37
7
11
37
(b) h v , v1 i = 75 , h v , v2 i = 11
13 , and h v , v3 i = 65 , and so ( 1, 1, 1 ) = 5 v1 + 13 v2 65 v3 .
(c)

7
5

11
13

37
+ 65

= 3 = k v k2 .

5.1.22.
(a) By direct commputation: v1 v2 = 0, v1 v3 = 0, v2 v3 = 0.
v2 v
v3 v
v v
6
3
1
1
=
= ,
= .
(b) v = 2v1 21 v2 + 12 v3 , since 1 2 = = 2,
k v1 k
3
k v 2 k2
6
2 k v 3 k2
2
!2
!2
!2
!2
!2
!2
v1 v
v2 v
v3 v
6
3
1
(c)
+
+
=
+
+
= 14 = k v k2 .
k v1 k
k v2 k
k v3 k
3
6
2
(d) The orthonormal basis is u1 =

(e) v = 2

0 1

B 3
B 1
B
B 3
@
1
3

1
3
3 u1 u2 + u3 and
6
2

C
C
C,
C
A

u2 =

2
3 +

B
B
B
B
@

5.1.23. (a) Because h v1 , v2 i = v1T K v2 = 0. (b) v =


(c)

v1 v
k v1 k

(d) u1 =

!2

1 , 1
3
3

v2 v
k v2 k
T

!2

, u2 =

7
3

2 ,

1
15
15

(e) v = h v , u1 i u1 + h v , u2 i u2 =

7
3

5
15

u1

5.1.24. Consider the non-orthogonal basis v1 =


but k v k2 = 1 6= 12 + (1)2 .

t dt = 0,

h P 0 , P2 i =

B
B
B
B
@

!2

2
1
2

C
C
C.
C
A

= 14 = k v k2 .

h v , v1 i
h v , v2 i
v1 +
v =
2
k v1 k
k v 2 k2 2

7
3

v1

1
3

v2 .

= 18 = k v k2 .

u2 ; k v k2 = 18 =
!

1
, v2 =
1

Z 1
t2
1

122

u3 =

15
3
!

h 1 , p2 i
h 1 , p3 i
h 1 , p1 i
= 1,
= 0,
= 0,,
k p1 k2
k p2 k2
k p3 k2
h x , p1 i
h x , p3 i
1 h x , p2 i
= ,
= 1,
= 0,,
2
2
k p1 k
2
k p2 k
k p3 k2
h x 2 , p1 i
h x 2 , p3 i
1 h x 2 , p2 i
=
,
=
1,
= 1,,
k p1 k2
3
k p2 k2
k p3 k2
1

5.1.25.

5.1.26.
Z
(a) h P0 , P1 i =

1
1
6C
C
1 C,
6C
A
2

6
!2

7
3

0
. We have v =
1

1
0

15
3

= v1 v2 ,

so 1 = p1 (x) + 0 p2 (x) + 0 p3 (x) = p1 (x).


so x =

1
2 p1 (x) + p2 (x).

so x2 =

1
3

1
3 p1 (x) + p2 (x) + p3 (x).

dt = 0,

h P 0 , P3 i =
h P 1 , P3 i =
(b)
(c)

1 ,
2
3

3
2

t,

h t , P0 i
=
k P 0 k2

Z 1
Z 1

t3 35 t dt = 0, h P1 , P2 i =
t t2 31 dt = 0,
1
1
Z 1
Z 1

3
3
t2 31
t t 5 t dt = 0, h P2 , P3 i =
t3 53 t dt = 0.
1
1
r

r
1
3
5 3 2
7 5 3
2 2t 2 ,
2 2t 2t ,
3
h t , P1 i
h t 3 , P3 i
3 h t 3 , P2 i
0,
=
,
=
0,
= 1, so t3 = 35 P1 (t) + P3 (t).
k P 1 k2
5
k P 2 k2
k P 3 k2

5.1.27.
Z
(a) h P0 , P1 i =

Z01

2
3

t dt = 0,

h P 0 , P2 i =

Z 1
t2
0

3
t2 65 t + 10
t dt = 0.
h P 1 , P2 i =
t 23
0


(b) 2 , 6 t 4,
6 3 12 t + 10 t2 .
h t 2 , P0 i
1 h t 2 , P1 i
6 h t 2 , P2 i
(c)
=
,
=
,
= 1, so t2 =
k P 0 k2
2
k P 1 k2
5
k P 2 k2

1
2

3
10

t dt = 0,

P0 (t) +

6
5

P1 (t) + P2 (t).

6
5

t+

5.1.28. (a) cos2 x = 12 + 12 cos 2 x, (b) cos x sin x = 21 sin 2 x, (c) sin3 x = 34 sin x 41 sin 3 x,
1
1
(d) cos2 x sin3 x = 18 sin x + 16
sin 3 x 16
sin 5 x, (e) cos4 x = 83 + 21 cos 2 x + 18 cos 4 x.
cos x sin x
1
, , ,
5.1.29.

2
5.1.30. h e
5.1.31.

ikx

,e

i lx

1
i=
2

ikx

cos k x cos l x dx =

sin k x sin l x dx =

e i k x dx

cos n x sin n x

,
.

... ,

cos k x sin l x dx =

1
2

1
2

1
2

1
=
2

i (kl) x

dx =

8
>
>
>
<

8
<

1,
0,
0,

cos(k l) x + cos(k + l) x dx = > 2 ,


>
>
:

cos(k l) x cos(k + l) x dx = :

k = l,
k 6= l.
k 6= l,
k = l = 0,

,
0,

k = l 6= 0,
k 6= l,

k = l 6= 0,

sin(k l) x + sin(k + l) x dx = 0.

5.1.32. Given v = a1 v1 + + an vn , we have

h v , v i i = h a 1 v 1 + + a n v n , v i i = a i k v i k2 ,

since, by orthogonality, h vj , vi i = 0 for all j 6= i. This proves (5.7). Then, to prove (5.8),
k v k 2 = h a 1 v1 + + a n vn , a1 v1 + + a n vn i =
=

n
X

i=1

5.2.1.
(a)
(b)

1
2
1
2

a2i

k vi k =

( 1, 0, 1 )T , ( 0, 1, 0 )T ,
( 1, 1, 0 )T ,

1
6

1
2

n
X

i=1

h v , vi i
k v i k2

( 1, 0, 1 )T ;

( 1, 1, 2 )T ,

1
3

!2

k vi k =

( 1, 1, 1 )T ;
123

n
X

i=1

n
X

i,j = 1

ai aj h v i , v j i

h v , vi i
k vi k

!2

(c)
5.2.2.
(a)
(b)

1
14

( 1, 2, 3 )T ,

1 , 0, 1 , 0
2
2
1 , 0, 0, 1
2
2

1
3

,
,

( 1, 1, 1 )T ,

1 , 0, 1
2
2

2 1
2 T
3 , 3 , 0, 3

0,

( 5, 4, 1 )T .

1
42
T

1 1
1 1
2, 2,2, 2

, ( 0, 0, 1, 0 ) ,

12 , 12 , 12 ,

1
2

1
4
1

,
, 0,
3 2 3 2
3 2

5.2.3. The first two GramSchmidt vectors are legitimate, v1 = ( 1, 1, 0, 1 )T , v2 = ( 1, 0, 1, 1 )T ,


but then v3 = 0, and the algorithm breaks down. The reason is that the given vectors are
linearly dependent, and do not, in fact, form a basis.
5.2.4.

1 T
2
, ( 1, 0, 0 )T .
0, ,
5
5

T
(b) Starting with the basis 12 , 1, 0
, ( 1, 0, 1 )T , the GramSchmidt process produces
(a)

T
T
4
5
2
1 , 2 , 0
,

,
,
.
5
5
3 5 3 5 3 5
T
T
Starting with the basis ( 1, 1, 0 ) , ( 3, 0, 1 ) , the GramSchmidt

T
T

3 , 3 , 2
.
,
orthonormal basis 1 , 1 , 0
2
2
22
22
22

the orthonormal basis

(c)

process produces the

5.2.5. ( 1, 1, 1, 1, 1 )T , ( 1, 0, 1, 1, 1 )T , ( 1, 0, 1, 1, 1 )T .
5.2.6.
(a)

1
3

( 1, 1, 1, 0 )T ,

1
15

( 1, 2, 1, 3 )T ,

1
15

( 3, 1, 2, 1 )T .

(b) Solving the homogeneous system we obtain the kernel basis ( 1, 2, 1, 0 ) T , ( 1, 1, 0, 1 )T .


The Gram-Schmidt process gives the orthonormal basis 1 ( 1, 2, 1, 0 )T , 1 ( 1, 0, 1, 2 )T .
6

(c) Applying Gram-Schmidt to the corange basis ( 2, 1, 0, 1 )T , 0, 21 , 1,


orthonormal basis

1
6

( 2, 1, 0, 1 )T ,

1
6

( 0, 1, 2, 1 )T .

1
2

, gives the

(d) Applying Gram-Schmidt to the range basis ( 1, 2, 0, 2 )T , ( 2, 1, 1, 5 )T , gives the or1


( 8, 7, 3, 11 )T .
thonormal basis 31 ( 1, 2, 0, 2 )T ,
9 3

(e) Applying Gram-Schmidt to the cokernel basis


orthonormal basis

1
14

( 2, 1, 3, 0 )T ,

1
9 42

T
2
1
3 , 3 , 1, 0
T

, ( 4, 3, 0, 1 )T , gives the

( 34, 31, 33, 14 ) .

(f ) Applying Gram-Schmidt to the basis ( 1, 1, 0, 0 )T , ( 1, 0, 1, 0 )T , ( 1, 0, 0, 1 )T , gives


1
the orthonormal basis 1 ( 1, 1, 0, 0 )T , 1 ( 1, 1, 2, 0 )T ,
( 1, 1, 1, 3 )T .
2

5.2.7.

2 3
!

1
1
1
1
1
1
(a) Range:
; kernel:
; corange:
;
3
1
1
100
1
0 1 2
0
1 2
1
1 B1C
1 B 2C
1
C
(b) Range: B
@ 1 A, @ 1 A; kernel: @ 1 A;
2
6
0 1 6 20 1
10
0
1
1
2
1
1
1 B C
1 B
C
C
corange: B
@ 0 A,
@ 5 A; cokernel: @ 1 A.
5
30 1
3
2
1

124

1
cokernel:
10

3
.
1

1
1
1
C
(c) Range: B
@ 1 A,
3 1
42

1
1
B C
@ 4 A,
14
5

3
B
C
@ 2 A;
1

kernel:

1
2

B
C
B 1 C
B
C;
@ 1A

1
0 1
0
1
1
1
0
1
B
C
C
1 B
1 B0C
B1C 1 B 1C
C
B
B C, B C,
C; the cokernel is {0}, so there is no basis.
corange: B
2 @1A
2 @0A 2 @ 1A
01
1 1
1
0
0
0
1
1
1
4
B
C
C
1 B
1
1
0
1
B
C
B
C
C
B
C, B
C; kernel:
(d) Range: B
@ 1 A;
3 @ 1 A
3 @ 0A
21
2
1
1
0 1
0
1
0 1
0
1
1
0
1
1
B
C
B
1
1 B
1 B 1 C 1 B 1 C
C
C
C
C.
corange: B
@ 2 A,
@ 2 A; cokernel: B C, B
@1A
@ 1A
6 1
14
3
3
3
0
1
0

5.2.8.
(i) (a)

(b)
(c)
(ii) (a)
(b)
(c)

T
1
1
1 ( 0, 1, 0 )T ,
( 1, 0, 3 )T ;
2 ( 1, 0, 1 ) ,
2
2 3
1 ( 1, 1, 0 )T , 1 ( 2, 3, 5 )T , 1 ( 2, 3, 6 )T ;
5
55
66
1

( 1, 2, 3 )T , 1 ( 4, 3, 8 )T , 1 ( 5, 6, 3 )T .
2 5
130
2 39

1
1
1
1 T
1 T
1 T
,
,

;
,
0,
,
1,
,
0,
2
2
2
2
2
2
T

T
T
1 , 1 , 0
, 12 , 0, 12
, 0, 1 , 1
;
2
2
2
2
T
T
T
1

( 1, 2, 3 ) , 1 ( 4, 5, 0 ) , 1 ( 2, 1, 0 ) .
2 3
42
14

5.2.9. Applying the GramSchmidt process to the standard basis vectors e1 , e2 gives
1
1 0
1
0
1 0
0 1 0 1 1
0

1
1
1
1
0
C
B
B
10
(a) @ 3 A, @ 1 A; (b) @ 2 A, @ 2 2 3 A; (c) @ 2 A, @ 2 C
A.

0
0
0
5
3
5

5.2.10. Applying the GramSchmidt


process
to the standard basis vectors e 1 , e2 , e3 gives
0
1
(a)

1
1
2
B C
B C,
@0A
0

B2 2
B
B 1
B
2
@

C
C
C,
C
A

B2 6
B
B 1
B
B 6
@
2
3

C
C
C
C;
C
A

(b)

C
1 B C
1 B
1 B
@ 0 A,
@ 3 A,
@
3
33
4 22

2
5C
A.
11

5.2.11. (a) 2, namely 1; (b) infinitely many; (c) no.


5.2.12. The key is to make sure the inner products are in the correct order, as otherwise complex conjugates appear on the scalars. By induction, assume that we already know
k1
X h w k , vj i
vj , for any i < k,
h vi , vj i = 0 for i 6= j k 1. Then, given vk = wk
k v j k2
j =1
h v k , vi i =

wk

k1
X

h w k , vj i

j =1

= h w k , vi i

k v j k2

k1
X

j =1

vj , vi

h w k , vj i
k vj

k2

h v j , vi i = h w k , vi i

completing the induction step.


125

h w k , vi i
h vi , vi i = 0,
k v i k2

!T

1+ i 1 i
5.2.13. (a)
,
2
2
1+ i 1 i 2 i
,
,
(b)
3
3
3
5.2.14.

(b)
(c)

1
3

i,

1
1 1
2,2, 2

!T

3 i 1 2 i 1 + 3 i
,
,
5
5
5

!T

T
1 + 2 i 3 i 3 i

;
, ,
2 6
2 6 2 6
1
( 6 + 2 i , 5 5 i , 9 )T ;

T
1 i 1
, ,0
,
3
3
( 1 2 i , 2, 0 )T ,

21

!T

(a)

3 i 1 + 3i T
,
;
2 5 2 5
2 + 9 i 9 7 i 1 + 3 i
,
,
15
15
15

3 19

21 ,

1
2

i , 12 ,

1
2

21 i ,

1
2

i , 0,

1
2

1
2

5.2.15. False. Any example that starts with a non-orthogonal basis will confirm this.
5.2.16. According to Exercise 2.4.24, we can find a basis of R n of the form u1 , . . . , um ,
vm+1 , . . . , vn . When we apply the GramSchmidt process to this basis in the indicated order, it will not alter the orthonormal vectors u1 , . . . , um , and so the result is the desired
orthonormal basis. Note also that none of the orthonormal basis vectors u m+1 , . . . , un belongs to V as otherwise it would be in the span of u1 , . . . , um , and so the collection would
not be linearly independent.

0.
C
5.2.17. (a) B
@ .7071 A,
.7071
0

.5164
B .2582 C
B
C
C
B .7746 C,
(c) B
B
C
@ .2582 A
0.

.8165
B
C
@ .4082 A,
.4082

.2189
B .5200 C
B
C
B
C
B .4926 C,
B
C
@ .5200 A
.4105

5.2.18. Same solutions.

.57735

.2582
.5164 C
C
C,
(b)
.2582 A
0.
.7746
0
1 0
1 0
1
0.
.6325
.1291
B .7071 C B .3162 C B .3873 C
B
C B
C B
C
C B
C B
C
B 0. C, B .6325 C, B .5164 C,
(d) B
B
C B
C B
C
@ .7071 A @ .3162 A @ .3873 A
0.
0.
.6455

.57735
B
C
@ .57735 A;
.57735
1

.2529
B .5454 C
B
C
B
C
B .2380 C;
B
C
@ .3372 A
.6843

B
C
B .57735 C
B
C,
@ .57735 A

B
B
B
@

.7746

B
C
B .2582 C
B
C;
@ .5164 A

.2582
1
.57735
B
0. C
B
C
B
C
B .57735 C.
B
C
@
0. A
.57735
0

5.2.19. See previous solutions.


(j)

(j)

5.2.20. Clearly, each uj = wj /k wj k is a unit vector. We show by induction on k and then


(j)

on j that, for each 2 j k, the vector wk


(k)
(k)
wk /k wk k

imply uk =
Indeed, by the formulas,

is orthogonal to u1 , . . . , uk1 , which will

is also orthogonal to u1 , . . . , uk1 ; this will establish the result.

(2)

h wk , u1 i = h wk , u1 i h wk , u1 i h u1 , u1 i = 0.
Further, for i < j < k
(j+1)

h wk

(j)

(j)

, ui i = h wk , ui i h wk , uj i h uj , ui i = 0,

(j)

h w j , ui i
(j)
= 0.
since, by the induction hypothesis, both h wk , ui i = 0 and h uj , ui i =
(j)
k wj k
Finally,
(j)
(j)
(j+1)
uj = h wk , uj i h wk , uj i h uj , uj i = 0,
wk
since uj is a unit vector. This completes the induction step, and the result follows.
5.2.21. Since u1 , . . . , un form an orthonormal basis, if i < j,
(j+1)

h wk

(j)

, ui i = h wk , ui i,
126

(i)

and hence, by induction, rik = h wk , ui i = h wk , ui i. Furthermore,


(j+1) 2

k wk

(i)

(j)

(j)

(j)

2
,
k = k wk k2 h wk , uj i2 = k wk k2 rjk

2
2
2
and so, by (5.5), k wi k2 = k wi k2 r1i
ri1,i
= rii
.

5.3.1. (a) Neither; (b) proper orthogonal; (c) orthogonal; (d) proper orthogonal; (e) neither;
(f ) proper orthogonal; (g) orthogonal.
5.3.2.
(a) By direct computation
RT R = 1
I , QT Q = I ; 0
1
0
cos 0 sin
cos sin 0
B
C
(b) Both R Q = B
0
0
1C
@
A, and Q R = @ sin 0 cos A
0
1
0
sin cos 0
satisfy (R Q)T (R Q) = I = (Q R)T (Q R);
(c) Q is proper orthogonal, while R, R Q and Q R are all improper.
5.3.3.
(a) True: Using the formula (5.31) for an improper 2 2 orthogonal matrix,
cos
sin

!2

sin
1 0
=
.
cos
0 1
0
1
cos sin
0 2
(b) False: For example, B
cos
0C
@ sin
A 6= I for 6= 0, .
0
0
1
5.3.4.
(a) By direct computation using sin2 + cos2 = 1, we find QT Q = I and det Q = +1.
(b)
0
1
cos cos cos sin sin cos sin cos sin cos
sin sin
C
Q1 = QT = B
@ sin cos + cos cos sin sin sin + cos cos cos sin cos A.
sin sin
sin cos
cos
5.3.5.
(a) By a long direct computation, we find QT Q = (y12 + y22 + y32 + y42 )2 I and
det Q = (y12 + y22 + y32 + y42 )3 = 1.
(b) Q1

y12 + y22 y32 y42


B
= QT = B
@ 2 (y2 y3 + y1 y4 )
2 (y2 y4 y1 y3 )

2 (y2 y3 y1 y4 )
y12 y22 + y32 y42
2 (y3 y4 + y1 y2 )

2 (y2 y4 + y1 y3 )
C
2 (y3 y4 y1 y2 ) C
A;
2
2
2
2
y1 y2 y3 + y4

(c) These follow by direct computation using standard trigonometric identities, e.g., the
(1, 1) entry is
y12 + y22 y32 y42

+
cos2 + cos2
sin2 sin2
sin2 sin2
cos2
= cos2
2
2
2
2
2
2
2
2
2
2
+ + cos( ) sin
= cos( + ) cos
2
2
!
!
2
2
2
2
= cos cos cos
+ sin
sin sin cos
sin
2
2
2
2
= cos cos cos sin sin .
5.3.6. Since the rows of Q are orthonormal (see Exercise 5.3.8), so are the rows of R and hence
127

R is also an orthogonal matrix. Moreover, interchanging two rows changes the sign of the
determinant, and so if det Q = +1, then det R = 1.
5.3.7. In general, det(Q1 Q2 ) = det Q1 det Q2 . If both determinants are +1, so is their product.
Improper times proper is improper, while improper times improper is proper.
5.3.8.
(a) Use (5.30) to show (QT )1 = Q = (QT )T .
(b) The rows of Q are the columns of QT , and hence since QT is an orthogonal matrix, the
rows of Q must form an orthonormal basis.
5.3.9. (Q1 )T = (QT )T = Q = (Q1 )1 , proving orthogonality.
5.3.10.
(a) False they must be an orthonormal basis.
(b) True, since then QT has orthonormal basis columns, and so is orthogonal. Exercise 5.3.8
then implies that Q = (QT !
)T is also orthogonal.
0 1
(c) False. For example
is symmetric and orthogonal.
1 0
5.3.11. All diagonal matrices whose diagonal entries are 1.

5.3.12. Let U = ( u1 u2 . . . un ), where the last n j entries of the j th column uj are zero.

Since k u1 k = 1, u1 = ( 1, 0, . . . , 0 )T . Next, 0 = u1 uj = u1,j for j 6= 1, and


so all non-diagonal entries in the first row of U are zero; in particular, since k u 2 k = 1,
u2 = ( 0, 1, 0, . . . , 0 )T . Then, 0 = u2 uj = u2,j , j 6= 2, and so all non-diagonal entries

in the second row of U are zero; in particular, since k u3 k = 1, u3 = ( 0, 0, 1, 0, . . . , 0 )T .


The process continues in this manner, eventually proving that U is a diagonal matrix whose
diagonal entries are 1.

5.3.13. (a) Note that P 2 = I and P = P T , proving orthogonality. Moreover, det P = 1


since P can be obtained from the identity matrix I by interchanging two rows. (b) Only
the matrices corresponding to multiplying a row by 1.
5.3.14. False. This is true only for row interchanges or multiplication of a row by 1.

5.3.15.
(a) The columns of P are the standard basis vectors e1 , . . . , en , rewritten in a different order, which doesnt affect their orthonormality.
(b) Exactly half are proper, so there are 12 n ! proper permutation matrices.
5.3.16.
(a) k Q x k2 = (Q x)T Q x = xT QT Q x = xT I x = xT x = k x k2 .
(b) According to Exercise 3.4.19, since both QT Q and I are symmetric matrices, the equation in part (a) holds for all x if and only if QT Q = I .
5.3.17.
(a) QT Q = ( I 2 u uT )T ( I 2 u uT ) = I 4 u uT + 4 u uT u uT = I , since k u k2 = uT u = 1
by assumption.
0
1
0
1
0
1
!
1
0 0
0 0 1
7
24

1 0
B
C
B
25
25 A,
, (ii) @
(b) (i)
(iii) @ 0 1 0 A, (iv ) @ 0 1
0C
A.
24
7
0 1
25 25
0
0
1
1
0
0
0 1
0 1
0 1
0 1
!
!
1
0
1
0
4
0
B C
B C
B C
C
, (iii) v = c@ 0 A + d@ 0 A, (iv ) v = c@ 0 A + dB
, (ii) v = c
(c) (i) v = c
@ 1 A.
3
1
0
1
1
0
128

In general, Q v = v if and only if v is orthogonal to u.


5.3.18. QT = ( I + A)T ( I A)T = ( I + AT )( I AT )1 = ( I A)( I + A)1 = Q1 . To prove
that I A is invertible, suppose ( I A)v = 0, so A v = v. Multiplying by v T and using
Exercise 1.6.29(f ) gives 0 = vT A v = k v k2 , proving v = 0 and hence ker( I A) = {0}.
5.3.19.
(a) If S = ( v1 v2 . . . vn ), then S 1 = S T D, where D = diag (1/k v1 k2 , . . . , 1/k vn k2 ).
1
01
0
1
1
11
0
0
1
1 1
4
4
4C
1
1
1
0 1 B 4
1
1
1
1 B 4 0 0 0C
1
1
1C
1
B
B
B1
B
0 0C
1
1 1
0C
1 1 1 C
C
B1
B
C
CB 0 4
4 4 4 C
C=B
C.
B
C =B4
CB
(b) B
1
B1
@ 1 1
@ 1 1
0
1A
0
0 AB
0 21 0 C
0
0C
A
A
@ 2 2
@ 0
1 1
0 1
0
0
1 1
1
1
1
0 0 0 2

0
0
2

5.3.20. Set A = ( v1 v2 . . . vn ), B = ( w1 w2 . . . wn ). The dot products are the same if and


only if the two Gram matrices are the same: AT A = B T B. Therefore, Q = B A1 =
B T AT satisfies QT = AT B T = Q1 , and hence Q is an orthogonal matrix. The
resulting matrix equation B = Q A is the same as the vector equations wi = Q vi for
i = 1, . . . , n.
5.3.21. (a) The (i, j) entry of QT Q is the product of the ith row of QT times the j th column of
(
1, i = j,
and hence QT Q = I .
(b) No. For instance,
Q, namely uT
u
=
u

u
=
i j
i
j
0, i 6= j,
if Q = u =

1
1
B 2C
@ 1 A,

2
0

then Q QT =

1
@2
1
2

1
1
2A
1
2

is not a 2 2 identity matrix.

5.3.22.
(a) Assuming the columns are nonzero, Proposition 5.4 implies they are linearly independent. But there can be at most m linearly independent vectors in R m , so n m.
(b) The (i, j) entry of AT A is the dot product viT vj = vi vj of the ith and j th columns
of A, and so by orthogonality this is zero if i 6= j. The ith diagonal entry is the squared
th column, k v k2 .
Euclidean norm of the i0
i
0
1
1
!
2
0
2
1
1
2
0
C
T
2 2 C
, but A AT = B
(c) Not necessarily. If A = B
@0
A.
@ 1 1 A, then A A =
0 6
2 2
4
0
2

5.3.23. If S = ( v1 v2 . . . vn ), then the (i, j) entry of S T K S is viT K vj = h vi , vj i, so


S T K S = I if and only if h vi , vj i = 0 for i 6= j, while h vi , vi i = k vi k2 = 1.

5.3.24.
(a) Given any A G, we have A1 G, and hence the product A A1 = I G also.
(b) (i) If A, B are nonsingular, so are A B and A1 , with (A B)1 = B 1 A1 , (A1 )1 = A.
(ii) The product of two upper triangular matrices is upper triangular, as is the inverse
of any nonsingular upper triangular matrix.
(iii) If det A = 1 = det B, then A, B are nonsingular; det(A B) = det A det B = 1 and
det(A1 ) = 1/ det A = 1.
(iv ) If P, Q are orthogonal matrices, so P 1 = P T , Q1 = QT , then (P Q)1 =
Q1 P 1 = QT P T = (P Q)T , and (Q1 )1 = Q = (QT )T = (Q1 )T , so both
P Q and Q1 are orthogonal matrices.
(v ) According to part (d), the product and inverse of orthogonal matrices are also orthogonal. Moreover, by part (c), the product and inverse of matrices with determi129

nant 1 also have determinant 1. Therefore, the product and inverse of proper orthogonal matrices are proper orthogonal.
(vi) The inverse of a permutation matrix is a permutation matrix, as is the product of
two permutation
!
! matrices.
x y
a b
have integer entries with det A = a d b d = 1,
,B =
(vii) If A =
z w
c d
!
ax + bz ay + bw
det B = x wy z = 1, then the product A B =
also has integer
cx + dz cy + dw
!
d b
1
entries and determinant det(A B) = det A det B = 1. Moreover, A
=
c
a
also has integer entries and determinant det(A1 ) = 1/ det A = 1.
(c) Because the inverse of a matrix with 0
integer entries
does not necessarily have integer
1
!1
1
1
2 A
1 1
entries. for instance,
.
= @ 21
1
1 1
2
2
(d) No, because the product of two positive definite matrices is not necessarily symmetric,
let alone positive definite.
5.3.25.
(a) The defining equation U U = I implies U 1 = U .
(b) (i)

1
B
1

2
U
=U =@
i
2
01
B2
B1
1

B2
(iii) U
=U =B
B1
@2
1
2

(c) (i) No, (ii) yes, (iii) yes.

2C
A,
1
2

1
2
2i
12
i
2

(ii) U 1 = U =

1
2
21
1
2
21

1
2
i
2
12
2i

C
C
C
C.
C
A

0 1

B 3
B 1
B
B 3
@
1
3

1
3
1

2 3
1
+
2 3

i
2
i
2

1
3
1

+
2 3
1

2 3

i
2
i
2

C
C
C,
C
A

(d) Let u1 , . . . , un denote the columns of U . The ith row of U is the complex conjugate of
the ith column of U , and so the (i, j) entry of U U is uT
i uj = uj ui , i.e., the Hermitian
dot product of the column vectors. Thus, U U = I if and only if u1 , . . . , un form an
orthonormal basis of C n .
(e) Note first that (U V ) = V U . If U 1 = U , V 1 = V , then (U V )1 = V 1 U 1 =
V U = (U V ) , so U V is also unitary. Also (U 1 )1 = U = (U ) = (U 1 ) , and so
U 1 is also unitary.

5.3.26.

0
B
B
@

2
2
1

5.3.27.
(a)

1
2

3
1

(b)

4
3

3
2

0
4
1
!

=
0

=@

3
1C B
B
B0
2C
A=B
@
3
0
0

1
B 5
@ 2

5
4
5
3
5

3
5
45

3
1
2
3
2

2 2
0
1 0

5C B
A @
1
0
5
1 0
1
18
5 A;
A @5
1
0
5

10
CB
CB
CB
CB
A@

5C
A;
7
5

130

2
3
2
3
1
3

2
1
2

3 2
1

3 2
232

C
C
C.
C
A

(c)

0
B
B
@

2
0
1

1
1
1

B
B
@

0
1
1

1
1
1

0
0
1

0
4
0

1
1
2
1

(d)

(e)

(f )

B
B
@

1
2
1
0

B
B
B1
B
B1
@

(ii) (a)

1
1
2

5.3.29.

4
B
@1
0
0
4 1
B
B1 4
B
@0 1
0 0
0
4 1 0
B1 4 1
B
B
B0 1 4
B
@0 0 1
0 0 0

2
5

1
0

2C B 0
B
1C
0
A=@
1
1
0
1

1C B
B
B
B
0C
C
C=B
B
B
1C
A
B
@
1

2
3

1
2
1
2
1
2
1
2

0 1C B1
B
1 0C
A @0
0 0
0
1
0

1
2

2 3
1

2 3

3
2
1

2 3

0
1

0 1 C
4
1C
A;
0 q 21 0
2
3
1

0
1

C
C
C
C
C
C
C
C
A

2
B
B
B0
B
B
B
B0
B
@

3
q5
6
5

q 5C
C
2 C;
7 15
C
q A
2
2 3
1

2 2 C
2 C;
A
2

2
0
0

5
2

3
2

3
2
1
2
1

2 3
1
6

C
C
C
C
C.
C
C
C
A

1 0
1
0
1
!
1
1
1
2

57 A
x
B
2
2C B
2C
@
=
;
=@
(b)
A @
A;
1
y
1
5
1
0
5
2
02
1 2
1
2
1
0
1
1
0
0 1
1

3
1
1 C B
3
0
2
2
3
2C
x

C
B
B
C B
C

C
2 2 C B
C
B1
; (b) B
@yA = B
2 2 2 C
2C
0
C @0
@ 1 A;
A=B3
3
A
@
A
z
1
2
1
2
0 0
3
1

3 2 3 2

0
1
0
1
0
1
1
1
1
1
0 1
2 1
1
1

2
6
3C B
2
2
0C B
C
x
q
B 2C
B
C B
C
B
C
C
3
B 1
1
1 C B 0
1 C
(b) B
@ y A = B 1 C.
1C
C B
A = B 2
2 6 C;
6
3
@ 2A
q
@
A @
A
z
1
1
2
0 23 1
0
0
2
3
3

1
4
1
0
1
4
1
0
0
1
4
1

0
.9701
B
1C
A = @ .2425
4
0
1
0
0
.9701
B
0C
C
B .2425
C=B
1A @ 0
4
0
1
0
0
.9701
B .2425
0C
C
B
B
C=B 0
0C
C
B
1A @ 0
4
0

1
0
0 1

1 0
1
5
6 C
q 30
C B
B
C
5
1
C B 0
B
6
@
q6 C
q
2
2A
0
15
3
1
0

0 C
2 2
B
1 C
C B 0
1
C
@
2
A
1
0
0
2
1 0
1

0
2C B
B
1

B
C
1A = B
2
@
3
1

B
B
B
B
B
@

1
5

B
B 1
@

1
0
1

B
B1
@

(iii) (a)

1 C
3C
A=
1

5.3.28.
(i) (a)

.2339
.9354
.2650
.2339
.9354
.2650
0
.2339
.9354
.2650
0
0

10

.0643
4.1231 1.9403
.2425
B
.2571 C
0
3.773 1.9956 C
A@
A,
.9642
0
0
3.5998
10
.0619 .0172
4.1231 1.9403
.2425
B
.2477
.0688 C
0
3.773 1.9956
CB
CB
.9291 .2581 A@
0
0
3.7361
.2677
.9635
0
0
0
1
.0619 .0166
.0046
.2477
.0663 .0184 C
C
C
.9291 .2486
.0691 C
C
.2677
.9283 .2581 A
0
.2679
.9634
0
1
4.1231 1.9403
.2425
0
0
B
0
3.773 1.9956
.2650
0 C
B
C
B
C
B
C.
0
0
3.7361
1.9997
.2677
B
C
@
0
0
0
3.7324 2.0000 A
0
0
0
0
3.5956

131

0
.2650 C
C
C,
1.9997 A
3.596

5.3.30. :

5.3.27 (a)

1.2361
,
2.0000

b =
v
1

.4472
.8944

Q=

(b)

(c)

(d)

(e)

(f )

b =
v
1

1
,
3
0

H1 =

.8
.6

H1 =
!

.8944
,
.4472
!

.6
,
.8

Q=

.8944
,
.4472

2.2361
0

R=
.8
.6

.4472
.8944

.6
,
.8

.4472
;
3.1305
R=

1.3416
2.556 C
A;
1.633

1.4142
0
.7071 .7071
B
C
b =
1
.7071
.5
.5 C
v
,
H
=
@
A
A,
1
1
1
.7071
.5
.5
0
0
1
1
0
1
0
0
B
C
B
b = @ 1.7071 A,
v
H2 = @ 0 .7071 .7071 C
A
2
.7071
0
.7071
.7071
1
0
1
0
1.4142 1.4142 2.8284
0
1
0
B
C
B
0
1
2 C
Q = @ .7071 0 .7071 A, R = @
A;
0
0
1.4142
.7071 0
.7071
B
@

1
0 0 1
0
C
b =
b =B
v
0C
H1 = B
0C
v
A,
@ 0 1
A,
@ 0 A,
1
2
1
1 00
0
0
0
1
1
0 0 1
1 0 1
Q=B
0C
R=B
1C
@ 0 1
A,
@0 4
A;
1 0
0
0 0 2
B
@

1
B
1C
B
C
b =B
C,
v
1
@ 1A
1
0
1
0
B
.4142 C
C
b =B
B
C,
v
2
@
0 A
1
0
1
0
B
0 C
C
b =B
B
C,
v
3
@ .3660 A
.7071
0
.5
0
B
.7071
B .5
Q=B
@ .5
0
.5 .7071

3.6
;
.2

.2361
.8944 0 .4472
C
B
b =
v
0
,
H
=
0
1
0 C
A
@
A,
1
1
1
.4472 0 .8944
0
1
0
1
0
1
0
0
B
C
B
b = @ .0954 A,
v
H2 = @ 0 .9129
.4082 C
A,
2
.4472
0
.4082
.9129
0
1
0
.8944 .1826 .4082
2.2361 1.3416
Q=B
0
.9129 .4082 C
R=B
0
1.0954
@
A,
@
.4472 .3651 .8165
0
0
B
@

5
0

H2 =

B
@0

0
1
0

0
0C
A,
1

.5
.5
.5
.5
B
.5 .5 .5 C
B .5
C
C,
H1 = B
@ .5 .5
.5 .5 A
.5 .5 .5
.5
0
1
1
0
0
0
B
0
.7071 0 .7071 C
C
B
C,
H2 = B
@0
0
1
0 A
0 .7071 0 .7071
0
1
1 0
0
0
B
0 1
0
0 C
C
B
C,
H3 = B
@ 0 0 .5774
.8165 A
0 0 1.8165 .5774
0
.2887
.8165
2
2
C
B
.2887 .4082 C
B 0 1.4142
C,
R=B
@0
.866
0 A
0
.2887 .4082
0
0

132

2.5
0
.866
0

1.5
.7071 C
C
C;
.2887 A
.4082

5.3.29:

3 3 case:

.9701
Q=B
@ .2425
0

.2339
.9354
.2650

.9701
B
B .2425
Q=B
@ 0
0

.1231
1 C
C
b =
C,
v
1
0 A
0
1
0
0
B
7.411 C
C
b =B
C,
B
v
2
@
1 A
0
1
0
0
B
0 C
C
b =B
C,
B
v
3
@ .1363 A
1
.2339
.9354
.2650
0

5 5 case:

b
v
1

b
v
2

b
v
3

b
v
4

.0619
.2477
.9291
.2677

4.1231
0
R=B
@
0

H1 =

H2 =

0
0

0
1
B
B0
H3 = B
@0
0
0

H1

H2

H3

H4

.2425 0 0
.9701 0 0 C
C
C,
0
1 0A
0
0
0 1
1
0
0
0
.9642 .2650 0 C
C
C,
.2650 .9642 0 A
0
0
1
1
0
0
0
1
0
0 C
C
C,
0 .9635
.2677 A
0 .2677 .9635
0

4.1231
B
0
B
R=B
@
0
0
0

.2425
1.9956 C
A;
3.5998

.9701

B
B0
B
@0

1.9403
3.773
0

B
B .2425
B
@ 0

.0172
.0688 C
C
C,
.2581 A
.9635

.1231
B
1 C
B
C
B
C
C,
0
=B
B
C
@
0 A
0
0
1
0
B 7.411 C
B
C
C
B
C,
=B
1
B
C
@
0 A
0
0
1
0
B
0 C
B
C
C
B .1363 C,
=B
B
C
@
1 A
0
0
1
0
B
C
0
B
C
B
C
C,
=B
0
B
C
@ 7.3284 A
1

.9701
.2425 0
C
H1 = B
@ .2425 .9701 0 A,
0
0
1
0
1
1
0
0
C
H2 = B
@ 0 .9642 .2650 A
0
.2650 .9642

.0643
.2571 C
A,
.9642

B
B
B
@

.1231
b =B
1 C
v
@
A,
1
0
0
1
0
C
b =B
v
@ 7.411 A,
2
1

4 4 case:

1.9403
3.773
0
0

.2425
1.9956
3.7361
0
1

.9701
.2425 0 0 0
B .2425 .9701 0 0 0 C
B
C
C
B 0
C,
0
1
0
0
=B
B
C
@ 0
0
0 1 0A
0
0
0 0 1
0
1
1
0
0
0 0
B 0 .9642 .2650 0 0 C
B
C
C
B0
C,
.2650
.9642
0
0
=B
B
C
@0
0
0
1 0A
0
0
0
0 1
0
1
1 0
0
0
0
B0 1
0
0
0C
B
C
C
B 0 0 .9635
C,
=B
.2677
0
B
C
@ 0 0 .2677 .9635 0 A
0 0
0
0
1
0
1
1 0 0
0
0
B0 1 0
0
0 C
B
C
B
C,
=B
0 0 1
0
0 C
B
C
@ 0 0 0 .9634 .2679 A
0 0 0
.2679 .9634

133

0
.2650 C
C
C;
1.9997 A
3.596

.9701 .2339
.0619 .0166
.0046
B .2425
.9354 .2477
.0663 .0184 C
B
C
C
B 0
C,
Q=B
.2650
.9291
.2486
.0691
B
C
@ 0
0
.2677
.9283 .2581 A
0
0
0
.2679
.9634
1
0
4.1231 1.9403
.2425
0
0
B
0
3.773 1.9956
.2650
0 C
C
B
C
C.
B
0
0
3.7361
1.9997
.2677
R=B
C
B
@
0
0
0
3.7324
2. A
0
0
0
0
3.5956
5.3.31.
(a) Q R factorization requires n3 + n2 multiplication/divisions, n square roots, and
n3 12 n2 12 n addition/subtractions.
(b) Multiplication of QT b requires an additional n2 multiplication/divisions and n2 n
addition/subtractions. Solving R x = QT b by Back Substitution requires 21 n2 + 12 n
multiplication/divisions and 21 n2 21 n addition/subtractions.
(c) The Q R method requires approximately 3 times as much computational effort as Gaussian Elimination.
e R,
e then Q1 Q
e = R
e R1 . The left hand side is orthogonal, while the
5.3.32. If Q R = Q
right hand side is upper triangular. Thus, by Exercise 5.3.12, both sides must be diagonal
e implies positivity of those of
with 1 on the diagonal. Positivity of the entries of R and R
1
1 e
1
e
e
e and R = R.
e
R R , and hence Q Q = R R = I , which implies Q = Q

5.3.33.
(a) If rank A = n, then the columns w1 , . . . , wn of A are linearly independent, and so form
a basis for its range. Applying the GramSchmidt process converts the column basis
w1 , . . . , wn to an orthonormal basis u1 , . . . , un of rng A.
(b) In this case, for the same reason as in (5.23), we can write
w1 = r11 u1 ,
w2 = r12 u1 + r22 u2 ,
w3 = r13 u1 + r23 u2 + r33 u3 ,
..
..
..
..
.
.
.
.
wn = r1n u1 + r2n u2 + + rnn un .

(c)

The result is equivalent to the factorization A = Q R where Q = (u1 , . . . , un ), and


R = (rij ) is nonsingular since its diagonal entries are non-zero: rii =
6 0.
0

1
B
(i) B
@2
0

1 C B
B 5
2
C
=
3A B
B
@ 5
2
0
(iii)

0
B
B
B
B
B
@

32

1
1
1
1

C
C
1C
3C
A
2
3

5
0
1

!
5
;
3
0

1C B
B
2 C
C
B
C=B
B
2C
A
@
1

1
2
1
2
1
2
1
2

(ii)
12
12

0
B
B
@

C
C
C
C
1C
2A
1
2

134

3
0
4
2
0

1C B
B
B
1C
A=B
@
2
!

3
;
1

3
5

0
4
5

5 5
1
5
6

5 5

1
C
C
C
C
A

5
0

1
;
5

(iv )

0
B
B
B
B
B
@

0
2
1
2

1
1
0
1

1 C B
B
B
B
3C
C
C=B
B
B
2 C
A
B
@
2

0
2
3
1
3
2
3

1
3
1
3

0
1
3

7 2
11

212
1321 2

212

1
C
C
C
C
C
C
C
C
A

B3
B
B0
@
0

3
0

38
0

7 2
3

C
C
C.
A

(d) The columns of A are linearly dependent, and so the algorithm breaks down, as in Exercise 5.2.3.
5.3.34.
(a) If A = ( w1 w2 . . . wn ), then U = ( u1 u2 . . . un ) has orthonormal columns and hence
is a unitary matrix. The Gram-Schmidt process takes the same form:
w1 = r11 u1 ,
w2 = r12 u1 + r22 u2 ,
w3 = r13 u1 + r23 u2 + r33 u3 ,
..
..
..
..
.
.
.
.
wn = r1n u1 + r2n u2 + + rnn un ,
which is equivalent 0
to the factorization
1 0A
= U R.
!
i
1
3i

C B 2
i
1
2
2
2
=B
(b) (i)
@
A
@
1
1
i

1 2i

0
(ii)

(iii)

0
B
B
@

1+ i
1 i
i
1
0

1
i
1

i
B
(iv ) B
1

i
@
1

C
A;

2
2
1
0 2
!
i
1
i
1
+

2 1 2i
2 i
2
2
2
2
A
@
;
= 1
0
1
i
2i 12 + 2i
2
0 i
1 0
1
1

1
i
0 1
2
3
6
0C B
C B 2
2
C
B 1
C

C
B
i
1 C B
B
=
;
1C
B 2
C
0
3 q0C
A
3
A
q6 A @
@
3
1
2
i

0
0
0
i
2
3 0 3
1
i
i
23
1
i
2
6
1
i C B
3 2 C
B
C
i
1
i
2 C
B1
=
0
1+ i C

+
B
C
A
2
2
2
6
3
A
@
1
1
2
i
1
2 + 3i
1
2

2 + 3
3 2
!

B2
B
B0
@

1 2 i
3

1 + i
1 3i

C
C
C.
A

2 2
0
0
3
(c) Each diagonal entry of R can be multiplied by any complex number of modulus 1. Thus,
requiring them all to be real and positive will imply uniqueness of the U R factorization.
The proof of uniqueness is modeled on the real version in Exercise 5.3.32.

5.3.35.
Householders Method
start
set R = A
for j = 1 to n 1
for i = 1 to j 1 set wi = 0 next i
for i = j to n set wi = rij next i
set v = w k w k ej

if v =
6 0
set uj = v/k v k,

rjj = k w k
135

for i = j + 1 to n

set rij = 0

next i

for k = j + 1 to n
for i = j to n

n
X

set rik = rik 2ui

ul rlk

l=1

next i

next k
else
set uj = 0
endif
next j
end

5.4.1.
(a) t3 = q3 (t) +
1=

3
5 q1 (t),

where
Z 1

h t 3 , q3 i
175
=
k q 3 k2
8

h t 3 , q1 i
3
3
=
=
5
k q 1 k2
2
(b) t4 + t2 = q4 (t) +

Z 1

t3

t3

0=

Z 1

t3

Z 1

t2

1
3

t3 dt;

where
Z 1

h t 4 + t 2 , q4 i
11025
=
2
k q4 k
128
h t 4 + t 2 , q3 i
175
=
2
k q3 k
8
2

h t 4 + t 2 , q1 i
3
=
k q 1 k2
2

h t 4 + t 2 , q0 i
8
1
=
=
15
k q 0 k2
2

Z 1

(t4 + t2 )

(t4 + t2 )

Z 1

(t4 + t2 )

(t4 + t2 ) t dt,

Z 1

(t4 + t2 ) dt;

t4

t3

3
5

6 2
7t

3
35

dt,

t dt,

1
3

dt,

1
1

h 7 t4 + 2 t3 t , q4 i
11025
=
2
k q4 k
128
h 7 t4 + 2 t3 t , q3 i
175
=
k q 3 k2
8

t2

Z 1

(c) 7 t4 + 2 t3 t = 7 q4 (t) + 2 q3 (t) + 6 q2 (t) +

2=

h t 3 , q0 i
1
=
k q 0 k2
2

0=

h t + t , q2 i
13
45
=
=
7
k q 2 k2
8

7=

h t 3 , q2 i
45
=
k q 2 k2
8

0=

0=

t dt,

t3 t dt,

8
13
7 q2 (t) + 15 q0 (t),

1=

3
5

1
7
5 q1 (t) + 5 q0 (t),

Z 1

Z 1

136

where

(7 t4 + 2 t3 t)

(7 t4 + 2 t3 t)

t4

t3

3
5

6 2
7t

t dt,

3
35

dt,

dt,

6=

h 7 t4 + 2 t3 t , q2 i
45
=
2
k q2 k
8

h 7 t4 + 2 t3 t , q1 i
1
3
=
=
2
5
k q1 k
2

h 7 t4 + 2 t3 t , q0 i
1
7
=
=
2
5
k q0 k
2
5.4.2. (a) q5 (t) = t5

10 3
9 t

5
21

t=

4
5 2
5
(c) q6 (t) = t6 15
11 t + 11 t 231

Z 1

(7 t4 + 2 t3 t)

t2

Z 1

(7 t4 + 2 t3 t) t dt,

Z 1

(7 t4 + 2 t3 t) dt.

1
1

5 ! d5
(t2 1)5 ,
10 ! dt5
6 ! d6 2
(t 1)6 ,
=
12 ! dt6

(b) t5 = q5 (t) +

1
3

dt,

10
3
9 q3 (t) + 7 q1 (t),

15
t6 = q6 (t)+ 11
q4 (t)+ 75 q2 (t)+ 71 q0 (t).

5.4.3. (a) We characterized qn (t) as the unique monic polynomial of degree n that is orthogonal to q0 (t), . . . , qn1 (t). Since these Legendre polynomials form a basis of P (n1) , this implies that qn (t) is orthogonal to all polynomials of degree n 1; in particular h qn , tj i = 0
for j = 0, . . . , n 1. Conversely, if the latter condition holds, then qn (t) is orthogonal to
every polynomial of degree n 1, and, in particular, h qn , qj i = 0, j = 0, . . . , n 1.
(b) Set q5 (t) = t5 + c4 t4 + c3 t3 + c2 t2 + c1 t + c0 . Then we require
2
2
3 c2 + 5 c4 ,
2
2
2
3 c1 + 5 c3 + 7 ,
2
2
2
5 c1 + 7 c3 + 9 .

0 = h q5 , 1 i = 2 c 0 +
0 = h q 5 , t2 i =

0 = h q 5 , t4 i =

2
2
3 c2 + 5 c4 ,
2
2
2
3 c0 + 5 c2 + 7 c4 ,

0 = h q5 , t i = 2 c 0 +

0 = h q 5 , t3 i =

5
The unique solution to this linear system is c0 = 0, c1 = 21
, c2 = 0, c3 = 10
9 , c4 = 0,, and
5
10 3
5
so q5 (t) = t 9 t + 21 t is the monic Legendre polynomial of degree 5.

5.4.4. Since even and odd powers of t are orthogonal with respect to the L2 inner product on
[ 1, 1 ], when the GramSchmidt process is run, only even powers of t will contribute to
the even order polynomial, whereas only odd powers of t will contribute to the odd order
cases. Alternatively, one can prove this directly from the Rodrigues formula, noting that
(t2 1)k is even, and the derivative of an even (odd) function is odd (even).
5.4.5. Use Exercise 5.4.4 and the fact that h f , g i =
even, since their product is odd.

2k (k !)2
k qk k =
(2 k) !

dk
k!
(t2 1)k ,
5.4.6. qk (t) =
(2 k) ! dtk
qk (t)
(2 k) !
5.4.7. Qk (t) =
= k
k qk (t) k
2 (k !)2

Z 1

f (t) g(t) dt = 0 if f is odd and g is

1
s

2
.
2k + 1

1
2k + 1
qk (t) = k
2
2 k!

2 k + 1 dk
(t2 1)k .
2
dtk

dk
1
(t 1)k (t + 1)k . Differentiating using Leibniz Rule, we con2k k ! dtk
clude the only term that does not contain a factor of t 1 is when all k derivatives are ap1
plied to (t 1)k . Thus, Pk (t) = k (t + 1)k + (t 1) Sk (t) for some polynomial Sk (t) and so
2
Pk (1) = 1.

5.4.8. Write Pk (t) =

5.4.9.
(a) Integrating by parts k times, and noting that the boundary terms are zero by (5.50):
137

Z 1

k
dk
2
k d
(t

1)
(t2 1)k dt
1 dtk
dtk
Z 1
Z 1
d2 k
2
k
k
(t2 1)k dt.
(t2 1)k
(t

1)
dt
=
(1)
(2
k)
!
= (1)k
1
1
dt2 k
(b) Since t = cos satisfies dt = sin d, and takes = 0 to t = 1 and = to t = 1, we
find
Z 1
Z
2k Z
(t2 1)k dt = (1)k
sin2 k+1 d = (1)k
sin2 k1 d
1
0
2k + 1 0
2k 2k 2 Z
= (1)k
sin2 k3 d =
2k + 1 2k 1 0

k Rk,k k2 =

= (1)k

(2 k)(2 k 2) 4 2
(2 k + 1)(2 k 1) 5 3

Z
0

sin d = (1)k

22 k+1 (k !)2
,
(2 k + 1) !

where we integrated by parts k times after making the trigonometric change of variables. Combining parts (a,b),
k Rk,k k2 =

(2 k) ! 22 k+1 (k !)2
22 k+1 (k !)2
=
.
(2 k + 1) !
2k + 1

Thus, by the Rodrigues formula,


1 2k+1/2 k !
1

=
k Rk,k k = k
k Pk k = k
2 k!
2 k! 2k + 1

2
.
2k + 1

5.4.10.
q
(a) The roots of P2 (t) are 1 ; the roots of P3 (t) are 0, 35 ;
3

30
.
the roots of P4 (t) are 152
35
d
(b) We use induction on Rj+1,k =
R . Differentiation reduces the order of a root by 1.
dt j,k
Moreover, Rolles Theorem says that at least one root of f 0 (t) lies strictly between any
two roots of f (t). Thus, starting with R0,k (t) = (1 t)k (1 + t)k , which has roots of
order k at 1, we deduce that, for each j < k, Rj,k (t) has j roots lying between 1 and
1 along with roots of order k j at 1. Since the degree of Rj,k (t) is 2 k j, and the
roots at 1 have orders k j, the j other roots between 1 and 1 must all be simple.
d
R
(t) has k simple roots
Setting k = j, we conclude that Pk (t) = Rk,k (t) =
dt k1,k
strictly between 1 and 1.

5.4.11.
(a) P0 (t) = 1,
(b) P0 (t) = 1,
(c) P0 (t) = 1,
(d) P0 (t) = 1,
5.4.12. 1, t

3
4

3
9 2
33
63
P1 (t) = t 23 , P2 (t) = t2 3 t + 13
6 , P3 (t) = t 2 t + 5 t 20 ;
2
3
2
3
6
4
P1 (t) = t 32 , P2 (t) = t 56 t + 10
, P3 (t) = t 12
7 t + 7 t 35 ;
2
3
3
5
P1 (t) = t, P2 (t) = t 5 , P3 (t) = t 7 t;
P1 (t) = t, P2 (t) = t2 2, P3 (t) = t3 12 t.

, t2

4
3

t+

2
5

, t3

15 2
8 t

15
14

5
28

, t4

12 3
5 t

+ 2 t2

2
3

t+

1
14

15
16

t.

5.4.13. These are the rescaled Legendre polynomials:


1,

1
2

t,

3 2
8t

1
2

5 3
16 t

3
4

t,

35 4
128 t

138

15 2
16 t

3
8

63 5
256 t

35 3
32 t

5.4.14. Setting h f , g i =

Z 1
0

k f k2 =

f (t) g(t) dt,

e (t),
q0 (t) = 1 = P
0
h t , q0 i
q (t) = t
q1 (t) = t
k q 0 k2 0

1
2

Z 1
0

f (t)2 dt,

1 e
2 P 1 (t),

h t 2 , q0 i
h t 2 , q1 i
1/3
1/12
e (t),
q
(t)

q1 (t) = t2
1
t 12 = t2 t + 16 = 61 P
0
2
2
2
k q0 k
k q1 k
1
1/12

1/120 2
1/4
3/40
1
1 e
P 3 (t),
= 20
t 21
t t + 61 = t3 23 t2 + 35 t 20
1
q3 (t) = t3
1
1/12
1/180

1/5
1/105 2
1/1400 3 3 2 3
1/15
1
q4 (t) = t4
t 21
t t + 61
t 2 t + 5 t 20
1
1
1/12
1/180
1/2800

q2 (t) = t2

= t4 2 t 3 +

9 2
7t

2
7

t+

1
70

5.4.15. p0 (t) = 1, p1 (t) = t

1
2

1 e
70 P 4 (t).

, p2 (t) = t2 t +

1
6

, p3 (t) = t3

3 2
2t

33
65

1
260 .

5.4.16. The formula for the norm follows from combining equations (5.48) and (5.59).
5.4.17. L4 (t) = t4 16 t3 + 72 t2 96 t + 24, k L4 k = 24,
L5 (t) = t5 25 t4 + 200 t3 600 t2 + 600 t 120, k L5 k = 120.
Z
e t dt
0

5.4.18. This is done by induction on k. For k = 0, we have


Integration by parts implies
Z
tk e t dt
0

= t k e t

t=0

Z
0

5.4.19. p0 (t) = 1, p1 (t) = t, p2 (t) = t2

k tk1 e t dt = k
1
2

, p3 (t) = t3

= e t

Z
tk1 e t dt
0

3
2

t=0

= 1.

= k (k 1) ! = k !

t, p4 (t) = t4 3 t2 +

3
4

5.4.20.
(a) To prove orthogonality, use the change of variables t = cos in the inner product intedt
:
gral, noting that dt = sin d, and so d =
8
1 t2
,
m = n = 0,
>
>
>
Z 1
Z
<
cos(m arccos t) cos(n arccos t)
1

h T m , Tn i =
dt =
cos m cos n d = > 2 , m = n > 0,
1
0
>
1 t2
>
:
q
0,
m 6= n.

(b) k T0 k = , k Tn k = 2 , for n > 0.


(c) T0 (t) = 1,
T1 (t) = t,
T2 (t) = 2 t2 1,
T3 (t) = 4 t3 3 t,
T4 (t) = 8 t4 8 t2 + 1,

T5 (t) = 16 t5 20 t3 + 5 t,

1.5

1.5

-0.5

-1

-0.5

0.5

0.5

-0.5

-1

-1.5

0.5

0.5

-0.5

1.5

0.5

-1

T6 (t) = 32 t6 48 t4 + 18 t2 1.

-1

-1.5

139

-1

-0.5

0.5

-0.5

-1

-1.5

1.5

1.5

0.5

-1

-0.5

1.5

0.5

0.5

-1

0.5

-0.5

-0.5

0.5

-1

-0.5

-0.5

-1

-0.5

-1

-1.5

0.5

-1

-1.5

-1.5

5.4.21. The GramSchmidt process will lead to the monic Chebyshev polynomials q n (t), obtained by dividing each Tn (t) by its leading coefficient: q0 (t) = 1, q1 (t) = t, q2 (t) = t2 12 ,
q3 (t) = t3 43 t, q4 (t) = t4 t2 + 18 , etc. This follows from the characterization of each qn (t)
as the unique monic polynomial of degree n that is orthogonal to all polynomials of degree
< n under the weighted inner product, cf. (5.43); any other degree n polynomial with the
same property must be a scalar multiple of qn (t), or, equivalently, of Tn (t).
5.4.22. A basis for the solution set is given by ex and e2 x . The Gram-Schmidt process yields
2 (e3 1) x
e .
the orthogonal basis e2 x and
3 (e2 1)

5.4.23. cos x, sin x, ex form a basis for the solution space. Applying the GramSchmidt process,
sinh
sinh
we find the orthogonal basis cos x, sin x, ex +
cos x
sin x.

(1)

5.4.24. Starting with a system of linearly independent functions fj (t) = fj (t), j = 1, . . . , n, in


an inner product space, we recursively compute the orthonormal system u1 (t), . . . , un (t) by
(j)

(j)

(j+1)

(j)

(j)

setting uj = fj /k fj k , fk
= fk h fk , uj i uj , for j = 1, . . . n, k = j + 1, . . . , n.
The algorithm leads to the same orthogonal polynomials.
5.4.25.
dt
> 0 for t > 0, the function
(a) First, when s = 0, t = 1, while when s = q
1, t = 1. Since
ds
1
is monotone increasing, with inverse s = + 2 (t + 1) .
(b) If p(t) is any polynomial, so is q(s) = p(2 s2 1). The formulas are q0 (s) = 1, q1 (s) =
2
2
2 s2 1, q2 (s) = 4 s4 4 s2 + 32 , q3 (s) = 8 s6 12 s4 + 24
5 s 5.
(c) No. For example, h q0 , q1 i =

Z 1
0

(2 s2 1) ds = 31 . They are orthogonal with respect

to the weighted inner product h F , G i =


F (s) = f (2 s2 1), G(s) = g(2 s2 1).

Z 1
0

F (s) G(s) s ds =

1
4

Z 1

f (t) g(t) dt provided

5.4.26.
(a) By the change of variables formula for integrals, since ds = e t dt, then
hf ,gi =

Z
0

f (t) g(t) e t dt =

Z 1
0

F (s) G(s) ds

when

f (t) = F (e t ), g(t) = G(e t ).

The change of variables does not map polynomials to polynomials.


e (e t ) are orthogonal with respect to the
(b) The resulting exponential functions Ek (t) = P
k
L2 inner product on [ 0, ).
(c) The resulting logarithmic polynomials Qk (s) = qk ( log s) are orthogonal with respect
Z 1

q ( log s) qk ( log s) ds is
to the L2 inner product on [ 0, 1 ]. Note that h Qj , Qk i =
0 j
finite since the logarithmic singularity at s = 0 is integrable.
140

5.5.1. (a) v2 , v4 , (b) v3 , (c) v2 , (d) v2 , v3 , (e) v1 , (f ) v1 , v3 , v4 .


5.5.2.
(a)

1
B3
B
B
@

C
1C
C,
3A
1
3
0

(b)
1

4
B 7
B 2
B
@ 7
6
7
0

1
C
C
C
A

B
C
@ .2857 A,
0

5 3C 2B 2C B
B
5.5.3. B
@ 2 A @ 2 A = B
@
7 1
3 2
5.5.4. Orthogonal basis:

0
B
B
@

.5714

.8571

17
21
58
21
43
21
1

1
C
C
C
A

(c)

B
B
B
@

7
9
11
9
1
9
1

1
C
C
C
A

B
C
@ 1.2222 A,

(d)

1007
B 4225

C
B 301 C
C
B
@ 4225 A
60
169

0
B
@

11
B 21
B 10
B 21
B
2
B
@ 7
10
21
0

53

2
3
7
3

1
C
C
C
A

0
B
@

.5333
2.9333 C
A.
1.3333

5.5.7. ( 1.3, .5, .2, .1 )T .

5.5.8.
(a) The entries of c = AT v are ci = uT
i v = ui v, and hence, by (1.11), w = P v = A c =
c1 u1 + + ck uk , reproducing
the
projection
formula (5.63).
0
1
0
1
4
4
2
1
1
0
1
1
1
9 9
9C
B
2 0 2C
B
2C
4
B 4
C,
(b) (i) @ 21 12 A, (ii) B
(iii) B
0C
@ 0 1
A,
9
9 9 A
@
1
1
2
2
1
2
2
2 0 2
9 9
9

141

.23833
.07123 C
A.
.35503

021
C
C
C
B
B
B3C
C
C
B 6C
B
B7C
C
C
B 5C
B
B3C
C,
C,
C,
B
B
B C.
5.5.5. (a)
(b)
(c)
(d)
C
C
B 3C
B
B C
A
@ 5A
@ 1 A
@ 0A
5
5
6
3
3
5
1
0
1
0
1
0
1
0
1
10
15
1

.5263
.88235
5
19
17
C
B
C
B
C
B
C
C
B
C
C
B
C
B 5 C @ .2632 A,
B 19 C @ 1.11765 A,
B 1 C,
(b) B
(c) B
5.5.6. (i) (a) B
@ 19 A
@ 17 A
@ 5A
.7895
.05882
1
15
1
19
0
017
15
1
0
1
0 1
0
1
191
5
.0788
0
.2632
B 2425 C
B 19 C
B
C
B
C
B
C
B
C
B
463 C @ .1909 A.
5 C @ .1316 C
(d) B
(ii) (a) @ 0 A, (b) B
A,
@ 2425 A
@ 38 A
.1237
0
.3947
12
15
97
38
0
0
1
1
0
1
0
1
13
553
.5909
.0387
B 22 C
B 14282 C
B
C
B 1602 C
B
C
9 C
C @ .4091 A,
B
C @ .2243 A.
B
(d)
(c) B
@ 22 A
@ 7141 A
.0455
.2487
48
1
22
193
0

.1111

3
1 C
B 15
B 2C
4
2B
C
C=B
B
B 44
orthogonal projection: B
2
@ 2A @
A
@ 15
3
5
1
52
43
1

.7778

.8095
C
B
@ 2.7619 A.
2.0476

1 C B 32 C
B
2C
2C
A;
A, @
5
1
2
0

(iv )

0
B
B
B
B
B
@

1
9
2
9
2
9

92

2
9

8
9
29
T

8
9

0
29
29

C
C
C
C,
C
A

(v )

03
B4
B1
B4
B
B1
@4
1
4

1
4
3
4
14
14

1
4
14
3
4
14

1
4
14
14
3
4

C
C
C
C.
C
A

1
0 92
9
(c) P T = (A AT )T = A A = P .
(d) The entries of AT A are the inner products ui uj , and hence, by orthonormality, AT A =

I . Thus, P 2 = (A AT ) (A AT ) = A I AT = A AT = P . Geometrically, w = P v is the


orthogonal projection of v onto the subspace W , i.e., the closest point. In particular, if
w W already, then P w = w. Thus, P 2 v = P w = w = P v for all v R n , and hence
P2 = P.
(e) Note that P is the Gram matrix for AT , and so, by Proposition 3.36,
rank P = rank AT = rank A.

5.5.9. (a) If x, y are orthogonal to V , then h x , v i = 0 = h y , v i for every v V . Thus,


h c x + d y , v i = c h x , v i + d h y , v i = 0 and hence c x + d y is orthogonal to every v V .
(b)

5.5.10.

23 , 34 , 1, 0

12 , 34 , 0, 1

T
1
1
.
,

,
2
2
2

5.5.11. (a)
5.5.12.

17 , 0

4 2 25 17
7 , 7 , 14 , 14

(b)

9
4
14 , 31

(c)

1 2
30 , 7

47 ,

5.5.13. orthogonal basis: ( 1, 0, 2, 1 )T , ( 1, 1, 0, 1 )T ,

1
1
2 , 1, 0, 2

closest point = orthogonal projection =


5.5.14. orthogonal basis: ( 1, 0, 2, 1 )T ,

5
1
3
4 , 1, 2 , 4

closest point = orthogonal

32 , 2, 32 ,

4
3

15
21 3
9 T
,

,
,

;
22
22 11
22

8
8 16 T
.
projection = 7 , 2, 7 , 7

5.5.15.
1
(a) p1 (t) = 14 + 27 t, p2 (t) = p3 (t) = 14 + 72 t + 14
(t2 2);
(b) p1 (t) = .285714 + 1.01429 t, p2 (t) = .285714 + 1.01429 t .0190476 (t2 4),
p3 (t) = .285714 + 1.01429 t .0190476 (t2 4) .008333 (t3 7 t);
80
20 2
(c) p1 (t) = 100 + 80
7 t, p2 (t) = p3 (t) = 100 + 7 21 (t 4).

5.5.16.
(a) The key point is that, since the sample points are symmetric, tk = 0 whenever k is odd.
Thus,
n
n
1 X
1 X
h q 0 , q1 i =
h q 0 , q1 i =
ti = t = 0,
(t2 t2 ) = t2 t2 = 0,
n i=1
n i=1 i
1
h q 0 , q3 i =
n

n
X

i=1

@ t3
i

t4
t2

ti A = t 3

142

t4
t2

t = 0,

1
n

h q 1 , q2 i =

1
h q 1 , q3 i =
n
1
h q 2 , q3 i =
n

ti (t2i t2 ) = t3 t t2 = 0,

i=1

n
X

ti @ t3i

i=1
n
X

i=1

t6 t 2 t4

(b) q4 (t) = t

n
X

(t2i
2

t4
t2

t2 ) @ t3i

t2

t4 t2

7
2

(t2 2)

t4

t4

ti A = t 4

t4
t2

t2

t4

ti A = t 5 t 2 t3
2

t2 = 0,

k q4 k =

t8

2
t4

t2
+

t3 + t4 t = 0.

t6 t 2 t4

t4 t2

(c) p4 (t) = .3429 + .7357 t + .07381 (t2 4) .008333 (t3 7 t) + .007197 t4


5.5.17.
(a) p4 (t) = 14 +
(b) p4 (t) =
(c) p4 (t) =

t+

1
14

5
12

t4

31
7

72
35

.2857 + 1.0143 t .019048 (t2 4) .008333 (t3 7 t) + .011742

4
67 2
20 2
5
72
100 + 80
7 21 (t 4) 66 t 7 t + 7 .

67 2
7 t

t4

.
+

67 2
7 t

72
7

72
7

5.5.18. Because, according to (5.65), the k th Gram-Schmidt vector belongs to the subspace
spanned by the first k of the original basis vectors.
5.5.19.
(a) Since ti = t0 + i h, we have t = t0 +
sni =

1
2

1
2

n h and so si =

n i h = si , proving symmetry of the points.

1
2

n h. In particular,

(b) Since p(ti ) = q(ti t ) = q(si ), the least squares errors coincide:
n h
X

i=1

(c)

p(ti ) yi

i2

92
35

7
2
7
2

= 56
5 +
+

9
56

= 9.7 + 1.503 t + .1607 t2 .

5.5.20.

92
35

i2

t = 11.2 + 2.6286 t,

7
2

35
12

97
+
= 10

q0 (t) = 1,

q1 (t) = t t,

q2 (t) = t2

q0 = t0 ,

q 1 = t1 t t 0 ,

q 2 = t2

k q0 k2 = 1,

i=1

q(si ) yi

, and hence q(s) minimizes the former if and only if p(t) = q(t t )

minimizes the latter.

p1 (t) = 2 + 92
35 t
p2 (t) = 2 +

n h
X

k q 1 k2 = t 2 t 2 ,

421
280

t3 t t 2
t2 t 2

t3 t t 2
t2 t 2

k q 2 k 2 = t 4 t2

t+

9 2
56 t

t t t2 ,

t1 t t 2 t0 ,

t3 t t 2
t 2 t2

5.5.21. For simplicity, we assume ker A = {0}. According to Exercise 5.3.33, orthogonalizing
the basis vectors for rng A is the same as factorizing A = Q R where the columns of Q are
the orthonormal basis vectors, while R is a nonsingular upper triangular matrix. The formula for the coefficients c = ( c1 , c2 , . . . , cn )T of v = b in (5.63) is equivalent to the matrix
formula c = QT b. But this is not the least squares solution
x = (AT A)1 AT b = (RT QT Q R)1 RT QT b = (RT R)1 RT QT b = R1 QT b = R1 c.
Thus, to obtain the least squares solution, John needs to multiply his result by R 1 .
143

5.5.22. Note that QT Q = I , while R is a nonsingular square matrix. Therefore, the least
squares solution is
x = (AT A)1 AT b = (RT QT Q R)1 RT QT b = (RT R)1 RT QT b = R1 QT b.
5.5.23. The solutions
are, of course,
0
1 the same:
!
!
.30151
.79455
3.31662 .90453
.06667
B
C
(a) Q = @ .90453 .06356 A, R =
, x=
;
0
2.86039
.91111
.30151
.60386
0
1
.8 .43644
!
!
B
.65465 C
5
0
.04000
B .4
C
C,
(b) Q = B
R=
, x=
;
@ .2 .43644 A
0 4.58258
.38095
.4
.43644
0
0
0
1
1
1
.53452
.61721
.57735
3.74166
.26726 1.87083
.66667
B
B
C
C
0
1.38873 3.24037 C
(c) Q = B
@ .80178 .15430 .57735 A, R = @
A, x = @ 1.66667 A;
.26726 .77152
.57735
0
0
1.73205
1.00000
0
1
0
1
.18257
.36515 .12910
5.47723 2.19089
0
B
C
B .36515 .18257 .90370 C
B
C
C,
R
=
0
1.09545
3.65148
(d) Q = B
@
A,
@
0
.91287 .12910 A
0
0
2.58199
.91287
0
.38730
x = ( .33333, 2.00000, .75000 )T ;
1
0
0
1
.57735
.51640 .15811 .20412
1.73205
.57735
.57735 .57735
C
B 0
.77460
.15811
.20412
B
C
B
0
1.29099 .25820 1.03280 C
B
C
C, R = B
C,
B .57735 .25820
.47434
.61237 C
(e) Q = B
C
B
@
A
0
0
1.26491
.31623
@ 0
0
.79057 .61237 A
0
0
0
1.22474
.57735 .25820 .31623 .40825
x = ( .33333, 2.00000, .33333, 1.33333 )T .
5.5.24. The second method is more efficient! Suppose the system is A x = b where A is an
m n matrix. Constructing the normal equations requires m n2 multiplications and
(m 1) n2 m n2 additions to compute AT A and an additional n m multiplications and
n(m1) additions to compute AT b. To solve the normal equations AT Ax = AT b by Gaussian Elimination requires 13 n3 + n2 13 n 31 n3 multiplications and 13 n3 + 12 n2 56 n 31 n3
additions. On the other hand, to compute the A = Q R decomposition by GramSchmidt
requires (m + 1) n2 m n2 multiplications and 12 (2 m + 1)n (n 1) m n2 additions.
To compute c = QT b requires m n multiplications and m (n 1) additions, while solving
R x = c by Back Substitution requires 21 n2 + 12 n multiplications and 21 n2 21 n additions.
Thus, the first step requires about the same amount of work as forming the normal equations, and the second two steps are considerably more efficient than Gaussian Elimination.
5.5.25.
(a) If A = Q has orthonormal columns, then
k Q x ? b k 2 = k b k 2 k Q T b k2 =

m
X

i=1
T

b2i

n
X

i=1

(ui b)2 .

(b) If the columns v1 , . . . , vn of A are orthogonal, then A A is a diagonal matrix with the
square norms k vi k2 along its diagonal, and so
k A x? b k2 = k b k2 bT A (AT A)1 AT b =

144

m
X

i=1

b2i

n
X

i=1

(ui b)2
.
k u i k2

5.5.26.
(a) The orthogonal projection is w = A x where x = (AT A)1 AT b is the least squares
solution to A x = b, and so w = A(AT A)1 AT b = P b.
(b) If the columns of A are orthonormal, then AT A = I , and so P = A AT .
(c) Since Q has orthonormal columns, QT Q = I while R is invertible, so
P = A(AT A)1 AT = Q R (RT QT Q R)1 RT QT = Q R (RT R)1 RT QT = Q QT .
Note: in the rectangular case, the rows of Q are not necessarily orthonormal vectors,
and so Q QT is not necessarily the identity matrix.
5.5.27.

.25
B
B .25
(a) P = B
@ .35
.05

(b) P =

(c) P =

0
B
B
B
B
B
@

1
3
1
3

0
1
3

.28

B
B .4
B
@ .2

.04

(d) P =

0
B
B
B
B
B
@

7
15
25
4
15
2
15

.25
.25
.35
.05

13

7
9
2
9
1
9

.4
.6
.2
.2

25
7
10
1
5
1
10

.35
.35
.49
.07
0
29
1
9
29

.2
.2
.4
.4

4
15
1
5
13
15
1
15

.05
.05 C
C
C,
.07 A
.01

1
3
1
9
29
7
9

C
C
C
C,
C
A

Pv=

.04
.2 C
C
C,
.4 A
.72
2
15
1
10
1
15
29
30

.25
B
C
B .25 C
C;
Pv=B
@ .35 A
.05
0
B
B
B
B
B
@

Pv=

1
3
1
3

C
C
C
C;
0C
A
1
3

.28

B
B .4
B
@ .2

.04

C
C
C
C,
C
A

Pv=

5.5.28. Both are the same quadratic polynomial:

1
5

7
15
52
4
15
2
15

4
7

B
B
B
B
B
@

C
C
C;
A

C
C
C
C.
C
A

12 +

3 2
2t

3
= 35
+

6 2
7t .

5.5.29.
2
2
3
32
35
t + 12
Quadratic: 15 + 52 (2 t 1) + 27 (6 t2 6 t + 1) = 35
7 t = .08571 .91429 t + 1.71429 t ;
1
1
Cubic: 15 + 25 (2 t 1) + 27 (6 t2 6 t + 1) + 10
(20 t3 30 t2 + 12 t 1) = 70
+ 27 t 79 t2 + 2 t3 =
.01429 + .2857 t 1.2857 t2 + 2 t3 .
5.5.30. 1.718282 + .845155 (2 t 1) + .139864 (6 t2 6 t + 1) + .013931 (20 t3 30 t2 + 12 t 1) =
.99906 + 1.0183 t + .421246 t2 + .278625 t3 .

9
9
9
5.5.31. Linear: 41 + 20
(2 t 1) = 51 + 10
t; minimum value: 700
= .01286. Quadratic:
2
2
1
9
1
1
3
3
1
4 + 20 (2 t 1) + 4 (6 t 6 t + 1) = 20 5 t + 2 t ; minimum value: 2800 = .0003571. Cubic:
2
3
2
3
9
1
1
1
4 + 20 (2 t 1) + 4 (6 t 6 t + 1) + 20 (20 t 30 t + 12 t 1) = t ; minimum value: 0.

5.5.32. (a) They are both the same quadratic polynomial:


2
10 120 2
+

6
6
t + 2 t2

12
720 60 2
720 60 2 2
120
+
+
t

t
3

4
5

= .050465 + 1.312236 t .417698 t2 .


145

1
0.8

(b)

The maximum error is .0504655 at the ends t = 0, .

0.6
0.4
0.2
0.5

1.5

2.5

5.5.33. .459698 + .427919 (2 t 1) .0392436 (6 t2 6 t + 1) .00721219 (20 t3 30 t2 + 12 t 1) =


.000252739 + 1.00475 t .0190961 t2 .144244 t3 .
5.5.34.

3 2
1
t

2
2 +

4
15 2
3
t

t
+
+ .070456 52 t3 23 t + .009965 35
8
4
8 +

5
35 3
15
231 6
315 4
105 2
+ .00109959 63
8 t 4 t + 8 t + .00009945
16 t 16 t + 16 t
2
3
4
5

p(t) = 1.175201 + 1.103638 t + .357814

5
16

= 1. + 1.00002 t + .500005 t + .166518 t + .0416394 t + .00865924 t + .00143587 t6 .


5.5.35.
(a) 23
(b)

2
35 2
4
= 15
3t+ 5
2 15 t + 4 t ; it gives the
2
t2 dt among all quadratic polynomials
k p(t) 1t k2 =
p(t) 1t
0

2
2
3
3
35
4
126
10
2
5
t3 158 t + 15t
2 3 t 4 + 4 t 3t+ 5 5
14 28
3
= 12 42 t + 56 t2 126
5 t .
10
3

3
4

35
4
Z 1

t2

smallest value to
p(t).

30
25

(c)

20
15
10
5
0.2

0.4

0.6

0.8

(d) Both do a reasonable job approximating from t = .2 to 1, but cant keep close near the
singularity at 0, owing to the small value of the weight function w(t) = t 2 there. The
cubic does a marginally better job near the singularity.
5.5.36. Quadratic: .6215 + .3434 (t 1) .07705 (t2 4 t + 2) = .1240 .6516 t .07705 t2 ;
Cubic: .6215 + .3434 (t 1) .07705 (t2 4 t + 2) + .01238 (t3 9 t2 + 18 t 6)
= .0497 + .8744 t .1884 t2 + .01238 t3 .

The accuracy is reasonable up until t = 2 for the quadratic


and t = 4 for the cubic polynomial.

1.4
1.2
1
0.8
0.6
0.4
0.2
4

5.6.1. (a) W has basis

011 0 11

B3C B 3C
@ 1 A, @
0 A,

dim W = 2; (b) W has basis

1
B2
B 5
B
@ 4

146

C
C
C,
A

dim W

= 1; (c) W

dim W = 1; (e) W
0

3
C
5.6.2. (a) B
@ 4 A;
5

(b)

1
C
B
B 3C
C;
5.6.3. (a) B
@ 2A
1
5.6.4. (a) w =
0
B

B
z=B
@

2
3
1
3
1
3

(b)

011 031
B2C B2C
@ 1 A, @ 0 A;

1
3
@ 10 A,
1
10

C
C
C;
A

z=

(d) w =

5.6.5. (a) Span of


0

011
2
B C
B1C
B 4 C,
B C
@ 1A

(c)

0
B
B
B
@

2
B 7
B 3
B
@ 7
17

C
C
C,
A

2C
7C
C,
1A

1
C
(d) B
@ 1 A,
0

B4C
B7C
B C;
@ 0A

1
B C
@ 0 A.
1

1
B C
B0C
C,
(d) B
@1A
0

0
B
B
B
@

4
9
C B
C

(c) Span of B
= 2.
@ 1 A, @ 0 A; dim W
0
1
(e) W = {0}, and dim W = 0.

51

(d) Span of

B C
@ 1 A,

11
2
B
C
B7 C
B 4 C.
B
C
@ 0A

0
1
6
B 5 C
B 3
C
B
C
8 C
B1
C, z = B 42 C;
(c) w = B
(b) w =
25 A
@ 25 A
@ 3
6
31

13
25 0
25 0
1
1
4
7
0 1
5
B 11 C
B 11 C
1 C
B7C
B 1 C
B
C
C
B 11 C
B
B 3 C;
C, z = B 11 C.
B
z=B
(e)
w
=
1 C
1 C
B
B
@7A
@ 11 A
@ 11 A
1
13
2
7
11
11
1
0
12
B 8 C
C

B 15 C; dim W
= 1.
dim W = 2.
(b) Span of B
8 A
@

1
7
@ 10 A;
21
10

1
C
(c) B
@ 1 A;
1

1
C
B
B 1 C
C;
B
@ 0A
1

1
021 0
1
C
B3C B
@ 1 A, @ 0 A;

1 0

2
3
B
C

1C
has basis
= 2; (d) W has basis
A, @ 0 A, dim W
0
1
= {0}, dim W = 0.
B
@

C
C
C,
A

B3C
@ 2 A;

dim W = 1.

5.6.6. For the weighted inner product, the orthogonal complement W is the set of all vectors
v = ( x, y, z, w )T that satisfy the linear system
h v , w1 i = x + 2 y + 4 w = 0,

h v , w2 i = x + 2 y + 3 z 8 w = 0.

A non-orthogonal basis for W is z1 = ( 2, 1, 0, 0 )T , z2 = ( 4, 0, 4, 1 )T . Applying


GramSchmidt, the corresponding orthogonal basis is y1 = ( 2, 1, 0, 0 )T ,
y2 =

34 , 43 , 4, 1

. We decompose v = w + z, where w =

30
13
4
1
43 , 43 , 43 , 43

13 13 4
1
43 , 43 , 43 , 43

W,

z=
W . Here, since w1 , w2 are no longer orthogonal, its easier
to compute z first and then subtract w = v z.
5.6.7. (a) h p , q i =
Z 1

p(x) dx =

Z 1

Z 1

p(x) q(x) dx = 0 for all q(x) = a + b x + c x2 , or, equivalently,

x p(x) dx =

Z 1

x2 p(x) dx = 0. Writing p(x) = a + b x + c x2 + d x3 + e x4 ,

the orthogonality conditions require 2 a + 23 c + 25 e = 0, 32 b + 52 d = 0, 32 a + 52 c + 27 e = 0.


3
; dim W = 2; (c) the preceding basis is orthogonal.
(b) Basis: t3 53 t, t4 67 t2 + 35

5.6.8. If u, v W , so h u , w i = h v , w i = 0 for all w W , then


h c u + d v , w i = c h u , w i + d h v , w i = 0 also, and so c u + d v W .
147

5.6.9. (a) If w W W then w W must be orthogonal to every vector in W and so


w W is orthogonal to itself, which implies w = 0. (b) If w W then w is orthogonal to
every z W and so w (W ) .
5.6.10. (a) The only element orthogonal to all v V is 0, and hence V contains only the zero
vector. (b) Every v V is orthogonal to 0, and so belongs to {0} .
5.6.11. If z W2 then h z , w i = 0 for every w W2 . In particular, every w W1 W2 , and
hence z is orthogonal to every vector w W1 . Thus, z W1 , proving W2 W1 .
5.6.12.
(a) We are given that dim W + dim Z = n and W Z = {0}. Now, dim W = n dim W ,
dim Z = n dim Z and hence dim W + dim Z = n. Furthermore, if v W Z
then v is orthogonal to all vectors in both W and Z, and hence also orthogonal to any
vector of the form w + z for w W and z Z. But since W, Z are complementary,
every vector in R n can be written as w + z and hence v is orthogonal to all vectors in
R n which implies v = 0.
Z W

(b)

5.6.13. Suppose v (W ) . Then we write v = w + z where w W, z W . By assumption,


for every y W , we must have 0 = h v , y i = h w , y i + h z , y i = h z , y i. In particular,
when y = z, this implies k z k2 = 0 and hence z = 0 which proves v = w W .
5.6.14. Every w W can be written as w =
z=

l
X

j =1

k
X

i=1

ai wi ; every z Z can be written as

bj zj . Then, using bilinearity, h w , z i =

and Z are orthogonal subspaces.

k
X

l
X

i=1 j =1

ai bj h wi , zj i = 0, and hence W

5.6.15.
(a) We are given that h wi , wj i = 0 for all i 6= j between 1 and m and between m+1 and n.
It is also 0 if 1 i m and m+1 j n since every vector in W is orthogonal to every
vector in W . Thus, the vectors w1 , . . . , wn are non-zero and mutually orthogonal, and
so form an orthogonal basis.
(b) This is clear: w W since it is a linear combination of the basis vectors w 1 , . . . , wm ,
similarly z W since it is a linear combination of the basis vectors wm+1 , . . . , wn .
5.6.16.
(a) Let V = { + x } = P (1) be the two-dimensional subspace of linear polynomials.
u(b) u(a)
Every u(x) C0 ( a, b ) can be written as u(x) = + x + w(x) where =
,
ba
= u(a) a, while w(a) = w(b) = 0, so w W . Moreover, a linear polynomial + x
vanishes at a and b if and only if it is identically zero, and so V satisfies the conditions

148

for it to be a complementary subspace to W .


(b) The only continuous function which is orthogonal to all functions in W is the zero function. Indeed, suppose h v , w i =

Z b
a

v(x) w(x) dx = 0 for all w W , and v(c) > 0

for some a < c < b. Then, by continuity, v(x) > 0 for | x c | < for some > 0.
Choose w(x) W so that w(x) 0, w(c) > 0, but w(x) 0 for | x c | . Then
v(x) w(x) 0, with v(c) w(c) > 0, and so

Z b
a

v(x) w(x) dx > 0, which is a contradic-

tion. The same proof works for w(c) < 0 only the inequalities are reversed. Therefore
v(x) = 0 for all a < x < b, and, by continuity, v(x) 0 for all x. Thus, W = {0}, and
there is no orthogonal complementary subspace.

5.6.17. Note: To show orthogonality of two subspaces, it suffices to check orthogonality of their
respective basis vectors.
!
!
!
!
1
2
1
2
(a) (i) Range:
; cokernel:
; corange:
; kernel:
;
2
1
2
1
!
!
!
!
1
2
1
2
(ii)

= 0; (iii)

= 0.
2
1
2
1
1
0 1 0 1
0
1
!
!
5
0
5
0
B C B C
B 5C
,
; kernel: {0};
(b) (i) Range: @ 1 A, @ 2 A; cokernel: @ 1 A; corange:
0
2
0
2
1
1
1
0 1 0
0 1 0
1
1
!
!
0
5
5
0
B C B 5C
C B 5C
0=
0 = 0.
(ii) B
@ 1 A @ 1 A = @ 2 A @ 1 A = 0; (iii)
0
2
2
0
11 0 1
1
0
0
1
0
1 0 1
0
1
0
1
3
1
0
3
B
C B C
B
C
B
C B C
B
(c) (i) Range: @ 1 A, @ 0 A; cokernel: @ 2 A; corange: @ 0 A, @ 1 A; kernel: @ 2 C
A;
2
3
1
3
2
1
0
1 0
1
0 1 0
1
0
1 0
1
0 1 0
1
0
3
1
3
1
3
0
3
B
C B
C
B C B
C
B
C B
C
B C B
(ii) @ 1 A @ 2 A = @ 0 A @ 2 A = 0; (iii) @ 0 A @ 2 A = @ 1 A @ 2 C
A = 0.
2
1
3
1
3 0 1 01 1
2
1
0
1 0 1
0
1
1
0
1
2
1
B C B C
B
C B C
B
C
B2C B3C
C , B C;
(d) (i) Range: @ 1 A, @ 1 A; cokernel: @ 1 A; corange: B
@0A @3A
0
3
1
1
2
1
0
1 0
1
0 1 0
1
1 0
0
2 B 31 C
1
2
1
1
B
1 C
C
B C B
C
C B
B
C B
23 C
C;
B
C, B
(ii)
kernel: B
@ 1 A @ 1 A = @ 1 A @ 1 A = 0;
C
B
@ 1A @
0A
1
3
1
0
0
1
0
1
0 1 0
1
0 1
0 1 0
1
0 1 0 11
1
2
1
1
0
2
0
3
3C
C
B C B
C
B C B
B C B
C
B C B
2
2C
B
C
1
2
2
3
1
B C B
C
B C B C
B C B
C
B3C B

B
CB
C=B CB 3 C=B CB
C=B CB 3 C
(iii) B
C = 0.
@0A @ 1A
@0A @
@
A
@
A
@
A
3
1
3
@ 0A
0A
1
0
1
2
0
2
1
1
1
0 1 0
3
0
0 1 0 1
0
1
B1C B 1C
3
1
3
B C B
C
B C B C
C B
C
B
C
B 4 C, B 1 C; kernel:
(e) (i) Range: @ 1 A, @ 1 A; cokernel: @ 1 A; corange: B
B C B
C
@
A
@
5
2
2
2
1 A
7
1

149

1 0

1 0

1 0

1
1
2
3
1
1
0 1 0
1
0 1 0
B 1 C B 1 C B 1 C
B 1 C B 1 C
3
1
3
3
B
C B
C B
C
B C B
C
B
C B
C B
C
B C B
C
C
B C B
B C B
C
B 1 C, B 0 C, B 0 C; (ii) @ 1 A @ 1 A = @ 1 A @ 1 A = 0; (iii) B 4 C B 1 C
B
C B
C B
C
B C B
C
@ 0A @ 1A @ 0A
@2A @ 0A
2
2
2
5
0
0
1
7
0
1
1 0
0
1
1 0
0
1
1 0
0
1
0 1 0
1
0 1 0
2
0
1
0
1
0
2
3
1
3
B 1 C B 1 C
B 1C B 1C
B 1 C B 1 C
B 1 C B 1 C
B1C B 1C
C
C B
B
C
C B
B
C
C B
B
C
B C B
C
B C B
C
C B
B
C
C B
B
C
C B
B
C
B C B
C
C B
B 4 C B 0 C = B 4 C B 0 C = B 1 C B 1 C = B 1 C B 0 C = B 1 C B 0 C = 0.
=B
C
C B
B
C
C B
B
C
C B
B
C
B C B
C
B C B
@ 1 A @ 0 A
@ 1 A @ 1 A
@ 1 A @ 0 A
@2A @ 0A
@2A @ 1A
1
1
00
1 0 01 0 1 1
1
7
0 0
7
1 0
1
1 0
1
1
3
1
1
1
0
B
B
B
C
C B C
B
C B
C
2 C
C B 1C
B 2 C B 1 C
B 3C B 7C
B
C, B
C; cokernel: B
C, B C; corange: B
C, B
C;
(f ) (i) Range: B
@ 3 A @ 5 A
@ 1A @0A
@ 0A @ 2A
1
4
0
1
2
1
0
1 0
1
kernel:
0

6
B 7C B
B2 C B
B 7 C, B
B
C B
@ 1A @

0
1 0

11
7
1
7

0
11

C
C
C;
C
A

3
1
3
1
B
C
B
C B C
1C
2
1
C B
C
B
C B1C
C B
C = B
C B C = 0;
(ii)

=
@ 5A @0A
5A @ 1A
1
0
1 0 1 1
4 0 01
4 0 11
0
0
0
1
1
1
0
1 0 61
11
6
11
1
0
0
1
7C
7 C
7C
7 C
B
B
B
B
B
B
C B 1 C
C B 2C
C B
B
C B 2C
B
3C B C B 3C B C B 7C B C B 7C B 1 C
CB 7 C=B
CB 7 C=B
CB 7 C
B
CB 7 C=B
(iii) B
C = 0.
@ 0A @
1A @ 0A @ 0 A @ 2A @ 1A @ 2A @ 0 A
2
1
1
2
0
1
0
1
1
1 0
1 0
0
1
1 0
0
20
1
11
2
1
B 4 C B 1 C B 9 C
B 2 C B 5 C
C
C B
C B
B
C
C B
B
C
C B
C B
B
C
C B
C; corange:
C, B 0 C , B
B 3 C, B 2 C; cokernel: B
0
1
(g) (i) Range: B
C
C B
C B
B
C
C B
B
@
@ 1 A @ 3 A
0A @ 1A @ 0A
1
0
0
5
2
0
1 0
1
0
1 0
1
1
0 1 0
0
1 0 1
1
11
1
1
1
2
0
1
B 2 C B 4 C
B 2 C B 1 C
B
C B
C
B
C B
C
C
B C B
B
C B C
B
C B
C
B
C B
C
B1C B 0C
B 2C B0C
C = B 3 C B 0 C =
C; (ii) B 3 C B
B
C, B C; kernel: B C, B
1
B
C B
C
B
C B
C
@0A @ 0A
@ 2A @1A
@ 1A @
@ 1A @ 1A
0A
1
0
0
1
2
0
2
0
1
1 0
0
1
1 0
0
1
1 0
0
1
1 0
0
20
2
1
2
11
2
20
1
B 5 C B 9 C
B 5 C B 1 C
B 5 C B 4 C
B 2 C B 9 C
C
C B
B
C
C B
B
C
C B
B
C
C B
B
C
C B
B
C
C B
B
C
C B
B
C
C B
B
C = 0;
C = B 2C B 0C = B 2C B
C = B 2C B
B 3 C B
0
1
0
C
C B
B
C
C B
B
C
C B
B
C
C B
B
@ 1A @
0 A @ 3 A @ 0 A @ 3 A @ 1 A @ 3 A @ 0 A
1
0 1 051 0 10 0 150
5
1
2 0
1 0 1
0
1 0
1
1
2
1
1
0
2
0
1
B
B C
B
C B
C
B C B C
B C B
C
2C
C B1C
B 2C B 0C
B0C B1C
B0C B 0C
B
CB C=B
CB
C=B CB C=B CB
C = 0.
(iii) B
@ 2A @0A
@ 2A @ 0A
@1A @0A
@1A @ 0A
1
0
1
1
0
0
0
1
B
C
B 2 C
B
C
@ 3 A

B
C
B 2 C
B
C
@ 1A

B
C
B 2 C
B
C
@ 3 A

5.6.18.
(a) The compatibility condition is

B C
B1C
B C
@0A

2
3 b1

B
B
B
@

+ b2 = 0 and so the cokernel basis is

T
2
.
3,1
T

(b) The compatibility condition is 3 b1 + b2 = 0 and so the cokernel basis is ( 3, 1 ) .


(c) There are no compatibility conditions, and so the cokernel is {0}.
(d) The compatibility conditions are 2 b1 b2 + b3 = 2 b1 2 b2 + b4 = 0 and so the cokernel
basis is ( 2, 1, 1, 0 )T , ( 2, 2, 0, 1 )T .
150

5.6.19. (a)

(c)

B
C
@ 2 A
0

1 0

B
C B C
B 2 C B 0 C
B
C, B C ;
@ 2A @1A

1 0

10
1
C B
C
(b) B
@ 21 A, @ 12 A;
12
21

1 B 1 C
10 B 10 C
=
@ 21 A +
@ 12 A,
99 12
99 21

B
@

2
2 B 1 C
20 B 10 C
4C
@ 21 A
@ 12 A,
A=
99 12
99 21
2

1
7 B 10 C 29 B 1 C
C
B
@ 5A =
@ 21 A +
@ 12 A.
99 12
99 21
7

2
4 B 1 C
7 B 10 C
C
B
@ 21 A +
@ 12 A,
@ 3 A =
33 12
33 21
0

5.6.20.
(a) Cokernel basis: ( 1, 1, 1 )T ; compatibility condition: 2 a b + c = 0;
(b) cokernel basis: ( 1, 1, 1 )T ; compatibility condition: a + b + c = 0;
(c) cokernel basis: ( 3, 1, 1, 0 )T , ( 2, 5, 0, 1 )T ;
compatibility conditions: 3 b1 + b2 + b3 = 2 b1 5 b2 + b4 = 0;
(d) cokernel basis: ( 1, 1, 1, 0 )T , ( 2, 1, 0, 1 )T ;
compatibility conditions: a b + c = 2 a b + d = 0.
5.6.21.
(a) z =

(b) z =

(c) z =

(d) z =

1
1
2
B
C
B
0C
@
A,
1
2
0
1
1
B 3C
B 1C
B
C,
@ 3A
13
0 14 1
B 17 C
B 1 C
B 17 C
B
C,
4 C
B
@ 17 A
5
17
0 11
C
B 2
B 1C
B 3C
C
B
B 1 C,
B 6C
C
B
B 1C
@3 A

1
1
1
0
0
1
1
2
2
C
C
B
B C
3B
C
C
C
B
B
w=B
@ 0 A = @ 2 A + @ 3 A;
2
1
1
2
2
1
0
0
1
0
1
2
1
1
B 3C
B
C
1B C
C
B
C
C
B1 C = B
w=B
@ 1 A @ 0 A;
@ 3A
3
1
1
2
3
0
0 1
1
0 3 1
1C
2
B
B C
B 17 C
B
B
C
B 1 C
1 B 1 C
4 B1C
C
C
B
B C;
B 17 C =
C+
w=B
B
B3C
B 4 C
C
51
51
0
@
@ A
A
@ 17 A
5
3
3
17
0 11
0
1
3
B 2
C
B
C
B1 C
B 2C
B 3C
B
C
B
C
1
B
C
1C
C.
w=B
1
B 6C= B
B
C
6B
B
C
C
B 1C
@ 2 A
@ 3A
0

5.6.22.
(a) (i) Fredholm requires that the cokernel basis
T

T
1
2,1

be orthogonal to the right hand

side ( 6, 3 ) ; (ii) the general solution is x = 3 + 2 y with y free; (iii) the minimum
norm solution is x = 53 , y = 65 .
(b) (i) Fredholm requires that the cokernel basis ( 27, 13, 5 )T be orthogonal to the right
hand side ( 1, 1, 8 )T ; (ii) there is a unique solution: x = 2, y = 1; (iii) by uniqueness, the minimum norm solution is the same: x = 2, y = 1.
(c) (i) Fredholm requires that the cokernel basis ( 1, 3 )T be orthogonal to the right hand
side ( 12, 4 )T (ii) the general solution is x = 2 + 21 y 32 z with y, z free; (iii) the
minimum norm solution is x = 47 , y = 72 , z = 67 .
151

(d) (i) Fredholm requires that the cokernel basis ( 11, 3, 7 )T be orthogonal to the right
hand side ( 3, 11, 0 )T (ii) the general solution is x = 3 + z, y = 2 2 z with z free;
1
7
(iii) the minimum norm solution is x = 11
6 , y = 3 , z = 6 .
(e) (i) Fredholm requires that the cokernel basis ( 10, 9, 7, 0 )T , ( 6, 4, 0, 7 )T be orthogonal to the right hand side ( 8, 5, 5, 4 )T ; (ii) the general solution is x1 = 1 t, x2 =
4
5
3+2 t, x3 = t with t free; (iii) the minimum norm solution is x1 = 11
6 , x2 = 3 , x3 = 6 .
T
(f ) (i) Fredholm requires that the cokernel basis ( 13, 5, 1 ) be orthogonal to the right
hand side ( 5, 13, 0 )T ; (ii) the general solution is x = 1 + y + w, z = 2 2 w with y, w
9
9
8
7
free; (iii) the minimum norm solution is x = 11
, y = 11
, z = 11
, w = 11
.
0

1 0

1 B 13 C
B
1C
1 C
C B
B
C, B 3 C
5.6.23. (a) B
C;
@ 0A B
@
1A
1
2

(b)

1 0

B C B
C
B1C B 1C
B C, B
C;
@ 0 A @ 1 A

(c) yes because of the orthogonality of the corange and kernel; see Exercise 5.6.15.

5.6.24. If A is symmetric, ker A = ker AT = coker A, and so this is an immediate consequence of


Theorem 5.55.
5.6.25. Since rng A = span {v1 , . . . , vn } = V , the vector w is orthogonal to V if and only if
w V = (rng A) = coker A.

5.6.26. Since ker A and corng A are complementary subspaces of R n , we can write any x R m
as a combination x = v + z with v = c1 v1 + + cr vr corng A and z ker A. Then
y = A x = A v = c1 A v1 + + cr A vr and hence every y rng A can be written as
a linear combination of A v1 , . . . , A vr , which proves that they span rng A. To prove linear
independence, suppose 0 = c1 A v1 + + cr A vr = A(c1 v1 + + cr vr ). This implies
that c1 v1 + + cr vr ker A. However, since they are orthogonal complements, the only
vector in both corng A and ker A is the zero vector, and so c1 v1 + + cr vr = 0, which,
since v1 , . . . , vr are a basis, implies c1 = = cr = 0.
5.6.27. False. The resulting basis is almost never orthogonal.
5.6.28. False. See Example 5.60 for a counterexample.
5.6.29. If f 6 rng K, then there exists x ker K = coker K such that xT f = x f = b 6= 0. But
then p(s x) = 2 s xT f + c = 2 b s + c can be made arbitrarily large negative by choosing
s = t b with t 0. Thus, p(x) has no minimum value.
5.6.30. The result is not true if one defines the cokernel and corange in terms of the transposed
matrix AT . Rather, one needs to replace AT by its Hermitian transpose A = AT , cf. Exercise 5.3.25, and define corng A = rng A , coker A = ker A . (These are the complex conjugates of the spaces defined using the transpose.) With these modifications, both Theorem 5.54 and the Fredholm Alternative Theorem 5.55 are true for complex matrices.

5.7.1.
(a) (i) c0 = 0, c1 = 21 i , c2 = c2 = 0, c3 = c1 = 21 i , (ii) 21 i e i x 12 i e i x = sin x;
1
(b) (i) c0 = 21 , c1 = 92 , c2 = 0, c3 = c3 = 18
, c4 = c2 = 0, c5 = c1 = 29 ,
1
1
1
(ii) 18
e 3 i x + 92 e i x + 12 + 29 e i x = 12 + 49 cos x + 18
cos 3 x 18
i sin 3 x;
(c) (i) c0 =

1
3,

c1 =

3 3 i
12

, c2 =

1 3 i
12

152

, c3 = c3 = 0, c4 = c2 =

1+ 3 i
12

1+ 3 i 2 i x
e
+ 3+123 i e i x + 13 + 3123 i e i x + 1123 i e2 i x =
12
1
1

6 cos 2 x + 2 3 sin 2 x;
2 3

(i) c0 = 18 , c1 = 18 + 1+4 2 i , c2 = 81 , c3 = 81 14 2 i , c4 = c4 =

81 , c5 = c3 = 81 + 14 2 i , c6 = c2 = 81 , c7 = c1 = 18 1+4 2 i ,

(ii) 81 e 4 i x + 18 + 14 2 i e 3 i x 18 e 2 i x + 18 1+4 2 i e i x 18 +

81 + 1+4 2 i e i x 18 e2 i x + 81 14 2 i e3 i x = 18 14 cos x 2+1


sin x
2

21
1
1
i
sin 3 x 18 cos 4 x 1
4 cos 2 x 4 cos 3 x
2
8 sin 4 x.

c5 = c1 = 3+123 i , (ii)
1
1
1

sin x +
3 + 2 cos x +

(d)

5.7.2.
(a) (i) f0 = f3 = 2, f1 = 1, f2 = 1. (ii)e i x + e i x =2 cos x;

(b) (i) f0 = f5 = 1, f1 = 1 5 , f2 = 1 + 5 , f3 = 1 + 5 , f4 = 1 5 ;
(ii) e 2 i x e i x + 1 e i x + e2 i x = 1 2 cos x + 2 cos 2 x;
(c) (i) f0 = f5 = 6, f1 = 2 + 2 e2 i /5 + 2 e 4 i /5 = 1 + .7265 i ,
f2 = 2 + 2 e2 i /5 + 2 e4 i /5 = 1 + 3.0777 i ,
f3 = 2 + 2 e 2 i /5 + 2 e 4 i /5 = 1 3.0777 i ,
f4 = 2 + 2 e 2 i /5 + 2 e4 i /5 = 1 .7265 i ;
(ii) 2 e 2 i x + 2 + 2 e i x = 2 + 2 cos x + 2 i sin x + 2 cos 2 x 2 i sin 2 x;
(d) (i) f0 = f1 = f2 = f4 = f5 = 0, f3 = 6; (ii) 1 e i x + e2 i x e3 i x + e4 i x e5 i x =
1 cos x + cos 2 x cos 3 x + cos 4 x cos 5 x + i ( sin x + sin 2 x sin 3 x + sin 4 x sin 5 x).
5.7.3.

1
1

The interpolants are accurate along most of the interval, but there is a noticeable problem
near the endpoints x = 0, 2 . (In Fourier theory, [ 16, 47 ], this is known as the Gibbs phenomenon.)
5.7.4.
(a)

40

40

40

30

30

30

20

20

20

10

10

10

(b)

10

10

10

2
1

-1

1
0.5

0.5

1
-0.5

0.5

(c)

2
1

-0.5

6
-0.5

-1

-1

153

(d)

0.5

0.6

0.6

0.6

0.4

0.4

0.2

0.2
2

5.7.5.

62

1
2

+ 23 i ;

12 + 23 i ,

6
60 = 1

63 = 1

(a)

0.5
1

1.5

0.5
1

2.5

1.5

0.5

1.5

2.5

(b) 6 =

2.5

0.2
1

0.4

0.8

0.8

-1

0.8

-0.5

-1

-1

(f )

-0.5

-0.5

(e)

1
0.5

0.5

64

65

(c) 62 =
63 = 1, 64 = 21 23 i , 65 = 21 23 i , so 1 + 6 + 62 + 63 + 64 + 65 = 0.
(d) We are adding three pairs of unit vectors pointing in opposite directions, so the sum
cancels out. Equivalently, 16 times the sum is the center (of mass) of the regular hexagon,
which is at the origin.
5.7.6.
(a) The roots all have modulus | k | = 1 and phase ph k = 2 k/n. The angle between
successive roots is 2 /n. The sides meet at an angle of 2 /n.
q

1
(b) Every root has modulus n | z | and the phases are
ph z + 2 k , so the angle between
n
successive roots is 2 /n, and the sides continue to meet at an angle of 2 /n.
q
1
The n-gon has radius = n | z | . Its first vertex makes an angle of = ph z with the
n
z
horizontal.

In the figure, the roots = 6 z are at the vertices of a


regular hexagon, whose first vertex makes an angle with
the horizontal that is 61 the angle made by the point z.

5.7.7.
(a) (i) i , i ; (ii) e2 k i /5 for k = 1, 2, 3 or 4; (iii) e2 k i /9 for k = 1, 2, 4, 5, 7 or 8;
(b) e2 k i /n whenever k and n have no common factors, i.e., k is relatively prime to n.
154

5.7.8. (a) Yes, the discrete Fourier coefficients are real for all n. (b) A function f (x) has real
discrete Fourier coefficients if and only if f (xk ) = f (2 xk ) on the sample points x0 , . . . , xn1 .
In particular, this holds when f (x) = f (2 x).
5.7.9. (a) In view of (2.13), formula (5.91) is equivalent to matrix multiplication f = F n c,
where
0
1
1
1
1
...
1
B

2
...
n1 C
B1
C
B

2
4
2(n1) C
B1
C

.
.
.

C
Fn = 0 1 . . . n1 = B
B
C
B.
C
..
..
..
..
B.
.
. 2C
.
.
@.
A
1 n1 2(n1) . . . (n1)

is the nn matrix whose columns are the sampled exponential vectors (5.90). In particular,
1
1

F2 =

1
1

1
B
B
F3 = @ 1
1

F8 =

B1
B
B
B1
B
B
B1
B
B
B1
B
B1
B
B
B1
@

1
12 + 23 i
12

1 ij
n n

1+
i
2

1
i
2

1+
i
2

1 ij
n n ,

12 +

i
1

1
1
1
1
1
1
1
1

1+
i
2

1+
i
2

1
i
2

1
i
2

Fn1 f .

which is

3
2

1
1
B
i
B1
F4 = B
@ 1 1
1 i
1
1
1
1
i C
i
C
2 C
1
i C
C
C
1+
i C
i
2 C
C.
1
1 C
C
1+
i
C

i
2 C
C
1
i C
A
1+
i
i

C
C,
A

1
i
2

1
i
2

1+
i
2

1
i
2

1
n

times the complex conjugate of the (j, i) entry

(c) By part (b), Un1 = n Fn1 = 1n Fn = Un .

2
1.5

0.5
1

0.5

-0.5

-0.5

-0.5

-1

1
i C
C
C,
1 A
i

Moreover, formula (5.91) implies that the (i, j)

1.5

0.5

1
1
1
1

2
1.5

5.7.10.

3
2

1
i
1
i
1
i
1
i

1+
i
2

(b) Clearly, if f = Fn c, then c =


entry of Fn1 is
of Fn .

1
12 23 i

-1

-1

Original function,
11 mode compression,
21 mode compression.
The average absolute errors are .018565 and .007981; the maximal errors are .08956 and
.04836, so the 21 mode compression is about twice as accurate.
5.7.11.
(a)

1
1

1
1

Original function,
11 mode compression,
21 mode compression.
The largest errors are at the endpoints, which represent discontinuities. The average absolute errors are .32562 and .19475; the maximal errors are 2.8716 and 2.6262, so, except
near the ends, the 21 mode compression is slightly less than twice as accurate.
155

(b)

80

80

80

60

60

60

40

40

40

20

20

20

Original function,
11 mode compression,
21 mode compression.
The error is much less and more uniform than in cases with discontinuities. The average
absolute errors are .02575 and .002475; the maximal errors are .09462 and .013755, so
the 21 mode compression is roughly 10 times as accurate.
1

0.8

(c)

0.8

0.8

0.6

0.6

0.6

0.4

0.4

0.4
0.2

0.2

0.2
1

Original function,
11 mode compression,
21 mode compression.
The only noticeable error is at the endpoints and the corner, x = . The average absolute errors are .012814 and .003612; the maximal errors are .06334 and .02823, so the 21
mode compression is 3 times as accurate.
5.7.12. l = 4, 27, 57.
5.7.13. Very few are needed. In fact, if you take too many modes, you do worse! For example,
if = .1,
2

1.5

1.5

0.5

1.5

0.5

1.5

0.5

1.5

0.5

0.5

plots the noisy signal and the effect of retaining 2 l + 1 = 3, 5, 11, 21 modes. Only the first
three give reasonable results. When = .5 the effect is even more pronounced:
2.5

2.5

2.5

2.5
2

1.5

1.5

0.5
1

0.5

-0.5

1.5

1.5

0.5

-0.5

1.5

0.5

-0.5

0.5
1

-0.5

5.7.14. For noise varying between 1, and 256 = 28 sample points, the errors are
# nodes

11

13

average error

.8838

.1491

.0414

.0492

.0595

.0625

maximal error

1.5994

.2687

.1575

.1357

.1771

.1752

Thus, the optimal denoising is at 2 l + 1 = 7 or 9 modes, after which the errors start to get
worse. Sampling on fewer points, say 64 = 26 , leads to similar results with slightly worse
performance:
# nodes

11

13

average error

.8855

.1561

.0792

.0994

.1082

.1088

maximal error

1.5899

.3348

.1755

.3145

.3833

.4014

On the other hand, tripling the size of the error, to vary between 3, leads to similar, and
marginally worse performance:

156

# nodes

11

13

average error

.8830

.1636

.1144

.3148

.1627

.1708

maximal error

1.6622

.4306

.1755

.3143

.3398

.4280

Note: the numbers will slightly vary each time the random number generator is run.
5.7.15. The compressed function differs significantly from original signal. The following plots
are the function, that obtained by retaining the first l = 11 modes, and then the first l = 21
2

1.5
1

modes:

1.5

1.5

0.5

0.5
1

0.5
1

-0.5

-0.5

-1

-0.5

-1

-1

5.7.16. True for the odd case (5.103), but false for the even case (5.104). Since k = k ,
when f is real, c k = ck , and so the terms c k e i k x + ck e i k x = 2 Re ck e i k x combine
to form a real function. Moreover, the constant term c0 , which equals the average sample
value of f , is also real. Thus, all terms in the odd sum pair up into real functions. However, in the even version, the initial term c m e i m x has no match, and remains, in general, complex. See Exercise 5.7.1 for examples.

5.7.17. (a) f =
0

(b) f =

(c) f =

B 1
B
B
2
B
B
B 1
B
B 1
B
B
2
B
B 0
B
B
B 1
B
2
B
B
B 1
@
1
2

B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@

B C
B1C
B2C
B C,
B C
@1A
3
2
1

C
C
C
C
C
C
C
C
C
C
C,
C
C
C
C
C
C
C
C
A

(0)

B
3 C
C
B
B
4C
C
B
1 C
B

B
2 C
C
B
1 C
B

B
(0)
4 C
C, c
B
=
B
0 C
C
B
C
B
1 C
B
B
4C
C
B
1 C
B

A
@
2
3
4

0
B C
B1C
B C

=B
B

c(0) =

,
1C
C
@2A
3
2
1
0

B
B 0
B
B
B 1
B
B
B 1
B
B 1
B
B
2
B
B 1
B
2
B
B 1
B
B
2
@
1
2
1

C
C
C
C
C
C
C
C
C
C
C,
C
C
C
C
C
C
C
C
A

0 1 1
B 2 C
B 1C
C
B 2 C,
c(1) = B
C
B
@ 1 A
12
0
1

c=c

(2)

c(1) =

B
C
B
C
B 0 C
B
C
B 0 C
B
C
B
C
B 1 C
B
C
B
C
B 0 C,
B
C
B 1 C
B C
B 2C
B
C
B
C
B 0 C
@
A
1
2
1

c(2) =

C
B 2
C
B
B 1 C
B
0 C
C
B 2
C
B
C
B
C
B
1 C
1
B
C
B

B 2 C
B
2 C
B
C
B
1 C
C
B
C
B

0
B
C
B
(1)
(2)
2 C
C
B
C
B
,
c
=
,
c
=
3 C
B 1 C
B

C
B
C
B
4 C
2
B
C
B
1 C
1
B
C
B
B 4 C
B
4C
C
B
C
B
1 C
B
C
B
1

A
@
A
@
4
2
3
1
4
4

157

B
B1 + 1
B 4
4
B
B
14
@
41 + 41
1

B 1
B i
B 2
B
B 0
B
B 1
B
B 2i
B
B 0
B
B 1 i
B
B 2 2
B
B
B 0
@
1+
i
2 2
1
2
1
4

3
4

0
1
4
1
2

C
C
C
C
C
C
C
C
C
C,
C
C
C
C
C
C
C
C
A

C
C
C
C
C
C
C
C
C
C,
C
C
C
C
1+ i
8 C
C
C
0 A
1 i
8

i
i

C
C
C
C;
C
A

B 1 C
B i C
B 2 C
C
B
B 0 C
C
B
C
B
B 0 C
(3)
C.
B
c=c =B
C
B 0 C
C
B
C
B
B 0 C
C
B
C
B
@ 0 A
1
2i
1
0
1

C
B 2
B 2+1 C
B C
B 8 2
C
B
C
B
C
B
C
0
B
C
B 21 C
B C
C
C;
B 8 2
c = c(3) = B
B
C
B 0
C
C
B
B 21
C
B C
B 8 2
C
B
C
B
C
B 0
C
@
A
2+1

8 2

(d)

f = ( 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )T ,
c(0) = ( 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0 )T ,
c(1) = ( .5, .5, .5, .5, .5, .5, 0, 0, .5, .5, .5, .5, .5, .5, 0, 0 )T ,
c(2) = ( .5, .25 .25 i , 0, .25 + .25 i , .25, .25, .25, .25, .5, .25 .25 i , 0, .25 + .25 i , .25, .25, .25, .25 ) T ,

c(3) = (.375, .2134 .2134 i , . .125 i , .0366 + .0366 i , .125, .0366 .0366 i , .125 i , .2134 + .2134 i ,

.375, .2134 .2134 i , .125 i , .0366 + .0366 i , .125, .0366 .0366 i , .125 i , .2134 + .2134 i ) T ,

c = c(4) = (.375, .1644 .2461 i , .0442 .1067 i , .0422 + .0084 i , .0625 .0625 i , .0056 .0282 i ,
.0442 + .0183 i , .0490 .0327 i , 0, .0490 + .0327 i , .0442 .01831 i , .0056 + .0282 i ,
.0625 + .0625 i , .0422 .0084 i , .0442 + .1067 i , .1644 + .2461 i )T .

5.7.18.
(a) c =

C
B
B 1 C
C,
B
@ 1A
0

2
B 2C
B
C
B
C
B 0C
B
C
B 1 C
C
(b) c = B
B 2 C,
B
C
B
C
B 1 C
B
C
@ 0A
1
5.7.19.
(a)
0
1 0 0
B0 0 0
B
B
B0 0 1
B
B0 0 0
M0 = B
B0 1 0
B
B
B0 0 0
B
@0 0 0
0 0 0
0

1
B0
B
B
B1
B
B0
M2 = B
B0
B
B
B0
B
@0
0

0
1
0
1
0
0
0
0

f (0) =

1
0
1
0
0
0
0
0

C
B
B 1C
C,
B
@ 1 A

C
B
B 0C
C,
B
@ 2 A

f (1) =

f = f (2) =

B C
B0C
B C;
@4A

3
1
0
1
0
1
2
4
4
B 3 (1 + i ) C
C
B
B 2C
B 0C
B 0C
2
C
B
B
C
B
C
B
C
C
B
B
C
B
C
B
C
4
+
3
i
C
B
B 0C
B 0C
B 4C
C
B 3
B
C
B
C
B
C
B (1 + i ) C
B 0C
B
C
B
C
C
2
C, f (1) = B 0 C, f (2) = B 0 C, f = f (3) = B
C.
B
=B
B 2C
B 1C
B 1 C
C
B
5
B
C
B
C
B
C
C
B
B
C
B
C
B
C
B 3 (1 + i ) C
B 1 C
B 3C
B 3C
C
B
B
C
B
C
B
C
2
C
B
@ 1 A
@ 2 A
@ 3A
C
B
4 3i
A
@
3
1
0
3
(1 i )
0

f (0)

0
0
0
0
0
0
1
0

0
1
0
0
0
0
0
0

0
0
0
0
0
1
0
0

0
0
0
1
0
0
0
0

0
0C
C
0C
C
C
0C
C,
0C
C
C
0C
C
0A
1

0
i
0
i
0
0
0
0

0
0
0
0
1
0
1
0

0
0
0
0
0
1
0
1

0
0
0
0
1
0
1
0

0
0C
C
0C
C
C
0C
C,
0C
C
C
iC
C
0A
i

1
B1
B
B
B0
B
B0
M1 = B
B0
B
B
B0
B
@0
0
0
1
B0
B
B
B0
B
B
B0
B
M3 = B
B1
B
B0
B
B
B0
@
0

1 0
0 0
0
1 0
0 0
0
0 1
1 0
0
0 1 1 0
0
0 0
0 1
1
0 0
0 1 1
0 0
0 0
0
0 0
0 0
0
0 0 0 1
0
1+
i
1 0 0 0
2
0 1 0 0
0
0 0 1 0
0
0 0 0 1
0
i
1 0 0 0 1+
2
0 1 0 0
0
0 0 1 0
0

0
0
0
0
0
0
1
1

0
0C
C
0C
C
C
0C
C,
0C
C
C
0C
C
1A
1
0
0
0
0
i
0
i
1
0
2
0
0
0
0
i
0
1
i
0
2

C
C
C
C
C
C
C
C
C.
C
C
C
C
C
C
A

(b) Because, by composition, f = M3 M2 M1 M0 c. On the other hand, according to Exercise


5.7.9, f = F8 c, and so M3 M2 M1 M0 c = F8 c. Since this holds for all c, the coefficient
matrices must be equal: F8 = M3 M2 M1 M0 .
(c) c(0) = N0 f , c(1) = N1 c(0) , c(2) = N2 c(0) , c = c(3) = N3 c(0) , where N0 = M0 , and
Nj = 21 Mj , j = 1, 2, 3, and F81 = 18 F8 = N3 N2 N1 N0 ..

158

Solutions Chapter 6

6.1.1. (a) K =

3
2

1
!
18
3.6
@ 5 A =
; (c) the first mass has moved the
17
3.4
5

17 T
= ( 3.6, .2, 3.4 )T , so the first spring has stretched
5

2
; (b) u =
3

1
farthest; (d) e = 18
5 , 5,
the most, while the third spring experiences the most compression.

6.1.2. (a) K =

3
1

1
; (b) u =
2

1
11
@ 5 A
13
5

2.2
; (c) the second mass has moved the
2.6

2
13
farthest; (d) e = 11
= ( 2.2, .4, 2.6 )T , so the first spring has stretched the
5 , 5, 5
most, while the third spring experiences even more compression.

6.1.3.

3
2

6.1.1: (a) K =

2
; (b) u =
2

7
17
2

7.0
; (c) the second mass has
8.5

moved the farthest; (d) e = 7, 32


= ( 7.0, 1.5 )T , so the first spring has stretched the
most.
0
1
!
!
7
3 1
3.5
2 A =
6.1.2: (a) K =
; (b) u = @ 13
; (c) the second mass has moved
1
1
6.5

the farthest; (d) e =

T
7
=
2,3

( 3.5, 3. )T , so the first spring has stretched slightly farther.

6.1.4.
(a) u = ( 1, 3, 3, 1 )T , e = ( 1, 2, 0, 2, 1 )T . The solution is unique since K is invertible.
(b) Now u = ( 2, 6, 7.5, 7.5 )T , e = ( 2, 4, 1.5, 0 )T . The masses have all moved farther, and
the springs are elongated more; in this case, no springs are compressed.
6.1.5.
(a) Since e1 = u1 , ej = uj uj+1 , for 2 j n, while en+1 = un , so
e1 + + en+1 = u1 + (u2 u1 ) + (u2 u1 ) + + (un un1 ) un = 0.
Alternatively, note that z = ( 1, 1, . . . , 1 )T coker A and hence z e = e1 + + en+1 = 0
since e = A u rng A.
(b) Now there are only n springs, and so
e1 + + en = u1 + (u2 u1 ) + (u2 u1 ) + + (un un1 ) = un .
1
1
Thus, the average elongation n (e1 + + en ) = n un equals the displacement of the
last mass divided by the number of springs.
6.1.6. Since the stiffness matrix K is symmetric, so is its inverse K 1 . The basis vector ei
represents a unit force on the ith mass only; the resulting displacement is uj = K 1 ei ,
which is the ith column of K 1 . Thus, (j, i) entry of K 1 is the displacement of the j th
mass when subject to a unit force on the ith mass. Since K 1 is a symmetric matrix, this
is equal to its (i, j) entry, which, for the same reason, is the displacement of the i th mass
when subject to a unit force on the j th mass.
159

6.1.7. Top and bottom support; constant force:


12
0.4
10
0.2

8
6
4

20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

-0.2

2
-0.4
20

40

60

80

100

Top and bottom support; linear force:


12
0.2

10
8
6

-0.2

-0.4

2
-0.6
20

40

60

80

100

Top and bottom support; quadratic force:


15
0.4
12.5
0.2

10
7.5
5

-0.2

2.5
-0.4
20

40

60

80

100

Top support only; constant force:


50

40

0.8

30

0.6

20

0.4

10

0.2
20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

20

40

60

80

100

Top support only; linear force:


1
60
0.8

50
40

0.6

30

0.4

20
0.2

10
20

40

60

80

100

Top support only; quadratic force:


50

40

0.8

30

0.6

20

0.4

10

0.2
20

40

60

80

100

160

6.1.8.
(a) For maximum displacement of the bottom mass, the springs should be arranged from
weakest at the top to strongest at the bottom, so c1 = c = 1, c2 = c0 = 2, c3 = c00 = 3.
(b) In this case, the order giving maximum displacement of the bottom mass is c 1 = c = 2,
c2 = c0 = 3, c3 = c00 = 1.
6.1.9.
(a) When the bottom end is free, for maximum displacement of the bottom mass, the springs
should be arranged from weakest at the top to strongest at the bottom. In fact, the
ith elongation is ei = (n i + 1)/ci . The displacement of the bottom mass is the sum
n
n ni+1
X
X
of the elongations of all the springs above it, and achieves
un =
ei =
ci
i=1
i=1
its maximum value if and only if c1 c2 cn .
(b) In this case, the weakest spring should be at the bottom, while the remaining springs
are arranged in order from second weakest at the top to strongest just above the last
mass. A proof that this is best would be interesting...
6.1.10.
1
i+1
.
The sub-diagonal entries of L are li,i1 = , while the diagonal entries of D are dii =
i
i
6.1.11.
(a) Since y = A u, we have y rng A = corng AT . Thus, according to Theorem 5.59, y has
minimal Euclidean norm among all solutions to AT y = f .
(b) To find the minimal norm solution to AT y = f , we proceed as in Chapter 5, and append
T
the conditions 0
that y is orthogonal
1 to ker A = coker A. In the particular case of Exam1 1
0
0
T
T
ple 6.1, AT = B
1 1
0C
@0
A, and ker A is spanned by z = ( 1, 1, 1, 1 ) . To find
0 1
0
1
0
0
0
1 1
1 1
0
0
B C
B
1 1
0C
B1C
B0
C
Cy = B C
the minimal norm solution, we solve the enlarged system B
@0A
@0
0
1 1 A
0
1
1
1
1
obtained by appending the compatibility condition z y = 0, whose solution y =

1 1
1
1
2, 2,2,2

reproduces the stress in the example.

1 0

2 1 0
2 1
3
3
B0
C B0
1
6.1.12. Regular Gaussian Elimination reduces them to @
A, @
2
2
0
0 34
0
0
tively. Since all three pivots are positive, the matrices are positive definite.

1
3

6.1.13. Denoting the gravitation


0 1
1
10
0 force by g:
g
u
2 1
0
1
C
CB 1 C
B
(a)
2 1 A @ u2 A ( u1 u2 u3 )B
p(u) = ( u1 u2 u3 ) @ 1
@gA
2
g
u3
0 1
1

(b)

0
1C
A , respec-

= u21 u1 u2 + u22 u2 u3 + 21 u23 g (u1 + u2 + u3 ).


0 1
0
10
1
g
2 1
0
0
u1
B C
B
C
B
C
1
2 1
0 C B u2 C
BgC
B 1
CB
C ( u 1 u2 u3 u4 ) B C
p(u) = ( u1 u2 u3 u4 ) B
@gA
@ 0 1
2 1 A @ u3 A
2
u4
g
0
0 1
2
= u21 u1 u2 + u22 u2 u3 + u23 u3 u4 + u24 g (u1 + u2 + u3 + u4 ),
161

(c)

p(u) =

1
2

B
1
B
( u 1 u2 u3 u4 ) B
@ 0

1
2
1
0

0
1
2
1

10

= u21 u1 u2 + u22 u2 u3 + u23 u3 u4 +


6.1.14.

0
u1
g
B
C
B C
0C
C B u2 C
BgC
CB
C ( u 1 u2 u3 u4 ) B C
@gA
1 A @ u3 A
1
u4
g
1
2

u24 g (u1 + u2 + u3 + u4 ).
!

1
u1
4
3 2
(a) p(u) = ( u1 u2 )
( u 1 u2 )
u
3
2
3
2
2

?
3 2
3 2
= 2 u1 2 u1 u2 + 2 u2 4 u1 3 u2 , so p(u ) = p 3.6, 3.4 = 12.3.
(b) For instance, p(1, 0) = 2.5, p(0, 1) = 1.5, p(3, 3) = 12.

6.1.15.
(a) p(u) =

7 2
7 2
u21 12 u1 u2 +12
u2 32u2 u3 + 12
u3 21 u3 u4 + 43 u24 u2 u3 ,
so p(u? ) = p 1, 3, 3, 1 = 3.
(b) For instance, p(1, 0, 0, 0) = p(0, 0, 0, 1) = .75, p(0, 1, 0, 0) = p(0, 0, 1, 0) = .4167.
3
4

6.1.16.
(a) Two masses, both ends fixed, c1 = 2, c2 = 4, c3 = 2, f = ( 1, 3 )T ;
equilibrium: u? = ( .3, .7 )T .
(b) Two masses, top end fixed, c1 = 4, c2 = 6, f = ( 0, 2 )T ;

= ( .5, .8333 )T .

= ( .5, .8333 )T .

equilibrium: u? = 21 , 56

(c) Three masses, top end fixed, c1 = 1, c2 = 3, c3 = 5, f = ( 1, 1, 1 )T ;


equilibrium: u? = 21 , 56

(d) Four masses, both ends fixed, c1 = 3, c2 = 1, c3 = 1, c4 = 1, c5 = 3, f = ( 1, 0, 2, 0 )T ;


equilibrium: u? = ( .0606, .7576, 1.5758, .3939 )T .
6.1.17. In both cases, the homogeneous system A u = 0 requires 0 = u1 = u2 = = un , and so
ker A = {0}, proving linear independence of its columns.
6.1.18. This is an immediate consequence of Exercise 4.2.9.
6.1.19.
(a) When only the top end is supported, the potential energy is lowest when the springs are
arranged from weakest at the top to strongest at the bottom: c1 = c = 1, c2 = c0 = 2,
c3 = c00 = 3, with energy 17
3 = 5.66667 under a unit gravitational force.
(b) When both ends are fixed, the potential energy is minimized when either the springs are
in the order c1 = c = 1, c2 = c0 = 3, c3 = c00 = 3 or the reverse order c1 = c = 2,
c2 = c0 = 3, c3 = c00 = 1, both of which have energy 15
22 = .681818 under a unit
gravitational force.
6.1.20. True. The potential energy function (6.16) uniquely determines the symmetric stiffness
matrix K and the external force vector f . According to (6.12), (6.15), the off-diagonal entries of K determine the individual spring constants c2 , . . . , cn of all but the first and (if
there is one) last springs. But once we know c2 and cn , the remaining one or two constants, c1 and cn+1 , are uniquely prescribed by the (1, 1) and (n, n) entries of K. If cn+1 =
0, then the bottom end is not attached to a support. Thus, the potential energy uniquely
prescribes the entire massspring chain.

162

6.2.1. (a)

(b)

(c)

(d)

1
B1
B
B1
6.2.2. (a) A = B
B
@0
0
(c) u =

1
0
0
1
1

15 9 3
8 , 8, 2

0
1
0
1
0

(e)

0
0C
C
C;
1 C
C
0A
1

(b)

1
3
1

B
@ 1

10

3
1
u
B C
B 1C
1 C
A @ u 2 A = @ 0 A.
0
u3
2

= ( 1.875, 1.125, 1.5 )T ;


T

3 9
= ( .75, .375, 1.875, .375, 1.125 )T .
(d) y = v = A u = 43 , 38 , 15
8 , 8, 8
(e) The bulb will be brightest when connected to wire 3, which has the most current
flowing through.
0
1
1 1
B1
0C
B
C
C
B1
C, and the equilibrium equations are
6.2.3. The reduced incidence matrix is A? = B
0
C
B
@0
1A
00 11
!
!
!
9
3
3 1
1.125
8
@
A
, with solution u =
u =
; the resulting currents are
=
3
0
1
3
.375

y = v = A u = 34 , 98 , 98 , 38 , 83
= ( .75, 1.125, 1.125, .375, .375 )T . Now, wires 2 and 3
both have the most current. Wire 1 is unchanged; the current in wires 2 has increased; the
current in wires 3, 5 have decreased; the current in wire 4 has reversed direction.
0

6.2.4. (a) A =

B0
B
B
B1
B
B1
B
B
B0
B
@0

1
1
0
0
1
0
0

0
1
1
0
0
1
0

0
0
0
1
1
0
1

4
3
9
1 19 16
11
35 , 35 , 7 , 35 , 5 , 35 , 35

y=
(c) wire 6.

0
0C
C
C
0C
C
0C
C;
0C
C
C
1 A
1

(b) u =

0 34
B 35
B 23
B 35
B
B 19
@ 35
16
35

1
C
C
C
C
C
A

.9714
B
C
B .6571 C
C;
=B
@ .5429 A
.4571

= ( .3143, .1143, .4286, .2571, .2000, .5429, .4571 )T ;

6.2.5. (a) Same incidence matrix; (b) u = ( .4714, .3429, .0429, .0429 )T ;
y = ( .8143, .3857, .4286, .2571, .3000, .0429, .0429 )T ; (c) wire 1.
6.2.6. None.
163

6.2.7. There is no current on the two wires connecting the same poles of the batteries (positivepositive and negative-negative) and 2.5 amps along all the other wires.
6.2.8.
(a) The potentials remain the same, but the currents are all twice as large.
(b) The potentials are u = ( 4.1804, 3.5996, 2.7675, 2.6396, .8490, .9376, 2.0416, 0. ) T ,
while the currents are
y = ( 1.2200, .7064, .5136, .6876, .5324, .6027, .1037, .4472, .0664, .0849, .0852, .1701 ) T .
6.2.9. Resistors 3 and 4 should be on the battery wire and the opposite wire in either order. Resistors 1 and 6 are connected to one end of resistor 3 while resistors 2 and 5 are
connected to its other end; also, resistors 1 and 5 are connected to one end of resistor 4
while resistors 2 and 6 are connected to its other end. Once the wires are labeled, there
are 8 possible configurations. The current through the light bulb is .4523.
6.2.10.
(a) For n = 2, the potentials are
0
B
B
B
@

1
16
1
8
1
16

1
8
3
8
1
8

1
16
1
8
1
16

.0625
=B
@ .125
.0625

C
C
C
A

.125
.375
.125

.0625
.125 C
A.
.0625

The currents along the horizontal wires are


0
B
B
B
@

1
16
18
1
16

1
16
14
1
16

1
16
1
4
1
16

1
16
1
8
1
16

1
C
C
C
A

.0625
=B
@ .125
.0625

.0625
.25
.0625

.0625
.25
.0625

.0625
.125 C
A,
.0625

where all wires are oriented from left to right, so the currents are all going away from
the center. The currents in the vertical wires are given by the transpose of the matrix.
For n = 3, the potentials are
0

.0288
B .0577
B
B
B .0769
B
@ .0577
.0288

.0577
.125
.1923
.125
.0577

.0769
.1923
.4423
.1923
.0769

.0577
.125
.1923
.125
.0577

.0288
.0577 C
C
C.
.0769 C
C
.0577 A
.0288

The currents along the horizontal wires are


0

.0288

B .0577
B
B
B .0769
B
@ .0577

.0288

.0288
.0673
.1153
.0673
.0288

.0192
.0673
.25
.0673
.0192

.0192
.0673
.25
.0673
.0192

.0288
.0673
.1153
.0673
.0288

.0288
.0577 C
C
C,
.0769 C
C
.0577 A
.0288

where all wires are oriented from left to right, so the currents are all going away from
the center. The currents in the vertical wires are given by the transpose of the matrix.
For n = 4, the potentials are
0

.0165

B .0331
B
B
B .0478
B
B .0551
B
B
B .0478
B
@ .0331

.0165

.0331
.0680
.1029
.125
.1029
.0680
.0331

.0478
.1029
.1710
.2390
.1710
.1029
.0478

.0551
.125
.2390
.4890
.2390
.125
.0551

164

.0478
.1029
.1710
.2390
.1710
.1029
.0478

.0331
.0680
.1029
.125
.1029
.0680
.0331

.0165
.0331 C
C
C
.0478 C
C
.0551 C
C.
.0478 C
C
C
.0331 A
.0165

The currents along the horizontal wires are


0

.0165
B .0331
B
B
B .0478
B
B .0551
B
B
B .0478
B
@ .0331
.0165

.0165
.0349
.0551
.0699
.0551
.0349
.0165

.0147
.0349
.0680
.1140
.0680
.0349
.0147

.0074
.0221
.0680
.25
.0680
.0221
.0074

.0074
.0221
.0680
.25
.0680
.0221
.0074

.0147
.0349
.0680
.1140
.0680
.0349
.0147

.0165
.0349
.0551
.0699
.0551
.0349
.0165

.0165
.0331 C
C
C
.0478 C
C
.0551 C
C,
.0478 C
C
C
.0331 A
.0165

where all wires are oriented from left to right, so the currents are all going away from
the center source. The currents in the vertical wires are given by the transpose of the
matrix.
As n the potentials approach a limit, which is, in fact, the fundamental solution to the Dirichlet boundary value problem for Laplaces equation on the square,
[ 47, 59 ]. The horizontal and vertical currents tend to the gradient of the fundamental
solution. But this, of course, is the result of a more advanced analysis beyond the scope
of this text. Here are graphs of the potentials and horizontal currents for n = 2, 3, 4, 10:

6.2.11. This is an immediate consequence of Theorem 5.59, which states that the minimum
norm solution to AT y = f is characterized by the condition y = corng AT = rng A. But,
solving the system AT A u = f results in y = A u rng A.

6.2.12.
(a) (i) u = ( 2, 1, 1, 0 )T , y = ( 1, 0, 1 )T ; (ii) u = ( 3, 2, 1, 1, 0 )T , y = ( 1, 1, 0, 1 )T ;
(iii) u = ( 3, 2, 1, 1, 1, 0 )T , y = ( 1, 1, 0, 0, 1 )T ; (iv ) u = ( 3, 2, 2, 1, 1, 0 )T ,
y = ( 1, 0, 1, 0, 1 )T ; (v ) u = ( 3, 2, 2, 1, 1, 1, 0 )T , y = ( 1, 0, 1, 0, 0, 1 )T .
(b) In general, the current only goes through the wires directly connecting the top and bottom nodes. The potential at a node is equal to the number of wires transmitting the
current that are between it and the grounded node.
6.2.13.

(i) u = 23 , 12 , 0, 0
, y = 1, 12 ,

1
2

;
165

(ii) u =
(iii) u =
(iv ) u =
(v ) u =

5 3 1
1 1 T
,
y
=
1,
1,
,
,
,
0,
0
,
;
2 2 2
2 2
T

1 1 1 T
7 4 1
, y = 1, 1, 3 , 3 , 3
;
3 , 3 , 3 , 0, 0, 0

1
8 3
3 2 1 1 T
;
, y = 1, 5 , 5 , 5 , 5
5 , 5 , 0, 5 , 0, 0
T
T

1
11 4
.
, y = 1, 74 , 37 , 17 , 17 , 17
7 , 7 , 0, 7 , 0, 0

6.2.14. According to Exercise 2.6.8(b), a tree with n nodes has n 1 edges. Thus, the reduced
incidence matrix A? is square, of size (n 1) (n 1), and is nonsingular since the tree is
connected.
6.2.15.
(a) True, since they satisfy the same systems of equilibrium equations K u = A T C b = f .
(b) False, because the currents with the batteries are, by (6.37), y = C v = C A u + C b,
while for the current sources they are y = C v = C A u.
6.2.16. In general, if v1 , . . . , vm are the rows of the (reduced) incidence matrix A, then the resistivity matrix is K = AT C A =

m
X

i=1

ci viT vi . (This relies on the fact that C is a diagonal

matrix.) In the situation described in the problem, two rows of the incidence matrix are
the same, v1 = v2 = v, and so their contribution to the sum will be c1 vT v + c2 vT v =
(c1 + c2 )vT v = c vT v, which is the same contribution as a single wire between the two
vertices with conductance c = c1 + c2 . The combined resistance is
R=

R1 R2
1
1
1
=
=
=
.
c
c1 + c 2
1/R1 + 1/R2
R1 + R 2

6.2.17.
(a) If f are the current sources at the nodes and b the battery terms, then the nodal voltage potentials satisfy AT C A u = f AT C b.
(b) By linearity, the combined potentials (currents) are obtained by adding the potentials
(currents) due to the batteries and those resulting from the current sources.
6.2.18. The resistivity matrix K ? is symmetric, and so is its inverse. The (i, j) entry of (K ? )1
is the ith entry of uj = (K ? )1 ej , which is the potential at the ith node due to a unit current source at the j th node. By symmetry, this equals the (j, i) entry, which is the potential
at the j th node due to a unit current source at the ith node.
6.2.19. If the graph has k connected subgraphs, then there are k independent compatibility
conditions on the unreduced equilibrium equations K u = f . The conditions are that the
sum of the current sources at the nodes on every connected subgraph must be equal to
zero.

6.3.1. 8 cm
6.3.2. The bar will be stress-free provided the vertical force is 1.5 times the horizontal force.
6.3.3.
(a) For a unit horizontal force on the two nodes, the displacement vector is
u = ( 1.5, .5, 2.5, 2.5 )T , so the left node has moved slightly down and three times as
far to the right, while the right node has moved five times as far up and to the right.
Note that the force on the left node is transmitted through the top bar to the right
node, which explains why it moves significantly further. The stresses are
166

e = ( .7071, 1, 0, 1.5811 )T , so the left and the top bar are elongated, the right bar is
stress-free, and the reinforcing bar is significantly compressed.
(b) For a unit horizontal force on the two nodes, u = ( .75, .25, .75, .25 )T so the left node
has moved slightly down and three times as far to the right, while the right node has
moved by the same amount up and to the right. The stresses are e = (.353553, 0.,
.353553, .790569, .79056)T , so the diagonal bars fixed at node 1 are elongated, the
horizontal bar is stress-free, while the bars fixed at node 4 are both compressed. the reinforcing bars experience a little over twice the stress of the other two diagonal bars.
6.3.4. The swing set cannot support a uniform horizontal force, since f1 = f2 = f ,
g1 = g2 = h1 = h2 = 0 does not satisfy the constraint for equilibrium. Thus, the swing
set will collapse. For the reinforced version, under a horizontal force of magnitude f , the
displacements of the two free nodes are ( 14.5 f, 0, 3 f )T and ( 14.5 f, 0, 3 f )T respectively,
so the first node has moved down and in the direction of the force, while the second node
has moved up and in the same horizontal direction. The corresponding elongation vector is
e = ( 1.6583 f, 1.6583 f, 0, 1.6583 f, 1.6583 f, 3 f, 3 f )T ,
and so the horizontal bar experiences no elongation; the diagonal bars connecting the first
node are stretched by an amount 1.6583 f , the diagonals connecting the second node are
compressed by the same amount, while the reinforcing vertical bars are, respectively, compressed and stretched by an amount 3 f .
0

6.3.5. (a) A =

B
B
B
B
B
B
B
@

0
1
0
0

1
0
0
0

1
2

1
2

0
1
0

1
2

0
0
1

C
C
C
C
C;
1
C
2C
A

3
2

(b)

u1

u1 +

1
2 v1

1
2 u1
3
2 u2
1
2 u2

u2 = 0,

+
+
+

3
2 v1
1
2 v2
3
2 v2

= 0,
= 0,
= 0.

(c) Stable, statically indeterminate. (d) Write down f = K e1 , so f1 =

3!
2 ,
12

f2 =

1
.
0

The horizontal bar; it is compressed by 1; the upper left to lower right bar is compressed
1 , while all other bars are stress free.
2

6.3.6. Under a uniform horizontal force, the displacements and stresses are: Non-joined version:

T
u = ( 3, 1, 3, 1 )T , e = 1, 0, 1, 2, 2 ; Joined version: u = ( 5, 1, 5, 1, 2, 0 )T ,


T
e = 1, 0, 1, 2, 2, 2, 2 ; Thus, joining the nodes causes a larger horizontal
displacement of the upper two nodes, but no change in the overall stresses on the bars.
Under a uniform vertical force, the displacements and elongations are:
Non-joined version:
u=
e=

1 5
1 5 T
= ( .1429, .7143, .1429, .7143 )T ,
7, 7, 7, 7

5 2 2
2 2 2 5
= ( .7143, .2857, .7143, .4041, .4041 )T
7,
7 , 7,
7 , 7

Joined version:

u = ( .0909, .8182, .0909, .8182, 0, .3636 )T ,

e = ( .8182, .1818, .8182, .2571, .2571, .2571, .2571 )T ;


Thus, joining the nodes causes a larger vertical displacement, but smaller horizontal displacement of the upper two nodes. The stresses on the vertical bars increases, while the
horizontal bar and the diagonal bars have less stress (in magnitude).

167

6.3.7.
(a) A =

0
B
B
B
B
B
B
@

1
2

1
1

1
2
1
2

1
2
1
2

0
0

0
0

C
C
C
C.
C
1
C
2A

1
2

0
0
0
0
0
1
(b) There are two mechanisms: u1 = ( 1, 0, 1, 0, 1, 0 )T , where all three nodes move by the
same amount to the right, and u2 = ( 2, 0, 1, 1, 0, 0 )T , where the upper left node moves
to the right while the top node moves up and to the right.
(c) f1 + f2 + f3 = 0, i.e., no net horizontal force, and 2 f1 + f2 + g2 = 0.
(d) You need to add additional two reinforcing bars; any pair, e.g. connecting the fixed
nodes to the top node, will stabilize the structure.
6.3.8.
(a) A =

0
B
B
B
B
B
B
B
B
B
@

3
10

10

1
0

0
0

3
10

1
10

1
10

3
10

3
10

0
0

0
0

C
C
C
C
C
0 C
C
C
1 C
10 A

0
0
0
0
0
1
1
0
1
0
0
0
0
C
B .9487 .3162
.9487 .3162
0
0
C
B
C
C;
B 1
0
0
0
1
0
=B
C
B
@
0
0
.9487 .3162 .9487 .3162 A
0
0
0
0
0
1
(b) One instability: the mechanism of simultaneous horizontal motion of the three nodes.
(c) No net horizontal force: f1 + f2 + f3 = 0. For example, if f1 = f2 = f3 = ( 0, 1 )T , then
0

= ( 1.5, 1.5811, 1.5, 1.5811, 1.5 )T , so the compressed die = 23 , 52 , 32 , 52 , 32


agonal bars have slightly more stress than the compressed vertical bars or the elongated
horizontal bar.
(d) To stabilize, add in one more bar starting at one of the fixed nodes and going to one of
the two movable nodes not already connected to it.

(e) In every case, e = 23 , 52 , 32 , 52 , 32 , 0


= ( 1.5, 1.5811, 1.5, 1.5811, 1.5, 0 )T , so the
stresses on the previous bars are all the same, while the reinforcing bar experiences no
stress. (See Exercise 6.3.21 for the general principle.)
6.3.9. Two-dimensional house:
0

(a) A =

B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@

0
0
1
0

1
1
0
0

0
0
0

0
0
0

0
1
0

0
0

0
0

0
0

0
0
0

0
0
0

0
0
0
0

2
5

2
5

1
5
1
5

2
5

0
0

0
0

0
0

0
0
0
0

0
0
1
0

1
0

0
0

0
0
0
0

C
C
C
C
C
C
C
C
C;
C
C
0 C
C
C
1 C
A

(b) 3 mechanisms: (i) simultaneous horizontal motion of the two middle nodes; (ii) simultaneous
horizontal motion of the three upper nodes; (iii) the upper left node moves horizontally to the right by 1 unit, the top node moves vertically by 1 unit and to the right by
1
2 a unit.
168

(c) Numbering the five movable nodes in order starting at the middle left, the corresponding forces f1 = ( f1 , g1 )T , . . . , f5 = ( f5 , g5 )T must satisfy: f1 + f5 = 0, f2 + f3 +
f4 = 0, f2 + 12 f3 + g3 = 0. When f1 = f2 = f4 = f5 = ( 0, 1 )T , f3 = 0,
then e = ( 2, 1, 0, 0, 0, 1, 2 )T , so the lower vertical bars are compressed twice as much
as the upper vertical bars, while the horizontal and diagonal bars experience no elongation or stress. When f1 = f5 = 0, f2 = f4 = ( 1, 0 )T , f3 = ( 0, 1 )T , then

1 1
5
5 1 1
, so all the vertical bars are compressed by .5, the die =
2 , 2 , 2 , 0, 2 , 2 , 2
agonal bars slightly more than twice as compressed, while the horizontal bar has no
stress.

(d) To stabilize the structure, you need to add in at least three more bars.
(e) Suppose we add in an upper horizontal bar and two diagonal bars going from lower left
to upper right. For the first set of forces, e = ( 2, 1, 0, 0, 0, 1, 2, 0, 0, 0 )T ; for the second

set of forces, e = 12 , 12 , 25 , 0, 25 , 12 , 12 , 0, 0, 0
. In both cases, the stresses remain
the same, and the reinforcing bars experience no stress.
Three-dimensional house:
(a)
0

A=

B 0
B
B
B 0
B
B
B 0
B
B
B 0
B
B 1
B
B 0
B
B
B 0
B
B 0
B
B 0
B
B
B 0
B
@ 0

0
1
2
1
0
0
0
0
0
0
0
0
0
0

1
1
2
0
0
0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0

0
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

1
2
1
0
0

0
0
0
0
0
0
0
1
0
0
0
0
0

1
2

1
2

1
2

0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
1

1
2
0
0
0

0
0
0
0
0
0
1
0
0
0
0
0
0

0
0
1

1
2

0
0
0
0
0
0
0
0
0

0
0
0
0
0
0
0
0
0

1
2

0
1
2
0

0
0
0

1
2
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

1
2

1
2

0
0
0
0
0
0
0
1
0
0
0
0
0

0
0
0
0
0
0
0
0
0
0
1

1
2

0
0
0
0
0
0
0
0
0
0
0

C
C
C
C
C
C
C
C
C
C
C
C
C
C;
C
C
C
C
C
C
C
C
C
C
1
C
A
2

(b) 5 mechanisms: horizontal motion of (i) the two topmost nodes in the direction of the
bar connecting them; (ii) the two right side nodes in the direction of the bar connecting them; (iii) the two left side nodes in the direction of the bar connecting them;
169

(iv ) the three front nodes in the direction of the bar connecting the lower pair; (v ) the
three back nodes in the direction of the bar connecting the lower pair.
(c) Equilibrium requires no net force on each unstable pair or nodes in the direction of the
instability.
(d) For example, when the two topmost nodes are subject to a unit downwards vertical
force, the vertical bars have elongation/stress 21 , the diagonal bars have 1 = .7071,
while the front and back horizontal bars have
stress.

1
2.

The longer horizontal bars have no

(e) To stabilize, you need to add in at least five more bars, e.g., two diagonal bars across
the front and back walls and a bar from a fixed node to the opposite topmost node.
In all cases, if a minimal number of reinforcing bars are added, the stresses remain
the same on the old bars, while the reinforcing bars experience no stress. See Exercise
6.3.21 for the general result.
6.3.10.
(a) Letting wi denote the vertical displacement and hi the vertical component of the force
on the ith mass, 2 w1 w2 = h1 , w1 + 2 w2 w3 = h2 , w2 + w3 = h3 . The system is
statically determinate and stable.
(b) Same equilibrium equations, but now the horizontal displacements u1 , u2 , u3 are arbitrary, and so the structure is unstable there are three independent mechanisms corresponding to horizontal motions of each individual mass. To maintain equilibrium, the
horizontal force components must vanish: f1 = f2 = f3 = 0.
(c) Same equilibrium equations, but now the two horizontal displacements u 1 , u2 , u3 , v1 , v2 , v3
are arbitrary, and so the structure is unstable there are six independent mechanisms
corresponding to the two independent horizontal motions of each individual mass. To
maintain equilibrium, the horizontal force components must vanish: f1 = f2 = f3 =
g1 = g2 = g3 = 0.
6.3.11.
(a) The incidence matrix is
0

A=

B 1
B
B
B
B
B
B
B
B
@

of size n n. The stiffness matrix is


0
c1 + c 2
c2
B
c 2 c2 + c 3
B
B
c3
B
B
K = AT C A =

B
B
B
B
B
B
B
B
B
@

1
1

1
1
1

c3
c3 + c 4
c4

1
..
.

..

.
1

1
C
C
C
C
C
C
C
C
C
A

c1
c4
c4 + c 5
..

c5
..

..

C
C
C
C
C
C
C
C,
C
C
C
C
C
C
A

cn1 cn1 + cn
cn
cn
c n cn + c 1
and the equilibrium system is K u = f .
(b) Observe that ker K = ker A is one-dimensional with basis vector z = ( 1, 1, . . . , 1 ) T .
Thus, the stiffness matrix is singular, and the system is not stable. To maintain equilibrium, the force f must be orthogonal to z, and so f1 + + fn = 0, i.e., the net force on
170

the ring is zero.


(c) For instance, if c1 = c2 = c3 = c4 = 1, and f = ( 1, 1, 0, 0 )T , then the solution is
T

+ t ( 1, 1, 1, 1 )T for any t. Nonuniqueness is telling us that the


u = 41 , 21 , 14 , 0
masses can all be moved by the same amount, i.e., the entire ring is rotated, without
affecting the force balance at equilibrium.

6.3.12.

(a)

A=

B
B
B
B
B
B
B
B
B
B
B
B
@

(b) v1 =

1
0
0

0
1
0

0
0
1

v2 =

1
2
1
2

B0C
B C
B C
B0C
B C
B1C
B C
B C
B0C
B C
B0C
B C,
B1C
B C
B C
B0C
B C
B0C
B C
B C
B1C
B C
@0A

1
0
0

0
0
0
1

B1C
B C
B C
B0C
B C
B0C
B C
B C
B1C
B C
B0C
B C,
B0C
B C
B C
B1C
B C
B0C
B C
B C
B0C
B C
@1A

v3 =

0
0
0

0
0
0

1
2

B0C
B C
B C
B1C
B C
B0C
B C
B C
B0C
B C
B1C
B C,
B0C
B C
B C
B0C
B C
B1C
B C
B C
B0C
B C
@0A

0
0
0

0
0
0

1
2

1
2

v4 =

0
0
0

1
2

0
1

B 0C
B
C
B
C
B 0C
B
C
B 0C
B
C
B
C
B 0C
B
C
B 0C
B
C
B 0 C,
B
C
B
C
B 0C
B
C
B 1 C
B
C
B
C
B 0C
B
C
@ 1A

v5 =

v6 =

B 0C
B
C
B
C
B 0C
B
C
B 0C
B
C
B
C
B 1 C
B
C
B 0C
B
C
B 1 C;
B
C
B
C
B 0C
B
C
B 0C
B
C
B
C
B 0C
B
C
@ 0A

(c) v1 , v2 , v3 correspond to translations in, respectively, the x, y, z directions;


(d) v4 , v5 , v6 correspond to rotations around, respectively, the x, y, z coordinate axes;
0

1
B 0
B
B
B 0
B
B 1
B
B
B 0
B
B
B 0
B
(e) K = B
B 0
B
B
B 0
B
B
B 0
B
B
B 0
B
B
@ 0
0

0
1
0
0
0
0
0
1
0
0
0
0

0
0
1
0
0
0
0
0
0
0
0
1

1
0
0
2
1
2
12
12
1
2

0
12
0
1
2

0
0
0

0
0
0

0
0
0

12

12
0

12

1
2

1
2

1
2
1
2

0
0
0
0

0
0
0
1
2

0
12

1
2

1
2
12

0
0
0
0

171

0
1
0

1
2
21

21
2
21
0
1
2
1
2

0
0
0
0
0
0
0
21
1
2
0
1
2
12

0
0
0

12
0
1
2

0
0
0
1
2

0
12

0
0
0
0
0
0
0
12
1
2

0
1
2
12

0
0C
C
C
1 C
C

1C
2C
C
0C
C
C
12 C
C
C;
0C
C
1C
C
2C
1C
2 C
C
12 C
C
C
1C
2 A

C
C
C
C
C
C
1 C
C;
C
C
1
C
2C
A
1
2

B 0C
B
C
B
C
B 0C
B
C
B 0C
B
C
B
C
B 0C
B
C
B 1 C
B
C
B 0 C,
B
C
B
C
B 0C
B
C
B 0C
B
C
B
C
B 1C
B
C
@ 0A

0
0
0

(f ) For fi = ( fi , gi , hi )T we require f1 + f2 + f3 + f4 = 0, g1 + g2 + g3 + g4 = 0,
h1 + h2 + h3 + h4 = 0, h3 = g4 , h2 = f4 , g2 = f3 , i.e., there is no net horizontal force
and no net moment of force around any axis.
(g) You need to fix three nodes. Fixing two still leaves a rotation motion around the line
connecting them.
(h) Displacement of the top node: u4 = ( 1, 1, 1 )T ; since e = ( 1, 0, 0, 0 )T , only the
vertical bar experiences compression of magnitude 1.
6.3.13.

(a)

Placing the vertices at

1
1
B 3C
B
C
B 0 C,
@
A

A=

0 3

B 2
B 3
B
B 2
B
B 1
B
B 3
B
B
B 0
B
B
B 0
B
@

B
B
B
@

2 3C
C
1
C,
A
2

1
2

0
0

21
1
2

3
2

0
0
0
0

2 3

0
B
B
B
@

2 3C
C
C,
A

21
0

1
2

2
3

C
C
C,
A

we obtain

12

2
3

0
0

1
0

0
0

0
21

2
3

B
B
B
@

0
0

2 3

12

2
3
0

2 3
1

2 3
1

1
2

2
3
2
3

C
C
C
C
C
C
C
C
C;
C
C
C
C
C
C
A

1
1
0
1
0 1
0 1
0
0
1
0
0
2
2
B 2 C
B
C
B0C
B1C
B0C
B
6 C
C
B
C
B 6 C
B C
B C
B C
B
C
B
C
B
C
B C
B C
B C
B
B
C
B 1 C
B0C
B0C
B1C
B 1 C

C
B 0 C
B
C
B C
B C
B C
B
B
B
B1C
B0C
B0C
B 2 2 C
3 C
0 C
C
B
C
B
C
B C
B C
B C
B
C
B
C
B
C
B C
B C
B C
B
C
B 1 C
B
C
B0C
B1C
B0C
B
0
0
C
B
C
B
C
B C
B C
B C
B
C
B
C
B
C
B0C
B0C
B1C
B
0
1
0
C , v = B C, v = B
C;
C, v = B C, v = B C, v = B

(b) v1 = B
C
B 3C
B
C
B1C
B0C
B0C
B
5
6
2
3
4
0 C
B
C
B 2 2 C
B C
B C
B C
B
C
B
C
B
C
B C
B C
B C
B
B 1 C
B
B0C
B1C
B0C
B
0 C
0 C
C
B
C
B
C
B C
B C
B C
B
B 0 C
B
B0C
B0C
B1C
B
0 C
1 C
B
C
B
C
C
B C
B C
B C
B
C
B
C
B
C
B C
B C
B C
B
C
B 0 C
B
C
B1C
B0C
B0C
B
0
0
C
B
C
B
C
B C
B C
B C
B
@ 0 A
@
@0A
@1A
@0A
@
0 A
0 A
0
0
0
1
0
0
0

(c) v1 , v2 , v3 correspond to translations in, respectively, the x, y, z directions;

(d) v4 , v5 , v6 correspond to rotations around the top node;

172

(e) K =
0

11
6

B
B
B 0
B
B
B
2
B 3
B
B
B 3
B
4
B
B
3
B
B 4
B
B
B 0
B
B
3
B
B 4
B
B
3
B
B 4
B
B
B 0
B
B
B
B 1
B
3
B
B
B 0
B
@
2
3

1
2

2
3

43

3
4

3
4
41

2
3

3
4
14

5
6
1
3
1

3 2

3
2
1
6

2
3

23

3
4
14

0
0

1
12

4 3
1

3 2

3 2
1
6
2
3

4 3
14
1
6

34

0
1

3 2
1
6
23

(f ) For fi = ( fi , gi , hi )T we require

3
4

3
4
14

0
0

1
3
3
2
1
6
1

4 3
41
1
6

3 2
1
6
2
3
1

3 2
1
6
2
3

0
5
6
1

3
1

3 2
1
12
1

4 3
1

3 2

13
0

2
3

23

2
3
1
12
1

4 3
1

3 2
1
12
1

4 3
1

3 2
1
2

4 3
41
1
6
1

4 3
41
1
6

3 2
1
6
32
1

3 2
1
6
32

1
2

C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C.
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A

2 f1 + 6 g1 h1 2 2 f2 + h2 = 0,

2 g1 + 3 f2 + g2 3 f3 + g3 = 0,

2 f1 6 g1 h1 2 2 f3 + h3 = 0,

f1 + f2 + f3 + f4 = 0,
g1 + g2 + g3 + g4 = 0,
h1 + h2 + h3 + h4 = 0,

i.e., there is no net horizontal force and no net moment of force around any axis.
(g) You need to fix three nodes. Fixing only two nodes still permits a rotational motion
around the line connecting them.

(h) Displacement of the top node: u = 0, 0, 21


experience compression of magnitude 1 .

; all the bars connecting the top node

6.3.14. True, since stability only depends on whether the reduced incidence matrix has trivial
kernel or not, which depends only on the geometry, not on the bar stiffnesses.
6.3.15. (a) True. Since K u = f , if f 6= 0 then u 6= 0 also. (b) False if the structure is unstable, since any u ker A yields a zero elongation vector y = A u = 0.
6.3.16.
(a) 3 n.
(b) Example: a triangle each of whose nodes is connected
to the ground by two additional, non-parallel bars.

6.3.17. As in Exercise 6.1.6, this follows from the symmetry of the stiffness matrix K, which
implies that K 1 is a symmetric matrix. Let fi = ( 0, . . . , 0, n, 0, . . . , 0 )T denote the force
vector corresponding to a unit force at node i applied in the direction of the unit vector n.
The resulting displacement is ui = K 1 fi , and we are interested the displacement of node j
in the direction n, which equals the dot product fj ui = fjT K 1 fi = (K 1 fj )T fi = fi uj ,
proving equality.
173

6.3.18. False in general. If the nodes are collinear, a rotation around the line through the nodes
will define a rigid motion. If the nodes are not collinear, then the statement is true.
6.3.19. Since y = e = A u rng A = corng AT , which, according to Theorem 5.59, is the
condition for the solution of minimal norm to the adjoint equation.
6.3.20.
(a) We are assuming that f rng K = corng A = rng AT , cf. Exercise 3.4.31. Thus, we can
write f = AT h = AT C g where g = C 1 h.
(b) The equilibrium equations K u = f are AT C A u = AT C g which are the normal equations (4.57) for the weighted least squares solution to A u = g.
6.3.21. Let A be the reduced incidence matrix of the structure, so the equilibrium equations are
AT C A u = f , where we are assuming f rng K = rng AT C A. We use Exercise 6.3.20 to
write f = AT C g and characterize u as the weighted least squares solution to A u = g, i.e.,
the vector that minimizes the weighted norm k A u g k2 .
!
A
e
Now, the reduced incidence matrix for the reinforced structure is A =
where
B
the rows of B represent the reinforcing bars. The structure will be stable if and only if
e = corng A + corng B = R n , and the number of bars is minimal if and only if
corng A
corng A corng B = {0}. Thus, corng A and corng B are complementary subspaces of
R n , which implies, as in Exercise 5.6.12, that their orthogonal complements ker A and ker B
are also complementary subspaces, so ker A + ker B = R n , ker A ker B = {0}.
eT C
eA
e v = AT C A v+
The reinforced equilibrium equations for the new displacement v are A
T
B D B v = f , where D is the diagonal
matrix whose entries are the stiffnesses of the rein!
!
g
C O
T
T
T
T e
e
e
,
. Since f = A C g = A C g + B D 0 = A C
forcing bars, while C =
0
O D
again using Exercise 6.3.20, the reinforced displacement v is the least squares solution to
the combined system
A v = g, B v = 0, i.e., the vector that minimizes the combined weighted norm k A v b k 2 +
k B v k2 . Now, since we are using the minimal number of bars, we can uniquely decompose v = z + w where z ker A and w ker B, we find k A v b k2 + k B v k2 =
k A w b k2 + k B z k2 . Clearly this will be minimized if and only if B z = 0 and w minimizes k A w b k2 . Therefore, w = u ker B, and so the entries of B u = 0 are the
elongations of the reinforcing bars.
6.3.22.
(a) A? =

0
B
B
B
B
@

1
2

1
2

0C
C

?
?
?
0C
C; K u = f where K =

B
B
B
B
B
B
B
B
@

3
2
1
2

1
2
1
2

1
0

0
0

0
C
0C
C
C

1
1 C.
3
1 0
2 2 2 C
C
1
1
1
1
1C

0
0

2
2
2A
2
2
1
1
1
0 0 2
2
2
(b) Unstable: there are two mechanisms prescribed by the kernel basis elements ( 1, 1, 1, 1, 0 ) T ,
which represents the same mechanism as when the end is fixed, and ( 1, 1, 1, 0, 1 ) T , in
which the roller and the right hand node move horizontally to the right, while the left
node moves down and to the right.

1
0

0
0

A
1
2

174

6.3.23.
(a) A? =

1
B
2
B
B

1
B
@

1
2

0
0

K ? u = f ? where K ? =

2
0
B
B
B
B
B
B
B
B
@

1
2
3
2
1
2

C
C
C;
C
A
1
2
1
1
2
1
0
2
3
0
2
0 12
1
0
2

0
0

0
C
0C
C
C

1 C.
1
12
2C
C
1
1C
0

2
2A
1
0
12
2
(b) Unstable: there are two mechanisms prescribed by the kernel basis elements ( 1, 1, 1, 1, 0 ) T ,
which represents the same mechanism as when the end is fixed, and ( 1, 1, 1, 0, 1 ) T ,
in which the roller moves up, the right hand node moves horizontally to the left, while
the left node moves up and to the left.

6.3.24.

.7071
.7071
0
0
0 C
1
0
1
0
0 C
C
C;
(a) Horizontal roller: A? =
0
0
.7071 .7071 .7071 C
A
.9487 .3162
0
0
.9487
1
0
2.4 .2 1
0 .9
C
B
B
.2 .6
0
0
.3 C
C
B
C
C;
B 1
K ? u = f ? where K ? = B
0
1.5

.5

.5
C
B
C
B
@
0 0 .5
.5
.5 A
.9 .3 .5
.5 1.4
unstable: there is one mechanism prescribed by the kernel basis element
( .75, .75, .75, .25, 1 )T , in which the roller moves horizontally to the right, the right
hand node moves right
and slightly down, while the left node moves
right and down.
0
1
.7071
.7071
0
0
0
B
C
B
C

1
0
1
0
0
B
C
?
C;
(b) Vertical roller: A = B
B
0
0
.7071 .7071 .7071 C
@
A
.9487 .3162
0
0
.3162
0
1
2.4
.2 1
0
.3
B
C
B .2
.6
0
0 .1 C
B
C
C
B 1
C;
K ? u = f ? where K ? = B
0
1.5

.5
.5
B
C
B
C
@
0
0 .5
.5 .5 A
.3 .1
.5 .5
.6
unstable: there is one mechanism prescribed by the kernel basis element
( .25, .25, .25, .75, 1. )T , in which the roller moves up, the right hand node moves up
and slightly to the left, while the left node moves slightly left and up.
B
B
B
B
B
@

6.3.25.
(a) Yes, if the direction of the roller is perpendicular to the vector between the two nodes,
the structure admits an (infinitesimal) rotation around the fixed node.
(b) A total of six rollers is required to eliminate all six independent rigid motions. The
rollers must not be aligned. For instance, if they all point in the same direction, they
do not eliminate a translational mode.

175

Solutions Chapter 7
7.1.1. Only (a) and (d) are linear.
7.1.2. Only (a),(d) and (f ) are linear.
2
0

0
7.1.3. (a) F (0, 0) =
6=
, (b) F (2 x, 2 y) = 4 F (x, y) 6= 2 F (x, y), (c) F ( x, y) =
0
!
!
0
1
.
6=
F (x, y) 6= F (x, y), (d) F (2 x, 2 y) 6= 2 F (x, y), (e) F (0, 0) =
0
0
0
7.1.4. Since T
0
identity map.
0

0
7.1.5. (a) B
@1
0
(e)

0
B
B
B
@

1
3
2
3
2
3
!

a
, linearity requires a = b = 0, so the only linear translation is the
b

=
1

1
0
0
2
3
13
2
3

x
y

0
0C
A,
1

(b)

2
3
2
3
13

B
B
B0
@

C
C
C,
A

(f )

1
2
3
2

B
@0

0C

0
1
0

23
1
2
1

0
0C
A,
0

C
C,
A

(c)

(g)

1
,
1

0
B
B
B
@

0
1
0

B
@0

0
5
6
1
6
1
3

0
0C
A,
1

1
6
5
6
1
3

31

(d)

B
@1

0
0
1

1
0C
A,
0

C
1C
C.
3A
1
3

1
form a basis, so we can write any v
1
!
!
!
1
1
1
2
R as a linear combination v = c
+d
. Thus, by linearity, L[ v ] = c L
+
1
1
1
!
1
dL
= 2 c + 3 d is uniquely determined by its values on the two basis vectors.
1

7.1.6. L

5
2

1
2

y. Yes, because

2x+
7.1.7. L(x, y) = @ 13
3x

4
3
1
3

yA
.
y

x1
y1

7.1.8. The linear function exists and is unique if and only if


dent. In this case, the matrix form is A =
c1

x1
y1

only if c1

+ c2
a1
b1

x2
y2

+ c2

a1
b1

a2
b2

x1
y1

x2
y2

x2
y2

!1

are linearly indepen-

. On the other hand, if

= 0, then the linear function exists (but is not uniquely defined) if and
a2
b2

= 0.

7.1.9. No, because linearity would require


0

20

13

1
1
1
1
0
B
C
B
C
B
C7
6B
C
C
LB
@ 1 A = L 4 @ 1 A @ 1 A 5 = L@ 0 A L@ 1 A = 3 6= 2.
0
1
0
0
1
7.1.10. La [ c v + d w ] = a (c v + d w) = c a v + d a w = c La [ v ] + d La [ w ];
176

matrix representative:

0
B
@

0
c
b

c
0
a

7.1.11. No, since N ( v) = N (v) 6= N (v).

b
a C
A.
0

7.1.12. False, since Q(c v) = c2 v 6= c v in general.

7.1.13. Set b = L(1). Then L(x) = L(x 1) = x L(1) = x b. The proof of linearity is straightforward; indeed, this is a special case of matrix multiplication.
7.1.14.
(a) L[ c X + d Y ] = A(c X + d Y ) = c A
X + d A Y =1c L[ X ] + d L[ Y ];
0
a 0 b 0
B
C
B0 a 0 bC
C.
matrix representative: B
@ c 0 d 0A
0 c 0 d
(b) R[ c X + d Y ] = (c X + d Y )B = c 0
X B + d Y B =1c R[ X ] + d R[ Y ];
p r 0 0
C
B
Bq s 0 0C
C.
matrix representative: B
@0 0 p rA
0 0 q s
(c) K[ c X + d Y ] = A(c X + d Y )B =0c A X B + d A Y B =
c K[ X ] + d K[ Y ];;
1
ap ar bp br
B
aq as bq bs C
C
C.
B
matrix representative: B
@ cp cr dp dr A
cq cs dq ds

7.1.15. (a) Linear; target space = Mnn . (b) Not linear; target space = Mnn .
(c) Linear; target space = Mnn . (d) Not linear; target space = Mnn .
(e) Not linear; target space = R. (f ) Linear; target space = R. (g) Linear; target space =
R n . (h) Linear; target space = R n . (i) Linear; target space = R.

7.1.16. (a) If L satisfies (7.1), then L[ c v + d w ] = L[ c v ] + L[ d w ] = c L[ v ] + d L[ w ], proving


(7.3). Conversely, given (7.3), the first equation in (7.1) is the special case c = d = 1,
while the second corresponds to d = 0. (b) Equations (7.1, 3) prove (7.4) for k = 1, 2. By
induction, assuming the formula is true for k, to prove it for k + 1, we compute
L[ c1 v1 + + ck vk + ck+1 vk+1 ] = L[ c1 v1 + + ck vk ] + ck+1 L[ vk+1 ]
= c1 L[ v1 ] + + ck L[ vk ] + ck+1 L[ vk+1 ].

7.1.17. If v = c1 v1 + + cn vn , then, by linearity,


L[ v ] = L[ c1 v1 + + cn vn ] = c1 L[ v1 ] + + cn L[ vn ] = c1 w1 + + cn wn .
Since v1 , . . . , vn form a basis, the coefficients c1 , . . . , cn are uniquely determined by v V
and hence the preceding formula uniquely determines L[ v ].
7.1.18.
e , w) = (c v +e
e
ee
e e
e
(a) B(c v+ec v
1 c v 1 ) w1 2 (c v2 + c v 2 ) w2 = c (v1 w1 2 v2 w2 )+ c (v 1 w1 2 v 2 w2 ) =
e , w), so B(v, w) is linear in v for fixed w. Similarly, B(v, c w + e
e =
c B(v, w) + ec B(v
c w)
e ) = c B(v, w) +
e 2v w
e ) 2 v (c w + e
e ) = c (v w 2 v w ) + e
w
v1 (c w1 + ec w
c
w
c
(v
2 2
1
2
2
2
1 1
2 2
2 1
e
e
c B(v, w),
proving linearity in w for fixed v.
e , w) = 2 (c v + e
(b) B(c v + ec v
c ve1 ) w2 3 (c v2 + ec ve2 ) w3 = c (2 v1 w2 3 v2 w3 ) +
1
e , w), so B(v, w) is linear in v for fixed w. Sime
c (2 ve1 w2 3 ve2 w3 ) = c B(v, w) + ec B(v
e = 2 v (c w + e
e ) 3 v (c w + e
e ) = c (2 v w 3 v w ) +
ilarly, B(v, c w + ec w)
c
w
cw
1
2
2
2
3
3
1 2
2 3
e
e ) = c B(v, w) + e
e 3v w
e
c
B(v,
w),
proving
bilinearity.
c (2 v2 w
1
2 3
e , w) = h c v + e
e , w i = ch v , w i + e
e , w i = c B(v, w) + e
e , w),
(c) B(c v + ec v
cv
ch v
c B(v
e = h v , cw + e
e i = ch v , w i + e
e i = c B(v, w) + e
e
B(v, c w + ec w)
cw
ch v , w
c B(v, w).
177

(d) B(c v + ec v
e , w) = (c v + e
e )T A w = c v T A w + e
e T A w = c B(v, w) + e
e , w),
cv
cv
c B(v

e = vT A(c w + e
e = c vT A w + e
e = c B(v, w) + e
e
B(v, c w + ec w)
c w)
c vT A w
c B(v, w).
(e) Set aij = B(ei , ej ) for i = 1, . . . , m, j = 1, . . . , n. Then

B(v, w) = B(v1 e1 + + vm em , w) = v1 B(e1 , w) + + vm B(em , w)


= v1 B(e1 , w1 e1 + + wn en ) + + vm B(em , w1 e1 + + wn en )
=

m
X

n
X

i=1 j =1

vi wj B(ei , ej ) =

m
X

n
X

i=1 j =1

vi wj aij = vT A w.

(f ) Let B(v, w) = ( B1 (v, w), . . . , Bk (v, w) )T . Then the bilinearity conditions B(c v +
e , w) = c B(v, w) + e
e and B(v, c w + e
e = c B(v, w) + e
e hold if and
e
cv
c B(v, w)
c w)
c B(v, w)
only if each component Bj (v, w) is bilinear.
e , w))
e
e, c w + e
e = c2 B(v, w) + c e
e +
(g) False. B(c (v, w) + ec (v
= B(c v + ec v
c w)
c B(v, w)
2
e , w) + e
e , w)
e 6= c B(v, w) + e
e , w).
e
c ec B(v
c B(v
c B(v
7.1.19. (a) Linear; target space = R. (b) Not linear; target space = R. (c) Linear; target
space = R. (d) Linear; target space = R. (e) Linear; target space = C 1 (R). (f ) Linear;
target space = C1 (R). (g) Not linear; target space = C1 (R). (h) Linear; target space =
C0 (R). (i) Linear; target space = C0 (R). (j) Linear; target space = C0 (R). (k) Not linear; target space = R. (l) Linear; target space = R. (m) Not linear; target space = R.
(n) Linear; target space = C2 (R). (o) Linear; target space = C2 (R). (p) Not linear; target
space = C1 (R). (q) Linear; target space = C1 (R). (r) Linear; target space = R. (s) Not
linear; target space = C2 (R).
7.1.20. True. For any constants c, d,
A[ c f + d g ] =

1
ba

Z b
a

[ c f (x) + d g(x) ] dx =

c
ba

Z b
a

f (x) dx+

d
ba

Z b
a

g(x) dx = c A[ f ]+d A[ g ].

7.1.21.
Mh [ c f (x) + d g(x) ] = h(x) (c f (x) + d g(x)) = c h(x) f (x) + d h(x) g(x) = c M h [ f (x) ]+d Mh [ g(x) ].
To show the target space is Cn [ a, b ], you need the result that the product of two n times
continuously differentiable functions is n times continuously differentiable.
7.1.22. Iw [ c f + d g ] =

Z bh

=c

a
Z b
a

c f (x) + d g(x) w(x) dx


f (x) w(x) dx + d

Z b
a

g(x) w(x) dx = c Iw [ f ] + d Iw [ g ].

i
h
g
f
+d
= c x [ f ] + d x [ g ]. The same
c f (x) + d g(x) = c
x
x
x
proof works for y . (b) Linearity requires d = 0.

7.1.23. (a) x [ c f + d g ] =

7.1.24. [ c f + d g ] =

i
i
2 h
2 h
c
f
(x,
y)
+
d
g(x,
y)
+
c
f
(x,
y)
+
d
g(x,
y)
x2
y 2
0

= c@

2f
2f
+
x2
y 2

A + d@

2g
2g
+
x2
y 2

178

1
A

= c [ f ] + d [ g ].

7.1.25. G[ c f + d g ] = (c f + d g) =

0
B
B
B
B
B
B
@

c f (x) + d g(x)

i 1

C
C
x
C
h
iC
C
c f (x) + d g(x) C
A

y
= c f + d g = c G[ f ] + d G[ g ].

0
B
B

= cB
B
@

f
x
f
y

B
C
B
C
C + dB
B
C
@
A

g
x
g
y

1
C
C
C
C
A

7.1.26.
(a) Gradient: (c f + d g) = c f + d g; domain is space of continuously differentiable
scalar functions; target is space of continuous vector fields.
(b) Curl: (c f + d g) = c f + d g; domain is space of continuously differentiable
vector fields; target is space of continuous vector fields.
(c) Divergence: (c f + d g) = c f + d g; domain is space of continuously differentiable vector fields; target is space of continuous scalar functions.

7.1.27.
(a) dimension = 3; basis: ( 1, 0, 0 !
) , ( 0, 1, 0 )!, ( 0, 0, 1 ).!
!
1 0
0 1
0 0
0 0
(b) dimension = 4; basis:
,
,
,
.
0 0
0 0
1 0
0 1
(c) dimension = m n; basis: Eij with (i, j) entry equal to ! and all other entries 0, for
i = 1, . . . , m, j = 1, . . . , n.
(d) dimension = 4; basis given by L0 , L1 , L2 , L3 , where Li [ a3 x3 + a2 x2 + a1 x + a0 ] = ai .
(e) dimension = 6; basis given by L0 , L1 , L2 ,!M0 , M1 , M2 , where
!
ai
0
2
2
Li [ a 2 x + a 1 x + a 0 ] =
, M i [ a2 x + a 1 x + a 0 ] =
.
0
ai
(f ) dimension = 9; basis given by L0 , L1 , L2 , M0 , M1 , M2 , N0 , N1 , N2 , where, for i = 1, 2, 3,
Li [ a2 x2 + a1 x + a0 ] = ai , Mi [ a2 x2 + a1 x + a0 ] = ai x, Ni [ a2 x2 + a1 x + a0 ] = ai x2 .
0
0

7.1.28. True. The dimension is 2, with basis

1
,
0

0
0

0
.
1

7.1.29. False. The zero function is not an element.

7.1.30. (a) a = ( 3, 1, 2 )T , (b) a = 3, 21 ,

2
3

, (c) a =

1 5
5
4,2, 4

7.1.31. (a) a = K 1 rT since L[ v ] = r v = r K 1 K v = aT K v and K T = K. (b) (i) a =


!
!
!1
!1
!
!
!
2
2
2 1
3 0
2
2
1
3 ,
(iii) a =
, (ii) a =
=
=
.
1
1
3
0 2
1
1
0
1
7.1.32.
(a) By linearity, Li [ x1 v1 + + xn vn ] = x1 Li [ v1 ] + + xn Li [ vn ] = xi .
(b) Every real-valued linear function L V has the form L[ v ] = a1 x1 + + an xn =
a1 L1 [ v ] + + an Ln [ v ] and so L = a1 L1 + + an Ln proving that L1 , . . . , Ln span
V . Moreover, they are linearly independent since a1 L1 + + an Ln = O gives the
trivial linear function if and only if a1 x1 + + an xn = 0 for all x1 , . . . , xn , which
implies a1 = = an = 0.
(c) Let ri denote the ith row of A1 which we identify as the linear function (Li [ v ] = ri v.
1 i = j,
The (i, j) entry of the equation A1 A = I says that Li [ vj ] = ri vj =
0, i 6= j,
which is the requirement for being a dual basis.

179

7.1.33.
In all cases,the dual
basis consists
of the
liner functions
L [ v ] = riv. (a) r1 =

1
1
1
1
1 3
2
1 1
1 1
2 , 2 , r2 =
2 , 2 , (b) r1 =
7 , 7 , r2 =
7 , 7 , (c) r1 =
2 , 2 , 2 , r2 =

1 1
1
2,2, 2

, r3 = 12 , 12 , 12 , (d) r1 = ( 8, 1, 3 ) , r2 = ( 10, 1, 4 ) , r3 = ( 7, 1, 3 ),
(e) r1 = ( 0, 1, 1, 1 ) , r2 = ( 1, 1, 2, 2 ) , r3 = ( 2, 2, 2, 3 ) , r4 = ( 1, 1, 1, 1 ).

7.1.34. (a) 9 36 x + 30 x2 ,

(b) 12 84 x + 90 x2 , (c) 1, (d) 38 192 x + 180 x2 .

7.1.35. 9 36 x + 30 x2 , 36 + 192 x 180 x2 , 30 180 x + 180 x2 .

7.1.36. Let w1 , . . . , wn be any basis of V . Write v = y1 w1 + +yn wn , so, by linearity, L[ v ] =


y1 L[ w1 ]+ +yn L[ wn ] = b1 y1 + +bn yn , where bi = L[ wi ]. On the other hand, if we
write a = a1 w1 + + an wn , then h a , v i =

n
X

i,j = 1

a i yj h w i , w j i =

n
X

i,j = 1

kij ai yj = xT K b,

where kij = h wi , wj i are the entries of the Gram matrix K based on w1 , . . . , wn . Thus
setting a = K 1 b gives L[ v ] = h a , v i.

7.1.37.
(a) S T
(b) S T
(c) S T
(d) S T

= T S = clockwise rotation by 60 = counterclockwise rotation by 300 ;


= T S = reflection in the line y = x;
= T S = rotation by 180 ;

= counterclockwise rotation by cos1 45 = 12 2 tan1 21 radians;

T S = clockwise rotation by cos1 45


(e) S T = T S = O;
(f ) S T maps ( x, y )T to

(g) S T maps ( x, y )
(h) S T maps ( x, y )

1
2

25

(x + y), 0
T

1
2

2 tan1

to
!

x+

1
5

4
5

y, x

T
2
;
5y
!

to

( 0, x )T ;

T S maps

1 0
1 1
; (c)
; (b) M =
3 2
3
2
(d) Each linear transformation is uniquely determined
!
1 0
2
M [ ei ] = N L[ ei ] for i = 1, 2.
(e)
=
3 2
0

7.1.38. (a) L =

radians;

; T S maps ( x, y )T to

to ( y, 0 ) ; T S maps ( x, y )

1
2

1
2

x, 12 x

25 x +

4
5

y, 15 x

2
5

2 1
;
0 1
by !
its action on!a basis of R 2 , and
1
1 1
.
1
3
2
N=

1 0
0
0 1 0
0 1
0
C
B
7.1.39. (a) R = B
0 0C
(b) R S = B
0 1 C
@ 0 0 1 A, S = @ 1
A;
@0
A 6= S R =
0
1
0
0
0
1
1
0
0
0
1
0 0 1
B
C
@ 1 0 0 A; under R S, the basis vectors e1 , e2 , e3 go to e3 , e1 , e2 , respectively.
0 1 0
Under S R, they go to e2 , e3 , e1 .
(c) Do it.
7.1.40. No. the matrix representatives for P, Q and R = Q P are, respectively,
P =

0
B
B
B
@

2
3
1
3
1
3

13
2
3
1
3

1
3
1
3
2
3

C
C
C,
A

Q=

0
B
B
B
@

2
3
1
3
1
3

1
3
2
3
13

1
3
13
2
3

C
C
C,
A

180

R = QP =

0
B
B
B
@

4
9
1
9
5
9

19
2
9
1
9

5
9
19
4
9

C
C
C,
A

but orthogo-

nal projection onto L =

8 0 19
>
1 >
<
=
B C
t
@0A
>
>
:
;

has matrix representative M =

1
B2
B
@ 0
1
2

0
0
0

1
1
2C
0C
A
1
2

6= R.

7.1.41. (a) L = E D where D[ f (x) ] = f 0 (x), E[ g(x) ] = g(0). No, they do not commute
D E is not even defined since the target of E, namely R, is not the domain of D, the
space of differentiable functions. (b) e = 0 is the only condition.
7.1.42. L M = x D 2 + (1 x2 )D 2 x, M L = x D2 + (2 x2 )D x. They do not commute.
7.1.43. (a) According to Lemma 7.11, Ma D is linear, and hence, for the same reason, L =
D (Ma D) is also linear. (b) L = a(x) D 2 + a0 (x) D.
7.1.44. (a) Given L: V U , M : W V and N : Z W , we have, for z Z,
((L M ) N )[ z ] = (L M )[ N [ z ] ] = L[ M [ N [ z ] ] ] = L[ (M N )[ z ] ] = (L (M N ))[ z ]
as elements of U . (b) Lemma 7.11 says that M N is linear, and hence, for the same reason, (L M ) N is linear. (c) When U = R m , V = R n , W = R p , Z = R q , then L is
represented by an m n matrix A, M is represented by a n p matrix B, and N is represented by an p q matrix C. Associativity of composition implies (A B) C = A (B C).

7.1.45. Given L = an Dn + + a1 D + a0 , M = bn Dn + + b1 D + b0 , with ai , bi constant, the


linear combination c L + d M = (c an + d bn ) Dn + + (c a1 + d b1 ) D + (c a0 + d b0 ), is also
a constant coefficient linear differential operator, proving that it forms a subspace of the
space of all linear operators. A basis is D n , Dn1 , . . . , D, 1 and so its dimension is n + 1.
7.1.46. If p(x, y) =

cij xi y j then p(x, y) =

cij xi yj is a linear combination of linear opera-

tors, which can be built up as compositions xi yj = x x y y of the basic


first order linear partial differential operators.
7.1.47.
(a) Both L M and M L are linear by Lemma 7.11, and, since the linear transformations
form a vector space, their difference L M M L is also linear.
(b) L M = M L if and only if [ L, M ] = L M
M L = O.
0
1
!
!
0 2
0
1
3
0 2
C
(c) (i)
, (ii)
, (iii) B
@ 2 0 2 A.
0 1
2
0
0 2
0
h
i

[ L, M ], N = (L M M L) N N (L M M L)
(d)
h
h

= L M N M L N N L M + N M L,

[ N, L ], M = (N L L N ) M M (N L L N )
i

= N L M L N M M N L + M L N,

[ M, N ], L = (M N N M ) L L (M N N M )

= M N L N M L L M N + L N M,
which add up to O.
!
!
h
i
h
i
h
i
3 2
0 0
(e) [ L, M ], N =
, [ N, L ], M =
, [ M, N ], L =
2 3
2 0
!
!
!
3 2
0 0
3 2
whose sum is
+
+
= O.
2 3
2 0
0 3
e M) = [cL + e
e M ] = (c L + e
e M M (c L + e
e
(f ) B(c L + ec L,
c L,
c L)
c L)

3
0

2
3

e = c [ L, M ] + e
e M ] = c B(L, M ) + e
e M ),
e M M L)
c [ L,
c B(L,
= c (L M M L) + ec (L
f ) = B(c M + e
f , L) = c B(M, L) e
f , L) = c B(L, M )+ e
f ).
B(L, c M + ec M
cM
c B(M
c B(L, M
(Or the latter property can be proved directly.)

181

7.1.48.
(a) [ P, Q ] [ f ] = P Q[ f ] Q P [ f ] = P [ x f ] Q[ f 0 ] = (x f )0 x f 0 = f .
(b) According to Exercise 1.2.32, the trace of any matrix commutator is zero: tr[ P, Q ] = 0.
On the other hand, tr I = n, the size of the matrix, not 0.
7.1.49.
(a) D (1) is a subspace of the vector space of all linear operators acting on the space of polynomials, and so, by Proposition 2.9, one only needs to prove closure. If L = p(x) D +
q(x) and M = r(x) D + s(x) are operators of the given form, so is c L + d M =
[ c p(x) + d r(x) ] D + [ c q(x) + d s(x) ] for any scalars c, d R. It is an infinite dimensional vector space since the operators xi D and xj for i, j = 0, 1, 2, . . . are all linearly
independent.
(b) If L = p(x) D + q(x) and M = r(x) D + s(x), then
L M = p r D 2 + (p r 0 + q r + p s) D + (p s0 + q s),
M L = p r D 2 + (p0 r + q r + p s) D + (q 0 r + q s),

hence [ L, M ] = (p r 0 p0 r) D + (p s0 q 0 r).
(c) [ L, M ] = L, [ M, N ] = N, [ N, L ] = 2 M, and so
h

[ L, M ], N + [ N, L ], M + [ M, N ], L = [ L, N ] 2 [ M, M ] + [ N, L ] = O.

7.1.50. Yes, it is a vector space, but the commutator of two second order differential operators
is, in general, a third order differential operator. For example [ x D 2 , D2 ] = 2 D 3 .

7.1.51. (a) The inverse is the scaling transformation that halves the length of each vector.
(b) The inverse is counterclockwise rotation by 45 .
(c) The inverse is reflection through the y axis.
(d) No
inverse.
!
1 2
(e) The inverse is the shearing transformation
.
0
1
7.1.52.
!
!
1
0
2 0
2
; inverse:
.
(a) Function:
0 2
0 12 0
0
1
1
1
1
1
1

B 2
2
2C
2C
(b) Function: B
@
A; inverse: @ 1
A.
1

1
1
2

(c) Function:

1
0

(d) Function:

1
2
1
2

(e) Function:

1
0

0
; inverse:
1

1!
2
;
1
2!

1
0

no inverse.

2
; inverse:
1

1
0

2
.
1

7.1.53. Since L has matrix representative


2
1

2
!

0
.
1

3
, and so L1 [ e1 ] =
1

2
1

3
, its inverse has matrix representative
2
!
3
1
and L [ e2 ] =
.
1

1
1

2
7.1.54. Since L has matrix representative B
@ 1
1

1
2
1

182

1
2C
A, its inverse has matrix representative
2

0
B
B
B
@

2
3
4
3

1
1
1

43

C
5C
C,
3A

and so L1 [ e1 ] =

0
B
B
B
@

2
3
4
3

C
C
C,
A

0
B

L1 [ e2 ] = B
@

4
1C
B 3C
1
5C
B
C.
[ e3 ] = B
1C
A L
3A
@
1
1

7.1.55. If L M = L N = I W , M L = N L = I V , then, by associativity,


M = M I W = M (L N ) = (M L) N = I V N = N .

7.1.56.
(a) Every vector in V can be uniquely written as a linear combination of the basis elements:
v = c1 v1 + + cn vn . Assuming linearity, we compute
L[ v ] = L[ c1 v1 + + cn vn ] = c1 L[ v1 ] + + cn L[ vn ] = c1 w1 + + cn wn .
Since the coefficients c1 , . . . , cn of v are uniquely determined, this formula serves to
uniquely define the function L: V W . We must then check that the resulting function is linear. Given any two vectors v = c1 v1 + + cn vn , w = d1 v1 + + dn vn in
V , we have
L[ v ] = c1 w1 + + cn wn ,
L[ w ] = d1 w1 + + dn wn .
Then, for any a, b R,
L[ a v + b w ] = L[ (a c1 + b d1 ) v1 + + (a cn + b dn ) vn ]
= (a c1 + b d1 ) w1 + + (a cn + b dn ) wn

= a c1 w1 + + c n wn + b d1 w1 + + d n wn

= a L[ v ] + d L[ w ],

proving linearity of L.
(b) The inverse is uniquely defined by the requirement that L1 [ wi ] = vi , i = 1, . . . , n.
Note that L L1 [ wi ] = L[ vi ] = wi , and hence L L1 = I W since w1 , . . . , wn is a
basis. Similarly, L1 L[ vi ] = L1 [ wi ] = vi , and so L1 L = I V .
(c) If A = ( v1 v2 . . . vn ), B = ( w1 w2 . . . wn ), then L has matrix representative B A1 ,
while L1 has matrix representative A B 1 .
0
1
0
1
!
!
1
3
1
1
3 5
2
5
3 A, L1 = @ 2
2 A;
(d) (i) L =
, L1 =
; (ii) L = @ 3
1
3
1 2
1
3
1
1

2
2
0
1
0
1
1
1
1

0
1
1
2
2
2C
B
C
1
1
1
1C
B
C, L
(iii) L = B
=B
@ 1 0 1 A.

2
2
2A
@
1 1 0
1
1
1
2
2 2

7.1.57. Let m = dim V = dim W . As guaranteed by Exercise 2.4.24, we can choose bases
v1 , . . . , vn and w1 , . . . , wn of R n such that v1 , . . . , vm is a basis of V and w1 , . . . , wm is
a basis of W . We then define the invertible linear map L: R n R n such that L[ vi ] = wi ,
i = 1, . . . , n, as in Exercise 7.1.56. Moreover, since L[ vi ] = wi , i = 1, . . . , m, maps the basis
of V to the basis of W , it defines an invertible linear function from V to W .
7.1.58. Any m !n matrix of rank n < m. The inverse is not unique. For example, if
1
, then B = ( 1 a )T , for any scalar a, satisfies B A = I = (1).
A=
0
7.1.59. Use associativity of composition: N = N

IW = N LM = IV

7.1.60.
(a) L[ a x2 + b x + c ] = a x2 + (b + 2 a) x + (c + b);
L1 [ a x2 + b x + c ] = a x2 + (b 2 a) x + (c b + 2 a) = e x
(b) Any of the functions Jc [ p ] =
D J

Z x
0

Z x

= M.

ey p(y) dy.

p(y) dy + c, where c is any constant, is a right inverse:

= I . There is no left inverse since ker D 6= {0} contains all constant functions.
183

7.1.61.
(a) It forms a three-dimensional subspace since it is spanned by the linearly independent
functions x2 ex , x ex , ex .

(b) D[ f ] = a x2 + (b + 2 a) x + (c + b) ex is invertible, with inverse

D1 [ f ] = a x2 + (b 2 a) x + (c b + 2 a) ex =
h

(c) D[ p(x) ex ] = p0 (x) + p(x) ex , while D 1 [ p(x) ex ] =

7.2.1.
(a)

1
B 2
@ 1

2
0
@

1
2

(d)

(e)

0
@

2C
A.
1
2
!

1
0

(b)
(c)

3
5
4
5

Z x

Z x

f (y) dy.

p(y) e dy.

(i) The line y = x; (ii) the rotated square 0 x + y, x y


(iii) the unit disk.

0
. (i) The x axis; (ii) the square 1 x, y 0; (iii) the unit disk.
1

1
4
5 A.
3
5

(i) The line 4 x + 3 y = 0; (ii) the rotated square with vertices

T
T
T
( 0, 0 )T , 1 , 1
, 0, 2
; (iii) the unit disk.
, 1 , 1
2

0
. (i) The line y = 4 x; (ii) the parallelogram with vertices ( 0, 0 )T , ( 1, 2 )T ,
1
( 1, 3 )T , ( 0, 1 )T ; (iii) the elliptical domain 5 x2 4 x y + y 2 1.
1
3
2A
5
2

1
2
3
2

1
B 2
@ 1

1
0

2C
A
1

3
1

0
B
@

1
1
2C
A.
1

1
2
1
2

(i) The line y = 3 x; (ii) the parallelogram with vertices ( 0, 0 )T , 12 , 32


( 1, 1 )T ,

(f )

1
@5
2
5

2;

3 5
,
1 2 2

2
5 A.
4
5

(iii) the elliptical domain

17
2

x2 9 x y +

(i) The line y = 2 x; (ii) the line segment

(iii) the line segment

7.2.2. Parallelogram with vertices

0
,
0

( x, 2 x )T
!

y 2 1.

( x, 2 x )T

1 x

1
5

ff

3
5

0x

3
2.5

3
:
1

4
,
3

1
,
2

5
2

2
1.5
1
0.5
1

(a) Parallelogram with vertices

0
,
0

1
,
1

4
,
1

3
:
2

0.5
4

-0.5
-1
-1.5
-2

(b) Parallelogram with vertices

0
,
0

2
,
1

3
,
4

1
:
3

0.5

184

1.5

2.5

0
,
0

(c) Parallelogram with vertices

5
,
7

10
,
8

5
:
1

10

(d) Parallelogram
with vertices
0
1 0 3
1
!
1 2
+ 2 2C
0 B 2
C
B
A, @ 3 2
A,
,@ 1
0
+
+2 2
2
2

!
2 :
2 2

-0.5

0.5

0
,
0

(e) Parallelogram with vertices

3
,
1

2
,
1

0.5

1
:
2

-1

-0.5
-1
-1.5
-2

1
0.75
0.5

(f ) Line segment between

12
1
2

and

0.25

1
:
1

-1 -0.75 -0.5-0.25

0.25 0.5 0.75

-0.25
-0.5
-0.75
-1

(g) Line segment between

1
2

and

3
:
6

-3

-2 -1

-2

7.2.3.

1
0
represents a rotation by = ;
(a) L =
0 1
(b) L is clockwise rotation by 90 , or, equivalently, counterclockwise rotation by 270 .
2

!2

0 1
1 0
7.2.4.
=
. L represents a reflection through the line y = x. Reflecting twice
1 0
0 1
brings you back where you started.
!

1 0
1
0
1
0
1 0
, we see that it is the com=
7.2.5. Writing A =
2 1
0 1
0 1
2 1
position of a reflection in the x axis followed by a shear along the y axis with shear factor
2, which is the same as first doing a shear along the y axis with shear factor 2 and then
reflecting in the x axis. If we perform L twice, the shears cancel each other out, while reflecting twice brings us back where we started.
7.2.6. Its image is the line that goes through the image points

185

1
,
2

4
.
1

2
0

7.2.7. Example:
reflection, e.g.,

0
. It is not unique, because you can compose it with any rotation or
3
1
0
0
1
!
1
1

2 2
2 0 B 2
2C
A.
A=@ 3
@ 1

1
0 3
2
2

1
7.2.8. Example: A = B
@0
matrix.
0

0
2
0

12

0
b = A Q where Q is any 3 3 orthogonal
0C
A. More generally, A
4

7.2.9. (a) True. (b) True. (c) False: in general, squares are mapped to parallelograms. (d) False:
in general circles are mapped to ellipses. (e) True.
7.2.10.
(a) The reflection through the line takes e1 to ( cos , sin )T and e2 to ( sin , cos )T ,
and hence has the
! indicated matrix
! representative.
!
cos
sin
cos
sin
cos( ) sin( )
(b)
=
is rotation by angle
sin cos
sin cos
sin( )
cos( )
. Composing in the other order gives the opposite rotation, through angle .
7.2.11. (a) Let z be a unit vector that is orthogonal to u, so u, z form an orthonormal basis of
R 2 . Then L[ u ] = u = R u since uT u = 1, while L[ z ] = z = R z since u z = uT z = 0.
v vT
Thus, L[ v ] = R v since they agree on a basis of R 2 . (b) R = 2
I.
k v k02
0
1
1
!
!
7
24
5
12

1
0
0
1
25
25 A;
13
13 A.
(c) (i)
(iii)
; (ii) @
, (iv ) @
7
12
5
0 1
1 0
24

25
25
13
13
7.2.12.
(a)

(b)

(c)

(d)

(e)

1 0
3 0
0 1
0 2
1 13 :
=
0 2
0 1
1 0
3 1
0
1
1
a shear of magnitude 3 along the x axis, followed by a scaling in the y direction by a
factor of 2, followed by a scaling in the x direction by a factor of 3 coupled with a reflection in the
by a!reflection
! in the line y = x.
!
! y axis, followed
1 1
1 0
1 0
1 1
:
=
0 1
0 2
1 1
1 1
a shear of magnitude 1 along the x axis, followed by a scaling in the y direction by a
factor of
1
! 2, followed
! of magnitude
! by a shear
!
! along the y axis.
1
1 0
3 1
1 0
3 0
1 3 :
= 1
1 2
0 35
0 1
0 1
3 1
a shear of magnitude 13 along the x axis, followed by a scaling in the y direction by a
factor of 53 , followed by a scaling of magnitude 3 in the x direction, followed by a shear
1
of
magnitude
along the 1
y0
axis.
0
1 30
10
10
10
10
1
1 1 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 1 0
B
C
B
CB
CB
CB
CB
CB
1 0 A @ 0 1 0 A @ 0 1 0 A @ 0 1 1 A @ 0 1 0 C
@1 0 1A = @1 1 0A@0
A:
0 1 1
0 0 1
0 1 1
0 0 2
0 0 1
0 0 1
0 0 1
a shear of magnitude 1 along the x axis that fixes the xz plane, followed a shear of magnitude 1 along the y axis that fixes the xy plane, followed by a reflection in the xz
plane, followed by a scaling in the z direction by a factor of 2, followed a shear of magnitude 1 along the z axis that fixes the xz plane, followed a shear of magnitude 1 along
the y axis 1
that 0
fixes the yz1 plane.
0
10
1
0
10
10
10
1 2 0
1 0 0
1 2 0
1 0 0
1 0 0
1 0 0
1 0 0
B
C
B
CB
C
CB
CB
CB
1 CB
@ 2 4 1 A = @ 0 0 1 A @ 2 1 0 A @ 0 1 0 A @ 0 3 0 A @ 0 1 3 A @ 0 1 0 A:
2 1 1
0 0 1
0 1 0
0 0 1
0 0 1
2 0 1
0 0 1
186

a shear of magnitude 2 along the x axis that fixes the xz plane, followed a shear of magnitude 13 along the y axis that fixes the xy plane, followed by a scaling in the y direction by a factor of 3 coupled with a reflection in the xz plane, followed by a shear
of magnitude 2 along the z axis that fixes the yz plane, followed shear of magnitude 2
along the y axis that fixes the yz plane, followed by a reflection in the plane y z = 0.
7.2.13.

2
1 a
1 0
1 a
cos sin
= 1 + ab 2a + a b =
since
0 1
b 1
0 1
sin
cos
b
1 + ab
1
= 12 sin2 21 = cos and 2 a+a2 b = 2 tan 21 +tan2 12 sin =
1+a b = 1tan
2 sin

!
2 tan 21 1 sin2 21 = 2 cos 21 sin 12 = sin .
1
0
(b) The factorization is not valid when is an odd multiple of , where the matrix
0 1
represents rotation by 180 .
(c) The first and third factors represent shears along the x axis with shear factor a, while
the middle factor is a shear along the y axis with shear factor b.

(a)

1
0
C
23 C
A
1
2

10

10

10

1
0 0
1 0 0
1 0 0
1 0
0 C
1
B
CB
CB
CB
1
=
0
1
0
0
0
0
1
0
0
1

3 A:
@
A
@
A
@
A
@

2
2
3
0
3
1
0
0
1
0
0
2
0
0
1
0
2

a shear of magnitude 3 along the y axis that fixes the xy plane, following by a scaling
in the z direction by a factor of 2, following by a scaling in the y direction by a factor of 12 ,

following by a shear of magnitude 3 along the z axis that fixes the xz plane.

B
0
7.2.14. B
@

1
0

7.2.15. (a)

0
,
0

7.2.16.

(b)

1
@2
1
2

1
1
2 A,
1
2

(c)

0
@

4
13
6
13

6
13
9
13

A.

a b
has rank 1 then its rows are linearly
dependent, and hence either
( a, b ) =
(a) If A =
!
!
c d

1
( c, d ) or c = d = 0. In the first case A =
( c, d ); in the second, A =
( a, b ).
1
0
(See Exercise 1.8.15 for the general case.)
(b) When u = v is a unit vector, or, more generally, when v = u/k u k2 .

7.2.17.

B
@0

0
B
@1
0

0
B
@0
1

1
B
@0
0

0
B
@0
1

0
1
0
1
0
0
0
1
0
0
0
1
1
0
0

0
0C
A is the identity transformation;
1
1

0
0C
A is a reflection in the plane x = y;
1
1

1
0C
A is a reflection in the plane x = z;
0
1

0
1C
A is a reflection in the plane y = z;
0
1

1C
A is rotation by 120 around the line x = y = z;
0

187

B
@1

0
0
1

1
7.2.18. X = B
@0
0

0C
A is rotation by 240 around the line x = y = z.
0

0
cos
sin
!

0
sin C
A.
cos

1
0
0
1
0
0C
= +1, representing a 180 rotation, while det B
7.2.19. det
@ 0 1
A = 1,
0 1
0
0 1
and so is a reflection but through the origin, not a plane, since it doesnt fix any nonzero
vectors.

7.2.20. If v is orthogonal to u, then uT v = 0, and so Q v = v, while since k u k2 = uT u = 1,


we have Q u = u. Thus, u is fixed, while every vector in the plane orthogonal to it is rotated through an angle . This suffices to show that Q represents the indicated rotation.
7.2.21.
(a) First, w = (u v) u = (uT v) u = u uT v is the orthogonal projection of v onto the line in
the direction of u. So the reflected vector is v 2 w = ( I 2 u uT )v.
(b) RT R = ( I 2 u uT )2 = I 4 u uT + 4 u uT u uT = I because uT u = k u k2 = 1. it is
improper because it reverses orientation.
0
1
1
0
1
0
2
151
24
72
2
24
7
169
31 C
0 25
169
169
3
3
B
C
B
25
C
B
137
96 C
1
2C
2
B
C,
C.
B 24
(iii) B
(ii) B
(c) (i) B
0 1
0C
A,
@
169
169
169 A
3 3
3A
@
@

24
25

7
25

72
169

96
169

119
169

(d) Because the reflected vector is minus the rotated vector.

1
3

2
3

2
3

7.2.22. (a) In the formula for R , the first factor rotates to align a with the z axis, the second rotates around the z axis by angle , while the third factor QT = Q1 rotates the z
axis back to the line through
a. The1combined effect is a rotation through angle around
0
1 0
0
C
T
the axis a. (b) Set Q = B
@ 0 0 1 A and multiply out to produce Q Z Q = Y .
0 1
0

7.2.23.
(a) (i) 3 + i + 2 j , (ii) 3 i + j + k , (iii) 10 + 2 i 2 j 6 k , (iv ) 18.
(b) q q = (a + b i + c j + d k )(a b i c j d k ) = a2 + b2 + c2 + d2 = k q k2 since all other
terms in the product cancel.
(c) This can be easily checked for all basic products, e.g., (1 i ) j = k = 1( i j ), ( i j ) k =
1 = i ( j k ), ( i i ) j = j = i ( i j ), etc. The general case follows by using the distributive property (or by a direct computation).
(d) First note that if a R H, then a q = q a for any q H. Thus, for any a, b R, we
use the distributive property to compute Lq [ a r + b s ] = q (a r + b s) = a q r + b q s =
a Lq [ r ] + b Lq [ s ], and Rq [ a r + b s ] = (a r + b s) q = a r q + b s q = a Rq [ r ] + b Rq [ s ]. The
0
1
0
1
a b c d
a b c d
B
B
b
a d
cC
a
d c C
C
Bb
C
B
C, Rq = B
C.
matrix representatives are Lq = B
@c
@ c d
d
a b A
a
bA
d c
b
a
d
c b
a
2
2
2
2
T
T
(e) By direct computation: LT
L
=
(a
+
b
+
c
+
d
)
I
=
R
R
,
and
so
L
L
q q
q
q
q q = I =
RqT Rq when k q k2 = a2 + b2 + c2 + d2 = 1.

(f ) For q = ( b, c, d )T , r = ( x, y, z )T , we have q r = (b i + c j + d k )(x i + y j + z k ) =

188

(b x + c y + d z) + (c z d y) i + (d x b z) j + (b y c x) k , while q r = b x + c y + d z and
q r = ( c z d y, d x b z, b y c x )T . The associativity law (q r) s = q (r s) implies that
(q r) s = q (r s), which defines the vector triple product, and the cross product
identity (q r) s (q r)s = q (r s) (r s)q.

7.2.24. (a)
0

1
2

4
, (b)
3

3
7.2.25. (a) B
@ 6
1
7.2.26.
(a)
(b)
(c)
(d)

(e)

1
1
1
!

2
6C
A,
0
!

43

6
, (c)
3
0

1
(b) B
@ 0
0
!

0
2
0

1
2
1

0
0C
A,
1

0
, (d)
5
(c)

0
B
@

1
0
1
5

0
52

0
, (e)
5

3
2

8
.
7

12
5 C
0 A.
15

0
2
0
!

1
0
1
2
1 0
Bases:
,
, and
,
; canonical form:
;
0
1
2
1
0 1
1 0 1
0 1 0
!
!
!
0
4
1
1
2
1 0 0
C B C
B C B
,
; canonical form:
;
bases: @ 0 A , @ 0 A , @ 4 A, and
2
1
0 0 0
3
1
0
0
1 0 1 0
1
0
1
!
!
2
3
4
1 0
0
1
C B C B
C
C
B
, and B
,
bases:
@ 0 A , @ 4 A , @ 5 A; canonical form: @ 0 1 A;
1
0
1
1
8 1 0
0 0 0
0 1 0 1 0
1
0 1 0
1
1
0
1
1
2
1
1 0
C B C B
C
B C B
C B
C
B
bases: B
@ 0 A , @ 1 A , @ 2 A, and @ 1 A , @ 1 A , @ 1 A; canonical form: @ 0 1
0
0
31 0
2
1 1 0 11 0
0 0
0 1 0 1 0
1
0
1 0
1
1
0
3
1
1
0
1
2
B C B C B
B
C
B
C B
C B
C B
C
0C B0C B 1C
C B 0C
B 2C B 1C B 1C B 1C
B C, B C , B
C, B
C, and B
C, B
C, B
C, B
C;
bases: B
@0A @1A @ 0A @ 4A
@ 1 A @ 1 A @ 1 A @ 0 A
0
00
0
1
0
1
0
1
1
1 0 0 0
B
0 1 0 0C
C
B
C.
canonical form: B
@0 0 0 0A
0 0 0 0

0
0C
A;
0

7.2.27. (a) Let v1 , . . . , vn be any basis for the domain space and choose wi = L[ vi ] for i =
1, . . . , n. Invertibility implies that w1 , . . . , wn are linearly independent, and so form a basis for the target space. (b) Only the identity transformation, since
A1=0S I S 11 = I .
0
!
!
!
!
!
!
1
1 C
1
0
2
0
1
0
B 2C B
2
(c) (i)
,
, and
,
. (ii)
,
, and @ 1 A , @
A.
0
1
0
2
0
1

1
(iii)

1
0

0
, and
1

1
2

0
.
1

7.2.28.
(a) Given any v R n , write v = c1 v1 + + cn vn ; then A v = c1 A v1 + + cn A vn =
c1 w1 + + cn wn , and hence A v is uniquely defined. In particular, the value of A ei
uniquely specifies the ith column of A.
(b) A = C B 1 , where B = ( v1 v2 . . . vn ), C = ( w1 w2 . . . wn ).
7.2.29. (a) Let Q have columns u1 , . . . , un , so Q is an orthogonal matrix. Then the matrix
representative in the orthonormal basis is B = Q1 A Q = QT A Q, and B T = QT AT (QT )T =

189

1
0

QT A Q = B. (b) Not necessarily. For example, if A =


!
2 0
1
is not symmetric.
S AS =
1 1

0
2

1
1

and S =

1
, then
0

7.2.30. (a) Write h x , y i = xT K y where K > 0. Using the Cholesky factorization (3.70), write
K = M M T where M is invertible. Let M T = ( v1 v2 . . . vn ) define the basis. Then
n
X

x=

i=1

ci vi = M T c,
T

y=

n
X

i=1
T

di vi = M T d,

implies that h x , y i = x K y = c M M M M
d = cT d = c d. The basis is not
unique since one can right multiply by any orthogonal matrix Q to
another one:
1
0 produce
!
!
!
1
2 , 0 ; (ii) 1 , B
3C
e
e
e v
M T Q = ( v
@ 2 A.
1 2 . . . vn ). (b) (i)
3
0

0
1

7.3.1.
(a) (i) The horizontal line y = 1; (ii) the disk (x 2)2 + (y + 1)2 1 of radius 1 centered
at ( 2, 1 )T ; (iii) the square { 2 x 3, 1 y 0 }.
(b) (i) The x-axis; (ii) the ellipse 19 (x + 1)2 + 41 y 2 1;
(iii) the rectangle { 1 x 2, 0 y 2 }.
(c) (i) The horizontal line y = 2; (ii) the elliptical domain x2 4 x y+5 y 2 +6 x16 y+12 0;
(iii) the parallelogram with vertices ( 1, 2 )T , ( 2, 2 )T , ( 4, 3 )T , ( 3, 3 )T .
(d) (i) The line x = 1; (ii) the disk (x 1)2 + y 2 1 of radius 1 centered at ( 1, 0 )T ;
(iii) the square { 1 x 2, 1 y 0 }.
(e) (i) The line 4 x + 3 y + 6 = 0; (ii) the disk (x + 3)2 + (y 2)2 1 of radius 1 centered at
( 3, 2 )T ; (iii) the rotated square with corners (3, 2), (2.4, 1.2), (1.6, 1.8), (2.2, 2.6).
(f ) (i) The line y = x1; (ii) the line segment from

2
2

1 , 1
2
2

1
0

0
1

1
0

to

1+

1 , 1
2
2

(iii) the line segment from ( 1, 0 )T to ( 2, 1 )T .

(g) (i) The line x + y + 1 = 0; (ii) the disk (x 2)2 + (y + 3)2 2 of radius 2 centered at
( 2, 3 )T ; (iii) the rotated square with corners (2, 3), (3, 4), (4, 3), (3, 2).

T
(h) (i) The line x+y = 2; (ii) the line segment from 1 5, 1 + 5
to 1 + 5, 1 5 ;
(iii) the line segment from ( 1, 1 )T to ( 4, 2 )T .

7.3.2.

2
1

(a) T3 T4 [ x ] =

1
0

x+

2
,
2

0
1 2
2 1
=
1
0 1
1 0
!
!
3
0
1
,
(b) T4 T3 [ x ] =
x+
1
1 2
with

with

0
1

(c) T3 T6 [ x ] = @

3
2
1
2

1
2
1

3
2 Ax +
1
2

0
1
!

1
0

1
0

1
0

2
,
2

190

2
1

3
1

1
0

2
1

1
2

1
;
2
!

1
;
0

with
(d) T6 T3 [ x ]
with

3
@2
1
2
0
1
= @ 12
2
0
1
@2
1
2

0
4

(e) T7 T8 [ x ] =

1
3
2A= 1
1
0
2
0
1
3
2 Ax + @
3
2
0
1
1
3
2A=@2
3
1
2
2
!

0
2

!0 1
@2
1
2
1

x+

5
2 A,
3
2
1
1
2A
1
2
!

1
0

2
,
2

0
0
1 1
with
=
4 2
1 1
!
!
1
3
2
(f ) T8 T7 [ x ] =
x+
,
1 3
0
1
1

with

3
3

2
2

1
1
2 A,
1
2

2
1

2
1

1
1

2
2

2
2

1
1

1
0

1
5
@2A
3
2
0

1
1

1
1

2
1

1
@2
1
2

4
3
2
0

1
0

1
1
2A
1
2

1
2

1
1
2
2

1
;
2

1
1

1
1

4
;
3
!

2
1
+
;
1
3
2
3

1
.
1

7.3.3. (a) True. (b) True. (c) False: in general, squares are mapped to parallelograms. (d) False:
in general circles are mapped to ellipses. (e) True.
7.3.4. The triangle with vertices (1, 6), (7, 2), (1, 6).
7.3.5.
(a) if and only if their matrices are mutual inverse: B = A1 .
(b) if and only if c = B a + b = 0, as in (7.34), so b = B a.
7.3.6.
(a) F [ x ] = A x + b has an inverse if and only if A in nonsingular.
(b) Yes: F 1
[ x ] = A!1 x A1
b.
!
!
!
!
!
!
1
1
0
x
x
x
2
1
1 x
3
+ 3 ,
(c) T1
=
+
, T2
=
1
y
y
y
1
y
0
0 !
2
!
!
!
!
!
!
x
0 1
3
x
1 2
1 x
1 x
+
=
, T4
+
=
T3
y
1
0
y
2
y
0
1
y
T51

x
y

T71

x
y

=
=

.6
.8

1
@2
1
2

.8
.6

12
1
2

1
A

x
y

x
y

+
0

+@

3.4
,
1.2

1
5
2 A,
1
2

0
,
1

T6 has no inverse,
T8 has no inverse.

7.3.7.
(a) First b = w0 = F [ 0 ], while A vi = F [ vi ] b = wi w0 for i = 1, . . . , n. Therefore,
knowing its action on the basis vectors uniquely prescribes the matrix A.
(b) A = ( w1 w0 w2 w0 . . . wn w0 ) and b = w0 .
(c) A = C B 1 , where B = ( v1 v2 . . . vn ), C = ( w1 w0 . . . wn w0 ), while b = w0 .

7.3.8. It can be regarded as a subspace of the vector space of all functions from R n to R n and
so one only needs to prove closure. If F [ x ] = A x + b and G[ x ] = C x + d, then (F +
G)[ x ] = (A + C) x + (b + d) and (c F )[ x ] = (c A) x + (c b) are affine for all scalars c. The
dimension is n2 + n; a basis consists of the n2 linear functions Lij [ x ] = Eij x, where Eij is
the n n matrix with a single 1 in the (i, j) entry and zeros everywhere else, along with the
n translations Ti [ x ] = x + ei , where ei is the ith standard basis vector.

191

A b
x
Ax + b
B b
A a
BA Ba + b
=
, (b)
=
.
0 1
1
1
0 1
0 1
0
1
(c) The inverse of F [ x ] = A x + b is F 1 [ y ] = A1 (y b) = A1 y A1 b. The inverse
!1
!
1
1
A b
A
A
b
matrix is
=
.
0 1
0
1

7.3.9. (a)

7.3.10. (a), (b), (e) are isometries


7.3.11. (a) True. (b) True if n is even; false if n is odd, since det( I n ) = (1)n .
0
1

7.3.12. Write y = F [ x ] = Q(x a) + a where Q =


T

1
0

represents a rotation through

. Thus, the vector y a = Q(x a) is obtained


an angle of 90 , and a = 32 , 12
by rotating the vector x a by 90 , and so the point y is obtained by rotating x by 90
around the point a.
7.3.13. If Q = I , then F [ x ] = x + a is a translation. Otherwise, since we are working in R 2 ,
by Exercise 1.5.7(c), the matrix Q I is invertible. Setting c = (Q I )1 a, we rewrite
y = F [ x ] as y c = Q(x c), so the vector y c is obtained by rotating x c according
to Q. We conclude that F represents a rotation around the point c.
7.3.14.
(a) F [ x ] =

(b)

!
1
2C
;
Ax; G[ x ] = x +
1
0
2
1
0
1
0
1
1
1
1

B 2
B 2C
B 2
2C
@ 1
Ax + @ 1 A = @ 1

2
2
2
2

1
B 2
@ 1

2
0

12

13
1
0
0
21 A 5 @ 21 A
C
2
4
@

F G[ x ] =
+
A x
2+1
2+1
1
2
2
2
0
1
1
2
A by 45 ;
is counterclockwise rotation around the point @ 2+1
2
1
12
0
0
13
1
0
0
!
1
1
1
1
1
1

1
C
C
B
B
2
2
2
2
2
2
4
A
5
A
@
@

=@ 1
+
G F[x] = @ 1
Ax +
A x
2+1
2+1

1
0
2
2
2
2
2
2
0
1
1
is counterclockwise rotation around the point @ 2 A by 45 .
2+1
2

1
0
0
0
1"
1
!
!#
1
1
3 3
3
3

1
1
2 A;
2 A x
2 Ax + @
=@ 2
+
F[x] = @ 2
3
3
1
1
1 3
1
1
2
2 "
2
2!
!2
!
!#
!

0
1

G[ x ] =

F G[ x ] = @

1
0

1
2
3
2

2
1

1
3
2 Ax +
12

2
1

0
1

1
0

x+

1
;
3

12
13
1
0
0
1 3
1 3
3
2 A4x @
2 A 5 + @
2 A
1+ 3
1+ 3
12
2
2
2
1
0
1 3
2 A by 120 ;
point @
1+ 3
2

! 0
1
3

= @ 2
3
3

is counterclockwise rotation around the

192

G F[x] = @

1
1
0
3+ 3
3
2 Ax + @
2 A
9 3
12
2

1
2
3
2

=@

is counterclockwise rotation around the point


(c) F [ x ] =
G[ x ] =

0
1

1
;
1

1
x+
0

1
0

0
1

!"

1
1

1
0

0
1

x+

2
;
2

1
3

is a glide reflection (see Exercise 7.3.16) along the line

3
1

is a glide reflection (see Exercise 7.3.16) along the line

0 1
x+
1
0
y = x 1 by a distance 2.

G F[x] =

1
1

0 1
x+
1
0
y = x + 1 by a distance 2;

F G[ x ] =

!#

13
1
12
0
0
1 3
1 3
3
2 A4x @
2 A 5 + @
2 A
5 3
5 3
12
2
2
1
1 3
2 A by 120 .
@
5 3
2

1
2
3
2
0

7.3.15.
(a) If F [ x ] = Q x + a and G[ x ] = R x + b, then G F [ x ] = R Q x + (R a + b) = S x + c is
a isometry since S = Q R, the product of two orthogonal matrices, is also an orthogonal
matrix.
(b) F [ x ] = x + a and G[ x ] = x + b, then G F [ x ] = x + (a + b) = x + c.
(c) Using Exercise 7.3.13, the rotation F [ x ] = Q x + a has Q 6= I , while G[ x ] = x + b is the
translation. Then G F [ x ] = Q x + (a + b) = Q x + c, and F G[ x ] = Q x + (a + Q b) =
e are both rotations.
Qx + c
(d) From part (a), G F [ x ] = R Q x + (R a + b) = S x + c is a translation if and only if
R = Q1 , i.e., the two rotations are by the opposite angle.
(e) Write x + c = G F [ x ] where F [ x ] = Q x and G[ x ] = Q1 x + c for any Q 6= I .
7.3.16. (a)
(c)

1
0
0
1

0
1

1
0

x+
!

2
0

=
1
0

x+2
,
y

!!

1
0

0 1
1 0
!
2 =
2

(b)
+

1
3
2C
x+B
@ 3 A=

2
0

y+
B
@

x+

y + 1 + 2 .
x + 1 2

1
3
2C
A,
3
2

7.3.17.
(a) F [ x ] = R(x a) + a where R = 2 u uT I is the elementary reflection matrix corresponding to the line in the direction of u through the origin.
(b) G[ x ] = R(x a) + a + d u, where R = 2 u uT I is the same reflection matrix.
(c) Let F [ x ] = R x + b with R an improper orthogonal matrix. According to (5.31) and
Exercise 7.2.10, R represents a reflection through a line L = ker( I R) that goes
through the origin. The affine map is a reflection through a line ` parallel to L and
passing through a point a if it takes the form F [ x ] = R(x a) + a, which requires
b rng ( I R) = L . Otherwise, we decompose b = c + e with c = ( I R)a L and
e L = ker( I R) 6= 0. Then the affine map takes the form F [ x ] = R(x a) + a + e,
which is a glide reflection along the line ` by a distance k e k.
7.3.18.
(a) Let A = { x + b | x W } ( R n be the affine subspace. If ai = xi + b W for all i, then
ai aj = xi xj W all belong to a proper subspace of R n and so they cannot span
R n . Conversely, let W be the span of all ai aj . Then we can write ai = xi + b where
193

xi = ai a1 W and b = a1 , and so all ai belong to the affine subspace A.


(b) Let vi = ai a0 , wi = bi b0 , for i = 1, . . . , n. Then, by the assumption, vi vi =
k vi k2 = k ai a0 k2 = k bi b0 k2 = k wi k2 = wi wi for all i = 1, . . . , n, while
k v i k2 2 v i v j + k v j k2 = k v i v j k2 = k a i a j k2 = k b i b j k2 = k w i w j k2 =
k wi k2 2 wi wj + k wj k2 , and hence vi vj = wi wj for all i 6= j. Thus, we have
verified the hypotheses of Exercise 5.3.20, and so there is an orthogonal matrix Q such
that wi = Q vi for i = 1, . . . , n. Therefore, bi = wi + b0 = Q vi + b0 = Q ai + (b0
Q a0 ) = F [ ai ] where F [ x ] = Q x + c with c = b0 Q a0 is the desired isometry.
7.3.19. First,

k v + w k 2 = k v k2 + 2 h v , w i + k w k 2 ,

k L[ v + w ] k2 = k L[ v ] + L[ w ] k2 = k L[ v ] k2 + 2 h L[ v ] , L[ w ] i + k L[ w ] k2 .
If L is an isometry, k L[ v + w ] k = k v + w k, k L[ v ] k = k v k, k L[ w ] k = k w k. Thus,
equating the previous two formulas, we conclude that h L[ v ] , L[ w ] i = h v , w i.

7.3.20. First, if L is an isometry, and k u k = 1 then k L[ u ] k = 1, proving that L[ u ] S 1 .


Conversely, if L preserves the unit sphere, and 0 6= v V , then u = v/r S1 where
r = k v k, so k L[ v ] k = k L[ r v ] k = k r L[ u ] k = r k L[ u ] k = r = k v k, proving (7.37).
7.3.21.
(a) All affine transformations F [ x ] = Q x + b where b is arbitrary and Q is a symmetry
of the unit square, and so a rotation by 0, 90, 180 or 270 degrees, or a reflection in the x
axis, the y axis, the line y = x or the line y = x.
(b) Same form, but where Q is one of 48 symmetries of the unit cube, consisting of 24 rotations and 24 reflections. The rotations are the identity, the 9 rotations by 90, 180 or 270
degrees around the coordinate axes, the 6 rotations by 180 degrees around the 6 lines
through opposite edges, e.g., x = y, z = 0, and the 8 rotations by 120 or 240 degrees
around the 4 diagonal lines x = y = z. The reflections are obtained by multiplying
each of the rotations by I , and represent either a reflection through the origin, in the
case of I , or a reflection through the plane orthogonal to the axis of the rotation in
the other cases.
7.3.22. Same answer as previous exercise. Now the transformation must preserve the unit diamond/octahedron, which has the same (linear) symmetries as the unit square/cube.
7.3.23.

2
2
(a) q(H x) = (x cosh + y sinh ) (x sinh + y cosh )
= (cosh2 sinh2 )(x2 y 2 ) = x2 y 2 = q(x).

(b) (a x + b y + e)2 (c x + d y + f )2 = x2 y 2 if and only if a2 c2 = 1, d2 b2 = 1,


a b = c d, e = f = 0. Thus, a = cosh , c = sinh , d = cosh , c = sinh ,
and sinh( ) = 0, and so = . Thus, the complete collection of linear (and affine)
transformations preserving q(x) is
cosh
sinh

sinh
,
cosh

cosh
sinh
!

sinh
,
cosh

cosh
sinh

sinh
,
cosh
!

cosh
sinh

T
T
y
y
x
, 0
, (b) q =
,
. The maps are nonlinear
1y
1+yx 1+yx
not affine; they are not isometries because distance between points is not preserved.

7.3.24. (a) q =

7.4.1.
(a) L(x) = 3 x; domain R; target R; right hand side 5; inhomogeneous.
194

sinh
.
cosh

(b) L(x, y, z) = x y z;! domain R 3 ; target R; right hand side 0; homogeneous.


!
u 2v
3
(c) L(u, v, w) =
; domain R 3 ; target R 2 ; right hand side
; inhomogeneous.
vw
1
!
!
3p 2q
0
(d) L(p, q) =
; domain R 2 ; target R 2 ; right hand side
; homogeneous.
p+q
0
(e) L[ u ] = u0 (x) + 3 x u(x); domain C1 (R); target C0 (R); right hand side 0; homogeneous.
1
(f ) L[ u ] = u0 (x); domain C
(R); target C0 (R); right hand side 3 x; inhomogeneous.
!
!
u0 (x) u(x)
0
(g) L[ u ] =
; domain C1 (R); target C0 (R) R; right hand side
;
u(0)
1
inhomogeneous.
!
!
u00 (x) u(x)
ex
2
0
(h) L[ u ] =
;
; domain C (R); target C (R) R; right hand side
0
u(0) 3 u(1)
inhomogeneous.
0
1
0
1
3x
u00 (x) + x2 u(x)
C
B
C
2
0
2
(i) L[ u ] = B
@
A; domain C (R); target C (R)R ; right hand side @ 1 A;
u(0)
0
0
u (0)
inhomogeneous.
!
u0 (x) v(x)
(j) L[ u, v ] =
; domain C1 (R) C1 (R); target C0 (R) C0 (R); right
2 u(x) + v 0 (x)
!
0
hand side
; homogeneous.
0
(k) L[ u, v ] =

0 00
1
u (x) v 00 (x) 2 u(x) + v(x)
B
C
u(0) v(0)
@
A;

u(1) v(1)
1
0
C
right hand side B
@ 0 A; homogeneous.
0
0

domain C2 (R)C2 (R); target C0 (R) R 2 ;

Z x

u(y) dy; domain C0 (R); target C0 (R); right hand side the constant
(l) L[ u ] = u(x) + 3
0
functionZ 1; inhomogeneous.

u(t) es t dt; domain C0 (R); target C0 (R); right hand side 1 + s2 ; inhomo(m) L[ u ] =
0
geneous.
(n) L[ u ] =

Z 1

u(x) dx u

0Z

1
2 ;
Z 1

domain C0 (R); target R; right hand side 0; homogeneous.

y v(y) dy; domain C0 (R) C0 (R); target R; right hand side


u(y) dy
(o) L[ u, v ] =
0
0
0; homogeneous.
u
u
(p) L[ u ] =
+2
; domain C1 (R 2 ); target C0 (R 2 ); right hand side the constant
t
x
function 1; inhomogeneous.
!
u/x v/y
(q) L[ u ] =
; domain C1 (R 2 ) C1 (R 2 ); target C0 (R 2 ); right hand side
u/y + v/x
the constant vector-valued function 0; homogeneous.
2u
2u

; domain C2 (R 2 ); target C0 (R 2 ); right hand side x2 + y 2 1;


(r) L[ u ] =
x2
y 2
inhomogeneous.
Z b

7.4.2. L[ u ] = u(x) +
K(x, y) u(y) dy. The domain is C0 (R) and the target is R. To show
a
linearity, for constants c, d,

195

L[ c u + d v ] = [ c u(x) + d v(x) ] +
=c

u(x) +

Z b
a

Z b
a

K(x, y) u(y) dy

K(x, y) [ c u(y) + d v(y) ] dy


!

+d

v(x) +

Z b
a

K(x, y) v(y) dy

= c L[ u ] + d L[ v ].

Z t

7.4.3. L[ u ] = u(t) +
K(t, s) u(s) ds. The domain is C0 (R) and the target is C0 (R). To show
a
linearity, for any constants c, d,
L[ c u + d v ] = [ c u(t) + d v(t) ] +
=c

u(t) +

Z t
a

Z t
a

K(t, s) u(s) ds

K(t, s) [ c u(s) + d v(s) ] ds


!

+d

v(t) +

Z t
a

K(t, s) v(s) ds

= c L[ u ] + d L[ v ].

7.4.4.
(a) Since a is constant, by the Fundamental
Theorem of Calculus,
!
Z t
Z 0
d
du
a+
k(s) u(s) ds = k(t) u(t). Moreover, u(0) = a +
k(s) u(s) ds = a.
=
0
0
dt
dt
2
t
(b) (i) u(t) = 2 e t , (ii) u(t) = et 1 , (iii) u(t) = 3 ee 1 .
7.4.5. True, since the equation can be written as L[ x ] + b = c, or L[ x ] = c b.

7.4.6.
(a)
(b)
(c)
(d)

u(x) = c1 e2 x + c2 e 2 x , dim = 2;
u(x) = c1 e4 x + c2 e2 x , dim = 2;
u(x) = c1 + c2 e3 x + c3 e 3 x , dim = 3;
u(x) = c1 e 3 x + c2 e 2 x + c3 e x + c4 e2 x , dim = 4.

7.4.7.
(a) If y C2 [ a, b ], then y 00 C0 [ a, b ] and so L[ y ] = y 00 + y C0 [ a, b ]. Further,
L[ c y + d z ] = (c y + d z)00 + (c y + d z) = c (y 00 + y) + d (z 00 + z) = c L[ y ] + d L[ z ].
(b) ker L is the span of the basic solutions cos x, sin x.
7.4.8. (a) If y C2 [ a, b ], then y 0 C1 [ a, b ], y 00 C0 [ a, b ] and so L[ y ] = 3 y 00 2 y 0 5 y
C0 [ a, b ]. Further, L[ c y + d z ] = 3 (c y + d z)00 2 (c y + d z)5 (c y + d z) = c (3 y 00 2 y 0 5 y)+
d (3 z 00 2 z 0 5 z) = c L[ y ]+d L[ z ]. (b) ker L is the span of the basic solutions e x , e5 x/3 .
7.4.9.
(a) p(D) = D 3 + 5 D 2 + 3 D 9.
(b) ex , e 3 x , x e 3 x . The general solution is y(x) = c1 ex + c2 e 3 x + c2 x e 3 x .
7.4.10. (a) Minimal order 2: u00 + u0 6 u = 0.
(b) minimal order 2: u00 + u0 = 0.
00
0
(c) minimal order 2: u 2 u + u = 0.
(d) minimal order 3: u00 6 u00 + 11 u0 6 u = 0.
7.4.11. (a) u = c1 x +
(d) u = c1 | x |

c2
, (b) u = c1 x2 + c2
x5

+ c2 | x |

7.4.12. u = c1 x + c2 | x |

|x|

, (c) u = c1 | x |(1+

5)/2

, (e) u = c1 x3 + c2 x 1/3 , (f ) v = c1 +

+ c 3 | x |

+ c2 | x |(1

5)/2

c2
.
x

. There is a three-dimensional
solution space for x > 0;

only those in the two-dimensional subspace spanned by x, | x |


tiable at x = 0.
196

are continuously differen-

7.4.13.

2
2
dv
du
du
d2 v
du
2t d u
t du
2 d u
= et
=x
,
=
e
+
e
=
x
+x
,
2
2
2
dt
dx
dx
dt
dx
dx
dx
dx
2
dv
d v
+ c v = 0.
and so v(t) solves a 2 + (b a)
dt
dt
(ii) In all cases, u(x) = v(log x) gives the solutions in Exercise 7.4.11.
(a) v 00 + 4 v 0 5 v = 0, with solution v(t) = c1 et + c2 e 5 t .
(b) 2 v 00 3 v 0 2 v = 0, with solution v(t) = c1 e2 t + c2 et/2 .

(i) Using the chain rule,

(c) v 00 v 0 v = 0, with solution v(t) = c1 e 2 (1+

1
5) t
2 (1 5) t .
+
c
e
2

3t

(d) v 00 3 v = 0, with solution v(t) = c1 e 3 t + c2 e


.
00
0
3t
(e) 3 v 8 v 3 v = 0, with solution v(t) = c1 e + c2 e t/3 .
(f ) v 00 + v 0 = 0, with solution v(t) = c1 + c2 e t .
7.4.14.
(a) v(t) = c1 er t + c2 t er t , so u(x) = c1 | x |r + c2 | x |r log | x |.
(b) (i) u(x) = c1 x + c2 x log | x |, (ii) u(x) = c1 + c2 log | x |.

e2 x
e 2 x
7.4.15. v 00 4 v = 0, so u(x) = c1
+c2
. The solutions with c1 +c2 = 0 are continuously
x
x
differentiable at x = 0, but only the zero solution is twice continuously differentiable.
7.4.16. True if S is a connected interval. If S is disconnected, then D[ u ] = 0 implies u is constant on each connected component. Thus, the dimension of ker D equals the number of
connected components of S.
2 y 2 2 x2
2u
2u
=
=

. Similarly, when v =
x2
(x2 + y 2 )2
y 2
x
2v
6 x y 2 2 x3
2v
1 u
,
then
=
=

. Or simply notice that v =


, and so if
2
x2 + y 2
x2
(x2 + y 2 )3
y
2
x
1
0
2u
2v
2v
1 @ 2u
2u A
2u
+
=
0,
then
+
=
+
= 0.
x2
x2
x2
y 2
2 x x2
y 2

7.4.17. For u = log(x2 + y 2 ), we compute

7.4.18. u = c1 + c2 log r. The solutions form a two-dimensional vector space.

7.4.19. log c (x a)2 + d (y b)2 . Not a vector space!


7.4.20.

2 x
2 x
e cos y +
e cos y = ex cos y ex cos y = 0.
2
x
y 2
p2 (x, y) = 1 + x + 21 x2 12 y 2 satisfies p2 = 0.
Same for p3 (x, y) = 1 + x + 21 x2 12 y 2 + 61 x3 12 x y 2 .
If u(x, y) is harmonic, then any of its Taylor polynomials are also harmonic. To prove
this, we write u(x, y) = pn (x, y) + rn (x, y), where pn (x, y) is the Taylor polynomial of
degree n and rn (x, y) is the remainder. Then u(x, y) = pn (x, y) + rn (x, y), where
pn (x, y) is a polynomial of degree n 2, and hence the Taylor polynomial of degree
n 2 for u, while rn (x, y) is the remainder. If u = 0, then its Taylor polynomial
pn = 0 also, and hence pn is a harmonic polynomial.
The Taylor polynomial of degree 4 is p4 (x, y) = 2 x x2 + y 2 23 x3 + 2 x y 2 12 x4 +
3 x2 y 2 21 y 4 , which is harmonic: p4 = 0.

(a) [ ex cos y ] =

(b)
(c)
(d)

(e)

7.4.21.
(a) Basis: 1, x, y, z, x2 y 2 , x2 z 2 , x y, x z, y z; dimension = 9.
197

(b) Basis: x3 3 x y 2 , x3 3 x z 2 , y 3 3 x2 y, y 3 3 y z 2 , z 3 3 x2 z, z 3 3 y 2 z, x y z; dimension = 7.


7.4.22. u = c1 + c2 /r. The solutions form a two-dimensional vector space.
7.4.23.
(a) If x ker M , then L M [ x ] = L[ M [ x ] ] = L[ 0 ] = 0 and so x ker L.
(b) For example, if L = O, but M 6= O!then ker(L M ) = {0} 6= ker M .
0 1
Other examples: L = M =
, and L = M = D, the derivative function.
0 0

3
7
0
B 2C
B5 C
B C
B
B
C
6 C. (c) x = B 3 C
C
C; kernel element is 0.
7.4.24. (a) Not in the range. (b) x = B
@ 1 A + zB
@4 A
@5 A
0
1
0
0
2 0 1
13
0
1
3
2
2
B
6 B C
C7
B
0C
B 0C7
6 B1C
C
C 7. (e) Not in the range.
B
C + 6 yB C + wB
(d) x = B
@ 3 A 5
4 @0A
@ 2A
1
0
0
0

2 0

13

0
1
2
B C7
C
6 B
C
B
B1C7
6 B 1 C
B 1C
C + w B C 7.
C + 6 zB
(f ) x = B
@0A5
4 @ 1A
@ 0A
1
0
0

7.4.25. (a) x = 1, y = 3, unique; (b) x = 71 + 37 z, y = 47 + 27 z, not unique; (c) no solution;


(d) u = 2, v = 1, w = 0, unique; (e) x = 2 + 4 w, y = 2 w, z = 1 6 w, not unique.
7.4.26.
(a) u(x) =
(b) u(x) =
(c) u(x) =

4x
1
11
16 4 x + c e ,
2 x/5
1 x
cos 45 x + c2 e2 x/5 sin 45
6 e sin x + c1 e
3
x
3
x
1
19 e + c1 + c2 e3 x .
3 xe

x,

7.4.27.
(a) u(x) = 41 ex 41 e4 3 x ,
(b) u(x) = 14 41 cos 2 x,
1 x
(c) u(x) = 49 e2 x 12 ex + 18
e
31 x e x ,
1
11 x
9 x
1
(d) u(x) = 10 cos x + 5 sin x + 10 e
cos 2 x + 10
e
sin 2 x,
1 x
1
3
(e) u(x) = x 1 + 2 e + 2 cos x + 2 sin x.

sin 2 x

7.4.28. (a) Unique solution: u(x) = x


; (b) no solution; (c) unique solution:
sin 2
u(x) = x + (x 1) ex ; (d) infinitely many solutions: u(x) = 21 + c e x sin x; (e) unique
3 e2 5 x 3 e 5 2 x
e + 2
e ; (f ) no solution; (g) unique solution:
solution: u(x) = 2 x + 3 2
e e
e e
5 x2
36

; (h) infinitely many solutions: u(x) = c (x x2 ).


u(x) =
31 x2
31
c
7.4.29. (a) u(x) = 12 x log x + c1 x + 2 , (b) u(x) = 21 log x + 34 + c1 x + c2 x2 ,
x
c
3
(c) u(x) = 1 8 x + c1 x5 + 2 .
x
7.4.30. (a) If b rng (L M ), then b = L M [ x ] for some x, and so b = L[ M [ x ] ] = L[ y ],
198

with y = M [ x ], belongs to rng L. (b) If M = O, but L 6= O then rng (L M ) = {0} 6=


rng L.
7.4.31.
(a) First, Y rng L since every y Y can be written as y = L[ w ] for some w W U ,
and so y rng L. If y1 = L[ w1 ] and y2 = L[ w2 ] are elements of Y , then so is c y1 +
d y2 = L[ c w1 + d w2 ] for any scalars c, d since c w1 + d w2 W , proving that Y is a
subspace.
(b) Suppose w1 , . . . , wk form a basis for W , so dim W = k. Let y = L[ w ] Y for w W .
We can write w = c1 w1 + + ck wk , and so, by linearity, y = c1 L[ w1 ] + +
ck L[ wk ]. Therefore, the k vectors y1 = L[ w1 ], . . . , yk = L[ wk ] span Y , and hence, by
Proposition 2.33, dim Y k.
7.4.32. If z ker L then L[ z ] = 0 ker L, which proves invariance. If y = L[ x ] rng L then
L[ y ] rng L, which proves invariance.

7.4.33. (a) {0}, the x axis, the y axis,; R 2 ; (b) {0}, the x axis, R 2 ; (c) If 6= 0, , then the
only invariant subspaces are {0} and R 2 . On the other hand, R0 = I , R = I , and so in
these cases every subspace is invariant.

7.4.34.
(a) If L were invertible, then the solution to L[ x ] = b would be unique, namely x = L 1 [ y ].
But according to Theorem 7.38, we can add in any element of the kernel to get another
solution, which would violate uniqueness.
(b) If b 6= rng L, then we cannot solve L[ x ] = b, and so the inverse cannot exist.
(c) On a finite-dimensional vector space, every linear function is equivalent to multiplication
by a square matrix. If the kernel is trivial, then the matrix
is nonsingular, and hence
Z x
f (y) dy on V = C0 has
invertible. An example: the integral operator I[ f (x) ] =
0

trivial kernel, but is not invertible because any function with g(0) 6= 0 does not lie in the
range of I and hence rng I 6= V .

7.4.35.
(a) u(x) = 21 + 52 cos x + 15 sin x + c e 2 x ,
1
sin x + c1 e3 x + c2 e 3 x ,
(b) u(x) = 91 x 10
1
(c) u(x) = 10
+ 18 ex cos x + c1 ex cos 31 x + c2 ex sin 31 x,
1
1 x
e + 41 e x + c1 ex + c2 e 2 x ,
(d) u(x) = 6 x ex 18
1 3x
(e) u(x) = 19 x + 54
e + c1 + c2 cos 3 x + c3 sin 3 x.
7.4.36. (a) u(x) = 5 x + 5 7 ex1 , (b) u(x) = c1 (x + 1) + c2 ex .

7.4.37. u(x) = 7 cos x 3 sin x.


7.4.38. u00 + x u = 2, u(0) = a, u(1) = b, for any a, b.
1
7.4.39. (a) u(x) = 91 x + cos 3 x + 27
sin 3 x, (b) u(x) = 12 (x2 3 x + 2) e4 x ,
3
(c) u(x) = 3 cos 2 x+ 10
sin 2 x 15 sin 3 x, (d) u(x) = 1 21 (x+1) ex + 12 (x2 +2 x1) ex1 .

7.4.40. u(x, y) =

1 2
4 (x

+ y2 ) +

4
1
12 (x

+ y 4 ).

7.4.41.
00
0
(a) If u = v u1 , then u0 = v 0 u1 + v u01 , u00 = v 00 u1 + 2 v 0 u01 + v u00
1 , and so 0 = u + a u + b u =
0
0
0
0
00
00
0
u1 v + (2 u1 + a u1 ) v + (u1 + a u1 + b u1 ) v = u1 w + (2 u1 + a u1 ) w, which is a first order
199

ordinary differential equation for w.


x
x2
x2
(b) (i) u(x) = c1 ex + c2 x ex , (ii) u(x)
=
c
(x

1)
+
c
e
,
(iii)
u(x)
=
c
e
+
c
x
e
,
1
2
1
2
Z
(iv ) u(x) = c1 ex

/2

+ c 2 ex

e x dx.

/2

7.4.42. We use linearity to compute


L[ u? ] = L[ c1 u?1 + + ck u?k ] = c1 L[ u?1 ] + + ck L[ u?k ] = c1 f 1 + + ck f k ,
and hence u? is a particular solution to the differential equation (7.66). The second part of
the theorem then follows from Theorem 7.38.

7.4.43. An example: Let A =

1
i

2
2i

. Then ker A consist of all vectors c

a + i b is any complex number. Then its real part a

2
1

2
1

where c =

and imaginary part b

2
1

are

also solutions to the homogeneous system.


7.4.44.
(a) u(x) = c1 cos 2 x + c2 sin 2 x,
(b) u(x) = c1 e 3 x cos x + c2 e 3 x sin x,
(c) u(x) = c1 ex + c2 e x/2 cos 32 x + c3 e x/2 sin 23 x,
(d)
(e)
(f )
(g)

u(x) = c1 ex/ 2 cos 1 x + c2 e x/ 2 cos 1 x + c3 ex/


2
2
u(x) = c1 cos 2 x+ c2 sin 2 x + c3 cos 3
x + c4 sin 3 x,
u(x) = c1 x cos( 2 log | x |) + c2 x sin( 2 log | x |),
u(x) = c1 x2 + c2 cos(2 log | x |) + c3 sin(2 log | x |).

7.4.45.
(a) Minimal order 2:
(b) minimal order 4:
(c) minimal order 5:
(d) minimal order 4:
(e) minimal order 6:

sin

1
2

x + c4 e x/

1
2

x,

)
2

sin

u00 + 2 u0 + 10 u = 0;
u(iv) + 2 u00 + u = 0;
u(v) + 4 u(iv) + 14 u000 + 20 u00 + 25 u0 = 0;
u(iv) + 5 u00 + 4 u = 0.
u(vi) + 3 u(iv) + 3 u00 + u = 0.

7.4.46.
(a) u(x) = c e i x = c cos x + i c sin
x.

x
( i 1) x
(b) u(x) = c1 e + c2 e
= c1 ex + c2 e x cos x + i e x sin x,
(c)

u(x) = c1 e(1+ i ) x/
h

= c1 ex/

cos

+ c2 e (1+ i ) x/
+ c2 e x/

cos

)
2

+ i c1 ex/

sin

c2 e x/

sin

7.4.47. (a) x4 6 x2 y 2 + y 4 , 4 x3 y 4 x y 3 . (b) The polynomial u(x, y) = a x4 + b x3 y + c x2 y 2 +


2u
2u
+
= (12 a + 2 c) x2 + (6 b + 6 d) x y + (2 c + 12 e) y 2 = 0 if and
d x y 3 + e y 4 solves
x2
y 2
only if 12 a + 2 c = 6 b + 6 d = 2 c + 12 e = 0. The general solution to this homogeneous
linear system is a = e, b = d, c = 6 e, where d, e are the free variables. Thus, u(x, y) =
e (x4 6 x2 y 2 + y 4 ) + 41 d (4 x3 y 4 x y 3 ).
2
2
2
2
u
2u
= k 2 e k t+ i k x =
; (b) e k t i k x ; (c) e k t cos k x, e k t sin k x.
t
x2
(d) Yes. When k = a + i b is complex, we obtain the real solutions

7.4.48. (a)

200

e(b a ) tb x cos(a x 2 a b t), e(b a ) tb x sin(a x 2 a b t) from e k t+ i k x , along with


2
2
2
2
2
e(b a ) t+b x cos(a x + 2 a b t), e(b a ) t+b x sin(a x + 2 a b t) from e k t i k x .
(e) All those in part (a), as well as those in part (b) for | a | > | b | which, when b = 0,
include those in part (a).
2

7.4.49. u(t, x) = ek t+k x , u(t, x) = e k t+k x , where k is any real or complex number. When
k = a + i b is complex, we obtain the four independent real solutions
e(a
e(b

b2 ) t+a x

e(a

cos(b x + 2 a b t),

a ) t+a x

e(b

cos(b x 2 a b t),

b2 ) t+a x

sin(b x + 2 a b t),

a ) t+a x

7.4.50. (a), (c), (e) are conjugated. Note: Case (e) is all of C 3 .

sin(b x 2 a b t).

u+u
uu
u+u
u+u
, then v =
=
= v. Similarly, if w = Im u =
,
2
2
2
2i
uu
uu
uu
then w =
=
=
= w.
2 i
2 i
2i

7.4.51. If v = Re u =

7.4.52. Let z1 , . . . , zn be a complex basis of V . Then xj = Re zj , yj = Im zj , j = 1, . . . , n, are


real vectors that span V since each zj = xj + i yj is a linear combination thereof, cf. Exercise 2.3.19. Thus, by Exercise 2.4.22, we can find a basis of V containing n of the vectors
x1 , . . . , xn , y1 , . . . , yn . Conversely, if v1 , . . . , vn is a real basis, and v = c1 v1 + + cn vn
V , then v = c1 v1 + + cn vn V also, so V is conjugated.

7.4.53. Every linear function from C n to C m has the form L[ u ] = A u for some m n complex
matrix A. The reality condition implies A u = L[ u ] = L[ u ] = A u = A u for all u C n .
Therefore, A = A and so A is a real matrix.
7.4.54. L[ u ] = L[ v ] + i L[ w ] = f , and, since L is real, the real and imaginary parts of this
equation yield L[ v ] = f , L[ w ] = 0.

7.4.55. (a) L[ u ] = L[ u ] = 0. (b) u =


32 + 12 i y + 12 21 i z, y, z
where y, z C
are the free variables, is the general solution to the first system, and so its complex conjugate

u=
32 21 i y + 12 + 21 i z, y, z , where y, z C are free variables, solves the conjugate system. (Note: Since y, z are free, they could be renamed y, z if desired.)
7.4.56. They are linearly independent if and only if u is not a complex scalar multiple of a real
solution. Indeed, if u = (a + i b)v where v is a real solution, then x = a v, y = b v are
linearly dependent. Conversely, if y = 0, then u = x is already a real solution, while if
x = a y for a real , then u = (1 + i a)y is a scalar multiple of the real solution y.

7.5.1. (a)

1
2

1
,
3

(b)

0
@

7.5.2. Domain (a), target (b):

1
4
3

2
4

32 A
,
3
!

3
;
9

(c)

13
@ 7
5
7

10
7
15
7

A.

domain (a), target (c):

a cant be complex, as otherwise x wouldnt be real.


201

3
1

5
;
10

domain (b), target (a):


domain (c), target (a):
0

1
7.5.3. (a) B
@1
0

1
0
1

0
1 C
A,
2

1
@2
2
03
6
@7
5
7

(b)
0

21 A
;
11
17 A
;
5

domain (b), target (c):


domain (c), target (b):

2
0

2
3

B1
B
@2

C
32 C
A,

(c)

B0
B
B1
@

1
4
32
11
4

3
@2
1
03
12
@ 7
10
7
1

3
2C
C
.
4 C
A
9
2

52
10
3

A;

37
15
7

A.

1 1 1
1 2
0
0 2 C
0 3 C
domain (a), target (c): B
7.5.4. Domain (a), target (b): B
A;
A;
@2
@1
1
4
5
0
2
6
1
0
1
0
1 1
0
1
1
1
C
B1
C
B
domain (b), target (a): B
domain (b), target (c): B
0 12 C
0 1 C
A;
@2
A;
@ 1
4
5
1
1
2
0
3
3
3
3
03
0
1
1
1
1
1

1
1
3C
4
2
4
B
B
C
C
C
1
1
B
B
; domain (c), target (b): B
.
domain (c), target (a): B
0 2 C
0 6 C
2
2
@
@
A
A
1
1
1
4
4
2
1
6
2
7.5.5. Domain (a), target (a):
domain (a), target (c):
domain (b), target (b):

0
2

1
;
1

2
8

0
8

2
;
4

1
@2

1
1
domain (c), target (a): @
1
0
0

domain (c), target (c):

1
3

16
@ 7
18
7

0
4
3
2
7
4
7

8
7
16
7

domain (a), target (b):


domain (b), target (a):

32 A
;
11
37 A
;
1

domain (b), target (c):


domain (c), target (b):

1
47 A
.
6
7

1
3

0
4

1
@2

2
3

8
03

8
3
4
7
8
7

@1

7.5.6. Using the monomial basis 1, x, x2 , the operator D has matrix representative
0
0

3
.
3
12

A.
1
3!

4 .
3 1
97 A
.
3
7

1
1
0
3
B 1 2
C
B1
1
1
2 A. The inner product is represented by the Hilbert matrix K = B
3
4
@2
0
1
0
1
1
1
6
2
3
3
4
5
B
C
1 T
Thus, the adjoint is represented by K A K = @ 12 24 26 A, so
0
30
30

D [ 1 ] = 6 + 12 x,
D [ x ] = 2 24 x 26 x ,
D [ x2 ] = 3 26 x 26 x2 .
One can check that
Z 1
Z 1
h p , D[ q ] i =
p(x) q 0 (x) dx =
D [ p(x) ] q(x) dx = h D [ p ] , q i

0
A=B
@0
0

1
0
0

C
C
C.
A

for all quadratic polynomials by verifying it on the monomial basis elements.

7.5.7. Suppose M, N : V U both satisfy h u , M [ v ] i = hh L[ u ] , v ii = h u , N [ v ] i for all u


U, v V . Then h u , (M N )[ v ] i = 0 for all u U , and so (M N )[ v ] = 0 for all v V ,
which proves that M = N .

202

7.5.8.
(a) For all u U, v V , we have h u , (L + M ) [ v ] i = hh (L + M )[ u ] , v ii =
hh L[ u ] , v ii + hh M [ u ] , v ii = h u , L [ v ] i + h u , M [ v ] i = h u , (L + M )[ v ] i. Since this
holds for all u U, v V , we conclude that (L + M ) = L + M .
(b) h u , (c L) [ v ] i = hh (c L)[ u ] , v ii = c hh L[ u ] , v ii = c h u , L [ v ] i = h u , c L [ v ] i.
(c) hh (L ) [ u ] , v ii = h u , L [ v ] i = hh L[ u ] , v ii.
(d) (L1 ) L = (L L1 ) = I = I , and L (L1 ) = (L1 L) = I = I .

7.5.9. In all cases, L = L if and only if its matrix !


representative A, with respect!to the stan1
0
0 1
dard basis, is symmetric. (a) A =
= AT , (b) A =
= AT ,
0 1
1 0
(c) A =

3
0

0
3

=A ,

(d) A =

1
@2
1
2

1
1
2A
1
2

= AT .

7.5.10. According to (7.78), the adjoint A = M 1 AT M = A if and only if M A = AT M =


(M T AT )T = (M A)T since M = M T .
!
!
2 0
12
6
7.5.11. The inner product matrix is M =
, so M A =
is symmetric, and
0 3
6 12
0
1
hence, by Exercise 7.5.10, A is self-adjint.
0 1 1
C
7.5.12. (a) a12 = 21 a21 , a13 = 13 a31 , 12 a23 = 31 a32 , (b) B
@ 2 1 2 A.
3 3 2
7.5.13.
(a) 2
a a22 =
a +2 a21 a31 , 2 a13 a23 = a21 +2 a31 , a13 +2 a23 a33 = a22 +2 a32 ,
0 12
1 11
0 1
1
(b) B
6 C
@2 3
A.
5 3 16
7.5.14. True. h I [ u ] , v i = h u , v i = h u , I [ v ] i for all u, v U , and so, according to (7.74),
I = I.
!
2 0
7.5.15. False. For example, A =
is not self-adjoint with respect to the inner product
0 1
!
!
2 1
4 1
defined by M =
since M A =
is not symmetric, and so fails the
1
2
2
2
criterion of Exercise 7.5.10.

7.5.16.
(a) (L + L ) = L + (L ) = L + L.
(b) Since L L = (L ) L , this follows from Theorem 7.60. (Or it can be proved directly.)
7.5.17.
(a) Write the condition as h N [ u ] , u i = 0 where N = K M is also self-adjoint. Then,
for any u, v U , we have 0 = h N [ u + v ] , u + v i = h N [ u ] , u i + h N [ u ] , v i +
h N [ v ] , u i + h N [ v ] , v i = 2 h u , N [ v ] i, where we used the self-adjointness of N to
combine h N [ u ] , v i = h u , N [ v ] i = h N [ v ] , u i. Since h u , N [ v ] i = 0 for all u, v, we
conclude that N = K M = O.
(b) When we take U = R n with the dot product, then K and M are represented by n n
matrices, A, B, respectively, and the condition is (A u) u = (B u) u for all u R n ,
which implies A = B provided A, B are symmetric matrices. In particular, if AT = A
is any skew-symmetric matrix, then (A u) u = 0 for all u.
203

7.5.18. (a) = (b): Suppose h L[ u ] , L[


v ] i = h u , v i forqall u, v U , then
q
k L[ u ] k = h L[ u ] , L[ u ] i = h u , u i = k u k .
(b) = (c): Suppose k L[ u ] k = k u k for all u U . Then
h L L[ u ] , u i = h L[ u ] , L[ u ] i = h u , u i.
Thus, by Exercise 7.5.17, L L = I . Since L is assumed to be invertible, this proves,
cf. Exercise 7.1.59, that L = L1 .
(c) = (a): If L L = I , then
h L[ u ] , L[ v ] i = h u , L L[ v ] i = h u , v i for all u, v U.
7.5.19.
(a) h Ma [ u ] , v i =

Z b
a

Ma [ u(x) ] v(x) dx =

h u , Ma [ v ] i, proving self-adjointness.

Z b
a

a(x) u(x) v(x) dx =

(b) Yes, by the same computation, hh Ma [ u ] , v ii =

Z b
a

Z b
a

u(x) Ma [ v(x) ] dx =

a(x) u(x) v(x) w(x) dx = hh u , Ma [ v ] ii.

7.5.20.
(a) If AT = A, then (A u) v = (A u)T v = uT AT v = uT A v = u A v for all u, v R n ,
and so A = A.
(b) When AT M = M A.
(c) h (L L )[ u ] , v i = h L[ u ] , v ih L [ u ] , v i = h u , L [ v ] ih u , L[ v ] i = h u , (L L)[ v ] i.
Thus, by the definition of adjoint, (L L ) = L L = (L L ).
(d) Write L = K + S, where K = 21 (L + L ) is self-adjoint and S = 21 (L L ) is skewadjoint.
7.5.21. Define L: U V1 V2 by L[ u ] = (L1 [ u ], L2 [ u ]). Using the induced inner product
hhh (v1 , v2 ) , (w1 , w2 ) iii = hh v1 , w1 ii1 + hh v2 , w2 ii2 on the Cartesian product V1 V2 given
in Exercise 3.1.18, we find
h u , L [ v1 , v2 ] i = hhh L[ u ] , (v1 , v2 ) iii
= hhh (L1 [ u ], L2 [ u ]) , (v1 , v2 ) iii = hh L1 [ u ] , v1 ii1 + hh L1 [ u ] , v2 ii2
= h u , L [ v ] i + h u , L [ v ] i = h u , L [ v ] + L [ v ] i,
1

and hence L [ v1 , v2 ] = L
1 [ v1 ] + L2 [ v2 ]. As a result,

L L[ u ] = L
1 L1 [ u ] + L2 L2 [ u ] = K1 [ u ] + K2 [ u ] = K[ u ].

7.5.22. Minimizer:
7.5.23. Minimizer:
7.5.24. Minimizer:
7.5.25.
(a) Minimizer:
(b) Minimizer:
(c) Minimizer:
7.5.26.
(a) Minimizer:

1
1
5,5

; minimum value: 51 .

3
14 2
13 , 13 , 13
2 1
3, 3

31
; minimum value: 26
.

; minimum value: 2.

1 T
5
; minimum value: 56 .
,
18 18
T

5 4
; minimum value: 5.
3, 3

1 1 T
; minimum value: 21 .
6 , 12

7
2
13 , 13

7
.
; minimum value: 26

204

(b) Minimizer:
(c) Minimizer:
(d) Minimizer:
7.5.27. (a)

1
3,

(b)

11
39 ,

12
13 ,

19
39 ,
6
11 ,

1 T
;
13

5 T
;
26

4 T
;
39

(c)

minimum value: 11
78 .
minimum value: 43
52 .
minimum value: 17
39 .

3
5.

7.5.28. Suppose L: U V is a linear map between inner product spaces with ker L 6= {0} and
adjoint map L : V U . Let K = L L: U U be the associated positive semi-definite
operator. If f rng K, then any solution to the linear system K[ u? ] = f is a minimizer for
the quadratic function p(u) = 21 k L[ u ] k2 h u , f i. The minimum is not unique since if u?
is a minimizer, so is u = u? + z for any z ker L.

205

Solutions Chapter 8
8.1.1. (a) u(t) = 3 e5 t ,

(b) u(t) = 3 e2(t1) ,

(c) u(t) = e 3(t+1) .

8.1.2. = log 2/100 .0069. After 10 years: 93.3033 gram; after 100 years: 50 gram;
after 1000 years: .0977 gram.
8.1.3. Solve e (log 2)t/5730 = .0624 for t = 5730 log .0624/ log 2 = 22, 933 years.

t/t?

8.1.4. By (8.6), u(t) = u(0) e (log 2)t/t? = u(0) 21


= 2n u(0) when t = n t? . After every
time period of duration t? , the amount of radioactive material is halved.
8.1.5. The solution is u(t) = u(0) e1.3 t . To double, we need e1.3 t = 2, so t = log 2/1.3 = .5332.
To quadruple takes twice as long, t = 1.0664. To reach 2 million needs t = log 10 6 /1.3 =
10.6273.
8.1.6. The solution is u(t) = u(0) e.27 t . For the given initial conditions, u(t) = 1, 000, 000 when
t = log(1000000/5000)/.27 = 19.6234 years.
8.1.7.
du
b
= 0 = a u + b, hence it is a solution.
(a) If u(t) u? = , then
a dv dt
b
= a v, so v(t) = c ea t , and u(t) = c ea t .
(b) v = u u? satisfies
dt
a
(c) The equilibrium solution is asymptotically stable if and only if a < 0 and is stable if
a = 0.
8.1.8. (a) u(t) =

1
2

1
2

e2 t , (b) u(t) = 3, (c) u(t) = 2 3 e 3(t2) .

du
log 2
=
u + 5 .000693 u + 5. (b) Stabilizes at the equilibrium solution
dt
1000
"
!#
5000
log 2
u? = 5000/ log 2 721 tons. (c) The solution is u(t) =
1 exp
t
which
log 2
1000
!
1000
log 2
equals 100 when t =
log 1 100
20.14 years.
log 2
5000

8.1.9. (a)

8.1.10.
(a) The first term on the right hand side says that the rate of growth remains proportional
to the population, while the second term reflects the fact that hunting decreases the
population by a fixed amount. (This assumes hunting is done continually throughout
the year, which is not what
happens in real life.)

e.27 t + 1000
(b) The solution is u(t) = 5000 1000
.27
.27 . Solving u(t) = 100000 gives
1
1000000 1000/.27
t=
log
= 24.6094 years.
.27
5000 1000/.27
(c) To avoid extinction, tThe equilibrium u? = b/.27 must be less than the initial population, so b < 1350 deer.
8.1.11. (a) | u1 (t) u2 (t) | = ea t | u1 (0) u2 (0) | when a > 0, since u1 (0) = u2 (0) if and
only if the solutions are the same. (b) t = log(1000/.05)/.02 = 495.17.
8.1.12.
(a) u(t) =

1
3

e2 t/7 .
206

(b) One unit: t = log 1/(1/3 .3333) /(2/7) = 36.0813;


h

1000 units: t = log 1000/(1/3 .3333) /(2/7) = 60.2585;

(c) One unit: t 30.2328 solves 13 e2 t/7 .3333 e.2857 t = 1.


1000 units: t 52.7548 solves 31 e2 t/7 .3333 e.2857 t = 1000.
Note: The solutions to these nonlinear equations are found by a numerical equation
solver, e.g., the bisection method, or Newtons method, [ 10 ].
du
= c a ea t = a u, and so u(t) is a valid solution. By
dt
Eulers formula (3.84), if Re a > 0, then u(t) as t , and the origin is an unstable
equilibrium. If Re a = 0, then u(t) remains bounded t , and the origin is a stable
equilibrium. If Re a < 0, then u(t) 0 as t , and the origin is an asymptotically
stable equilibrium.

8.1.13. According to Exercise 3.6.24,

8.2.1.
(a) Eigenvalues: 3, 1;
(b) Eigenvalues:

1 1
2, 3;

(c) Eigenvalue: 2;

eigenvectors:
eigenvectors:

eigenvector:

(f )

(g)
(h)
(i)

(j)

1
.
1

i
2 ,
eigenvectors:
1
0
1 0
1 0 1
1
1
1
C B
C B C
eigenvectors: B
@ 1 A, @ 0 A, @ 2 A.
1
1
1

(d) Eigenvalues: 1 + i 2, 1 i 2;
(e) Eigenvalues: 4, 3, 1;

1
1
.
,
1
1
!
!
1
4
.
,
1
3

3
2
3
2

1 +
B

!
2 .
1

1 0

3
2
3
2

1
B

2
C

C B
C B
C, B
Eigenvalues: 1, 6, 6; eigenvectors: B
@ 0 A, B
@ 2+
A @ 2
1
1
1
1 0
0 1 0
3 + 2i
3 2i
3
C B
C B
Eigenvalues: 0, 1 + i , 1 i ; eigenvectors: B
@ 1 A , @ 3 i A, @ 3 + i
1
1
01
0
1 0
1
1
C B
C
Eigenvalues: 2, 0; eigenvectors: B
@ 1 A, @ 3 A.
1
1 1
0
2
C
1 is a simple eigenvalue, with eigenvector B
@ 1 A;
1
1
0 1 0
1
32
3
C B
C
2 is a double eigenvalue, with eigenvectors B
@ 0 A, @ 1 A .
1
0 1
0
1 0
1
0
C
B
C B
B 1 C B 0 C
C, B
C;
1 is a double eigenvalue, with eigenvectors B
@ 0 A @ 3 A
0
2

207

C
A.

C
C
C.
A

7 is also a double eigenvalue, with eigenvectors

(k) Eigenvalues: 1, 2, 3, 4;

1 0

B C B C
B1C B0C
B C , B C.
@0A @1A

0
2
1 0 1 0 1 0 1
0
0
0
1
B C B C B C B C
0
0
1
C B C B C B1C
B C , B C, B C, B C .
eigenvectors: B
@0A @1A @1A @0A
1
1
0
0
0

1
. They are real
8.2.2. (a) The eigenvalues are e
= cos i sin with eigenvectors
i
only for = 0 and . (b) Because R a I has an inverse if and only if a is not an eigenvalue.
i

8.2.3. The eigenvalues are 1 with eigenvectors ( sin , 1 cos )T .


8.2.4. (a) O, and (b) I , are trivial examples.
3

8.2.5. (a) The characteristic equation is + + + = 0. (b) For example,

B
@0

1
0
b

0
1C
A.
a

a =1b =
0 then A = O and0all vectors
8.2.6. The eigenvalues are 0 and i a2 + b2 + c2 . If 0
1
0c = 1
a
b
ac
i
C
C B
C
B
are eigenvectors. Otherwise, the eigenvectors are B
@ b A, @ a A
@ b c A.
2 + b2 + c2
2
2
a
c
0
a +b

8.2.7.

1
1
,
.
0
1
!

i (2 5) .
(b) Eigenvalues: 5; eigenvectors:
1!
!
1
3
1
+
i
5
5
(c) Eigenvalues: 3, 2 i ; eigenvectors:
.
,
1
1
0
1
1
C
(d) 2 is a simple eigenvalue with eigenvector B
@ 2 A;
1
0
1 0
1 + i
1+ i
B
C B
i is a double eigenvalue with eigenvectors @
0
A, @ 1
1
0
(a) Eigenvalues: i , 1 + i ;

eigenvectors:

C
A.

8.2.8.
(a) Since O v = 0 = 0 v, we conclude that 0 is the only eigenvalue; all nonzero vectors v 6= 0
are eigenvectors.
(b) Since I v = v = 1 v, we conclude that 1 is the only eigenvalue; all nonzero vectors v 6= 0
are eigenvectors.
!

1
1
, and
. For n = 3,
1
1
0
1 0
1
0 1
1
1
1
C B
C
B C
the eigenvalues are 0, 0, 3, and the eigenvectors are B
@ 0 A, @ 1 A, and @ 1 A. In general,
1
0
1
the eigenvalues are 0, with multiplicity n 1, and n, which is simple. The eigenvectors
corresponding to the eigenvalue 0 are all nonzero vectors of the form ( v1 , v2 , . . . , vn )T with
v1 + + vn = 0. The eigenvectors corresponding to the eigenvalue n are all nonzero vectors
of the form ( v1 , v2 , . . . , vn )T with v1 = = vn .

8.2.9. For n = 2, the eigenvalues are 0, 2, and the eigenvectors are

208

8.2.10.
(a) If A v = v, then A(c v) = c A v = c v = (c v) and so c v satisfies the eigenvector
equation for the eigenvalue . Moreover, since v 6= 0, also c v 6= 0 for c 6= 0, and so c v is
a bona fide eigenvector.
(b) If A v = v, A w = w, then A(c v + d w) = c A v + d A w = c v + d w = (c v + d w).
(c) Suppose A v = v, A w = w. Then v and w must be linearly independent as otherwise they would be scalar multiples of each other and hence have the same eigenvalue.
Thus, A(c v + d w) = c A v + d A w = c v + d w = (c v + d w) if and only if c = c
and d = d , which, when 6= , is only possible when either c = 0 or d = 0.
8.2.11. True by the same computation as in Exercise 8.2.10(a), c v is an eigenvector for the
same (real) eigenvalue .
8.2.12. Write w = x + i y. Then, since is real, the real and imaginary parts of the eigenvector
equation A w = w are A x = x, A y = y, and hence x, y are real eigenvectors of A.
Thus x = a1 v1 + + ak vk , y = b1 v1 + + bk vk for a1 , . . . , ak , b1 , . . . , bk R, and hence
w = c1 v1 + + ck vk where cj = aj + i bj .
0

8.2.13.
(a) A =

B
B
B
B
B
B
B
B
B
B
B
@0

1
0

0
1
0

0
0
1 0
.. .. ..
. . .
0 1
0

0
0C
C

C
C
C
C
C.
C
C
0C
C
1A

1 0
0
(b) A A = I by direct computation, or, equivalently, note that the columns of A are the
standard orthonormal basis vectors en , e1 , e2 , . . . , en1 , written in a slightly different
order.

(c) Since

1, e2 k i /n , e4 k i /n , . . . , e2 (n1) k i /n

k =

S k =

e2 k i /n , e4 k i /n , . . . , e2 (n1) k i /n,1

= e2 k i /n k ,

so k is an eigenvector with corresponding eigenvalue e2 k i /n for each k = 0, . . . , n 1.

8.2.14. (a) Eigenvalues: 3, 1, 5; eigenvectors: ( 2, 3, 1 )T , 23 , 1, 1


(b) tr A = 3 = 3 + 1 + 5, (c) det A = 15 = (3) 1 5.

, ( 2, 1, 1 )T ;

8.2.15.
(a) tr A = 2 = 3 + (1); det A = 3 = 3 (1).
(b) tr A = 56 = 21 + 31 ; det A = 16 = 21 13 .
(c) tr A = 4 = 2 + 2; det A = 4 =
2 2.

(d) tr A = 2 = (1 + i 2) + (1 i 2); det A = 3 = (1 + i 2) (1 i 2).


(e) tr A = 8 = 4 + 3
+ 1; det
A = 12 = 4 3 1.

(f ) tr A = 1 = 1 + 6 + ( 6); det A = 6 = 1 6 (
6).

(g) tr A = 2 = 0 + (1 + i ) + (1 i ); det A = 0 = 0 (1 + i 2) (1 i 2).


(h) tr A = 4 = 2 + 2 + 0; det A = 0 = 2 2 0.
(i) tr A = 3 = (1) + 2 + 2; det A = 4 = (1) 2 2.
(j) tr A = 12 = (1) + (1) + 7 + 7; det A = 49 = (1) (1) 7 7.
(k) tr A = 10 = 1 + 2 + 3 + 4; det A = 24 = 1 2 3 4.

209

8.2.16.
(a) a = a11 + a22 + a33 = tr A, b = a11 a22 a12 a21 + a11 a33 a13 a31 + a22 a33 a23 a32 ,
c = a11 a22 a33 + a12 a23 a31 + a13 a21 a 32 a11 a23 a32 a12 a21 a33 a13 a22 a31 = det A

(b) When the factored form of the characteristic polynomial is multiplied out, we obtain
( 1 )( 2 )( 3 ) = 3 + (1 + 2 + 3 )2 (1 2 + 1 3 + 2 3 ) + 1 2 3 ,
giving the eigenvalue formulas for a, b, c.

8.2.17. If U is upper triangular, so isQU I , and hence p() = det(U I ) is the product of
the diagonal entries, so p() =
(uii ). Thus, the roots of the characteristic equation
are u11 , . . . , unn the diagonal entries of U .
8.2.18. Since Ja I is an upper triangular matrix with a on the diagonal, its determinant
is det(Ja I ) = (a )n and hence its only eigenvalue is = a, of multiplicity n. (Or use
Exercise 8.2.17.) Moreover, (Ja a I )v = ( v2 , v3 , . . . , vn , 0 )T = 0 if and only if v = c e1 .
8.2.19. Parts (a,b) are special cases of part (c):
If A v = v then Bv = (c A + d I )v = (c + d) v.
8.2.20. If A v = v then A2 v = A v = 2 v, and hence v is also an eigenvector of A2 with
eigenvalue 2 .
!

0 1
0 0
8.2.21. (a) False. For example,
0 is an eigenvalue of both
and
, but the eigen!
0 0
1 0
0 1
are i . (b) True. If A v = v and B v = v, then (A + B)v =
values of A + B =
1 0
( + )v, and so v is an eigenvector with eigenvalue + .
8.2.22. False in general, but true if the eigenvectors coincide: If A v = v and B v = v, then
A B v = ( )v, and so v is an eigenvector with eigenvalue .
8.2.23. If A B v = v, then B A w = w, where w = B v. Thus, as long as w 6= 0, it is
an eigenvector of B A with eigenvalue . However, if w = 0, then A B v = 0, and so the
eigenvalue is = 0, which implies that A B is singular. But then so is B A, which also has
0 as an eigenvalue. Thus every eigenvalue of A B is an eigenvalue of B A. The converse follows by the same reasoning. Note: This does not imply that their null eigenspaces (kernels)
have the same dimension; compare Exercise 1.8.18. In anticipation of Section 8.6, even
though A B and B A have the same eigenvalues, they may have different Jordan canonical
forms.
8.2.24. (a) Starting with A v = v, multiply both sides by A1 and divide by to obtain
A1 v = (1/) v. Therefore, v is an eigenvector of A1 with eigenvalue 1/.
(b) If 0 is an eigenvalue, then A is not invertible.
8.2.25. (a) If all | j | 1 then so is their product 1 | 1 . . . n | = | det A |, which is a contra!
2 0
has eigenvalues 2, 13 while det A = 23 .
diction. (b) False. A =
0 31
8.2.26. Recall that A is singular if and only if ker A 6= {0}. Any v ker A satisfies A v = 0 =
0 v. Thus ker A is nonzero if and only if A has a null eigenvector.
8.2.27. Let v, w be any two linearly independent vectors. Then A v = v and A w = w for
some , . But v + w is an eigenvector if and only if A(v + w) = v + w = (v + w),
which requires = = . Thus, A v = v for every v, which implies A = I .
8.2.28. If is a simple real eigenvalue, then there are two real unit eigenvectors: u and u.
For a complex eigenvalue, if u is a unit complex eigenvector, so is e i u, and so there are

210

infinitely many complex unit eigenvectors. (The same holds for a real eigenvalue if we also
allow complex eigenvectors.) If is a multiple real eigenvalue, with eigenspace of dimension
greater than 1, then there are infinitely many unit real eigenvectors in the eigenspace.
8.2.29. All false. Simple
!
! 2 2 examples suffice to disprove them:
0 1
0 1
has eigenvalue 1;
, which has eigenvalues i , i ; (a)
Strt with
1 2
1
0
!
!
1
0
0 4
(b)
has eigenvalues 1, 1;
(c)
has eigenvalues 2 i , 2 i .
0 1
1
0
8.2.30. False. The eigenvalue equation A v = v is not linear in the eigenvalue and eigenvector
since A(v1 + v2 ) 6= (1 + 2 )(v1 + v2 ) in general.
8.2.31.

24
7
0
25 25 A.
(a) (i) Q =
. (ii) Q = @
7
1
24
25 25
1
0
1
0 1 0
1
0 0
4
3
C
Eigenvalues 1, 1; eigenvectors @ 54 A, @ 53 A. (iii) Q = B
@ 0 1 0 A. Eigenvalue 1

0
0 1
5
5
0 1
0 1 0 1
0
1
0
1
0
0 0 1
C
B C B C
B
C
has eigenvector: B
@ 1 A; eigenvalue 1 has eigenvectors: @ 0 A, @ 0 A. (iv ) Q = @ 0 1 0 A.
0
0
1
1 0 0
0
1
0 1 0 1
1
1
0
C
B C B C
Eigenvalue 1 has eigenvector: B
@ 0 A; eigenvalue 1 has eigenvectors: @ 0 A, @ 1 A.
1
1
0
(b) u is an eigenvector with eigenvalue 1. All vectors orthogonal to u are eigenvectors
with eigenvalue +1.

1
0

0
. Eigenvalues 1, 1; eigenvectors
1

1
,
0

8.2.32.
h
i
(a) det(B I ) = det(S 1 A S I ) = det S 1 (A I )S

= det S 1 det(A I ) det S = det(A I ).


(b) The eigenvalues are the roots of the common characteristic equation.
(c) Not usually. If w is an eigenvector of B, then v = S w is an eigenvector of A and conversely.
!
!
1 1
2 0
1
(d) Both have 2 as a double eigenvalue. Suppose
= S
S, or, equiv1 3
0 2
!
!
!
1 1
2 0
x y
alently, S
=
S for some S =
. Then, equating entries,
1 3
0 2
z w
we must have x y = 2 x, x + 3 y = 0, z w = 0, z + 3 w = 2 w, which implies
x = y = z = w = 0, and so S = O, which is not invertible.

8.2.33.
(a)

pA1 () = det(A

"

I ) = det A

1
I A

!#

( )n
=
p
det A A

Or, equivalently, if
pA () = (1)n n + cn1 n1 + + c1 + c0 ,
then, since c0 = det A 6= 0,
pA1 () = (1)

(1)n
=
c0

c
c
+ 1 n1 + + n1
c0
c0
n

n
2
4

!n

+ c1

211

!n1

1
c0

+ + cn

3
5

( )n
p
=
det A A

(b) (i) A1 =

3
2

12

. Then pA () = 2 5 2, while

pA1 () = 2 +
0
B

B
(ii) A1 = B
@

3
5
6
5
4
5

45
3
5
25

5
2

4
5
58
7
5
3

pA1 () = +

1
2

C
C
C.
A

7
5

2
2

5
2
+1 .
2

Then pA () = 3 + 3 2 7 + 5, while
2

3
5

1
5

3
=
5

1
3
7
3 + 2 +5 .

8.2.34.
k
(a) If A v = v!then 0 = Ak v = !
v and hence k = 0, so = 0.
0 1
0 0
(b) A =
has A2 =
;
0 0
0 0
0
0
0
1
1
1
0 0 1
0 0 0
0 1 1
B
B
C
C
C
2
3
A=B
@ 0 0 1 A has A = @ 0 0 0 A, A = @ 0 0 0 A.
0 0 0
0 0 0
0 0 0
In general, A can be any upper triangular matrix with all zero entries on the diagonal,
and all nonzero entries on the super-diagonal.
8.2.35.
(a) det(AT I ) = det(A I )T = det(A I ), and hence A and AT have the same
characteristic polynomial, which implies that they have the same eigenvalues.
(b) No. See the examples.
(c) v w = (A v)T w = vT AT w = v w, so if 6= , v w = 0 and the vectors are
orthogonal.
!
!
1
1

2
; the
, v2 =
(d) (i) The eigenvalues are 1, 2; the eigenvectors of A are v1 =
1
1
!
!
1
2
, and v1 , w2 are orthogonal, as are
, w2 =
eigenvectors of AT are w1 =
1
1
v2 , w 1 .
0 1
0 1
1
1
C
B C
(ii) The eigenvalues are 1, 1, 2; the eigenvectors of A are v1 = B
@ 1 A , v 2 = @ 2 A, v 3 =
0
1
0 1
0
1
0
1
0
1
0
1
2
3
B1C
B
C
B
C
B
C
T
@ 2 A; the eigenvectors of A are w1 = @ 2 A, w2 = @ 2 A, w3 = @ 1 A. Note
1
1
1
1
that vi is orthogonal to wj whenever i 6= j.
8.2.36.
(a) The characteristic equation
of a 3 3 1
matrix is a real cubic polynomial, and hence has at
0
0 1
0 0
B
1 0
0 0C
C
B
C has eigenvalues i . (c) No, since the characterisleast one real root. (b) B
@ 0 0
0 1A
0 0 1 0
tic polynomial is degree 5 and hence has at least one real root.
8.2.37. (a) If A v = v, 0
then v =
1
0
B
B 0 1
4
satisfy = 1. (b) B
@0
0
0
0

A4 v = 1
4 v, and hence, since v 6= 0, all its eigenvalues must
0
0
0
0C
C
C.
0 1 A
1
0
212

8.2.38. If P v = v then P 2 v = 2 v. Since P v = P 2 v, we find v = 2 v. Since v 6= 0, it


follows that 2 = , so the only eigenvalues are = 0, 1. All v rng P are eigenvectors
with eigenvalue 1 since if v = P u, then P v = P 2 u = P u = v, whereas all w ker P are
null eigenvectors.
8.2.39. False. For example,

B
@1

0
0
1

1
1
0C
A has eigenvalues 1, 2
0

3
2

i.

8.2.40.
(a) According to Exercise 1.2.29, if z = ( 1, 1, . . . , 1 )T , then A z is the vector of row sums of
A, and hence, by the assumption, A z = z. Thus,o z is an eigenvector with eigenvalue 1.
(b) Yes, since the column sums of A are the row sums of AT , and Exercise 8.2.35 says that
A and AT have the same eigenvalues.
8.2.41.
(a) If Q v = v, then QT v = Q1 v = 1 v and so 1 is an eigenvalue of QT . Furthermore, Exercise 8.2.35 says that a matrix and its transpose have the same eigenvalues.
(b) If Q v = v, then, by Exercise 5.3.16, k v k = k Q v k = | | k v k, and hence | | = 1.
Note that this proof also applies to complex eigenvalues/eigenvectors, with k k denoting
the Hermitian norm in C n .
(c) Let = e i be the eigenvalue. Then
e i vT v = (Q v)T v = vT QT v = vT Q1 v = e i vT v.
Thus, if e i 6= e i , which happen if and only if it is not real, then

0 = vT v = k x k2 k y k2 + 2 i x y,
and so the result follows from taking real and imaginary parts of this equation.
8.2.42.
(a) According to Exercise 8.2.36, a 3 3 orthogonal matrix has at least one real eigenvalue,
which by Exercise 8.2.41 must be 1. If the other two eigenvalues are complex conjugate, i , then the product of the eigenvalues is (2 + 2 ). Since this must equal the
determinant of Q, which by assumption, is positive, we conclude that the real eigenvalue
must be +1. Otherwise, all the eigenvalues of Q are real, and they cannot all equal 1
as otherwise its determinant would be negative.
(b) True. It must either have three real eigenvalues of 1, of which at least one must be 1
as otherwise its determinant would be +1, or a complex conjugate pair of eigenvalues
, , and its determinant is 1 = | |2 , so its real eigenvalue must be 1 and its complex eigenvalues i .
8.2.43.
(a) The axis of the rotation is the eigenvector v corresponding to the eigenvalue +1. Since
Q v = v, the rotation fixes the axis, and hence must rotate around it. Choose an orthonormal basis u1 , u2 , u3 , where u1 is a unit eigenvector in the direction of the axis of
rotation, while u20+ i u3 is a complex
eigenvector for the eigenvalue e i . In this basis, Q
1
1
0
0
C
has matrix form B
@ 0 cos sin A, where is the angle of rotation.
0 sin 0 cos

1
2
C
(b) The axis is the eigenvector B
@ 5 A for the eigenvalue 1. The complex eigenvalue is
1

7
13

+i

2 30
13 ,

and so the angle is = cos1

7
13

1.00219.

8.2.44. In general, besides the trivial invariant subspaces {0} and R 3 , the axis of rotation and
213

its orthogonal complement plane are invariant. If the rotation is by 180 , then any line in
the orthogonal complement plane, as well as any plane spanned by such a line and the axis
of rotation are also invariant. If R = I , then every subspace is invariant.
8.2.45.
(a) (Q I )T (Q I ) = QT Q Q QT + I = 2 I Q QT = K and hence K is a Gram
matrix, which is positive semi-definite by Theorem 3.28.
(b) The Gram matrix is positive definite if and only if ker(Q I ) = {0}, which means that
Q does not have an eigenvalue of 1.
8.2.46. If Q = I , then we have a translation. Otherwise, we decompose b = c + d, where
c rng (Q I ) while d coker(Q I ) = ker(QT I ). Thus, c = (Q I )a, while
QT d = d, and so d = Q d, so d belongs to the axis of the rotation represented by Q. Thus,
referring to (7.41), F (x) = Q(x a) + a + d represents either a rotation around the center
point a, when d = 0, or a screw around the line in the direction of the axis of Q passing
through the point a, when d 6= 0.
8.2.47.

0 1
(a) M2 =
1 0
0
0 1
M3 = B
@1 0
0 1
(b) The j th entry

: eigenvalues 1, 1; eigenvectors
1

1
,
1

1
;
1
0

1 0

1 0

1
1
1
0

B C B
C B C
1C
A: eigenvalues 2, 0, 2; eigenvectors @ 2 A, @ 0 A, @ 2 A.
1
1
1
0
of the eigenvalue equation Mn vk = k vk reads
(j 1) k
(j + 1) k
k
jk
sin
+ sin
= 2 cos
sin
,
n+1
n+1
n+1
n+1
+

sin
.
which follows from the trigonometric identity sin + sin = 2 cos
2
2
These are all the eigenvalues because an n n matrix has at most n distinct eigenvalues.

8.2.48. We have A = a I + b Mn , so by Exercises 8.2.19 and 8.2.47 it has the same eigenvectors
k
as Mn , while its corresponding eigenvalues are a + b k = a + 2 b cos
for k = 1, . . . , n.
n+1
8.2.49. For k = 1, . . . , n,
2k
k = 2 cos
, vk =
n

cos

2k
4k
6k
, cos
, cos
,
n
n
n
!

. . . , cos
!

2 (n 1) k
, 1
n

!T

v
Av
v
v
8.2.50. Note first that if A v = v, then D
=
=
, and so
is an eigen0
0
0
0
vector for D with eigenvalue
. Similarly, each eigenvalue and eigenvector w of B gives
!
0
an eigenvector
of D. Finally, to check that D has no other eigenvalue, we compute
w
!
!
!
!
v
v
Av
v
and hence, if v 6= 0, then is an eigenvalue of A,
=
=
=
D
w
w
Bw
w
while if w 6= 0, then it must also be an eigenvalue for B.
8.2.51. (a) Follows by direct computation:
!
!
!
!
2
0 0
1 0
a b
a
+
b
c
a
b
+
b
d
.
=
+ (a d b c)
(a + d)
pA (A) =
0 0
0 1
c d
a c + c d b c + d2
(b) by part (a), O = A1 pA (A) = A (tr A) I + (det A)A1 , and the formula follows upon
solving for A1 . (c) tr A = 4, det A = 7 and one checks A2 4 A + 7 I = O.
8.2.52.
214

(a) B v = (A v bT )v = A v (b v)v = ( )v.


(b) B (w + c v) = (A v bT )(w + c v) = w + (c ( ) b w)v = (w + c v) provided
c = b w/( ).
(c) Set B = A1 v1 bT where v1 is the first eigenvector of A and b is any vector such that
b v1 = 1. For example, we can set b = v1 /k v1 k2 . (Weilandt deflation, [ 10 ], chooses
b = rj /(1 v1,j ) where v1,j is any nonzero entry of v1 and rj is the corresponding row
of A.)
!
!
3
1
. The deflated ma,
(d) (i) The eigenvalues of A are 6, 2 and the eigenvectors
1
1
!
!
!
1 v1 v1T
0 0
1
0
trix B = A
=
has eigenvalues 0, 2 and eigenvectors
,
.
2 2
1
1
k v 1 k2
(ii) The eigenvalues of A are 4, 3, 1 and the eigenvectors

flated matrix B = A
0

1 0

1 v1 v1T
=
k v 1 k2
1 0

1
1
1
C B
C B C
eigenvectors B
@ 1 A, @ 0 A, @ 2 A.
1
1
1

0
B
B
B
@

5
3
1
3
4
3

1
3
2
3
1
3

1 0

B
C B
@ 1 A, @

43

C
1C
C
3A
5
3

1 0

1
1
B C
0C
A, @ 2 A. The de1
1

has eigenvalues 0, 3, 1 and

8.3.1. (a) Complete; dim = 1 with basis ( 1, 1 )T . (b) Not complete; dim = 1 with basis
( 1, 0 )T . (c) Complete; dim = 1 with basis ( 0, 1, 0 )T . (d) Not an eigenvalue. (e) Complete;
dim = 2 with basis ( 1, 0, 0 )T , ( 0, 1, 1 )T . (f ) Complete; dim = 1 with basis ( i , 0, 1 )T .
(g) Not an eigenvalue. (h) Not complete; dim = 1 with basis ( 1, 0, 0, 0, 0 )T .
8.3.2.

(b) Eigenvalues:
(c) Eigenvalues:
(d) Eigenvalues:
(e) Eigenvalue 3
(f ) Eigenvalue 2
(g) Eigenvalue 3

2
; not complete.
1
!
!
1
2
; complete.
,
2, 2; eigenvectors:
1
1
!
1 i
1 2 i ; eigenvectors:
; complete.
2
!
!
i
i
0, 2 i ; eigenvectors:
,
; complete.
1
1
0 1 0 1
1
1
C B C
has eigenspace basis B
@ 1 A, @ 0 A; not complete.
0 1 0
1 1
0
0
1
1
0
2
C B C
B
C
has eigenspace basis B
@ 0 A, @ 1 A; eigenvalue 2 has @ 1 A; complete.
1
0
1
1
0
0 1
1
0
C
B
C
has eigenspace basis B
@ 1 A; eigenvalue 2 has @ 1 A; not complete.
1
1

(a) Eigenvalue: 2; eigenvector:

215

(h) Eigenvalue 0 has

B C
B0C
B C;
@0A

eigenvalue 1 has

B C
B0C
B C;
@1A

eigenvalue 1 has

11
1 0
1
2
B C B
0 C B 1 C
C
B C, B
C; eigenvalue 2 has
(i) Eigenvalue 0 has eigenspace basis B
@1A @ 0A
0
1
0

8.3.3.

1
,
1

(a) Eigenvalues: 2, 4; the eigenvectors

1
1

sion is 0.

(c) Eigenvalue: 1; there is only one eigenvector v1 =


space of R 2 .

(d)

(e)

(f )

(g)

(h)

1 0

B C B C
B3C B1C
B C, B C;
@1A @0A

complete.

0 1 1
1
B
C
B 1C
B
C; not complete.
@ 5 A
1
0

form a basis for R 2 .

i
,
1

(b) Eigenvalues: 1 3 i , 1 + 3 i ; the eigenvectors

1
0

i
1
!

, are not real, so the dimen-

spanning a one-dimensional sub-

1
1
C
B C
The eigenvalue 1 has eigenvector B
@ 0 A, while the eigenvalue 1 has eigenvectors @ 1 A,
2
0
0
1
0
B
C
3
@ 1 A. The eigenvectors form a basis for R .
0
0 1
0 1
0
1
C
B C
The eigenvalue 1 has eigenvector @ 0 A, while the eigenvalue 1 has eigenvector B
@ 0 A.
1
0
The eigenvectors span a two-dimensional subspace0of R 31. 0 1
0 1
0
0
8
C B C
B C
The eigenvalues are 2, 0, 2. The eigenvectors are B
@ 1 A, @ 1 A, and @ 5 A, forming a
1
1
7
3
basis for R .
0
1 0 1
0 1
i
i
0
C B C
B C
The eigenvalues are i , i , 1. The eigenvectors are B
@ 0 A, @ 0 A and @ 1 A. The real
1
1
0
eigenvectors span only a one-dimensional subspace of R 3 .
0 1 0 1 0
1
0
4
1
B C B C B
1C B3C B i C
C
B C, B C , B
C,
The eigenvalues are 1, 1, i 1, i + 1. The eigenvectors are B
@0A @2A @i A
0
1
0
6
1
1
B
C
Bi C
4
B
C. The real eigenvectors span a two-dimensional subspace of R .
@ i A
1

8.3.4. Cases (a,b,d,f,g,h) have eigenvector bases of C n .


8.3.5. Examples:

1
(a) B
@0
0

1
1
0

0
1C
A,
1

1
(b) B
@0
0

0
1
0

0
1C
A.
1

8.3.6.
(a) True. The standard basis vectors !are eigenvectors.
1 1
is incomplete since e1 is the only eigenvector.
(b) False. The Jordan matrix
0 1

216

8.3.7. According to Exercise 8.2.19, every eigenvector of A is an eigenvector of c A + d I with


eigenvalue c + d, and hence if A has a basis of eigenvectors, so does c A + d I .
8.3.8. (a) Every eigenvector of A is an eigenvector of A2 with
eigenvalue 2 , and hence if A
!
0 1
has a basis of eigenvectors, so does A2 . (b) A =
with A2 = O.
0 0
8.3.9. Suppose A v = v. Write v =

n
X

i=1

ci vi . Then A v =

independence, i ci = ci . Thus, either = i or ci = 0.

n
X

i=1

ci i vi and hence, by linear

8.3.10. (a) If A v = v, then, by induction, An v = n v, and hence v is an eigenvector with


eigenvalue n . (b) Conversely,if A is complete and An has eigenvalue , then at least one
of its complex nth roots = n is an eigenvalue of A. Indeed, the eigenvector basis of A
is an eigenvector basis of An , and hence, using Exercise 8.3.9, every eigenvalue of An is the
nth power of an eigenvalue of A.
8.3.11. As in Exercise 8.2.32, if v is an eigenvector of A then S 1 v is an eigenvector of B. Moreover, if v1 , . . . , vn form a basis, so do S 1 v1 , . . . , S 1 vn ; see Exercise 2.4.21 for details.
8.3.12. According to Exercise 8.2.17, its only eigenvalue is , the common value of its diagonal
entries, and so all eigenvectors belong to ker(U I ). Thus U is complete if and only if
dim ker(U I ) = n, which happens if and only if U I = O.
8.3.13. Let V = ker(A I ). If v V , then A v V since (A I )A v = A(A I )v = 0.
8.3.14.
(a) Let v = x + i y, w = v = x i y be the corresponding eigenvectors, so x =
y = 12 i v + 21 i w. Thus, if c x + d y = 0 where c, d are real scalars, then

1
2

1
2

v+

1
2

w,

c 12 i d v + 21 c + 12 i d w = 0.
Since v, w are eigenvectors corresponding to distinct eigenvalues + i 6= i ,
Lemma 8.13 implies they are linearly independent, and hence
1
1
1
1
2 c 2 i d = 2 c + 2 i d = 0, which implies c = d = 0.
(b) Same reasoning: xj = 12 vj + 12 wj , y = 12 i vj + 21 i wj , where vj , wj = vj are the
complex eigenvectors corresponding to the eigenvalues j + i j , j i j . Thus, if
0 = c 1 xk + d 1 y1 + + c k xk + d k yk
=

1
2 c1

1
2

i d1 v1 +

1
2 c1

1
2

i d1 w1 + +

1
2 ck

1
2

i dk vk +

1
2 ck

1
2

i dk wk ,

then, again by Lemma 8.13, the complex eigenvectors v1 , . . . , vk , w1 , . . . , wk are linearly


independent, and so 21 c1 21 i d1 = 21 c1 + 12 i d1 = = 12 ck 21 i dk = 21 ck + 12 i dk = 0.
This implies c1 = = ck = d1 = = dk = 0, proving linear independence of
x1 , . . . , x k , y1 , . . . , y k .

8.3.15. In all cases, A = S S 1 .


!

(a) S =

3
1

3
, =
2

0
0

(b) S =

2
1

1
, =
1

3
0

(c) S =

35 +
1

1
5

35
1

0
.
3
!

0
.
1
1
5

, =

1 + i
0
217

0
1 i

(d) S =

0
(e) S = B
@1
0

1
10
2
C
1 C, = B
0
@
2 A
0
1

1
1
0

B
B
@0

21
10
7

1
0
B
6C
A, = @ 0
3
0

0
1
0

0
0C
A.
3
1

0
7
0
1

0
0C
A.
1

2 + 3i
1 53 i 15 + 53 i 1
C
B
B 5
(f ) S = @
1
1
0 A, = @ 0
0
1
1
1
0
1
0
1
1
0 1
2 0
0
B
(g) S = B
0C
0C
@ 0 1
A, = @ 0 2
A.
0
1
1
0 0 3
(h) S =

12
0
B
B 1
(i) S = B
@ 0
1
0

1
B
B 1
(j) S = B
B
@ 0
0
1
1

8.3.16.

1
0

=
!

3
2

i
1
2 2i
1+ i
1

1
3
0
0

23 i
21 + 2 i
1 i
1

1
B
B0
=B
@0
0

C
C
C,
C
A

10
0 AB

@
1 5
0
2
1
!0
0 @ 2i 12 A
;
1
i
i
2
2

10
1+ 5
1 5
2
A@
2

1+ 5
@
2

0
0 C
A.
2

2
0 0 0
3 1 0
B
0
1
0 0C
2 0 1C
C
B
C
C.
C, = B
@ 0
0 1 0A
6 0 0A
0
0 0 2
0 0 0
1
0
1
1 0 1
1
0 0 0
C
B
C
0 1 0C
B 0 1 0 0 C
C, = B
C.
@ 0
1 0 1A
0 1 0A
0 1 0
0
0 0 1

B
B 3
B
@ 0

0
2 3i
0

0
1
0
0

0
0
i
0

1
5
1

0
0C
C
C.
0A
i

1 5

2 5 C
A.
1+ 5
2 5

0 1
i i
i
a rotation does not stretch any real
8.3.17.
=
1
0
1
1
0
vectors, but somehow corresponds to two complex stretches.
8.3.18. 0

0
(a) B
@1
0

(b)

0
B
B
@

1
0
0
5
13

0
12
13

0
i
B
0C
A=@1
1
0
0
1
0

1
12
13 C
0C
A
5
13

0
B

=B
@

10

i
1
0

0
i
B
0C
A@ 0
1
0

i
0
1

i
0
1

0CB
B
B
1C
AB
@
0

0
i
0

5+12 i
13

0
0

10

0 B2
i
0C
AB
@ 2
1
0
0
512 i
13

1
2
1
2

0
C
0C
A,

0 1 12

8.3.19.
(a) Yes: distinct real eigenvalues 3,
2.
(b) No: complex eigenvalues 1 i 6.
(c) No: complex eigenvalues 1, 21 25 i .
(d) No: incomplete eigenvalue 1 (and complete eigenvalue 2).
(e) Yes: distinct real eigenvalues 1, 2, 4.
(f ) Yes: complete real eigenvalues 1, 1.

218

0C
2
CB
CB
B i
0C
2
A@
0
1

0
0
1

1
2
1
2

C
C
C.
A

1
8.3.20. In all cases, A
.
! = S S
!
1 1
1+ i
0
(a) S =
, =
.
1
1
0
1 + i
!
!
2 + i 0
1 i
0
(b) S =
, =
.
1
1
0
2+ i
!
!
1
1
4
0
.
(c) S = 1 2 2 i , =
0 1
1
1
1
0
0
1
0
0
1
1
B
1 C
3
+
i
,

=
(d) S = B
0
1

i
A
@
@ 1 1 i
5
5
0
1
1
0
0

0
0
1 i

C
A.

8.3.21. Use the formula A = S S 1 . For parts (e,f ) you can choose any other eigenvalues and
eigenvectors you want to fill in S0and .
1
!
!
6
6 2
7
4
3 0
, (b) B
,
(a)
0C
(c)
@ 2 2
A,
8 5
0 3
6
6
4
0
1
1
0
1
!
0 0
4
3
4
2 0C
B
C
B
1

3 ,
(e) example: @ 0 1
(d)
0 A, (f ) example: @ 2
3 0 A.
6 3
0 0 2
2 2 0
!
!
!
1 0
11
6
4 6
.
, (c)
, (b)
8.3.22. (a)
0 2
18 10
3
5
8.3.23. Let S1 be the eigenvector matrix for A and S2 the eigenvector matrix for B. Thus, by
the hypothesis S11 A S1 = = S21 B S2 and hence B = S2 S11 A S1 S21 = S 1 A S where
S = S1 S21 .
8.3.24. The hypothesis says B = P A P T = P A P 1 where P is the corresponding permutation
matrix, and we are using Exercise 1.6.14 to identify P T = P 1 . Thus A and B are similar
matrices, and so, according to Exercise 8.2.32, have the same eigenvalues. If v is an eigenvector fofr A, then the vector w = P v obtained by permuting its entries is an eigenvector
for B with the same eigenvalue, since B w = P A P T P v = P A v = P v = w.
8.3.25. True. Let j = ajj denote the j th diagonal entry of A, which is the same as the j th
eigenvalue. We will prove that the corresponding eigenvector is a linear combination of
e1 , . . . , ej , which is equivalent to the eigenvector matrix S being upper triangular. We
use induction on the size n. Since A is upper triangular, it leaves the subspace V spanned
by e1 , . . . , en1 invariant, and hence its restriction to the subspace is represented by an
(n 1) (n 1) upper triangular matrix. Thus, by induction and completeness, A possesses
n 1 eigenvectors of the required form. The remaining eigenvector vn cannot belong to V
(otherwise the eigenvectors would be linearly dependent) and hence must involve e n .
8.3.26. The diagonal entries are all eigenvalues, and so are obtained from each other by permutation. If all eigenvalues are distinct, then there are n ! different diagonal forms othern!
wise, if it has distinct eigenvalues of multiplicities j1 , . . . jk , there are
distinct
j1 ! j k !
diagonal forms.
2
2
8.3.27. Let A = S S 1 . Then
are 1.
! A = I if and only if = I , and so all its!eigenvalues
!
3 2
1
1
Examples: A =
with eigenvalues 1, 1 and eigenvectors
,
; or, even
4 3
1
2
!
0 1
.
simpler, A =
1 0

219

8.3.28.
(a) If A = S S 1 and B = S D S 1 where , D are diagonal, then A B = S D S 1 =
S D S 1 = B A, since diagonal matrices commute.
(b) According to Exercise 1.2.12(e), the only matrices that commute with an n n diagonal
matrix with distinct entries is another diagonal matrix. Thus, if A B = B A, and A =
S S 1 where all entries of are distinct, then D = S 1 B S commutes with and
hence is a diagonal matrix.
!
0 1
commutes with the identity matrix, but is not diagonalizable.
(c) No, the matrix
0 0
See also Exercise 1.2.14.

8.4.1.

1
1
2
1
,
.
(a) Eigenvalues: 5, 10; eigenvectors:
5 1!
5 2
!
1
1
1
1
(b) Eigenvalues: 7, 3; eigenvectors:
,
.
1
2
2 1

7 + 13 7 13
(c) Eigenvalues:
,
;
0
1
0

2
2
3 13
2
2
eigenvectors: q
@ 2 A, q
@
1
26 6 13
26 + 6 13
0 4 1 0
1 0
4 1

(d) Eigenvalues: 6, 1, 4;

(e) Eigenvalues: 12, 9, 2;

eigenvectors:

eigenvectors:

B5 2C B
B 3 C B
B C, B
B5 2C @
A
@
1
2
1 0
0
1
6
C B
B
C B
B
B 1 C, B
C B
B
6
A @
@
2

5
4
5

3+ 13
A.
2

5 2C
3 C
C.

5 2C
A
1
2
1 0
1
1
C B 2C
C B
C
C, B 1 C.
C B
C
A @ 2A

C B
C B
C, B
A B
@

1
3
1
3
1
3

8.4.2.

(a) Eigenvalues 52 21 17; positive definite. (b) Eigenvalues 3, 7; not positive definite.

(c) Eigenvalues 0, 1, 3; positive semi-definite. (d) Eigenvalues 6, 3 3; positive definite.


8.4.3. Use the fact that K = N is positive definite and so has all positive eigenvalues. The
eigenvalues of N = K are j where j are the eigenvalues of K. Alternatively, mimic
the proof in the book for the positive definite case.
8.4.4. If all eigenvalues are distinct, there are 2n different bases, governed by the choice of sign
in each of the unit eigenvectors uk . If the eigenvalues are repeated, there are infinitely
many, since any orthonormal basis of each eigenspace will contribute to an orthonormal
eigenvector basis of the matrix.
8.4.5.
(a) The characteristic equation p() = 2 (a + d) + (a d b c) = 0 has real roots if and
only if its discriminant is non-negative: 0 (a + d)2 4 (a d b c) = (a d)2 + 4 b c, which
is the necessary and sufficient condition for real eigenvalues.
(b) If A is symmetric,!then b = c and so the discriminant is (a d)2 + 4 b2 0.
1 1
(c) Example:
.
0 2

220

8.4.6.
(a) If A v = v and v 6= 0 is real, then

k v k2 = (A v) v = (A v)T v = vT AT v = vT A v = v (A v) = k v k2 ,
and hence = 0.
(b) Using the Hermitian dot product,
k v k2 = (A v) v = vT AT v = vT A v = v (A v) = k v k2 ,

and hence = , so is purely imaginary.


(c) Since det A = 0, cf. Exercise 1.9.10, at 0
least one of the1eigenvalues of A must be 0.
0
c b
3
2
2
2
(d) The characteristic polynomial of A = B
0
aC
@ c
A is + (a + b + c ) and
b a
0

2
2
2
hence the eigenvalues are 0, i a + b + c , and so are
all zero if and only if A = O.
(e) The eigenvalues are: (i) 2 i , (ii) 0, 5 i , (iii) 0, 3 i , (iv ) 2 i , 3 i .
8.4.7.
(a) Let A v = v. Using the Hermitian dot product,
k v k2 = (A v) v = vT AT v = vT A v = v (A v) = k v k2 ,

and hence = , which implies that the eigenvalue is real.


(b) Let A v = v, A w = w. Then

v w = (A v) w = vT AT w = vT A w = v (A w) = v w,
since is real. Thus, if 6= then v w = 0.
!
!

(c) (i) Eigenvalues 5; eigenvectors: (2 5) i , (2 + 5) i .


1!
1
!
2 i
2 + i
(ii) Eigenvalues 4, 2; eigenvectors:
,
.
1
5
0 1 0
1 0
1
1
1
1
C

B C B C B
(iii) Eigenvalues 0, 2; eigenvectors: @ 0 A, @ i 2 A, @ i 2 A.
1
1
1

8.4.8.
(a) Rewrite (8.31) as M 1 K v = v, and so v is an eigenvector for M 1 K with eigenvalue
. The eigenvectors are the same.
(b) M 1 K is not necessarily symmetric, and so we cant use Theorem 8.20 directly. If v is
an generalized eigenvector, then since K, M are real matrices, K v = M v. Therefore,
k v k2 = vT M v = ( M v)T v = (K v)T v = vT (K v) = vT M v = k v k2 ,
and hence is real.
(c) If K v = M v, K w = M w, with , and v, w real, then
h v , w i = ( M v)T w = (K v)T w = vT (K w) = vT M w = h v , w i,
and so if 6= then h v , w i = 0, proving orthogonality.
(d) If K > 0, then h v , v i = vT ( M v) = vT K v > 0, and so, since M is positive definite,
> 0.
(e) Part (b) proves that the eigenvectors are orthogonal with respect to the inner product
induced by M , and so the result follows immediately from Theorem 5.5.
8.4.9.
(a) Eigenvalues:

5 1
3, 2;

eigenvectors:

(b) Eigenvalues: 2, 12 ; eigenvectors:

1
3
, 2 .
1
1!
!
1
1

2 .
,
1
1

221

1
2

(c) Eigenvalues: 7, 1; eigenvectors:

10

1
.
0

1 0

1 0

6
6
2
C B
C B C
(d) Eigenvalues: 12, 9, 2; eigenvectors: B
@ 3 A, @ 3 A, @ 1 A.
41 0 21 0 0 1
0
1
1
1
C B
C B 1C
(e) Eigenvalues: 3, 1, 0; eigenvectors: B
@ 2 A, @ 0 A, @ 2 A.
1
10 1 0 1 1
1
1
C B
C
(f ) 2 is a double eigenvalue with eigenvector basis B
@ 0 A, @ 1 A, while 1 is a simple eigen1
0
0
1
2
C
value with eigenvector B
@ 2 A. For orthogonality you need to select an M orthogonal
1
basis of the two-dimensional eigenspace, say by using GramSchmidt.
8.4.10. If L[ v ] = v, then, using the inner product,

k v k2 = h L[ v ] , v i = h v , L[ v ] i = k v k2 ,
which proves that the eigenvalue is real. Similarly, if L[ w ] = w, then
h v , w i = h L[ v ] , w i = h v , L[ w ] i = h v , w i,
and so if 6= , then h v , w i = 0.

8.4.11. As shown in the text, since yi V , its image A yi V also, and hence A yi is a
linear combination of the basis vectors y1 , . . . , yn1 , proving the first statement. Furthermore, since y1 , . . . , yn1 form an orthonormal basis, by (8.30), bij = yi A yj = (A yi ) yj =
bji .
8.4.12.
(a)

0
B
B
B
B
B
B
B
B
B
B
B
@

1
0

1
1
0

0
1
1
..
.

0
0
1
..
.
0

0
1

0
..
.
1
0

(b) Using Exercise 8.2.13(c),

..
1
1
0

01
0C
C

C
C
C
C
C.
C
C
0C
C
1A

k = (S I ) k = 1 e2 k i /n k ,

and so k is an eigenvector of with corresponding eigenvalue 1 e2 k i /n .


(c) Since S is an orthogonal matrix, S T = S 1 , and so S T k = e 2 k i /n k . Therefore,
K k = (S T I )(S I )k = (2 I S S T )k

= 2 e2 k i /n e 2 k i /n k =

2 2 cos

k i
n

k ,

k i
.
n
(d) Yes, K > 0 since its eigenvalues are all positive; or note that K = T is a Gram
matrix, with ker = {0}.
k i
(n k) i
(e) Each eigenvalue 2 2 cos
= 2 2 cos
for k 6= 12 n is double, with a twon
n
dimensional eigenspace spanned by k and nk = k . The corresponding real eigenvectors are Re k = 12 k + 21 nk and Im k = 21i k 21i nk . On the other hand,
and hence k is an eigenvector of K with corresponding eigenvalue 2 2 cos

222

if k = 12 n (which requires that n be even), the eigenvector n/2 = ( 1, 1, 1, 1, . . . )T


is real.
8.4.13.
(a) The shift matrix has c1 = 1, ci = 0 for i 6= 1; the difference matrix has c0 = 1, c1 = 1,
and ci = 0 for i > 1; the symmetric product K has c0 = 2, c1 = cn1 = 1, and ci = 0
for 1 < i < n 2;
(b) The eigenvector equation
C k = (c0 + c1 e2 k i /n + c2 e4 k i /n + + +cn1 e2 (n1) k i /n ) k
can either be proved directly, or by noting that
C = c0 I + c1 S + c2 S 2 + + cn1 S n1 ,

and using Exercise 8.2.13(c).

(c) This follows since the individual columns of Fn = 0 , . . . , n1 are the sampled
exponential eigenvectors, and so the columns of the matrix equation C Fn = Fn are
the eigenvector equations C k = k k !
for k = !
0, . . . , n 1.
1
1
(d) (i) Eigenvalues 3, 1; eigenvectors
,
.
1
1
1 0
0
0 1
1
1 C B
1 B

C B
B C B
3
3
3
3
1
3
1
, B 2 23 i
(ii) Eigenvalues 6, 2 2 i , 2 + 2 i ; eigenvectors @ 1 A, B
2 + 2 i C
A @
@

1
1
1
3
3

+
21 0
2 i
0 1 0
1 20
12
1
1
1
1
B C B
C B
C B
C
B 1 C B i C B 1 C B i C
C, B
C, B
C, B
C.
(iii) Eigenvalues 0, 2 2 i , 0, 2 + 2 i ; eigenvectors B
@ 1 A @ 1 A @ 1 A @ 1 A
1
1i 0 1
i
0 1 0
1 0
1
1
1
1
1
B C B
B
C B
C
1C B i C
C B 1 C B i C
B C, B
C, B
C, B
C.
(iv ) Eigenvalues 0, 2, 4, 2; eigenvectors B
@ 1 A @ 1 A @ 1 A @ 1 A
1
i
1
i

7+ 5 7+ 5 7 5 7 5
(e) The eigenvalues are (i) 6, 3, 3; (ii) 6, 4, 4, 2; (iii) 6, 2 , 2 , 2 , 2 ; (iv ) in
2k
the n n case, they are 4 + 2 cos
for k = 0, . . . , n 1. The eigenvalues are real and
n
positive because the matrices are positive definite.
(f ) Cases (i,ii) in (d) and all matrices in part (e) are invertible. In general, an n n circulant matrix is invertible if and only if none of the roots of the polynomial c0 +c1 x+ +
cn1 xn1 = 0 is an nth root of unity: x 6= e2 k /n .

8.4.14.
3
4

(a)

2
1

(b)

(c)

B
@1

4
3

1
B 5
@ 2

5
0

5
0

0
5

0
!
B
@

1
5
2
5

1
2
5C
A.
1
5

5C
A
1

0
1
2
1 2
1+ 2
!
! 1

B 42 2
C 3+ 2
1
0 B
42
4+2
2
B
B
C
2
@
=@
A
2
4
1
1
0 3 2
1+
42 2
4+2 2
4+2 2
0
1
0 1
0
1

1
1
2
1
1
1
6
2
3
6
6
6
3
0
0
C
B
B
1 0
B
CB
B 2
C
1 C
1
1
C
B
C
B
B

0
0
2 1A = B 6
@ 0 1 0 AB 2
3C
2
A
@
@
1 1
1
1
1
1
1
0 0 0

6
3
3
2
3
3

223

1
1
C
42 2 C
A.
1
4+2 2
1

C
C
C.
C
A

C
C
C.
A

(d)

1
2
0

B
@ 1

8.4.15.

B
1
B
0C
A=B
B
@
2

2
6
1
6
1
6

0
1

2
1
2

1
3
1
3
1
3

C 4
CB
CB 0
C@
A

0
2
0

0 CB
B
B
0C
AB
@
1

2
6

0
1
3

1
6
1
2
1
3

1
6
1
2
1
3

C
C
C.
C
A

0
1
1
! 2
1
1
0 B 5
5C 5
5C
(a)
=
@ 1
A
A.

0 10
2
2
5
5
05
1
1
!
!
1
1
1
1

5 2
5
0
B
2
2C
2
2C
=B
(b)
@
@
A
A.
2
5
1
1
1
1
0 10
2
2
2
2

1
0
0

0
1 3 13
3+ 13
3 13
!
2

7+
13
C
B 26613
B 266 13
0
2 1
266 13
26+6
13
2
C
B
B
@
A

(c)
=@
A
2
1
5
7 13 @ 3+ 13
2
2

2
266 13
26+6 13
26+6 13
26+6 13
1
1
0
0 4
4
1
0
3
4

0
1
35
1

5 2C 6 0
0
B
B5 2
5 2
5 2
2C
1 0 4
C
B 3
CB
B
C
3 C
4
4
3
CB 0 1
CB

(d) B
@0 1 3A = B
0C

0
C,
C
B
B5 2
A
@
5
5
5
5
2
A
@
A
@
4 3 1
4
3
1
1
1

0 0 4

0
5
2
5
2
2
2
2
1
1
0
0
0
1
1
1
1
2
0
1
1
1
6
3
2
6
6
6
12
0
0
C
C
B
B
6 4
1
CB
C
CB
B
1
1
1 CB 0 9 0 CB 1
1
1 C.
(e) B
6 1 C
@ 4
A=B
C
C
B
B 6
@
A
3
2
3
3
3
A
A
@
@
1 1 11
1
1
2
1
0 0 2

0
0
6
3
2
2
0
0
1
1
57
1
24
3
25 25 A, (b) @ 2
2 A. (c) None, since eigenvectors are not orthog8.4.16. (a) @
43
1
3
24

25
2
2
! 25

2
6

6
7

2
B 5
@ 1

05

2 0
. Note: even though the given eigenvectors are not orthogonal, one can
0 2
construct an orthogonal basis of the eigenspace.
onal. (d)

8.4.17.
(a)
(b)
(c)

(d)

(e)

2
2
2
1
1 y
1 x + 3 y

(3 x + y)2 + 11
+ 11
= 20
2
20 ( x + 3 y) ,
10
10
10

2
2

2 x + 1 y
+ 11

= 75 (x + 2 y)2 + 52 ( 2 x + y)2 ,
7 1 x + 2 y
2
5
5
5
5

2
2

4
3
3
4

4
x+
y 1 z
+ 6
x+
y + 1 z
+ 53 x + 45 y
5 2
5 2
2
5 2
5 2
2
2
2
2
1
3
2
= 25 (4 x + 3 y 5 z) + 25 ( 3 x + 4 y) + 25 (4 x + 3 y + 5 z) ,
2

2
1
1 x + 1 y + 1 z
1 y + 1 z
2 x + 1 y + 1 z
+
2

2
3
3
3
2
2
6
6
6
= 61 (x + y + z)2 + 21 ( y + z)2 + 31 ( 2 x + y + z)2 ,
2

2
2 1 x + 1 y
+ 9 1 x + 1 y + 1 z
+ 12 1 x 1 y + 2 z
2
2
3
3
3
6
6
6
2
2
2
1
2

3
10

x+

= (x + y) + 3 ( x + y + z) + 2 (x y + 2 z) .

0 1
0
1
0
1
8.4.18.
1
1
1
C
B
C
B
C
(a) 1 = 2 = 3, v1 = B
@ 1 A, v2 = @ 0 A, 3 = 0, v3 = @ 1 A;
0
1
1
(b) det A = 1 2 3 = 0.
(c) A is positive semi-definite, but not positive definite since it has a zero eigenvalue.

224

C
C,
A

0
B
B

(d) u1 = B
B
(e)

(f )

B
@

2
1
1
1

B C
@0A

1
2
1
=

1
6C
3
B
B
C
B
B
C
C
C, u = B 1 C, u = B 1
B
B
C
3
2
6C
3
@
@
A
A
2
1

0
6
3
0 1
1
0

1
1
1
6
3C 3
B 2
1
B 1
CB

1
1 CB 0

1C
A=B
B 2
C@
6
3
@
A
2
2
1
0

0
6
3

1
2
1
2

1
2

u1

1
6

u2 +

1
3

C
C
C;
C
A

0
3
0

0 CB
B
B
0C
AB
@
0

1
2
1
6
1
3

1
2
1
6
1

0
2
6
1
3

C
C
C;
C
A

u3 .

8.4.19. The simplest is A = I . More generally, any matrix of the form A = S T S, where
S = ( u1 u2 . . . un ) and is any real diagonal matrix.
8.4.20. True, assuming that the eigenvector basis is real. If Q is the orthogonal matrix formed
by the eigenvector basis, then A Q = Q where is the diagonal eigenvalue matrix. Thus,
A = Q Q1 = Q QT = AT is symmetric. For complex eigenvector bases, the result
! is
cos sin
has
false, even for real matrices. For example, any 2 2 rotation matrix
sin
cos
!
!
i
i
orthonormal eigenvector basis
,
. See Exercise 8.6.5 for details.
1
1
8.4.21. Using the spectral factorization, we have xT A x = (QT x)T (QT x) =

n
X

i=1

i yi2 , where

yi = ui x = k x k cos i denotes the ith entry of QT x.

8.4.22. Principal stretches = eigenvalues: 4 + 3, 4 3, 1;


T
T

principal directions = eigenvectors: 1, 1 + 3, 1


, 1, 1 3, 1
, ( 1, 0, 1 )T .

8.4.23. Moments of inertia: 4, 2, 1; principal directions: ( 1, 2, 1 )T , ( 1, 0, 1 )T , ( 1, 1, 1 )T .


8.4.24.
(a) Let K = Q QT be its spectral factorization. Then xT K x = yT y where x = Q y.
2
2
The ellipse yT y = q
1 y1 + 2 y1 = 1 has its principal axes aligned with the coordinate
axes and semi-axes 1/ i , i = 1, 2. The map x = Q y serves to rotate the coordinate
axes to align with the columns of Q, i.e., the eigenvectors, while leaving the semi-axes
unchanged.
1

0.5

(b) (i)

-1

-0.5

0.5

ellipse with semi-axes

-0.5

-1

225

1, 12

and principal axes

1
,
0

0
.
1

1.5

0.5

(ii)

-1.5

-1

-0.5

0.5

1.5

-0.5

q
ellipse with semi-axes 2, 23 , and principal axes

1
,
1

1
.
1

-1

-1.5
1.5

(iii)

-1.5

-1

-0.5

1
1
, ,
2+ 2
2
!2

ellipse with semi-axes

0.5

0.5

1.5

and principal axes

-0.5

1+ 2 ,
1

!
1 2 .
1

-1

-1.5

(c) If K is positive semi-definite it is a parabola; if K is symmetric and indefinite, a hyperbola; if negative (semi-)definite, the empty set. If K is not symmetric, replace K by
T
1
2 (K + K ) as in Exercise 3.4.20, and then apply the preceding classification.
8.4.25. (a) Same method as in Exercise 8.4.24. Its principal axes are the eigenvectors of K,
and the semi-axes are the reciprocals of the square roots of the eigenvalues. (b) Ellipsoid
with principal axes: ( 1, 0, 1 )T , ( 1, 1, 1 )T , ( 1, 2, 1 )T and semi-axes 1 , 1 , 1 .
6

12

24

8.4.26. If = diag (1 , . . . , n ), then the (i, j) entry of M is di mij , whereas the (i, j) entry of
M is dk mik . These are equal if and only if either mik = 0 or di = dk . Thus, M = M
with M having one or more non-zero off-diagonal entries, which includes the case of nonzero skew-symmetric matrices, if and only if has one or more repeated diagonal entries.
Next, suppose A = Q QT is symmetric with diagonal form . If A J = J A with J T =
J 6= O, then M = M where M = QT J Q is also nonzero, skew-symmetric, and hence
A has repeated eigenvalues. Conversely, if i = j , choose M such that mij = 1 = mji ,
and then A commutes with J = Q M QT .

8.4.27.

(a) Set B = Q QT , where is the diagonal matrix with the square roots of the eigenvalues of A along the diagonal. Uniqueness follows from the fact that the eigenvectors
and eigenvalues are uniquely determined. (Permuting them does not change the final
form of B.)
!

1
1
3+1
31
1 1 2 ;
2 2
q

; (ii)
(b) (i)

1
2
31
3+1
(2 2) 2 + 2 1 2
(iii)

0
B
@

2
0
0

0
5
0

0
C
0 A;
3

(iv )

0
B
B
B
B
@

1 + 1
2
3
1

1+
1
2
3
1 + 2
3

1+

1 1
2
3
1
1

1+
+
2
3
1 2
3

1 +

2
3
2
3
4
3

1 +
1
1+

C
C
C.
C
A

8.4.28. Only the identity matrix is both orthogonal and positive definite. Indeed, if K = K T >
0 is orthogonal, then K 2 = I , and so its eigenvalues are all 1. Positive definiteness implies that all the eigenvalues must be +1, and hence its diagonal form is = I . But then
K = Q I QT = I also.

8.4.29. If A = Q B, then K = AT A = B T QT Q B = B 2 , and hence B = K is the positive


226

definite square root of K. Moreover, Q = A B 1 then satisfies QT Q = B T AT A B 1 =


B 1 K B 1 = I since K = B 2 . Finally, det A = det Q det B, and det B > 0 since B > 0. So
if det A > 0, then det Q = +1 > 0.
8.4.30. (a)

0
2

1
0

2
1

(c)
0

0
(d) B
@1
0
(e)

B
@1

1
0
!

3
0
4

0
1

1
0

1
2
1
2
0

=B
@

2
0

0
,
1

1
.3897
B
0C
A = @ .5127
0
.7650

.0323
.8378
.5450

2
1

(b)

10
1
1
1
1
2 CB 2
2C
A@ 1
A,
1

3
2
2
2
10
1 0
53 54
CB
0 0A@0 5
4
3
0 0
5
5

8
0
B
0C
A = @1
6
0

0
2
1

3
6

2
B 5
@ 1

5C
A
2
5

5
0

0
,
3 5

0
0C
A,
10

10

.9204
1.6674
B
.1877 C
A@ .2604
.3430
.3897

.2604
2.2206
.0323

.3897
.0323 C
A.
.9204

8.4.31.
(i) This follows immediately from the spectral factorization. The rows of Q T are
T
1 uT
1 , . . . , n un , and formula (8.34) follows from the alternative version of matrix multiplication given in0Exercise
1 1.2.34.
1
0
!
2
1
4
25 A
3 4
5
5
5
A
@
@
(ii) (a)
=5 2 4 5
.
2
1
4 3

5
5
5
5

2
1

(b)
0

1
(c) B
@1
0

1
4

1
2
1

3
(d) B
@ 1
1

0
B
B
1C
A = 3B
@
1
1

1
2
0

1
0
322
1 2
3+22

B 42 2
42 2
42 2 C
A + (3 2 )@
1
1 2
1+ 2
42
2
42
2
42
2
1
0
1
1
1
1
0

6C
2
2C
B
1C
C+B
0
0
0C
@
A.
3A
1
1
1
2 10
6
0
02
1
2
1
1

0
0
0
3
3
3C
B
B
1
1C
1
1C
C+B
C + 2B
B
0

13
@
A
2
2
6
6A
@
1
1
1
1
1
0

3
2
2
6
6

= (3 + 2 )B
@
1
3
2
3
1
03

1
6
1
3
1
6

1
B
B
0C
A = 4B
@
2

1
1+ 2
42 2 C
A.
1
42 2

1
3
1
3
1
3

1
3
1
3
1
3

1
3
1
3
1
3

C
C
C.
A

8.4.32. According to Exercise 8.4.7, the eigenvalues of an n n Hermitian matrix are all real,
and the eigenvectors corresponding to distinct eigenvalues orthogonal with respect to the
Hermitian inner product on C n . Moreover, every Hermitian matrix is complete and has an
orthonormal eigenvector basis of C n ; a proof follows along the same lines as the symmetric
case in Theorem 8.20. Let U be the corresponding matrix whose columns are the orthonormal eigenvector basis. Orthonormality implies that U is a unitary matrix: U U = I , and
satisfies H U = U where is the real matrix with the eigenvalues of H on the diagonal.
Therefore, H = U U .
8.4.33.
(a)

(b)

3
2i
6
1 + 2i

2i
6

=B
@

1 2i
2

0
1
!
2
i
5
5C 2 0 B
5
@
A
2i

1
2
0 7
5
5 0
05
1
12
1+2
!
i
i
B
6
30 C 7 0 B

@
A
@
1
5
0 1
6
6

227

1
2i

5C
A,
1
5
1+2
i
6
i
1+2
30

1
1
6 C
A,
5
6

(c)

B
@ 5 i

5i
1
4 i

B
4
B
4i C
A=B
B
@
8

1
6
i
6
2
6

1
3
i
3
1
3

2
1
2

C 12
CB
CB 0
C@
A

0
6
0

0 CB
B
B
0C
AB
@
0

1
6
i
2
1
3

6
1
2
i
3

1
2
6C
C
0 C
C.
A
1

8.4.34. Maximum: 7; minimum: 3.


8.4.35. Maximum:

4+ 5
2 ;

minimum:

4 5
2 .

8.4.36.
(a) 5+2 5 = max{ 2 x2 2 x y + 3 y 2 | x2 + y 2 = 1 },

5 5
2

= min{ 2 x2 2 x y + 3 y 2 | x2 + y 2 = 1 };
(b) 5 = max{ 4 x2 + 2 x y + 4 y 2 | x2 + y 2 = 1 },
3 = min{ 4 x2 + 2 x y + 4 y 2 | x2 + y 2 = 1 };
(c) 12 = max{ 6 x2 8 x y + 2 x z + 6 y 2 2 y z + 11 z 2 | x2 + y 2 + z 2 = 1 },
2 = min{ 6 x2 8 x y + 2 x z + 6 y 2 2 y z + 11 z 2 | x2 + y 2 + z 2 = 1 };
(d) 6 =
max{ 4 x2 2 x y 4 x z + 4 y 2 2 y z + 4 z 2 | x2 + y 2 + z 2 = 1 },
3 3 = min{ 4 x2 2 x y 4 x z + 4 y 2 2 y z + 4 z 2 | x2 + y 2 + z 2 = 1 }.

8.4.37. (c) 9= max{ 6 x2 8 x y + 2 x z + 6 y 2 2 y z + 11 z 2 | x2 + y 2 + z 2 = 1, x y + 2 z = 0 };


(d) 3 + 3 = max{ 4 x2 2 x y 4 x z + 4 y 2 2 y z + 4 z 2 | x2 + y 2 + z 2 = 1, x z = 0 },
8.4.38. (a) Maximum: 3; minimum: 2;
(c) maximum:

(d) maximum:

(b) maximum:

5
2;

minimum: 12 ;

8 5
8+ 5
2 = 5.11803; minimum:
2 = 2.88197;
4 10
4+ 10
= 3.58114; minimum:
= .41886.
2
2

8.4.39. Maximum: cos

;
n+1

minimum: cos

.
n+1

8.4.40. Maximum: r 2 1 ; minimum: r 2 n , where 1 , n are, respectively, the maximum and


minimum eigenvalues of K.
8.4.41. max{ xT K x | k x k = 1 } = 1 is the largest eigenvalue of K. On the other hand, K 1
is positive definite, cf. Exercise 3.4.10, and hence min{ xT K 1 x | k x k = 1 } = n is its
smallest eigenvalue. But the eigenvalues of K 1 are the reciprocals of the eigenvalues of K,
and hence its smallest eigenvalue is n = 1/1 , and so the product is 1 n = 1.
8.4.42. According to the discussion preceding the statement of the Theorem 8.30,
j = max

yT y

k y k = 1, y e1 = = y ej1 = 0 .

Moreover, using (8.33), setting x = Q y and using the fact that Q is an orthogonal matrix
and so (Q v) (Q w) = v w for any v, w R n , we have
xT A x = yT y,

k x k = k y k,

y e i = x vi ,

where vi = Q ei is the ith eigenvector of A. Therefore, by the preceding formula,


j = max

xT A x k x k = 1, x v1 = = x vj1 = 0 .

8.4.43. Let A be a symmetric matrix with eigenvalues 1 2 n and corresponding


orthogonal eigenvectors v1 , . . . , vn . Then the minimal value of the quadratic form xT A x
228

over all unit vectors which are orthogonal to the last nj eigenvectors is the j th eigenvalue:
j = min

xT A x k x k = 1, x vj+1 = = x vn = 0 .

v
vT K v
= uT K u, where u =
is a unit vector. Moreover, if v is orthogok v k2
kvk
nal to an eigenvector vi , so is u. Therefore, by Theorem 8.30

8.4.44. Note that


8
<

max :

vT K v
k v k2

v 6= 0,

9
=

v v1 = = v vj1 = 0 ;

= max

u Ku

k u k = 1,

u v1 = = u vj1 = 0

.
= j

8.4.45.

f = R1 K R1 . Then
(a) Let R = M be the positive definite square root of M , and set K
f y, xT M x = yT y = k y k2 , where y = R x. Thus,
xT K x = y T K
max

xT K x

xT M x = 1

= max

fy
yT K

k y k2 = 1

e ,
=
1

f But K
f y = y implies K x = x, and so the eigenvalues of
the largest eigenvalue of K.
f coincide.
K and K
x
(b) Write y =
so that yT M y = 1. Then, by part (a),
T
x 8
Mx

9
< xT K x
=

n
o

T
T
x 6= 0
y M y = 1 = 1 .
max : T
=
max
y
K
y

;
x Mx

(c) n = min{ xT K x | xT M x = 1 }.
(d) j = max{ xT K x | xT M x = 1, xT M v1 = = xT M vj1 = 0 } where v1 , . . . , vn are
the generalized eigenvectors.

8.4.46. (a) Maximum: 43 , minimum: 25 ; (b) maximum: 9+47 2 , minimum: 947


(c) maximum: 2, minimum: 12 ; (d) maximum: 4, minimum: 1.
1
0

8.4.47. No. For example, if A =


only if | b | < 4.
8.5.1. (a) 3
1
0

1
2

5
B 1025
B
=@
2
102 5
!
!

0
1

1
0

(c)

1
3

2
6

(d)

2
0

0
0

(b)

(e)

2
0

has eigenvalues 1, 4, then q(x) > 0 for x 6= 0 if and


5, (b) 1, 1; (c) 5 2; (d) 3, 2; (e) 7, 2; (f ) 3, 1.

8.5.2.
(a)

b
4

0
3
1
1

1+
1
0

0
1
0
1
B

0
1

=@

0
1

1
1

10 C
A
3
10
!

1
0

=B
@

10+2 5
2
10+2
! 5

1
1
0

0
1

0q

C
C@
A

0
1

5
1
5

0
1

1
,
0

1
5 2 5

3 0
0 2
2

3+ 5

1
0

1
1
5C
A
2
5

2
5
!

2+

1
1
C
104 5 C
A,
1
10+4 5

0
,
0
0
!

7 0 B
@
0
2
229

5
B 1045
0
q
AB
@ 25

3 5

10+4 5
1

4
35
2
10

35

10

1
35
2
10

1
3
35 C
A,
1
10

1
(f ) B
@ 1
0

1
2
1

8.5.3.
(a)

1
0

1
1

0
1 C
A=
1
1+

5
B 10+25
B
@
2
10+2 5

0
B
B
B
B
B
@

1
6
2
3
1
6

2C
C
C
0C
C
A
1

3
0

1
0q
5
3
C
102 5 C@
2
A
2
102 5

0
!
0 B
@

+
5
1
2

1
6
1
2

3
2

1
1
6C
A.

2
3
1

0
2
0

0 AB
102
5
B
@ 15
1

2 5
10+2 5
1

1+

1
2
C
102 5 C
A;
2
10+2 5

(b) The first and last matrices are proper orthogonal, and so represent rotations, while the middle matrix is a stretch along the coordinate directions, in proportion to the singular values. Matrix multiplication corresponds to composition of the corresponding linear transformations.
8.5.4.

221
(a) The eigenvalues of K = AT A are 15
= 14.933, .0667. The square roots of these
2
2
eigenvalues give us the singular values of A. i.e., 3.8643, .2588. The condition number is
3.86433 / .25878 = 14.9330.
(b) The singular values are 1.50528, .030739, and so the condition number is
1.50528 / .030739 = 48.9697.
(c) The singular values are 3.1624, .0007273, and so the condition number is
3.1624 / .0007273 = 4348.17; slightly ill-conditioned.
(d) The singular values are 12.6557, 4.34391, .98226, so he condition number is
12.6557 / .98226 = 12.88418.
(e) The singular values are 239.138, 3.17545, .00131688, so the condition number is
239.138 / .00131688 = 181594; ill-conditioned.
(f ) The singular values are 30.2887, 3.85806, .843107, .01015, so the condition number is
30.2887 / .01015 = 2984.09; slightly ill-conditioned.
8.5.5. In all cases, the large condition number results in an inaccurate solution.
(a) The exact solution is x = 1, y = 1; with three digit rounding, the computed solution is
x = 1.56, y = 1.56. The singular values of the coefficient matrix are 1615.22, .274885,
and the condition number is 5876.
(b) The exact solution is x = 1, y = 109, z = 231; with three digit rounding, the computed solution is x = 2.06, y = 75.7, z = 162. The singular values of the coefficient
matrix are 265.6, 1.66, .0023, and the condition number is 1.17 105 .
(c) The exact solution is x = 1165.01, y = 333.694, z = 499.292; with three digit rounding, the computed solution is x = 467, y = 134, z = 200. The singular values of the
coefficient matrix are 8.1777, 3.3364, .00088, and the condition number is 9293.
8.5.6.
(a) The 22 Hilbert matrix has singular values 1.2676, .0657 and condition number 19.2815.
The 3 3 Hilbert matrix has singular values 1.4083, .1223, .0027 and condition number
524.057. The 4 4 Hilbert matrix has singular values 1.5002, .1691, .006738, .0000967
and condition number 15513.7.
(b) The 5 5 Hilbert matrix has condition number 4.7661 105 ; the 6 6 Hilbert matrix
has condition number 1.48294 107 .
8.5.7. Let A = v R n be the matrix (column vector) in question. (a) It has one singular value:

vT
v
, = k v k a 1 1 matrix, Q = (1); (c) v+ =
.
k v k; (b) P =
kvk
k v k2

8.5.8. Let A = vT , where v R n , be the matrix (row vector) in question. (a) It has one singu

v
v
lar value: k v k; (b) P = (1), = k v k , Q =
; (c) v+ =
.
kvk
k v k2
230

8.5.9. Almost true, with but one exception the zero matrix.
8.5.10. Since S 2 = K = AT A, the eigenvalues of K are the squares, = 2 of the
eigenvalues
of S. Moreover, since S > 0, its eigenvalues are all non-negative, so = + , and, by
definition, the nonzero > 0 are the singular values of A.
8.5.11. True. If A = P QT is the singular value decomposition of A, then the transposed
equation AT = Q P T gives the singular value decomposition of AT , and so the diagonal
entries of are also the singular values of AT .
8.5.12. Since A is nonsingular, so is K = AT A, and hence all its eigenvalues are nonzero. Thus,
Q, whose columns are the orthonormal eigenvector basis of K, is a square orthogonal matrix, as is P . Therefore, the singular value decomposition of the inverse matrix is A 1 =
QT 1 P 1 = Q 1 P T . The diagonal entries of 1 , which are the singular values of
A1 , are the reciprocals of the diagonal entries of . Finally, (A1 ) = n /1 = 1/(A).
8.5.13.
(a) When A is nonsingular, all matrices in its singular value decomposition (8.40) are square.
Thus, we can compute det A = det P det det QT = 1 det = 1 2 n , since
the determinant of an orthogonal matrix is 1. The result follows upon taking absolute
values of this equation and using the fact that the product of the singular values is nonnegative.
(b) No even simple nondiagonal examples show this is false.
(c) Numbering the singular values in decreasing order, so k n for all k, we conclude
n
10 k > | det A | = 1 2 n n
, and the result follows by taking the nth root.
(d) Not necessarily, since all the singular values could be very small but equal, and in this
case the condition number would be 1.
(e) The diagonal matrix with entries 10k and 10k for k 0, or more generally, any 2 2
matrix with singular values 10k and 10k , has condition number 102 k .
8.5.14. False. For example, the diagonal matrix with entries 2 10k and 10k for k 0 has
determinant 2 but condition number 2 102 k .
8.5.15. False the singular values are the absolute values of the nonzero eigenvalues.
8.5.16. False. For example, U =

1
0

1
2

has singular values 3

5.

8.5.17. False, unless A is symmetric or, more generally,


normal,
meaning that A T A = A AT .
!
r

1 1
are 32 25 , while the singular values of
For example, the singular values of A =
!
0 1
q

1 2
are 3 2 2 .
A2 =
0 1
8.5.18. False. This is only true if S is an orthogonal matrix.
8.5.19.
(a) If k x k = 1, then y = A x satisfies the equation y T B y = 1, where B = AT A1 =
P T 2 P . Thus, by Exercise 8.4.24, the principal axes of the ellipse are the columns of
P , and the semi-axes are the reciprocals of the square roots of the diagonal entries of
2 , which are precisely the singular values i .
(b) If A is symmetric (and nonsingular), P = Q is the orthogonal eigenvector matrix, and
so the columns of P coincide with the eigenvectors of A. Moreover, the singular values
i = | i | are the absolute values of its eigenvalues.
(c) From elementary geometry, the area of an ellipse equals times the product of its semi231

axes. Thus, area E = 1 2!= | det A |!using Exerciser8.5.13.


r

2
2
3
5
3
+
,
25 ; area: .
(d) (i) principal axes:
,
; semi-axes:
2
2
2
1+ 5
1 5
!
!
1
2
(ii) principal axes:
,
(but any orthogonal basis of R 2 will also do); semi2
1

axes: 5, 5 (its a!circle);! area: 5 .

3
1
(iii) principal axes:
,
; semi-axes: 3 5 , 5 ; area: 15 .
1
3
(e) If A = O, then E = {0} is a point. Otherwise, rank A = 1 and its singular value decomposition is A = 1 p1 qT
1 where A q1 = 1 p1 . Then E is a line segment in the direction
of p1 of length 2 1 .
8.5.20.

10

10
11

3
11

3
11

2
11

u+ v +
u+ v
= 1 or 109 u + 72 u v + 13 v = 121.
Since A is symmetric, the semi-axes are the eigenvalues,
which are the same as the singular values, namely 11, 1,
so the ellipse is very long and thin; the principal axes
1
3

are the eigenvectors,

7.5
5
2.5

-4

-2

2
-2.5

3
; the area is 11 .
1

-5
-7.5
-10

8.5.21.
(a) In view of the singular value decomposition of A = P QT , the set E is obtained by
first rotating the unit sphere according to QT , which doesnt change it, then stretching
it along the coordinate axes into an ellipsoid according to , and then rotating it according to P , which aligns its principal axes with the columns of P . The equation is
(65 u + 43 v 2 w)2 + (43 u + 65 v + 2 w)2 + ( 2 u + 2 v + 20 w)2 = 2162 ,
2

or

1013 u + 1862 u v + 1013 v 28 u w + 28 v w + 68 w = 7776.

(b) The semi-axes are the eigenvalues: 12, 9, 2; the principal axes are the eigenvectors:
( 1, 1, 2 )T ,

( 1, 1, 1 )T ,

(c) Since the unit sphere has volume

4
3

( 1, 1, 0 )T .

, the volume of E is

4
3

det A = 288 .

8.5.22.
(a) k A u k2 = (A u)T A u = uT K u, where K = AT A. According to Theorem 8.28,
max{ uT K u | k u k = 1
eigenvalue 1 of K = AT A, hence the maxi} is the largest
q
mum value of k A u k = uT K u is 1 = 1 .
(b) This is true if rank A = n by the same reasoning, but false if ker A 6= {0}, since then the
minimum is 0, but, according to our definition, singular values are always nonzero.
(c) The k th singular value k is obtained by maximizing k A u k over all unit vectors which
are orthogonal to the first k 1 singular vectors.
8.5.23. Let 1 be the maximal eigenvalue, and let u1 be a corresponding unit eigenvector. By
Exercise 8.5.22, 1 k A u1 k = | 1 |.
8.5.24. By Exercise 8.5.22, the numerator is the largest singular value, while the denominator is
the smallest, and so the ratio is the condition number.
8.5.25. (a)

0
@

1
20
1
20

3
20
3
20

A,

(b)

0
@

1
5
2
5

1
2
5 A,
1
5

232

(c)

1
@2

0
1

0A
,
0

(d)

B
@0

0
1
0

0
0C
A,
0

(e)
8.5.26.

0
B
B
B
@

1
15
1
15
1
15

2
15

C
2 C
C,
15 A
2
15

(f )

1
@ 140
3
140

1
70
3
70

1
3
140 A,
9
140

(g)

1
B 9
B 5
B
@ 18
1
18

19
2
9
4
9

C
C
C.
A

2
9
1
18
7
18

1
1
3
1 1
1
?
+
20 A,
@ 4 A;
(a) A =
, A+ = @ 20
x
=
A
=
3
1
3 3
2
141
20
20
0
0
1
1
0
1 3
2
1
1
1
0
C
?
+B
3
6 A,
(b) A = B
1C
x
=
A
A+ = @ 6 3
@2
@ 1 A =
A,
7
1
1

11 11 11
11
1
1
0
0
0
1
1

1
2

(c) A =

1
1

1
,
1

A+ =

1
B7
B4
B
@7
2
7

2
7
5
14
1
14

C
C
C,
A

x? = A+

5
2

9
B 7C
B 15 C
B
C.
@ 7 A
11
7

8.5.27. We repeatedly use the fact that the columns of P, Q are orthonormal, and so
P T P = I , QT Q = I .
(a) Since A+ = Q 1 P T is the singular value decomposition of A+ , we have (A+ )+ =
P (1 )1 QT = P QT = A.
(b) A A+ A = (P QT )(Q 1 P T )(P QT ) = P 1 QT = P QT = A.
(c) A+ A A+ = (Q 1 P T )(P QT )(Q 1 P T ) = Q 1 1 P T = Q 1 P T = A+ .
Or, you can use the fact that (A+ )+ = A.
+ T
(d) (A A ) = (Q 1 P T )T (P QT )T = P (1 )T QT QT P T = P (1 )T T P T =
P P T = P 1 P T = (P QT )(Q 1 P T ) = A A+ .
(e) This follows from part (d) since (A+ )+ = A.
8.5.28. In general, we know that x? = A+ b is the vector of minimum norm that minimizes the
least squares error k A x b k. In particular, if b rng A, so b = A x0 for some x0 , then
the minimum least squares error is 0 = k A x0 b k. If ker A = {0}, then the solution is
unique, and so x? x0 ; otherwise, x? corng A is the solution to A x = b of minimum
norm.

8.6.1.
(a) U =

0
B
@

(b) U = B
@

(c) U = B
@
(d) U =

0
B
@

1
2
1
2

1
2
1
2

3
13
2
13

1+3
i
14

2
7

1
1
2C
A,
1
2
1
1
2C
A,
1
2
1
2
13 C
A,
3
13

1
2
7 C
A,
13
i
14

2
0

2
;
2

3
0

0
;
1

2
0

3i
0

15
;
1
2 3 i
3 i

233

0
B
B

(e) U = B
B
@

(f ) U =

35
5
3
2

3 5

1
5

0
2
5

0 1

B 2
B
B 0
B
@
1
2

i
1+

2 2
1
2
1+
i
2 2

2
3
32
1
3
1
2
1 i
2
12

C
C
C,
C
A
1

C
C
C,
C
A

B
B

=B
B
0
B

B
=B
@

0
0

i
0

22

5
9
5

C
C
C;
C
A

1
1
i
2 C

C
.
2 + i 2C
A

8.6.2. If U is real, then U = U T is the same as its transpose, and so (8.45) reduces to U T U =
I , which is the condition that U be an orthogonal matrix.
8.6.3. If U1 U1 = I = U2 U2 , then (U1 U2 ) (U1 U2 ) = U2 U1 U1 U2 = U2 U2 = I , and so U1 U2 is
also orthogonal.
8.6.4. If A is symmetric, its eigenvalues are real, and hence its Schur Decomposition is A =
Q QT , where Q is an orthogonal matrix. But AT = (QT QT )T = Q T T QT , and hence
T = is a symmetric upper triangular matrix, which implies that = is a diagonal
matrix with the eigenvalues of A along its diagonal.
8.6.5.
(a)
(b)
(c)
(d)

If A is real, A = AT , and so if A = AT then AT A = A2 = A AT .


If A is unitary, then A A = I = A A .
Every real orthogonal matrix is unitary, so this follows from part (b).
When A is upper triangular, the ith diagonal entry of the matrix equation A A = A A
is | aii |2 =

n
X

k=i

| aik |2 , and hence aik = 0 for all k > i. Therefore A is a diagonal matrix.

(e) Let U = ( u1 u2 . . . un ) be the corresponding unitary matrix, with U 1 = U . Then


A U = U , where is the diagonal eigenvalue matrix, and so A = U U = U U .
Then A A = U U U U = U U = A A since = as it is diagonal.
(f ) Let A = U U be its Schur Decomposition. Then, as in part (e), A A = U U ,
while A A = U U . Thus, A is normal if and only if is; but part (d) says this
happens if and only if = is diagonal, and hence A = U U satisfies the conditions
of part (e).
(g) If and only if it is symmetric. Indeed, by the argument in part (f ), A = Q QT where Q
is real, orthogonal, which is just the spectral factorization of A = AT .

8.6.6.
(a)
(b)
(c)
(d)
(e)

One 2 2 Jordan block; eigenvalue 2; eigenvector e1 .


Two 1 1 Jordan blocks; eigenvalues 3, 6; eigenvectors e1 , e2 .
One 1 1 and one 2 2 Jordan blocks; eigenvalue 1; eigenvectors e1 , e2 .
One 3 3 Jordan block; eigenvalue 0; eigenvector e1 .
One 1 1, 2 2 and 1 1 Jordan blocks; eigenvalues 4, 3, 2; eigenvectors e 1 , e2 , e4 .

234

8.6.7.

B
B0
B
@0

0
2
B
0
B
B
@0
0
0

8.6.8.

8.6.9.

2
B
@0
0
0
2
B
@0
0

0
2
0
0
1
2
0
0

0
0
2
0
0
0
2
0

0
2
0
1
2
0

0
0C
A,
5
1
0
0C
A,
5

0
0C
C
C,
0A
2
1
0
0C
C
C,
1A
2
0

2
B
@0
0
0
5
B
@0
0

B
B0
B
@0

0
2
B
B0
B
@0
0
0

0
5
0
0
2
0

1
2
0
0
1
2
0
0
1

0
0C
A,
2
1
0
1C
A,
2

0
0
2
0
0
1
2
0
0

5
B
@0
0
0
2
B
@0
0

(c)

(d)

(e)

(f )

3
0

1
.
3

0
2
0
0
0
2
0
0

0
2
B
B0
B
@0
0
0

0
0C
A,
2
1
0
1C
A,
5

1
0

(b) Eigenvalue: 3. Jordan basis: v1 =

B
B0
B
@0

0
2
0
0
5
0

(a) Eigenvalue: 2. Jordan basis: v1 =

Jordan canonical form:

0
0C
C
C,
0A
2
1
0
0C
C
C,
0A
2

1
2

2
B
@0
0
0
5
B
@0
0

0
5
0
1
5
0
0

, v2 =
!

0
1
2
0
0
1
2
0

1
3

, v2 =
0

0
0C
C
C,
0A
2
1
0
0C
C
C,
1A
2

0
0C
A,
5
1
0
0C
A.
2

B
B0
B
@0

0
2
B
0
B
B
@0
0
0

5
B
@0
0

0
2
0
0
1
2
0
0

0
0
2
0
0
1
2
0

0
2
0

0
0C
C
C,
1A
2
1
0
0C
C
C.
1A
2

0
0C
A,
5

5
B
@0
0

0
1

1
.
2

0
0
1
C
B
B C
C
Eigenvalue: 1. Jordan basis: v1 = B
@ 0 A , v2 = @ 1 A , v3 = @ 1 A.
1
0
0
0
1
1 1 0
C
Jordan canonical form: B
@ 0 1 1 A.
0 0 1 0 1
0 1
0 1
1
0
1
C
B C
B C
Eigenvalue: 3. Jordan basis: v1 = B
@ 0 A , v 2 = @ 1 A , v 3 = @ 0 A.
11
0
0
0
3
1
0
Jordan canonical form: B
1C
@ 0 3
A.
0
0 3 0
1
0
1
0
1
1
0
0
C
B
C
B
C
Eigenvalues: 2, 0. Jordan basis: v1 = B
@ 0 A , v2 = @ 1 A , v3 = @ 1 A.
1
0
1
1
0
2
1 0
C
Jordan canonical form: B
@ 0 2 0 A.
0
0 0
0
1
0
1
0 1
1
2
1
1
B C
B
B C
B 2C
1C
C
B0C
C
B
C, v2 = B C, v3 = B
Eigenvalue: 2. Jordan basis: v1 = B
B 1 C, v 4 =
@ 1A
@0A
@ 2A
0
0
0
1
0
2 0 0 0
B
0 2 1 0C
C
C.
B
Jordan canonical form: B
@0 0 2 1A
0 0 0 2

235

0
0C
A,
2

2
0

. Jordan canonical form:

1
2

0
5
0

0
B
B
B
@

0
0C
C
C.
0A

21

8.6.10.

1
J,n
=

0 1

B
B 0
B
B
B 0
B
B
B 0
B
B .
B .
@ .

2
1
0
0
..
.
0

3
2
1
0
..
.
0

4
3
2
1
..
.
0

...
...
...
...
..
.
...

()n
C
()n1 C
C
C
()n2 C
C
.
C
()n3 C
C
C
..
C
A
.
1

8.6.11. True. All Jordan chains have length one, and so consist only of eigenvectors.
8.6.12. No in general. If an eigenvalue has multiplicity 3, then you can tell the size of its Jordan blocks by the number of linearly independent eigenvectors it has: if it has 3 linearly
independent eigenvectors, then there are three 1 1 Jordan blocks; if it has 2 linearly independent eigenvectors then there are two Jordan blocks, of sizes 1 1 and 2 2, while
if it only has one linearly independent eigenvector, then it corresponds to a single 3 3
Jordan block. But if the multiplicity of the eigenvalue is 4, and there are only 2 linearly independent eigenvectors, then it could have two 2 2 blocks, or a 1 1 and an 3 3 block.
Distinguishing between the two cases is a difficult computational problem.
8.6.13. True. If zj = c wj , then A zj = cA wj = c wj + c wj1 = zj + zj1 .
8.6.14. False. Indeed, the square of a Jordan matrix is not necessarily a Jordan matrix, e.g.,
0

8.6.15.

1
B
@0
0

1
1
0

0 2
1
B
1C
A = @0
1
0

2
1
0

1
2C
A.
1

0 1
. Then e2 is an eigenvector of A2 = O, but is not an eigenvector of A.
0 0
(b) Suppose A = S J S 1 where J is the Jordan canonical form of A. Then A2 = S J 2 S 1 .
Now, even though J 2 is not necessarily a Jordan matrix, cf. Exercise 8.6.14, since J is
upper triangular with the eigenvalues on the diagonal, J 2 is also upper triangular and
its diagonal entries, which are its eigenvalues and the eigenvalues of A2 , are the squares
of the diagonal entries of J.
(a) Let A =

8.6.16. Not necessarily. A simple example is A =


!
0 0
.
whereas B A =
0 0

1
0

1
,B =
0

0
0

1
, so A B =
0

0
0

1
,
0

8.6.17. First, since J,n is upper triangular, its eigenvalues are its diagonal entries, and hence
is the only eigenvalue. Moreover, v = ( v1 , v2 , . . . , vn )T is an eigenvector if and only if
(J,n I )v = ( v2 , . . . , vn , 0 )T = 0. This requires v2 = = vn = 0, and hence v must
be a scalar multiple of e1 .

8.6.18.
k
is the matrix with 1s along the k th upper diagonal, i.e., in positions
(a) Observe that J0,n
n
= O.
(i, k + i). In particular, when k = n, all entries are all 0, and so J0,n
(b) Since a Jordan matrix is upper triangular, the diagonal entries of J k are the k th powers
of diagonal entries of J, and hence J m = O requires that all its diagonal entries are
zero. Moreover, J k is a block matrix whose blocks are the k th powers of the original
Jordan blocks, and hence J m = O, where m is the maximal size Jordan block.
(c) If A = S J S 1 , then Ak = S J k S 1 and hence Ak = O if and only if J k = O.
(d) This follow from parts (cd).
236

8.6.19. (a) Since J k is upper triangular, Exercise 8.3.12 says it is complete if and only if it is
a diagonal matrix, which is the case if and only if J is diagonal, or J k = O. (b) Write
A = S J S 1 in Jordan canonical form. Then Ak = S J k S 1 is complete if and only if J k is
complete, so either J is diagonal, whence A is complete, or J k = O and so Ak = O.
8.6.20.
(a) If D = diag (d1 , . . . , dn ), then pD () =

n
Y

i=1

( di ). Now D di I is a diagonal matrix

with 0 in its ith diagonal position. The entries of the product pD (D) =

n
Y

i=1

(D di I ) of

diagonal matrices is the product of the individual diagonal entries, but each such product has at least one zero, and so the result is a diagonal matrix with all 0 diagonal entries, i.e., the zero matrix: pD (D) = O.
(b) First, according to Exercise 8.2.32, similar matrices have the same characteristic polynomials, and so if A = S D S 1 then pA () = pD (). On the other hand, if p() is any
polynomial, then p(S D S 1 ) = S 1 p(D) S. Therefore, if A is complete, we can diagonalize A = S D S 1 , and so, by part (a) and the preceding two facts,
pA (A) = pA (S D S 1 ) = S 1 pA (D) S = S 1 pD (D) S = O.
(c) The characteristic polynomial of the upper triangular Jordan block matrix J = J ,n
n
with eigenvalue is pJ () = ( )n . Thus, pJ (J) = (J I )n = J0,n
= O by Exercise
8.6.18.
(d) The determinant of a (Jordan) block matrix is the product of the determinants of the
individual blocks. Moreover, by part (c), substituting J into the product of the characteristic polynomials for its Jordan blocks gives zero in each block, and so the product
matrix vanishes.
(e) Same argument as in part (b), using the fact that a matrix and its Jordan canonical
form have the same characteristic polynomial.
8.6.21. The n vectors are divided into non-null Jordan chains, say w1,k , . . . , wik ,k , satisfying
B wi,k = k wi,k + wi1,k with k 6= 0 the eigenvalue, (and w0,k = 0 by convention)
along with the null Jordan chains, say y1,l , . . . , wil ,l , wil +1,l , supplemented by one additional vector, satisfying B yi,k = yi1,k , and, in addition, the null vectors
z1 , . . . , znrk ker B \ rng B. Suppose some linear combination vanishes:
Xh
k

a1,k w1,k + + aik ,k wik ,k

Xh
l

b1,l y1,l + + bil ,l yil ,l + bil +1,l yil +1,l

+ (c1 z1 + crk zrk ) = 0.

Multiplying by B and using the Jordan chain equations, we find


i
Xh
(k a1,k + a2,k ) w1,k + + (k aik 1,k + aik ,k ) wik 1,k + k aik ,k wik ,k
k

Xh
l

b2,l y1,l + + bil +1,l yil ,l = 0.

Since we started with a Jordan basis for W = rng B, by linear independence, their coefficients in the preceding equation must all vanish, which implies that a1,k = = aik ,k =
b2,l = = bil +1,l = 0. Substituting this result back into the original equation, we are left
with
X
b1,l y1,l + (c1 z1 + crk zrk ) = 0,
l

which implies all b1,l = cj = 0, since the remaining vectors are also linearly independent.

237

Solutions Chapter 9

9.1.1.

du
=
(i) (a) u(t) = c1 cos 2 t + c2 sin 2 t. (b)
dt

0
4

1
0

c1 cos 2 t + c2 sin 2 t
.
2 c1 sin 2 t + 2 c2 cos 2 t

u. (c) u(t) =

0.4
0.2

(d)

(e)

-0.2
-0.4

(ii) (a) u(t) = c1 e

2t

du
+ c2 e . (b)
=
dt
2t

0
4

1
0

u. (c) u(t) =

c 1 e 2 t + c 2 e2 t
.
2 c 1 e 2 t + 2 c 2 e2 t

7
6
5
4

(e)

(d)

3
2
1
-1

(iii) (a) u(t) = c1 e

+ c2 t e

du
. (b)
=
dt

0
1

1
2

-0.5

0.5

u. (c) u(t) =

c 1 e t + c 2 t e t
.
(c2 c1 ) e t c2 t e t

3
2
1

(d)

(e)

-1

-0.5

0.5

-1
-2
-3

(iv ) (a) u(t) = c1 e

+ c2 e

3t

du
. (b)
=
dt

0
3

1
4

u. (c) u(t) =

c 1 e t + c 2 e 3 t
.
c 1 e t 3 c 2 e 3 t

2
-1

(d)

-0.5

0.5
-2

(e)

-4
-6
-8

(v ) (a) u(t) = c1 et cos 3 t + c2 et sin 3 t. (b)

du
=
dt

0
10

238

1
2

u.

(c) u(t) =

c1 e t cos 3 t + c2 e t sin 3 t
.
(c1 + 3 c2 ) e t cos 3 t + (3 c1 c2 ) e t sin 3 t
20
10

(d)

(e)

-1

-10
-20
-30

9.1.2.

0
du
=B
(a)
0
@
dt
12

1
0
4

0
1C
A u.
3

(b) u(t) = c1 e 3 t + c2 cos 2 t + c3 sin 2 t,


(c) dimension = 3.

c1 e 3 t + c2 cos 2 t + c3 sin 2 t
B
C
B
u(t) = @ 3 c1 e 3 t 2 c2 sin 2 t + 2 c3 cos 2 t C
A,
3t
9 c1 e
4 c2 cos 2 t 4 c3 sin 2 t
0

0
u1 (t)
B
C
B
du

Bc
B u2 (t) C
C. Then
=B
9.1.3. Set u1 = u, u2 = u, u3 = v, u4 = v and u(t) = B
@0
@ u3 (t) A
dt
r
u4 (t)

1
a
0
p

0
d
0
s

0
bC
C
Cu.
1A
q

9.1.4. False; by direct computation, we find that the functions u1 (t), u2 (t) satisfy a quadratic
equation u21 + u1 u2 + u22 + u1 + u2 = c if and only if c1 = c2 = 0.
9.1.5.

dv
du
=
( t) = A u( t) = A v.
dt
dt
(b) Since v(t) = u( t)!parametrizes the same curve as u(t), but in the
reverse direction.
!
dv
0 1
c1 cos 2 t c2 sin 2 t
(c) (i)
=
v; solution: v(t) =
.
4
0
2 c1 sin 2 t + 2 c2 cos 2 t
dt
!
!
dv
c 1 e2 t + c 2 e 2 t
0 1
=
v; solution: v(t) =
.
(ii)
4
0
dt
2 c1 e2 t + 2 c2 e!2 t
!
dv
c 1 et c 2 t e t
0 1
v; solution: v(t) =
.
=
(iii)
1
2
dt
(c2 c1 ) et + c2 t!et
!
dv
c 1 et + c 2 e3 t
0 1
=
(iv )
v; solution: v(t) =
.
3
4
dt
c 1 et 3 c 2 e3 t
!
!
dv
c1 et cos 3 t c2 et sin 3 t
0 1
(v )
=
v; solution: v(t) =
.
10 2
dt
(c1 + 3 c2 ) et cos 3 t (3 c1 c2 ) et sin 3 t

(d) Time reversal changes u1 (t) = u(t) into v1 (t) = u1 ( t) = u( t) and u2 (t) = u(t) into

v2 (t) = u2 ( t) = u( t) = v(t). The net effect is to change the sign of the coefficient
du
d2 v
dv
d2 u
+
a
+
b
u
=
0
becomes
a
+ b v = 0.
of the first derivative term, so
2
2
dt
dt
dt
dt
d
d
9.1.6. (a) Use the chain rule to compute
v(t) = 2
u(2 t) = 2 A u(2 t) = 2 A v(t), and
dt
dt
so the coefficient matrix is multiplied by 2. (b) The solution trajectories are the same, but
the solution moves twice as fast (in the same direction) along them.
(a) Use the chain rule to compute

9.1.7.
(a) This is proved by induction, the case k = 0 being assumed.
239

dk+1 u
d
If
=
k+1
dt
dt
0

0
@

dk u
dtk
1

1
A

= A
0

dk u
, then differentiating the equation with respect to t
dtk
1

d @ dk u A
dk+1 u
d @ dk+1 u A
=
=
A
A
, which proves the induction step.
yields
dt
dt
dtk+1
dtk
dtk+1
(b) This is also proved by induction, the case k = 1 being assumed. If true for k, then
dk+1 u
d
=
k+1
dt
dt

0
@

dk u
dtk

1
A

d
du
(Ak u) = Ak
= Ak A u = Ak+1 u.
dt
dt

9.1.8. False. If u = A u then the speed along the trajectory at the point u(t) is k A u(t) k.
So the speed is constant only if k A u(t) k is constant. (Later, in Lemma 9.31, this will be
shown to correspond to A being a skew-symmetric matrix.)

9.1.9. In all cases, the t axis is plotted vertically, and the three-dimensional solution curves

(u(t), u(t), t)T project to the phase plane trajectories (u(t), u(t))T .

(i) The solution curves are helices going around the t axis:

(ii) Hyperbolic curves going away from the t axis in both directions:

(iii) The solution curves converge on the t axis as t :

(iv ) The solution curves converge on the t axis as t :

(v ) The solution curves spiral away from the t axis as t :

240

9.1.10.
(a) Assuming b 6= 0, we have

1
d
a
bc ad

()
u + u.
u u,
v=
b
b
b
b
Differentiating the first equation yields
dv
1 a
= u u.
dt
b
b
Equating this to the right hand side of the second equation yields leading to the second
order differential equation

()
u (a + d)u + (a d b c)u = 0.
v=

(b) If u(t) solves (), then defining v(t) by the first equation in () yields a solution to the
first order system. Vice versa, the first component of any solution ( u(t), v(t) ) T to the
system gives a solution u(t) to the second order equation.

(c) (i) u + u = 0, hence u(t) = c1 cos t + c2 sin t, v(t) = c1 sin t + c2 cos t.

(ii) u 2u + 5 u = 0, hence
u(t) = c1 et cos 2 t + c2 et sin 2 t, v(t) = (c1 + 2 c2 ) et cos 2 t + ( 2 c1 + c2 ) et sin 2 t.

(iii) u u 6 u = 0, hence u(t) = c1 e3 t + c2 e 2 t , v(t) = c1 e3 t + 6 c2 e 2 t .

(iv ) u 2 u = 0,hence

u(t) = c1 e 2 t + c2 e 2 t , v(t) = ( 2 1)c1 e 2 t ( 2 + 1)c2 e 2 t .

(v ) u = 0, hence u(t) = c1 t + c2 , v(t) = c1 .


1
c
b
ad bc

(d) For c 6= 0 we can solve for u = v v, u =


u + v, leading to the same
d d
d
d

second order equation for v, namely, v (a + d)v + (a d b c)v = 0.


(e) If b = 0 then u solves a first order linear equation; once we solve the equation, we can
then recover v by solving an inhomogeneous first order equation. Although u continues
to solve the same second order equation, it is no longer the most general solution, and
so the one-to-one correspondence between solutions breaks down.

9.1.11. u(t) =

7 5t
5e

8 5t
5e ,

5t
v(t) = 14
+
5 e

4 5t
5e .

9.1.12.

(a) u1 (t) = ( 101)c1 e(2+ 10) t ( 10+1)c2 e(2 10) t , u2 (t) = c1 e(2+ 10) t +c2 e(2 10) t ;
(b) x1 (t) = ch1 e 5 t + 3 c2 e5 t , x2 (t) =
3 c1 e 5 t + ch2 e5 t ;
i
i
(c) y1 (t) = e2 t c1 cos t (c1 + c2 ) sin t , y2 (t) = e2 t c2 cos t + (2 c1 + c2 ) sin t ;
(d) y1 (t) = c1 e t c2 et 32 c3 , y2 (t) = c1 e t c2 et , y3 (t) = c1 e t + c2 et + c3 ;
(e) x1 (t) = 3 c1 et + 2 c2 e2 t + 2 c3 e4 t , x2 (t) = c1 et + 21 c2 e2 t , x3 (t) = c1 et + c2 e2 t + c3 e4 t .

9.1.13.

(a) u(t) = 12 e22 t +

1 2+2 t
, 12 e22 t
2e

3t t
3t T

1 2+2 t T
,
2e

e t 3 e , e + 3 e
,

(c) u(t) = et cos 2 t, 1 et sin 2 t


,
2

,
(d) u(t) = e t 2 cos 6 t , e t 1 cos 6 t + 23 sin 6 t , e t 1 cos 6 t
(b) u(t) =

(e) u(t) = ( 4 6 cos t 9 sin t, 2 + 3 cos t + 6 sin t, 1 3 sin t )T ,


(f ) u(t) =

1 2t
2e

1 2+t
, 21 e42 t
2e

1 4+2 t
, 12 e2t
2e

241

1 2+t 1 42 t
,2e
2e

1 4+2 t T
e
,
2

(g) u(t) =

21 e t +

3
2

cos t

3
2

sin t, 32 e t

5
2

9.1.14. (a) x(t) = e t cos t, y(t) = e t sin t;

9.1.15. x(t) = et/2 cos

3
2

3 sin

3
2

cos t +

3
2

sin t, 2 cos t, cos t + sin t

(b)

t , y(t) = et/2 cos

3
2

time t = 1, the position is ( x(1), y(1) )T = ( 1.10719, .343028 )T .

1
3

sin

3
2

t , and so at

9.1.16. The solution is x(t) = c1 cos 2 t c2 sin 2 t, c1 sin 2 t + c2 cos 2 t, c3 e t . The origin
is a stable, but not asymptotically stable, equilibrium point. Fluid particles starting on the
xy plane move counterclockwise, at constant speed with angular velocity 2, around circles
centered at the origin. Particles starting on the z axis move in to the origin at an exponential rate. All other fluid particles spiral counterclockwise around the z axis, converging
exponentially fast to a circle in the xy plane.
9.1.17. The coefficient matrix has eigenvalues 1 = 5, !2 = 7, and,!since the coefficient ma1
1
trix is symmetric, orthogonal eigenvectors v1 =
, v2 =
. The general solution
1
1
is
!
!
5t 1
7 t 1
u(t) = c1 e
+ c2 e
.
1
1
For the initial conditions
!
!
!
1
1
1
u(0) = c1
+ c2
=
,
1
1
2
we can use orthogonality to find
h u(0) , v1 i
= 23 ,
c1 =
k v 1 k2

c2 =

Therefore, the solution is

u(t) =

3 5t
2e

1
1

1 7t
2e

h u(0) , v2 i
= 12 .
k v 2 k2
1
1

9.1.18.
(a) Eigenvalues: 1 = 0, 2 = 1, 3 = 3, eigenvectors: v1 =

B C
@1A,

v2 =

0
B
@

(b) By direct computation, v1 v2 = v1 v3 = v2 v3 = 0.


(c) The matrix is positive semi-definite since it has one zero eigenvalue and all the rest
are positive.
(d) The general solution is
0 1
0
1
0
1
1
1
1
C
C
C
t B
3t B
u(t) = c1 B
@ 1 A + c2 e @ 0 A + c3 e
@ 2 A .
1
1
1
For the given initial conditions
0
1
0
1
0 1
0
1
1
1
1
1
B
C
B
C
C
B
C
u(0) = c1 B
@ 1 A + c2 @ 0 A + c3 @ 2 A = @ 2 A = u0 ,
1
1
1
1
242

1
1
B
C
0C
A , v3 = @ 2 A .
1
1

we can use orthogonality to find


h u0 , v2 i
h u0 , v1 i
2
= ,
c2 =
= 1,
c1 =
2
k v1 k
3
k v 2 k2
Therefore, the solution is u(t) =

2
3

+ et

2
3

e3t ,

2
3

c3 =
4
3

e3t ,

2
3

h u0 , v3 i
2
= .
2
k v3 k
3

et

2
3

e3t

9.1.19. The general complex solution to the system is


0
1
0 1
0
1
1
1
1
C
C
t B
(1+2 i ) t B C
(12i) t B
u(t) = c1 e @ 1 A + c2 e
@ i A + c3 e
@i A.
1
1
1
Substituting into the initial conditions,
0
1
0
1
c1 = 2,
2C
c1 + c2 + c3
B
B
C
C
c2 = 21 i ,
u(0) = @ c1 + i c2 i c3 A = B
we find
@ 1 A
c1 + +c2 + c3
2
c3 = 12 i .
Thus, we obtain the same solution:
0
1
0 1
0
1
0
1
1
1
1
2 et + et sin 2 t
C
C
B
C
(1+2 i ) t B C
(12i) t B
1
1
u(t) = 2 et B
@ 1A 2 i e
@ i A+ 2 ie
@ i A = @ 2 et + et cos 2 t A.
t
t
1
1
1
2 e + e sin 2 t
9.1.20. Only (d) and (g) are linearly dependent.
du
d
e (t) =
e
e (t) solves
u
(t t0 ) = A u(t t0 ) = A u(t),
and hence u
dt
dt
e (t ) = u(0) = b has the correct initial conditions.
the differential equation. Moreover, u
0
e (t) is always ahead of u(t) by an amount t .
The trajectories are the same curves, but u
0

9.1.21. Using the chain rule,

9.1.22.
(a) This follows immediately from uniqueness, since they both solve the initial value prob
e (t) for all t;
lem u = A u, u(t1 ) = a, which has a unique solution, and so u(t) = u
e
(b) u(t) = u(t + t2 t1 ) for all t.

T
du
= 1 c 1 e 1 t , . . . , n c n e n t
= u. Moreover, the solution is a
dt
linear combination of the n linearly independent solutions ei t ei , i = 1, . . . , n.

9.1.23. We compute

9.1.24.

dv
du
=S
= S A u = S A S 1 v = B v.
dt
dt

9.1.25.
(i) This is an immediate
consequence
of the preceding two exercises.
!
!
!
2t !
c
e
c 1 e3 t
1 1
1 1
1
(ii) (a) u(t) =
,
(b) u(t) =
,
t
1 1
1 1
c 2 e t
c 2 e2 0

1
!

(1+ i 2) t
2
i
2 i @ c1 e

A,

(c) u(t) =
1
1
c2 e(1 i 02)t
1
t
0
1
c
e
1
2
1q
1q

C
B
2
2 CB
(1+ i 6) t C
C,
B
(d) u(t) = B
1

i
@1 1 + i
A
c
e
3
3 @ 2
A

1
1
1
(1 i 6)t
c3 e
0

4
(e) u(t) = B
@ 2
1

3 + 2i
2 i
1

3 2i
2 + i
1

10
CB
AB
@

c1
C
c2 e i t C
A,
c 3 e i t

243

0
B
B 1
(f ) u(t) = B
@ 0
1

1
0
1
0

0
1
0
1

10

2t 1
C
C
C,
C
A

1 B c1 e
c 2 e t
0C
CB
CB
B
1 A @ c 3 et
0
c e2 t
4

1
B
B 1
B
(g) u(t) = B
@ 0
0

3
2

1
3
0
0

i
12 2 i
1+ i
1

32 i
21 + 2 i
1 i
1

10
CB
CB
CB
CB
AB
@

c 1 et
C
c 2 e t C
C
C.
c3 e i t C
A
c 4 e i t

9.1.26.

(a)
0

c 1 e2 t + c 2 t e 2 t
,
c 2 e2 t

c e 3 t + c2 12 + t e 3 t A
,
(c) @ 1
2 c 1 e 3 t + 2 c 2 t e 3 t

c e 3 t + c2 t e 3 t + c3 1 + 21 t2 e 3 t
C
B 1
C,
(e) B
c 2 e 3 t + c 3 t e 3 t
A
@
3t
3t
2 3t
1
c1 e
+ c2 te
+ 2 c3 t e

(g)

c e
B 1
B
B
B
B
@

3t

9.1.27. (a)

+ c2 t e3 t 41 c3 e t 14 c4 (t + 1) e t
C
C
c 3 e t + c 4 t e t
C
C, (h)
3t
t
1
C
c2 e 4 c4 e
A
c 4 e t
du
=
dt

du
(d)
=
dt
0

1
5

0
du
(g)
=B
@ 2
dt
2

12
1

2
0

1
1
1
2

0
0

u,

u,

11
2C
0 A u,

(b)

du
=
dt 0

2
du
(e)
=B
@0
dt
2

1
9

0
3
3
0
1
du
=B
(h)
@ 1
dt
0

c e t + c2 13 + t e t A
(b) @ 1
,
3 c 1 e t + 3 c 2 t e t 1
0
c e t + c 2 et + c 3 t e t
B 1
C
C,
(d) B
c1 e t c3 et
@
A
t
t
t
2 c1 e + c2 e + c3 t e
0
1
c1 e t + c2 t e t + 12 c3 t2 e 3 t
B
C
C,
(f ) B
c2 e t + c3 (t 1) e t
@
A
t
c3 e
0

c1 cos t + c2 sin t + c3 t cos t + c4 t cos t


B
C
B c1 sin t + c2 cos t c3 t sin t + c4 t cos t C
B
C.
@
A
c3 cos t + c4 sin t
c3 sin t + c4 cos t

1
u, (c)
1
1
0
0C
(f )
A u,
0
1
1
0
2
C
1 1 A u.
1
1
2

du
0
=
1
dt
0

0
0

0
du
=B
@ 1
dt
0

u,

1
0
0

0
0C
A u,
0

dui
is a linear combination of u1 , u2 . Or note that the trajecdt
tories described by the solutions cross, violating uniqueness. (b) No, since polynomial solutions a two-dimensional system can be at most first order in t. (c) No, since !a two-dimensional

1 0
system has at most 2 linearly independent solutions. (d) Yes: u =
u. (e) Yes:
0 1
!
dui

2 3
is a linear combination of u1 , u2 . Or note that
u. (f ) No, since neither
u=
3 2
dt
both solutions have the unit circle as0their trajectory,
but traverse it in opposite directions,
1
0 0 0

C
violating uniqueness. (g) Yes: u = B
@ 1 0 0 A u. (h) Yes: u = u. (i) No, since a three1 0 0
dimensional system has at most 3 linearly independent solutions.

9.1.28. (a) No, since neither

u(t)
du

C
9.1.29. Setting u(t) = B
=
@ u(t) A, the first order system is

dt
u(t)

244

0
B
0
@
12

1
0
4

0
1C
A u. The eigen3

values of the coefficient matrix are 3, 2 i with eigenvectors


resulting solution is u(t) =

1 0

B
C B
C
@ 3 A, @ 2 i A

9
1

c1 e 3 t + c2 cos 2 t + c3 sin 2 t
C
B
B 3 c e 3 t 2 c sin 2 t + 2 c cos 2 t C,
A
@
1
2
3
9 c1 e 3 t 4 c2 cos 2 t 4 c3 sin 2 t

and the

which is the same as

that found in Exercise 9.1.2.


0

u(t)
C
B
du
B u(t) C
C, the first order system is
9.1.30. Setting u(t) = B
=
@ v(t) A
dt

v(t)
0

0 1
0 0
C
B
B 1 1 1 0 C
C. The coeffiB
@ 0 0
0 1A
1
0
1 1
1
1 0 1 0 1 0
1
1
1
1
C
B C B C B
B
1 C
C B0C B1C B 2C
C. Thus
C, B C , B C, B
B
cient matrix has eigenvalues 1, 0, 1, 2 and eigenvectors B
@ 1 A @ 1 A @ 1 A @ 1 A
0
1
2
1
0
1
c e t + c + c et + c e2 t

2
3
4
B 1
C
B c e t + c et + 2 c e2 t C
B
C
1
3
4
C, whose first and third components give the general
u(t) = B
B c e t + c + c et c e2 t C
@
A
1
2
3
4
c 1 e t + c 3 et 2 c 4 e2 t
solution u(t) = c1 e t + c2 + c3 et + c4 e2 t , v(t) = c1 e t + c2 + c3 et c4 e2 t to the second

order system.

9.1.31. The degree is at most n 1, and this occurs if and only if A has only one Jordan chain
in its Jordan basis.
9.1.32.
(a) By direct computation,
duj
= e t
dt
which equals

j
X

i=1

tji
w + e t
(j i) ! i

j1
X

i=1

tji1
w,
(j i 1) ! i
3

j
X
tji
tji
tj1
A uj = e
A wi = e t 4
w1 +
( wi + wi1 ) 5 .
(j 1) !
i = 2 (j i) !
i = 1 (j i) !
(b) At t = 0, we have uj (0) = wj , and the Jordan chain vectors are linearly independent.
j
t X

9.1.33.

(a) The equilibrium solution satisfies A u? = b, and so v(t) = u(t) u? satisfies v = u =


A u + b = A(u u? ) = A v, which is the homogeneous system.
u(t) = 2 c1 cos 2 t + 2 c2 sin 2 t 3,
u(t) = 3 c1 e2 t + c2 e 2 t 14 ,
(ii)
(b) (i)
v(t) = c1 sin 2 t + c2 cos 2 t 21 .
v(t) = c1 e2 t + c2 e 2 t + 41 .

9.2.1.
(a)
(b)
(c)
(d)
(e)
(f )
(g)
(h)

Asymptotically stable: the eigenvalues


are 2 i ;

1
11
unstable: the eigenvalues are 2 2 i ;
asymptotically stable eigenvalue 3;
stable: the eigenvalues are 4 i ;
stable: the eigenvalues are 0, 1, with 0 complete;
unstable: the eigenvalues are 1, 1 2 i ;
asymptotically stable: the eigenvalues are 1, 2;
unstable: the eigenvalues are 1, 0, with 0 incomplete.
245

9.2.2.

i
c1 + 23 c2 cos 6 t + 23 c1 + c2 sin 6 t +
h

i
v(t) = e t c1 cos 6 t + c2 sin 6 t + 12 c3 e 2 t ,
h

i
w(t) = e t c1 cos 6 t + c2 sin 6 t + c3 e 2 t .

u(t) = e t

2t
1
,
2 c3 e

The system is stable because all terms in the solution are exponentially decreasing as t .
9.2.3.

(a) u = 2 u, v = 2 v, with solution u(t) = c1 e 2 t , v(t) = c2 e 2 t .

(b) u = v, v = u, with solution u(t) = c1 et + c2 e t , v(t) = c1 et + c2 e t .

(c) u = 8 u + 2 v, v = 2 u
2 v, with solution

(5 13)t
e (5+ 13)t +c2 133
, v(t) = c1 e (5+ 13)t +c2 e (5
u(t) = c1 13+3
2
2 e

(d) u = 4 u + v + 2 v, v = u 4 v + w, w = 2u + v 4 w, with solution

(3+ 3) t
3
+
1)
c
e
u(t) = c1 e 6 t + c2 e (3+ 3) t + c3 e (3 3) t , v(t)
=

(
+
2

6t
(3+ 3) t
(3 3) t
(3 3) t
, w(t) = c1 e
+ c2 e
+ c3 e
.
+( 3 1) c3 e

13)t

9.2.4.

(a) u = 2 v, v = 2 u, with solution u(t) = c1 cos 2 t + c2 sin 2 t, v(t) = c1 sin 2 t + c2 cos 2 t;


stable.

(b) u = u, v = v, with solution u(t) = c1 et , v(t) = c2 e t ; unstable.

(c) u = 2 u + 2 v, v = 8 u + 2 v, with solution u(t) = 41 (c1 3 c2 ) cos 2 3 t +

1
4 ( 3 c1 + c2 ) sin 2 3 t, v(t) = c1 cos 2 3 t + c2 sin 2 3 t; stable.
9.2.5. (a) Gradient flow; asymptotically stable. (b) Neither; unstable. (c) Hamiltonian flow;
unstable. (d) Hamiltonian flow; stable. (e) Neither; unstable.
!

H
H
b
, then we must have
= a u + b v,
= b u c v. Therefore,
c
v
u
2
H
by equality of mixed partials,
= a = c. But if K > 0, both diagonal entries must
u v
be positive, a, c > 0, which is a contradiction.

9.2.6. True. If K =

a
b

9.2.7.
(a) The characteristic equation is 4 + 2 2 + 1 = 0, and so i are double eigenvalues.
However, each has only one linearly
independent eigenvector, namely ( 1,
i , 0, 0 )T .
1
0
c1 cos t + c2 sin t + c3 t cos t + c4 t cos t
C
B
B c1 sin t + c2 cos t c3 t sin t + c4 t cos t C
C.
(b) The general solution is u(t) = B
A
@
c3 cos t + c4 sin t
c3 sin t + c4 cos t
(c) All solutions with c23 + c24 6= 0 spiral off to as t , while if c3 = c4 = 0, but
c21 + c22 6= 0, the solution goes periodically around a circle. Since the former solutions can
start out arbitrarily close to 0, the zero solution is not stable.
9.2.8. Every solution to a real first order system of period P comes from complex conjugate
eigenvalues 2 i /P . A 3 3 real matrix has at least one real eigenvalue 1 . Therefore,
if the system has a solution of period P , its eigenvalues are 1 and 2 i /P . If 1 = 0,
every solution has period P . Otherwise, the solutions with no component in the direction
of the real eigenvector all have period P , and are the only periodic solutions, proving the
result. The system is stable (but never asymptotically stable) if and only if the real eigenvalue 1 0.
9.2.9. No, since a 4 4 matrix could have two distinct complex conjugate pairs of purely imaginary eigenvalues, 2 i /P1 , 2 i /P2 , and would then have periodic solutions of periods
246

P1 and P2 . The general solution in such a case is quasi-periodic; see Section 9.5 for details.
9.2.10. The system is stable since i must be simple eigenvalues. Indeed, any 5 5 matrix has
5 eigenvalues, counting multiplicities, and the multiplicities of complex conjugate eigenvalues are the same. A 6 6 matrix can have i as complex conjugate, incomplete double
eigenvalues, in addition to the simple real eigenvalues 1, 2, and in such a situation the
origin would be unstable.
9.2.11. True, since Hn > 0 by Proposition 3.34.
9.2.12. True, because the eigenvalues of the coefficient matrix K are real and non-negative,
0. Moreover, as it is symmetric, all its eigenvalues, including 0, are complete.
9.2.13. (a) v = B v = A v. (b) True, since the eigenvalues of B = A are minus the
eigenvalues of A, and so will all have positive real parts. (c) False. For example, a saddle
point, with one positive and one negative eigenvalue is still unstable when going backwards
in time. (d) False, unless all the eigenvalues of A and hence B are complete and purely
imaginary or zero.

9.2.14. The eigenvalues of A2 are all of the form 2 0, where is an eigenvalue of A.


Thus, if A is nonsingular, the result is true, while if A is singular, then the equilibrium solutions are stable, since the 0 eigenvalue is complete, but not asymptotically stable.
9.2.15. (a) True, since the sum of the eigenvalues equals the trace, so at least one must be
positive or!have positive real part in order that the trace be positive. (b) False. A =
1
0
gives an example of a asymptotically stable system with positive determinant.
0 2
9.2.16.
(a) Every v ker K gives an equilibrium solution u(t) v.
(b) Since K is complete, the general solution has the form
u(t) = c1 e 1 t v1 + + cr e r t vr + cr+1 vr+1 + + cn vn ,

where 1 , . . . , r > 0 are the positive eigenvalues of K with (orthogonal) eigenvectors


v1 , . . . , vr , while vr+1 , . . . , vn form a basis for the null eigenspace, i.e., ker K. Thus, as
t , u(t) cr+1 vr+1 + + cn vn ker K, which is an equilibrium solution.
(c) The origin is asymptotically stable if K is positive definite, and stable if K is positive
semi-definite.
(d) Note that a = u(0) = c1 v1 + + cr vr + cr+1 vr+1 + + cn vn . Since the eigenvectors
are orthogonal, cr+1 vr+1 + + cn vn is the orthogonal projection of a onto ker K.
9.2.17.
(a) The tangent to the Hamiltonian trajectory at a point ( u, v )T is v = ( H/v, H/u )T
while the tangent to the gradient flow trajectory is w = ( H/u, H/v ) T . Since
v w = 0, the tangents are orthogonal.
1

0.75
0.5

0.5
0.25

(b) (i)

-1

-0.5

0.5

(ii)

-1 -0.75 -0.5-0.25

0.25 0.5 0.75

-0.25

-0.5
-0.5
-0.75

-1

-1

9.2.18. False. Only positive definite Hamiltonian functions lead to stable gradient flows.

247

9.2.19.
(a) When q(u) = 12 uT K u then
d

q(u) = 21 uT K u +
dt

1
2

uT K u = uT K u = (K u)T K u = k K u k2 .

d
q(u) = k K u k2 < 0 and hence q(u) is a strictly
dt
decreasing function of t whenever u(t) is not an equilibrium solution. Moreover, u(t)
u? goes to equilibrium exponentially fast, and hence its energy decreases exponentially
fast to its equilibrium value: q(u) q(u? ).

(b) Since K u 6= 0 when u 6 ker K,

9.2.20.
(a) By the multivariable calculus chain rule
H du
H dv
H H
H
H
d
H(u(t), v(t)) =
+
=
+

dt
u dt
v dt
u v
v
u
Therefore H(u(t), v(t)) c is constant, with its value
c = H(u0 , v0 ) fixed by the initial conditions u(t0 ) = u0 , v(t0 ) = v0 .
(b) The solutions are
u(t) = c1 cos(2 t)c1 sin(2 t)+2 c2 sin(2 t),
v(t) = c2 cos(2 t) c1 sin(2 t) + c2 sin(2 t),
and leave the Hamiltonian function constant:
H(u(t), v(t)) = u(t)2 2 u(t) v(t) + 2 v(t)2 = c21 2 c1 c2 + 2 c22 = c.

0.
1
0.75
0.5
0.25

-1 -0.75 -0.5-0.25

0.25 0.5 0.75

-0.25
-0.5
-0.75
-1

9.2.21. In both cases, | f (t) | = tk e t . If > 0, then e t , while tk 1 as t , and so


| f (t) | e t . If = 0, then | f (t) | = 1 when k = 0, while | f (t) | = tk if k > 0.
If < 0, then | f (t) | = e t+k log t 0 as t , since t + k log t .
9.2.22. An eigensolution u(t) = e t v with = + i is bounded in norm by k u(t) k e t k v k.
Moreover, since exponentials grow faster than polynomials, any solution of the form u(t) =
e t p(t), where p(t) is a vector of polynomials, can be bounded by C ea t for any a > =
Re and some C > 0. Since every solution can be written as a linear combination of such
solutions, every term is bounded by a multiple of ea t provided a > a? = max Re and so,
by the triangle inequality, is their sum. If the maximal eigenvalues are complete, then there
are no polynomial terms, and we can use the eigensolution bound, so we can set a = a ? .

9.3.1.
(i) A =

0
9

1
;
0

i
i
,
, 2 = 3 i , v 2 =
3
3
u1 (t) = c1 cos 3 t + c2 sin 3 t, u2 (t) = 3 c1 sin 3 t 3 c2 cos 3 t;

1 = 3 i , v 1 =
center; stable.

(ii) A =

2
1

1 =

1
2

2 =

1
2

+i
i

3
;
1

3
2 ,

3
2 ,

v1 =

3
2

v2 =

3
2

+i
1
i
1

!
3
2
,
!
3
2
,

248

u1 (t) = e t/2

3
2 c1

u2 (t) = e t/2 c1 cos

3
2

3
2 c2

cos

t + c2 sin

3
2

3
2

t+

3
2 c1

3
2 c2

sin

t ;

stable focus; asymptotically stable

(iii) A =

3
2

2
;
2

1
2
, 2 = 2, v2 =
,
2
1
u1 (t) = c1 e t + 2 c2 e2 t , u2 (t) = 2 c1 e t + c2 e2 t ;

1 = 1, v1 =

saddle point; unstable

9.3.2.

1
1
;
+ c 2 et
1
3
saddle point; unstable.

(i) u(t) = c1 e t

2 cos t sin t
2 sin t + cos t
+ c 2 e t
;
5 cos t
5 sin t
stable focus; asymptotically stable.

(ii) u(t) = c1 e t

1
t
(iii) u(t) = c1 e
+ c2 e t/2
;
1
t + 52
stable improper node; asymptotically stable.
t/2

9.3.3.

1
4
,
(a) For the matrix A =
1 2
tr A = 3 < 0, det A = 2 < 0, = 17 > 0,
so this is an unstable saddle point.

249

3
2

t ,

2
1
,
1 4
tr A = 6 < 0, det A = 7 > 0, = 8 > 0,
so this is a stable node.

(b) For the matrix A =

5 4
,
1 2
tr A = 7 > 0, det A = 6 > 0, = 25 > 0,
so this is an unstable node.

(c) For the matrix A =

3 2
(d) For the matrix A =
,
3
2
tr A = 1 < 0, det A = 0, = 1 > 0,
so this is a stable line.

9.3.4. (a)

(b)

(c)

(d)

9.3.5. (a) u(t) =

(e)

e 2 t cos t + 7 e 2 t sin t, 3 e 2 t cos t 4 e 2 t sin t

(b)

(c) Asymptotically stable since the coefficient matrix has tr A = 4 < 0, det A = 5 > 0,
= 4 < 0, and hence it is a stable focus; equivalently, both eigenvalues 2 i have negative
real part.
250

9.3.6. For (9.31), the complex solution

e t v = e(+ i ) t (w + i z) = e t cos( t) w sin( t) z + e t sin( t) w + cos( t) z


leads to the general real solution
h
i
h
i
u(t) = c1 e t cos( t) w sin( t) z + c2 e t sin( t) w + cos( t) z
h

= e t c1 cos( t) + c2 sin( t) w + e t c1 sin( t) + c2 cos( t) z

= r e t cos( t ) w sin( t ) z ,

where r = c21 + c22 and tan = c2 /c1 .


To justify (9.32), we differentiate
i
h
i
d h
du
=
(c1 + c2 t)e t v + c2 e t w = (c1 + c2 t)e t v + c2 e t w + c2 e t v,
dt
dt
which is equal to
A u = (c1 + c2 t)e t A v + c2 e t A w = (c1 + c2 t)e t v + c2 e t ( w + v)
by the Jordan chain condition.
9.3.7. All except for cases IV(ac), i.e., the stars and the trivial case.

9.4.1.
0

(a)

4 t
@3e
4 t
3e

1 2t
3e
4 2t
3e
!

(c)

cos t
sin t

(f )

e t + 2 t e t
2 t e t

9.4.2.

sin t
,
cos t

(c)

(d)

(d)

0
cos t
sin t

1 t
1 3t
1 4t
B6e + 2e + 3e
B
1 4t
1 t
B
3e 3e
@
1 3t
1 4t
1 t
6e 2e + 3e
0
e 2 t + t e 2 t
B
B
1 + e 2 t
@
2t
2t
0

1e

B
B
B1 t
B3e
B
@
1 t
3e

1
1 2t
3e
A,
4 2t
3e
!

1
0

t
,
1

(b)
(e)

2 t e t
.
t
e 2 t e t

1
(a) B
@ 2 sin t
2 cos t 2

(b)

31 et +
31 et +

te

1 t
@2e
1 t
2e

1 t
2e
1 t
2e

1 t
2e
1 t
2e

e2 t cos t 3 e2 t sin t
5 e2 t sin t

1
1 t
2e
A
1 t
2e

cosh t
sinh t

sinh t
,
cosh t
!

2 e2 t sin t
,
2t
e cos t + 3 e2 t sin t

0
sin t C
A,
cos t
1 t
3e
2 t
3e
1 t
3e

t e 2 t
e 2 t
t e 2 t

1 4t
3e
1 4t
3e
1 4t
3e

12 e3 t + 31 e4 t C
C
1 t
1 4t
C,
3e 3e
A
1 t
1 3t
1 4t
e
+
e
+
e
6
2
3
1 t
6e

t e 2 t C
1 + e 2 t C
A,
2t
1 te

2 t/2
1 t
1 t/2
3
3
1 t
1 t/2 sin
e
+
e
cos
t
e

e
cos
3
3
2
3
3
2 t 3e

3
3
1 t/2
1 t
2 t/2
1 e t/2 sin 3 t
e
cos
t
+
e
+
e
cos
3
2
2
3
3
2 t
3

1 t/2
cos 23 t 1 e t/2 sin 23 t 31 et 13 e t/2 cos 23 t + 1 e t/2 sin
3e
3
3

t/2
1 t
1 t/2
3
1
3
cos 2 t + e
sin 2 t C
3e 3e
3

C
1 t
3
1 t/2
1 e t/2 sin 3 t C
e

e
cos
t

C
3
3
2
2
3
A
3
2 t/2
1 t
e
+
e
cos
t
3
3
2

251

3
2

t
.

3
2

9.4.3. 9.4.1 (a) det et A = e t = et tr A , (b) det et A = 1 = et tr A , (c) det et A = 1 = et tr A ,


(d) det et A = 1 = et tr A , (e) det et A = e4 t = et tr A , (f ) det et A = e 2 t = et tr A .
9.4.2 (a) det et A = 1 = et tr A , (b) det et A = e8 t = et tr A , (c) det et A = e 4 t = et tr A ,
(d) det et A = 1 = et tr A .
0
0
1

1
!
7
7
1 3
1 3
e
cos
2

2
e
sin
2
(e
+
e
)
(e

e
)
3 1
2
2

@
@
A
A
,
, (b)
, (c)
9.4.4. (a)
7
7
1 3
1 3
1 e sin 2
4 1
e cos 2
2
2 (e e ) 2 (e + e )
0

e
(d) B
0
@
0

e2
0

0
0

e5

9.4.5.

C
A,

(e)

B
B4
B
@9
2
9
!

(a) u(t) =

cos t
sin t

(b) u(t) =

3 e t 2 e 3 t
2 e t 2 e 3 t

0
B

(c) u(t) = B
@

sin t
cos t

4
5
9 + 9 cos 3
4
1
9 cos 3 3 sin 3
2
2
9 cos 3 + 3 sin 3

1
2

9.4.6. et O = I for all t.

=B
@

2
9

4
1
9 cos 3 + 3 sin 3
4
5
9 + 9 cos 3
2
2
9 cos 3 3 sin 3

3 e t + 3 e 3 t
2 e t + 3 e 3 t

2
9
2
9

cos t + 2 sin t
,
sin t 2 cos t

3 e t 2 cos 3 t 2 sin 3 t
2 e t + 2 cos 3 t + 2 sin 3 t
2 e t 2 cos 3 t
0

4
9

1
1

1
2
2
cos
3

sin
3
9
3
C
C
2
2
C.
cos
3
+
sin
3
9
3
A
1
8
+
cos
3
9
9

6 e t + 5 e 3 t ,
4 e t + 5 e 3 t

3 e t 3 cos 3 t sin 3 t
2 e t + 3 cos 3 t + sin 3 t
2 e t 2 cos 3 t + sin 3 t
1

3 e t 3 cos 3 t sin 3 t C
2 e t + 3 cos 3 t + sin 3 t C
A.
t
2 e 2 cos 3 t + sin 3 t

10

2 sin 3 t
0
CB C
B1C
2 sin 3 t C
A@ A
cos 3 t + sin 3 t
0

9.4.7. There are none, since et A is always invertible.


9.4.8.
e

tA

cos 2 t
sin 2 t

sin 2 t
, and hence, when t = 1, eA =
cos 2 t

cos 2
sin 2

sin 2
cos 2

1
0

0
.
1

9.4.9. (a) According to Exercise 8.2.51, A2 = 2 I since tr A = 0, det A = 2 . Thus, by


induction, A2 m = (1)m 2 m I , A2 m+1 = (1)m 2 m A.
et A =

n=0

X
X
( t)2 m
t2 m+1 2 m
sin t
tn n
(1)m
(1)m
A =
I+
A = cos t +
.
n!
(2 m)!
(2 m + 1)!

m=0
m=0

Setting t = 1 proves the formula. (b) eA = (cosh ) I +


(c) eA = I + A since A2 = O by Exercise 8.2.51.

sinh
A, where = det A .

9.4.10. Assuming A is an n n matrix, since et A is a matrix solution, each of its n individual


columns must be solutions. Moreover, the columns are linearly independent since e 0 A = I
is nonsingular. Therefore, they form a basis for the n-dimensional solution space.
9.4.11. (a) False, unless A1 = A. (b) True, since A and A1 commute.
9.4.12. Fix s and let U (t) = e(t+s) A , V (t) = et A es A . Then, by the chain rule,

U = A e(t+s) A = A U , while, by the matrix Leibniz rule (9.40), V = A et A es A = A V .


Moreover, U (0) = es A = V (0). Thus U (t) and V (t) solve the same initial value problem,
hence, by uniqueness, U (t) = V (t) for all t, proving the result.
252

9.4.13. Set U (t) = A et A , V (t) = et A A. Then, by the matrix Leibniz formula (9.40), U =

A2 et A = A U , V = A et A A = A V , while U (0) = A = V (0). Thus U (t) and V (t) solve the


same initial value problem, hence, by uniqueness, U (t) = V (t) for all t. Alternatively, one
tn
X
An+1 = et A A.
can use the power series formula (9.46): A et A =
n!
n=0

9.4.14. Set U (t) = e t et A . Then, U = e t et A + e t A et A = (A I )U. Moreover,


U (0) = I . Therefore, by the the definition of matrix exponential, U (t) = et (A I ) .
9.4.15.

dV
d tA T
=
e
(a) Let V (t) = (e ) . Then
= (et A A)T = AT (et A )T = AT V , and
dt
dt
T
V (0) = I . Therefore, by the the definition of matrix exponential, V (t) = et A .

(b) The columns of et A form a basis for the solutions to u = A u, while its rows are a basis
for the solutions to v = AT v. The stability properites of the two ystems are the same
since the eigenvalues of A are the same as those of AT .
tA T

9.4.16. First note that An = S B n S 1 . Therefore, using (9.46),


e

tA

tn
tn
X
X
tn n
A =
S B n S 1 = S @
B n A S 1 = S et B S 1 .
n!
n!
n!
n=0
n=0

n=0

An alternative proof relies on the fact that et A and S et B S 1 both satisfy the initial value

problem U = A U = S B S 1 U, U (0) = I , and hence, by uniqueness, must be equal.


d
diag (et d1 , . . . et dn ) = diag (d1 et d1 , . . . , dn et dn ) = D diag (et d1 , . . . , et dn ). Moredt
over, at t = 0, we have diag (e0 d1 , . . . , e0 dn ) = I . Therefore, diag (et d1 , . . . , et dn ) satisfies
the defining properties of et D . (b) See Exercise 9.4.16.

9.4.17. (a)

et
0

(c) 9.4.1: (a)

1
1

1
4

(b)

1
1

1
1

et
0

(c)

i
1

i
1

eit
0

e 2 t

1
1

1
1

e i t

i
1

!1

4 t
e 1 e 2 t
= @ 34 t 34 2 t
3e 3e
0
!1
1 t
e + 1 e t
1
= @ 12 t 12 t
1
2e 2e
!1

1
4

i
1

cos t
sin t

13 et +
13 et +
1 t
2e
1 t
2e
!

1
1 2t
3e
A;
4 2t
e
3
1
1 t
e
2
A;
1 t
e
2

sin t
;
cos t

(d) not diagonalizable;


(e)

3
5

15 i
1

3
5

+ 51 i
1
=

e(2+ i )t
0

e(2 i )t

e2 t cos t 3 e2 t sin t
5 e2 t sin t

3
5

15 i
1

3
5

+ 51 i
1

!1

2 e2 t sin t
;
2t
e cos t + 3 e2 t sin t

(f ) not diagonalizable.
9.4.2: (a)

0
B
@

1
0
2

0
i
1

10

0
1
B
iC
A@0
1
0

eit
0

10

0
1
B
0C
A@ 0
e i t
2
253

0
i
1

0 1
1
=B
iC
A
@ 2 sin t
1
2 cos t 2

0
cos t
sin t

0
sin t C
A;
cos t

1
B
@
(b) 2
1

1
0
1

10

1
et
CB
1 A @ 0
1
0
=

0
e3 t
0

10

1
0
CB
0 A@2
1
e4 t

1 t
1 3t
1 4t
B6e + 2e + 3e
B
1 4t
1 t
B
3e 3e
@
1 t
1 3t
1 4t
6e 2e + 3e

1
0
1
1 t
3e
2 t
3e
1 t
3e

1 1
1 C
A
1

1 4t
3e
1 4t
3e
1 4t
3e

(c) not diagonalizable;

(d)

B1
B
B1
@

12 i

12

+i
1

3
2
3
2

21 + i
21

i
1

3
2
3
2

10

CB
CB
CB 0
A@

3
1
( 2 i 2 )t
e

0
0

B1

B
B1
@

1
which equals the same matrix exponential as before.

12 e3 t + 13 e4 t C
C
1 4t
1 t
C;
3e 3e
A
1 3t
1 4t
1 t
e
+
e
+
e
6
2
3
1 t
6e

1
3
( 2 + i 2 )t
e

12 i 23 12

12 + i 23 12

1
C
C
C
A

+i
i
1

11
3
2 C
3C
C
2 A

9.4.18. Let M have size p q and N have size q r. The derivative of the (i, j) entry of the
product matrix M (t) N (t) is
q
q
q
dnkj
X
X
dmik
d X
nkj (t) +
.
mik (t)
mik (t) nkj (t) =
dt k = 1
dt
dt
k=1
k=1
The first sum is the (i, j) entry of

dM
dN
N while the second is the (i, j) entry of M
.
dt
dt

9.4.19. (a) The exponential series is a sum of real terms. Alternatively, one can choose a real
basis for the solution space to construct the real matrix solution U (t) before substituting
into formula (9.42). (b) According to Lemma 9.28, det eA = etr A > 0 since a real scalar
exponential is always positive.
9.4.20. Lemma 9.28 implies det et A = et tr A = 1 for all t if and only if tr A = 0. (Even if tr A is
allowed to be complex, by continuity, the only way this could hold for all t is if tr A = 0.)
du
= et v =
dt
u = A u, and hence, by (9.41), u(t) = et A u(0) = et A v. Therefore, equating the two
formulas for u(t), we conclude that et A v = et v, which proves that v is an eigenvector of
et A with eigenvalue et .

9.4.21. Let u(t) = et v where v is the corresponding eigenvector of A. Then,

9.4.22. The origin is an asymptotically stable if and only if all solutions tend to zero as t .
Thus, all columns of et A tend to 0 as t , and hence lim et A = O. Conversely, if
t

lim et A = O, then any solution has the form u(t) = et A c, and hence u(t) 0 as t ,

proving asymptotic stability.

9.4.23. According to Exercise 9.4.21, the eigenvalues of eA are e = e cos + i e sin , where
= + i are the eigenvalues of A. For e to
real part, we must have
have negative

1
cos < 0, and so = Im must lie between 2 k + 2 < < 2 k + 23 for some
integer k.

254

e (t) are linear combinations of the columns of U (t), and hence


9.4.24. Indeed, the columns of U
automatically solutions to the linear system. Alternatively, we can prove this directly using
e

dU
d
dU
e , since C is constant.
U (t) C =
=
C = AU C = AU
the Leibniz rule (9.40):
dt
dt
dt
9.4.25.
dU
(a) If U (t) = C et B , then
= C et B B = U B, and so U satisfies the differential equation.
dt
Moreover, C = U (0). Thus, U (t) is the unique solution to the initial value problem

U = U B, U (0) = C, where the initial value C is arbitrary.

(b) By Exercise 9.4.16, U (t) = C et B = et A C where A = C B C 1 . Thus, U = A U as


claimed. Note that A = B if and only if A commutes with U (0) = C.

9.4.26. (a) Let U (t) = ( u1 (t) . . . un (t) ) be the corresponding matrix-valued function with the
n
duj

X
indicated columns. Then
=
bij ui for all j = 1, . . . , n, if and only if U = U B
dt
i=1

where B is the n n matrix with entries bij . Therefore, by Exercise 9.4.25, U = A U , where
A = C B C 1 with C = U (0).
dk u

of a solution to u = A u is also a
(b) According to Exercise 9.1.7, every derivative
k
dt
solution. Since the solution space is an n-dimensional vector space, at most n of the derivatives are linearly independent, and so either
du
dn1 u
dn u
=
c
u
+
c
+

+
c
,
()
0
1
n1
dtn
dt
dtn1
for some constants c0 , . . . , cn1 , or, for some k < n, we have
du
dk u
dk1 u
=
a
u
+
a
.
+

+
a
0
1 dt
k1
dtk
dtk1
In the latter case,
dn u
dnk
=
dtn
dtnk

dk1 u
du
@c u + c
0
1 dt + + ck1
dtk1

()
1
A

dnk+1 u
dn1 u
dnk u
+
a
+

+
a
1
k1 dtn1 ,
dtnk
dtnk+1
and so () continues to hold, now with c0 = = cnk1 = 0, cnk = a0 , . . . , cn1 = ak1 .
= a0

dU
= A U, U (0) = I , in block
dt
!
dV
V (t) W (t)
form U (t) = et A =
=
. Then the differential equation decouples into
Y (t) Z(t)
dt
dW
dY
dZ
B V,
= A W,
= C Y,
= C Z, with initial conditions V (0) = I , W (0) =
dt
dt
dt
O, Y (0) = O, Z(0) = I . Thus, by uniqueness of solutions to the initial value problems,
V (t) = et B , W (t) = O, Y (t) = O, Z(t) = et C .

9.4.27. Write the matrix solution to the initial value problem

255

9.4.28.
(a)
0

B1
B

d
dt

B
B
B
B
B0
B
B
B
B
B
B0
B
B
B.
B.
B.
B
B
@0

t2
2

t3
6
t2
2

...

..
.
0
0

..
.
0
0

..

..

.
...
...

...

1
0

tn1
(n 1) !
tn2
(n 2) !
..
.
t
1
0

0
B
B
B0
B
B .
B .
B .
B
=B
B .
B .
B .
B
B
B0
@

tn
n!

...

C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A

B0
B
B
B
B
B
B0
B
B
B
B
B
B0
B
B
B.
B.
B.
B
B
@0

1
0
..
.
..
.

0
1
..
.
..
.

0
0
..
.
..

0
0

0
0

...
...

...
...
..
.
..

0
0

t2
2

...

...

...

..
.
0
0

..
.
0
0

..

..

.
...
...
0

1B 1
0 CB
B
B
0C
CB
CB
B0
.. C
CB
B
.C
CB
CB
B
.. C
0
CB
C
. CB
B
CB
B ..
1C
AB
B.
B
0 @0

tn1
(n 1) !
tn2
(n 2) !
tn3
(n 3) !
..
.
1
0

0
0

1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A

t2
2

...

..
.
0
0

..
.
0
0

..

..

t3
6
t2
2

.
...
...

...
...

1
0

tn
n!
tn1
(n 1) !
tn2
(n 2) !
..
.
t
1

C
C
C
C
C
C
C
C
C
C
C,
C
C
C
C
C
C
C
C
C
A

Thus, U (t) satisfies the initial value problem U = J0,n U, U (0) = I , that characterizes
the matrix exponential, so U (t) = et J0,n .
(b) Since J,n = I + J0,n , by Exercise 9.4.14, et J,n = et et J0,n , i.e., you merely multiply
all entries in the previous formula by et .
(c) According to Exercises 9.4.17, 27, if A = S J S 1 where J is the Jordan canonical form,
then et A = S et J S 1 , and et J is a block diagonal matrix given by the exponentials of
its individual Jordan blocks, computed in part (b).
9.4.29. If J is a Jordan matrix, then, by the arguments in Exercise 9.4.28, et J is upper triangular with diagonal entries given by et where is the eigenvalue appearing on the diagonal
of the corresponding Jordan block of A. In particular, the multiplicity of , which is the
number of times it appears on the diagonal of J, is the same as the multiplicity of e t for
et J . Moreover, since et A is similar to et J , its eigenvalues are the same, and of the same
multiplicities.
9.4.30. (a) All matrix exponentials
are nonsingular by the remark after (9.44). (b) Both A =
!
0 2
have the identity matrix as their exponential eA = I . (c) If eA =
O and A =
2
0
I and is an eigenvalue of A, then e = 1, since 1 is the only eigenvalue of I . Therefore,
the eigenvalues of A must be integer multiples of 2 i . Since A is real, the eigenvalues must
be complex conjugate, and hence either both 0, or 2 n i for some positive integer n. In
the latter case, the characteristic equation of A is 2 + 4!
n2 2 = 0, and hence A must have
a
b
zero trace and determinant 4 n2 2 . Thus, A =
with a2 + b c = 4 n2 2 . If A
c a
has both eigenvalues zero, it must be complete, and hence A = O, which is included in the
previous formula.
256

9.4.31. Even though this formula is correct in the scalar case, it is false in general. Would that
life were so simple!

9.4.32.
1 2t
(a) u1 (t) = 13 et 12
e
14 e2 t , u2 (t) = 13 et 13 e 2 t ;
(b) u1 (t) = et1 et + t et , u2 (t) = et1 et + t et ;
(c) u1 (t) = 31 cos 2 t 21 sin 2 t 13 cos t, u2 (t) = cos 2 t + 32 sin 2 t
4t
3
1
13 4 t
29
3
(d) u(t) = 13
16 e + 16 4 t, v(t) = 16 e 16 + 4 t;
1 2
1 3
1 2
1 3
(e) p(t) = 2 t + 3 t , q(t) = 2 t 3 t .

1
3

sin t;

9.4.33.
(a) u1 (t) = 12 cos 2 t + 14 sin 2 t + 12 12 t, u2 (t) = 2 e t 12 cos 2 t 14 sin 2 t 23 +
u3 (t) = 2 e t 41 cos 2 t 34 sin 2 t 74 + 23 t;
(b) u1 (t) = 32 et + 12 e t + t e t , u2 (t) = t e t , u3 (t) = 3 et + 2 e t + 2 t e t .

3
2

t,

9.4.34. Since is not an eigenvalue, A I is nonsingular. Set w = (A I )1 v. Then


u? (t) = e t w is a solution. The general solution is u(t) = e t w + z(t) = e t w + et A b,
where b is any vector.
9.4.35.
Z t
(a) u(t) =
e(ts) A b ds.
0

(b) Yes, since if b = A c, then the integral can be evaluated as


Z t
e(ts) A A c ds
0
tA

u(t) =
where v(t) = e
rium solution.

9.4.36.
(a)

(b)

(c)

(d)

= e(ts) A c

s=0

= et A c c = v(t) u? ,

c solves the homogeneous system v = A v, while u? = c is the equilib-

e2 t 0 scalings in the x direction, which expand when t > 0 and contract when
0 1
t < 0. The trajectories are half-lines parallel to the x axis. Points on the y axis are left
fixed. !
1 0
shear transformations in the y direction. The trajectories are lines parallel
t 1
to the y axis. Points on the y axis are fixed.
!
cos 3 t sin 3 t
rotations around the origin, starting in a clockwise direction for
sin 3 t cos 3 t
t > 0. The trajectories are the circles x2 + y 2 = c. The origin is fixed.
!
cos 2 t sin 2 t
elliptical rotations around the origin. The trajectories are the
2 sin 2 t 2 cos 2 t
ellipses x2 + 41 y 2 = c. The origin is fixed.
!

cosh t sinh t
hyperbolic rotations. Since the determinant is 1, these are area(e)
sinh t cosh t
preserving scalings: for t > 0, expanding by a factor of et in the direction x = y and
contracting by the reciprocal factor e t in the direction x = y; the reverse holds for
t < 0. The trajectories are the semi-hyperbolas x2 y 2 = c and the four rays x = y.
The origin is fixed.
257

9.4.37. 0
(a)

(b)

(c)

(d)

(e)

e2 t 0 0
B
C
t
2
2t
@ 0
et 0 A scalings by a factor = e in the y direction and = e in the x
0
0 1
direction. The trajectories are the semi-parabolas x = c y 2 , z = d for c, d constant, and
the half-lines x 6= 0, y = 0, z = d and x = 0, y 6= 0, z = d. Points on the z axis are left
fixed.
0
1
1 0 t
B
C
@ 0 1 0 A shear transformations in the x direction, with magnitude proportional
0 0 1
to the z coordinate. The trajectories are lines parallel to the x axis. Points on the xy
plane are fixed.
0
1
cos 2 t 0 sin 2 t
B
C
0
1
0
@
A rotations around the y axis. The trajectories are the circles
sin 2 t 0 cos 2 t
x2 + z 2 = c, y = d. Points on the y axis are fixed.
0
1
cos t
sin t 0
B
0C
@ sin t cos t
A spiral motions around the z axis. The trajectories are the pos0
0
et
itive and negative z axes, circles in the xy plane, and cylindrical spirals (helices) winding around the z axis while going away from the xy pane at an exponentially increasing
rate. The only fixed point is the origin.
0
1
cosh t 0 sinh t
B
1
0 C
@ 0
A hyperbolic rotations in the xz plane, cf. Exercise 9.4.36(e). The
sinh t 0 cosh t
trajectories are the semi-hyperbolas x2 z 2 = c, y = d, and the rays x = z, y = d.
The points on the y axis are fixed.

9.4.38.
(a) et:A =
0
2
1
3 + 3 cos 3 t
B

B 1
1
B
1
B 3 3 cos 3 t 3 sin 3 t

@
13 + 13 cos 3 t 1 sin 3 t
3

1
3

31 cos

3t +

1
3

1 sin
3

3t

+ 32 cos 3 t

13 + 31 cos 3 t + 1 sin 3 t
3

(b) The axis is the null eigenvector: ( 1, 1, 1 ) .

1
3 t + 1 sin 3 t
3

C
C
13 + 13 cos 3 t 1 sin 3 t C
C.
3

A
2
1
3 + 3 cos 3 t
13 + 13 cos

9.4.39.
(a) Given c, d R, x, y R 3 , we have

Lv [ c x + d y ] = v (c x + d y) = c v x + d v y = c Lv [ x ] + d Lv [ y ],

proving linearity.
(b)
(c)
(d)
(e)

0
c b
T
If v = ( a, b, c )T , then Av = B
0
aC
@ c
A = Av .
b a
0
Every 3 3 skew-symmetric matrix has the form Av for some v = ( a, b, c )T .
x ker Av if and only if 0 = Av x = v x, which happens if and only if x = c v is
parallel to v.
0
1
0
1
0 r 0
cos r t sin r t 0
B
tA
If v = r e3 , then Ar e3 = B
0 0C
cos r t 0 C
@r
A and hence e r e3 = @ sin r t
A
0
0 0
0
0
1
represents a rotation by angle r t around the z axis. More generally, given v with r =
k v k, let Q be any rotation matrix such that Q v = r e3 . Then Q(v x) = (Q v) (Q x)
since rotations preserve the orthogonality of the cross product and its right-handedness.
258

Thus, if x = v x and we set y = Q x then y = r e3 y. We conclude that the solutions


x(t) are obtained by rotating the solutions y(t) and so are given by rotations around the
axis v.

9.4.40.
x0 cos t y0 sin t
C
(a) The solution is x(t) = B
@ x0 sin t + y0 cos t A, which is a rotation by angle t around the z
z0
q
axis. The trajectory of the point ( x0 , y0 , z0 )T is the circle of radius r0 = x20 + y02 at
height z0 centered on the z axis. The points on the z axis,
with r0 = 0, are
fixed.
0
1
x0 cos t y0 sin t
C
(b) For the inhomogeneous system, the solution is x(t) = B
@ x0 sin t + y0 cos t A, which is a
z0 + t
screw motion. If r0 = 0, the trajectory is the z axis; otherwise it is a helix of radius r0 ,
spiraling up the z axis.

(c) The solution to the linear system x = a x is x(t) = Rt x0 where Rt is a rotation


through angle t k a k around the axis a. The solution to the inhomogeneous system is
the screw motion x(t) = Rt x0 + t a.
0

0
c b
9.4.41.
b2 c2
ab
ac
B
B
C
C
2
0
a A, we have A = @
(a) Since A = @ c
A while
ab
a 2 c2
bc
2
2
b a
0
ac
bc
a b
A3 = (a2 + b2 + c2 )A = A. Therefore, by induction, A2 m+1 = (1)m A and
A2 m = (1)m1 A2 for m 1. Thus
et A =

n=0

X
X
t2 m+1
t2 m
tn n
A = I+
A
A2 .
(1)m
(1)m
n!
(2 m + 1)!
(2 m)!
m=0
m=1

The power series are, respectively, those of sin t and cos t 1 (since the constant term
doesnt appear), proving the formula.
(b) Since u 6= 0, the matrices I , A, A2 are linearly independent and hence, by the Euler
Rodrigues formula, et A = I if and only if cos t = 1, sin t = 0, so t must be an integer
multiple of 2 .
(c) If v = r u with r = k v k, then
e

t Av

=e

t r Au

= I

+ (sin t r)Au + (1 cos t r) A2u

sin t k v k
= I+
Av +
kvk

cos t k v k
1
k v k2

which equals the identity matrix if and only if t = 2 k / k v k for some integer k.
9.4.42. None of them commute:
"

"

"

"

"

2
0

0
0

2
0

0
0

0
1

0
0

0
1

0
0

0
3

3
0

0
1

0
0

0
4

0
3

1
0

0
1

,
,

!#

0
1

3
0
1
0

!#

!#

1
0

!#

=
=

!#

0
2

0
8

3
0
1
0
6
0

0
,
0
!

2
,
0
!

0
,
3
!

"

"

"

"

0
,
1
!

"

0
,
6
259

2
0

0
0

2
0

0
0

0
1

0
0

0
3

0
4

3
0
1
0

0
3

3
0

!#

0
1

1
0

0
4

1
0

!#

!#

0
4

1
0

0
1

1
0

0
2

2
,
0

1
0

!#

!#

6
,
0

0
6

9
0
5
0

0
,
1

0
,
9
!

0
,
5

A2v ,

20

13

0 0 2
1
B
7
C
0C
A 5 = @ 0 0 0 A,
0 0 0
0
0
0
1
0
13
1 0
20
0 0 4
0 0 2
2 0 0
B
C7
C B
6B
0C
0A5 = @ 0 0
A,
4@0 1 0A, @0 0
4 0
0
2 0
0
0 0 0
20
1 0
13
0
1
2 0 0
0 1 0
0 1 0
6B
C B
C7
B
C
4 @ 0 1 0 A , @ 1 0 0 A 5 = @ 1 0 0 A,
0 0 0
0 0 1
0 0 0
20
1 0
13
0
1
2 0 0
0 0 1
0 0 2
6B
C B
C7
B
C
4 @ 0 1 0 A , @ 0 0 0 A 5 = @ 0 0 0 A,
0 0 0
1 0 0
2 0 0
20
1 0
13
0
1
0 0 1
0 0 2
2 0
0
6B
C B
C7
B
0A5 = @0 0
0C
4@0 0 0A, @0 0
A,
0 0 0
2 0
0
0 0 2
20
1 0
13
0
1
0 0 1
0 1 0
0 0 1
6B
C B
C7
B
C
4 @ 0 0 0 A , @ 1 0 0 A 5 = @ 0 0 1 A,
0 0 0
0 0 1
0 0 0
20
1 0
13
0
1
0 0 1
0 0 1
1 0
0
6B
C B
C7
B
0C
4@0 0 0A, @0 0 0A5 = @0 0
A,
0 0 0
1 0 0
0 0 1
1
0
13
1 0
20
0 0 2
0 1 0
0 0 2
C
B
C7
B
6B
0C
A , @ 1 0 0 A 5 = @ 0 0 2 A,
4@0 0
2 2
0
0 0 1
2 0
0
20
1 0
13
0
1
0 0 2
0 0 1
4 0 0
6B
B
C7
B
C
0C
4@0 0
A , @ 0 0 0 A 5 = @ 0 0 0 A,
2 0
0
1 0 0
0 0 4
20
1 0
13
0
1
0 1 0
0 0 1
0
0 1
6B
C B
C7
B
0 1 C
4 @ 1 0 0 A , @ 0 0 0 A 5 = @ 0
A.
0 0 1
1 0 0
1 1
0
2

6B
4@0

0
1
0

0
0C
A,
0

B
@0

0
0
0

9.4.43.
(a) If U, V are upper triangular, so are U V and V U and hence so is [ U, V ] = U V V U .
(b) If AT = A, B T = B then
(c) No.

[ A, B ]T = (A B B A)T = B T AT AT B T = B A A B = [ A, B ].

9.4.44. The sum of


h

[ A, B ], C = (A B B A) C C (A B B A) = A B C B A C C A B + C B A,

[ B, C ], A = (B C C B) A A (B C C B) = B C A C B A A B C + A C B,

[ C, A ], B = (C A A C) B B (C A A B) = C A B A C B B C A + B A C,

is clearly zero.

dunj
dU
= A U , the equations in the last row are
= 0 for
dt
dt
j = 1, . . . , n, and hence the last row of U (t) is constant. In particular, for the exponential matrix solution U (t) = et A the last row must equal the last row of the identity matrix
U (0) = I , which is eT
n.

9.4.45. In the matrix system

260

V (t)
g(t)

f (t)
, where f (t) is a column vector, g(t)
w(t)
dU
= A U decouples
a row vector, and w(t) is a scalar function. Then the matrix system
dt
dV
df
dg
dw
into
= B V,
= B f + c w,
= 0,
= 0, with initial conditions V (0) = I ,
dt
dt
dt
dt
f (0) = O, g(0) = O, w(0) = 1. Thus, g 0, w 1, are constant, V (t) = et B . The equaZ t
df
= B f + c, f (0) = 0, and the solution is u(t) =
e(ts) A b ds,
tion for f (t) becomes
0
dt
cf. Exercise 9.4.35.

9.4.46. Write the matrix solution as U (t) =

x+t
et x
9.4.47. (a)
: scaling in x and y direc: translations in x direction. (b)

y
e 2t y
!
(x + 1) cos t y sin t 1
t
2t
tions by respective factors e , e
. (c)
: rotations around
(x + 1) sin t + y cos t
!
!
et (x + 1) 1
1
the point
. (d)
: scaling in x and y directions, centered at the
0
e t (y + 2) 2
!
1
point
, by reciprocal factors et , e t .
2

9.5.1. The vibrational frequency is =


/(2 ) .297752.
9.5.2. We need

21/6 1.87083, and so the number of Hertz is

1
1
= 20 and so m =
=
.0000633257.
2
1600 2
2 m

9.5.3.

2
1

(a) Periodic of period :

-2

10

-1
-2
2.5
2
1.5
1
0.5

(b) Periodic of period 2:

-2 -0.5

10

2
1.5
1
0.5

(c) Periodic of period 12:


-5

-0.5
-1
-1.5

10

15

20

25

10

15

20

25

2
1

(d) Quasi-periodic:

-5
-1
-2

261

3
2
1

(e) Periodic of period 120 :

100

200

300

400

500

-1
-2
-3

3
2
1

(f ) Quasi-periodic:

10

20

30

40

50

60

-1
-2
0.5

(g) sin t sin 3 t = cos 2 t cos 4 t,


and so periodic of period :

0.25
-5

-0.25

10

15

-0.5
-0.75
-1

m
, where m is the least common multiple of q and s, while 2k
2k1
is the largest power of 2 appearing in both p and r.

gives two linearly !independent solutions;
9.5.5. (a) 2 , 7 ; (b) 4 each eigenvalue
!

1
2
(c) u(t) = r1 cos( 2 t 1 )
; (d) The solution is periodic
+ r2 cos( 7 t 2 )
2
1
if only one frequency is excited, i.e., r1 = 0 or r2 = 0; all other solutions are quasiperiodic.
9.5.4. The minimal period is

9.5.6. (a) 5, 10; (b) 4 each eigenvalue


gives two linearly independent
solutions;
!
!
3
4
(c) u(t) = r1 cos(5 t 1 )
+ r2 cos(10 t 2 )
; (d) All solutions are periodic;
4
3
when r1 6= 0, the period is 25 , while when r1 = 0 the period is 51 .
9.5.7.

(a) u(t) = r1 cos(t 1 ) + r2 cos( 5 t 2 ), v(t) = r1 cos(t 1 ) r2 cos( 5 t 2 );


(b) u(t) = r1 cos( 10 t 1 ) 2 r2 cos( 15 t 2 ),

v(t) = 2 r1 cos( 10 t 1 ) + r2 cos( 15 t 2 );


T
(c) u(t) = ( r1 cos(t 1 ), r20
cos(2
t 2 ), r3 cos(3 t 01 ) )1
;
1
0
1
1
1
1

C
B
C
B
C
(d) u(t) = r1 cos( 2 t 1 ) B
@ 1 A + r2 cos(3 t 2 ) @ 1 A + r3 cos( 12 t 3 ) @ 1 A.
0
1
2
!

c
0
1
= (c1 + c2 ) and so the
9.5.8. The system has stiffness matrix K = ( 1 1 ) 1
0 c2
1

dynamical equation is m u + (c1 + c2 ) u = 0, which is the same as a mass connected to a


single spring with stiffness c = c1 + c2 .
!

52 36
with eigen9.5.9. Yes. For example, c1 = 16, c2 = 36, c3 = 37, leads to K =
36
73
values 1 = 25, 2 = 100, and hence natural frequencies 1 = 5, 2 = 10. Since 2 is a
rational multiple of 1 , every solution is periodic with period 25 or 51 . Further
examples
!
c1 + c 2
c2
= QT Q for
can be constructed by solving the matrix equation K =
c2
c2 + c 3
c1 , c2 , c3 , where is a diagonal matrix with entries 2 , r2 2 where r is any rational num262

ber and Q is a suitable orthogonal matrix, making sure that the resulting stiffnesses are all
positive: c1 , c2 , c3 > 0.
9.5.10. (a) The vibrations slow down. (b) The vibrational frequencies are 1 = .44504, 2 =
1.24698, 3 = 1.80194,
each of which is a bit smaller than the fixed
end case, which has
q
q

frequencies 1 = 2 2 = .76537, 2 = 2 = 1.41421, 3 = 2 + 2 = 1.84776.


(c) Graphing the motions of the three masses for 0 t 50:
with bottom support:

without bottom support:

9.5.11.
(a) The vibrational frequencies and eigenvectors are
q

1 = 2 2 = .7654,
2 = 2 = 1.4142, ,
0

1 C
v1 = B
@ 2 A,
1

3 =

1
C
v2 = B
@ 0 A,
1

2+

2 = 1.8478,

C
v3 = B
@ 2 A.
1

Thus, in the slowest mode,


all three masses are moving in the same direction, with the

middle mass moving 2 times farther; in the middle mode, the two outer masses are
moving in opposing directions by equal amounts, while the middle mass remains still; in
the fastest mode, the two outer masses are moving in tandem, while the middle mass is
moving farther in an opposing direction.
(b) The vibrational frequencies and eigenvectors are
1 = .4450,
2 = 1.2470,
3 = 1.8019,
v1 =

.3280

B
C
@ .5910 A,

.7370

v2 =

0
B
@

.7370
.3280 C
A,
.5910

v3 =

0
B
@

.5910
.7370 C
A.
.32805

Thus, in the slowest mode, all three masses are moving in the same direction, each slightly
farther than the one above it; in the middle mode, the top two masses are moving in
the same direction, while the bottom, free mass moves in the opposite direction; in the
fastest mode, the top and bottom masses are moving in the same direction, while the
middle mass is moving in an opposing direction.
9.5.12. Let c be the commons spring stiffness. The stiffness matrix K is tridiagonal with all
diagonal entries equal to 2 c and all sub- and super-diagonal
entries equal to c. Thus, by
v
!
u
u

k
k
t
Exercise 8.2.48, the vibrational frequencies are 2 c 1 cos
= 2 c sin
n+1
2(n + 1)
for k = 1, .
. . , n. As n , the frequencies form a denser and denser set of points on the
graph of 2 c sin for 0 12 .
9.5.13. We take fastest to mean that the slowest vibrational frequency is as large as possible. Keep in mind that, for a chain between two fixed supports, completely reversing the
263

order of the springs does not change the frequencies. For the indicated springs connecting
2 masses to fixed supports, the order 2, 1, 3 or its reverse, 3, 1, 2 is the fastest, with frequencies 2.14896, 1.54336. For the order 1, 2, 3, the frequencies are 2.49721, 1.32813, while for
1, 3, 2 the lowest frequency is the slowest, at 2.74616, 1.20773. Note that as the lower frequency slows down, the higher one speeds up. In general, placing the weakest spring in the
middle leads to the fastest overall vibrations.
For a system of n springs with stiffnesses c1 > c2 > > cn , when the bottom mass
is unattached, the fastest vibration, as measured by the minimal vibrational frequency,
occurs when the springs should be connected in order c1 , c2 , . . . , cn from stiffest to weakest, with the strongest attached to the support. For fixed supports, numerical computations show that the fastest vibrations occur when the springs are attached in the order
cn , cn3 , cn5 , . . . , c3 , c1 , c2 , c4 , . . . , cn1 when n is odd, and cn , cn1 , cn4 , cn6 , . . . , c4 , c2 ,
c1 , c3 , . . . , cn5 , cn3 , cn2 when n is even. Finding analytic proofs of these observations
appears to be a challenge.
0

3
2
1
2

12

1
0

0
C
0C
C

u1 (t)
B
C
d u

B v1 (t) C
Cu = 0, where u(t) = B
C are the horizontal and
+
9.5.14. (a)
2
3
1
C
@
u2 (t) A
dt
1
0
2
2A
v2 (t)
1
3
0
0
2
2
r

vertical displacements of the two free nodes. (b) 4; (c) 1 = 1 21 2 = .541196, 2 =


r
r
r

2 21 2 = 1.13705, 3 = 1 + 21 2 = 1.30656, 4 = 2 + 12 2 = 1.64533; (d) the


1
1
0
0
1
0
2.4142
1 2
1 + 2
B
B
B
1 C
C
C
B
C
B
1 C
1 C
B
C = B
C =
C, v2 = B
corresponding eigenvectors are v1 = B
@ 1 2 A
@ 2.4142 A
@ 1 2A
1
1
1
1
1
0
1
1
0
0
0
1
0
.4142
2.4142
.4142
1 2
1 + 2
B
C
C
B
B
B
B
1
1 C
B
C
B 1 C
B
C
C
B
C
B
1 C
1 C
C, v 4 = B
C. In the
C = B
C = B
C, v3 = B
B
@ 1+ 2A
@ .4142 A
@ 2.4142 A
@ 1 + 2 A
@ .4142 A
1
1
1
1
1
first mode, the left corner moves down and to the left, while the right corner moves up and
to the left, and then they periodically reverse directions; the horizontal motion is proportionately 2.4 times the vertical. In the second mode, both corners periodically move up and
towards the center line and then down and away; the vertical motion is proportionately 2.4
times the horizontal. In the third mode, the left corner first moves down and to the right,
while the right corner moves up and to the right, periodically reversing their directions; the
vertical motion is proportionately 2.4 times the horizontal. In the fourth mode, both corners periodically move up and away from the center line and then down and towards it; the
horizontal motion is proportionately 2.4 times the vertical.

1
(e) u(t) = cos(1 t) v1 + cos(2 t) v2 + cos(3 t) v3 cos(4 t) v4 , which is a
4 2
quasiperiodic combination of all four normal modes.
2

B
B
B
B
B
@

3
2

9.5.15. The system has periodic solutions whenever A has a complex conjugate pair of purely
imaginary eigenvalues. Thus, a quasi-periodic solution requires two such pairs, i 1 and
i 2 , with the ratio 1 /2 an irrational number. The smallest dimension where this can
occur is 4.

9.5.16.

(a) u(t) = a t + b + 2 r cos( 5 t ),

v(t) = 2 a t 2 b + r cos( 5 t ).
264

The unstable mode consists of the terms with a in them; it will not be excited if the

initial conditions satisfy u(t

0 ) 2 v(t0 ) = 0.
(b) u(t) = 3 a t 3 b + r cos( 10 t ), v(t) = a t + b + 3 r cos( 10 t ).
The unstable mode consists of the terms with a in them; it will not be excited if the

) = 0.
initial conditions satisfy 3 u(t0 ) + v(t
r0
r
!
(c)

u(t) = 2 a t 2 b
v(t) =

w(t) =

7+ 13
2

1 13
r1 cos
4

t 1

1+ 13
r2 cos
4

7 13
2

r
r
!

3+ 13
3 13
7+ 13
7 13
+
r
cos
t

r
cos
2at 2b +
1
1
2
4
2
4
2
r
r
!
!

7+ 13
7 13
t 1 + r2 cos
t 2 .
a t + b + r1 cos
2
2

t 2
t 2

!
!

,
,

The unstable mode is the term containing a; it will not be excited if the initial condi

tions satisfy 2 u(t0 ) 2 v(t0 ) + w(t0 ) =0.


(d)
u(t) = (a1 2 a2 ) t + b1 2 b2 + r cos( 6 t ),

v(t) = a1 t + b1 r cos( 6 t ),

w(t) = a2 t + b2 + 2 r cos( 6 t ).
The unstable modes consists of the terms with a1 and a2 in them; they will not be ex

cited if the initial conditions satisfy u(t0 ) + v(t0 ) = 0 and 2 u(t0 ) + w(t0 ) = 0.
9.5.17.
(a) Q =

0
B
B
B
B
@

1
2

1
2

1
2

C
C
1C
C,
A

4
=B
@0
0

0
2
0

0
0C
A.
2

(b) Yes, because K is symmetric and has all positive eigenvalues.


!T

1
.
(c) u(t) = cos 2 t, sin 2 t, cos 2 t
2

(d) The solution u(t) is periodic with


period 2 .
(e) No since the frequencies 2, 2 are not rational multiples of each other, the general
solution is quasi-periodic.
9.5.18.
(a) Q =

0
B
B
B
B
@

1
3
1
3
1
3

0
1
2

1
6
2
6
1
6

C
C
C,
C
A

B
@0

0
2
0

0
0C
A.
0

(b) No K is only positive semi-definite.

1
01
1
(t + 1) + 32 cos 3 t
sin 3 t
3
3
B3

C
B2
C
2
1
B

(t
+
1)

cos
3
t
+
3tC
sin
(c) u(t) = B 3
C.
3
3
3
@

A
2
1
1

3 (t + 1) + 3 cos 3 t 3 3 sin 3 t
(d) The solution u(t) is unstable, and becomes unbounded as | t | .
(e) No the general solution is also unbounded.
9.5.19. The solution to the initial value problem m
u (t) = a cos

(t t0 ) + b
m

m
sin

d2 u

+ u = 0, u(t0 ) = a, u(t0 ) = b, is
dt2

(t t0 ). In the limit as 0, using the fact that


m

265

sin c h
= c, we find u (t) a + b(t t0 ), which is the solution to the unrestrained
h0
h

initial value problem m u = 0, u(t0 ) = a, u(t0 ) = b. Thus, as the spring stiffness goes to
zero, the motion converges to the unrestrained motion. However, since the former solution
is periodic, while the latter moves along a straight line, the convergence is non-uniform on
all of R and the solutions are close only for a period of time: if you wait long enough they
will diverge.
lim

9.5.20.

1
3
(a) Frequencies: 1 =
5 = .61803, 2 = 1, 3 =

+
2
2 5 = 1.618034;
1
1
0
1
0
1
2+ 5
2 5
B
C
B
B
C
C
B 1 C
B
1 C
1 C
C, v 2 = B
C; unstable
C, v3 = B
B
stable eigenvectors: v1 = B
@ 1 A
@ 2 5 A
@ 2 + 5 A
1
1
1
1
0
1
B
1 C
C
C. In the lowest frequency mode, the nodes vibrate up and toB
eigenvector: v4 = B
@ 1A
1
wards each other and then down and away, the horizontal motion being less pronounced
than the vertical; in the next mode, the nodes vibrate in the directions of the diagonal
bars, with one moving towards the support while the other moves away; in the highest
frequency mode, they vibrate up and away from each other and then down and towards,
with the horizontal motion significantly more than the vertical; in the unstable mode
the left node moves down and to the right, while the right hand node moves at the same
rate up and to the right.
(b) Frequencies: 1 = .444569, 2 = .758191, 3 = 1.06792, 4 = 1.757; eigenvectors:
1
0
1
0
1
0
1
0
.823815
.500054
.122385
.237270
C
B
C
B
C
B
B
.117940 C
B .066249 C
B .185046 C
B .973375 C
C
C.
C, v 4 = B
C, v3 = B
C, v 2 = B
B
v1 = B
@ .552745 A
@ .666846 A
@ .028695 A
@ .498965 A
.106830
.520597
.191675
.825123
In the lowest frequency mode, the left node vibrates down and to the right, while the
right hand node moves further up and to the right, then both reversing directions; in
the second mode, the nodes vibrate up and to the right, and then down and to the left,
the left node moving further; in the next mode, the left node vibrates up and to the
right, while the right hand node moves further down and to the right, then both reversing directions; in the highest frequency mode, they move up and towards each other and
then down and away, withrthe horizontal motionrmore than the vertical.
r

2
3
20
(c) Frequencies: 1 = 2 = 11
= .4264, 3 = 21

5
=
1.1399,

=
4
11
11
11 =
r

3
1.3484, 5 = 21
11 + 11 5 = 1.5871; stable eigenvectors:
3
2

1
2
0

1
1
01
1
25
+ 25
B 2
C
C
B 3C
B 2
B0C
B
C
C
B
B
0C
0
0
B C
B
C
C
C
B
B
B C
B
C
C
C
B
B
B0C
1
1
1
C, v = B
C, v = B
C;
B C, v 3 = B
B
C
C
C
B
B
1
4
5
1
1
5
5
B0C
B +
C
C
B3 C
B
B C
2 C
2 C
B 2
C
B
B 2
@1A
@
A
A
A
@
@
0
0
0
0
1
1
1
0
31
B 0C
B
C
B
C
B 1 C
C. In the two lowest frequency modes, the individunstable eigenvector: v6 = B
B 3C
B
C
@ 0A
1

01
B1C
B C
B C
0C
B C, v 2 =
v1 = B
B0C
B C
@0A
0
0

266

ual nodes vibrate horizontally and transverse to the swing; in the next lowest mode, the
nodes vibrate together up and away from each other, and then down and towards each
other; in the next mode, the nodes vibrate oppositely up and down, and towards and
then away from each other; in the highest frequency mode, they also vibrate vibrate up
and down in opposing motion, but in the same direction along the swing; in the unstable mode the left node moves down and in the direction of the bar, while the right hand
node moves at the same rate up and in the same horizontal direction.
9.5.21. If the massspring molecule is allowed to move in space, then the vibrational modes
and frequencies remain the same, while there are 14 independent solutions corresponding
to the 7 modes of instability: 3 rigid translations, 3 (linearized) rotations, and 1 mechanism, which is the same as in the one-dimensional version. Thus,
the general motion of the
molecule in space is to vibrate quasi-periodically at frequencies 3 and 1, while simultaneously translating, rigidly rotating, and bending, all at a constant speed.
9.5.22.

(a) There are 3 linearly independent normal modes of vibration: one


of frequency 3, , in
q
which the triangle expands and contacts, and two of frequency 32 , , in which one of
the edges expands and contracts while the opposite vertex moves out in the perpendicular direction while the edge is contracting, and in when it expands. (Although there
are three such modes, the third is a linear combination of the other two.) There are
3 unstable null eigenmodes, corresponding to the planar rigid motions of the triangle.
To avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel; thus, if vi is the initial velocity of the ith mode, we require v1 + v2 + v3 = 0 and
v1 + v2 + v3 = 0 where vi denotes the angular component of the velocity vector with
respect to the center of the triangle.

(b) There are 4 normal modes of vibration, all of frequency 2, in which one of the edges
expands and contracts while the two vertices not on the edge stay fixed. There are 4
unstable modes: 3 rigid motions and one mechanism where two opposite corners move
towards each other while the other two move away from each other. To avoid exciting
the instabilities, the initial velocity must be orthogonal to the kernel; thus, if the vertices are at ( 1, 1 )T and vi = ( vi , wi )T is the initial velocity of the ith mode, we
require v1 + v2 = v3 + v4 = w1 + w4 = w2 + w3 = 0.

(c) There are 6 normal modes of vibration: one of frequency 3, in which three nonadjacent edges expand and then contact,qwhile the other three edges simultaneously contract
and then expand; two of frequency 52 , in which two opposite vertices move back and
forth in the perpendicular direction to the line joining
them (only two of these three
q
3
modes are linearly independent); two of frequency 2 , in which two opposite vertices
move back and forth towards each other (again, only two of these three modes are linearly independent); and one of frequency 1, in which the entire hexagon expands and
contacts. There are 6 unstable modes: 3 rigid motions and 3 mechanisms where two
opposite vertices move towards each other while the other four move away. As usual,
to avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel;

thus, if the vertices are at cos 31 k , sin 13 k


ity of the ith mode, we require
v1 + v2 + v3 + v4 + v5 + v6 = 0,
w1 + w2 + w3 + w4 + w5 + w6 = 0,

3 v5 + w5 + 3 v6 + w6 = 0,

, and vi = ( vi , wi )T is the initial veloc

3 v1 + w1 + 2 w2 = 0,

3 v1 + w1 + 2 w6 = 0,

2 w3 + 3 v4 + w4 = 0.

9.5.23. There are 6 linearly independent normal modes of vibration: one of frequency 2, in
267


which the tetrahedron expands and contacts; four of frequency 2, in which one of the
edges expands and contracts while the opposite vertex stays fixed; and two of frequency

2, in which two opposite edges move towards and away from each other. (There are three
different pairs, but the third mode is a linear combination of the other two.) There are
6 unstable null eigenmodes, corresponding to the three-dimensional rigid motions of the
tetrahedron.
To avoid exciting the instabilities, the initial velocity must be orthogonal to the kernel,
and so, using the result of Exercise 6.3.13, if vi = ( ui , vi , wi )T is the initial velocity of the
ith mode, we require

2 u1 + 6 v1 w1 2 2 u2 + w2 = 0,
u1 + u2 + u3 + u4 = 0,

v1 + v2 + v3 + v4 = 0,
2 v1 + 3 u2 + v2 3 u3 + v3 = 0,

w1 + w2 + w3 + w4 = 0,
2 u1 6 v1 w1 2 2 u3 + w3 = 0.
q

9.5.24. (a) When C = I , then K = AT A and so the frequencies i = i are the square roots
of its positive eigenvalues, which, by definition, are the singular values of the reduced incidence matrix. (b) Thus, a structure with one or more very small frequencies i 1, and
hence one or more very slow vibrational modes, is almost unstable in that a small perturbation might create a null eigenvalue corresponding to an instability.
9.5.25. Since corng A is the orthogonal complement to ker A = ker K, the initial velocity is
orthogonal to all modes of instability, and hence by Theorem 9.38, the solution remains
bounded, vibrating around the fixed point prescribed by the initial position.
9.5.26.
(a) u(t) = r1 cos

(d) u(t) =

1
2

t 1

q
1
5
+ r2 cos
3 t 2
2
!
q

2
8
+ r2 cos
t

2
5
3

3
,
1
!
5
,
2

1 t
1
3
1
1
!0
!0
r
r

1+ 3
1 3
3 3
3+ 3
r1 cos
t 1 @ 2 A + r2 cos
t 2 @ 2 A,
2
2
1
1
0 1
0
1
0
1
3
0

C
C
C
2 t 2 B
3 t 3 B
r1 cos ( t 1 ) B
@ 2 A + r3 cos
@ 2 A,
@ 1 A + r2 cos

(b) u(t) = r1 cos


(c) u(t) =

1 !
1
1
2
(e) u(t) = r1 cos
+ r2 cos ( 2 t 2 )
,
3 t 1
1
1
0
1
0
1
0
1
1
2
1

C
C
B
C
3 t 2 B
(f ) u(t) = (a t + b) B
@ 2 A.
@ 1 A + r1 cos ( t 1 ) @ 0 A + r2 cos
1
1
1
1

9.5.27. u1 (t) =
u2 (t) =
9.5.28. u1 (t) =
u2 (t) =

3 3
3+1
3+ 3

t
+
t,
cos
cos
2
2
2 3
r
r

1
1

cos 32 3 t
cos 3+2 3 t.
2 3
2 3

5
17
173
17+3

t+
cos
cos 5+2 17
2
2 17
2 17


1 cos 5 17 t 1 cos 5+ 17 t.
2
2
17
17

31

2 3

268

t,

9.5.29. The order does make a difference:


Mass order

Frequencies

1, 3, 2

or

2, 3, 1

1.4943, 1.0867, .50281

1, 2, 3

or

3, 2, 1

1.5451, 1.0000, .52843

2, 1, 3

or

3, 1, 2

1.5848,

.9158, .56259

Note that, from top to bottom in the table, the fastest and slowest frequencies speed up,
but the middle frequency slows down.
9.5.30.
(a) We place the oxygen molecule at the origin, one hydrogen at ( 1, 0 )T and the other at
( cos , sin )T = ( 0.2588, 0.9659 )T with = 105
180 = 1.8326 radians. There are two
independent vibrational modes, whose fundamental frequencies are 1 = 1.0386, 2 =
1.0229, with corresponding eigenvectors v1 = ( .0555, .0426, .7054, 0., .1826, .6813 )T ,
v2 = ( .0327, .0426, .7061, 0., .1827, .6820 )T . Thus, the (very slightly) higher frequency mode has one hydrogen atoms moving towards and the other away from the oxygen, which also slightly vibrates, and then reversing their motion, while in the lower frequency mode, they simultaneously move towards and then away from the oxygen atom.
(b) We place the carbon atom at the origin and the chlorine atoms at
T

1
2 3
3 , 0, 3

2
3 ,

2
1
3,3

!T

2
3 ,

2
1
3,3

!T

( 0, 0, 1 )T ,

which are the vertices of a unit tetrahedron. There are four independent vibrational
modes, whose fundamental frequencies are 1 = 2 = 3 = .139683, 4 = .028571,
with corresponding eigenvectors
1
0
1
0
1
0
1
0
0.
.9586
.5998
.2248
B
B
B .7314 C
B .6510 C
0. C
0. C
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
B .1559 C
B .6668 C
0.
0.
C
B
C
B
C
B
C
B
B .4714 C
B .2191 C
B .1245 C
B .0025 C
C
B
C
B
C
B
C
B
B
B
B
B
0. C
0. C
0. C
0. C
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
B .1667 C
B .0775 C
B .0440 C
B .0009 C
C
B
C
B
C
B
C
B
B .2357 C
B .0548 C
B .0318 C
B .1042 C
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
C.
B
C
B
C
B
B .1805 C,
.4082
,
v
=
.0949
,
v
=
.0551
v
=
v1 = B
4
3
2
C
B
C
B
C
B
C
B
B .1667 C
B .0387 C
B .0225 C
B .0737 C
C
B
C
B
C
B
C
B
B .2357 C
B .0548 C
B .1130 C
B .0246 C
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
B .4082 C
B .0949 C
B .1957 C
B .0427 C
C
B
C
B
C
B
C
B
B .1667 C
B .0387 C
B .0799 C
B .0174 C
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
0.
0.
0.
0.
C
B
C
B
C
B
C
B
C
B
C
B
C
B
C
B
@
A
@
A
@
A
@
0. A
0.
0.
0.
.5000
0.
.0401
.1714
The three high frequency modes are where two of the chlorine atoms remain fixed, while
the other two vibrate in opposite directions into and away from the carbon atom, which
slightly moves in the direction of the incoming atom. (Note: There are six possible
pairs, but only three independent modes.) The low frequency mode is where the four
chlorine atoms simultaneously move into and away from the carbon atom.
(c) There are six independent vibrational modes, whose fundamental frequencies are 1 =
2.17533, 2 = 3 = 2.05542, 4 = 5 = 1.33239, 6 = 1.12603. In all cases, the
269

bonds periodically lengthen and shorten. In the first mode, adjacent bonds have the opposite behavior; in the next two modes, two diametrically opposite bonds shorten while
the other four bonds lengthen, and then the reverse happens; in the next two modes,
two opposing nodes move in tandem along the line joining them while the other four
twist accordingly; in the lowest frequency mode, the entire molecule expands and contracts. Note: The best way to understand the behavior is to run a movie of the different
motions.
9.5.31.
(a)

0
1
0
2
x
2 B 1C
d B y1 C B
B 0
C+B
B
dt2 @ x2 A @ 1

0
0
0
0

0
1
0
x
2
2 B 1C
d B y1 C B
B 0
B
C+B
dt2 @ x2 A @ 1

0
0
0
0

10

10

0
1 0
x1
B C
C
B
0 0C
B0C
C B y1 C
C = B C.
CB
2 0 A@ x 2 A @ 0 A
y2
0
y2
0
0 0

Same vibrational frequencies: 1 = 1, 2 = 3, along with two unstable mechanisms


corresponding to motions of either mass in the transverse direction.
(b)

y2

1
0
1
0

0
x1
0
B
C
B C
0C
C B y1 C
B0C
CB
C = B C.
0 A@ x 2 A @ 0 A
0
y2
0
r

Same vibrational frequencies: 1 = 32 5 , 2 = 3+2 5 , along with two unstable


mechanisms corresponding to motions of either mass in the transverse direction.
(c) For a mass-spring chain with n masses, the two-dimensional motions are a combination
of the same n one-dimensional vibrational motions in the longitudinal direction, coupled
with n unstable motions of each individual mass in the transverse direction.
9.5.32.
(a)
1 0 0 10 x1 1 0 0 1
C
B0C
B
0 0 0C
B C
C B y1 C
C
C
CB
0 0 0 CB z 1 C B
B0C
C = B C.
CB
C
B0C
B
2 0 0C
B C
CB x 2 C
0 0 0 A @ y2 A @ 0 A
0
0 0 0
z2

Same vibrational frequencies: 1 = 1, 2 = 3, along with four unstable mechanisms


corresponding to motions of either mass in the transverse directions.
x1 1 0 2
B 0
By C
B
1C
2 B
C
B
d B z1 C B
B 0
C+B
B
C
B 1
x
dt2 B
B
B 2C
@ 0
@y A
2
0
z2
0

0
0
0
0
0
0

0
0
0
0
0
0

(b)
0

x1 1 0 2 0 0
By C
B 0 0 0
B
1C
2 B
B
C
d B z1 C B
B 0 0 0
B
C+B
C
B 1 0 0
x
dt2 B
B 2C
B
@y A
@ 0 0 0
2
z2
0 0 0
r

2 = 3+2 5 , along with

1
0
0
1
0
0

0
0
0
0
0
0

0 10 x 1 1 0 0 1
B
C
B0C
0C
C B y1 C
B C
CB
C
C
0 CB z 1 C B
B0C
CB
C = B C.
B
C
B0C
0C
CB x 2 C
B C
0 A @ y2 A @ 0 A
0
z2
0

four unstable mechanisms corresponding to


1 = 32 5 ,
motions of either mass in the transverse directions.
(c) For a mass-spring chain with n masses, the three-dimensional motions are a combination
of the same n one-dimensional vibrational motions in the longitudinal direction, coupled
with 2 n unstable motions of each individual mass in the transverse directions.
9.5.33. K v = M v if and only if M 1 K v = v, and so the eigenvectors and eigenvalues are
270

the same. The characteristic equations are the same up to a multiple, since
h

det(K M ) = det M (M 1 K I ) = det M det(P I ).


9.5.34.

e
d2 u
d2 u
fu
e.
=
N
= N K u = N K N 1 u = K
dt2
dt2
T
T
T
T
f is symmetric since K
f =N
Moreover, K
K N
= N 1 KN 1 since both N and K
are symmetric. Positive definiteness follows since

(a) First,

fx
eT K
e =x
e T N 1 K N 1 x
e = xT K x > 0
x

for all

e = N x 6= 0.
x

e =
f produces two solutions
e of K
e 2 and corresponding eigenvector v
(b) Each eigenvalue

(
)
et
cos
f u.
e (t) =
e to the modified system d2 u
e /dt2 = K
e The corresponding
u
v
e
sin t
n
o
cos
et
e (t) =
e and
solutions to the original system are u(t) = N 1 u
v, where =
sin
et
1 e
v = N v. Finally, we observe that v is the generalized eigenvector for the generalized
e of the matrix pair K, M . Indeed, K
ev
fv
e =
e implies K v =
eigenvalue = 2 =
1
e
N K N N v = v.

9.5.35. Let v1 , . . . , vk be the eigenvectors corresponding to the non-zero eigenvalues 1 = 12 , . . . ,


k = k2 , and vk+1 , . . . , vn the null eigenvectors. Then the general solution to the vibra
tional system u + K u = 0 is
u(t) =

k h
X

i=1

ci cos(i (t t0 ) ) + di sin(i (t t0 ) ) vi +

n
X

j = k+1

(pj + qj (t t0 ))vj .

which represents a quasi-periodic vibration with frequencies 1 , . . . , k around a linear motion with velocity

n
X

j = k+1

qj vj . Substituting into the initial conditions u(t0 ) = a, u(t0 ) = b,

and using orthogonality, we conclude that


h b , vi i
h a , vi i
,
di =
,
ci =
2
k vi k
i k v i k 2

pj =

h a , vj i
k vj

k2

qj =

h b , vj i
k v j k2

In particular, the unstable modes are not excited if and only if all qj = 0 which requires
that the initial velocity b be orthogonal to the null eigenvectors vk+1 , . . . , vn , which form

a basis for the null eigenspace or kernel of K. This requires u(t0 ) = b (ker K) =
corng K = rng K, using the fundamental Theorem 2.49 and the symmetry of K.

9.5.36.
(a) u(t) = t e 3t ; critically damped.

(b) u(t) = e t cos 3 t + 23 sin 3 t ; underdamped.


(c) u(t) = 14 sin 4 (t 1); undamped.

(d) u(t) = 2 9 3 e 3 t/2 sin 3 2 3 t; underdamped.


(e) u(t) = 4 e t/2 2 e t ; overdamped.
(f ) u(t) = e 3 t (3 cos t + 7 sin t); underdamped.

(v + 5) e t 41 (v + 1) e 5 t , where v = u(0) is the initial velocity.


v+1
v+1
, which happens when t = t? > 0 provided
> 1, and
This vanishes when e4 t =
v+5
v+5
so the initial velocity must satisfy v < 5.

9.5.37. The solution is u(t) =

1
4

271

9.5.38.
(a) By Hookes Law, the spring stiffness is k = 16/6.4 = 2.5. The mass is
16/32 = .5. The

equations of motion are .5 u + 2.5 u = 0. The natural frequency is = 5 = 2.23607.

(b) The solution to the initial value problem .5 u + u + 2.5 u = 0, u(0) = 2, u(0) = 0, is
u(t) = e t (2 cos 2 t + sin 2 t).
(c) The system is underdamped, and the vibrations are less rapid than the undamped system.
9.5.39. The undamped case corresponds to a center; the underdamped case to a stable focus;
the critically damped case to a stable improper node; and the overdamped case to a stable
node.
9.5.40.
(a) The general solution has the form u(t) = c1 e a t + c2 e b t for some 0 < a < b. If
c1 = 0, c2 6= 0, the solution does not vanish. Otherwise, u(t) = 0 if and only if e(ba) t =
c2 /c1 , which, since e(ba) t is monotonic, happens for at most one time t = t? .
(b) Yes, since the solution is u(t) = (c1 + c2 t) e a t for some a > 0, which, for c2 6= 0, only
vanishes when t = c1 /c2 .
du
d2 u
= 0 is u(t) = c1 + c2 e t/m . Thus, the mass
9.5.41. The general solution to m 2 +
dt
dt
approaches its equilibrium position u? = c1 , which can be anywhere, at an exponentially
fast rate.

9.6.1.
(a) cos 8 t cos 9 t = 2 sin

1
2

t sin

17
2

t; fast frequency:

17
2 ,

beat frequency:

1
2.

2
1

-5

15

10

20

-1
-2

(b) cos 26 t cos 24 t = 2 sin t sin 25 t; fast frequency: 25, beat frequency: 1.
2

1
-2

10

-1
-2

(c) cos 10 t + cos 9.5 t = 2 sin .25 t sin 9.75 t; fast frequency: 9.75, beat frequency: .25.
2
1

15

10

20

25

30

-1
-2

(d) cos 5 tsin 5.2 t = 2 sin .1 t

1
4

sin 5.1 t

1
4

; fast frequency: 5.1, beat frequency: .1.

2
1

10

20

30

40

-1
-2

272

50

60

70

9.6.2.
(a) u(t) =
(b) u(t) =
(c)
(d)
(e)
(f )

u(t) =
u(t) =
u(t) =
u(t) =

1
1
27 cos 3 t 27 cos 6 t,
3t
35
4 3t
4
3
50
e
+ 50
cos t + 50
sin t,
50 t e

t/2
1
15
15
cos 2 t 5 sin 215
2 sin 2 t + e
cos 3 t 21 t cos 3 t 16 sin 3 t,
1
3
1
9 t
1
+ e t/2 ,
5 cos 2 t + 5 sin 2 t + 5 e
1
3 t/3
10
cos t + 15 sin t + 14 e t 20
e
.

t ,

9.6.3.
(a) u(t) = 13 cos 4 t + 32 cos 5 t + 15 sin 5 t;
undamped periodic motion with fast frequency 4.5 and beat frequency .5:
1
0.5
5

15

10

20

25

30

-0.5
-1

(b) u(t) = 3 cos 5 t + 4 sin 5 t e 2 t 3 cos 6 t + 13


3 sin 6 t ; the transient is an underdamped
motion; the persistent motion is periodic of frequency 5 and amplitude 5:
4
2
4

-2
-4

10

5
56 5 t
(c) u(t) = 60
+ 8 e t ;
29 cos 2 t + 29 sin 2 t 29 e
the transient is an overdamped motion; the persistent motion is periodic:
4
3
2
1
5

-1
-2

(d) u(t) =

1
32

sin 4 t

1
8

15

10

20

25

t cos 4 t; resonant, unbounded motion:

3
2
1
5

-1
-2
-3

10

15

20

25

9.6.4. In general, by (9.102), the maximal allowable amplitude is =


q

m2 ( 2 2 )2 + 2 2 =

625 4 49.9999 2 + 1, which, in the particular cases is (a) .0975, (b) .002, (c) .1025.

9.6.5. .14142 or .24495.


q

9.6.6. 5 2 3 = 2.58819.

9.6.7. The solution to .5 u + u + 2.5 u = 2 cos 2 t, u(0) = 2, u(0) = 0, is

u(t) =

4
17

cos 2 t +

16
17

sin 2 t + e t

30
17

cos 2 t

1
17

sin 2 t

= .9701 cos(2 t 1.3258) + 1.7657 e t cos(2 t + .0333).


The solution consists of a persistent periodic vibration
at the forcing frequency of 2, with a

phase lag of tan1 4 = 1.32582 and amplitude 4/ 17 = .97014, combined with a transient
vibration at the same frequency with exponentially decreasing amplitude.

273

9.6.8.
(a) Yes, the same fast oscillations and beats can be observed graphically. For example, the
graph of cos t .5 cos 1.1 t on the interval 0 t 300 is:
1.5
1
0.5
-0.5
-1
-1.5

50

100

150

200

250

300

To prove this observation, we invoke the trigonometric identity


a cos t b cos t


+

+
t cos
t + (a + b) sin
t sin
t
= (a b) cos
2
2
2
2

+
= R(t) cos
t (t) ,
2
where R(t), (t) are the polar coordinates of the point


T
(a b) cos
= ( R(t) cos (t), R(t) sin (t) )T ,
t , (a + b) sin
t
2
2
and represent, respectively, the envelope or slowly varying amplitude of the oscillations,
i.e., the beats, and a periodically varying phase shift.
(b) Beats are still observed, but the larger | a b | is as prescribed by the initial conditions the less pronounced the variation in the beat envelope. Also, when a 6= b, the
fast oscillations are no longer precisely periodic, but exhibit a slowly varying phase shift
over the period of the beat envelope.
t
(cos t cos t)
, (b) u(t) =
sin t .
2
2
m( )
2m
(c) Use lH
opitals rule, differentiating with respect to to compute
(cos t cos t)
t sin t
t
lim
= lim
=
sin t.
2
2

m( )
2m
2m

9.6.9. (a) u(t) =

9.6.10. Using the method of undetermined coefficients, we set


u? (t) = A cos t + B sin t.
Substituting into the differential equation (9.101), and then equating coefficients of
cos t, sin t, we find
m ( 2 2 ) A + B = ,
A + m ( 2 2 ) B = 0,
where we replaced k = m 2 . Thus,
m ( 2 2 )

A= 2 2
,
B= 2 2
.
2
2
2
2
m ( ) +
m ( 2 )2 + 2 2
We then put the resulting solution in phase-amplitude form
u? (t) = a cos( t ),
where, according to (2.7), A = a cos , B = a sin , which implies (9.102103).

9.6.11.
(a) Underdamped, (b) overdamped, (c) critically damped, (d) underdamped, (e) underdamped.
9.6.12. (a) u(t) = e t/4 cos 14 t + e t/4 sin 41 t,

(b) u(t) =

274

3 t/3
2e

1 t
,
2e

(c) u(t) = e t/3 + 31 t e t/3 , (d) u(t) = e t/5 cos

1
1
(e) u(t) = e t/2 cos
t + 3 e t/2 sin
t.
2 3

9.6.13. u(t) =

165
41

e t/4 cos 14 t

= 4.0244 e

.25 t

1
10

t + 2 e t/5 sin

1
10

t,

2 3

91
41

e t/4 sin 41 t

cos .25 t 2.2195 e

124
41
.25 t

cos 2 t +

32
41

sin 2 t

sin .25 t 3.0244 cos 2 t + .7805 sin 2 t.

9.6.14. The natural vibrational frequency is = 1/ R C. If 6= then the circuit experiences


a quasi-periodic vibration as a combination of the two frequencies. As gets close to , the
current amplitude becomes larger and larger, exhibiting beats. When = , the circuit is
in resonance, and the current amplitude grows without bound.
9.6.15. (a) .02, (b) 2.8126, (c) 26.25.
9.6.16. .03577 or .04382.
9.6.17. R .10051.

9.6.18.
(a) u(t) = cos t
(b) u(t) = sin 3 t
(c) u(t) =
(d) u(t) =
(e) u(t) =

(f ) u(t) =

(g) u(t) =

1
2

!
!
3 !

2
1
14
+
r
cos(2
2
t

)
3
t

)
+
r
cos(
,
1
1
2
2
1
1
2
7
!
!
q
1

5 +r cos(4 5 t )
2 +r cos( 4 + 5 t )
1
2
2
1
2
1

t sin 2 t + 31 cos 2 t
3
4 t sin 2 t

+ r1 cos

17 t 1

3
2

+ r2 cos(2 t 2 )

!
5 ,

2
,
3

1
!
!
2
q
3
1
1
5
1 @ 17 A
+ r2 cos( t 2 )
,
+ r1 cos( 3 t 1 )
cos 2 t
2
1
2
12
17
0
1
1
0
!
!
1
1
q
2
5
8
1
3
6
@
A
A
@

cos t
+ sin 2 t
+ r1 cos( 5 t 1 )
t 2 )
,
+ r2 cos(
3
3
2
13
23
1
0
0
1
0 1
0
1
6
1
1
1
B 11 C

B
C
B C
B
C
B 5 C
B
C
B
B
C
C
B
cos t@ 11 A+r1 cos( 12 t 1 )@ 1 A+r2 cos(3 t 2 )@ 1 A+r3 cos( 2 t 3 )@ 1 C
A,
1
2
0
1
11
0 1
0
1
0
1
0 1
1
0
3
3
B8C

B
C
B
C
B C
C
B
C
C
B C
B 3 C+r1 cos( 3 t 1 )B
cos tB
@ 2 A+r2 cos( 2 t 2 )@ 2 A+r3 cos(t 3 )@ 1 A.
@8A

9.6.19.

1 +
2

(a) The resonant frequencies are

3 3
2

= .796225,

3+ 3
= 1.53819.
2
!
r

3+ 3
t w, where
2

w1
w2

(b) For example, a forcing function of the form cos


w =
is
!

not orthogonal to the eigenvector 1 3 , so w2 6= (1 + 3 )w1 , will excite reso1


nance.

9.6.20. When the bottom support is removed, the resonant frequencies are 52 17 = .468213,

5+ 17
= 1.51022. When the top support is removed, the resonant frequencies are
2
275

2 2
2

2+ 2
= .541196,
= 1.30656. In both cases the vibrations are slower. The previ2
ous forcing function will not excite resonance.

9.6.21. In each case, you need to force the system by cos( t)a where 2 = is an eigenvalue
and a is orthogonal to the corresponding eigenvector. In order not to excite an instability,
a needs to also be orthogonal to the kernel of the stiffness matrix spanned by the unstable
mode vectors.
(a) Resonant frequencies: 1 = .5412, 2 = 1.1371, 3 = 1.3066, 4 = 1.6453;
1
0
1
0
1
0
1
0
.6533
.2706
.2706
.6533
C
B
C
B
C
B
B
.2706 C
B .2706 C
B .6533 C
B .6533 C
C
C;
C, v 4 = B
C, v 3 = B
C, v 2 = B
B
eigenvectors: v1 = B
@ .6533 A
@ .2706 A
@ .2706 A
@ .6533 A
.2706
.6533
.6533
.2706
no unstable modes.
(b) Resonant frequencies:
1 = .4209, 2 = 1 (double), 3 = 1.2783, 4 = 1.6801, 5 = 1.8347; eigenvectors:
0
0 1
0
0
0
0
.6626 1
0
01
.5000 1
.2470 1
.5000 1
B .1426 C
B1C
B 1 C
B .2887 C
B .3825 C
B .2887 C
B
B C
B
C
C
B
B
C
B
C
C
B
B C
B
C
B
B
C
B
C
C
.6626 C
B0C
B 0C b
C
B .5000 C
B .2470 C
B .5000 C
B
C, v 2 = B C , v 3 = B
C, v 2 = B
C, v4 = B
C, v5 = B
C;
v1 = B
B .1426 C
B1C
B 1C
B .2887 C
B .3825 C
B .2887 C
B
B C
B
C
B
C
B
C
B
C
C
@ .2852 A
@0A
@ 1A
@
@ .7651 A
@
0 A
0 A
0
0
1
.5774
0
.5774
no unstable modes.
(c) Resonant frequencies: 1 = .3542, 2 = .9727, 3 = 1.0279, 4 = 1.6894, 5 = 1.7372;
eigenvectors:
0
0
0
0
0
.6889 1
.3914 1
.1251 1
.1160 1
.0989 1
B .1158 C
B .2009 C
B .6940 C
B .6780 C
B .0706 C
C
B
C
B
C
B
C
B
C
B
B
C
B
C
B
C
B
C
B
0 C

.7829
0
.2319
0
C
B
C
B
C
B
C
B
C
B
C;
C, v5 = B
C, v 4 = B
C, v 3 = B
C, v 2 = B
v1 = B
B .1549 C
B
B .0744 C
B
B .9851 C
0 C
0 C
C
B
C
B
C
B
C
B
C
B
@ .6889 A
@ .3914 A
@ .1251 A
@ .1160 A
@ .0989 A
.1158
.2009
.6940
.6780
.0706
T
unstable mode: z = ( 1, 0, 1, 0, 1, 0 ) . To avoid exciting the unstable mode, the initial

velocity must be orthogonal to the null eigenvector: z u(t0 ) = 0, i.e., there is no net
horizontal velocity of the masses.

(d) Resonant frequencies: 1 = 32 = 1.22474 (double), 2 = 3 = 1.73205;


0

eigenvectors:

v1 =

1
B 2

B
B 3
2
B
B 1
B
B 2
B
3
B
B 2
B
B
@ 1

C
C
C
C
C
C
C,
C
C
C
C
C
A

1
B0C
B C
B C
1C
B C,
unstable modes: z1 = B
B0C
B C
@1A
0

1
23 C
B
C
B
B 1 C
B 2 C
B
3 C
C
B
B 2 C,
C
B
B 1 C
C
B
2
C
B
C
B
@ 0 A
0

b =
v
1

0
B1C
B C
B C
0C
B C,
z2 = B
B1C
B C
@0A
1

z3 =

0
C
B
B 2 C
B C
C
B
B 3C
C;
v2 = B
C
B
B 1 C
B C
C
B
@
3 A
1

B
B
B
B
B
B
B
B
B
B
@

1
3
2 C
1 C
C
2 C
C
0 C
C.
1 C
C
C
0 A

0
To avoid exciting the unstable modes, the initial velocity must be orthogonal to the null

eigenvectors: z1 u(t0 ) = z2 u(t0 ) = z3 u(t0 ) = 0, i.e., there is no net linear or angular


velocity of the three masses.

(e) Resonant frequencies: 1 = 1, 2 = 3 = 1.73205;


276

1
1
1
B C
C
0C
eigenvectors: v1 =
v2 = B
@ 2 A; unstable mode: z = @ 1 A.
A,
1
1
1
To avoid exciting the unstable mode, the initial velocity must be orthogonal to the null

eigenvector: z u(t0 ) = 0, i.e., there is no net horizontal velocity of the atoms.


(f ) Resonant frequencies: 1 = 1.0386, 2 = 1.0229;
0
0
.0555 1
.0327 1
B .0426 C
B .0426 C
B
C
B
C
B
C
B
C
B .7054 C
B .7061 C
C,
B
C;
eigenvectors: v1 = B
v
=
2
B
B
0 C
0 C
B
C
B
C
@ .1826 A
@ .1827 A
.6813
.6820
0
1
0
0 1
0 1
0 1
0 1
0
0
0
1
B 0 C
C
B
B0C
B1C
B0C
0
C
B
C
B
B C
B C
B C
C
B
C
B
B C
B C
B C
0
0
0
1
B 0 C
C
B
B C
B C
B C
C.
C=B
C, z 2 = B C, z 3 = B C, z 4 = B
unstable modes: z1 = B
B 0 C
C
B
B1C
B1C
B0C
0
C
B
C
B
B C
B C
B C

@ .9659 A
@ sin 105 A
@0A
@0A
@1A
.2588
cos 105
0
1
0
To avoid exciting the unstable modes, the initial velocity must be orthogonal to the null

eigenvectors: z1 u(t0 ) = z2 u(t0 ) = z3 u(t0 ) = z4 u(t0 ) = 0, i.e., there is no net


linear velocity of the three atoms, and neither hydrogen atom has an angular velocity
component around the oxygen.
B
@

277

Solutions Chapter 10

10.1.1.
(a) u(1)
(b) u(1)
(c) u(1)
(d) u(1)

= 2, u(10) = 1024, u(20) = 1048576; unstable.


= .9, u(10) = .348678, u(20) = .121577; asymptotically stable.
= i , u(10) = 1, u(20) = 1; stable.
= 1 2 i , u(10) = 237 + 3116 i , u(20) = 9653287 + 1476984 i ; unstable.

10.1.2.
(a) u(k+1) = 1.0325 u(k) , u(0) = 100, where u(k) represents the balance after k years.
(b) u(10) = 1.032510 100 = 137.69 dollars.
(c) u(k+1) = (1 + .0325/12) u(k) = 1.002708 u(k) , u(0) = 100, where u(k) represents the
balance after k months. u(120) = (1 + .0325/12)120 100 = 138.34 dollars.
10.1.3. If r is the yearly interest rate, then u(k+1) = (1 + r/12) u(k) , where u(k) represents
the balance after k months. Let v (m) denote the balance after m years, so v (m) = u(12 m) .
Thus v (m) satisfies the iterative system v (m+1) = (1 + r/12)12 v (m) = (1 + s) v (m) , where
s = (1 + r/12)12 1 is the effective annual interest rate.
10.1.4. The balance after k years coming from compounding n times per year is

r n k
a er k a as n , by a standard calculus limit, [ 2, 58 ].
1+
n

10.1.5. Since u(t) = a e t we have u(k+1) = u((k + 1) h) = a e (k+1) h = e h a e k h


h

(k)

e u , and so = e . The stability properties are the same: | | < 1 for asymptotic
stability; | | 1 for stability, | | > 1 for an unstable system.
10.1.6. The solution u(k) = k u(0) is periodic of period m if and only if m = 1, and hence is
an mth root of unity. Thus, = e2 i k /m for some k = 0, 1, 2, . . . m 1. If k and m have a
common factor, then the solution is of smaller period, and so the solutions of period exactly
m are when k is relatively prime to m and is a primitive mth root of unity, as defined in
Exercise 5.7.7.
10.1.7. Let = e i where 0 < 2 . The solution is then u(k) = a k = a e i k . If is
a rational multiple of , the solution is periodic, as in Exercise 10.1.6. When / is irrational, the iterates eventually fill up (i.e., are dense in) the circle of radius | a | in the complex plane.
10.1.8. | u(k) | = | |k | a | > | v (k) | = | |k | b | provided k >
inequality relies on the fact that log | | > log | |.

log | b | log | a |
, where the
log | | log | |

10.1.9. The equilibrium solution is u? = c/(1 ). Then v (k) = u(k) u? satisfies the homogeneous system v (k+1) = v (k) , and so v (k) = k v (0) = k (a u? ). Thus, the solution
to (10.5) is u(k) = k (a u? ) + u? . If | | < 1, then the equilibrium is asymptotically
stable, with u(k) u? as k ; if | | = 1, it is stable, and solutions that start near u?
278

stay nearby; if | | > 1, it is unstable, and all non-equilibrium solutions become unbounded:
| u(k) | .
10.1.10. Let u(k) represent the balance after k years. Then u(k+1) = 1.05 u(k) + 120, with
u(0) = 0. The equilibrium solution is u? = 120/.05 = 2400, and so after k years the
balance is u(k) = (1.05k 1) 2400. Then
u(10) = $1, 509.35, u(50) = $25, 121.76, u(200) = $4, 149, 979.40.
10.1.11. If u(k) represent the balance after k months, then u(k+1) = (1 + .05/12) u(k) + 10,
u(0) = 0. The balance after k months is u(k) = (1.0041667k 1) 2400. Then
u(120) = $1, 552.82, u(600) = $26, 686.52, u(2400) = $5, 177, 417.44.
10.1.12. The overall yearly growth rate is 1.2 .05 = 1.15, and so the deer population satisfies
the iterative system u(k+1) = 1.15 u(k) 3600, u(0) = 20000. The equilibrium is u? =
3600/.15 = 24000. Since = 1.15 > 1, the equilibrium is unstable; if the initial number of
deer is less than the equilibrium, the population will decrease to zero, while if it is greater,
then the population will increase without limit. Two possible options: ban hunting for 2
years until the deer population reaches equilibrium of 24, 000 and then permit hunting at
the current rate again. Or to keep the population at 20, 000 allow hunting of only 3, 000
deer per year. In both cases, the instability of the equilibrium makes it unlikely that the
population will maintain a stable number, so constant monitoring of the deer population
is required. (More realistic models incorporate nonlinear terms, and are less prone to such
instabilities.)

10.1.13.

3k + (1)k
3k + (1)k
, v (k) =
.
2
2
18
15
18
20
= k + k , v (k) = k + k .
3
2
3
2

( 5 + 2)(3 5)k + ( 5 2)(3 + 5)k


(3 5)k (3 + 5)k
(k)

=
, v
=
.
2 5
2 5
18
1
1
27
= 8 + k k , v (k) = 4 + k1 , w(k) = k .
2
3
3
3
= 1 2k , v (k) = 1 + 2 (1)k 2k+1 , w(k) = 4 (1)k 2k .

(a) u(k) =
(b) u(k)
(c) u(k)
(d) u(k)
(e) u(k)
10.1.14.
(a) u

(k)

(b) u

(k)

k 2
= c1 ( 1 2)
1
0

= c1

1
2

5
@2

(c) u(k)
(d) u(k)

3
2

!
k 2
+ c2 ( 1 + 2)
;
1 0
1

5 i 3

1
A+c
2
2 2

3
2

1
5+ i 3
A
2

cos 13 k + 23 sin 13 k A + a @ 25 sin 13 k 23 cos 31 k A;


= a1
2
cos 31 k0
1
sin 13 k
0 1
0 1
1
2
2
C
kB C
kB C
= c1 B
@ 2 A + c2 (2) @ 3 A + c3 ( 3) @ 3 A;
0
2
3
0 1
0 1
0 1
1
1
0

k
k
B C
B C
B C
1
1
= c1 21
1 A.
@ 1 A + c2 3
@ 2 A + c3
@
6
0
1
2
279

10.1.15. (a) It suffices to note that the Lucas numbers are the general Fibonacci numbers (10.16)
when a = L(0) = 2, b = L(1) = 1. (b) 2, 1, 3, 4, 7, 11, 18. (c) Because the first two are integers and so, by induction, L(k+2) = L(k+1) + L(k) is an integer whenever L(k+1) , L(k) are
integers.
10.1.16.

0
1k
1 @ 1 5 A
k

< .448.62 < .5 for all k 0. Since


The second summand satisfies

2
5

the sum is an integer, adding the second summand to the first is the same as rounding the
first off to the nearest integer.

1 + 5
1
( k)
k+1 (k)
=
,
10.1.17. u
= (1)
u . Indeed, since
1+ 5
2

20
1 k 3
1k 3
1k 0
1 5A 7
1 5 A 7
1 6 1 + 5 A
6@ 1 + 5 A
@
@
5 = 4@
5
4
2
2
2
2
5
2
0
0
1 k3
1 k
6
k @1 + 5A 7
k @1 5A
(1)
5
4(1)
2
2
20

1
u( k) =
5
1
=
5

20

(1)k+1

=
5
10.1.18.
(a)

5
2

(b)

4
2

(c)
0

(e)

B
@

1
1

B
B
B
@

1
1

=
=

=
i
1

2
1

2
5
@
15
!

1 CB
B
CB

B
1C
AB
@
1
!

!0
@

2
1

10
1

0
1
0

0 CB
B
0 C
AB
@
(1)k

k
1+ 5
2

0
0
!

25 6k
,
15

i (1 + i )
(1 i )k

1
1

i
2
i
2
1
3
1
6
12

0
A@
(1 i )k 0

10

1
1
5 A,
2
5

!0

1 C B 4k
B
0C
A@ 0
1
0

1
2
i
1

0
1

1k 3
5A 7
1

k+1 (k)
@
u .
5 = (1)
2
0

3k 0
0 2k
!0
i @ (1 + i )k
1
0

0
1C
A =
2

u(k)
v (k)

1k

1
2

1
2
1

3+ 5
2
1 5
2

1k

6
0

5A

2 k B1
1C
A =B
@1
1
1

u(k)
v (k)

i
1

1
0
0

1
2

3 5
2
1+ 5
2

10.1.19. (a)
(c)

!k

6@ 1 +
4

2
1

!k

1
2
1

0
0
1
0

!k

1
1

1
1

1
(d) B
@1
2
0

2
2

1 k

k
1+ 5
2

(b)
1

A,

280

u(k)
v (k)
0

0C

1
1
2 A,
1
2
1
3
13

1
3
1
6
1
2

CB
CB
CB
@
0C
A

C
C
C,
A

53 5
10
5+3 5
10

5 5
10
5+ 5
10

5+2 5
5
52 5
5

1
=

1
1
0

1
2

1
u(k)
B
B (k) C
(d) @ v A = B
1
@
w(k)
1

1
2
1

3k
,
2k+1
10

2 k
1 C B 3 4 C
B 1 C
C,
0C
AB
@ 3 A
1
0

C
C
C.
A

u(k)
B
B (k) C
B
(e) @ v A = B
@
w(k)

3 5
2
1+ 5
2

3+ 5
2
1 5
2

k

1+ 5
5+3 5
B
10
2
1 CB
CB
k

B
1+ 5
5+ 5
1C
AB
B
10
2
@
1

C
C
C
C.
C
C
A

10.1.20. (a) Since the coefficient matrix T has all integer entries, its product T u with any vector with integer
entries
also 0has integer
entries;
(b) 1c1 = 2, 0
c2 = 3, 1c2 = 3; 0
0
1
1
0
1
4
26
76
164
304
C
C
C
C
(2)
(3)
(4)
(5)
(c) u(1) = B
=B
=B
=B
76 C
=B
@ 2 A, u
@ 10 A, u
@ 32 A, u
@
A, u
@ 152 A.
2
2
16
44
88

10.1.21. The vectors u(k)


0
0
1
0
B 0
0
1
B
B
0
0
B 0
..
..
B ..
T =B
B .
.
.
B
B
@

0
cj

0
cj1

0
cj2

=
0
0
1
..
.

0
cj3

u(k) , u(k+1) , . . . , u(k+j1)


R j satisfy u(k+1) = T u(k) , where
1
... 0
0
1
a0
... 0 C
C
C
B a
C
... 0 C
B
1 C
C
B
C
(0)
. C. The initial conditions are u
=a=B
..
. C, and
B . C
. .. C
C
.
@
A
C
... 1 A
aj1
. . . c1

so u(0) = a0 , u(1) = a1 , . . . , u(j1) = aj1 .

k1

k1

(2)k , (b) u(k) = 31


+ 14
,
k

k
(5 3 5)(2 + 5) + (5 + 3 5)(2 5)
(c) u(k) =
,
10

k
k
k/2
(k)
1
1
cos 41 k + 2 sin 41 k ,
(d) u
= 2 i (1 + i ) + 2 + i (1 i ) = 2

10.1.22. (a) u(k) =

4
31
3

(e) u(k) = 12

1
2

(1)k + 2k ,

(f ) u(k) = 1 + 1 + ( 1)k 2k/21 .

10.1.23. (a) u(k+3) = u(k+2) + u(k+1) + u(k) ,

(b) u(4) = 2, u(5) = 4, u(6) = 7, u(7) = 13,

(c) u(k) .183 1.839k + 2 Re ( .0914018 + .340547 i ) ( .419643 + .606291 i )k

= .183 1.839k .737353k .182804 cos 2.17623 k + .681093 sin 2.17623 k .

10.1.24.
(a) u(k) = u(k1) + u(k2) u(k8) .
(b) 0, 1, 1, 2, 3, 5, 8, 13, 21, 33, 53, 84, 134, 213, 339, 539, 857, 1363, 2167, . . . .
T

satisfies u(k+1) = A u(k) where the 8 8 coeffi(c) u(k) = u(k) , u(k+1) , . . . , u(k+7)
cient matrix A has 1s on the superdiagonal, last row ( 1, 0, 0, 0, 0, 0, 1, 1 ) and all other
entries 0.
(d) The growth rate is given by largest eigenvalue in magnitude: 1 = 1.59, with u(k) 1.59k .
For more details, see [ 34 ].
10.1.25.

(k)
ui

n
X

j =1

cj

j
2 cos
n+1

!k

sin

ij
,
n+1

i = 1, . . . , n.

10.1.26. The key observation is that the coefficient matrix T is symmetric. Then, according to
Exercise 8.5.19, the principal axes of the ellipse E1 = { T x | k x k = 1 } are the orthogonal
eigenvectors of T . Moreover, T k is also symmetric and has the same eigenvectors. Hence,
all the ellipses Ek have the same principal axes. The semi-axes are the absolute values of
the eigenvalues, and hence Ek has semi-axes ( .8)k and ( .4)k .

281

10.1.27.
(a)

E1 : principal axes:
E2 : principal axes:
E3 : principal axes:
E4 : principal axes:

1
,
1
!
1
,
1
!
1
,
1
!
1
,
1

1
,
1
!
1
,
1
!
1
,
1
!
1
,
1

semi-axes: 1, 31 , area:

1
3

semi-axes: 1, 19 , area:

1
9

semi-axes: 1,

1
27 ,

area:

1
27

semi-axes: 1,

1
81 ,

area:

1
81

(b)

E1 : principal axes:
E2 : circle of radius
E3 : principal axes:
E4 : circle of radius

0
1
, semi-axes: 1.2, .4, area: .48 = 1.5080.
,
1
0
.48,!area:!.2304 = .7238.
0
1
, semi-axes: .576, .192, area: .1106 = .3474.
,
1
0
.2304, area: .0531 = .1168.

(c)

E1 : principal axes:
E2 : principal axes:
E3 : principal axes:
E4 : principal axes:

.6407
,
.7678
!
.6765
,
.7365
!
.6941
,
.7199
!
.7018
,
.7124

.7678
,
.6407
!
.7365
,
.6765
!
.7199
,
.6941
!
.7124
,
.7018

semi-axes: 1.0233, .3909, area: .4 = 1.2566.


semi-axes: 1.0394, .1539, area: .16 = .5027.
semi-axes: 1.0477, .0611, area: .064 = .2011.
semi-axes: 1.0515, .0243, area: .0256 = .0804.

10.1.28. (a) This follows from Exercise 8.5.19(a), using the fact that K = T n is also positive
definite. (b) True they are the eigenvectors of T . (c) True r1 , s1 are the eigenvalues
of T . (d) True, since the area is times the product of the semi-axes, so A 1 = r1 s1 , so
n
n
= r1 s1 = | det T |. Then An = rn sn = r1n sn
1 = | det T | = .
282

10.1.29. (a) This follows from Exercise 8.5.19(a) with A = T n . (b) False; see Exercise 10.1.27(c)
for a counterexample. (c) False the singular values of T n are not, in general, the nth
powers of the singular values of T . (d) True, since the product of the singular values is the
absolute value of the determinant, and so An = | det T |n .
10.1.30. v(k) = c1 ( 1 + )k v1 + + cn ( n + )k vn .
10.1.31. If u(k) = x(k) + i y(k) is a complex solution, then the iterative equation becomes
x(k+1) + i y(k+1) = T x(k) + i T y(k) . Separating the real and imaginary parts of this
complex vector equation and using the fact that T is real, we deduce x(k+1) = T x(k) ,
y(k+1) = T y(k) . Therefore, x(k) , y(k) are real solutions to the iterative system.
10.1.32. The formula uniquely specifies u(k+1) once u(k) is known. Thus, by induction, once
the initial value u(0) is fixed, there is only one possible solution u(k) for k = 0, 1, 2, . . . .
Existence and uniqueness also hold for k < 0 when T is nonsingular, since u(k1) =
T 1 u(k) . If T is singular, the solution will not exist for k < 0 if any u(k) 6 rng T , or,
if it exists, is not unique since we can add any element of ker T to u(k) without affecting
u(k+1) , u(k+2) , . . . .
10.1.33. According to Theorem 8.20, the eigenvectors of T are real and form an orthogonal basis of R n with respect to the Euclidean norm. The formula for the coefficients cj thus follows directly from (5.8).
10.1.34. Since matrix multiplication acts column-wise, cf. (1.11), the j th column of the matrix
(k+1)
(k)
(0)
equation T k+1 = T T k is cj
= T cj . Moreover, T 0 = I has j th column cj = ej .
10.1.35. Separating the equation into its real and imaginary parts, we find
!
!
!
x(k+1) =
x(k) .

y (k+1)
y (k)
!
1
and so the
The eigenvalues of the coefficient matrix are i , with eigenvectors
i
solution is
!
!
!
(0)
(0)
x(0) i y (0)
k 1
x(k) = x + i y ( + i )k 1
+
( i )
.
i
i
2
2
y (k)
Therefore z (k) = x(k) + i y (k) = (x(0) + i y (0) ) ( + i )k = k z (0) .

10.1.36.
(a) Proof by induction:
T

k+1

wi = T

@ k w
k

+ k

= T wi + k

k1

k1

0 1

k
wi1 + @ A k2 wi2 +
2
0 1

T wi1 + @

= ( wi + wi1 ) + k
=

k+1

k1

1
A

kA k2

T wi2 +
2
0 1

( wi1 + wi2 ) + @
0

kA k2

( wi2 + wi3 ) +
2

k + 1A k1
wi + (k + 1) wi1 + @

wi2 + .
2
k

(b) Each Jordan chain of length j is used to construct j linearly independent solutions by
formula (10.23). Thus, for an n-dimensional system, the Jordan basis produces the required number of linearly independent (complex) solutions, and the general solution is
283

obtained by taking linear combinations. Real solutions of a real iterative system are obtained by using the real and imaginary parts of the Jordan chain solutions corresponding to the complex conjugate pairs of eigenvalues.
10.1.37.

(a) u(k) = 2k c1 +

(b) u(k) = 3k c1 +

1
2

k c2 , v (k) =
1
3

1
2

c2 , v

(e)
(f )

=3

2 c1 +

2
3

k c2 ;

1
k(k 1) c3 , v (k) = (1)k c2 (k + 1) c3 , w(k) = (1)k c3 ;
2

1
k (k 1) + 1 c3 , v (k) = 3k c2 + 31 k c3 ,
u(k) = 3k c1 + 31 k c2 + 18

1
k (k 1) c3 ;
w(k) = 3k c1 + 13 k c2 + 18
u(0) = c2 , v (0) = c1 + c3 , w(0) = c1 + c2 , while, for k > 0,

u(k) = 2k c2 + 21 k c3 , v (k) = 2k c3 , w(k) = 2k c2 + 21 k c3 ;


u(k) = i k+1 c1 k i k c2 ( i )k+1 c3 k( i )k c4 , w(k) = i k+1 c2 ( i )k+1 c4 ,
v (k) = i k c1 + k i k1 c2 + ( i )k c3 + k( i )k1 c4 ,
z (k) = i k c2 + ( i )k c4 .

(c) u(k) = (1)k c1 k c2 +

(d)

1 k
3 2 c2 ;
(k)
k

k
10.1.38. J,n
=

0 k

B
B
B
B 0
B
B
B
B
B 0
B
B
B
B
B 0
B
B
B ..
B .
@


k
k2
2

k k1
k

k k1

0
..
.
0

0
..
.
0


k
k3
3

k
k2
2

...
...

k k1

...

k
..
.
0

...
..
.
...

k
kn+1 1
n1
C
C

k
kn+2 C
C

C
n2
C
C

k
kn+3 C
C

n3
C.
C
C

k
kn+4 C

C
n4
C
C
..
C
C
.
A
n

10.1.39. (a) Yes, if T is nonsingular. Indeed, in this case, the solution formula u (k) = T kk0 u(k0 )
e (k)
is valid even when k < k0 . But if T is singular, then one can only assert that u(k) = u
e (kk0 +k1 ) for all k; if T is singular, then
for k k0 . (b) If T is nonsingular, then u(k) = u
this only holds when k max{ k0 , k1 }.

10.1.40.
(a) The system has an equilibrium solution if and only if (T I )u? = b. In particular, if 1
is not an eigenvalue of T , every b leads to an equilibrium solution.
(b) Since v(k+1) = T v(k) , the general solution is
u(k) = u? + c1 k1 v1 + c2 k2 v2 + + cn kn vn ,
where v1 , . . . , vn are the linearly independent eigenvectors and 1 , . . . , n the corresponding eigenvalues of T .
0

1
2
3 A 5k

1
3
(3)k @ 3 A;
(c) (i) u
=@
1
1
1
k

!
!
(1 2)
(1 + 2)k
1

2
(k)

(ii) u
=
1
1
2 2
2 2
(k)

(iii) u(k)

1
1
B C
3C
=B
@ 2 A 3@ 2 A +
1
0

!
2 ;
1
0

2
2
kB C
kB C
15
(2)
3

5
(3)
3 A;
@
A
@
2
2
3
284

(iv ) u(k) =

1
0 1
0 1
0 1
1
B6C

k B 1 C

k B 1 C
k B 0 C
B5C
7
1
1
B1C 7 1
B2C + 7
B 1 C.
B C + 2 2
@ A
@ A
@2A
2
3
3
6
@3A
3
0
1
1
2

(d) In general, using induction, the solution is

u(k) = T k c + ( I + T + T 2 + + T k1 )b.

If we write b = b1 v1 + + bn vn , c = c1 v1 + + cn vn , in terms of the eigenvectors,


then
u(k) =

n h
X
kj cj
j =1

+ (1 + j + 2j + + jk1 )bj vj .

If j 6= 1, one can use the geometric sum formula 1 + j + 2j + + jk1 =

1 kj
1 j

= k. Incidentally, when it exists the


while if j = 1, then 1 + j + 2j + + k1
j
bj
X
equilibrium solution is u? =
vj .
j 6=1 1 j
10.1.41.
(a) The sequence is 3, 7, 0, 7, 7, 4, 1, 5, 6, 1, 7, 8, 5, 3, 8, 1, 9, 0, 9, 9, 8, 7, 5, 2, 7, 9, 6, 5,
1, 6, 7, 3, 0, 3, 3, 6, 9, 5, 4, 9, 3, 2, 5, 7, 2, 9, 1, 0, 1, 1, 2, 3, 5, 8, 3, 1, 4, 5, 9, 4, 3, 7, 0,
and repeats when u(60) = u(0) = 3, u(61) = u1 = 7.
(b) When n = 10, other choices for the initial values u(0) , u(1) lead to sequences that also
start repeating at u(60) ; if u(0) , u(1) occur as successive integers in the preceding sequence, then one obtains a shifted version of it otherwise one ends up with a disjoint
pseudo-random sequence. Other values of n lead to longer or shorter sequences; e.g.,
n = 9 repeats at u(24) , while n = 11 already repeats at u(10) .

10.2.1.

(a) Eigenvalues: 5+2 33 5.3723, 52 33 .3723; spectral radius:


i
1
(b) Eigenvalues:
.11785 i ; spectral radius:
.11785.
6 2
6 2
(c) Eigenvalues: 2, 1, 1; spectral radius: 2.
(d) Eigenvalues: 4, 1 4 i ; spectral radius: 17 4.1231.

5+ 33
2

10.2.2.

(a) Eigenvalues: 2 3 i ; spectral radius: 13 3.6056; not convergent.


(b) Eigenvalues: .95414, .34586; spectral radius: .95414; convergent.
(c) Eigenvalues: 45 , 35 , 0; spectral radius: 45 ; convergent.
(d) Eigenvalues: 1., .547214, .347214; spectral radius: 1; not convergent.
10.2.3.
(a) Unstable: eigenvalues 1,
3;

5+ 73
(b) unstable: eigenvalues 12 1.12867, 512 73 .29533;
i
(c) asymptotically stable: eigenvalues 1
2 ;
(d) stable: eigenvalues 1, i ;
(e) unstable: eigenvalues 54 , 14 , 14 ;
(f ) unstable: eigenvalues 2, 1, 1;
(g) asymptotically stable: eigenvalues 21 , 13 , 0.
10.2.4.
(a) 1 = 3, 2 = 1 + 2 i , 3 = 1 2 i , (T ) = 3.
285

5.3723.

(b) 1 = 35 , 2 = 51 + 25 i , 3 = 15 52 i , (Te ) = 53 .
(c) u(k) 0 in all cases; generically, the initial data has a non-zero component, c1 6= 0, in
the direction of the dominant eigenvector, and then ( 1, 1, 1 )T , then
u(k) c1

3
5

( 1, 1, 1 )T .

10.2.5.
(a) T has a double!eigenvalue of 1, so (T!) = 1.
a
1 k
(b) Set u(0) =
. Then T k =
, and so u(k) =
b
0 1
q

a + kb
b

provided b 6= 0.

(c) In this example, k u(k) k = b2 k2 + 2 a b k + a2 + b2 b k when b 6= 0, while


C (T )k = C is constant, so eventually k u(k) k > C (T )k no matter how large C is.
(d) For any > 1, we have b k C k for k 0 provided C 0 is sufficiently large more
specifically, if C > b/ log .
10.2.6. A solution u(k) 0 if and only if the initial vector u(0) = c1 v1 + + cj vj is a linear
combination of the eigenvectors (or more generally, Jordan chain vectors) corresponding to
eigenvalues satisfying | i | < 1 for i = 1, . . . , j.
10.2.7. Since (c A) = | c | (A), then c A is convergent if and only if | c | < 1/(A). So, technically, there isnt a largest c.
10.2.8.
(a) Let u1 , . . . , un be a unit eigenvector basis for T , so k uj k = 1. Let
n

mj = max | cj | k c1 u1 + + cn un k 1 ,
which is finite since we are maximizing a continuous function over a closed, bounded set.
Let m? = max{m1 , . . . , mn }. Now, given > 0 small, if
k u(0) k = k c1 u1 + + cn un k < ,

| c j | < m?

then

for

j = 1, . . . , n.

Therefore, by (10.25), k u(k) k | c1 | + + | cn | n m? , and hence the solution


remains close to 0.
(b) If any eigenvalue of modulus k k = 1 is incomplete, then, according to (10.23), the system has solutions of the form u(k) = k wi + k k1 wi1 + , which are unbounded
as k . Thus, the origin is not stable in this case. On the other hand, if all eigenvalues of modulus 1 are complete, then the system is stable, even if there are incomplete
eigenvalues of modulus < 1. The proof is a simple adaptation of that in part (a).
10.2.9. Assume u(0) = c1 v1 + + cn vn with c1 6= 0. For k 0, u(k) c1 k1 v1 since
(k+1)

| k1 | | kj | for all j > 1. Thus, the entries satisfy ui

(k)

1 ui

and so, if nonzero, are

just multiplied by 1 . Thus, if 1 > 0 we expect to see the signs of all the entries of u(k)
not change once k is sufficiently large, whereas if 1 < 0, again for sufficiently large k, the
signs alterate at each step of the iteration.
10.2.10. Writing u(0) = c1 v1 + + cn vn , then for k 0,

u(k) c1 k1 v1 + c2 k2 v2 ,
()
which applies even when v1 , v2 are complex eigenvectors of a real matrix. Thus this happens if and only if the iterates eventually belong (modulo a small error) to a two-dimensional
subspace V , namely that spanned by the eigenvectors v1 and v2 . In particular, for k 0,
the iterates u(k) and u(k+1) form a basis for V (again modulo a small error) since if they
were linearly dependent, then there would only be one eigenvalue of largest modulus. Thus,
we can write u(k+2) a u(k+1) + b u(k) for some scalars a, b. We claim that the dominant
286

eigenvalues 1 , 2 are the roots of the quadratic equation 2 = a + b, which gives an effective algorithm for determining them. To prove the claim, for k 0, by formula (),
v2 ,
v1 + c2 k+2
u(k+2) c1 k+2
2
1

a u(k+1) + b u(k) c1 k1 (a 1 + b) v1 + c2 k2 (a 2 + b) v2 .
Thus, by linear independence of the eigenvectors v1 , v2 , we conclude that

21 = a 1 + b,
22 = a 2 + b,
which proves the claim. With the eigenvalues in hand, the determination of the eigenvectors is straightforward, either by directly solving the linear eigenvalue system, or by using
equation () for k and k + 1 sufficiently large.
10.2.11. If T has eigenvalues j , then c T + d I has eigenvalues c j + d. However, it is not necessarily true that the dominant eigenvalue of c T + d I is c 1 + d when 1 is the dominant
eigenvalue of T . For instance, if 1 = 3, 2 = 2, so (T ) = 3, then 1 2 = 1, 2 = 4,
so (T 2 I ) = 4 6= (T ) 2. Thus, you need to know all the eigenvalues to predict (T ),
or, more accurately, the extreme eigenvalues, i.e., those such that all other eigenvalues lie in
their convex hull in the complex plane.
10.2.12. By definition, the eigenvalues of AT A are i = i2 , and so the spectral radius ofqAT A
2
}. Thus (AT A) = 1 < 1 if and only if 1 = 1 < 1.
is equal to (AT A) = max{ 12 , . . . , n

. (b) No, since its spectral radius is slightly less than 2.


n+1
!k
n
X
ij
j
(k)
(k)
sin
(c) The entries of u
are ui =
cj 2 cos
, i = 1, . . . , n, where
n+1
n+1
j =1
c1 , . . . , cn are arbitrary constants.

10.2.13. (a) (Mn ) = 2 cos

10.2.14.
(k)
(a) The entries of u(k) are ui =

k
j
ij
cj + 2 cos
sin
, i = 1, . . . , n.
n
+
1
n
+1
j =1
(b) The system is asymptotically stable if and only if


ff
(T, ) = max + 2 cos
< 1.
, 2 cos
n+1
n+1
In particular, if | 2 | < 1 the system is asymptotically stable for any n.
n
X

10.2.15. (a) According to Exercise


8.2.25, T has at least one eigenvalue with | | > 1. (b) No.
!
2 0
For example, T =
has det T = 23 , but (T ) = 2, and so the iterative system is
0 31
unstable.
10.2.16.
(a) False: (c A) = | c | (A).
(b) True, since the eigenvalues of A and S 1 A S are the same.
(c) True, since the eigenvalues of A2 are the squares of the eigenvalues of A.
(d) False, since (A) = max whereas (A1 ) = max 1/.
!
!
0 0
1 0
, then
and B =
(e) False in almost all cases; for instance, if A =
0 1
0 0
(A) = (B) = (A + B) = 1 6= 2 = (A) + (B).
(f ) False: using the matrices in (e), A B = O and so (A B) = 0 6= 1 = (A) (B).
0

10.2.17. (a) True by part (c) of0Exercise


10.2.16. (b) False. For example, A =
1
1
1

(A) = 12 , whereas AT A = @ 14 25 A has (AT A) = 34 + 21 2 = 1.45711.


2

287

1
@2

1A

1
2

has

10.2.18. False. The first requires its eigenvalues satisfy Re j < 0; the second requires | j | < 1.
10.2.19. (a) A2 =

lim T k

= lim T 2 k = A. (b) The only eigenvalues of A are 1 and 0.


k

Moreover, A must be complete, since if v1 , v2 are the first two vectors in a Jordan chain,
then A v1 = v1 , A v2 = v2 + v1 , with = 0 or 1, but A2 v2 = 2 v1 + 2 v2 6= A v2 =
v2 + v1 , so there are no Jordan chains except for the ordinary eigenvectors. Therefore,
A = S diag (1, . . . , 1, 0, . . . 0) S 1 for some nonsingular matrix S. (c) If is an eigenvalue of
T , then either | | < 1, or = 1 and is a complete eigenvalue.
10.2.20. If v has integer entries, so does Ak v for any k, and so the only way in which Ak v 0
is if Ak v = 0 for some k. Now consider the basis vectors e1 , . . . , en . Let ki be such that
Aki ei = 0. Let k = max{k1 , . . . , kn }, so Ak ei = 0 for all i =!1, . . . , n. Then Ak I = Ak =
0 1
O, and hence A is nilpotent. The simplest example is
.
0 0
10.2.21. The equivalent first order system v (k+1) = C v(k) for v(k) =
cient matrix C =

O
B
!

u(k)
u(k+1)

has coeffi-

I
. To compute the eigenvalues of C we form det(C I ) =
A

I
I
. Now use row operations to subtract appropriate multiples of the
B
AI
first n rows from the last n, and then a series of row interchanges to conclude that
det

det(C I ) = det

I
B + A 2 I

I
O

= det

= det(B + A 2 I ).

B + A 2 I
I

O
I

Thus, the quadratic eigenvalues are the same as the ordinary eigenvalues of C, and hence
stability requires they all satisfy | | < 1.
10.2.22. Set = / > 1. If p(x) = ck xk + + c1 x + c0 has degree k, then p(n) a nk for all
n 1 where a = max | ci |. To prove a nk C n it suffice to prove that k log n < n log +
log C log a. Now h(n) = n log k log n has a minimum when h0 (n) = log k/n = 0,
so n = k/ log . The minimum value is h(k/ log ) = k(1 log(k/ log )). Thus, choosing
log C > log a + k(log(k/ log ) 1) will ensure the desired inequality.
10.2.23. According to Exercise 10.1.36, there is a polynomial p(x) such that
k u(k) k

X
i

| i |k pi (k) p(k) (A)k .

Thus, by Exercise 10.2.22, k u(k) k C k for any > (A).


10.2.24.
(a) Rewriting the system as u(n+1) = M 1 u(n) , stability requires (M 1 ) < 1. The eigenvalues of M 1 are the reciprocals of the eigenvalues of M , and hence (M 1 ) < 1 if and
only if 1/| i | < 1 for all i.
(b) Rewriting the system as u(n+1) = M 1 K u(n) , stability requires (M 1 K) < 1. Moreover, the eigenvalues of M 1 K coincide with the generalized eigenvalues of the pair; see
Exercise 8.4.8 for details.

288

10.2.25. (a) All scalar multiples of

1
; (b)
1

1 0

0
; (c) all scalar multiples of
0
1

1
1
C B C
(d) all linear combinations of B
@ 1 A , @ 0 A.
0
1

B
C
@ 2 A;

10.2.26.
(a) The eigenvalues are 1, 12 , so the fixed points are stable, while all other solutions go to a

.
unique fixed point at rate 12 . When u(0) = ( 1, 0 )T , then u(k) 35 , 35
(b) The eigenvalues are .9, .8, so the origin is a stable fixed point, and every nonzero solution goes to it, most at a rate of .9k . When u(0) = ( 1, 0 )T , then u(k) 0 also.
(c) The eigenvalues are 2, 1, 0, so the fixed points are unstable. Most solutions, specifically those with a nonzero component in the dominant eigenvector direction, become
unbounded. However, when u(0) = ( 1, 0, 0 )T , then u(k) = ( 1, 2, 1 )T for k 1, and
the solution stays at a fixed point.
(d) The eigenvalues are 5 and 1, so the fixed points are unstable. Most solutions, specifically those with a nonzero component in the dominant eigenvector direction, become
unbounded, including that with u(0) = ( 1, 0, 0 )T .
10.2.27. Since T is symmetric, its eigenvectors v1 , . . . , vn form an orthogonal basis of R n . Writing u(0) = c1 v1 + + cn vn , the coefficients are given by the usual orthogonality formula (5.7): ci = u(0) vi /k v1 k2 . Moreover, since 1 = 1, while | j | < 1 for j 2,
u(k) = c1 v1 + c2 k2 v2 + + cn kn vn c1 v1 =

u(0) v1
v .
k v 1 k2 1

10.2.28. False: T has an eigenvalue of 1, but convergence requires that all eigenvalues be less
than 1 in modulus.
10.2.29. True. In this case T u = u for all u R n and hence T = I .
10.2.30.
(a) The iterative system has a period 2 solution if and only if T has an eigenvalue of 1.
Indeed, the condition u(k+2) = T 2 u(k) = u(k) implies that u(k) 6= 0 is an eigenvector of
T 2 with eigenvalue of 1. Thus, u(k) is an eigenvector of T with eigenvalue 1 because if
the eigenvalue were 1 then
u(k) = u(k+1) , and the solution would
!
! be a fixed point. For
k
1 0
has the period 2 orbit u(k) = c (1)
for any c.
example, T =
0 2
0
(b) The period 2 solution is never unique since any nonzero scalar multiple is also a period
2 solution.
(c) T must have an eigenvalue equal to a primitive mth root of unity; the non-primitive
roots lead to solutions with smaller periods.
10.2.31.
(a) Let u? = v1 be a fixed point of T , i.e., an eigenvector with eigenvalue 1 = 1. Assuming
T is complete, we form an eigenvector basis v1 , . . . , vn with corresponding eigenvalues
1 = 1, 2 , . . . , n . Assume 1 , . . . , j have modulus 1, while the remaining eigenvalues
satisfy | i | < 1 for i = j + 1, . . . , n. If the initial value u(0) = c1 v1 + + cn vn is
close to u? = v1 , then | c1 1 |, | c2 |, . . . , | cn | < for small. Then the corresponding

289

solution u(k) = c1 v1 + c2 k2 v2 + + cn kn vn satisfies


k u(k) u? k | c1 1 | k v1 k + | c2 | | 2 |k k v2 k + + | cn | | n |k k vn k

< k v1 k + + k vn k = C ,
and hence any solution that starts near u? stays near.
(b) If A has an incomplete eigenvalue of modulus | | = 1, then, according to the solution
e (k) . Thus, for
formula (10.23), the iterative system admits unbounded solutions u
e (k) that starts out arbitrarily
any > 0, there is an unbounded solution u? + u
close to the fixed point u? . On the other hand, if all eigenvalues of modulus 1 are complete, then the preceding proof works in essentially the same manner. The first j terms
are bounded as before, while the remainder go to 0 as k .

10.3.1. (a) 34 , convergent; (b) 3, inconclusive; (c) 87 , inconclusive; (d) 74 , inconclusive;


(e) 78 , inconclusive; (f ) .9, convergent; (g) 73 , inconclusive; (h) 1, inconclusive.
10.3.2. (a) .671855, convergent; (b) 2.5704, inconclusive; (c) .9755, convergent; (d) 1.9571,
inconclusive; (e) 1.1066, inconclusive; (f ) .8124, convergent; (g) 2.03426, inconclusive;
(h) .7691, convergent.
10.3.3. (a) 23 , convergent; (b) 12 , convergent; (c) .9755, convergent; (d) 1.0308, divergent;
(e) .9437, convergent; (f ) .8124, convergent; (g) 23 , convergent; (h) 23 , convergent.
10.3.4. (a) k Ak k = k 2 + k, (b) k Ak k2 = k 2 + 1, (c) (Ak ) = 0. (d) Thus, a convergent matrix can have arbitrarily large norm. (e) Because the norm in the inequality will
depend on k.
!

1
0

1
, A2 =
1

2
10.3.5. For example, when A =
, and (a) k A k = 2, k A2 k = 3;
1
r
q

(b) k A k2 = 3+2 5 = 1.6180, k A2 k2 = 3 + 2 2 = 2.4142.


10.3.6. Since k c A k = | c | k A k < 1.
10.3.7. For example, if A =

0
0

1
,B=
0

1
0

1
, then (A + B) = 2 > 0 + 1 = (A) + (B).
0

0
1

10.3.8. True: this implies k A k2 = max i < 1.


0

1
@2

1A

10.3.9. For example, if A =


, then (A) =
1
0
2

3+2 2
32 2
=
1.2071
and

=
= .2071.
2
2
2
10.3.10.
(a) False: For instance, if A =

0
0

1
,S=
1

1
0

1
2.

The singular values of A are 1 =

2
, then B = S 1 A S =
1

0
0

2
, and
1

k B k = 2 6= 1 = k A k .

(b) False: The same example has k B k2 = 5 6= 2 = k A k2 .


(c) True, since A and B have the same eigenvalues.
10.3.11. By definition, (A) = 1 /n . Now k A k2 = 1 . On the other hand, by Exercise 8.5.12,
the singular values of A1 are the reciprocals 1/i of the singular values of A, and so the
290

largest one is k A1 k2 = 1/n .


10.3.12. (i) The 1 matrix norm is the maximum absolute column sum:
8
n
< X

(ii) (a)
(e)

k A k1 = max :

i=1

17
5
6 , convergent; (b) 6 , inconclusive;
12
7 , inconclusive; (f ) .9, convergent;

| aij |
(c)
(g)

9
=

1 j n ;.

8
7 , inconclusive;
7
3 , inconclusive;

(d)
(h)

11
4 , inconclusive;
2
3 , convergent.

10.3.13. If a1 , . . . , an are the rows of A, then the formula (10.40) can be rewritten as k A k =
max{ k ai k1 | i = 1, . . . , n }, i.e., the maximal 1 norm of the rows. Thus, by the properties of
the 1-norm,
k A + B k = max{ k ai + bi k1 } max{ k ai k1 + k bi k1 }
max{ k ai k1 } + max{ k bi k1 } = k A k + k B k ,

k c A k = max{ k c ai k1 } = max{ | c | k ai k1 } = | c | max{ k ai k1 } = | c | k A k .


Finally, k A k 0 since we are maximizing non-negative quantities; moreover, k A k = 0
if and only if all its rows have k ai k1 = 0 and hence all ai = 0, which means A = O.
10.3.14. k A k = max{1 , . . . , n } is the largest generalized singular value, meaning i =

where 1 , . . . , n are the generalized eigenvalues of the positive definite matrix pair A K A
and K, satisfying AT K A v = K v for some v =
6 0, or, equivalently, the eigenvalues of
K 1 AT K A.

10.3.15.
(a) k A k =

7
2.

The unit sphere for this norm is the rectangle with corners

5
1
6, 6

It is mapped to the parallelogram with corners


,
5
7
tive norms 3 and 2 , and so k A k = max{ k A v k | k v k = 1 } = 72 .
8
3.
T
1
.
3

(b) k A k =

1 7
6, 6

12 , 31

, with respec-

The unit sphere for this norm is the diamond with corners

T
1
,
2,0
T

2
1
,
3,3

0,
It is mapped to the parallelogram with corners 12 , 12
,
5
8
with respective norms 2 and 3 , and so k A k = max{ k A v k | k v k = 1 } = 83 .

(c) According to Exercise 10.3.14, k A k is


the square root of the !
largest generalized eigen!
5 4
2 0
T
. Thus,
, A KA =
value of the matrix pair K =
4 14
0 3
r

k A k = 43+12 553 = 2.35436.


(d) According to Exercise 10.3.14, k A k is the
square root of the largest
generalized eigen!
!
2 1
2 1
T
. Thus, k A k = 3.
, A KA =
value of the matrix pair K =
1 14
1
2
2

10.3.16. If we identify an n n matrix with a vector in R n , then the Frobenius norm is the
same as the ordinary Euclidean norm, and so the norm axioms are immediate. To check
T
denote the rows of A and c1 , . . . , cn the columns
the multiplicative property, let rT
1 , . . . , rn v
of B, so k A kF =
k C kF =

v
uX
u n
t

u n
uX

k r i k2 , k B k F = u
t

i=1
v
u n
u X
u
c2ij
t
i,j = 1

k cj k2 . Then, setting C = A B, we have

j =1
v
v
u n
u n
u X
u X
T
2
u
(ri cj ) u
t
t
i,j = 1
i,j = 1

k r i k2 k cj k2 = k A k F k B k F ,

where we used the CauchySchwarz inequality in the middle.


291

10.3.17.
(a) This is a restatement of Proposition 10.28.
(b) k k22 =

n
X

i=1

i2 =

n
X

i=1

i = tr(AT A) =

n
X

i,j = 1

a2ij = k A k2F .
2

10.3.18. If we identify a matrix A with a vector in R n , then this agrees with !


the norm on
1
1 1
n2
, then A2 =
R and hence satisfies the norm axioms. For example, when A =
0
0
1
and so k A2 k = 2 > 1 = k A k2 .
10.3.19. First, if x =

a
b

,y =

c
d

are any two linearly independent vectors in R 2 , then the


!

cos
a
c
is the image of the
curve (cos ) x (sin ) y =
! unit circle under the
sin
b d
a
c
, and hence defines an
linear transformation defined by the nonsingular matrix
b d
ellipse. The same argument shows that the curve (cos ) x (sin ) y describes an ellipse in
the two-dimensional plane spanned by the vectors x, y.
10.3.20.
(a) This follows from the formula (10.40) since | aij | si k A k , where si is the ith
absolute row sum.
(b) Let aij,n denote the (i, j) entry of An . Then, by part (a),
and hence

n=0

n=0

| aij,n |

n=0

k A n k <

aij,n = a?ij is an absolutely convergent series, [ 2, 16 ]. Since each

entry converges absolutely, the matrix series also converges.


tn
X
(c) et A =
An and the series of norms
n!
n=0

n=0

| t |n
X
| t |n
| t | k A k
k A n k
k A kn
=e
n!
n!
n=0

is bounded by the standard scalar exponential series, which converges for all t, [ 2, 58 ].
Thus, the convergence follows from part (b).
10.3.21. (a) Choosing a matrix norm such that a = k A k < 1, the norm series is bounded by a
convergent geometric series:

X
X
X
1
an =
An =
An
.
1

a
n=0
k k= 0
k k= 0
Therefore, the matrix series converges.
( I A)

n=0

(b) Moreover,

An =

n=0

An

An+1 = I ,

n=0

since all other terms cancel. I A is invertible if and only if 1 is not an eigenvalue of A,
and we are assuming all eigenvalues are less than 1 in magnitude.

292

2
,
1

10.3.22.
2

(a) Gerschgorin disk: | z 1 | 2; eigenvalues: 3, 1;

-1

-2

-1

-2

1
0.75

(b) Gerschgorin disks: | z 1 |


eigenvalues:

1 1
2, 3;

2
3, z

1
6

0.5

21 ;

0.25
-1

-0.5

0.5

1.5

-0.25
-0.5
-0.75
-1

(c) Gerschgorin disks: | z 2 | 3, | z | 1;

eigenvalues: 1 i 2;

-1

-1

-2
-3

(d) Gerschgorin disks: | z 3 | 1, | z 2 | 2;


eigenvalues: 4, 3, 1;

-1

-2

(e) Gerschgorin disks: | z + 1 | 2, | z 2 | 3, | z + 4 | 3;


eigenvalues: 2.69805 .806289, 2.3961;

-8

-6

-4

-2

-2

-4

0.4

(f )

Gerschgorin disks: z = 21 , | z |
1
eigenvalues: 12 ,
;
3 2

1
3,

|z|

5
12 ;

0.2

-0.4

-0.2
-0.2

-0.4

293

0.2

0.4

(g) Gerschgorin disks: | z | 1, | z 1 | 1;


eigenvalues: 0, 1 i ;

-1

-2

-1

-2

(h) Gerschgorin disks: | z 3 | 2, | z 2 | 1, | z | 1,


| z 1 | 2; eigenvalues:

1
2

5
2

5
2

5
2 ;

-2

-1

-2

-3

10.3.23. False. Almost any non-symmetric matrix, e.g.,

2
0

1
1

provides a counterexample.

10.3.24.
(i) Because A and its transpose AT have the same eigenvalues, which must therefore belong
to both DA and DAT .
(ii)
2

(a) Gerschgorin disk: | z 1 | 2; eigenvalues: 3, 1;

-1

-2

-1

-2

1
0.75

(b) Gerschgorin disks: | z 1 |


eigenvalues:

1 1
2, 3;

1
,
2 z

1
6

32 ;

0.5
0.25
-1

-0.5

0.5

1.5

-0.25
-0.5
-0.75
-1

2
1

(c) Gerschgorin disks: | z 2 | 1, | z | 3;

eigenvalues: 1 i 2;

-3

-2

-1

1
-1

-2
-3

294

(d) Gerschgorin disks: | z 3 | 1, | z 2 | 2;


eigenvalues: 4, 3, 1;

-1

-2

(e) Gerschgorin disks: | z + 1 | 2, | z 2 | 4, | z + 4 | 2;


eigenvalues: 2.69805 .806289, 2.3961;

-8

-6

-4

-2

-2

-4

0.75
0.5

1
1
2 | 4,
1

;
3 2

(f ) Gerschgorin disks: | z
eigenvalues:

1
2,

| z | 61 , | z | 31 ;

0.25

-0.75 -0.5 -0.25

0.25

0.5

0.75

-0.25
-0.5
-0.75

(g) Gerschgorin disks: z = 0, | z 1 | 2, | z 1 | 1;


eigenvalues: 0, 1 i ;

-2

-1

-1

-2

(h) Gerschgorin disks: | z 3 | 1, | z 2 | 2, | z | 2,


| z 1 | 1; eigenvalues:

1
2

5
2

, 52

5
2 ;

-2

-1

-2

-3

10.3.25. By elementary geometry, all points z in a closed disk of radius r centered at z = a


satisfy max{ 0, | a | r } | z | | a | + r. Thus, every point in the ith Gerschgorin disk
satisfies max{ 0, | aii | ri } | z | | aii | + ri = si . Since every eigenvalue lies in such a
disk, they all satisfy max{ 0, t } | i | s, and hence (A) = max{ | i | } does too.
10.3.26. (a) The absolute row sums of A are bounded by si =

295

n
X

j =1

| aij | < 1, and so

(A) s = max si < 1 by Exercise 10.3.25. (b) A =


hence (A) = 1.

1
@2
1
2

1
1
2A
1
2

has eigenvalues 0, 1 and

10.3.27. Using Exercise 10.3.25, we find (A) s = max{ s1 , . . . , sn } n a? .


10.3.28. For instance, any diagonal matrix whose diagonal entries satisfy 0 < | a ii | < 1.
!

1 2
is a counterexample to (a), while
10.3.29. Both false.
2 5
to (b). However, see Exercise 10.3.30.

1
0

0
1

is a counterexample

10.3.30. The eigenvalues of K are real by Theorem 8.20. The ith Gerschgorin disk is centered
at kii > 0 and by diagonal dominance its radius is less than the distance from its center to
the origin. Therefore, all eigenvalues of K must be positive and hence, by Theorem 8.23,
K > 0.
10.3.31.

0 1
(a) For example, A =
has Gerschgorin domain | z | 1.
1 0
(b) No see the proof of Theorem 10.37.

10.3.32. The ith Gerschgorin disk is centered at aii < 0 and, by diagonal dominance, its radius
is less than the distance from its center to the origin. Therefore, all eigenvalues of A lie in
the left half plane: Re < 0, which, by Theorem 9.15, implies asymptotic stability of the
differential equation.

10.4.1. (a) Not a transition matrix; (b) not a transition matrix; (c) regular transition matrix:

9
8
17 , 17

; (d) regular transition matrix:

matrix; (f ) regular transition matrix:


T

1 1 1
3, 3, 3

1 5
6, 6

; (e) not a regular transition

; (g) regular transition matrix:

( .2415, .4348, .3237 ) ; (h) not a transition matrix; (i) regular transition matrix:

6
4
3
13 , 13 , 13

225
235
290
251
1001 , 1001 , 1001 , 1001

= ( .4615, .3077, .2308 )T ; (j) not a regular transition matrix;


(k) not a transition matrix; (l) regular transition matrix (A4 has all positive entries):
T

= ( .250749, .224775, .234765, .28971 )T ; (m) regular transition

matrix: ( .2509, .2914, .1977, .2600 )T .


10.4.2. (a) 20.5%; (b) 9.76% farmers, 26.83% laborers, 63.41% professionals
10.4.3. 2004: 37,000 city, 23,000 country; 2005: 38,600 city, 21,400 country; 2006: 39,880
city, 20,120 country; 2007: 40,904 city, 19,096 country; 2008: 41,723 city, 18,277 country;
Eventual: 45,000 in the city and 15,000 in the country.
10.4.4. 58.33% of the nights.
10.4.5. When in Atlanta he always goes to Boston; when in Boston he has a 50% probability of
going to either Atlanta or Chicago; when in Chicago he has a 50% probability of going to
either Atlanta or Boston.0 The transition matrix1 is regular because
.375
.3125 .3125
C
T4 = B
.5625 .5
@ .25
A has all positive entries.
.375
.125
.1875
On average he visits Atlanta: 33.33%, Boston 44.44%, and Chicago: 22.22% of the time.

296

B0
B
B1
@

10.4.6. The transition matrix T =

2
3

0
1
3

2
3
1
3

C
C
C
A

is regular because T 4 =

14
B 27
B 2
B
@ 9
7
27

26
81
49
81
2
27

26
81
16
27
7
81

1
C
C
C
A

has all positive entries. She visits branch A 40% of the time, branch B 45% and branch C:
15%.
10.4.7. 25% red, 50% pink, 25% pink.
10.4.8. If u(0) = ( a, b )T is the initial state vector, then the subsequent state vectors switch
back and forth between ( b, a )T and ( a, b )T . At each step in the process, all of the population in state 1 goes to state 2 and vice versa, so the system never settles down.
10.4.9. This is not a regular transition matrix, so we need to analyze the iterative process di1
eigenvectors
rectly.0The
eigenvalues
of A are 10= 1
2 = 1 and 3 = 2 , with corresponding
1
0 1
0
1
1
0
1
p
B C
B C
B
C
B 0C
(0)
v1 = @ 0 A, v2 = @ 0 A, and v3 = @ 2 A. Thus, the solution with u = @ q0 A is
0
1
1
r0

u(n) = p0 +

1
2 q0

p + 12 q0
1
0

q0 B 1 C
B C
B C
C
B 0
1
@ 0 A + 2 q0 + r 0 @ 0 A
@ 2 A @
A.
0
k+1
2
1
1
0
1
q0 + r 0
2

Therefore, this breeding process eventually results in a population with individuals of genotypes AA and aa only, the proportions of each depending upon the initial population.
10.4.10. Numbering the vertices from top to bottom and left to right, the transition matrix is
0

T =

B
B
B
B
B
B
B
B
B
B
B
B
@

0
1
2
1
2

1
4

0
1
4
1
4
1
4

1
4
1
4

1
2

0
0

0
0

1
4
1
4
1
4

0C
C
0C
C

B
B
B
B
B
B
B
B
B
B
B
B
@

1C
C
2 C.
C
0C
C
1C
C
2A

1
9
2
9
2
9
1
9
2
9
1
9

1
C
C
C
C
C
C
C
C
C
C
C
C
A

The probability eigenvector is


and so the bug spends,
0
1
1
0
0
4
2
1
1
0 0 4 0 4 0
on average, twice as much time at the edge vertices as at the corner vertices.
10.4.11. Numbering the vertices from top to bottom and left to right, the transition matrix is
0

T =

B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@

0
1
2
1
2

0
0
0
0
0
0
0

1
4

0
1
4
1
4
1
4

0
0
0
0
0

1
4
1
4

1
4

0
0

0
0

1
4
1
4

1
4

1
6
1
6
1
6

0
0
0
0

0
1
4
1
4

0
0

0
1
6

0
1
6
1
6

0
0
1
4

0
1
4

0
0
0
1
4
1
4

0
0
0

0
0
0

1
2

1
4
1
4

0
0
0
1
2

0
0

0
1
4

0
1
4

0
0
0
0
1
4
1
4

0
1
4

0
1
4

0
C
0C
C
C
C
0C
C
C
0C
C
C
0C
C

.
1C
C
2C
C
0C
C
C
0C
C
1C
C
2A
0

The probability eigenvector is

B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@

1
18
1
9
1
9
1
9
1
6
1
9
1
18
1
9
1
9
1
18

1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A

and

so the bug spends, on average, twice as much time on the edge vertices and three times as
much time at the center vertex as at the corner vertices.

297

10.4.12. The transition matrix T =

B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@

1
2

0
1
2

0
0
0
0
0

1
3

0
1
3

0
1
3

0
0
0
0

1
3

0
1
2

0
0
0
1
2

0
0
0

1
4

1
3

1
4

1
3

0
0
0

0
0
0

0
1
4

0
0

1
2

1
4

1
3

0
0
0

1
3

0
0
0

0
0

1
2

1
3

0
0
0
0
1
3

0
1
3

0
1
3

0
C
0C
C
C
C
0C
C
C
0C
C
C
is not regular. Indeed,
0C
C

1C
C
2C
C
0C
C
1C
C
2A

if, say, the bug starts out at a corner,then after an odd number of steps it can only be at
one of the edge vertices, while after an even number of steps it will be either at a corner
vertex or the center vertex. Thus, the iterates u(n) do not converge. If the bug starts at
vertex i, so u(0) = ei , after a while, the probability vectors u(n) end up switching back and

forth between the probability vectors v = T w = 0, 14 , 0, 14 , 0, 14 , 0, 41 , 0


has an equal probability of being at any edge vertex, and

w = T v = 16 , 0, 16 , 0, 13 , 0, 61 , 0,
likely, at the middle vertex.

1
6

(k)

10.4.13. The ith column of T k is ui

= T k ei v by Theorem 10.40.
0
B
B
B
@

1
3
1
3
1
3

1
3
1
3
1
3

1
3
1
3
1
3

C
C
C.
A

10.4.15. First, v is a probability vector since the sum of its entries is


(1 p) q + q p
p+q
eigenvector for eigenvalue 1.

p q + (1 q) p
p+q

10.4.16. All equal probabilities: z =


10.4.17. z =

10.4.18. False:

1
1
n, ... , n
0

.3

B
@ .3

.4

.5
.2
.3

, where the bug

, where the bug is either at a corner vertex or, twice as

10.4.14. In view of Exercise 10.4.13, the limit is

over, A v =

1
1
n, ... , n

q
p+q

p
p+q

p
p
+
= 1. Morep+q
p+q

= v, proving it is an

.2
.5 C
A is a counterexample.
.3

10.4.19. False. For instance, if T =


not even invertible.

1
@2
1
2

1
1
3 A,
2
3

then T 1 =

4
3

2
, while T = @
3

1
2
1
2

10.4.20. False. For instance, 0 is not a probability vector.


10.4.21. (a) The 1 norm.
10.4.22. True. If v = ( v1 , v2 , . . . , vn )T is a probability eigenvector, then
n
X

j =1

n
X

i=1

vi = 1 and

tij vj = vi for all i = 1, . . . , n. Summing the latter equations over i, we find

298

1
1
2A
1
2

is

n
X

vi =

i=1

n
X

n
X

i=1 j =1

tij vj =

n
X

j =1

vj = 1,

since the column sums of a transition matrix are all equal to 1.


10.4.23. (a)

0
1

0
(b) @
1

1
;
0

10.4.24. The ith entry of v is vi =


also. Moreover,

n
X

i=1

vi =

n
X

1
1
2 A.
1
2
n
X

j =1

i,j = 1

tij uj . Since each tij 0 and uj 0, the sum vi 0

tij uj =

equal to 1, and u is a probability vector.

n
X

j =1

uj = 1 because all the column sums of T are

10.4.25.
(a) The columns of T S are obtained by multiplying T by the columns of S. Since S is a
transition matrix, its columns are probability vectors. Exercise 10.4.24 shows that each
column of T S is also a probability vector, and so the product is a transition matrix.
(b) This follows by induction from part (a), where we write T k+1 = T T k .

10.5.1.
(a) The eigenvalues are 12 , 13 , so (T ) = 12 .

(b) The iterates will converge to the fixed point

16 , 1

at rate

1
2.

Asymptotically, they

come in to the fixed point along the direction of the dominant eigenvector ( 3, 2 ) T .
10.5.2.
(a) (T ) = 2; the iterates diverge: k u(k) k at a rate of 2.
(b) (T ) = 34 ; the iterates converge to the fixed point ( 1.6, .8, 7.2 )T at a rate
dominant eigenvector direction ( 1, 2, 6 )T .
(c) (T ) = 12 ; the iterates converge to the fixed point ( 1, .4, 2.6 )T at a rate
dominant eigenvector direction ( 0, 1, 1 )T .
10.5.3. (a,b,e,g) are diagonally dominant.
10.5.4. (a) x = 17 = .142857, y = 72 = .285714; (b) x = 30, y = 48;
(e) x = 1.9172, y = .339703, z = 2.24204;
(g) x = .84507, y = .464789, z = .450704;
10.5.5. (c) Jacobi spectral radius = .547723, so Jacobi converges to the solution
x = 78 = 1.142857, y = 19
7 = 2.71429;
(d ) Jacobi spectral radius = .5, so Jacobi converges to the solution
13
2
x = 10
9 = 1.1111, y = 9 = 1.4444, z = 9 = .2222;
(f ) Jacobi spectral radius = 1.1180, so Jacobi does not converge.
!

.3333
4
.7857
C
, (c) u = B
, (b) u =
10.5.6. (a) u =
@ 1.0000 A,
5
.3571
1.3333
0
1
0
1
0
1
.8750
0.
.7273
B
B
C
.1250 C
C
C
B .7143 C
B
C,
B
C.
(d) u = B
(e) u = B
(f
)
u
=
@ 3.1818 A,
@ .1250 A
@ .1429 A
.6364
.1250
.2857
299

3
4,

along the

1
2,

along the

10.5.7.
(a) | c | > 2.
(b) If c = 0, then D = c I = O, and Jacobi iteration isnt even defined. Otherwise, T =
D1 (L + U ) is tridiagonal with diagonal entries all 0 and sub- and super-diagonal
2
k
entries equal to 1/c. According to Exercise 8.2.48, the eigenvalues are cos
c
n+1
1
2
cos
. Thus, convergence
for k = 1, . . . , n, and so the spectral radius is (T ) =
|
c
|
n
+
1
1
requires | c | > 2 cos
; in particular, | c | 2 will ensure convergence for any n.
n+1
(c) For n = 5, the solution is u = ( .8333, .6667, .5000, .3333, .1667 ) T , with a convergence rate of (T ) = cos 61 = .8660. It takes 51 iterations to obtain 3 decimal place
accuracy, while log(.5 104 )/ log (T ) 53.
For n = 10, the solution is u = (.9091, .8182, .7273, .6364, .5455, .4545, .3636,
1
= .9595. It takes 173 itera .2727, .1818, .0909)T , with a convergence rate of cos 11
4
tions to obtain 3 decimal place accuracy, while log(.5 10 )/ log (T ) 184.
For n = 20, u = (.9524, .9048, .8571, .8095, .7619, .7143, .6667, .6190, .5714,
.5238, .4762, .4286, .3810, .3333, .2857, .2381, .1905, .1429, .0952, .0476) T , with
1
= .9888. It takes 637 iterations to obtain 3 decimal place
a convergence rate of cos 21
accuracy, while log(.5 104 )/ log (T ) 677.
10.5.8. If A u = 0, then D u = (L + U ) u, and hence T u = D 1 (L + U ) u = u, proving that
u is a eigenvector for T with eigenvalue 1. Therefore, (T ) 1, which implies that T is not
a convergent matrix.
10.5.9. If A is nonsingular, then at least one of the terms in the general determinant expansion
(1.85) is nonzero. If a1,(1) a2,(2) an,(n) 6= 0 then each ai,(i) 6= 0. Applying the permutation to the rows of A will produce a matrix whose diagonal entries are all nonzero.
10.5.10. Assume, for simplicity, that T is complete with a single dominant eigenvalue 1 , so
that (T ) = | 1 | and | 1 | > | j | for j > 1. We expand the initial error e(0) = c1 v1 + +
cn vn in terms of the eigenvectors. Then e(k) = T k e(0) = c1 k1 v1 + + cn kn vn , which,
for k 0, is approximately e(k) c1 k1 v1 . Thus, k e(k+j) k (T )j k e(k) k. In particular,
if at iteration number k we have m decimal places of accuracy, so k e(k) k .5 10 m ,
then, approximately, k e(k+j) k .5 10 m+j log10 (T ) = .5 10 m1 provided j =
1/ log10 (T ).

10.5.11. False for elementary row operations of types 1 & 2, but true for those of type 3.

10.5.12.
(a) x =

(b) x(1)

0 7
B 23
B 6
@ 23
40
23
0

1
C
C
A

.30435
C
=B
@ .26087 A;
1.73913
1

.5
.4375
.390625
.0862772
C
C
C
C
(2)
(3)
(3)
=B
=B
=B
=B
@ .25 A, x
@ .0625 A, x
@ .3125
A, with error e
@ .0516304 A;
1.75
1.8125
1.65625
.0828804

300

(c) x(k+1) =
0

0
B
B
B
@

1
4
1
4

14
0
14

1
2
41

C
C (k)
Cx
A
0

1
1
B2 C
C
B 1 C;
+B
@ 4A
7
4
1

.274902
.484375
.5
C
C
(3)
(2)
=B
=B
(d) x
= .375 C
A, x
@ .245728 A; the error at the third
@ .316406 A, x
1.74271
1.70801
1.78125
0
1
.029446
C
iteration is e(3) = B
@ .015142 A, which is about 30% of the Jacobi error;
.003576
(1)

(e) x(k+1)

B
@

B0
B0
=B
@
0

(f ) (TJ ) =

41
1
16
3
64

1
2
3
8
1
32

1
C
B2
C (k)
B 3
Cx
+B
A
@8
57
32

3
4 =.433013,
= 3+64 73 = .180375,

C
C
C;
A

(g) (TGS )
so GaussSeidel converges about log GS / log J = 2.046
times as fast.
(h) Approximately log(.5 106 )/ log GS 8.5 iterations.
(i) Under GaussSeidel, x(9)

1.0475
.304347
C
C
(9)
=B
= 106 B
@ .4649 A.
@ .260869 A, with error e
.1456
1.73913

10.5.13. (a) x = 71 = .142857, y = 72 = .285714; (b) x = 30, y = 48;


(e) x = 1.9172, y = .339703, z = 2.24204;
(g) x = .84507, y = .464789, z = .450704;

10.5.14. (a) J = .2582, GS = .0667; (b) J = .7303, GS = .5333; (c) J = .5477,


GS = .3; (d) J = .5, GS = .2887; (e) J = .4541, GS = .2887; (f ) J = .3108,
GS = .1667; (g) J = 1.118, GS = .7071. Thus, all systems lead to convergent
GaussSeidel schemes, with faster convergence than Jacobi (which doesnt even converge in
case (g)).
10.5.15.
(a)
(b)

(c)

(d)

(e)

.7857
1
Solution: u =
= .06667, so
; spectral radii: J = 1 = .2582, GS = 15
15
.3571
GaussSeidel converges
! exactly twice as fast;
4
; spectral radii: J = 1 = .7071, GS = 21 = .5, so GaussSeidel
Solution: u =
2
5
converges exactly
twice
as
fast;
0
1
.3333
B
Solution: u = @ 1.0000 C
A; spectral radii: J = .7291, GS = .3104, so GaussSeidel
1.3333
converges log GS
/ log J 1
= 3.7019 times as fast;
0
.7273
C
4
2
= .5164, GS = 15
= .2667, so
Solution: u = B
@ 3.1818 A; spectral radii: J =
15
.6364
GaussSeidel converges
exactly
twice as fast;
0
1
.8750
B
.1250 C
C
B
C; spectral radii: J = .6, GS = .1416, so GaussSeidel
Solution: u = B
@ .1250 A
.1250
converges log GS / log J = 3.8272 times as fast;
301

(f ) Solution: u =

0.

B
C
B .7143 C
B
C;
@ .1429 A

spectral radii: J = .4714, GS = .3105, so GaussSeidel

.2857
converges log GS / log J = 1.5552 times as fast.
10.5.16. (a) | c | > 2; (b) c > 1.61804; (c) same answer; (d) GaussSeidel converges exactly
twice as fast since GS = 2J for all values of c.
10.5.17. The solution is x = .083799, y = .21648, z = 1.21508. The Jacobi spectral radius
is .8166, and so it converges reasonably rapidly to the solution; indeed, after 50 iterations,
x(50) = .0838107, y (50) = .216476, z (50) = 1.21514. On the other hand, the Gauss
Seidel spectral radius is 1.0994, and it slowly diverges; after 50 iterations, x (50) = 30.5295,
y (50) = 9.07764, z (50) = 90.8959.
10.5.18. The solution is x = y = z = w = 1. GaussSeidel converges, but extremely slowly.
Starting with the initial guess x(0) = y (0) = z (0) = w(0) = 0, after 2000 iterations, the
approximate solution x(50) = 1.00281, y (50) = .99831, z (50) = .999286, w (50) = 1.00042,
is correct to 2 decimal places. The spectral radius is .9969 and so it takes, on average, 741
iterations per decimal place.
10.5.19. (TJ ) = 0, while (TGS ) = 2. Thus Jacobi converges extremely rapidly, whereas
GaussSeidel diverges.
10.5.20. Jacobi doesnt converge because its spectral radius is 3.4441. GaussSeidel converges,
but extremely slowly, since its spectral radius is .999958.
10.5.21. For a general matrix, both Jacobi and GaussSeidel require k n (n 1) multiplications
and k n (n1) additions to perform k iterations, along with n2 divisions to set up the initial
matrix T and vector c. They are more efficient than Gaussian Elimination provided the
number of steps k < 31 n (approximately).
10.5.22. (a) Diagonal dominance requires | z | > 4; (b) The solution is u = (.0115385,
.0294314, .0755853, .0536789, .31505, .0541806, .0767559, .032107, .0140468, .0115385) T .
It takes 41 Jacobi iterations and 6 GaussSeidel iterations to compute the first three decimal places of the solution. (c) Computing the spectral radius, we conclude that the Jacobi scheme converges to the solution whenever | z | > 3.6387, while the GaussSeidel
scheme converges for z < 3.6386 or z > 2.
10.5.23.
(a) If is an eigenvalue of T = I A, then = 1 is an eigenvalue of A, and hence the
eigenvalues of A must satisfy | 1 | < 1, i.e., they all lie within a distance 1 of 1.
(b) The Gerschgorin disks are
D1 = { | z .8 | .2 } ,
D2 = { | z 1.5 | .3 } ,
D3 = { | z 1 | .3 } ,
and hence all eigenvalues of A are within a distance 1 of 1. Indeed, we can explicitly
compute the eigenvalues of A, which are
1 = 1.5026,
2 = .8987 + .1469 i ,
3 = .8987 .1469 i .
n

Hence, the spectral radius of T = I A is (T ) = max | 1 j |

= .5026. Starting the

iterations with u
= 0, we arrive at the solution u = ( 1.36437, .73836, 1.65329 )T to
4 decimal places after 13 iterations.
(0)

302

10.5.24.
(a)

(b)
(c)
(d)
(e)

(f )
(g)

1.4
u=
.
.2
The spectral radius is J = .40825 and so it takes about 1/ log 10 J 2.57 iterations
to produce each additional decimal place of accuracy.
The spectral radius is GS = .16667 and so it takes about 1/ log 10 GS 1.29 iterations to produce
each additional decimal 1
place of accuracy.
1
0
0
3
1

(n)
(n+1)
2
2
A.
@
A
@
u
+ 2
u
=
1 2
31 (1 ) 61 2 + 1

3
2
The SOR spectral radius is minimized when the two eigenvalues of T coincide, which
occurs when ? = 1.04555, at which value ? = ? 1 = .04555, so the optimal
SOR method is almost 3.5 times as fast as Jacobi, and about 1.7 times as fast as Gauss
Seidel.
For Jacobi, about 5/ log10 J 13 iterations; for GaussSeidel, about 5/ log 10 GS =
7 iterations; for optimal SOR, about 5/ log 10 SOR 4 iterations.
To obtain 5 decimal place accuracy, Jacobi requires 12 iterations, GaussSeidel requires
6 iterations, while optimal SOR requires 5 iterations.

10.5.25. (a) x = .5, y = .75, z = .25, w = .5. (b) To obtain 5 decimal place accuracy,
Jacobi requires 14 iterations, GaussSeidel requires 8 iterations. One can get very good
approximations of the spectral radii J = .5, GS = .25, by taking ratios of entries of
successive iterates, or the ratio of norms of successive error vectors. (c) The optimal SOR
scheme has = 1.0718, and requires 6 iterations to get 5 decimal place accuracy. The SOR
spectral radius is SOR = .0718.

10.5.26. (a) J = 1+4 5 = .809017, GS = 3+8 5 = .654508; (b) no; (c) ? = 1.25962 and
? = .25962; (d) The solution is x = ( .8, .6, .4, .2 )T . Jacobi: predicted 44 iterations;
actual 45 iterations. Gauss-Seidel: predicted 22 iterations; actual 22 iterations. Optimal
SOR: predicted 7 iterations; actual 9 iterations.

10.5.27. (a) J = 1+4 5 = .809017, GS = 3+8 5 = .654508; (b) no; (c) ? = 1.25962
and ? = 1.51315, so SOR with that value of doesnt converge! However, by numerically
computing the spectral radius of T , the optimal value is found to be ? = .874785 (underrelaxation), with ? = .125215. (d) The solution is x = (.413793, .172414, .0689655,
.0344828)T . Jacobi: predicted 44 iterations; actual 38 iterations. Gauss-Seidel: predicted
22 iterations; actual 19 iterations. Optimal SOR: predicted 5 iterations; actual 5 iterations.
10.5.28. (a) The Jacobi iteration matrix TJ = D1 (L + U ) is tridiagonal with all 0s on
the main diagonal and 12 s on the sub- and super-diagonals. Thus, using Exercise 8.2.47,
2
= 1.49029. Since
J = cos 91 < 1, and so Jacobi converges. (b) ? =
1 + sin 19
log J
? = .490291, it takes
11.5 Jacobi steps per SOR step. (c) The solution is
log ?
u=

8 8 7 2 5 4 1 2 1
9, 9, 9, 3, 9, 9, 3, 9, 9
(0)

Starting with u
place accuracy.

= ( .888889, .7778, .6667, .5556, .4444, .3333, .2222, .1111 )T .

= 0, it takes 116 Jacobi iterations versus 13 SOR iterations to achieve 3

10.5.29. The optimal value for SOR is = 1.80063, with spectral radius SOR = .945621.
Starting with x(0) = y (0) = z (0) = w(0) = 0, it take 191 iterations to obtain 2 decimal place
accuracy in the solution. Each additional decimal place requires about 1/ log 10 SOR
41 iterations, which is about 18 times as fast as GaussSeidel.
303

10.5.30. The Jacobi and GaussSeidel spectral radii are J = 37 = .881917, GS = 79 =


.777778, respectively. It takes 99 Jacobi iterations versus 6 Gauss-Seidel iterations to obtain the solution with 5 decimal place accuracy. Using (10.86) to fix the optimal SOR parameter ? = 1.35925 with spectral radius ? = .359246. However, it takes 16 iterations to
obtain the solution with 5 decimal place accuracy, which is significantly slower than GaussSeidel, which converges much faster than it should, owing to the particular right hand side
of the linear system.
10.5.31.
(a) u = ( .0625, .125, .0625, .125, .375, .125, .0625, .125, .0625 )T .
(b) It takes 11 Jacobi iterations to compute the first two decimal places of the solution, and
17 iterations for 3 place accuracy.
(c) It takes 6 GaussSeidel iterations to compute the first two decimal places of the solution, and 9 iterations for 3 place accuracy.
(d) J = 1 , and so, by (10.86), the optimal SOR parameter is ? = 1.17157. It takes only
2
4 iterations for 2 decimal place accuracy, and 6 iterations for 3 places.
2

2
.
1
+
sin
1+
n+1

For the n = 5 system, J = 23 , and ? = 43 with ? = 13 , and the convergence is about 8


times as fast as Jacobi, and 4 times as fast as GaussSeidel. For the n = 25 system, J =
.992709, and ? = 1.78486 with ? = .78486, and the convergence is about 33 times as fast
as Jacobi, and 16.5 times as fast as GaussSeidel.

10.5.32. Using (10.86), the optimal SOR parameter is ? =

1 2J

10.5.33. The Jacobi spectral radius is J = .909657. Using (10.86) to fix the SOR parameter
= 1.41307 actually slows down the convergence since SOR = .509584 while GS =
.32373. Computing the spectral radius directly, the optimal SOR parameter is ? = 1.17157
with ? = .290435. Thus, optimal SOR is about 13 times as fast as Jacobi, but only marginally
faster than Gauss-Seidel.
10.5.34. The two eigenvalues

2 8 + 8 + 2 16 + 16 ,
2 = 81 2 8 + 8 2 16 + 16

are real for 0 8 4 3 . A graph of the modulus


1
of the eigenvalues over the range 0 2 reveals that,
0.8
as increases, the smaller eigenvalue is increasing
0.6
and the larger decreasing until they meet at 8 4 3 ;
0.4
after this point, both eigenvalues are complex conjugates
0.2
of the same modulus. To prove this analytically, we compute
0.5
2
1
1.5
d2
3
2
=
+
>0
d
4
2 16 + 16

for 1 8 4 3 , and so the smaller eigenvalue is increasing. Furthermore,


d2
3
2
=

<0
2
d
4
16 + 16

on the same interval, so the larger eigenvalue is decreasing. Once > 8 4 3 , the eigenvalues are complex conjugates, of equal modulus | 1 | = | 2 | = 1 > ? 1.

1 =

1
8

10.5.35.
(a) u(k+1) = u(k) + D1 r(k) = u(k) D1 A u(k) + D1 b

= u(k) D1 (L + D + U )u(k) + D1 b = D 1 (L + U )u(k) + D1 b,


which agrees with (10.65).
304

(b) u(k+1) = u(k) + (L + D)1 r(k) = u(k) (L + D)1 A u(k) + (L + D)1 b


= u(k) (L + D)1 (L + D + U )u(k) + (L + D)1 b

= (L + D)1 U u(k) + (L + D)1 b,


which agrees with (10.71).
(k+1)
= u(k) + ( L + D)1 r(k) = u(k) ( L + D)1 A u(k) + ( L + D)1 b
(c) u
= u(k) ( L + D)1 (L + D + U )u(k) + ( L + D)1 b

= ( L + D)1 (1 )D + U u(k) + ( L + D)1 b,


which agrees with (10.80).
(d) If u? is the exact solution, so A u? = b, then r(k) = A(u? u(k) ) and so k u(k) u? k
k A1 k k r(k) k. Thus, if k r(k) k is small, the iterate u(k) is close
to the solution
u? pro!
!
1
0
1
vided k A1 k is not too large. For instance, if A =
and b =
, then
0 .0001
0
!
!
0
1
, even though x is nowhere near the
has residual r = b A x =
x =
.001
100
!
1
exact solution x? =
.
0
10.5.36. Note that the iteration matrix is T = I A, which has eigenvalues 1 j . When
2
2
, the iterations converge. The optimal value is =
, with spectral
0 < <
1
1 + n
n
.
radius (T ) = 1
1 + n

10.5.37.
In each solution,!the last uk is the!actual solution, with residual
rk = f K
u = 0.
! k
!
.78571
.07692
.76923
2
;
, u2 =
,
r1 =
, u1 =
(a) r0 =
.35714
.15385
.38462
1
0
1
0
1
0
1
0
1
1
.5
1
.51814
C
B
C
C
B
C
(b) r0 = B
r1 = B
@ 0 A, u 1 = @ 0 A ,
@ 2 A, u2 = @ .72539 A,
2
1 0
.5
1.94301
1
0
1
1.28497
1.
C
B
C
r2 = B
@ .80311 A, u3 = @ 1.4 A;
.64249
2.2
1
0
1
0
1
0
1
0
.13466
2.36658
.13466
1
C
B
C
C
B
C
r1 = B
(c) r0 = B
@ 4.01995 A, u2 = @ .26933 A,
@ 2 A, u1 = @ .26933 A,
.94264
.81047
.94264
7
0
1
0
1
.72321
.33333
C
B
C
r2 = B
@ .38287 A, u3 = @ 1.00000 A;
.21271
1.33333
1
0
1
0
1
0
1
0
.90654
1.2
.2
1
C
B
C
B
C
B
C
B
2C
B .46729 C
B .8 C
B .4 C
C,
B
C
B
C,
C, u 1 = B
B
,
u
=
r
=
(d) r0 = B
2
1
@ .33645 A
@ .8 A
@
@ 0A
0A
.57009
.4
.2
1

305

1.45794
4.56612
1.36993
9.50
B
B
C
B
.59813 C
.40985 C
1.25 C
C
B
C
B 1.11307 C
B
C
C, u 3 = B
C,
B
C
B
C;
r2 =
r
=
,
u
=
3
4
@ 2.92409 A
@ 3.59606 A
@ 10.25 A
.26168 A
2.65421 0 1
5.50820
.85621
13.00
0 1
0
1
0
1
4
.8
0.
.875
B C
B C
B
C
B
C
0C
B 0C
B .8 C
B .125 C
B C, u 1 = B C ,
B
C
B
C.
(e) r0 = B
r
=
,
u
=
1
2
@0A
@ 0A
@ .8 A
@ .125 A
0
0
.8
.125
B
B
B
@

10.5.38. Remarkably, after only two iterations, the method finds the exact solution: u 3 = u? =
( .0625, .125, .0625, .125, .375, .125, .0625, .125, .0625 )T , and hence the convergence is dramatically faster than the other iterative methods.
10.5.39. (a) n = 5: b = ( 2.28333, 1.45, 1.09286, .884524, .745635 )T ;
n = 10: b = ( 2.92897, 2.01988, 1.60321, 1.3468, 1.16823, 1.0349, .930729, .846695, .777251, .718771 ) T ;
n = 30: b = (3.99499, 3.02725, 2.5585, 2.25546, 2.03488, 1.86345, 1.72456, 1.60873, 1.51004, 1.42457,
1.34957, 1.28306, 1.22353, 1.16986, 1.12116, 1.07672, 1.03596, .998411, .963689, .931466,
.901466, .873454, .847231, .82262, .799472, .777654, .75705, .737556, .719084, .70155) T ;
(b) For regular Gaussian Elimination, using standard arithmetic in Mathematica, the
maximal error, i.e., norm of the difference between the computed solution and u ? , is:
n = 5: 1.96931 1012 ; n = 10: 5.31355 104 ; n = 30: 457.413. (c) Pivoting has
little effect for m = 5, 10, but for n = 30 the error is reduced to 5.96011 10 4 . (d) Using
the conjugate gradient algorithm all the way to completiong results in the following errors:
n = 5: 3.56512 103 ; n = 10: 5.99222 104 ; n = 30: 1.83103 104 , and so, at least
for moderate values of n, it outperforms Gaussian Elimination with pivoting.
0

2.7377
.3077
.9231
2
C
C
C
B
C
B
r1 = B
10.5.40. r0 = B
@ 2.3846 A, u2 = @ 3.0988 A,
@ 1 A, u1 = @ .4615 A,
.2680
1.7692
.4615 1
1 1
1
0
0
0
1
5.5113
7.2033
C
B
C
B
C
r2 = B
@ 4.6348 A, u3 = @ 9.1775 A, but the solution is u = @ 1 A. The problem is
1
.7262
4.3823
that the coefficient matrix is not positive definite, and so the fact that the solution is orthogonal to the conjugate vectors does not uniquely specify it.
10.5.41. False. For example, consider the homogeneous system K u = 0 where K =
!

.0001
0

1
.01
with solution u = 0. The residual for u =
is r = K u =
with k r k =
0
0
.01, yet not even the leading digit of u agrees with the true solution. In general, if u ? is
the true solution to K u = f , then the residual is r = f A u = A(u? u), and hence
k u? u k k A1 kk r k, so the result is valid only when k A1 k 1.
?

10.5.42. Referring to the pseudocode program in the text, at each step to compute
rk = f K uk requires n2 multiplications and n2 additions;
k rk k2 requires n multiplications and n 1 additions;
k r k k2
v requires n + 1 multiplications and n additions since
k rk1 k2 k
k rk1 k2 was already computed in the previous iteration (but when k = 0 this step
requires no work);

vk+1 = rk +

306

0
,
1

T
vk+1
K vk+1 requires n2 + n multiplications and n2 1 additions;

uk+1 = uk +

k r k k2
vk+1 requires n2 multiplications and n2 additions;
T Kv
vk+1
k+1

for a grand total of 2 (n + 1)2 2 n2 multiplications and 2 n2 + 3 n 2 2 n2 additions.

Thus, if the number of steps k < 61 n (approximately), the conjugate gradient method is
more efficient than Gaussian Elimination, which requires 13 n3 operations of each type.

10.5.43. tk =

2
T
2
k r k k2
uT
k K uk 2 f K u k + k f k
.
=
3
T 2
T
rT
uT
k K rk
k K uk 2 f K uk + f K f

10.6.1. In all cases, we use the normalized version (10.101) starting with u (0) = e1 ; the answers
are correct to 4 decimal places.
(a) After 17 iterations, = 2.00002, u = ( .55470, .83205 )T ;
(b) after 26 iterations, = 3.00003, u = ( .70711, .70710 )T ;
(c) after 38 iterations, = 3.99996, u = ( .57737, .57735, .57734 ) T ;
(d) after 121 iterations, = 3.30282, u = ( .35356, .81416, .46059 ) T ;
(e) after 36 iterations, = 5.54911, u = ( .39488, .71005, .58300 ) T ;
(f ) after 9 iterations, = 5.23607, u = ( .53241, .53241, .65810 )T ;
(g) after 36 iterations, = 3.61800, u = ( .37176, .60151, .60150, .37174 ) T ;
(h) after 30 iterations, = 5.99997, u = ( .50001, .50000, .50000, .50000 ) T .
10.6.2.
For n = 10, it takes 159 iterations to obtain 1 = 3.9189 = 2 + 2 cos 61 to 4 decimal places.
1
for n = 20, it takes 510 iterations to obtain 1 = 3.9776 = 2 + 2 cos 21
to 4 decimal places.
1
for n = 50, it takes 2392 iterations to obtain 1 = 3.9962 = 2 + 2 cos 51
to 4 decimal
places.
10.6.3. In each case, to find the dominant singular value of a matrix A, we apply the power
method to K = AT A andqtake the square root of its dominant eigenvalue to find the dominant singular value 1 = 1 of A.
!
2 1
(a) K =
; after 11 iterations, 1 = 13.0902 and 1 = 3.6180;
1 13
1
0
8 4 4
B
2C
(b) K = @ 4 10
A; after 15 iterations, 1 = 14.4721 and 1 = 3.8042;
4
2
2
0
1
5
2
2 1
B
2
8
2 4 C
C
B
C; after 16 iterations, 1 = 11.6055 and 1 = 3.4067;
(c) K = B
@ 2
2
1 1 A
1 4 1 1 2
0
14 1
1
(d) K = B
6 6 C
@ 1
A; after 39 iterations, 1 = 14.7320 and 1 = 3.8382.
1 6
6

307

10.6.4. Since v(k) k1 v1 as k ,


u

(k)

< u1 ,
c1 k1 v1
v(k)

=
=
: (1)k u ,
| c 1 | | 1 |k k v 1 k
k v(k) k
1

1 > 0,
1 < 0,
(k)

8
<

where
1 u1 ,

u1 = sign c1
1 > 0,

v1
k v1 k

so
(1) 1 u1 , 1 < 0,
k A u(k) k | 1 |. If 1 > 0, the iterates u(k) u1 converge to one of the two dominant
unit eigenvectors, whereas if 1 < 0, the iterates u(k) (1)k u1 switch back and forth
between the two real unit eigenvectors.
is one of the two real unit eigenvectors. Moreover, A u

10.6.5.

1
v, and so v is also the eigenvector of A1 .

(b) If 1 , . . . , n are the eigenvalues of A, with | 1 | > | 2 | > > | n | > 0 (recalling that
1
1
0 cannot be an eigenvalue if A is nonsingular), then
,...,
are the eigenvalues of
1
n
1
1
1
1
A1 , and
is the dominant eigenvalue of A1 .
>
> >
and so
| n |
| n1 |
| 1 |
n
Thus, applying the power method to A1 will produce the reciprocal of the smallest (in
absolute value) eigenvalue of A and its corresponding eigenvector.
(c) The rate of convergence of the algorithm is the ratio | n /n1 | of the moduli of the
smallest two eigenvalues.
(d) Once we factor P A = L U , we can solve the iteration equation A u(k+1) = u(k) by
rewriting it in the form L U u(k+1) = P u(k) , and then using Forward and Back Substitution to solve for u(k+1) . As we know, this is much faster than computing A1 .
(a) If A v = v then A1 v =

10.6.6.
(a) After 15 iterations, we obtain = .99998, u = ( .70711, .70710 ) T ;
(b) after 24 iterations, we obtain = 1.99991, u = ( .55469, .83206 ) T ;
(c) after 12 iterations, we obtain = 1.00001, u = ( .40825, .81650, .40825 ) T ;
(d) after 6 iterations, we obtain = .30277, u = ( .35355, .46060, .81415 ) T ;
(e) after 7 iterations, we obtain = .88536, u = ( .88751, .29939, .35027 ) T ;
(f ) after 7 iterations, we obtain = .76393, u = ( .32348, .25561, .91106 ) T ;
(g) after 11 iterations, we obtain = .38197, u = ( .37175, .60150, .60150, .37175 ) T ;
(h) after 16 iterations, we obtain = 2.00006, u = ( .500015, .50000, .499985, .50000 ) T .
10.6.7.
(a) According to Exercises 8.2.19, 8.2.24, if A has eigenvalues 1 , . . . , n , then (A I )1
1
. Thus, applying the power method to (A I )1 will
has eigenvalues i =
i
produce its dominant eigenvalue ? , for which | ? | is the smallest. We then recover
1
the eigenvalue ? = + ? of A which is closest to .

(b) The rate of convergence is the ratio | (? )/(?? ) | of the moduli of the smallest
two eigenvalues of the shifted matrix.
(c) is an eigenvalue of A if and only if A I is a singular matrix, and hence one cannot
implement the method. Also choosing too close to an eigenvalue will result in an illconditioned matrix, and so the algorithm may not converge properly.
10.6.8.
(a) After 11 iterations, we obtain ? = 2.00002, so ? = 1.0000, u = ( .70711, .70710 )T ;
308

(b) after 27 iterations, we obtain ? = .40003, so ? = 1.9998, u = ( .55468, .83207 )T ;


(c) after 10 iterations, we obtain ? = 2.00000, so ? = 1.00000,
u = ( .40825, .81650, .40825 )T ;
(d) after 7 iterations, we obtain ? = 5.07037, so ? = .30278,
u = ( .35355, .46060, .81415 )T ;
(e) after 8 iterations, we obtain ? = .72183, so ? = .88537,
u = ( .88753, .29937, .35024 )T ;
(f ) after 6 iterations, we obtain ? = 3.78885, so ? = .76393,
u = ( .28832, .27970, .91577 )T ;
(g) after 9 iterations, we obtain ? = 8.47213, so ? = .38197,
u = ( .37175, .60150, .60150, .37175 )T ;
(h) after 14 iterations, we obtain ? = .66665, so ? = 2.00003,
u = ( .50001, .50000, .49999, .50000 )T .
10.6.9.
(i) First, compute the dominant eigenvalue 1 and eigenvector v1 using the power method.
Then set B = A 1 v1 bT , where b is any vector such that b v1 = 1, e.g., b =
v1 /k v1 k2 . According to Exercise 8.2.52, B has eigenvalues 0, 2 , . . . , n , and corresponding eigenvectors v1 and wj = vj cj v1 , where cj = b vj /j for j 1. Thus,
applying the power method to B will produce the subdominant eigenvalue 2 and the
eigenvector w2 of the deflated matrix B, from which v2 can be reconstructed using the
preceding formula.
(ii) In all cases, we use the normalized version (10.101) starting with u (0) = e1 ; the answers
are correct to 4 decimal places.
(a) Using the computed values 1 !
= 2., v1 = ( .55470, .83205 )T , the deflated matrix
1.61538 1.07692
; it takes only 3 iterations to produce 2 = 1.00000,
is B =
3.92308
2.61538
v2 = ( .38075, .924678 )T .
!
3.5 3.5
(b) Using 1 = 3., v1 = ( .70711, .70711 )T , the deflated matrix is B =
; it
1.5 1.5
takes 3 iterations to produce 2 = 2.00000.
(c) Using01 = 4., v1 = ( .57737, .57735,1
.57735 )T , the deflated matrix is
1.66667 .333333 1.33333
B=B
.333333 C
@ .333333 .666667
A; it takes 11 iterations to produce 2 = 2.99999.
1.33333 .333333
1.66667
(d) Using
1 = 3.30278, v1 = ( .35355,
.81415, .46060 )T , the deflated matrix is B =
0
1
1.5872
.95069
.46215
B
.18924 1.23854 C
@ 2.04931
A; it takes 10 iterations to produce 2 = 3.00000.
2.53785 3.76146
4.70069
(e) Using
1 = 5.54913, v1 = ( .39488,
.71006, .58300 )T ,, the deflated matrix is B =
1
0
1.86527
.44409 .722519
B
2.70287 C
A; it takes 13 iterations to produce 2 = 3.66377.
@ 2.55591 .797798
.277481
1.70287 1.88606
(f ) Using
1 = 5.23607, v1 = 1( 0.53241, 0.53241, 0.65809 )T , the deflated matrix is B =
0
.5158
.5158 .8346
B
C
@ .4842 1.5158 .8346 A; it takes 36 iterations to produce 2 = 1.00003.
.1654
.1654 .2677
(g) Using 1 = 3.61803, v1 = ( .37176, .60151, .60150, .37174 )T , the deflated matrix

309

is B =

1.5

B
B .19098
B
@ .80902

.19098
.69098
.30902
.80902

.80902
.30902
.69098
.19098

.5000
.80902 C
C
C; it takes 18 iterations to produce
.19098 A
1.5000

.5000
2 = 2.61801.
(h) Using 1 = 6., v1 = ( .5, .5, .5, .5 )T , the deflated matrix is B =
0
1
2.5
.5 1.5
.5
B
2.5
.5 1.5 C
B .5
C
B
C; it takes 17 iterations to produce 2 = 3.99998.
@ 1.5
.5
2.5
.5 A
.5 1.5
.5
2.5

10.6.10. That A is a singular matrix and 0 is an eigenvalue. The corresponding eigenvectors are
the nonzero elements of ker A. In fact, assuming u(k) 6= 0, the iterates u(0) , . . . , u(k) form
a Jordan chain for the zero eigenvalue. To find other eigenvalues and eigenvectors, you need
to try a different initial vector u(0) .

10.6.11.
(a)
(b)
(c)
(d)

(e)

(f )

.3310
.9436
Eigenvalues: 6.7016, .2984; eigenvectors:
,
.
.9436
.3310
!
!
.3827
.9239
Eigenvalues: 5.4142, 2.5858; eigenvectors:
,
.
.9239
.3827
0
1 0
1 0
1
.2726
.9454
.1784
B
C B
C B
Eigenvalues: 4.7577, 1.9009, 1.6586; eigenvectors: @ .7519 A, @ .0937 A, @ .6526 C
A.
.6003
.3120
.7364
Eigenvalues: 7.0988, 2.7191,
4.8180;
0
1 0
1 0
1
.6205
.5439
.5649
B
C B
C B
eigenvectors: @ .6328 A, @ .0782 A, @ .7704 C
A.
.4632
.8355
.2956
Eigenvalues: 4.6180, 3.6180,
2.3820,
1.3820; 1 0
1 0
1 0
1
0
.3717
.6015
.6015
.3717
B
B
C B
C B
C
.6015 C
C B .3717 C B .3717 C B .6015 C
B
C, B
C, B
C, B
C.
eigenvectors: B
@ .6015 A @ .3717 A @ .3717 A @ .6015 A
.3717
.6015
.6015
.3717
Eigenvalues: 8.6091, 6.3083,
4.1793,
1.9033;
0
1 0
1 0
1 0
1
.3182
.8294
.4126
.2015
B
C B
C B
C B
.9310 C B .2419 C B .1093 C B .2507 C
C
B
C, B
C, B
C, B
C.
eigenvectors: B
@ .1008 A @ .4976 A @ .6419 A @ .5746 A
.1480
.0773
.6370
.7526
0

6 0
10.6.12. The iterates converge to the diagonal matrix An B
@0 9
0 0
pear on along the diagonal, but not in decreasing order, because,

0
0C
A. The eigenvalues ap3
when
0 the eigenvalues
1 are
0
1 1
C
listed in decreasing order, the corresponding eigenvector matrix S = B
@ 1 1 1 A (or,
2 1 1
rather its transpose) is not regular, and so Theorem 10.57 does not apply.

10.6.13.
(a) Eigenvalues: 2, 1; eigenvectors:

2
,
3

310

1
.
1

.9669
,
.2550

(b) Eigenvalues: 1.2087, 5.7913; eigenvectors:

.6205
.
.7842

(c) Eigenvalues: 3.5842, 2.2899,


1.7057;
0
1 0
1 0
1
.4466
.1953
.7491
C B
C B
C
eigenvectors: B
@ .7076 A, @ .8380 A, @ .2204 A.
.5476
.5094
.6247
(d) Eigenvalues: 7.7474, 0
.2995, 3.4479;
1 0
1 0
1
.4697
.7799
.6487
C B
C B
C
eigenvectors: B
@ .3806 A, @ .2433 A, @ .7413 A.
.7966
.5767
.1724
(e) Eigenvalues: 18.3344, 4.2737,
0,
1.6081;
0
1 0
1 0
1 0
1
.4136
.4183
.5774
.2057
B
B
C B
C B
C
.8289 C
C B .9016 C B .5774 C B .4632 C
B
C, B
C, B
C, B
C.
eigenvectors: B
@ .2588 A @ .0957 A @ .5774 A @ .6168 A
.2734
.0545
0
.6022

10.6.14. Yes. After 10 iterations, one finds


0
1
0
1
2.0011 1.4154 4.8983
.5773
.4084
.7071
R10 = B
0
.9999 .0004 C
S10 = B
.4082 .7071 C
@
A,
@ .5774
A,
0
0
.9995
.5774 .8165
.0002
so the diagonal entries of R10 give the eigenvalues correct to 3 decimal places, and the
columns of S10 are similar approximations to the orthonormal eigenvector basis.
10.6.15. It has eigenvalues 1, which have the same magnitude. The Q R factorization is trivial, with Q = A and R = I . Thus, R Q = A, and so nothing happens.
10.6.16. This follows directly from Exercise 8.2.23.
10.6.17.
T
(a) By induction, if Ak = Qk Rk = RkT QT
k = Ak , then, since Qk is orthogonal,
T
T T T
T
T T
AT
k+1 = (Rk Qk ) = Qk Rk = Qk Rk Qk Qk = Qk Qk Rk Qk = Rk Qk = Ak+1 ,
proving symmetry of Ak+1 . Again, proceeding by induction, if Ak = Qk Rk is tridiagonal, then its j th column is a linear combination of the standard basis vectors ej1 , ej , ej+1 .
By the Gram-Schmidt formulas (5.19), the j th column of Qk is a linear combination of
the first j columns of Ak , and hence is a linear combination of e1 , . . . , ej+1 . Thus, all
entries below the sub-diagonal of Qk are zero. Since Rk is upper triangular, this implies
all entries of Ak+1 = Rk Qk lying below the sub-diagonal are also zero. But we already
proved that Ak+1 is symmetric, and hence this implies all entries lying above the superdiagonal are also 0, which implies Ak+1 is tridiagonal.
(b) The result does not hold if A is 0
only tridiagonal and1not symmetric. For example, when
1
0
1
1
1
1

1
0
2 0
C
B 2
3
6
2
C
B
1 1 0
C
B

B
1
1 C
C
B 1
2 C
C,, and
B

0
3
C
B
,
R
=
A = B
1
1
1
,
then
Q
=
A
@
B
3
B 2
3C
6 C
A
@
A
@
0
1 1
1
1
2
0
0
0

A1 = R Q =

B
B
B 3
B
B 2
@

1
6
5
3
1

3 2

2
3
1

3 2
1
3

C
C
C
C,
C
A

which is not tridiagonal.

311

10.6.18.

1
(a) H = B
@0
0
(b)
0

1
B
B0
H1 = B
@0
0
0
1
B
0
B
H2 = B
@0
0

0
.9615
.2747

0
.2747 C
A,
.9615

8.0000
T = H AH = B
@ 7.2801
0

0
0
0
.4082 .8165 .4082 C
C
C,
.8165 .5266
.2367 A
.4082 .2367
.8816
1
0
0
0
1
0
0 C
C
C,
0 .8278 .5610 A
0 .5610
.8278

7.2801
20.0189
3.5660

5.0000
B
B 2.4495
T1 = H 1 A H1 = B
@
0
0
0
5.0000
B
2.4495
B
T = H 2 T1 H2 = B
@
0
0

0
3.5660 C
A.
4.9811
1

2.4495
0
0
3.8333 1.3865
.9397 C
C
C,
1.3865 6.2801 .9566 A
.9397 .9566 6.8865
1
2.4495
0
0
3.8333 1.6750
0 C
C
C.
1.6750
5.5825
.0728 A
0
.0728 7.5842

(c)
H1 =

B
B0
B
@0

0
1
B
B0
H2 = B
@0
0
0

0
0
0
0
.7071 .7071 C
C
C,
.7071 .5000
.5000 A
.7071 .5000
.5000
1
0
0
0
1
0
0 C
C
C,
0 .1691 .9856 A
0
.9856 .1691

T1 = H 1 A H1 =

4.0000

B
B 1.4142
B
@
0

0
4.0000
B
B 1.4142
T = H 2 T1 H2 = B
@
0
0
0

1.4142
0
0
2.5000
.1464 .8536 C
C
C,
.1464 1.0429
.7500 A
.8536
.7500 2.4571
1
1.4142
0
0
2.5000 .8660
0 C
C
C.
.8660 2.1667
.9428 A
0
.9428 1.3333

10.6.19.
(a) Eigenvalues: 24, 6, 3; (b) eigenvalues: 7.6180, 7.5414, 5.3820, 1.4586;
(c) eigenvalues: 4.9354, 3.0000, 1.5374, .5272.
10.6.20. The singular values are the square roots of the non-zero eigenvalues of
0

5
B
B 2
T
K=A A=B
@ 2
1

2
9
0
6

2
0
5
3

Applying the tridiagonalization algorithm, we find


H1 =

0
0
0
.6667 .6667 .3333 C
C
C,
.6667
.7333 .1333 A
.3333
.1333 .9333
1
0
0
0
1
0
0 C
C
C,
0 .9839 .1789 A
0 .1789
.9839

B
B0
B
@0

0
1
B
B0
H2 = B
@0
0
0

A1 =

1
6 C
C
C.
3A
6

5.0000

B
B 3.0000
B
@
0

0
5.0000
B
B 3.0000
A2 = B
@
0
0
0

3.0000
8.2222
4.1556
.7556
3.0000
8.2222
4.2237
0

0
0
4.1556 0.7556 C
C
C,
8.4489 4.8089 A
4.8089 3.3289
1
0
0
4.2237
0 C
C
C.
9.9778 3.6000 A
3.6000
1.8000

Applying the Q R algorithm to the final tridiagonal matrix, the eigenvalues are found to be
14.4131, 7.66204, 2.92482, 0, and so the singular values of the original matrix are
3.79646, 2.76804, 1.71021.
10.6.21.

1
(a) H1 = B
@0
0

0
.4472
.8944

0
.8944 C
A,
.4472

3.0000
A1 = B
@ 2.2361
0
312

1.3416
2.2000
1.4000

1.7889
1.6000 C
A;
4.2000

(b)

(c)

1
B
B0
H1 = B
@0
0
0
1
B
B0
H2 = B
@0
0

H1 =

B
B0
B
@0

0
1
B
B0
H2 = B
@0
0
0

0
0
.8944 0
0
1
.4472 0
0
0
1
0
0 .7875
0 .6163

0
.4472 C
C
C,
0 A
.8944
1
0
0 C
C
C,
.6163 A
.7875
1

0
0
0
.5345 .2673 .8018 C
C
C,
.2673 .9535
.1396 A
.8018 .1396
.5811
1
0
0
0
1
0
0 C
C
C,
0 .7242 .6897 A
0 .6896
.7242

3.0000
B
B 2.2361
A1 = B
@
0
0
0
3.0000
B
B 2.2361
A2 = B
@
0
0

A1 =

1.0000

B
B 3.7417
B
@
0

0
1.0000
B
B 3.7417
A2 = B
@
0
0
0

2.2361
3.8000
1.7889
1.4000
2.2361
3.8000
2.2716
0
1.0690
1.0714
2.2316
2.1248
1.0690
1.0714
3.0814
0

1.0000
2.2361
2.0000
4.4721
.7875
2.0074
3.2961
.9534

0
.4000 C
C
C,
5.8138 A
1.2000
1
.6163
1.0631 C
C
C;
2.2950 A
6.4961
1

.8138
.4414
1.2091 1.4507 C
C
C,
1.7713
1.9190 A
.0482
3.1573
1
.2850
.8809
1.876
.2168 C
C
C.
3.4127 1.6758 A
.1950
1.5159

10.6.22.
(a) Eigenvalues: 4.51056, 2.74823, 2.25879,
(b) eigenvalues: 7., 5.74606, 4.03877, 1.29271,
(c) eigenvalues: 4.96894, 2.31549, 1.70869, 1.42426.
10.6.23. First, by Lemma 5.28, H1 x1 = y1 . Furthermore, since the first entry of u1 is zero,
T
uT
1 e1 = 0, and so H e1 = ( I 2 u1 u1 )e1 = e1 . Thus, the first column of H1 A is
H1 (a11 e1 + x1 ) = a11 e1 + y1 = ( a11 , r, 0, . . . , 0 )T .
Finally, again since the first entry of u1 is zero, the first column of H1 is e1 and so multiplying H1 A on the right by H1 doesnt affect its first column. We conclude that the first
column of the symmetric matrix H1 A H1 has the form given in (10.109); symmetry implies
that its first row is just the transpose of its first column, which completes the proof.
10.6.24. Since T = H 1 A H where H = H1 H2 Hn is the product of the Householder reflections, A v = v if and only if T w = w where w = H 1 v is the corresponding eigenvector
of the tridiagonalized matrix. Thus, to recover the eigenvectors of A we need to multiply
v = H w = H1 H2 Hn w.
10.6.25.
(a) Starting with a symmetric matrix A = A1 , for each j = 1, . . . , n 1, the tridiagonalization algorithm produces a symmetric matrix Aj+1 from Aj as follows. We first extract
xj , which requires no arithmetic operations, and then determine vj = xj k xj k ej+1 ,
which, since their first j entries are 0, requires n j multiplications and additions and a
square root. To compute Aj+1 , we only need to work on the lower right (n j) (n j)
block of Aj since its first j 1 rows and columns are not affected, while the j th row and
column entries of Aj+1 are predetermined. Setting uj = vj /k vj k2 ,
T
Aj+1 = Hj Aj Hj = ( I 2 uT
j uj )A( I 2 uj uj ) = A

T
2 v j zT
j + v j zj

k v j k2

4 vjT vj zj vjT
k v j k4

b
where zj = A vj . Thus, the updated entries a
ik of Aj+1 are given by
2
2
b
a
where j =
, j = zj vjT = zj vj ,
ik = aik j (vi zk + vk zi ) + j j vi vk ,
k v j k2

313

for i, k = j + 1, . . . , n, where aik are the entries of Aj and Aj+1 . To compute zj requires (n j)2 multiplications and (n j)(n j 1) additions; to compute the different
products vi vk and vi zk requires, respectively, (n j)2 and 21 (n j)(n j + 1) multiplications. Using these, to compute j and j requires 2(n j 1) additions and 1
division; finally, to compute the updated entries on and above the diagonal (the ones below the diagonal following from symmetry) requires 2(n j)(n j + 1) multiplications
and 32 (n j)(n j + 1) additions. The total is 23 n3 12 n2 1 23 n3 multiplications,
1 2
10
5 3
5 3
6 n + 2 n 3 n + 2 6 n additions and n 1 square roots that are required to
tridiagonalize A.
(b) First, to factor a tridiagonal A = Q R using the pseudocode program on page 242, we
note that at the beginning of the j th step, for j < n, the last n j 1 entries of the
j th column and the last n j 2 entries of the (j + 1)st column are zero, while columns
j + 2, . . . , n are still in tridaigonal form. Thus, to compute rjj requires j + 1 multiplications, j additions and a square root; to compute the nonzero aij requires j + 1 multiplications. We only need compute rjk for k = j +1, which requires j +1 multiplications and
j additions, and, if j < n 1, for k = j + 2, which requires just 1 multiplication. We update the entries in columns j + 1, which requires j + 1 multiplications and j + 1 additions
and, when j < n 1, column j + 2, which requires j + 1 multiplications and 1 addition.
The final column only requires 2 n multiplications, n 1 additions and a square root to
normalize. The totals are 52 n2 + 92 n 7 25 n2 multiplications, 32 n2 + 23 n 4 32 n2
additions and n square roots.
(c) Much faster: Once the matrix is tridiagonalized, each iteration requires 25 n2 versus n3
multiplications and 23 n2 versus n3 additions, as found in Exercise 5.3.31. Moreover, by
part (a), the initial triadiagonalization only requires the effort of about 1 21 Q R steps.
10.6.26.
Tridiagonalization
start
set R = A
for j = 1 to n 2
for i = 1 to j set xi = 0 next i
for i = j + 1 to n set xi = rij next i
set v = x + (sign xj+1 ) k x k ej+1

if v 6= 0, set uj = v/k v k, Hj = I 2uT


j uj , R = H j R H j

else set uj = 0, Hj = I endif


next j
end

The program also works as written for reducing a non-symmetric matrix A to upper
Hessenberg form.

314

Solutions Chapter 11
d
1

d
d2
11.1.1. The greatest displacement is at x = + 2 , with u(x) = 8 + 2 + 2 a , when d < 2, and
at x = 1, with u(x) = d, when d 2. The greatest stress and greatest strain are at x = 0,

with v(x) = w(x) = 2 + d.


11.1.2. u(x) =

8
< 1
4
: 1
4

x2 ,

1
2

x+

3
4

1
2

0 x 21 ,

x2 ,

1
2

v(x) =

x 1,

0.03

8
< 1
4
:

0 x 21 ,

x,

1
2

x 34 ,

x 1.

0.2

0.02
0.1

0.01
0.2

0.4

0.6

0.8

0.2

-0.01

0.4

0.6

0.8

-0.1

-0.02
-0.2
-0.03

11.1.3.

8
<

21 x2 ,

(a) u(x) = b + :
8
<

(b) v(x) = :

1 2
4`

x,

` x + 21 x ,
0 x 21 `,
1
2

x `,

0x
1
2

1
2

`,

` x 1,

for any b.

which is the same for all equilibria.

` x 1,

0.2

0.1

0.4

0.6

0.8

-0.1
0.05

(c) u(x):

0.2

0.4

0.6

v(x):

0.8

-0.2
-0.3

-0.05
-0.4
-0.1
-0.5

11.1.4.
u(x) =

log(x + 1)
x,
log 2

v(x) = u0 (x) =

0.08

1
1,
log 2(x + 1)

w(x) = (1 + x)v(x) =

1
1 x.
log 2

0.4

0.4
0.3

0.2

0.06
0.2
0.04

0.1

0.02

-0.1

0.2
0.2

0.4

0.6

0.8

0.4

0.6

0.6

0.8

-0.4

-0.2
0.2

0.4

-0.2

0.8

Maximum displacement where u0 (x) = 0, so x = 1/ log 2 1 .4427. The bar will break at
the point of maximum strain, which is at x = 0.
11.1.5.

u(x) = 2 log(x + 1) x,

v(x) = u0 (x) =

1x
,
1+x

w(x) = (1 + x)v(x) = 1 x.

0.8

0.3

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.2
0.1

0.2

0.4

0.6

0.8

0.2

0.4

315

0.6

0.8

0.2

0.4

0.6

0.8

Maximum displacement at x = 1. Maximum strain at x = 0.


11.1.6. There is no equilibrium solution.
11.1.7.
0.2

u(x) = x3

3
2

x2 x,

v(x) = 3 x2 3 x 1.

0.8
0.1

0.6
0.4
1

0.5

1.5

0.2

0.5

-0.1

1.5

-0.2
-0.4

The two points x = 1 31 3 .42265, 1 + 13 3 1.57735 have maximal (absolute)


displacement, while the maximal stress is at x = 0 and 2, so either end is most likely to
break.
-0.2

11.1.8. u(x) =

8
< 7
8
: 1
4

x2 ,

1
2

3
8

1
4

x2 ,

0 x 1,

8
< 7
8
: 3
8

v(x) =

1 x 2,

x,
1
2

0 x 1,

x,

1 x 2.

0.8

0.3

0.6
0.4

0.2

0.2

0.1

-0.2

0.5

1.5

-0.4
0.5

1.5

-0.6

The greatest stress is at x = where v(0) =


w(2) = 2 v(2) = 45 .

7
8,

and the greatest strain is at x = 2, where

11.1.9. The bar will stretch farther if the stiffer half is on top. Both boundary value problems
have the form
!
du
d
c(x)
= 1,
0 < x < 2,
u(0) = 0,
u0 (2) = 0.

dx
dx
(
1, 0 x 1,
When the stiffer bar is on bottom, c(x) =
and the solution is
2, 1 x 2,
8
< 3 x 1 x2 ,
0 x 1,
2
with u(2) = 45 . When the stiffer bar is on top,
u(x) = : 2
1
1 2
+
x

x
,
1

2,
4
4
8
(
< 3 x 1 x2 ,
0 x 1,
2, 0 x 1,
4
with u(2) = 74 .
and u(x) = : 2
c(x) =
1 2
1
1, 1 x 2,
4 + 2 x 2 x , 1 x 2,

11.1.10.

((1 x)u0 )0 = 1,

u(0) = 0,

w(1) = lim (1 x) u0 (x) = 0.


x1

The solution is u = x. Note that we still need the limiting strain at x = 1 to be zero,
w(1) = 0, which requires u0 (1) < . Indeed, the general solution to the differential equation is u(x) = a+b log(1x)+x; the first boundary condition implies a = 0, while u 0 (1) <
is required to eliminate the logarithmic term.
11.1.11. The boundary value problem is
u00 = f (x),
u(0) = u(2 ),
Integrating the differential equation, we find
u(x) = a x + b

u0 (0) = u0 (2 ).

Z x Z y
0

1
The first boundary condition implies that a =
2
316

f (z) dz dy.

0
Z x Z y
0

f (z) dz dy. The second bound-

ary condition requires


hf ,1i =

Z 2
0

f (z) dz = 0,

()

which is required for the forcing function to maintain equilibrium. The condition () is precisely the Fredholm alternative in this case, since any constant function solves the homogeneous boundary value problem. For example, if f (x) = sin x, then the displacement is
u(x) = b + sin x, where b is an arbitrary constant, and the stress is u0 (x) = cos x.
11.1.12. Displacement u and position x: meters. Strain v = u0 : no units. Stiffness c(x): N/m =
kg/sec2 . Stress w = c v 0 : N/m = kg/sec2 . External force f (x): N/m2 = kg/(m sec2 ).

11.2.1. (a) 1, (b) 0, (c) e, (d) log 2, (e)


11.2.2.
(a) (x) = (x);

Z b
a

Z b

(x) u(x) dx = u(1) for a < 1 < b.

(c) (x) = 3 (x 1) + 3 (x + 1);


1
2

(f ) 0.

(x) u(x) dx = u(0) for a < 0 < b.

(b) (x) = (x 1);

(d) (x) =

1
9,

(x 1);

Z b
a

Z b
a

(x) u(x) dx = 3 u(1) + 3 u(1) for a < 1 < 1 < b.

(x) u(x) dx =

1
2

u(1) for a < 1 < b.

(e) (x) = (x) (x ) (x + );


Z b
a

(f ) (x) =

(x) u(x) dx = u(0) u() u( ) for a < < < b.


1
2

11.2.3.
(a) x (x) =

(x 1) +

1
5

(x 2);

Z b
a

(x) u(x) dx =

1
2

u(1) +

1
5

u(2) for a < 1 < 2 < b.

nx
= 0 for all x, including x = 0. Moreover, the functions
1 + n 2 x2
are all bounded in absolute value by 21 , and so the limit, although non-uniform, is to an
ordinary function.
lim

(b) h u(x) , x (x) i =

Z b
a

u(x) x (x) dx = u(0) 0 = 0 for all continuous functions u(x), and so

x (x) has the same dual effect as the zero function: h u(x) , 0 i = 0 for all u.

11.2.4.
(a) (x) = n lim

(b)

Z b
a

2
4

n
3n

2
2
1+n x
1 + n2 (x 1)2

5;

(x) u(x) dx = u(0) 3 u(1), for any interval with a < 0 < 1 < b.

11.2.5.
(a) Using the limiting sequence, (11.31)
n
n
= lim

2
2
1 + n (2 x)
1 + (2 n)2 x2

(2 x) = n lim
g (2 x) = n lim
n

= mlim

1
2

1 + m 2 x2

1
2 mlim

where we set m = 2 n in the middle step.


317

gm (x) =

1
2

(x),

b = 2 x in the integral
(b) Use the change of variables x
Z a

(2 x) f (x) dx =

Z 2a

1
2

(x) f

2a

1
2

x dx =

1
2

f (0) =

Z a

1
a 2

(x) f (x) dx.

1
(c) (a x) = a (x).
11.2.6.
6

8
>
>
<

(a) f (x) = (x + 1) 9 (x 3) + >


>
:

(b) g (x) = x +

1
2

1
2

2 x,
1,
0,

0 < x < 3,
1 < x < 0,
otherwise.

4
2

-1

-2

-2
-4
-6

8
>
>
>
<

1.5

cos x,

+ > cos x,
>
>
:
0,

12

< x < 0,

0 < x < 21 ,
otherwise.

1
0.5
-3

-1

-2

-0.5
-1
-1.5
4

(c) h (x) = e

8
>
>
<

(x + 1) + >
>
:

8
>
>
>
<

cos x,
2 x,
ex ,

-3

-2

-1

1,

0
(a) f (x) = > 1,
00

>
>
:

cos x,
e x ,

x < ,

-6

-4

-2

0,

< x < 0,

-4

x > 0.

-6

= (x + 1) 2 (x) + (x 1),

otherwise,

f (x) = (x + 1) 2 (x) + (x 81).


>
> 1,
0

>
<

(b) k (x) = 2 (x + 2) 2 (x 2) + > 1,


>
>
:

2 < x < 0,
0 < x < 2,

0,
otherwise,
= 2 (x + 2) 2 (x 2) (x + 2) + 2 (x) (x 2),

k00 (x) = 2 0 (x + 2) 2 0 (x 2) (x + 2) + 2 (x)


8 (x 2).
(
< 2 cos x,
sin x, 1 < x < 1,
(c) s0 (x) =
s00 (x) = :
0,
otherwise,
0,

11.2.8.
(a) f 0 (x) = sign x e | x | ,

-2

1 < x < 0,
0 < x < 1,

-2

11.2.7.
8
>
>
>
<

-4

(d) k (x) = (1 + )(x) + > 2 x,


>
>
:

x > 1,
1 < x < 1,
x < 1.

f 00 (x) = e | x | 2 (x).
318

1 < x < 1,
otherwise.

8
>
>
<

(b) f (x) = >


>
:

8
<

(c) f (x) = :

1
3,
1,

x < 0,
0 < x < 1, = 1 + 4 (x) 2 (x 1),
x > 1,

2 x + 1,

x > 0,

or

f 00 (x) = 4 (x) 2 (x 1).

x < 1,

2 x 1,

1 < x < 0,
8
< 2,
00
f (x) = 2 (x + 1) + 2 (x) + :
82,
<

(d) f 0 (x) = 4 (x + 2) 4 (x 2) + :

x > 0,

or

x < 1,

1 < x < 0,
1,
| x | > 2,

1, | x | < 2,
= 4 (x + 2) 4 (x 2) + 1 2 (x + 2) + 2 (x 2),

f 00 (x) = 4 0 (x + 2) 4 0 (x 2) 2 (x + 2) + 2 (x 2).
(e) f 0 (x) = sign x cos x,
f 00 (x) = 2 (x) sin | x |.
(f ) f 0 (x) = sign(sin x) cos x,

(g) f 0 (x) = 2
f 00 (x) = 2

k =

k =

f 00 (x) = | sin x | + 2

(x 2 k ) 2

k =

0 (x 2 k ) 2

n =

(x n ).

(x (2 k + 1) ),
0 (x (2 k + 1) ).

k =

11.2.9.
(a) It suffices to note that, when > 0, the product x has the same sign as x, and so
(
1, x > 0,
= (x).
( x) =
0, x < 0,
(
1, x < 0,
= 1 (x).
(b) If < 0, then ( x) =
0, x > 0,
(c) Use the chain rule. If > 0, then (x) = 0 (x) = 0 ( x) = ( x), while if < 0,
then (x) = 0 (x) = 0 ( x) = ( x).
11.2.10.

lim
n

2 2
n
en x =

0,
,

x 6= 0,
x = 0,

11.2.11.
(a) First, by the definition of mn , we have
Z `
0

Second, note that mn =

gen (x) dx =

Z `
g (n) (x) dx
0 y

Z `

lim mn = 1. Therefore, using (11.32),

lim gen (x) = n lim

(b) It suffices to note that n lim

Z `
0

Z
2 2
2
n
1
en x dx =
ey dy = 1.

gn (x y) dx
mn

= 1.

tan1 n(` y) tan1 ( n y)


, and hence

gn (x y)
=0
mn

whenever

x 6= y.

gn (x y) dx = n lim
mn = 1, as shown in part (a).

319

11.2.12.

1
2

(a)

1
n

n1

(b) First, n lim


g (x) = 0 for any x 6= 0 since gn (x) = 0 whenever n > 1/| x |. Moreover,
n
Z

g (x) dx
n

(x).

(c) fn (x) =

= 1, and hence the sequence satisfies (11.3233), proving nlim


g (x) =
n

Z x

g (y) dy
n

8
>
>
>
<

1
2
>
>
:

=>

x < n1 ,

0,
n x + 21 ,

| x | < n1 ,
x > n.

1,

1
n

n1

8
>
<

1
f (x) = (x) = >
Yes, since n 0 as n , the limiting function is n lim
n
:

x < 0,
x = 0,
x > 0.

1
2

(d) hn (x) = 21 n x + n1 12 n x n1 .
(e) Yes. To compute the limit, we use the dual interpretation of the delta function. Given
any C1 function u(x),
h hn , u i =
=

hn (x) u(x) dx

Z h

1
2

n x +

1
n

1
2

n x

1
n

u(x) dx =

u n1

2
n

1
n

The n limit is u0 (0), since the final expression is minus the standard centered
1
u(x + h) u(x h)
difference formula u0 (x) = lim
where h =
at x = 0. (Alternah0
2h
n
0
tively, you can use lH
opitals rule to prove this.) Thus, nlim
g (x) = 0 (x).
n
11.2.13.

1
2

(a)

n1

1
n

(b) First, n lim


g (x) = 0 for any x 6= 0 since gn (x) = 0 whenever n > 1/| x |. Moreover,
n
320

2
= 1 since its graph is a triangle of base n and height n. We conclude that
the sequence satisfies (11.3233), proving n lim
g (x) = (x).
n
g (x) dx
n

(c) fn (x) =

8
>
0,
>
>
>
>
>
>
< 1 + nx +
2
>
1
>
>
+ nx
>
>
> 2
>
:

1
2
1
2

2 2

n x ,
n 2 x2 ,

1,

x < n1 ,

1
n

< x < 0,

0<x<
x>

1
n,

1
n.

n1

1
n

8
>
<

1
f (x) = (x) = >
(d) Yes, since n 0 as n , the limiting function is n lim
n
:
8
>
>
>
>
<

0,

|x| >

(e) hn (x) = > n2 ,

1
n,

0
1
2

x < 0,
x = 0,
x > 0.

n1 < x < 0,

>
>
>
:

n2 , 0 < x < n1 .
(f ) Yes, using the same integration by parts argument in (11.54).

11.2.14. By duality, for any f C1 ,


D h

i
E
h
1
1
lim
n

x
+
,
f
=
lim
n
f
n
n
n
n

1
n

where we used lH
opitals rule to evaluate the limit.

11.2.15.
s(x) =

Z x

r(x) =

Z x

8
<

y (t) dt = y (x) = :

11.2.16. h y , u i =

provided u(`) = 0.

1,

y (z) dz = y (x) + y a =

s(x):

Z `

0,

f n1

= 2 f 0 (0) = 2 h 0 , f i,

x > y,
(

x < y,
y a,

x < y,

x a,

x > y,

when

r(x):

1
y (x) u(x) dx = u(y), while h y , u0 i =

Z `
y

y < a.

ay
u0 (x) dx = u(`) u(y) = u(y)

11.2.17. Use induction on the order k. Let 0 < y < `. Integrating by parts,
h y(k+1) , u i =
=

`
Z `
Z `

y(k+1) (x) u(x) dx = y(k) (x) u(x)

(k) (x)) u0 (x) dx


0
0 y
x=0
Z `

(k) (x)) u0 (x) dx = h y(k) , u i = (1)k+1 u(k+1) (y),


0 y

by the induction hypothesis. The boundary terms at x = 0 and x = ` vanish since


y(k) (x) = 0 for any x 6= y.
11.2.18. x 0 (x) = (x) because they both yield the same value on a test function:
hx0 ,ui =

x 0 (x) u(x) dx = x u(x)


321

i0

x=0

= u(0) =

(x) u(x) dx = h u , i.

See Exercise 11.2.20 for the general version.


11.2.19. The correct formulae are
(f (x) (x))0 = f 0 (x) (x) + f (x) 0 (x) = f (0) 0 (x).
Indeed, integrating by parts,
h (f )0 , u i =
hf 0 ,ui =
h f 0 , u i =

(f (x) (x))0 u(x) dx =

f (x) (x) u0 (x) dx = f (0) u0 (0),

f (x) 0 (x) u(x) dx = (f (x) u(x))0

x=0

f 0 (x) (x) u(x) dx = f 0 (0) u(0).

= f (0) u0 (0) f 0 (0) u(0),

Adding the last two produces the first. On the other hand On the other hand,
h f (0) 0 , u i =

f (0) 0 (x) u(x) dx = (f (0) u(x))0

x=0

= f (0) u0 (0)

also gives the same result, proving the second formula. (See also the following exercise.)
11.2.20.
(a) For any text function,
hf 0 ,ui =
=

(b)

f (x)

(n)

(x) =

u(x) f (x) 0 (x) dx = (u(x) f (x))0

x=0

u(x) f (0) 0 (x) dx u(x) f 0 (0) (x) = h f (0) 0 f 0 (0) , u i.

n
X

(1)

i=1

11.2.21.
(a) (x) = 2 0 (x) (x),
(b) (x) = 0 (x),

= u0 (0) f (0) u(0) f 0 (0)

0 1
i @nA (i)

(0) (ni) (x).

(x) u(x) dx = 2 u0 (0) u(0);

(x) u(x) dx = u0 (0);

(c) (x) = (x 1) 4 0 (x 2) + 4 (x 2),

(x) u(x) dx = u(1) + 4 u0 (2) + 4 u(2);

(d) (x) = e1 00 (x + 1) 2 e1 0 (x + 1) + e1 (x + 1),


Z
u00 (1) + 2 u0 (1) + u(1)
(x) u(x) dx =
.

e
11.2.22. If f (x0 ) > 0, then, by continuity, f (x) > 0 in some interval | x x0 | < . But then the
integral of f over this interval is positive:

Z x0 +
x0

f (x) dx > 0, which is a contradiction. An

analogous argument shows that f (x0 ) < 0 is also not possible. We conclude that f (x) 0
for all x.
11.2.23. Suppose there is. Let us show that y (z) = 0 for all 0 < z 6= y < `. If y (z) > 0 at
some z 6= y, then, by continuity, y (x) > 0 in a small interval 0 < z < x < z + < `.
We can further assume < | z y | so that y doesnt lie in the interval. Choose u(x) to be
a continuous function so that u(z) > 0 for z
< x < z + but u(x) = 0 for all 0 x `
(
| x z |, | x z | ,
Note that, in
such that | z x | ; for example, u(x) =
0,
otherwise.
particular, u(y) = 0. Then

Z `
0

y (x) u(x) dx =

Z z+
z

y (x) u(x) dx > 0 because we are

integrating a positive continuous function. But this contradicts (11.39) since u(y) = 0. A
similar argument shows that y (z) < 0 also leads to a contradiction. Therefore, y (x) = 0
for all 0 < x 6= y < ` and so, by continuity, y (x) = 0 for all 0 x `. But then
322

Z `
0

y (x) u(x) dx = 0 for all functions u(x) and so (11.39) doesnt hold if u(y) 6= 0.

11.2.24. By definition of uniform convergence, for every > 0 there exists an n ? such that
| fn (x) (x) | < for all n n? . However, if < 12 , then there is no such n? since each
fn (x) is continuous, but would have to satisfy fn (x) < < 12 for x < 0 and fn (x) > 1 >
1
2 for x > 0 which is impossible for a continuous function, which, by the Intermediate Value
Theorem, must assume every value in between and 1 , [ 2 ].

11.2.25. .5 mm by linearity and symmetry of the Greens function.


11.2.26. To determine the Greens function, we must solve the boundary value problem
c u00 = (x y),
u(0) = 0,
u0 (1) = 0.
The general solution to the differential equation is
(x y)
(x y)
+ a x + b,
u0 (x) =
+ a.
u(x) =
c
c
The integration constants a, b are fixed by the boundary conditions
1
u(0) = b = 0,
u0 (1) = + a = 0.
c
Therefore, the Greens function for this problem is
(
x/c,
x y,
G(x, y) =
y/c,
x y.
8
< 1
8
: 1
8

11.2.27. (a) G(x, y) =

x(4 y),

x y,

x,

x y,

(c) The free


y(4 x), x y.
y, x y.
boundary value problem is not positive definite, and so there is not a unique solution.

11.2.28.
(a)

8
>
>
>
>
<

G(x, y) = >

(b) u(x) =
=

Z x
0

Z 1
0

log(1 + y)

11.2.29.
(a) u(x) =
(b)

9
16

>
>
>
:

log(1 + x)
log(1 + y)

x < y,

x > y.

log(1 + x)
log 2

dy +

Z 1
x

log(1 + x)

log(1 + y)
log 2

u0 (x)
1 2
3 3
1 4
9
= 16
x
+
x

x
,
w(x)
=
2
16
4
2
1
+
x
8

>
<
1 43 y 14 y 3 x + 13 x3 , x < y,

>
:
1 34 x 14 x3 x + 13 x3 , x > y.

Z 1

dy =

log(x + 1)
x.
log 2

x;

G(x, y) dy

0
Z x
0
9
16

log(1 + y)
1
log 2
log(1 + x)
1
log 2

G(x, y) dy

G(x, y) =

(c) u(x) =

(b) G(x, y) =

8
< 1
2
: 1
2

1
1
2

3
4

1
4

x3

3
16

x+
1
4

1
3

x3 dy +

Z 1
x

3
4

1
4

y3

x+

1
3

x3 dy

x x + x x .
=
(d) Under an impulse force at x = y, the maximal displacement is at the forcing point,
1 6
namely g(x) = G(x, x) = x 43 x2 + 31 x3 21 x4 12
x . The maximum value of
323

g(x? ) =

1
3

occurs at the solution x? =


3
2

g 0 (x) = 1

x + x2 2 x 3

11.2.30. (a) G(x, y) =


(c)

u(x) =

Z 1
0

Z x
0

x < y,
x > y;

G(x, y)f (y) dy =


Z x

u0 (x) = x (1 + x) f (x) +
y f (y) dy +

0
Z 1
x

1+

x5 = 0.

1
2

(1 + y)x,
y(1 + x),

Z x

q
3

1+

= .596072 to
2

(b) all of them;

y (1 + x) f (y) dy +

Z 1
x

(1 + y) x f (y) dy,

y f (y) dy (1 + x) x f (x) +

Z 1
x

(1 + y) f (y) dy

(1 + y) f (y) dy,

u00 (x) = x f (x) (1 + x) f (x) = f (x).


(d) The boundary value problem is not positive definite the function u(x) = x solves the
homogeneous problem and so there is no Greens function.
11.2.31.

8
>
>
>
>
<

x(y 1),

1
4n
>
>
>
:

(a) un (x) = >

1
2

x+

1
4

n x2

1
2

y+ 1

x(1 y),

1
2

n xy +

1
4

n y2 ,

0xy

1
n

|x y|

y+

1
u (x)
n , we have n lim
n

2
1
= y y = G(y, y).
4n

(b) Since un (x) = G(x, y) for all | x y |

1
n

1
n

x 1.

= G(x, y) for all x 6= y,

(Or one can appeal to


while n lim
u (y) = n lim
y y+
n

continuity to infer this.) This limit reflects the fact that the external forces converge to
the delta function: n lim
f (x) = (x y).
n
0.2

0.4

0.6

0.8

-0.05

(c)

-0.1
-0.15
-0.2

11.2.32. Use formula (11.64) to compute


1
c

Z x

1
0
c
1
u0 (x) = x f (x) x f (x) +
c
u(x) =

Moreover, u(0) =

1
c

y f (y) dy +

Z 0
0

Z 1
x

Z 1
x

x f (y) dy,
f (y) dy =

1
c

y f (y) dy = 0 and u0 (1) =

Z 1
x

1
c

f (y) dy,

Z 1
1

u00 (x) =

1
f (x).
c

f (y) dy = 0.

11.2.33.
(a) The j th column of G is the solution to the linear system K u = ej /x, corresponding to
a force of magnitude 1/x concentrated on the j th mass. The total force on the chain
is 1, since the force only acts over a distance x, and so forcing function represents a
concentrated unit impulse, i.e., a delta function, at the sample point yj = j/n. Thus, for
n 0, the solution should approximate the sampled values of the Greens function of
the limiting solution.
(b) In fact, in this case, the entries of G are exactly equal to the sample values of the Greens
function, and, at least in this very simple case, no limiting procedure is required.
324

11.2.34. Yes. For a system of n masses connected to both top and bottom supports by n + 1
1
springs, the spring lengths are x = n + 1 , and we rescale the incidence matrix A by di-

viding by x, and set K = AT A. Again, the j th column of G = K 1 /x represents the


response of the system to a concentrated unit force on the ith mass, and so its entries approximate G(xi , yj ), where G(x, y) is the Greens function (11.59) for c = 1. Again, in this
specific case, the matrix entries are, in fact, exactly equal to the sampled values of the continuum Greens function.

11.2.35. Set

Z w
I
F
I
I
=
(x, y) dy,
= F (x, z),
= F (x, w),
z x
z
x
z
w
by the Fundamental Theorem of Calculus. Thus, by the multivariable chain rule,
d Z (x)
I
I d
I d
d
I(x, (x), (x)) =
+
+
F (x, y) dy =
dx (x)
dx
x
y dx
z dx
Z (x)
d
F
d
= F (x, (x))
F (x, (x))
+
(x, y) dy.
(x) x
dx
dx

I(x, z, w) =

Z w

F (x, y) dy,

so

5
2

11.3.1. (a) Solution: u? (x) =

5
2

x2 .

25
(b) P[ u? ] = 24
= 1.04167,

(i) P[ x x2 ] = 32 = .66667, (ii) P[ 32 x

3
2

39
x3 ] = 40
= .975,

16
(iii) P[ 32 sin x ] = 320 + 91 2 = 1.02544, (iv ) P[ x2 x4 ] = 35
= .45714;
all are larger than the minimum P[ u? ].

11.3.2. (i) u? (x) =


(iii) P[ u? ] =

P[ c x c x2 ] =

11.3.3.
(a) (i) u? (x) =
(ii)
(iii)
(iv )
(b) (i)

1
6x
1
90

1
6

1
c 96

5
x4 36
,
3
(u0 )2
2 5
P[ u ] =
+ x u dx, u(1) = u(1) = 0,
2 (x2 + 1)
P[hu? ] = .0282187,
i
h
i
P 51 (1 x2 ) = .018997, P 51 cos 12 x = .0150593.
u? (x) = 21 e 1 + e x1 21 e 2 x ,

(ii) P[ u ] =

1 6
18 x2
Z 1
4
1

Z 1h
0

1
12

1 x 0 2
2 e (u )

1
5

sin 21 x) = .0354279.

Z 2h

i
2
1 2 0 2
x
(u
)

3
x
u
dx, u0 (1) = u(2) = 0,
2
1
37
P[ u? ] = 20
= 1.85,
h
i
2
1
P[ 2 x x ] = 11
6 = 1.83333, P cos 2 (x 1) = 1.84534.
u? (x) = 31 x + 79 94 x2 ,
Z 1 h
i
12 x3 (u0 )2 x2 u dx, u(2) = u(1) = 0.
P[ u ] =
2

(ii) P[ u ] =

(ii)

e x u dx, u(0) = u0 (1) = 0,

(iii) P[hu? ] = .0420967,


i
(iv ) P 25 x 15 x2 ) = .0386508,
(c) (i) u? (x) = 25 x1 21 x2 ,
(iii)
(iv )
(d) (i)

i
1
(u0 )2 x u dx, u(0) =
2
0
2
1
P[ c x c x3 ] = 52 c2 15
c > 90
2
2
1

u(1) = 0,

for c 6= 16 ,
1
= .01042, P[ c sin x ] = 4 c c 4 = .01027.

= .01111, (iv )

1 2
1
6 c 12

Z 1h

x3 , (ii) P[ u ] =

325

13
(iii) P[hu? ] = 216
= .060185,
i
1
(iv ) P 4 (x + 1)(x + 2) = .0536458,

3
P 20
x(x + 1)(x + 2) = .0457634.

11.3.4.
(a) Boundary value problem: u00 = 3, u(0) = u(1) = 0;
solution: u? (x) = 23 x2 + 32 x.

(b) Boundary value problem: (x + 1) u0 = 5, u(0) = u(1) = 0;


5
log(x + 1) 5 x.
solution: u? (x) =
log 2
(c) Boundary value problem: 2 (x u0 )0 = 2, u(1) = u(3) = 0;
2
solution: u? (x) = x 1
log x.
log 3
(d) Boundary value problem: (ex u0 )0 = 1 + ex , u(0) = u(1) = 0;
solution: u? (x) = (x 1) e x x + 1.
!
d
du
x
1
(e) Boundary value problem: 2
, u(1) = u(1) = 0;
= 2
dx 1 + x2 dx
(x + 1)2
1
1 3
solution: u? (x) = 16
x + 16
x .

11.3.5.
1
3
log x
(a) Unique minimizer: u? (x) = x2 2 x + +
.
2
2
2 log 2
(b) No minimizer since x is not positive for all < x < .
!
cos 1(1 + sin x)
log
1
(1 + sin 1) cos x
!
(c) Unique minimizer: u? (x) = (x 1)
.
1 + sin 1
2
log
1 sin 1
(d) No minimizer since 1 x2 is not positive for all 2 < x < 2.
(e) Not a unique minimizer due to Neumann boundary conditions.
11.3.6. Yes: any function of the form u(x) = a + 41 x2 16 x3 satisfies the Neumann boundary
value problem u00 = x 21 , u0 (0) = u0 (1) = 0. Note that the right hand side satisfies
D

the Fredholm condition 1 , x 21


problem would have no solution.

Z 1
0

1
2

dx = 0; otherwise the boundary value

11.3.7. Arguing as in Example 11.3, any minimum must satisfy the Neumann boundary value
problem (c(x) u0 )0 = f (x), u0 (0) = 0, u0 (1) = 0.
The general solution to the differential
!
Z x
1 Zx
1 Zy
f (z) dz dy. Since u0 (x) = a
f (z) dz,
equation is u(x) = a x + b
0
c(y) 0
c(x) 0
the first boundary condition requires a = 0. The second boundary condition requires
1 Z1
u0 (1) =
f (x) dx = 0. The mean zero condition is both necessary and sufficient
c(x) 0
for the boundary value problem to have a (non-unique) solution, and hence the functional
to achieve its minimum value. Note that all solutions to the boundary value problem give
the same value for the functional, since P[ u + b ] = P[ u ] b
11.3.8.

1
2

k D[ u ] k2 =

1
2

k v k2 =

Z `

1
0 2

c(x) v(x)2 dx =

Z `

1
0 2

Z 1
0

f (x) dx = P[ u ].

v(x) w(x) dx =

1
2

h v , w i.

11.3.9. According to (11.9192) (or, using a direct integration by parts) P[ u ] = 12 h K[ u ] , u i


h u , f i, so when K[ u? ] = f is the minimizer, P[ u? ] = 21 h u? , f i = 21 h u? , K[ u? ] i < 0
326

since K is positive definite and u? is not the zero function when f 6 0.


11.3.10. Yes. Same argument
11.3.11. u(x) = x2 satisfiess

Z 1
0

2
3.

u00 (x) u(x) dx =

Positivity of

Z 1
0

for functions that satisfy the boundary conditions u(0) = u(1).

u00 (x) u(x) dx holds only

11.3.12. No. The boundary terms still vanish and so the integration by parts identity continues
Z `h

to hold, but now u(x) = a constant makes


u00 (x) u(x) dx = 0. The integral is 0
0
for all functions satisfying the Neumann boundary conditions.
11.3.13. hh I [ u ] , v ii =
I [ v ] =

Z `
0

u(x) v(x) c(x) dx =

Z `
0

u(x) v(x) (x) dx = h u , I [ v ] i, provided

c(x)
v(x) is a multiplication operator.
(x)
Z `

c(x) u(x) v(x) dx =


11.3.14. h K[ u ] , v i =
0
conditions are required.

Z `
0

u(x) c(x) v(x) dx = h u , K[ v ] i. No boundary

11.3.15. Integrating by parts,


Z `
Z `
i
h
i
du
d h
hh D[ u ] , v ii =
v(x) c(x) dx
v(x) c(x) dx = u(`) c(`) v(`) u(0) c(0) v(0)
u(x)
0 dx
0
dx
!
*
+
Z `
h
i
1
d
1 d(v c)
u(x)
v(x) c(x)
(x) dx = u ,
=
= h u , D [ v ] i,
0
(x) dx
dx
provided the boundary terms vanish:
u(`) c(`) v(`) u(0) c(0) v(0) = 0.

Therefore,

i
c(x) dv
c0 (x)
1
d h
c(x) v(x) =

v(x).
(x) dx
(x) dx
(x)
Note that we have the same boundary terms as in (11.87), and so all of our self-adjoint
boundary conditions continue to be valid. The self-adjoint boundary value problem K[ u ] =
D D[ u ] = f is now given by
1 d
du
K[ u ] =
c(x)
= f (x),
(x) dx
dx

D [ v(x) ] =

along with the selected boundary conditions, e.g., Dirichlet conditions u(0) = u(`) = 0.
11.3.16.
(a) Integrating the first term by parts, we find
hh L[ u ] , v ii =

Z 1h
i
u0 (x) v(x) + 2 x u(x) v(x) dx
0
Z 1h
h
i

= u(1) v(1) u(0) v(0) +

= h u , v + 2 x v i = h u , L [ v ] i,
0

u(x) v 0 (x) + 2 x u(x) v(x) dx

owing to the boundary conditions. Therefore, the adjoint operator is L [ v ] = v 0 +2 x v.


(b) The operator K is positive definite because ker L = {0}. Indeed, the solution to L[ u ] = 0
2
is u(x) = c e x and either boundary condition implies c = 0.
(c) K[ u ] = ( D + 2 x)(D + 2 x)u = u00 + (4 x2 2)u = f (x), u(0) = u(1) = 0.
2
2
(d) The general solution to the differential equation ( D + 2 x)v = ex is v(x) = (b x)ex ,
327

and so the solution to ( D + 2 x)(D + 2 x)u = ex is found by solving


2

(D + 2 x)u = (b x)ex ,

Imposing the boundary conditions, we find


u(x) =

x2

e
4

u(x) = 41 ex + a e x + b e x

so
x2

Z x
2
e2 y
0
Z 1
2 y2
e
dy
0

(e4 1) e x
4

Z x
2
e2 y
0

dy.

dy
.

(e) Because the boundary terms u(1) v(1) u(0) v(0) in the integration by parts argument are not zero. Indeed, we should identify v(x) = u0 (x) + 2 x u(x), and so the correct form of the free boundary conditions v(0) = v(1) = 0 requires u0 (0) = u0 (1) +
2 u(1) = 0. On the other hand, although it is not self-adjoint and so doesnt admit a
minimization principle, the free boundary value problem does have a unique solution:

2
u(x) = 41 e2 1 + e x .

11.3.17.
(a) a(x) u00 + b(x) u0 = (c(x) u0 )0 = c(x) u00 c0 (x) u0 if and only if a = c and
b = c 0 = a0 .
!
Z
0
b a0
b a0
0
0
0
(b) We require b = ( a) = a + a , and so
=
. Hence, = exp
dx

a
a
is found by one integration.
(c) (i) No integrating factor needed: (x2 u0 )0 = x 1;
(ii) no integrating factor needed: (ex u0 )0 = e2 x ;
(iii) integrating factor (x) = e2 x , so (e2 x u0 )0 = e2 x ;
(iv ) integrating factor (x) = x2 , so (x3 u0 )0 = x3 ; !
1
1
u0
(v ) integrating factor (x) =
,
so

=
.
2
cos x
cos x
cos x
11.3.18.
(a) Integrating the first term by parts and using the boundary conditions,
hh L[ u ] , v ii =

Z 1h
u0 v1
0

+ u v2 dx =

Z 1
0

u v10 + v2 dx = h u , L [ v ] i,

and so the adjoint operator is L [ v ] = v10 + v2 .

(b) u00 + u = x 1, u(0) = u(1) = 0, with solution u(x) = x 1 +


11.3.19. Use integration by parts:
h L[ u ] , v i =

Z 2
0

i u0 (x) v(x) dx =

Z 2
0

i u(x) v 0 (x) dx =

Z 2
0

e2x ex
.
e2 1

u0 (x) i v 0 (x) dx = h u , L[ v ] i,

where the boundary terms vanish because both u and v are assumed to be 2 periodic.
11.3.20. Quadratic polynomials do not, in general, satisfy any of the allowable boundary conditions, and so, in this situation, the the boundary terms will contribute to the computation
of the adjoint.

11.3.21. Solving the corresponding boundary value problem


d
du
x3 1
2 log x

(x
) = x2 , u(1) = 0, u(2) = 1, gives u(x) =

.
dx
dx
9
9 log 2
11.3.22.
(a) (i) u00 = 1, u(0) = 2, u(1) = 3;

(ii) u? (x) =
328

1
2

x2 +

1
2

x + 2.

(b) (i) 2 (x u0 )0 = 2 x, u(1)


= 1, u(e) = 1. (ii) u? (x) = 45 41 x2 + 41 (e2 1) log x.
!
1
d
du
(c) (i)
= 0, u(1) = 1, u(1) = 1. (ii) u? (x) = 43 x 41 x3 .
dx 1 + x2 dx
(d) (i) (e x u0 )0 = 1, u(0) = 1, u(1) = 0; (ii) u? (x) = (x 1) e x .
11.3.23.
(a) (i)
(b) (i)

Z h

0
Z 1

1
2
2
4

(u0 )2 (cos x) u dx, u(0) = 1, u() = 2;


3

(u0 )2
x2 u 5 dx, u(1) = 1, u(1) = 1;
2(x2 + 1)

1 6
(ii) u? (x) = 18
x
Z 1h

1
12
i

x4 +

1
4

x3 +

3
4

u(x) = (1 x) + x +

= (1 x) + x +
11.3.25.
(a)

(b)
(c)
(d)

"

0
Z x
0

x
.

5
36 .

1 x 0 2
2 e (u ) + u dx, u(0) = 1, u(1) = 0;
0
Z 1 h
i
1 2 0 2
(d) (i)
x (u ) (x2 x) u dx, u(1) = 1,
2
1
5
(ii) u? (x) = 16 x2 + 21 x + 11
3 x.
h
i
e (x) = u(x) (1 x) + x
11.3.24. The function u
e 00 = f, u
e (0) = u
e (1) = 0, and so is
conditions u
Z 1
e (x, y) =
u
f (y) G(x, y) dy. Therefore,
0
Z 1

(c) (i)

(ii) u? (x) = cos x

(ii) u? (x) = e x (1 x).


u(3) = 2;

satisfies the homogeneous boundary


given by the superposition formula

f (y) G(x, y) dy
(1 x) y f (y) dy +

Z 1
x

x (1 y) f (y) dy.

du
d
(1 + x)
= 1 x, u(0) = 0, u(1) = .01;

dx
dx
solution:0 u? (x) = .25 x2 1.5 x + 1.8178
log(1 + x);
1
!2
Z 1
du
@ 1 (1 + x)
P[ u ] =
(1 x) u A dx;
2
0
dx
P[ u? ] = .0102;
When u(x) = .1 x + (x x2 ), then P[ u ] = .25 2 .085 .00159, with minimum value
.0070535 at = .17. When u(x) = .1 x + sin x, then P[ u ] = 3.7011 2 .3247
.0087, with minimum value .0087 at = .0439.

e (x) + h(x) where h(x) is any


11.3.26. We proceed as in the Dirichlet case, setting u(x) = u
function that satisfies the mixed boundary conditions h(0) = , h0 (`) = , e.g., h(x) =
+ x. The same calculation leads to (11.105), which now is
hh u0 , h0 ii h u , K[ h ] i = c(`) h0 (`) u(`) c(0) h0 (0) = c(`) u(`) c(0) h0 (0) .
The second term, C1 = c(0) h0 (0) , doesnt depend on u. Therefore,
e ] = c(`) u(`) + P[ u ] C + C ,
P[ u
1
0
e (x) minimizes P[ u
e ] if and only if u(x) = u
e (x) + h(x) minimizes the functional
and so u
(11.106).

d
11.3.27. Solving the boundary value problem
dx
1 3
17
conditions gives u(x) = 18
x + 18
43 log x.

11.3.28. For any > 0, the functions u(x) =

8
<
:

du
x
dx

= 12 x2 with the given boundary

0,
1
2 (x 1 + )

329

0 x 1 ,

1 x 1,

satisfy the

boundary conditions, but J[ u ] = 31 , and hence J[ u ] can be made as close to 0 as desired.


However, if u(x) is continuous, J[ u ] = 0 if and only if u(x) = c is a constant function. But
no constant function satisfies both boundary conditions, and hence J[ u ] has no minimum
value when subject to the boundary conditions.
11.3.29. The extra boundary terms serve to cancel those arising during the integration by parts
computation:
b u] =
P[

Z bh

a
Z bh
a

1 0 2
2 (u )

0
0
+ u00
? u dx u? (b)u(b) + u? (a)u(a)

1 0 2
2 (u )

u0? u0 dx =

Z b

1 0
(u
a 2

u0? )2 dx +

Z b

1 0 2
(u ) dx.
a 2 ?

Again, the minimum occurs when u = u? , but is no longer unique since we can add in any
constant function, which solves the associated homogeneous boundary value problem.

11.4.1.
1 3
1
1 4
x 12
x + 24
x, w(x) = u00 (x) = 12 x2 12 x.
(a) u(x) = 24
(b) no equilibrium solution.
1 4
x 16 x3 + 31 x, w(x) = u00 (x) = 12 x2 x.
(c) u(x) = 24
1 4
x 16 x3 + 41 x2 , w(x) = u00 (x) = 12 x2 x + 12 .
(d) u(x) = 24
1 4
(e) u(x) = 24
x 16 x3 + 61 x2 , w(x) = u00 (x) = 12 x2 x + 13 .
11.4.2.


3
(a) Maximal displacement: u 12 = 384
= .01302; maximal stress: w 21 = 81 = .125;
(b) no solution;
5
(c) Maximal displacement: u(1) = 24
= .2083; maximal stress: w(1) = 21 = .5;
(d) Maximal displacement: u(1) = 81 = .125; maximal stress: w(0) = 21 = .5;
1
= .04167; maximal stress: w(0) = 31 = .3333.
(e) Maximal displacement: u(1) = 24
Thus, case (c) has the largest displacement, while cases (c,d) have the largest stress,.
11.4.3. Except in (b), which has no minimization principle, we minimize
Z 1h

P[ u ] =

subject to the boundary conditions


(a) u(0) = u00 (0) = u(1) = u00 (1) = 0.
(d) u(0) = u0 (0) = u00 (1) = u000 (1) = 0.
11.4.4.
(a) (i) G(x, y) =
0.02

8
< 1
3
: 1
3

xy
xy

1
6
1
2

x3
2

x y

x y2 +
1
6

y +

(iii)

0.01
0.005

0.4

0.6

0.8

(iv ) maximal displacement at x =


(b) no Greens function.

1
2

3
1 3
1
6 x y + 6 xy ,
3
1 3
1
6 x y + 6 xy ,

Z x

+
0.2

u00 (x)2 u(x) dx

(c) u(0) = u00 (0) = u0 (1) = u000 (1) = 0.


(e) u(0) = u0 (0) = u0 (1) = u000 (1) = 0.

u(x) =

0.015

(ii)

1
2

1
2

( 13
0
Z 1
x

xy

1
2

( 13 x y

x > y;

x2 y

1
6

y3 +

x3

1
2

x y2 +

1
6

with G( 12 , 21 ) =
330

x < y,

1
48 .

1
6

x3 y +
1
6

1
6

x3 y +

x y 3 )f (y) dy
1
6

x y 3 )f (y) dy;

0.2

8
<

(c) (i) G(x, y) = :


(iii) u(x) =

Z x
0

xy
xy

1
6
1
2

1
6

(x y

1
2

x2 y
x

xy ,

x < y,

y3 ,

x > y;

1
6

1
2

x y )f (y) dy +

0.15

(ii)

0.1
0.05

Z 1
x

0.2

1
6

(x y

(iv ) maximal displacement at x = 1 with G(1, 1) =

1
3.

1
2

0.4

0.6

0.8

0.8

x y )f (y) dy;

0.1

8
<

(d) (i) G(x, y) = :


(iii) u(x) =

16 x3 +
1
2

Z x

( 12
0

xy
2

xy

0.08

1 2
2 x y,
1 3
6y ,
1
6

x < y,

0.06

(ii)

x > y;

y )f (y) dy +

0.04
0.02

Z 1
x

0.2

1
6

( x +

(iv ) maximal displacement at x = 1 with G(1, 1) =

0.4

0.6

1
2

x y)f (y) dy;

1
3.
0.04

8
<

(e) (i) G(x, y) = :


(iii) u(x) =

Z x
0

16 x3 +
1
2

x y2
2

( 12 x y

1 2
1 2 2
2x y 4x y ,
1 3
1 2 2
6y 4x y ,
1
6

1
4

0.03

x < y,

(ii)

x > y;

2 2

x y )f (y) dy +

0.02
0.01

Z 1
x

(iv ) maximal displacement at x = 1 with G(1, 1) =

0.2

( 16 x +

1
12 .

1
2

x y

0.4

1
4

0.6

0.8

2 2

x y )f (y) dy;

11.4.5. The boundary value problem for the bar is


with solution
u(x) = 21 f x(` x).
u00 = f,
u(0) = u(`) = 0,

The maximal displacement is at the midpoint, with u 12 ` = 18 f `2 . The boundary value


problem for the beam is
1
f x2 (` x)2 .
with solution
u(x) = 24
u0000 = f,
u(0) = u0 (0) = u(`) = u0 (`) = 0,

1
The maximal displacement is at the midpoint, with u 12 ` = 384
f `4 . The beam displace
ment is greater than the bar displacement if and only if ` > 4 3.

x2 x
, with stress w(x) = (1 + x2 ) v(x) = 21 (x2 x), which
2(1 + x2 )
is obtained by solving w 00 = 1 subject to boundary conditions w(0) = w(1) = 0. The problem is statically determinate because the boundary conditions uniquely determine w(x) and
v(x) without having to first find the displacement u(x) (which is rather complicated).

11.4.6. The strain is v(x) =

11.4.7.
(a) The differential equation is u0000 = f . Integrating once, the boundary conditions u000 (0) =
u000 (1) = 0 imply
u000 (x) =

Z x
0

f (y) dy

with

Z 1
0

f (y) dy = 0.

Integrating again, the other boundary conditions u00 (0) = u00 (1) = 0 imply that
u00 (x) =

Z x Z z
0

f (y) dy

dz

with

Z 1 Z z
0

f (y) dy

dz = 0.

(b) The Fredholm alternative requires f (x) be orthogonal to the functions in ker K = ker L =
ker D2 , with basis 1 and x. (Note that both functions satisfy the free boundary conditions.) Then
hf ,1i =

Z 1
0

f (x) dx = 0,
331

hf ,xi =

Z 1
0

xf (x) dx = 0.

Comparing with the two previous constraints, the first are identical, while the second
are equivalent using an integration by parts.
(c) f (x) = x2 x + 61 satisfies the constraints; the corresponding solution is
1
1
1
x6 120
x5 + 144
x4 + c x + d, where c, d are arbitrary constants.
u(x) = 360
11.4.8.
(a) The differential equation is u0000
= f . Integrating once, the boundary condition
Z
u000 (0) = 0 implies u000 (x) =

u00 (0) = u00 (1) = 0 imply that


u00 (x) =

Z x Z z
0

f (y) dy. Integrating again, the boundary conditions

f (y) dy

dz

with

Z 1 Z z
0

f (y) dy

dz = 0.

Integrating twice more, one can arrange to satisfy the remaining boundary condition
u(1) = 0, and so in this case there is only one constraint.
(b) The Fredholm alternative requires f (x) be orthogonal to the functions in ker K = ker L =
ker D2 , with basis x 1, and so
hf ,x 1i =

Z 1
0

(x 1)f (x) dx = 0.

This is equivalent to the previous constraint through an integration by parts.


(c) f (x) = x 31 satisfies the constraint; the corresponding solution is
1 4
1
1
x5 72
x + c (x 1) + 180
, where c is an arbitrary constant.
u(x) = 120
11.4.9. False. Any boundary conditions leading to self-adjointness result in a symmetric Greens
function.
11.4.10.

1 Z1 2
1 Zx 2
y (1 x)2 (3 x y 2 x y) f (y) dy +
x (1 y)2 (3 y x 2 x y) f (y) dy,
6 0
6 x
du
1 Z1
1 Zx 2
y (1 x) (1 3 x + 2 x y) f (y) dy +
x (1 y)2 (2 y x 2 x y) f (y) dy,
=
dx
2 0
2 x
Z x
Z 1
d2 u
2
=
y
(
2
+
3
x
+
y

2
x
y)
f
(y)
dy
+
(1 y)2 (y x 2 x y) f (y) dy,
0
x
dx2
Z 1
Z x
d3 u
2
(1 y)2 (1 + 2 y) f (y) dy,
y
(3

2
y)
f
(y)
dy

=
x
0
dx3
d4 u
= f (x).
Moreover,
u(0) = u0 (0) = u(1) = u0 (1).
dx4
u=

11.4.11. Same answer in all three parts: Since, in the absence of boundary conditions, ker D 2 =
{ a x + b }, we must impose boundary conditions in such a way that no non-zero affine function can satisfy them. The complete list is (i) fixed plus any other boundary condition; or
(ii) simply supported plus any other except free. All other combinations have a nontrivial kernel, and so have non-unique equilibria, are not positive definite, and do not have a
Greens function.
11.4.12.
(a) Simply supported end with right end raised 1 unit; solution u(x) = x.
(b) Clamped ends, the left is held horizontally and moved one unit down, the right is held
at an angle tan1 2; solution u(x) = x2 1.
(c) Left end is clamped at a 45 angle, right end is free with an induced stress; solution
u(x) = x + 21 x2 .
(d) Left end is simply supported and raised 2 units up, right end is sliding and tilted at an
angle tan1 2; solution u(x) = 2 2 x.
332

11.4.13. (a) P[ u ] =

Z 1

1
0 2

u00 (x)2 dx,

(b) P[ u ] =

Z 1

1
0 2

(c) P[ u ] = u0 (1) +

u00 (x)2 dx,

Z 1

1
0 2

u00 (x)2 dx,

(d) P[ u ] =

Z 1

1
0 2

u00 (x)2 dx.

11.4.14.

8
<

(a) u(x) = :
8
>
>
>
<

0.5

1.25 (x + 1)3 + 4.25 (x + 1) 2,


3

1.25 x 3.75 x + .5 x 1,

1 x 0,

(c) u(x) =

(d) u(x) = >


>
>
>
>
>
>
:

0.5
-0.5
-1
-1.5
-2

x3 + 2 x + 1,

0 x 1,

1.5

1 x 2,

(x 2)3 + 3 (x 2)2 (x 2),

1
0.5

2 x 3.

0.5

1.5

2.5

2.5

(x 2),

1 x 2,

1.5

2 x 4.

0.5

1.5

1.53571 (x + 2) 4.53517 (x + 2) + 5,

2.5

3.5

2 x 1,

3.67857 (x + 1)3 + 4.60714 (x + 1)2 + .07143 (x + 1) + 2,


3

1 x 0,

4.17857 x 6.42857 x 1.75 x + 3,

0 x 1,

2.03571 (x 1)3 + 6.10714 (x 1)2 2.07143 (x 1) 1,

1 x 2.

4
3
2
1
-2

-1

-1

11.4.15. In general, the formulas for the homogeneous clamped spline coefficients a j , bj , cj , dj ,
for j = 0, . . . , n 1, are
aj = y j ,
j = 0, . . . , n 1,
dj =

cj+1 cj
3 hj

b0 = 0,

j = 0, . . . , n 2,

bj =

yj+1 yj
hj

bn1 =

8
< 2 (x 1)3 11 (x 1) + 3,
3
3
:
3
1
3 (x 2) + 2 (x 2)2 35

8
>
>
>
>
>
>
>
<

-0.5

0 x 1.

(b) u(x) = > 2 (x 1)3 3 (x 1)2 (x 1) + 2,


>
>
:

-1

where c = c0 , c1 , . . . , cn1

dn1 =

(2 cj + cj+1 ) hj
3

bn1

3 h2n1

j = 1, . . . , n 2,

3(yn yn1 )
1
cn1 hn1 ,
2 hn1
2

solves A c = z =

333

2 cn1
,
3 hn1

z0 , z1 , . . . , zn1

, with

A=

B
B
B
B
B
B
B
B
B
B
B
B
B
@

2 h0
h0

h0
2 (h0 + h1 )
h1

h1
2 (h1 + h2 )
h2

y y0
z0 = 3 1
,
h0

zj = 3 @

h2
2 (h2 + h3 )
h3
..
..
..
.
.
.
hn3
2 (hn3 + hn2 )
hn2
hn2
2 hn2 + 23 hn1

yj+1 yj
hj

yj yj1
hj1

The particular solutions are:

1
A

C
C
C
C
C
C
C
C,
C
C
C
C
C
A

j = 1, . . . , n 2.

8
<

(a) u(x) = :

0.5

5.25 (x + 1)3 + 8.25 (x + 1)2 2,

1 x 0,

4.75 x3 7.5 x2 + .75 x + 1,

-1

-0.5

0.5
-0.5

0 x 1.

-1
-1.5
-2
2

8
>
>
>
<

2.6 x3 + 3.6 x2 + 1,

0 x 1,

(b) u(x) = > 2.8 (x 1)3 4.2 (x 1)2 .6 (x 1) + 2,


>
>
:

1 x 2,

2.6 (x 2)3 + 4.2 (x 2)2 .6 (x 2),

2 x 3.

1.5
1
0.5

0.5

1.5

2.5

1.5

2.5

3.5

8
<

(c) u(x) = :
8
>
>
>
>
>
>
>
<

(d) u(x) = >


>
>
>
>
>
>
:

2.5

3.5 (x 1)3 6.5 (x 1)2 + 3,

1 x 2,

1.125 (x 2)3 + 4 (x 2)2 2.5 (x 2),

2 x 4.

2
1.5
1
0.5

-0.5

4.92857 (x + 2)3 7.92857 (x + 2)2 + 5,


3

2 x 1,

4.78571 (x + 1) + 6.85714 (x + 1) 1.07143 (x + 1) + 2,


3

1 x 0,

5.21429 x 7.5 x 1.71429 x + 3,


3

0 x 1,

5.07143 (x 1) + 8.14286 (x 1) 1.07143 (x 1) 1,

1 x 2.

4
3
2
1
-2

-1

-1

11.4.16.
8
>
>
>
<

x3 2 x2 + 1,

(a) u(x) = > (x 1)2 (x 1),


>
>
:

0 x 1,

(x 2) + (x 2) + (x 2),
334

1 x 2,

2 x 3.

0.8
0.6
0.4
0.2
0.5
-0.2

1.5

2.5

8
>
>
>
<

x + 2 x + 1,
3

0 x 1,

(b) u(x) = > 2 (x 1) 3 (x 1) (x 1) + 2,


>
>
:

(c) u(x) =

(d) u(x) =

1.5

1 x 2,

(x 2)3 + 3 (x 2)2 (x 2),

1
0.5

2 x 3.

8 5 3
>
x 94 x2 + 1,
0
>
4
>
>
>
>
3
2
3
3
3
>
< (x 1) + (x 1) (x 1), 1
4
2
4
>
3
2
3
3
>
>
2
>
4 (x 2) 4 (x 2) ,
>
>
>
:
45 (x 3)3 + 23 (x 3)2 + 43 (x 3), 3
8
2
3
9
3
>
> 2 (x + 2) + 4 (x + 2) + 4 (x + 2) + 1,
>
>
>
>
>
< 7 (x + 1)3 21 (x + 1)2 9 (x + 1) + 2,
2
4
4
>
3
21 2
9
>
>

2
x
+

x
x

2,
>
4
4
>
>
>
: 1
3
2
3
9
2 (x 1) 4 (x 1) + 4 (x 1) 1,

0.5

x 1,

1.5

2.5

1
0.8

x 2,

0.6
0.4

x 3,

0.2

x 4.
2 x 1,
1 x 0,

2
1

-1

-2

0 x 1,

-1

1 x 2.

-2

11.4.17.
(a) If we measure x in radians,
8
>
>
>
>
>
<

.9959 x .1495 x3 ,

u(x) = > .5 + .8730 x


>
>
>
>
:

1
6

.2348 x

.7071 + .6888 x

1
4

1
6

.4686 x

2
1
4

.297 x

1
6

.5966 x

0 x 1,
,

1
4

1
6

1
4

1
4

1
3

We plot the spline, a comparison with the exact graph, and a graph of the error:
0.8

0.8

0.003

0.6

0.6

0.002

0.4

0.4

0.001

0.2

0.2
0.2
0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

0.4

0.6

0.8

(b) The maximal error in the spline is .002967 versus .000649 for the interpolating polynomial. The others all have larger error.
11.4.18.
(a)
8
>
>
>
>
>
<

2.2718 x 4.3490 x3 ,

u(x) = > .5 + 1.4564 x


>
>
>
>
:

.75 + .5066 x

1
4

9
16

3.2618 x

+ .2224 x

1 2
1 3
+
3.7164
x

,
4
4
2

3
9
9
.1694 x 16
,
16

0 x 1,
1
4

9
16

9
16 ,

x 1.

We plot the spline, a comparison with the exact graph, and a graph of the error:
1

1
0.1

0.8

0.8

0.6

0.6

0.4

0.4

0.08
0.06
0.04
0.02
0.2

0.2
0.2

0.4

0.6

0.8

-0.02
0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

(b) The maximal error in the spline is .1106 versus .1617 for the interpolating polynomial.
(c) The least squares error for the spline is .0396, while the least squares cubic polynomial,
p(x) = .88889 x3 1.90476 x2 + 1.90476 x + .12698 has larger maximal error .1270, but
smaller least squares error .0112 (as it must!).
335

0.8

11.4.19.

0.6

0.6

0.6

0.4

0.4

0.2

-1

-2

0.8

0.4
0.2
-3

0.8

-3

-2

-1

0.2
1

-3

-2

-1

The cubic spline interpolants do not exhibit any of the pathology associated with the interpolating polynomials. Indeed, the maximal absolute errors are, respectively, .4139, .1001,
.003816, and so increasing the number of nodes significantly reduces the overall error.
11.4.20. Sample letters:

11.4.21. Sample letters:

Clearly, interpolating polynomials are completely unsuitable for typography!


11.4.22.
(a)

8
>
>
>
>
>
<

19
15

x+

4
15

x3 ,

0 x 1,

7
C0 (x) = > 15
(x 1) + 54 (x 1)2
>
>
>
>
:

2
15

8 8
>
>
5
>
>
>
<

(x 2)

C1 (x) = > 1
>
>
>
>
:
8
>
>
>
>
>
<

(x 2)2 +

1
15

(x 1)3 ,

(x 2)3 ,

x3 ,

1 x 2,

0.2

(x 1) 59 (x 1)2 + (x 1)3 ,

1 x 2,

(x 2)2

2 x 3,

25 x +

2
5

6
5

2
5

(x 2)3 ,

x3 ,

0 x 1,

(x 1) + 56 (x 1)2 (x 1)3 ,
(x 2)

9
5

(x 2)2 +

3
5

(x 2)3 ,

336

0.6
0.4

2 x 3,
0 x 1,

54 (x 2) +

4
5
>
>
>
>
: 1
5

C2 (x) = >

1
5

3
5

1
5

1
3

1
0.8

1 x 2,
2 x 3,

0.5

1.5

2.5

0.5

1.5

2.5

0.5

1.5

2.5

1
0.8
0.6
0.4
0.2

1
0.8
0.6
0.4
0.2

8 1
>
>
15
>
>
>
<

1
15

x3 ,

0 x 1,

C3 (x) = > 12 15 (x 1) 15 (x 1)2 +


>
>
>
>
:

7
15

(x 2) +

4
5

(x 2)2

4
15

1
3

(x 1)3 ,

1
0.8
0.6

1 x 2,

(x 2)3 ,

0.4
0.2

2 x 3.

0.5

1.5

2.5

(b) It suffices to note that any linear combination of natural splines is a natural spline.
Moreover, u(xj ) = y0 C0 (xj ) + y1 C1 (xj ) + + yn Cn (xj ) = yj , as desired.
(c) The n + 1 cardinal splines C0 (x), . . . , Cn (x) orthogonal matrix a basis. Part (b) shows
that they span the space since we can interpolate any data. Moreover, they are linearly
independent since, again by part (b), the only spline that interpolates the zero data,
u(xj ) = 0 for all j = 0, . . . , n, is the trivial one u(x) 0.
(d) The same ideas apply to clamped splines with homogeneous boundary conditions, and
to periodic splines. In the periodic case, since C0 (x) = Cn (x), the vector space only has
dimension n. The formulas for the clamped cardinal and periodic cardinal splines are
different, though.
11.4.23.

8
>
>
>
>
>
>
<

(a) (x) = >


>
>
>
>
>
:

(x + 2)3 ,

2 x 1,

4 6 x 2 + 3 x3 ,

4 6 x 2 3 x3 ,

0 x 1,

(2 x) ,

1 x 0,

1 x 2.

-3

-2

-1

(b) By direct computation (2) = 0 = 0 (2);


(c) By the interpolation conditions, the natural boundary conditions and (b), (2) =
0 (2) = 00 (2) = 0 = (2) = 0 (2) = 00 (2), and so periodicity is assured.
(d) Since (x) is a spline, it is C2 for 2 < x < 2, while the zero function is everywhere
C2 . Thus, the only problematic points are at x = 2, and, by part (c), ? (2 ) = 0 =
? (2+ ), ? (2 ) = 0 = ? (2+ ), ?0 (2 ) = 0 = ?0 (2+ ), ? (2 ) = 0 = ? (2+ ),
? (2 ) = 0 = ? (2+ ), ?0 (2 ) = 0 = ?0 (2+ ),, proving the continuity of ? (x) at 2.
11.4.24.
(a) According to Exercise 11.4.23, the functions Bj (x) are all periodic cubic splines. More8
4, j = k,
>
>
<
over, their sample values Bj (xk ) = > 1, | j k | = 1 mod n, form a linearly indepen>
:
0, otherwise,
dent set of vectors because the corresponding circulant tridiagonal n n matrix B, with
entries bjk = Bj (xk ) for j, k = 0, . . . , n 1, is diagonally dominant.
(b)
4

3.5
3

B0 (x):

3.5

3.5

2.5

B1 (x):

2
1.5

2.5

2.5

B2 (x):

1.5

1.5

0.5

0.5
1

0.5

3.5

3.5

B3 (x):

2.5

B4 (x):

2
1.5

2.5
2
1.5

0.5

0.5
1

337

(c) Solve the linear system B = y, where B is the matrix constructed in part (a), for the
Bspline coefficients.
2
1

(d) u(x) =

5
11

B1 (x) +

2
11

B2 (x)

2
11

B3 (x)

5
11

B4 (x).

-1
-2

11.5.1. u(x) =

sinh 23 x
e3 x/2 e 3 x/2
=
;
e3 e 3
sinh 3

yes, the solution is unique.

11.5.2. True the solution is u(x) = 1.

11.5.3. The Greens function superposition formula is u(x) =


If x 12 , then
u(x) =

Z 1

1/2

G(x, y) dx.

Z 1/2
sinh (1 x) sinh y
sinh x sinh (1 y)
dy +
dy
x
sinh
sinh
Z 1
sinh x sinh (1 y)
1
e x + e/2 e x
dy = 2
,

1/2
sinh

2 (1 + e/2 )

Z 1/2
0

11.5.4.

G(x, y) dx

Z x

while if x 21 , then
u(x) =

Z 1/2

sinh (1 x) sinh y
dy
sinh
Z 1
x

Z x

1/2

sinh (1 x) sinh y
dy
sinh

1
e /2 e x + e e x
sinh x sinh (1 y)
dy = 2
.
sinh

2 (1 + e/2 )

8
>
>
>
<

sinh x cosh (1 y)
,
x < y,
cosh
(a) G(x, y) = >
>
cosh (1 x) sinh y
>
:
,
x > y.
cosh
1
(b) If x 2 , then
Z x
Z 1/2
cosh (1 x) sinh y
sinh x cosh (1 y)
u(x) =
dy +
dy
0
x
cosh
cosh
Z 1
sinh x cosh (1 y)

dy
1/2
cosh
1
(e/2 e /2 + e ) e x + (e e/2 + e /2 ) e x

,
2
2 (e + e )
while if x 21 , then
Z 1/2
Z x
cosh (1 x) sinh y
cosh (1 x) sinh y
u(x) =
dy
dy
0
1/2
cosh
cosh
Z 1
sinh x cosh (1 y)

dy
x
cosh
(e /2 e + e 3 /2 ) e x + (e3 /2 e + e/2 ) e x
1
.
= 2 +

2 (e + e )
=

338

11.5.5.

8
>
>
>
<

If x 12 , then
u(x) =

G(x, y) = >
>
>
:

Z x
0

while if x 21 , then
u(x) =

x < y,
x > y.

Z 1/2
cosh (1 x) cosh y
cosh x cosh (1 y)
dy +
dy
x
sinh
sinh
Z 1
cosh x cosh (1 y)
1
cosh x
,

dy = 2
2
1/2
sinh

cosh 12

Z 1/2
0

cosh x cosh (1 y)
,
sinh
cosh (1 x) cosh y
,
sinh

Z x
cosh (1 x) cosh y
cosh (1 x) cosh y
dy
dy
1/2
sinh
sinh
Z 1
cosh x cosh (1 y)
1
cosh (1 x)

.
dy = 2 +
x
sinh

2 cosh 12

This Neumann boundary value problem has a unique solution since ker K = ker L =
{0}. Therefore, the Neumann boundary value problem is positive definite, and hence has
a Greens function.

11.5.6. Since L[ u ] = u0 , u , clearly ker L = {0}, irrespective of any boundary conditions.


Thus, every set of self-adjoint boundary conditions, including the Neumann boundary value
problem, leads to a positive definite boundary value problem. The solution u? (x) to the
homogeneous Neumann boundary value problem minimizes the same functional (11.156)
among all u C2 [ a, b ] satisfying the Neumann boundary conditions u0 (a) = u0 (b) = 0.
11.5.7.
(a) The solution is unique provided the homogeneous boundary value problem z 00 + z = 0,
z(0) = z(1) = 0, has only the zero solution z(x) 0, which occurs whenever 6= n 2 2
for n = 1, 2, 3, . . . . If = n2 2 , then z(x) = c sin n x for any c 6= 0 is a non-zero
solution to the homogeneous boundary value problem, and so can be added in to any
solution to the inhomogeneous system. 8
>
> sinh (1 y) sinh x ,
>
x < y,
>
<
sinh
2
G(x, y) = >
(b) For = < 0,
>
sinh (1 x) sinh y
>
>
:
,
x > y;
sinh
8
< x(y 1),
x < y,
for = 0,
G(x, y) = :
y(x 1),
x > y;
8
>
> sin (1 y) sin x ,
>
x < y,
>
<
sin
for = 2 6= n2 2 > 0,
G(x, y) = >
>
sin (1 x) sin y
>
>
:
,
x > y.
sin
(c) The Fredholm alternative requires the forcing function to be orthogonal to the solutions
to the homogeneous boundary value problem, and so
h h , sin n x i =
11.5.8. The first term is written as

d
dx

p(x)

Z 1
0

du
dx

339

h(x) sin n x dx = 0.

= D D[ u ], where D[ u ] = u0 , and the

adjoint is computed with respect to the weighted L2 inner product


hh v , ve ii =

Z b
a

p(x) v(x) ve(x) dx.

The second term is written as q(x) u = I I [ u ], where I [ u ] = u is the identity operator,


and the adjoint is computed with respect to the weighted L2 inner product
e ii =
hh w , w

Z b
a

e
q(x) w(x) w(x)
dx.

Thus, by Exercise 7.5.21, the sum can be written in self-adjoint form


!
D[
u
]

=
D D[ u ] + I I [ u ] = L L[ u ],
with
L[ u ] =
I [u]

u0
u

taking its values in the Cartesian product space, with inner product exactly given by (11.154).
11.5.9.
h
i
h
i0
(a) (x) a(x) u00 + b(x) u0 + c(x) u = p(x) u0 + q(x) u = p(x) u00 p0 (x) u + q(x) u if
and only if a = p, b = p0 , c = q. Thus, ( a)0 = b, and so the formula for the
integrating factor is
!
!
Z
Z
b(x) a0 (x)
1
b(x)
(x) = exp
dx =
exp
dx .
a(x)
a(x)
a(x)

(ii) (x) = x4

(iii) (x) = e x

du
+ e2 x u = e3 x ,
dx !
3
1
1 du
+ 4u= 4,
2
x dx
x
x
!
d
x du
x
xe
e u = 0.
yields
dx
dx

d
dx
d
yields
dx

(b) (i) (x) = e2 x yields

(c) (i) Minimize P[ u ] =

Z bh

(ii) Minimize P[ u ] =

Z bh

(iii) Minimize P[ u ] =

1
2

e2 x

e2 x u0 (x)2 +

1
2

u(1) = u(2) = 0;

Zab h

e2 x u(x)2 e3 x u(x) dx subject to

u0 (x)2
3 u(x)2
u(x) i
+
4 dx subject to u(1) = u(2) = 0;
2
4
2x
2x
x
i
x 0
2
1 x
1
u (x) 2 e u(x)2 dx subject to u(1) = u(2) = 0.
2 xe

11.5.10. Since > 0, the


to
the ordinary differential equation is
general solution

u(x) = e x c1 cos x + c2 sin x . The first boundary condition implies c1 = 0, while

the second implies c2 = 0 unless sin 2 = 0, and so the desired values are = 14 n2 2 for
any positive integer n.
11.5.11. The solution is

1
(1 e ) e x + (e 1) e x

.
2
2 (e e )
Moreover, by lH
opitals Rule, or using Taylor expansions in ,
u(x, ) =

e e x + e (x1) e + e x e(1x)
=
2 (e e )
0+
0+
which is the solution to the limiting boundary value problem.
lim u(x, ) = lim

11.5.12. The solution is


u(x, ) = 1

1
2

e1/ 1
1 e 1/
x/
e

e x/ .
1/

1/
1/

1/
e
e
e
e
340

1
2

x2 = u? (x),

To evaluate the limit, we rewrite


lim u(x, ) = 1 lim

0+

0+

since lim+ e
0

x/

1 e 1/ (x1)/ 1 e 1/ x/
e

e
=
1 e 2/
1 e 2/

1,
0,

0 < x < 1,
x = 0 or x = 1,

= 0 for all x > 0. The convergence is non-uniform since the limiting

function is discontinuous.
11.5.13. To prove the first bilinearity condition:
hh c v + d w , z ii =
=c

Z 1h
0

Z 1h
0

p(x) (c v1 (x) + d w1 (x)) z1 (x) + q(x) (c v2 (x) + d w2 (x)) z2 (x) dx


i

p(x) v1 (x) z1 (x) + q(x) v2 (x) z2 (x) dx + d

Z 1h
0

p(x) w1 (x) z1 (x) + q(x) w2 (x) z2 (x) dx

= c hh v , z ii + d hh w , z ii.
The second has a similar proof, or follows from symmetry as in Exercise 3.1.9. To prove
symmetry:
hh v , w ii =
=
As for positivity,

Z 1h

0
Z 1h
0

p(x) v1 (x) w1 (x) + q(x) v2 (x) w2 (x) dx


i

p(x) w1 (x) v1 (x) + q(x) w2 (x) v2 (x) dx = hh w , v ii.

hh v , v ii =

Z 1h
0

p(x) v1 (x)2 + q(x) v2 (x)2 dx 0,

since the integrand is a non-negative function. Moreover, since p(x), q(x), v 1 (x), v2 (x) are
all continuous, so is p(x) v1 (x)2 +q(x) v2 (x)2 , and hence hh v , v ii = 0 if and only if p(x) v1 (x)2 +
q(x) v2 (x)2 0. Since p(x) > 0, q(x) > 0, this implies v(x) = ( v1 (x), v2 (x) )T 0.
11.5.14.
(a) sinh cosh + cosh sinh =
=
(b) cosh cosh + sinh sinh =
=

1
) (e + e ) + 41 (e + e ) (e
4 (e e
1 +
e ) = sinh( + ).
2 (e

1
) (e + e ) + 41 (e e ) (e
4 (e + e
1 +
+ e ) = cosh( + ).
2 (e

e )
e )

ex 1
x ex . The graphs compare the exact and finite
e2 1
element approximations to the solution for 6, 11 and 21 nodes:

11.6.1. Exact solution: u(x) = 2 e2


1.4

1.4

1.4

1.2

1.2

1.2

0.8

0.8

0.8
0.6

0.6

0.6

0.4

0.4

0.4

0.2

0.2

0.2

0.5

1.5

0.5

1.5

0.5

1.5

The respective maximal overall errors are .1973, .05577, .01476. Thus, halving the nodal
spacing apparently reduces the error by a factor of 4.
11.6.2.
(a) Solution:
u(x) =

1
4

x 2 (x 1) =

8
< 1
4
: 1
4

x,
x

1
2

(x 1)2 ,
341

0 x 1,
1 x 2;

0.3
0.25
0.2
0.15

finite element sample values:


0.1
0.05
( 0., .06, .12, .18, .24, .3, .32, .3, .24, .14, 0. );
maximal error at sample points .05; maximal overall error: .05.

0.5

1.5

0.08
0.06

(b) Solution: u(x) =

log x + 1
x;
log 2

0.04

finite element sample

0.02

0.2

0.4

0.6

0.8

values: ( 0., .03746, .06297, .07844, .08535, .08489, .07801, .06549, .04796, .02598, 0. );
maximal error at sample points .00007531; maximal overall error: .001659.
1.5

2.5

-0.05
-0.1

(c) Solution: u(x) =

1
2

x2+

3
2

finite element sample

-0.15
-0.2
-0.25

values: ( 0., .1482, .2264, .2604, .2648, .2485, .2170, .1742, .1225, .0640, 0. );
maximal error at sample points .002175; maximal overall error: .01224.
0.4
0.3

(d) Solution: u(x) =

e2 + 1 2 e1x
x;
e2 1

0.2

finite element

0.1

-1

-0.5

0.5

sample values: ( 0., .2178, .3602, .4407, .4706, .4591, .4136, .3404, .2444, .1298, 0. );
maximal error at sample points .003143; maximal overall error: .01120.
11.6.3.
(a) The finite element sample values are c = ( 0, .096, .168, .192, .144, 0 )T .

(b)

0.15

0.15

0.15

0.1

0.1

0.1

0.05

0.05

0.05

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

(c) (i) The maximal error at the mesh points is 2.77561017 , almost 0! (ii) The maximal
error on the interval using the piecewise affine interpolant is .0135046.
(d) (i) The maximal error at the mesh points is the same 2.775 1017 ; (ii) the maximal
error on the interval using the spline interpolant is 5.5511 1017 , making it, for all
practical purposes, identical with the exact solution.
11.6.4. The only difference is the last basis function, which should changed to
8
>
>
>
>
>
<

n1 (x) = >
>
>
>
>
:

x xn2
,
xn1 xn2

1,
0,

xn2 x xn1 ,
xn1 x b,
x xn2 ,

in order to satisfy the free boundary condition at xn = b. This only affects the bottom
right entry
s
mn1,n1 = n2
h2
342

of the finite element matrix (11.169) and the last entry


bn1 =

Z xn1

1
h

xn2

(x xn2 )f (x) dx +

Z xn

xn1

f (x) dx h f (xn1 ) +

1
2

h f (xn ),

of the vector (11.172). For the particular boundary value problem, the exact solution is
u(x) = 4 log(x + 1) x. We graph the finite element approximation and then a comparison
with the solution:
0.3

0.3

0.2

0.2

0.1

0.1

0.2

0.4

0.6

0.8

0.2

0.4

0.6

0.8

They are almost identical; indeed, the maximal error on the interval is .003188.
11.6.5.
(a) u(x) = x +

ex
ex
+
.
1 e2 Z 1 e 2
2h

(b) Minimize P[ u ] =

1
2

u0 (x)2 +

1
2

u(x)2 x u(x) dx over all C2 functions u(x) that

satisfy the boundary conditions u(0) = u(2 ), u0 (0) = u0 (2 ).


(c) dim W5 = 4 since any piecewise affine function (x) that satisfies the
conditions is uniquely determined by its 4 interior sample values c1 =
(x4 ), with c0 = (x0 ) = 12 (c1 + c4 ) then determined so that (x0 )
0 (x0 ) = 0 (x5 ). Thus, a basis consists of the following four functions
polation values plotted in
1

1 :

1
2 , 1, 0, 0, 0

0.8

0.8

0.6

0.6

2 : 0, 0, 1, 0, 0

0.4
0.2
1

0.4
0.2

0.8

0.8

0.6

3 : 0, 0, 0, 1, 0

two boundary
(x1 ), . . . , c4 =
= (x5 ) and
with listed inter-

0.6

4 :

0.4
0.2
1

(d) n = 5: maximal error .9435

0.4
0.2

(e) n = 10: maximal error .6219

1
2 , 0, 0, 0, 1

343

n = 20: maximal error .3792

n = 40: maximal error .2144

6
4

1
2

Each decrease in the step size by

decreases the maximal error by slightly less than

1
2.

11.6.6.
(c) dim W5 = 5, since a piecewise affine function (x) that satisfies the two boundary conditions is uniquely determined by its values cj = (xj ), j = 0, . . . , 4. A basis consists
(
1, i = j,
of the 5 functions interpolating the values i (xj ) =
for 0 i, j < 5, and
0, i 6= j,
i (2 ) = i (x5 ) = i (x0 ) = i (0). The basis functions are graphed below:
1

0 :

0.8

0.8

0.6

0.6

1 :

0.4
0.2
1

1
0.8
0.6

2 :

0.4
0.2
1

0.2
1

0.8

0.8

0.6

3 :

0.4

0.6

4 :

0.4
0.2
1

(d) n = 5: maximal error 1.8598

344

n = 20: maximal error .4933

0.2

(e) n = 10: maximal error .9744

0.4

n = 40: maximal error .2474

1
2

Each decrease in the step size by also decreases the maximal error by about
ever, the approximations in Exercise 11.6.5 are slightly more accurate.

1
2.

How-

2
1.5
1

11.6.7. n = 5: maximum error .1117

0.5

2
1.5
1

n = 10: maximum error .02944

0.5

2
1.5
1

n = 20: maximum error .00704

0.5

11.6.8.

(a) L =

B1
B 2
B
B
B
B
B
B
@

1
23

1
34

1
..

C
C
C
C
C,
C
C
C
A

D=

1
h

0
B
B
B
B
B
B
B
B
@

3
2

4
3

1
5
4

..
.
.
since all pivots are positive, the matrix is positive definite.

..

C
C
C
C
C;
C
C
C
A

k
(b) By Exercise 8.2.48, the eigenvalues of M are k = 2 2 cos n + 1 > 0 for k = 1, . . . , n.
Since M is symmetric and these are all positive definite, the matrix is positive definite.
11.6.9. (a) No. (b) Yes. The Jacobi iteration matrix T , cf. (10.65), is the tridiagonal matrix
with 0s on the main diagonal, and 21 s on the sub and super-diagonals, and hence, accordk
, k = 1, . . . , n. Thus, its spectral
ing to Exercise 8.2.48, has eigenvalues k = cos
n+1

radius, and hence rate of convergence of the iteration, is (T ) = cos n + 1 < 1, proving
convergence.
11.6.10.
(a) A basis for Wn consists of the n 1 polynomials k (x) = xk (x 1) = xk+1 xk for
k = 1, . . . , n 1.

345

(b) The matrix entries are


mij = hh L[ i ] , L[ j ] ii = hh 0i , 0j ii
=
=

Z 1h
0

(i + 1)xi i xi1

ih

(j + 1)xj j xj1 (x + 1) dx

4 i2 j + 4 i j 2 + 4 i j + j 2 + i2 i j
,
(i + j 1) (i + j) (i + j + 1) (i + j + 2)

1 i, j n 1.

while the right hand side vector has entries


bi = h f , i i =

Z 1h
0

(i + 1)xi i xi1 dx =

Solving M c = b, the computed solutions v(x) =

1
,
i2 + 3 i + 2

n1
X

k=1

i = 1, . . . , n 1.

ck k (x) are almost identical to

the exact solution: for n = 5, the maximal error is 2.00 105 , for n = 10, the maximal
error is 1.55 109 , and for n = 20, the maximal error is 1.87 1010 . Thus, the
polynomial finite element method gives much closer approximations, although solving
the linear system is (slightly) harder since the coefficient matrix is not sparse.
11.6.11.
(a) There is a unique solution provided 6= n2 for n = 1, 2, 3, . . . , namely

8
>
x

sinh
x
>
>
>

,
> 0,
>
>
>

sinh

>
>
<
u(x) = > 61 2 x 16 x3 ,
= 0,
>
>

>
>
>
x
sin x
>
>
>

,
n2 6= < 0.

sin
When = n2 , the boundary value problem has no solution.

(b) The minimization principle P[ u ] =

Z 2h
0

1
2

u0 (x)2 +

1
2

u(x)2 x u(x) dx over all C2

functions u(x) that satisfy the boundary conditions u(0) = 0, u() = 0, is only valid for
> 0. The minimizer is unique. Otherwise, the functional has no minimum.
(c) Let h = /n. The finite element equations are M c = b where M is the (n 1) (n 1)
2
2
tridiagonal matrix whose diagonal entries are + h and sub- and super-entries are
h
3
1
1
+ h .
h
6
(d) According to Exercise 8.2.48, the eigenvalues of M are
2
2
+ h +
h
3

2
1
+ h
h
3

cos k h,

k = 1, . . . , n 1.

The finite element system has a solution if and only if the matrix is not singular, which
6 cos k h 1
occurs if and only if 0 is an eigenvalue, and so = 2
k 2 , provided
h cos k h + 2
k h 0, and so the eigenvalues of the finite element matrix converge to the eigenvalues
of the boundary value problem. Interestingly, the finite element solution converges to
the actual solution, provided on exists, even when the boundary value problem is not
positive definite.
(ef ) Here are some sample plots comparing the finite element approximant with the actual
solution. First, for > 0, the boundary value problem is positive definite, and we expect convergence to the unique solution.
346

1
0.8
0.6

= 1, n = 5, maximal error .1062:

0.4
0.2
0.5

1.5

2.5

0.5

1.5

2.5

0.5

1.5

2.5

0.5

1.5

2.5

1
0.8
0.6

= 1, n = 10, maximal error .0318:

0.4
0.2

0.03
0.025
0.02
0.015

= 100, n = 10, maximal error .00913:

0.01
0.005

0.025
0.02
0.015

= 100, n = 30, maximal error .00237:

0.01
0.005

Then, for negative not near an eigenvalue the convergence rate is simliar:
4
3
2

= .5, n = 5, maximal error .2958:

0.5

1.5

2.5

4
3
2

= .5, n = 10, maximal error .04757:

0.5

1.5

2.5

Then, for very near an eigenvalue (in this case 1), convergence is much slower, but
still occurs. Note that the solution is very large:
200
150
100

= .99, n = 10, maximal error 76.0308:

50

0.5

1.5

2.5

200
150
100

= .99, n = 50, maximal error 5.333:

50

0.5

347

1.5

2.5

200
150
100

= .99, n = 50, maximal error .4124:

50

0.5

1.5

2.5

On the other hand, when is an eigenvalue, the finite element solutions dont converge.
Note the scales on the two graphs:
250
200
150

= 1, n = 10, maximum value 244.3

100
50
0.5

1.5

2.5

0.5

1.5

2.5

6000
5000
4000
3000

= 1, n = 50, maximum value 6080.4

2000
1000

The final example shows convergence even for large negative . The convergence is slow
because 50 is near an eigenvalue of 49:
0.2
0.1

= 50, n = 10, maximal error .3804:

0.5

1.5

2.5

0.5

1.5

2.5

0.5

1.5

2.5

-0.1
-0.2
-0.3

= 50, n = 50, maximal error 1.1292:

0.5

-0.5
-1
-1.5
0.3
0.2

= 50, n = 200, maximal error .0153:

0.1

-0.1
-0.2
-0.3

11.6.12.

ui+1 ui
(x xi ) for xi x xi+1 .
xi+1 xi
(b) Clearly, since each hat function
is piecewise affine, any linear combination is also piece(
n
X
1, i = j,
ui i (xj ) = uj , and so
wise affine. Since i (xj ) =
we have u(xj ) =
0, i 6= j,
i=0
u(x) has the correct interpolation values.
(a) We define f (x) = ui +

(c) u(x) = 2 (x) + 3 (x) + 6 (x) + 11 (x)


0
1
2
3
8
x
+
2,
0

1,
>
>
>
<

= > 3 x,
>
>
:

5 x 4,

1 x 2,

12
10
8
6
4
2

2 x 3.

0.5

348

1.5

2.5

11.6.13.
0
(a) If (x) is any continuously differentiable function at xj , then 0 (x+
j ) (xj ) = 0.
Now, each term in the sum is continuously differentiable at x = xj except for j (x) =
+

0
0
0
cj | x xj |, and so f 0 (x+
j ) f (xj ) = j (xj ) j (xj ) = 2 cj , proving the formula.
Further, a = f 0 (x
0 )+
(b)

n
X

i=1

ci , b = f (x0 ) a x0

n
X

i=1

ci | x0 xi |.

1
1
1
| x xj1 | | x xj | +
| x xj+1 |.
2h
h
2h
If u(x) = 0 for x < 0 or x > 3, then
1
5
u(x) = 13
2 + 2 | x | + | x 1 | + | x 2 | 2 | x 3 |.
More generally, for any constants c, d, we can set
u(x) = (3 c + d) x (3 d + 1) + c | x | + | x 1 | + | x 2 | + d | x 3 |.
j (x) =

349

Вам также может понравиться