Вы находитесь на странице: 1из 73

More Lecture Notes in Algebra 1 (Fall Semester 2013)

November 28, 2013

CHAPTER 1

Linear Systems of Equations


1. Introduction
A linear equation in the variables (or unknowns) x1 , . . . , xn is a statement of the form
(1)

a1 x1 + a2 x2 + . . . + an xn = b.

Here a1 , . . . , an and b are constants which are usually known. By a


solution to (1) we mean a set of values of x1 , x2 , . . . , xn which makes
the statement true.
The linear equation (1) is called homogeneous if b = 0; otherwise it
is non-homogeneous.
A linear equation system is a set of linear equations. By a solution
to such a system we mean a set of values of the variables such that
each of the equations in the system are fulfilled.
Example. (a) The system
(

x + 2y = 5
2x + 3y = 4

has a solution (x, y) = (1, 2).


(b) The system

x+y+ z=1

2x y + z = 3

3x
+ 2z = 4
has, for example, the solutions (x, y, z) = (2, 0, 1) and (x, y, z) =
(4, 1, 4).
(c) The system
(
x+y=2
x+y=3
has no solutions, for if there were a solution (x, y), we would have
2 = x + y = 3, which is false.
3

1. LINEAR SYSTEMS OF EQUATIONS

2. The GaussJordan Elimination Method


We shall now introduce a method for solving linear systems of
equations. In this method, we shall successively replace a given system of equations by simpler, equivalent systems. Here, by definition,
two systems are equivalent iff they have precisely the same solutions.
(
Example 1. Solve the system (*)

x + 2y = 5
2x + 3y = 4

Solution. The first equation in the system is equivalent to that x =


5 2y. We can then eliminate x from the second equation by replacing
it with 5 2y. The system (*) is thus equivalent to
x + 2y = 5
2(5 2y) + 3y = 4

(
(1)

(2)

(3)

(
x + 2y = 5
7y = 14

(
x + 2y = 5
y=2

The second equation in the last system says that y = 2. We can then
eliminate y from the first equation by substituting this value for y, so
that (3) is equivalent to
(4)

x+22=5
y=2

(5)

(
x=1
y=2

It follows that the system has the unique solution (x, y) = (1, 2).
The elimination we performed to obtain (2) from (*) can be recognized as the following operation: the first equation in (*) is multiplied
by 2, and is then added to the second equation. The coefficient of x
in the resulting sum of equations is then zero, i.e. the variable x is

2. THE GAUSSJORDAN ELIMINATION METHOD

eliminated there:
(

x + 2y = 5 2, add to second eq.


2x + 3y = 4
(
x+
2y = 5

0 x + (2 2 + 3)y = 2 5 + 4
(
x + 2y = 5

7y = 14

The elimination of y from the first equation has a similar interpretation: the second equation in (3) is multiplied by 2 and added to the
first,
(
x + 2y = 5
y = 2 (2), add to first eq.
(
x+0y=522
y=2
The indicated operation is the main ingredient in the Gauss-Jordan
elimination method. We shall execute this operation repeatedly, in a
systematic way.

x + 4y 2z = 8

Example 2. Solve the system


2x + 9y + z = 7

3x 2y 4z = 6
Solution. We first use the frist equation to eliminate x from the other
equations. From the second equation we subtract 2(first eq.) and
from the third we subtract 3(first eq.). The result is

x + 4y 2z = 8

y + 5z = 9

14y + 2z = 18
To simplify, we here divide the third equation by 2,

x + 4y 2z = 8

y + 5z = 9

7y + z = 9

1. LINEAR SYSTEMS OF EQUATIONS

We now eliminate y from the third equation by adding 7(second eq.):

x + 4y 2z = 8

y + 5z = 9

36z = 72

x + 4y 2z = 8


y + 5z = 9

z = 2
Using the third equation, we eliminate z from the first two ones:

x + 4y
= 4

y
= 1

z = 2
Finally, y is eliminated from the first equation using the second one:

x
= 0

y
= 1

z = 2
The system has the unique solution (x, y, z) = (0, 1, 2).
Here is the main principle. The first equation is used to eliminate the first unknown from the other equations. Then the (new)
second equation is used to eliminate the second unknown from the
subsequent equations, etc. The last equation will only contain the
last unknown, which has thus been calculated. After that we make a
backwards substitution to eliminate the variables "upwards, one at a
time, until the system is completely solved.
The simple rule above can not always be carried out. We illustrate
with a couple of simple examples.
Example 3. In the system

y 2z = 3

x + 2y z = 2

2x + 3y + z = 1
we cant as earlier use the first equation to eliminate x from the other
ones. On the other hand, the second equation can be used. To obtain
a system of the same form as earlier, we switch places of the first two

2. THE GAUSSJORDAN ELIMINATION METHOD

equations:

x + 2y z = 2

y 2z = 3

2x + 3y + z = 1
After this we can proceed as before. Do this! (The system has the
unique solution (x, y, z) = (2, 1, 2).)
(
3x + 2y = 4
.
Example 4. Solve the system
6x + 4y = 1
Solution. We eliminate x from the second equation by subtracting
2(first eq.) and get
(
3x + 2y = 4
0 = 7
The second equation is a false statement for all values of the variables
x and y. This means that the system has no solutions.
(
3x + 2y = 4
Example 5. Solve the system
.
6x + 4y = 8
Solution. Eliminating x from the second equation, we now get
(
3x + 2y = 4
0=0
The second equation is always satisfied, so the system is equivalent
to the single equation 3x + 2y = 4. This means that we can let y have
an arbitrary value, say y = t, and then x is uniquely determined by
y. Hence the system has infinitely many solutions
(
x = 34 32 t
,
t R.
y=t
Exercises.
1. Solve the following linear systems of equations:
(
(
(
x+y=2
2x 3y = 1
4x + 3y = 2
a)
b)
c)
x + y = 4
6x + y = 7
3x 5y = 6
2. Solve
the following systems:

x y+ z=4
2x y + 3z = 9

a)
b)
3x + 5y z = 0
3x + 2y 2z = 1

2x y 3z = 2
4x + 5y 4z = 2

1. LINEAR SYSTEMS OF EQUATIONS

x+y
=8
y + z = 2

d)
c)
x
+z=6 .
x 2y + z = 2

2x 5y + 3z = 2
y+z=4
3. Solve the following systems:
(
(
(
2x 31 y = 2
4x 3y = 4
6x + 9y = 5
a)
b)
c)
9
6x y = 4
3x + 4 y = 3
9x + 3y = 15
2

3. The Augmented Matrix of a Linear System of Equations


The numerical operations used when solving a linear system involve solely the coefficients of the unknowns and the constants in the
right hand sides. To simplify the notation, one can omit the unknown
and represent the system by a scheme called the augmented matrix of
the system. For example, the system

x + 2y + 3z = 10
1 2
3 10

()
is represented by 1 1 1 4 .
x+ y z= 4

2 1 1 5
2x y + z = 5
Example 6. We solve the system (*) in two ways: by writing down
complete equations as before, and, in parallel, by just manipulating
with the augmented matrix.
Eliminate x from eq.s 2 and 3:

x + 2y + 3z = 10

y 4z = 6

5y 5z = 15

Row 2 row 1, row 3 2 row 1 :

3
10
1 2
0 1 4 6

0 5 5 15

1
1
Mult. eq. 2 by 1, eq. 3 by : | Mult. row 2 by 1, row 3 by :
5
5

x + 2y + 3z = 10
1 2 3 10

0 1 4 6
y + 4z = 6

0 1 1 3

y+ z= 3
Eliminate y from eq. 3 :

x + 2y + 3z = 10

y + 4z = 6

3z = 3

Row

1
0

3 row 2 :

2 3 10
1 4
6

0 3 3

3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS

Divide eq. 3 by 3 :

x + 2y + 3z = 10

y + 4z = 6

z= 1

Divide row 3 by 3 :

1 2 3 10
0 1 4 6

0 0 1 1

Eliminate z in eq.s 1 and 2 :

x + 2y

=7

y
=6

z=1

| Row 1 3 row 3, row 2 4 row 3 :

1 2 0 7
0 1 0 2

0 0 1 1

Eliminate y in eq. 1 :

x
=3

y
=6

z=1

Row 1 2 row 2 :

1 0 0 3
0 1 0 2

0 0 1 1

The system thus has the unique solution (x, y, z) = (3, 2, 1).
The admissible operations on the augmented matrix, i.e. those
giving rise to an equivalent augmented matrix, are the following:
(i)
(ii)
(iii)
(iv)

A row is multiplied by a constant , 0.


A constant multiple of a row is added to another row.
Two rows are interchanged.
Two columns can be interchanged if one observes that this
means that two unknowns are interchanged. This must be taken
into account when interpreting the answer.

The operation (iv) is not really needed one can always circumvent it by other means. Here is an example of this.
Example 7. Solve the system

x + 2y 3z + w = 2

3x + 6y + 3z w = 2

2x + 4y + 3z + w = 9

2x + 4y 3z w = 13
Solution. The augmented matrix is

10

1. LINEAR SYSTEMS OF EQUATIONS

1
3

1
0

1
0

2 3 1 2
6 3 1 2

4 3
1
9

4 3 1 13

2 3 1 2
0 12 4 4

0 9 1 13

0 3 3 9

2 3 1 2
0 3 1 1

0 9 1 13

0 1 1 3

row 2 3row 1
row 3 2row 1
row 4 2row 1
divide row 2 by 4
divide row 4 by 3

Here we realize that we can not eliminate the variable in the second
column. We therefore skip that column and continue with the third
one. Interchange rows 2 and 4:

1 2 3 1 2
0 0 1 1 3

row 3 9row 2
0 0 9 1 13

0 0 3 1 1
row 4 3row 2

1 2 3 1 2
0 0 1 1 3

8 40
divide row 3 by 8
0 0 0

0 0 0
2 10
divide row 4 by 2

row 1- row 3
1 2 3 1 2
0 0 1 1 3
row 2 + row 3

1
5
0 0 0

0 0 0
1
5
row 4 is unnecessary - strike it!

row 1 + 3row 2
1 2 3 0 7
0 0 1 0 2

0 0 0 1 5

x + 2y = 1

1 2 0 0 1

0 0 1 0 2


.
z=2
i.e.

0 0 0 1 5

w=5
Here y can have an arbitrary value (say t), and then the solution is
fixed. The solutions are thus
(x, y, z, w) = (1 2t, t, 2, 5),

t R.

3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS

11

We finish this section by discussing the application of the Gauss


Jordan method in a few other slightly complicated cases.

x 2y = 1

Example 8. Solve the system


2x 3y = 4 .

4x 7y = 5
Solution. The augmented matrix is

1 2 1
2 3 4
row 2 - 2row 1

row 3 - 4row 1
4 7 5

1 2 1

2
0 1

0 0 1
The last line represents the equation 0 = 1, which is not satisfied for
any values of x and y. The given system thus lacks solutions.
(
x 2y + z = 3
Example 9. Solve the system
.
2x + 4y 2z = 6
Solution.
!
1 2 1
3
2 4 2 6
!
1 2 1 3

i.e.
0 0 0 0

row 2 + 2row 1
(

x 2y + z = 3
0=0

The last equation is always satisfied - it does not mean any condition
on the unknowns x, y, and z. The given system of equations is
therefore equivalent to the single equation x 2y + z = 3. Here we
can prescribe values for y and z arbitrarily; the variable x is uniquely
determined by these values. The general solution can be written
(x, y, z) = (3 + 2s t, s, t),

s, t R.

Example 10. Solve the following system for all values of a:

x + y az = 3

x ay z = 2

x 3y z = 2 a

12

1. LINEAR SYSTEMS OF EQUATIONS

Solution.

(1)

(2)

(3)

(4)

1
1

1 a
3
a 1
2

3 1 2 a

row 2 - row 1
row 3 - row 1

1
a
Divide row 3 by -4,
3
then interchange
a 1 a 1
1

row 2 with row 3.


4
a 1 1 a

1
a
3

1
1
1
(1 a) 4 (1 + a)
4

row 3 + (a + 1)row 2
a 1 a 1
1

1
a
3

1
1
1
(1

a)
(1
+
a)
.
4
4

0 14 (a 1)(3 a) 14 (a 1)(a + 3)

The last row is equivalent to the equation


()

(a 1)(3 1)z = (a 1)(a + 3).

Here we need to divide into cases.


Case 1: a = 1. Then the equation (*) reduces to 0 = 0, which is
always satisfied. The given system is then equivalent to the first two
rows in (4), i.e.,
1 1 1 3
0 1 0 12

row 1 - row 2

5
2
1
2

(
xz=
y=

1 0 1

0 1 0

i.e.

5
2
1
2

One can e.g. prescribe z arbitrarily and then get x, y, z from the value
of z,


5
1
(x, y, z) =
+ t, , t ,
t R.
2
2
Case 2: a = 3. In this case, (*) says that 0 z = 2 6 0 = 12. This
is false, so the system lacks solutions for a = 3.

3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS

13

Case 3: a , 1 a , 3. In this case, the third row in (4) can be


4
, which leads to
multiplied by (a1)(3a)

1
0

a
1
(1
a)
4
1
2
1 0 9+a
3a

1 0 3a

3+a
0 1 3a

1
1
0

0 0
1 0
0 1

3
1
(1 + a)
4

3+a

row 1 + arow 3
row 2 + 14 (a 1)row 3

3a

row 1 - row 2

a2 a+9
3a

3a
3+a
3a

In conclusion, we have arrived at the following result concerning the


solutions to the system:
(a) For a = 3 the system has no solutions.
(b) For a = 1 it has the solutions (x, y, z) = ( 52 + t, 12 , t), t R.
(c) For a < {1, 3} it has the unique solution
!
a2 a + 9 1 3 + a
(x, y, z) =
,
,
.
3a
3a 3a
Exercises.
4. Solve
the following systems:

x 9y 3z = 4

x y+ z=0

a)
b)
3x 2y + z = 2
3x + 5y z = 0

2x + 7y + 4z = 2
6x + 2y + 2x = 5

2x + 3y = 2
x + 2y = 3

c)
d)
x
+
3y

z
=
5
2x 3y = 8

10x y = 2
3x + y + z = 3

x 2y = 6

2x + y 3z = 4

e)
f)
3x
+
y
=
2

4x + 3y z = 2

5x 3y = 14

2x 3y + 4z = 1

x + y 4z = 7
3x + y z = 7
g)
h)

x + y + 2z = 1
x y + 5z = 1

4x 6y + 8z = 3

14

1. LINEAR SYSTEMS OF EQUATIONS

5.

6.
7.

8.

9.

2x 3y + 4z = 1

2x y + z = 3

3x + y z = 7
j)
i)
4x 2y + 5z = 0

x y + 5z = 1

2x y 2z = 9

4x 6y + 8z = 2

x + 3y + z w = 2

3x + 5y z w = 2
.
k)

5x y 5z + 3w = 0

2x + 3y 3z 2w = 2

2x 3y + z = 1

Solve the system


x 3y + 2z = b when

3x + y 4z = c
a) a = 2, b = 3, c = 0;
b) a = b = c = 0.
Hint: Several linear systems with the same coefficient
matrices but different right hand sides can be solved simultaneously with the same augmented matrix: one just writes the
different right hand sides next to each other.
Determine a and b so that the lines y 3x = 2 and 2y + ax = b
a) intersect at a point, b) are parallel and different, c) coincide.
2x+13
a
b
Determine a and b so that (x1)(x+2)
= x1
+ x+2
.
(This is an example of a decomposition in partial fractions
of a rational function. Such decompositions are frequently
used in the calculation of integrals, for example.)
Determine for all values of a the number of solutions to the
system
(
(
(
x + 2y = 2
x + 2y = 1
x + 2y = 1
a)
b)
c)
.
2x + ay = a
2x + ay = a
2x + a2 y = a
Determine all solutions for the following systems for all values of(the constant a:
(
(
x + 3y = 4
x + 3y = 3
x 3ay = 2
a)
b)
c)
2x + ay = a
2x + ay = a
ax 12y = a + 2

(
(

x + y + 3z = 1

x+ y=3
x 2ay = 3

d)
e)
f)
2x + y + z = 2

2x ay = 2
ax + 3y = 3a 1

3x + 3y + az = 0

x+ y+
z=1
x + y + az = 1

g)
h)
2x + ay
z=1
x + ay + z = a .

ax + 2y + (a + 3)z = 2
ax + y + z = a2

4. ANSWERS TO EXERCISES

15

10. Examine, for different values of a and b, the number of solutions to the following linear systems of equations:
(
(
ax + by = 2
ax + 2y = b
.
b)
a)
x+ y =1
3x + 2y = 5
11. Determine a, b, and c so that
3x2 + 6x 16 a
b
c
= +
+
.
3
x 4x
x x+2 x2
12. Determine the constants a, b, c so that the function f (x) =
ax2 + bx + c satisfies f (1) = 3, f (2) = 1 and f (1) = 7.
13. Determine the equation of a third degree polynomial whose
graph passes through the points (0, 1), (1, 1), (2, 1), and (1, 7).
4. Answers to Exercises
1. a) x = 1, y = 3
b) x = 1.1, y = 0.4 c) x = 28
, y = 18
.
29
29
2. (x, y, z) =
a) (2, 1, 2)
b) (1, 2, 3)
c) (6, 2, 0)
d) (5, 3, 1).
3. a) Has no solution. b) x = 1 + 43 y, y R. c) (x, y) = ( 65 , 0).
4. a) (x, y, z) = (1 + 32 t, t, 1 52 t), t R.
b) No solution.
c) (x, y, z) = (4, 2, 7).
d) No solution exists.
, 167 ) f) (x, y, z) = (4t + 5, 5t 6, t), t R
e) (x, y) = ( 10
7
g) (x, y, z) = (3 t, t, 1), t R
h) Has no solution.
i) (x, y, z) = (2, 1, 0)
j) (x, y, z) = (t, 2t5, 2),
,t R
k) (x, y, z, w) = (2, 1, 1, 2).
5. a) There is no solution.
b) x = y = z = t, t R.
6. a) a , 6
b) a = 6, b , 4
c) a = 6, b = 4.
d) The system of equations has respectively: exactly one
solution, no solutions, infinitely many solutions.
7. a = 5, b = 3.
8. a) If a = 4 there are infinitely many, otherwise unique.
b) If a = 4 there is no solution, otherwise unique.
c) If a = 2 there are infinitely many, if a = 2 none, otherwise a unique solution.


a
9. a) For a = 6 no solution. For a , 6, (x, y) = a6
, a8
.
a6
b) For a = 6: (x, y) = (3 3t, t), t R. For a , 6: (x, y) =
(0, 1).
c) For a = 2, no solution. For a = 2: (x, y) = (2 + 6t, t),
t R.


1
.
For a , 2: (x, y) = a+4
,

a+2
3(a+2)


4
d) For a = 2: insoluble. For a , 2: (x, y) = 3a+2
,
.
a+2 a+2

16

1. LINEAR SYSTEMS OF EQUATIONS

e) (x, y) =

10.

11.
12.
13.

6a2 2a+9
, 2a21+3
2a2 +3


.



, 15 , 3 .
f) a = 9: insoluble. a , 9: (x, y, z) = a15
a9 a9 a9
g) a , 1: insoluble. a = 1: (x, y, z) = (2t, 1 3t, t), t R.
h) a = 1: (x, y, z) = (1tu,
 2 t, u), t, u R. a = 2: insoluble.
1
, a+2
, a+1
a < {1, 2}: (x, y, z) = a +2a+1
.
a+2
a+2
a) If a , 3 unique solution; if a = 3 and b = 5 infinitely many
solutions; if a = 3 and b , 5 no solutions.
b) If a = b = 2 infinitely many solutions; if a = b , 2 none;
if a , b a unique solution.
a = 4, b = 2, c = 1.
a = 3, b = 5, c = 1.
The polynomial x3 + 3x2 2x + 1.

CHAPTER 2

Vectors
1. Basic Definitions

If P and Q are two points in space, we denote by PQ the directed


line-segment which starts at P and ends at Q. A directed segment is
determined by its starting point P, its direction, and its magnitude (or
length).

Two directed segments PQ and RS are called equivalent (notation:



PQ RS) if they have the same direction and magnitude. It is easy to
verify that this defines an equivalence relation on the set of directed
line segments in space.

The vector u which contains PQ is the set of all directed line

segments in space that are equivalent to PQ (i.e. it is the equivalence


~ In symbols:
class containing PQ).
n o
(5)
u = RS : RS PQ .

In this situation we say that the line segment PQ represents the vector
u.
The direction and the magnitude of the vector u is defined as the

direction and magnitude of some representative PQ. We will denote


the length (or norm) of u by the symbol kuk.

In practice, one often briefly writes u = PQ instead of the using


the bulky (but more precise) notation in (5). We will often, without
further mention, use this convention in the following.

The zero vector is the vector 0 = PP; clearly k0k = 0. A vector u


with kuk = 1 is called a unit vector.
The sum u + v of two vectors is defined as follows. Take a directed

segment PQ representing u, then a representative QR (with the same


Q) of v. We define

u + v = PR.
(This rule is sometimes called the parallelogram law of addition.)
17

18

2. VECTORS

The addition of vectors satisfies the following (easily verified)


rules:
(i) u + v = v + u (commutativity)
(ii) u + (v + w) = (u + v) + w (associativity)
(iii) If u + v = u + w then v = w (the cancelation law)
(iv) u + 0 = u (neutral element).

If u = PQ is a vector, we denote by ~
u the vector with the same

magnitude but opposite direction, i.e., ~


u = QP. It is clear that u+(u) =
0 for all u. We define the difference between u and v by
u v = u + (v).
Let s be a scalar (scalar is a synonym for real number). We shall
define a vector su.
(1) If s > 0 we define su to be the vector with the same direction
as u and magnitude skuk. (So su is a rescaled version of u.)
(2) If s = 0 we define su = 0.
(3) If s < 0 we define su to be the vector with the direction opposite
to u and length |s|kuk.
1
Observe that if u , 0, then the vector kuk
u is the unit vector with
the same direction as u.
The multiplication with scalars satisfies the following rules:
(a) s(tu) = (st)u,
(b) (s + t)u = su + tu,
(c) s(u + v) = su + sv.
(d) 0u = 0, 1u = u, s0 = 0.
Two non-zero vectors u, v are called parallel if one of them can be
written as a scalar multiple of the other one. We write u k v to denote
that u and v are parallel.

Example 1. Let O, P, Q be three points in space and put u = OP and

v = OQ. Let M be the mid-point of the segment PQ. We claim that


(1)

1
OM = (u + v) .
2

To prove this, observe that



OP + PQ = OQ,
i.e.,


PQ = OQ OP = v u.

1. BASIC DEFINITIONS

19

This gives that


1
1
1
OM = OP + PQ = u + (v u) = (u + v) .
2
2
2
A simple geometric consequence of this is that the diagonals of a
parallelogram divide each other into equal parts. Namely, if S is the

fourth corner in the parallelogram with sides u and v, then OS = u+v.


It thus follows from (1) that M lies halfway between O and S.
Example 2. Let O, P, Q, R be four points in space and put

u = OP ; v = OQ ; w = OR.
By the center of mass of the triangle PQR we mean the point N defined
by
1
ON = (u + v + w) .
3
Prove that the three medians of the triangle PQR intersect at the point
N. (A median of a triangle is a line-segment connecting a vertex to
the mid-point of the opposite side.)

(2)

Solution. Let M be the mid-point of the segment PQ. Then (by (1))
1
1
RM = OM OR = (u + v) w = (u + v 2w) .
2
2
Moreover,
1
1
RN = ON OR = (u + v + w) w = (u + v 2w) .
3
3
We have shown that

2RM = 3RN,
so N lies on the median RM and divides it according to the ratio 2 : 1.
By symmetry of the expression (2), the point N is also on the other
medians. 
Exercises.

1. Let A and B be two points and O a third point. Let a = OA,

b = OB, and let M be the mid-point of the line-segment AB.


Express the following vectors in terms of a and b:

a) OB + BA b) AB c) OM d) AM.
2. Simplify the following sums:


a) OB + BD + DC b) AC + CO + OB


c) AB + OA + BD d) BD BA + DL.

20

2. VECTORS

3. Prove that a point N is the center of mass of a triangle PQR if


and only if

NP + NQ + NR = 0.
4. Let u, v, w be non-zero vectors. Prove that parallelism is an
equivalence relation, namely:
a) u k u b) u k v v k u c) If u k v and v k w, then u k w.
5. Let ABC be a triangle, A1 the mid-point on BC, B1 the midpoint on AC, and C1 the mid-point of AB. Prove that the

vectors AA1 , BB1 , and CC1 can be used to form a triangle.


6. A median in a tethrahedron is the line-segment from a vertex to
the center of mass of the opposite side. Let O be an arbitrary
point and PQRS a tetrahedron. Prove that the medians of
PQRS intersect at a point T which is given by
1
OT = (OP + OQ + OR + OS).
4
The point T is called the center of mass of the tetrahedron.
2. Bases and coordinates
Let ` be a line in space. We say that a vector u is parallel to `,
or simply that u belongs to `, if u can be represented by a directed
line segment of `. (Thus by "u `, we really mean that some
representative of u belongs to `. We hence use the phrase "belongs to
in slightly different meanings. This should not cause confusion, as
long as the reader is aware of the distinction.)
Likewise, we say that u belongs to a plane if u can be represented
by a line segment in .
Fix a line ` and let e , 0 be a vector in `. For any other vector u in
` there is then a number x such that
(1)

u = xe.

The vector e is called a basis for ` and x is the coordinate for u with
respect to the basis e.
Now let be a plane and let e1 , e2 be two non-parallel vectors in
. We claim that each vector u in can be written
(2)

u = x1 e1 + x2 e2

where x1 and x2 are real numbers. To show this, we first choose two
lines `1 and `2 in such that e1 is in `1 and e2 is in `2 . Let O be
the point of intersection between `1 and `2 . To help our geometric
intuition, we will in the following fix O as our "origin, and place all

2. BASES AND COORDINATES

21

vectors in so that they emanate from the point O. (So e1 = OP1 for

some point P1 , u = OP, etc.)


We then decompose u into a sum
u = u1 + u2
where u1 is on `1 and u2 is on `2 . As in (1) we can then write
u1 = x1 e1

and u2 = x2 e2

for some real numbers x1 and x2 . This proves (2).


Definition 1. Two non-parallel vectors e1 , e2 in a plane are
called a basis for . Given a vector u , the numbers x1 and
x2 in (2) are uniquely determined; they are called the coordinates
of u with respect to the basis e1 , e2 . If the basis is fixed, and no
misunderstandings can arise, we can suppress the basis vectors in
(2), and simply write
u = (x1 , x2 ).
To describe all vectors in space in a similar way, we need three
vectors e1 , e2 , e3 which are not co-planar (i.e. they are not in one and
the same plane). We assert that every vector u then can be written
(3)

u = x1 e1 + x2 e2 + x3 e3

where x1 , x2 , x3 are real numbers, which are uniquely determined by


u. To prove this, we proceed in two steps.
First, let be a plane containing e1 and e2 and let ` be a line
containing e3 . Let O be the point of intersection between ` and ; in
the following we place all vectors so that they emanate from O.
We decompose u into a sum
(4)

u = u0 + u00 ,

u0 , u00 `.

Applying (2) and (1) we can write u0 = x1 e1 + x2 e2 and u00 = x3 e3 for


some (unique) real numbers x1 , x2 , x3 . This proves (3).
Definition 2. Three vectors e1 , e2 , e3 which are not co-planar are
called a basis for three-dimensional space. If u is a three-dimensional
vector, then the numbers x1 , x2 , x3 in (3) are called the coordinates of u
with respect to the basis e1 , e2 , e3 . If the basis is understood, we can
abbreviate the notation (3) and write
u = (x1 , x2 , x3 ).

Example. Let OPQR be a tetrahedron. The vectors e1 = OP, e2 = OQ,

e3 = OR then form a basis for three-dimensional space. Let N be the

22

2. VECTORS

center of mass of the triangle PQR. By Example 2 in the previous


section, we then have

ON = (1/3, 1/3, 1/3)


relative to this basis.
When we have a fixed basis for 3-space, we can compute with
coordinates instead of vectors. We then have the rules
(x1 , x2 , x3 ) + (y1 , y2 , y3 ) = (x1 + y1 , x2 + y2 , x3 + y3 )
t(x1 , x2 , x3 ) = (tx1 , tx2 , tx3 ).

(5)
(6)

The rule (5) corresponds to the rule for addition of vectors, for if
u = x1 e1 + x2 e2 + x3 e3

and v = y1 e1 + y2 e2 + y3 e3 ,

then
u + v = (x1 e1 + x2 e2 + x3 e3 ) + (y1 e1 + y2 e2 + y3 e3 )
= (x1 + y1 )e1 + (x2 + y2 )e2 + (x3 + y3 )e3 .
The proof of (6) is similar.
Projections. In the decomposition (4) of a vector u, the vector u0
is called the projection of u parallel to the line ` on the plane , and u00 is
called the projection of u parallel to on `.
If the line ` is normal to the plane , then the projection u0 is called
the orthogonal (or "right angled) projection of u on .
If we decompose both u and v in this way, we find
u + v = (u0 + v0 ) + (u00 + v00 ),
which means that
(u + v)0 = u0 + v0

and

(u + v)00 = u00 + v00 .

Exercises.
7. Let OPQR be a tetrahedron and introduce a basis for 3-space

by e1 = OP, e2 = OQ, e3 = OR. Let A be the mid-point of the


segment OP and B the mid-point of the segment QR. Also,
let C be the mid-point on the segment AB. Determine the

coordinates of the vectors OA, OB, and OC relative to the


basis e1 , e2 , e3 .

3. LINEAR DEPENDENCE

23

3. Linear Dependence
Let u1 , u2 , . . . , uk be a collection of vectors. We introduce some
basic terminology:
A vector u of the form
u = 1 u1 + . . . + k uk ,
where 1 , . . . , k are real numbers, is called a linear combination of the
vectors u1 , . . . , uk .
The collection u1 , u2 , . . . , uk is called linearly dependent if there are
real numbers 1 , . . . , k , not all equal to zero, such that
1 u1 + . . . + k uk = 0.
Otherwise, i.e., if
1 u1 + . . . + k uk = 0

1 = 2 = . . . = k = 0,

we say that the collection is linearly independent.


Example 1. That two vectors u1 , u2 are linearly dependent means
precisely that there are 1 , 2 , not both zero, such that 1 u1 +2 u2 = 0.
If 1 , 0 this implies u2 = tu1 where t = 2 /1 , so u1 and u2 are
parallel. The same conclusion holds if 2 , 0. Conversely, if u1 and
u2 are parallel, say if u1 = tu2 , then the relation 1 u1 + (t)u2 = 0
shows that u1 , u2 are linearly dependent. We have shown that two
vectors are linearly dependent if and only if they are parallel.
The example has the following generalization for arbitrary collections of vectors.
Theorem 3. A collection u1 , . . . , uk is linearly dependent if and only if
(at least) one u j can be written as a linear combination of the other ui s.
Proof. () Assume that u1 , . . . , uk are linearly dependent. Then
there are real numbers 1 , . . . , k , not all zero, such that
1 u1 + 2 u2 + . . . + k uk = 0.
We can w.l.o.g. assume that 1 , 0. But then
u1 = (2 /1 )u2 + . . . + (k /1 )u2 ,
which shows that u1 is a linear combination of u2 , . . . , uk .
() Suppose that some u j is a linear combination of the other ui s.
We can w.l.o.g. assume that j = 1 and that
u1 = t2 u2 + . . . + tk uk ,
for some real numbers t2 , . . . , tk . Then
1 u1 + (t2 )u2 + . . . + (tk )uk = 0.

24

2. VECTORS

Since the coefficient of u1 is not zero, we infer that the collection is


linearly dependent.

Remark 4. If a collection u1 , . . . , uk is linearly dependent, then
any larger collection u1 , . . . , uk , uk+1 , . . . , un is also linearly dependent.
This holds, since if
1 u1 + . . . + k uk = 0
where not all j are zero, then
1 u1 + . . . + k uk + 0uk+1 + . . . + 0un = 0.
Example 2. Now suppose that three vectors u1 , u2 , u3 are linearly
dependent. Then one of them, say u3 , can be written as a linear
combination of u1 and u2 :
u3 = t1 u1 + t2 u2 .
Hence if u1 and u2 are in a plane , then u3 is also in . Conversely,
if u1 , u2 , u3 belong to a plane , then they are linearly dependent.
To realize this, it suffices to observe that either u1 , u2 are linearly
dependent, whence u1 , u2 , u3 is also linearly dependent by the remark
above, or u1 , u2 is a basis for a plane , whence u3 implies that u3
is a linear combination of u1 and u2 . We have shown that three vectors
are linearly dependent if and only if they are co-planar. Equivalently three
vectors are linearly independent if and only if they form a basis for 3-space.
Example 3. Assume that the vectors u1 , u2 , u3 have coordinates
u1 = (1, 2, 2)

u2 = (2, 3, 1)

u3 = (1, 3, 2)

with respect to some basis for 3-space. We shall investigate whether


or not the vectors u1 , u2 , u3 form a basis. To this end, we consider the
vector equation
1 u1 + 2 u2 + 3 u3 = 0,
which is equivalent to the linear system

1 22 3 = 0

22 + 32 + 33 = 0

21 + 2 + 23 = 0
Solving this with the elimination method gives the only solution
1 = 2 = 3 = 0. Hence u1 , u2 , u3 is linearly independent and (by
the result of Example 2) is a basis for 3-space.
A collection of four or more vectors in 3-space is always linearly
dependent. To show this, it is enough to prove that four vectors
are linearly dependent, for then every larger collection is linearly

4. LINES AND PLANES

25

dependent as well. Thus let u1 , u2 , u3 , u4 be four arbitrary vectors in


3-space. Then either u1 , u2 , u3 are linearly dependent, and therefore
so are u1 , u2 , u3 , u4 , or u1 , u2 , u3 is a basis for 3-space, whence u4 is a
linear combination of u1 , u2 , u3 . In either case, u1 , u2 , u3 , u4 is linearly
dependent.
Exercises. In the following exercises, vectors are expressed by
their coordinates relative to some fixed basis e1 , e2 , e3 .
8. Prove that the vector v = (2, 7, 1) is in the plane spanned
by the vectors u1 = (2, 1, 3) and u2 = (1, 1, 2). (The unique
plane containing two linearly independent vectors is called
the plane spanned by the vectors in question.) Determine the
coordinates for v with respect to the basis u1 , u2 .
9. Are the vectors (1, 2, 1), (2, 1, 1), (1, 4, 5) co-planar?
10. Prove that the vectors (1, 1, 2), (4, 4, 9), (2, 3, 7) form a basis for
the three-dimensional space. Determine the coordinates of
the vector (5, 4, 3) relative to this basis.
11. For which values of k are the following sets of vectors linearly
independent?
a) (k, k2 , k3 ), (2, 2, 2);
b) (1, 1, 1), (1, k, 2k), (k, 1, k);
c) (1, 1, k), (2, k, 4), (k, 2, 4).
4. Lines and Planes
Fix a point O in three-dimensional space. An arbitrary point P

can then be described by the position vector OP. For a given basis e1 ,
e2 , e3 there are then unique numbers x1 , x2 , x3 such that

OP = x1 e1 + x2 e2 + x3 e3 .
The numbers x1 , x2 , x3 are the coordinates of the point P with respect
to the coordinate system Oe1 e2 e3 . If the coordinate system is clear from
the context, we can simply write
P = (x1 , x2 , x3 ).
Remark 5. In a similar way we can describe points in a plane .
Let O be a point in and e1 , e2 is a basis for . Then for each point P
in we can write

OP = x1 e1 + x2 e2
for unique x1 , x2 R. We say that x1 , x2 are the coordinates of P in
the coordinate system Oe1 e2 for . In short: P = (x1 , x2 ).

26

2. VECTORS

Throughout the rest of this section we fix a coordinate system


Oe1 e2 e3 for three-dimensional space. Each point can then be identified with its coordinates, which we denote by (x, y, z) rather than
(x1 , x2 , x3 ).
Parametric representation. Let ` be a line in space. Suppose that
` passes through a point Q and that a vector u , 0 is parallel to `.
Then a point P belongs to ` if and only if

(1)
QP = tu
for some t R. The vector u is called a direction vector of `.
Denote by (a, b, c) the coordinates of u with respect to the basis e1 ,
e2 , e3 . Thus
u = ae1 + be2 + ce3 .

If Q0 = (x0 , y0 , z0 ) and P = (x, y, z), then the vector QP = OP OQ has
coordinates (x x0 , y y0 , z z0 ). Hence (1) can be cast in the form

x x0 = at

t R.
(2)
y y0 = bt ,

z z0 = ct
The relation (2) is known as the parametric representation of the line `.
Example 1. Consider the line ` which passes through the points
Q = (3, 6, 5) and R = (4, 3, 3). A direction vector of ` is

QR = OR OQ = (4, 3, 3) (3, 6, 5) = (1, 3, 2).
Since ` passes through Q, we conclude that

x3=t

t R,
y + 6 = 3t ,

z + 5 = 2t
is a parametric representation for `.
In a similar way, one can represent a plane in space. Namely, let
Q be a point in and u, v a basis for (i.e. two non-parallel vectors
which are parallel to ). Then a point P in space belongs to if and
only if

(3)
QP = su + tv
for some s, t R. The numbers s, t are the coordinates for P in the
coordinate system Quv of . If we denote
u = ae1 + be2 + ce3

v = a0 e1 + b0 e2 + c0 e3

Q = (x0 , y0 , z0 ),

4. LINES AND PLANES

then (3) can be written

x x0 = as + a0 t

(4)
y y0 = bs + b0 t

z z0 = cs + c0 t

27

s, t R.

This formula is called a parametric representation of the plane .


Example 2. Consider the plane passing through the points Q =
(1, 2, 0), R1 = (0, 1, 1), and R2 = (2, 1, 3). Two non-parallel vectors
in M are

u = QR1 = (1, 1, 1) , v = QR2 = (1, 3, 3).


Hence

x 1 = s + t

y 2 = s 3t

z = s 3t

s, t R

is a parametric representation of .
Example 3. Let us determine the point of intersection between the
line ` of Example 1 and the plane of Example 2. To this end, we
separate between the t-parameters in the representations for ` and
and write

x 3 = t1
x 1 = s + t2

`:
,

:
y
+
6
=
3t
y 2 = s 3t2 .

z + 5 = 2t1

z = s 3t2
At the point of intersection, the values of x, y, and z must match, i.e.,

3 + t1 = 1 s + t2
s + t1 t2 = 2


s + 3t1 + 3t2 = 8 .
6 + 3t1 = 2 s 3t2

5 + 2t1 =

s 3t2
s + 2t1 + 3t2 = 5
After Gaussian elimination, this gives s = 2, t1 = 1, and t2 = 3.
Inserting t1 = 1 into the equation for ` gives (x, y, z) = (2, 9, 7).
The equation of a plane. We claim that planes in space correspond precisely to equations of the form
(5)

Ax + By + Cz = D

where not all coefficients A, B, C are zero. Indeed, suppose that A , 0


(the cases B , 0 and C , 0 are analogous). Then a point (x, y, z)

28

2. VECTORS

satisfies (5) if and only if

x (D/A) = (B/A)s (C/A)t

y
=
s

z
=
t
for some s, t R. Thus (5) describes the plane passing through
the point (D/A, 0, 0), which is parallel to the vectors (B/A, 1, 0),
(C/A, 0, 1).
Conversely, one can, by eliminating s and t in the parametric representation (4), prove that any plane can be described by an equation
of the form (5). We omit the details, since we shall anyway find
an easier method to prove this later on. Instead, we turn to some
examples.
Example 4. Consider again the plane

x 1 = s + t

:
y 2 = s 3t

z = s 3t

s, t R

A point (x, y, z) is in if and only if there are s and t satisfying these


equations. To determine when this is the case, we can first eliminate
s from the last two equations, and then t from the last equation:

1
=
s
+
t
x
1 = s + t


y 2 = s 3t
x + y 1 =
4t

x+ z1=
z = s 3t
2t

x
1 = s + t


x + y
1=
4t .

3x y + 2z 1 =
0
Since we can always choose s and t so that the first two equations are
satisfied, we see that a point (x, y, z) belongs to if and only if the
third equation is satisfied, i.e., iff
3x y + 2z 1 = 0.
Lines in space can be described as the intersection between two
non-parallel planes. This is illustrated by the following example.
Example 5. The points (x, y, z) which belong to both of the planes
2x + y 3z = 5

and x + 2y z = 4

4. LINES AND PLANES

are precisely the solutions to the system


(
(
2x + y 3z = 5
2x + y 3z = 5

3y + z = 3
x + 2y z = 4

29

Inserting z = 3t (to obtain integer coefficients in the parametric representation) we get

x = 2 + 5t

y=1 t .

z =
3t
The intersection line thus passes through the point (2, 1, 0) and has
direction vector (5, 1, 3).
Remark 6. To describe a line in space we need two equations of the
form Ax + By + Cz = D. On the other hand, if we only consider points
in a plane , and if x, y denote coordinates with respect to some
coordinate system in , then one equation Ax + By = C is sufficient to
describe a line. If for example A , 0, then (x, y) satisfies the equation
if and only if
(
x = (C/A) (B/A)t
y=
t
for some t R. This is a parametric representation of a line in
passing through the point (C/A, 0), with direction vector (B/A, 1).
Exercises.
12. Give a parametric representation for the line passing through
the points (1, 1, 4) and (2, 3, 5).
13. Consider the lines

x=2+ t
x = 1 + 2t

`1 :
, `2 :
y=1 t
y=
t .

z =

2t
z = 1 + t
Do they intersect? Are they parallel?
14. Determine whether the following lines intersect:

x = 1 + 15t
x = 6 65t

y = 4 21t
y = 11 + 91t .

z = 5 + 33t
z = 16 143t

30

2. VECTORS

15. Find a parametric representation of the line ` which passes


through the point (3, 2, 1) and intersects the lines

x = 10 + 5t
x
=
1
+
t

and
y= 5+ t .
y=
t

z = 2 + 2t
z = 5 + t
16. Find a parametric representation of the plane which passes
through the points (2, 3, 0), (1, 5, 2), and (1, 4, 3).
17. Are the four points (1, 1, 0), (0, 4, 1), (1, 0, 1), (1, 3, 2)
co-planar?
18. Find, in the form Ax + By + Cz = D, the equation for the
plane which passes through the point (1, 1, 2) and which
contains the line

x=3 t

y = 2 + 2t .

z = 1 3t
19. Prove that a line with direction-vector (a, b, c) is parallel to the
plane Ax + By + Cz = D if and only if
Aa + Bb + Cc = 0.
20. Find a parametric representation for the line which passes
through the point (1, 2, 4) and which is parallel to the planes
2x + y z = 3

and

3x 3y + z = 0.

21. Determine the equation for the plane which contains the line
(
3x + 4y + z = 5
,
x y
= 6
and which passes through the mid-point on the segment between the points (1, 1, 2) and (3, 1, 4).
5. Answers to Exercises
1. a) a b) b a c) 21 (a + b) d) 21 (b a).

2. a) OC
b) AB
c) OD
d) AL.

7. OA = (1/2, 0, 0), OB = (0, 1/2, 1/2), OC = (1/4, 1/4, 1/4).


8. v = 3u1 4u2 .
9. Yes, (1, 4, 5) = 3(1, 2, 1) 2(2, 1, 1).
10. (5, 4, 3) = 23(1, 1, 2) 4(4, 4, 9) (2, 3, 7).
The coordinates are thus (23, 4, 1).
11. a) k < {0, 1}
b) k < {1/2, 1}
c) k < {2, 4}.

5. ANSWERS TO EXERCISES

12.
13.
14.
15.

16.
17.
18.
19.
20.
21.

x= 1+ t

y = 1 + 4t .

z = 4 + t
The lines do not intersect and they are not parallel.
The lines
coincide.

x = 5 + 2t

`:
y = 4 + 2t .

z =
t

x = 2 s 3t

:
y = 3 + 2s + t .

z = 2s + 3t
Yes.
: 5x 3y 14z = 60.

: x y z = 0.

x = 1 + 2t

y = 2 + 5t .

z = 4 + 9t
13x + 36y + 7z = 83.

31

CHAPTER 3

Distance and Angle


1. Scalar Product
We start with an important definition.
Definition 7. Let u, v be two vectors in space and let be the
angle between them (in the interval 0 ). Their scalar product
is the number (u|v) defined by
(u|v) = kuk kvk cos .
If u = 0 or v = 0, the angle is not defined, but in these cases we
define (u|v) = 0.
The word "scalar is a synonym for "number; the term "scalar
product is used because the result (u|v) is a real number and not a
vector.
Now consider the case when v = e is a unit vector, i.e., kek = 1. If
is the angle between u and e, then
(1)

(u|e) = kuk cos .

We shall find an important geometric interpretation of this formula:


Place u and e so that they emanate from the same point and let `
be a line through this point, with direction vector e. The orthogonal
projection of u of ` (See Chapter 2, end of Section 2) can then seen to
be
(2)

u00 = (kuk cos )e.

(This can be verified by an elementary trigonometrical argument


we ask the reader to supply details.)
Comparing the formulas (1) and (2), it is seen that
u00 = (u|e) e,
i.e. the number (u|e) is the coordinate of the orthogonal projection u00 in
the basis e for `.
If v , 0 is not a unit vector, we write
(u|v) = kvk (u|e),
33

34

3. DISTANCE AND ANGLE

1
where e = kvk
v is a unit vector. Hence the absolute value |(u|v)|
equals the length of v times the length of the orthogonal projection
of u on a line with direction vector v. The sign of (u|v) is positive if
the angle between u and v is acute, and negative if it is obtuse.
The scalar product satisfies the following rules.
(I) (u|v) = (v|u)
(II) (u + v|w) = (u|w) + (v|w)
(III) (tu, v) = t (u, v)
(IV) (u|u) p0 with equality u = 0.
(V) kuk = (u|u)
The rule (I) is called symmetry of the scalar product; (II) and (III)
together are called linearity in the first argument; (IV) is called positive
definiteness.
The properties (I),(III), (IV) and (V) are immediate; (II) can be
realized in the following way: If w = e is a unit vector, then, by the
geometrical interpretation of the scalar product, we can see that (II)
means precisely that

(u + v)00 = u00 + v00 .


That this is the case was shown at the end of Chap. 2, Sect. 2.
This shows (II) when w is a unit vector. For general w, we now get
(II) by applying the unit vector case to the vector e in the formula
(u|w) = kwk (u|e).
Remark 8. Note that linearity in the second argument follows from
the symmetry and the linearity in the first argument. In particular,
we have (u|tv) = t(u|v).
As an application of the scalar product, we prove a well-known
theorem. We say that two vectors u, v are orthogonal to each other if
(u|v) = 0. (This means that the angle = /2, or that one of u or v is
the zero vector.)
Theorem 9. (Pythagoras Theorem) If u and v are orthogonal, then
ku + vk2 = kuk2 + kvk2 .
Proof. If u, v are any vectors (orthogonal or not), then by the
computational rules above
ku + vk2 = (u + v|u + v)
= (u|u) + (u|v) + (v|u) + (v|v)
= kuk2 + 2(u|v) + kvk2 .

2. ORTHONORMAL BASES

35

Hence if (u|v) = 0, then


ku + vk2 = kuk2 + kvk2 .

Exercises.
1. Let be the angle between the sides AB and BC in a triangle
ABC. Prove the law of cosines:

2 2 2
AC = AB + BC 2
AB BC cos .
(Observe that this reduces to Pythagoras Theorem when =
/2.)
2. Denote by a and b the side-lengths in a parallelogram, and by
c and d the lengths of the diagonals. Prove the parallelogram
law:


c2 + d2 = 2 a2 + b2 .
2. Orthonormal Bases
Let be a plane and e1 , e2 a basis for . If
u = x1 e1 + x2 e2

and v = y1 e1 + y2 e2

are two vectors in , then by the rules (I)(III) for the scalar product,
(u|v) = x1 y1 (e1 |e1 ) + (x1 y2 + x2 y1 )(e1 |e2 ) + x2 y2 (e2 |e2 ).
This expression becomes particularly simple if the basis vectors e1 , e2
are unit vectors making a right angle with each other. Namely, then
(e1 |e1 ) = (e2 |e2 ) = 1 and (e1 |e2 ) = 0, and thus
(3)

(u|v) = x1 y1 + x2 y2 .

A basis for a plane consisting of two orthogonal unit vectors is called


an orthonormal basis (in short: ON-basis) for the plane.
The corresponding definition in three dimensions is the following.
We say that three vectors e1 , e2 , e3 are pairwise orthogonal if
(e j |ek ) = 0,

when

j , k.

Three vectors e1 , e2 , e3 are said to form an orthonormal basis for threedimensional space if (i) they have unit length, and (ii) they are pairwise orthogonal. We can summarize the definition of orthonormal
basis by the equation
(
0 if j , k
(e j |ek ) =
.
1 if j = k

36

3. DISTANCE AND ANGLE

If e1 , e2 , e3 form an orthonormal basis and


u = x1 e1 + x2 e2 + x3 e3

and v = y1 e1 + y2 e2 + y3 e3 ,

then
(4)

(u|v) = x1 y1 + x2 y2 + x3 y3 .

When u = v this reduces to


(5)

kuk2 = x21 + x22 + x23 .

This can be regarded as the three-dimensional version of the theorem


of Pythagoras. (u is here the sum of three pairwise orthogonal vectors
x1 e1 , x2 e2 , x3 e3 .)
Example 1. Suppose that u and v have coordinates
u = (4, 1, 1)

and v = (2, 2, 1)

with respect to some orthonormal basis. We shall determine the angle


between u and v. For this purpose we use (4) and (5) to compute
(u|v) = 4 2 + 1 2 + 1 (1) = 9

kuk = 16 + 1 + 1 = 3 2

kvk = 4 + 4 + 1 = 3.
Since (u|v) = kukkvk cos , this gives
cos =

(u|v)
9
1
=
= .
kukvk 3 2 3
2

This means that = /4.


Example 2. The formula (3) can be used to prove many trigonometrical
identities. (As we know, the de Moivre formula for multiplication of
complex numbers provides another way to do this.)
As an example, we shall now show the following version of the
addition formula for cosines:
(6)

cos( ) = cos cos + sin sin .

To this end, let e1 , e2 be an orthonormal basis for the plane and put
u = (cos )e1 + (sin )e2

v = (cos )e1 + (sin )e2 .

Since u and v then have length 1, while the angle between them is
, the definition of scalar product shows that
(u|v) = cos( ).
(We have here used the fact that cos is even: cos( ) = cos( ).)

3. COMPUTING DISTANCES AND ANGLES

37

On the other hand, by (3), we have


(u|v) = cos cos + sin sin .
Comparing the two expressions for (u|v), we infer that the formula
(6) holds. 
Exercises.
3. A triangle in space has vertices at the points (1, 0, 2), (0, 1, 1),
and (2, 1, 2) according to some orthonormal basis. Compute
all side-lengths and cosines of angles in the triangle.
4. Let e1 , e2 , e3 be an orthonormal basis in space. Suppose that
the vector u makes the angle /4 to the vector e1 and the angle
/3 to the vector e2 . What are the possible angles between u
and e3 ?
3. Computing Distances and Angles
A coordinate system Oe1 e2 e3 where e1 , e2 , e3 is an orthonormal
basis is called an orthonormal system or an ON-system. In this section
we fix an ON-system and assume that all points are represented
in that system. The distance between two points P = (x, y, z) and
Q = (x0 , y0 , z0 ) is then given by
q
PQ = (x x0 )2 + (y y0 )2 + (z z0 )2 .
Distance between a point and a line. Consider a line ` and a
point P which is not on the line. By the distance between P and L
we mean the shortest possible distance between P to a point R `. A
geometric consideration shows that the closest point R is determined

by the condition that PR be orthogonal to the direction vector of `.


(Draw a figure!)
Example 1. Let ` be the line through the points Q = (2, 1, 1) and
S = (0, 1, 2), and let P = (1, 2, 1). We shall compute the distance
from P to `.

First note that ` has direction vector QS = (2, 2, 1), so it has the
parametric representation

x = 2 + 2t

`:
y = 1 2t .

z = 1 + t
We now seek the closest point R ` to P. To this end we put

R = (2 + 2t, 1 2t, 1 + t) and we must determine t so that PR is

38

3. DISTANCE AND ANGLE

orthogonal to the direction vector (2, 2, 1) of `. That is, we shall


have
 
0 = PR|QS = (3 + 2t) 2 + (1 2t) (2) + t 1 = 9t 4.

This gives t = 4/9 and PR = 19 (19, 17, 4). The distance from P to `
is thus

1
74
PR =
.
192 + 172 + 42 =
9
3
Distance from a point to a plane. Let be a plane in space, Q
a point in , and n a unit normal vector to . (This means that n has
unit length and points in a direction orthogonal to .)
The distance d between a point P and is again defined as the
smallest distance between P to some point in . We assert that
 
(7)
d = n|QP .
To prove this, it suffices to note that the orthogonal projection of the
 

vector QP on n is equal to n|QP n. The lengh of the latter vector


must thus equal to the sought distance, which proves (7).
Now suppose that Q = (x0 , x y , z0 ), P = (x, y, z), and n = (A, B, C).
Then
 
n|QP = A(x x0 ) + B(y y0 ) + C(z z0 ).
If we put
D = Ax0 + By0 + Cz0 ,
then (7) can be written
(8)



d = Ax + By + Cz D .

But P belongs to precisely when d = 0, i.e., when


(9)

Ax + By + Cz = D.

We have shown that an arbitrary plane can be described by a linear


equation of the type (9). If P = (x, y, z) is not in , then its distance to
is given by (8).
Remark 10. The argument above shows that if we have a plane of
the form (9), then n = (A, B, C) is a normal vector. Above we assumed
that n is a unit vector, i.e., that
A2 + B2 + C2 = 1.

3. COMPUTING DISTANCES AND ANGLES

39

If this is not satisfied,


one must first normalize the equation (9) by

2
dividing with A + B2 + C2 . The distance formula then becomes


Ax + By + Cz D
(10)
d=
.
A2 + B2 + C2
Example 2. We shall determine an equation for the plane which
passes through the point (1, 1, 2) and has the normal

x = 1 + 3t

`:
y = 3 + 2t .

z = 2 t
The direction vector of `, i.e. the vector (3, 2, 1) must then be a normal vector of . Since the point (1, 1, 2) belongs to M, the equation
of the plane becomes
3(x 1) + 2(y (1)) + (1)(z 2) = 0,
i.e.,
3x + 2y z = 1.
The distance from the point P = (5, 6, 7) to is thus (see (10))
21
|3 5 + 2 6 1 7 + 1|
= .

14
32 + 22 + 12
Distance between two lines. Given two lines which do not intersect, we define their distance to be the smallest possible distance
between one point on the first line and one on the second one. The
following example comprises a method to compute this kind of distance.
Example 3. Consider the lines `1 and `2 having parametric representations

x
=
3
+
2t
x=3+ t

`1 :
; `2 :
y=
t
y = 4 + 3t .

z = 1 t
z = 2 + 2t
We shall compute the (shortest) distance between `1 and `2 . Direction
vectors for the two lines are (2, 1, 1) and (1, 3, 2). Let be the plane
parallel to these directions, which passes through the point (3, 0, 1)
on `1 . Then `1 and also, is parallel to `2 . Hence the distance d

40

3. DISTANCE AND ANGLE

between `1 and `2 must equal to the distance from an arbitrary point


on `2 to . A parametric representation of is

x = 3 + 2t + s

:
y=
t + 3s .

z = 1 t + 2s
Elimination of s and t gives the equation
x y + z = 2
for the plane . The distance from the point (3, 4, 2) on `2 to is,
according to (10)

3
|3 4 + 2 + 2|
d=
= = 3.
3
1+1+1

The distance between the lines is thus 3.


Angle between two planes. If 1 , 2 are two planes, we define
the angle between them to be the angle between the corresponding
normal vectors. (In general there are two possibilities for the angle,
depending on the mutual orientations of the normal vectors: see the
example below.)
Example 4. Suppose that
1 : x 2y 2z = 3

2 : x + 4y + z = 5.

Corresponding normal vectors are n1 = (1, 2, 2) and n2 = (1, 4, 1).


The angle between the planes then satisfies
(n1 |n2 )
1 1 + (2) 4 + (2) 1
1
=
= .

kn1 kkn2 k
2
12 + 22 + 22 12 + 42 + 12
This gives = 3/4. This is the obtuse angle between the planes.
There is also another possibility, namely if we substitute n1 for n1
above. This leads to the acute angle = /4.
cos =

Angle between a line and a plane. To determine the angle between a line ` and a plane , one first computes the acute angle
between ` and a normal vector to . The angle between ` and is
defined by + = /2.
Example 5. Suppose that

x=2+ t

`:
y=3+ t

z = 1 + 4t

: 4x 11y 5z = 2.

3. COMPUTING DISTANCES AND ANGLES

41

Let denote the angle between the direction vector (1, 1, 4) of ` and
the normal vector (4, 11, 5) of . Then
1 4 + 1 (11) + 4 (5)
1
= .

2
12 + 12 + 42 42 + 112 + 52
This gives = 2/3, which is obtuse. Hence the acute angle between
` and the normal to is = = /3. The angle between ` and
is thus = /2 = /6.
cos =

Exercises.
5. Compute the distance between the point (1, 2, 3) and the line

x= 1 t

y = 4 + 2t .

z = 3 t
6. The line ` is the intersection between the planes x+2y2z = 5
and 2x y + z = 0. Determine the point on ` which is closest
to the origin.
7. The line ` passes through the point (1, 2, 3) and is perpendicular to the plane 2x 3y + z = 3. Find the distance between
` and the point (4, 5, 6).
8. Determine, in the form Ax + By + Cz = D, the equation of the
plane which consists of all points which have equal distance
to the points (1, 2, 0) and (1, 0, 2).
9. Find the distance from the plane 3x 4y + 12z = 13 to the
points (0, 0, 0) and (2, 1, 3). Are these points on the same or
on opposite sides of the plane?
10. a) Determine, in the form Ax + By + Cz = D, an equation for
the plane M which passes through the points (2, 3, 0) and
(2, 2, 2), and is parallel to the line x = 2 t, y = 1 + t, z = 2 t.
b) Find the distance between the point (3, 1, 0) and M.
11. Find the point in the plane through the points (1, 3, 1), (1, 1, 0),
(1, 3, 2) which is closest to the point (2, 2, 1).
12. a) Prove that the lines

x
=
1
+
t
x=3+ t

`1 :
and
`2 :
y=2 t
y=2+ t

z = 3 + 2t
z = 2 3t
intersect at a point.
b) Find the distance between the point (3, 4, 5) and the
plane spanned by `1 and `2 .

42

3. DISTANCE AND ANGLE

13. A ray of light is emitted from the point (3, 2, 1) and reflected
off the plane x 2y 2z = 0. The reflected ray passes the point
(4, 1, 6). At which point does the ray hit the plane?
14. Determine the distance between the lines
a) (x, y, z) = t(3, 3, 1) and (x, y, z) = (1, 0, 0) + t(1, 1, 1).
b) (x, y, z) = (1, 2, 3) + t(0, 1, 1) and (x, y, z) = (1, 1, 1) +
t(2, 3, 1).
15. Consider the lines

x = 12 t
x
=
8

3t

.
and
`2 :
`1 :
y = 4 2t
y= 2 t

z = 1 + t
z = 3 + t
Determine, in the form Ax + By + Cz = D, an equation for the
plane which is parallel to `1 and `2 and has the same distance
to the two lines.
16. a) Determine, in the form Ax+By+Cz = D, an equation for the
plane M which passes through the points (2, 1, 3), (1, 2, 2),
and (1, 0, 2).
b) Determine the angle between M and the plane 2x + y
z = 1.
17. Determine the angle between the plane x + 2y z = 0 and the
line (x, y, z) = (3, 5, 1) + t(1, 1, 0).
18. A tetrahedron has corners A = (1, 2, 0), B = (1, 3, 1), C(1, 1, 0),
and D(1, 3, 2). Determine the angle between the plane containing the side BCD and the line containing the edge AB.

3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.

4. Answers to Exercises

The side-lengths
are
3,
3,
and
2. The cosines of angles are

2/3, 5/(3 3), and 2 2/3.


/3
or 2/3.
12.
(1,1, 1).
3 3.
x + y z = 0.
1 and 25/13 respectively. The points are on opposite sides of
the plane.

a) 3x 2y + z = 12.
b) 1/ 14.
(1, 1, 1).

a) Intersection-point: (2, 1, 5).


b) 16/ 30.
(2, 1, 0).

a) 1/ 14.
b) 1/ 3.

4. ANSWERS TO EXERCISES

15.
16.
17.
18.

x + 2y + 5z = 1.
a) x + 2y + z = 3.
/3.
/6.

b) /3.

43

CHAPTER 4

Second degree curves


1. Ellipse, Hyperbola, Parabola
Circle. Let F be a point in a plane and consider the set of all
points P in of a certain distance a to F. If F and P have coordinates
(x0 , y0 ) and (x, y) respectively, where coordinates are represented in
some ON-system for , then the equation of the circle can be written
(x x0 )2 + (y y0 )2 = a2 .
Of course, point F is called the center and a is the radius of the circle.
If F is the origin, the equation reduces to
x2 + y2 = a2 .
We shall now discuss the other basic types of second degree curves:
ellipse, hyperbola, and parabola.
Ellipse. The definition of an ellipse generalizes the definition of
a circle. Let F1 and F2 be two points in a plane and let a be a positive
constant; we assume that 2a is greater than the distance between F1
and F2 . The set of points P in with the property that the sum of the
distances from P to F1 and from P to F2 equals to 2a is called an ellipse.
If we choose an ON-system in such that the origin is the midpoint on the segment F1 F2 , and the x-axis passes through the points
F1 and F2 , then F1 and F2 have coordinates (c, 0) and (c, 0) for some
real c. We can assume that c > 0. That the sum of distances from
P = (x, y) to F1 and F2 equals 2a means that
q
q
(1)
(x + c)2 + y2 + (x c)2 + y2 = 2a.
Squaring the equation (1) leads to
q


(x + c)2 + (x c)2 + 2y2 + 2 (x + c)2 + y2 (x c)2 + y2 = 4a2 .
Dividing by 2 and rearranging, this becomes
q


2
2
2
2
x + y + c 2a = x2 + y2 + c2 + 2cx x2 + y2 + c2 2cx .
45

46

4. SECOND DEGREE CURVES

Squaring again, we obtain that



2



2
x2 + y2 + c2 4a2 x2 + y2 + c2 + 4a4 = x2 + y2 + c2 4c2 x2 ,
i.e.,




a2 c2 x2 + a2 y2 = a2 a2 c2 .
If we put b2 = a2 c2 and divide with a2 b2 , this becomes
x2 y2
+
= 1.
a2 b2
We have shown that each point (x, y) satisfying the root-equation (1)
also satisfies (2). By tracing back in the calculations, one can also
verify that all solutions to (2) satisfy (1). (Since we have squared
several times, this is not immediate!) We have shown that the ellipse
is completely determined by the equation (2).
Notice that the ellipse (2) intersects the coordinate axes at the
points (a, 0) and (0, b). The segments from the origin to these
points are called the semi-axes of the ellipse. The points F1 = (c, 0)
and F2 = (c, 0) are the foci of the ellipse.
(2)

Hyperbola. If we instead consider the set of points P in a plane


such that the difference between the distances to two given points
("foci) F1 and F2 is constant = 2a, we get a curve known as a hyperbola.
In a similar way to the case of the ellipse, we can introduce an
ON-system in such that F1 = (c, 0) and F2 = (c, 0) for a number
c > a. Hence the equation of the hyperbola becomes
q
q
2
2
(3)
(x + c) + y (x c)2 + y2 = 2a.
Here the plus-sign shall be chosen if P = (x, y) is closer to F2 , and the
minus-sign if P is closer to F1 . The hyperbola is not connected, it has
two branches.
Calculations analogous to the case for the ellipse show that the
equation of the hyperbola can be written
(4)

x2 y2

= 1,
a2 b2

where this time b2 = c2 a2 .


Parabola. Let ` be a line in a plane , and F a point in , which
is not on `. The set of points P in whose distance to ` equals the
distance to F is called a parabola. The point F is the focus and the line
` is called the directrix of the parabola.

1. ELLIPSE, HYPERBOLA, PARABOLA

47

Choose an ON-system in such that the y-axis is parallel to `,


the x-axis passes through F, and the origin has equal distance a to
F and to `. We shall prove that the equation of the parabola in this
ON-system becomes
(5)

y2 = 4ax.

To show this, note that the distance


from a point P = (x, y) to ` is x + a,
p
while the distance to F is (x a)2 + y2 . Squaring these distances
leads to (5).
Remark 11. The parabola has an interesting optical property with
many practical applications. Each light-ray parallel to the positive
x-axis will, after reflection in the parabola, pass through the same
point F. A set of parallel light-rays are thus focussed to the point F.
Conversely, if we place a source of light at F, this will after reflection
give rise to light-rays which are parallel to the x-axis.
In the case of the ellipse, one has instead that if a light source is
placed at F1 , then all light-rays will pass through F2 .
Remark 12. The ellipse, the hyperbola, and the parabola are all
cases of so-called conic sections. This name stems from the fact that
all such curves can be obtained as the intersection of a double cone
with a suitable plane.
Exercises.
1. Determine the centers of circles which are tangent to the x-axis
and which pass through the points (0, 1) and (0, 9).
2. Determine the foci of the ellipses
a) 9x2 + 25y2 = 225.
b) 25x2 + 169y2 = 4225.
3. Determine the equation of the ellipse which intersects the
y-axis at the points (0, 2) and has foci at the points (2, 0).
4. Let (x0 , y0 ) be a point on the ellipse x2 /a2 + y2 /b2 = 1.
a) Show that the line x = x0 + t, y = y0 + t is tangent to
the ellipse if and only if x0 /a2 + y0 /b2 = 0.
b) Show that the point (x, y) is on the tangent of the ellipse
at (x0 , y0 ) if and only if xx0 /a2 + yy0 /b2 = 1.
5. Find the foci of the hyperbolae
a) 16x2 9y2 = 144.
b) 3x2 5y2 = 75.
6. Find the equation of the hyperbola which intersects the x-axis
at the points (2, 0) and has foci at (3, 0).
7. Find the equation of the parabola which is symmetric with
respect to the x-axis and which passes through the points
(0, 0) and (27, 18). Also determine the focus.

48

4. SECOND DEGREE CURVES

2. General Second-Degree Equations


A second-degree equation in the variables x and y is an equation
of the form
(1)

Ax2 + Bxy + Cy2 + Dx + Ey = F.

Now suppose that x and y are coordinates with respect to an ONsystem Oe1 e2 in the plane. We shall investigate the geometric meaning of the equation (1). In the preceding section, we saw that ellipses,
hyperbolas, and parabolas are all described by second-degree equations. We shall here show that, except for certain "pathological
cases, these three basic types of curves can be used to describe all
second-degree curves.
If A = B = C = 0, then (1) is a first-degree equation
Dx + Ey = F.
This is (unless D = E = 0) the equation for a line. In the sequel,
we can hence assume that at least one of the coefficients A, B, C are
non-zero.
The main idea for our solution of (1) involves changing coordinates to a new ON-system, where the equation has a simpler form.
We start by showing that, by a suitable rotation of the basis vectors,
we can get rid of the coefficient B for xy. Thus we introduce new
basis vectors e01 , e02 by
e01 = cos e1 + sin e2 ,
e02 = sin e1 + cos e2 .
It is easy to see that e01 , e02 is an orthonormal basis (since e1 , e2 is so).
Let (x0 , y0 ) be the coordinates for a point P relative to the system
Oe01 e02 . Then



OP = x0 cos y0 sin e1 + x0 sin + y0 cos e2 .


If (x, y) are the coordinates of P in the "old system Oe1 e2 , we hence
have
x = x0 cos y0 sin .
y = x0 sin + y0 cos .
Substituting these expressions into (1), we get an equation of the form
2
A0 (x0 )2 + B0 x0 y0 + C0 y0 + D0 x0 + E0 y0 = F,

2. GENERAL SECOND-DEGREE EQUATIONS

49

where the coefficient B0 for x0 y0 is given by


(2)

B0 = 2A cos sin + B(cos2 sin2 ) + 2C cos sin


= B cos 2 (A C) sin 2,

where we have used the familiar "double angle formulas for cos and
sin.
If B = 0 no rotation is necessary, and we can take = 0. If B , 0
we choose such that
AC
.
cot 2 =
B
The relation (2) then shows that B0 = 0. In all cases, our new coordinate system turns our equation into the type
2
(3)
A0 (x0 )2 + C0 y0 + D0 x0 + E0 y0 = F,
for suitable constants A0 , C0 , D0 , E0 . Since we have assumed that at
least one of the numbers A, B, C are not zero, it is easy to see that at
least one of A0 , C0 are not zero.
Case 1. We first consider the case when both A0 and C0 are non-zero.
We can then complete squares in (3) to obtain




D0 2
E0 2 (D0 )2 (E0 )2
0
0
(4)
A0 x0 +
+
C
y
+
=
+
+ F.
2A0
2C0
4A0
4C0
We then make a new change of coordinates by
D0
E0
00
0
,
y
=
y
+
.
2A0
2C0
This means that the origin in the old system is moved to the point
which has x0 y0 -coordinates (D0 /2A0 , E0 /2C0 ). If
x00 = x0 +

F0 =

(D0 )2 (E0 )2
+
+ F,
4A0
4C0

then (4) becomes


(5)

A0 (x00 )2 + C0 (y00 )2 = F0 .

If A0 and C0 are both positive, then (5) describes an ellipse, a point,


or the empty set, depending on whether F0 > 0, F0 = 0, or F0 < 0. The
case then A0 and C0 are both negative can be reduced to the positive
case by multiplying both sides of the equation by 1. If A0 and
C0 have opposite signs, we can after multiplication with a suitable
constant assume that A0 > 0 and C0 = 1. The equation (5) is then
A0 (x00 )2 (y00 )2 = F0 .

50

4. SECOND DEGREE CURVES

If F00 , 0 this means a hyperbola. If F0 = 0 we get

y00 = A0 x00 ,
which means a "degenerate hyperbola, or rather: two intersecting
straight lines.
Case 2. Now suppose that one of the numbers A0 , C0 in (3) are
zero. We can w.l.o.g. assume that A0 = 0 and C0 , 0. Then (3) says
that
C0 (y0 )2 + D0 x0 + E0 y0 = F.

(6)

If D0 = 0, then the equation (6) becomes independent of x0 . The


equation then means two lines parallel to the x0 -axis, or one line
parallel to the x0 -axis, or the empty set, depending on the number of
different real solutions to the second-degree equation C0 (y0 )2 + E0 y0 =
F. If D0 , 0, the equation (6) can be written
!


(E0 )2
F
E0 2
0
0
0
0
C y + 0 + D x 0 0 0 = 0.
2C
D
4C D
If we now put
(E0 )2
F
E0
00
0

,
y
=
y
+
4C0 D0 ,
D0 4C0 D0
2C0
we see that (6) transforms into the equation
x00 = x0

(7)

C0 (y00 )2 + D0 x00 = 0.

This last equation means a parabola: if C0 and D0 have equal signs, it


surrounds the negative x00 -axis, otherwise it surrounds the positive
x00 -axis.
The above discussion characterizes all possible second-degree
curves. Except for ellipse, hyperbola, and parabola, there are the
following "pathological cases: two intersecting straight lines, one
or two parallel straight lines, a point, the whole plane (when A = B =
C = D = E = F = 0), and the empty set.
Exercises.
8. Prove that each of the following equations describe an ellipse.
Also determine the lengths of the semi-axes.
a) 17x2 16xy + 17y2 = 225.
b) 3x2 + 2xy + 3y2 = 8
c) 9x2 + y2 18x + 4y + 4 = 0
d) 2x2 + 3y2 + 12x + 12 = 0.

3. ANSWERS TO EXERCISES

3. Answers to exercises
1.
2.
3.
5.
6.
7.
8.

(3, 5).
a) (4, 0).
b) (12, 0).
2
2
x + 2y = 8.

a) (5, 0).
b) ( 40, 0).
5x2 4y2 = 20.
y2 = 12x2 ; focus (3, 0).

a) 5 and 3. b) 2 and 2. c) 3 and 1. d) 3 and 2.

51

CHAPTER 5

Cross-product and Volume-product


1. Orientation of vectors
Let be a plane in space and u, v two non-parallel vectors in .
Consider the smallest rotation which turns u into a vector with the
same direction as v. If we view the rotation from one side of the
plane, the rotation appears clockwise, but from the other side it will
be perceived to be counterclockwise. If w is a vector not in , and
if the rotation is counterclockwise when seen from the side of in the
direction of w, then the triple u, v, w is said to be positively oriented.
If this is not the case, we say that the triple is negatively oriented. A
positively (resp. negatively) oriented system is sometimes called a
right handed (resp. left handed) system.
Observe that the ordering of the vectors is essential here. For
example, if u, v, w is positively oriented, then u, w, v is negatively
oriented. Namely: place all vectors with their tails at the same point.
Then, seen from the tip of v, the smallest rotation which turns u into
a vector with the same direction as w, will be clockwise. The reader
is asked to supply a picture of the situation.
2. Cross-product
Let u and v be two vectors in space and denote by A(u, v) the area
of the parallelogram spanned by u and v. If u and v are parallel, then
of course A(u, v) = 0; otherwise, a simple geometric consideration
shows that
(8)

A(u, v) = kukkvk sin ,

where is the angle between u and v (in the interval [0, ]).
Definition 13. The cross-product (or vector-product) u v of u and
v is the unique vector w with the following properties:
(i) w is orthogonal to u and to v,
(ii) kwk = A(u, v),
(iii) If u and v are not parallel, then u, v, w is a positively oriented
triple.
53

54

5. CROSS-PRODUCT AND VOLUME-PRODUCT

Remark 14. Note that if the vectors u and v are parallel, then
A(u, v) = 0 and u v = 0.
Example 1. Let e1 , e2 , e3 be a positively oriented orthonormal basis.
Then
e1 e2 = e2 e1 = e3 ,
e3 e1 = e1 e3 = e2 ,
e2 e3 = e3 e2 = e1 .
These formulas are obvious geometrically.
The cross-product can (like the scalar product) be described using
an orthogonal projection. Suppose that v , 0 and that M is a plane
with normal vector v. Denote by u0 the orthogonal projection of u on
M. Then A(u, v) = A(u0 , v), because ku0 k is the height from v of the
parallelogram spanned by u and v. Let w = u v. So w is the vector
orthogonal to u and v, such that kwk = ku0 kkvk and that u, v, w is a
positively oriented triple. It is now seen that
(9)

u v = kvk T(u0 ),

where T(u0 ) is the vector u0 rotated the angle /2 clockwise in M,


seen from the tip of v. For the orthogonal projection, we know
that (u1 + u2 )0 = u01 + u02 . If the resulting vectors are rotated by /2
clockwise, it follows that T ((u1 + u2 )0 ) = T(u01 ) + T(u02 ). It now follows
from (9) that
(10)

(u1 + u2 ) v = u1 v + u2 v.

We have proved the distributive law for the cross-product. It is also


clear from the definition of the cross-product that it obeys linearity in
the first argument,
(11)

(tu) v = t(u v),

as well as a new type of rule:


(12)

u v = v u.

The rule (12) is known as the anti-commutativity of the cross-product.


A consequence of (12) is that the counterparts of (10) and (11) also
hold for the second argument:
u (v1 + v2 ) = u v1 + u v2

u (tv) = t(u v).

If e1 , e2 , e3 is a basis for three-dimensional space and


(13)

u = x1 e1 + x2 e2 + x3 e3

v = y1 e1 + y2 e2 + y3 e3

2. CROSS-PRODUCT

55

we will thus have


u v = x1 y1 e1 e1 + x1 y2 e1 e2 + . . . + x3 y3 e3 e3 .
Since e j e j = 0 and e j ek = ek e j , this simplifies to
(x1 y2 x2 y1 )e1 e2 + (x1 y3 x3 y1 )e1 e3 + (x2 y3 x3 y2 )e2 e3 .
In the important case when the basis e1 , e2 , e3 is orthonormal and
positively oriented, we get (using Example 1) the following result.
Theorem 15. Suppose that e1 , e2 , e3 is a positively oriented orthonormal
basis. Then
(14)

u v = (x2 y3 x3 y2 )e1 (x1 y3 x3 y1 )e2 + (x1 y2 x2 y1 )e3 .

Remark 16. As a help for memory, there is a well-known mnemonic


trick to remember the above formula. This uses the concept of "determinants, a concept which we at this stage will use solely as a help
for our memory. By definition a 2 2-determinant is defined by


a b
c d = ad bc.
A 3 3-determinant is then defined by








a1 a2 a3
b
b
b
b
b
b






2
3
1
3
1
2
b1 b2 b3 = a1
c2 c3 a2 c1 c3 + a3 c1 c2 .


c1 c2 c3
This rule is called "expansion along the first row: one starts in the
upper left corner with a1 and multiplies by the 2 2-determinant
obtained by striking the row and column containing a1 . Then we
proceed to a2 and do the same, but with the opposite (i.e., minus-)
sign in front of it. Then we move to a3 (changing sign again). Using
determinants we can now write


e1 e2 e3
u v = x1 x2 x3 .


y1 y2 y3
Remark 17. We saw above that the cross-product is non-commutative
(it is in fact anti-commutative). The cross-product does not obey the
associate law either. For example, if e1 , e2 , e3 is an orthonormal basis,
then by Example 1,
e1 (e1 e2 ) = e2

and

(e1 e1 ) e2 = 0 e2 = 0.

Thus it can happen that u (v w) , (u v) w.

56

5. CROSS-PRODUCT AND VOLUME-PRODUCT

Example 2. Consider three points P0 , P1 , P2 , which in a positively oriented orthonormal system have coordinates (2, 3, 2), (4, 1, 1), and
(2, 1, 1) respectively. We shall compute the area of the triangle
P0 P1 P2 . This area equals to half the area of the parallelogram spanned

by the vectors P0 P1 and P0 P2 . The area of that parallelogram is the


length of the cross-product

P0 P1 P0 P2 = (2, 2, 3) (0, 2, 1) = (4, 2, 4).
Thus the triangle P0 P1 P2 has area
1 2
4 + 22 + 42 = 3.
2
Example 3. If e1 , e2 is an orthonormal basis in a plane , we can
choose a unit normal vector e3 to such that e1 , e2 , e3 becomes an
orthonormal basis for space. Two vectors
u = x1 e1 + x2 e2

v = y1 e1 + y2 e2

in will then have coordinates (x1 , x2 , 0) resp. (y1 , y2 , 0) in the basis


e1 , e2 , e3 . The area of the parallelogram spanned by u and v therefore
equals to the length of the cross-product
(x1 , x2 , 0) (y1 , y2 , 0) = (0, 0, x1 y2 x2 y1 ),
i.e. we have the formula


A(u, v) = x1 y2 x2 y1 .
Example 4. The cross-product can be used to calculate the distance
between lines. Assume a positively oriented orthonormal system,
and let `1 be the line through the points (1, 1, 1) and (4, 5, 3) while `2 is
the line passing through the points (1, 10, 1) and (8, 2, 2). Thus `1
has direction vector (3, 4, 2) and `2 has direction vector (3, 4, 1). Since
(3, 4, 2) (3, 4, 1) = (4, 3, 0)
we see that e = 15 (4, 3, 0) is a unit vector orthogonal to both `1 and `2 .
If P is a point on `1 and Q a point on `2 , we infer that the absolute value
 
of the scalar product PQ e must equal to the distance between `1
and `2 . Choosing, for example, P = (1, 1, 1) and Q = (8, 2, 2), we find
that
 
PQ e = 5.
The distance between `1 and `2 is thus 5.

3. VOLUME-PRODUCT

57

Exercises.
1. Prove that if u + v + w = 0, then
2u v = v w + w u.
2. Find the area of the triangle which in a positively oriented
ON-system has its vertices at the points
a) (1, 2, 3), (3, 4, 1), (2, 0, 2).
b) (5, 1, 1), (2, 3, 2), (3, 2, 3).
c) (1, 0, 0), (0, 1, 0), (0, 0, 1).
3. In a positively oriented ON-system, the points (1, 1, 1) and
(0, 3, 3) are on the line `1 , and (2, 2, 4) and (4, 4, 4) are on `2 .
Find the distance between `1 and `2 .
4. Prove that
(u v) w = (u|w) v (v|w) u.
Hint: To simplify the computations, one can choose an
ON-basis e1 , e2 , e3 such that u = x1 e1 and v = y1 e1 + y2 e2 .
5. Prove that
(u v) w = u (v w)
if and only if u is parallel to w, or u and w are both orthogonal
to v.
Hint: Use the preceding exercise.
3. Volume-product
The cross-product and the scalar product can be combined to form
the volume-product V(u, v, w) of three vectors in space:
(15)

V(u, v, w) = (u v|w).

The motivation for the name is that the absolute value of V(u, v, w)
equals to the volume of the parallelepiped spanned by the vectors u, v, w,
if they are placed with their tails at one and the same point. To show this,
note that the vector
uv
e=
ku vk
is a unit normal to the plane spanned by u and v. Denote by P the
parallelepiped spanned by u, v, w. From elementary geometry we
know that the volume of P equals the area of the "base parallelogram
spanned by u, v, times the height h of P above the plane spanned by
u and v. Then h is the length of the orthogonal projection of w on the

58

5. CROSS-PRODUCT AND VOLUME-PRODUCT

normal of the plane, i.e., h = |(e|w)|. Since the base parallelogram has
area A(u, v) = ku vk, we infer that the volume of P equals
A(u, v)h = ku vk |(e|w)| = |(u v|w)| .
The asserted property of the volume product is proved.
When two vectors u and v are parallel we have u v = 0, so
V(u, v, w) = 0 in this case. More generally, the volume-product is
zero when the parallelepiped is degenerate, i.e., when the vectors u,
v, w are linearly dependent. On the other hand, if u, v, w are linearly
independent, the sign of (u v|w) depends on whether or not the
two vectors u v and w lie on the same side of the plane spanned by
u and v. The volume product V(u, v, w) is positive when the triple u,
v, w is positively oriented and negative otherwise. Since the volume
of the parallelepiped is the same regardless of how we choose to
permute the vectors u, v, w, only the sign can change under such a
permutation:
V(u, v, w) = V(w, u, v) = V(v, w, u) = V(u, w, v)
= V(v, u, w) = V(w, v, u).
Combining the computational rules for the cross- and scalar- products, we find that
(16)

V(su1 + tu2 , v, w) = sV(u1 , v, w) + tV(u2 , v, w)

for all real numbers s and t. Thus we have linearity in the first argument
for the volume product. We similarly have linearity in the second
and in the third argument. In short: the volume product is tri-linear,
i.e., linear in each of its three arguments.
Now let e1 , e2 , e3 be a basis for space, and take three vectors
u = (x1 , x2 , x3 ), v = (y1 , y2 , y3 ), w = (z1 , z2 , z3 ), where coordinates are
given according to the chosen basis. Then by linearity in the different
arguments,
V(u, v, w) = V(x1 e1 + x2 e2 + x3 e3 , v, w)
= x1 V(e1 , v, w) + x2 V(e2 , v, w) + x3 V(e3 , v, w)
X
= ... =
xi y j zk V(ei , e j , ek ).
Here the sum is over all possible choices of of i, j, k {1, 2, 3}, but
since V(ei , e j , ek ) = 0 if two of them coincide, only six terms can be
non-zero. Furthermore
V(e1 , e2 , e3 ) = V(e3 , e1 , e2 ) = V(e2 , e3 , e1 ) = V(e1 , e3 , e2 )
= V(e2 , e1 , e3 ) = V(e3 , e2 , e1 ).

3. VOLUME-PRODUCT

59

This gives that V(u, v, w) is equal to


(x1 y2 z3 + x3 y1 z2 + x2 y3 z1 x1 y3 z2 x2 y1 z3 x3 y2 z1 )V(e1 , e2 , e3 ).
The number in front of V(e1 , e2 , e3 ) can now be recognized as the
3 3-determinant (see Remark 16 in the previous section)


x1 x2 x3
y1 y2 y3 .


z1 z2 z3
We have arrived at the formula


x1 x2 x3
(17)
V(u, v, w) = y1 y2 y3 V(e1 , e2 , e3 ).


z1 z2 z3
This formula is particularly simple when the basis e1 , e2 , e3 is orthonormal. Then V(e1 , e2 , e3 ) = 1, so we simply have:


x1 x2 x3
(18)
V(u, v, w) = y1 y2 y3 .


z1 z2 z3
Example 1. Suppose that the vectors u, v, w have coordinates (2, 1, 3),
(0, 3, 2), and (3, 5, 1) respectively, with respect to a positively oriented
orthonormal basis. We get


2 1 3
V(u, v, w) = 0 3 2 = 47.


3 5 1
The volume of the parallelepiped spanned by u, v, w is thus 47.
Three vectors are linearly dependent if and only if they are coplanar, i.e., if the corresponding parallelepiped has volume zero.
According to (17) this is equivalent to that the determinant of the
coordinates of the vectors is zero.
Example 2. To decide when the vectors (1, a, 2), (1, 7, 1 + a), and
(1, 1, 1) are linearly dependent, we form the determinant


a
2
1
1 7 1 + a = a2 + 3a 4 = (a 1)(a + 4).


1 1
1
The vectors are thus linearly dependent if a = 1 or a = 4.

60

5. CROSS-PRODUCT AND VOLUME-PRODUCT

Exercises.
6. Motivate the identity
(u v|w) = (u|v w) .
7. The volume of a tetrahedron spanned by three vectors u, v,
w, rooted at the same point, equals to 1/6 of the volume of the
parallelepiped spanned by u, v, and w. Find the volume of
the tetrahedron with vertices at
a) (2, 1, 0), (3, 5, 2), (4, 1, 2), (6, 1, 5).
b) (2, 2, 3), (2, 1, 3), (1, 4, 2), (0, 5, 1).
8. A tetrahedron with volume 5 has three of its vertices at the
points (2, 1, 1), (3, 0, 1), (2, 1, 3). The fourth vertex is on the
positive y-axis. Determine its y-coordinate.
9. For which values of a and b are the three vectors
(a, b, b) ,

(b, a, b) ,

(b, b, a)

linearly dependent?
10. For which values of a are the four points
(0, 2, 1) ,

(a, 1, 0)

(3, 3, a) ,

(3, 3, 1 + a)

in the same plane?


4. Quaternions
We have seen that neither the commutative, nor the associative
laws hold for the cross-product. One can ask whether there is some
other way of defining multiplication between vectors, so that all the
usual computational laws are satisfied. For vectors in a plane, this is
true, since plane vectors can be identified with complex numbers. In
the definition of multiplication of complex numbers, one starts with
the familiar identity i2 = 1; if the usual laws of calculation shall
hold, the product of complex numbers must be
(x1 + ix2 )(y1 + iy2 ) = (x1 y1 x2 y2 ) + i(x1 y2 + x2 y1 ).
As we know, this definition does indeed satisfy all the usual computational rules.
Vectors in three-dimensional space can formally be written
x1 + x2 i + x3 j
and if we still insist that i2 = 1, so that the multiplication when
x3 = 0 corresponds to multiplication of complex numbers, we just

4. QUATERNIONS

61

need to establish the rules for multiplication with j. In particular, the


product ij must be a new vector
(19)

ij = a + bi + cj.

If this is multiplied by i, using that i2 = 1, we obtain


j = b + ia + ijc.
If we here substitute ij for the right hand side of (19), we get after
simplification that
0 = (ac b) + (bc + a)i + (c2 + 1)j.
This does not make sense, for we can not have c2 + 1 = 0 for a real
number c.
The above argument shows that it is impossible to extend the multiplication of complex numbers to multiplication of triples of numbers in a way such that the usual laws of calculation are preserved.
Annoyed by this type of inconveniences, the Irish mathematician
Hamilton tried to instead define multiplication between 4-tuples
(20)

x = x0 + x1 i + x2 j + x3 k.

For reasons soon to be made clear, the coefficients are enumerated


from 0 to 3, rather than from 1 to 4. In 1843, Hamilton discovered
that if one abandons the commutative law xy = yx and defines
(21)

i2 = j2 = k2 = 1
ij = ji = k , jk = k j = i ,

ki = ik = j,

then multiplication of 4-tuples will satisfy all other rules of calculation. Hamilton coined the term quaternions for the set of 4-tuples
with this multiplication. He also showed how to define division by
a non-zero quaternion.
In 1878, the German mathematician Frobenius proved that if we
want to define multiplication of n-tuples such that division by nonzero elements is always possible, and if we only are prepared to
abandon the commutative law for multiplication, then n must be
either 1, 2, or 4. Except for the real and complex fields, Hamiltons
quaternions are the only possibility.
Hamilton called the number x0 the scalar part of the quaternion (20)
and x1 i + x2 j + x3 k is the vector part. If one multiplies two quaternions
with scalar parts zero and uses the identities in (21), one finds after a
little calculation that
(x1 i + x2 j + x3 k)(y1 i + y2 j + y3 k) = (x1 y1 + x2 y2 + x3 y3 )+
+ (x2 y3 x3 y2 )i + (x3 y1 x1 y3 ) j + (x1 y2 x2 y1 )k.

62

5. CROSS-PRODUCT AND VOLUME-PRODUCT

The scalar part of the right hand sign equals the negative of the scalar
product of the vectors in the left hand side, and the vector part of
the right hand side equals the cross-product of the vectors in the left
hand side. The scalar product is older, even though the name was
invented by Hamilton, but the cross-product was discovered in this
way, as a by-product of multiplication of quaternions. Hamilton also
interpreted the quaternions geometrically and defined the scalar- and
cross-products in a basis-independent way, as we have done in this
chapter.
Exercises.
11. Prove that the formula (21) implies that
(x0 + x1 i + x2 j + x3 k)(x0 x1 i x2 j x3 k) = x20 + x21 + x22 + x23 ,
and that one therefore can define division by non-zero quaternions.
5. Answers to Exercises

a) 3 2. b) 26/2. c) 3/2.
3.
a) 4/3. b) 10.
y = 8.
a = b or a = 2b.
a = 1 and a = 9/4.

2.
3.
7.
8.
9.
10.

CHAPTER 6

Matrices
1. Basic properties
Definitions. By a p n-matrix we mean an array of numbers,
arranged in the form

a11 a12 . . . a1n


a

21 a22 . . . a2n
A = ..
..
..
.
.
.

ap1 ap2 . . . apn


with p rows and n columns. The numbers a jk are called matrix elements. Notice that a jk is in the j:th row and k:th column. A more brief
notation, meaning the same matrix A is:
p,n

A = (a jk ) j,k=1 .
In the special case when p = n we say that A is a square matrix of order
n.
p,n

p,n

Operations with matrices. Let A = (a jk ) j,k=1 and B = (b jk ) j,k=1 be


two p n matrices. We define A + B to be the p n matrix with entries
a jk + b jk , i.e.,
p,n

A + B = (a jk + b jk ) j,k=1 .
Example 1.
!
!
!
2 3 1
3 2 1
5 1 0
+
=
.
4 2 1
2 1 2
6 3 3
For a scalar t we define tA as the matrix with entries ta jk .
Example 2.
!
!
2 3 1
4 6 2
.
2
=
4 2 1
8 4 2
63

64

6. MATRICES

The definition of the product of two matrices is less obvious. In


order to find a reasonable definition, let us consider a linear relation

a11 x1 + a12 x2 + . . . + a1n xn = y1

a21 x1 + a22 x2 + . . . + a2n xn = y2


.
(1)

..

ap1 x1 + ap2 x2 + . . . + apn xn = yp


If the numbers y1 , . . . , yp are given, then (1) is a linear system for the
unknowns x1 , . . . , xn . On the other hand, if x1 , . . . , xn are given, then
(1) gives us the values of y1 , . . . , yp . That is, the quantities y1 , . . . , yp
can via (1) be regarded as functions of x1 , . . . , xn . In order to define
matrix multiplication, we shall adapt this latter point of view: we
regard (1) as a recipe for a function.
Now suppose that we have another set of variables z1 , . . . , zq
which depend on y1 , . . . , yp in a similar way,

b11 y1 + b12 y2 + . . . + b1p yp = z1

b21 y1 + b22 y2 + . . . + b2p yp = z2


.
(2)

..

bq1 y1 + bq2 y2 + . . . + aqp yp = zq


If we here substitute y1 , . . . , yp by the corresponding left hand side in
(1), we get a relation of the form

c x + c12 x2 + . . . + c1n xn = z1

11 1

c21 x1 + c22 x2 + . . . + c2n xn = z2


(3)
,

..

cq1 x1 + cq2 x2 + . . . + cqn xn = zq


where
(4)

c jk = b j1 a1k + b j2 a j2 + + b jp apk .

Now define two matrices by

a11 a12 . . . a1n


a

21 a22 . . . a2n
A = ..
..
..
.
.
.

ap1 ap2 . . . apn

b11 b12 . . . b1p


b

21 b22 . . . b2n
and B = ..
..
.. .
.
.
.

bp1 bp2 . . . bpn

We call these matrices the coefficient matrices of the linear equation


systems (1) resp. (2). We define the product BA to be the matrix
q,n
C = (c jk ) j,k=1 where the c jk are given by (4). Thus the element in

1. BASIC PROPERTIES

65

position ( j, k) in BA is obtained by pairwise multiplication of the


elements of row j in B with the elements in column k in A, followed
by summation.
Observe that the matrix product BA is defined only if the number
of columns of B equals to the number of rows of A.
Example 1.
!
!
!
1 2 1 0 5
11+23 10+26 15+27
=
3 4 3 6 7
31+43 30+46 35+47
!
7 12 19
=
.
15 24 43
Example 2.



 3  
1 2 3 2 = 10 .

1

Example 3.


3 6 9
3 


2 1 2 3
= 2 4 6 .


1 2 3
1
Example 4.
!
!
!
4 2 1 2
8 16
=
,
2 1 2 4
4 8
!
!
!
1 2 4 2
0 0
=
.
2 4 2 1
0 0
The last example shows that the order between the factors is
essential for matrix multiplication. In other words, matrix multiplication is non-commutative: it is possible (and very common) to have
AB , BA.
In Example 1, B is a 2 2 matrix and A is a 2 3 matrix. The
product BA is therefore a 2 3 matrix. The product AB is not defined
in this case. In order that both AB and BA be defined, it is necessary
and sufficient that A be a n p and B a p n matrix (with the same n
and p). Then BA is an n n matrix and AB is p p. This is illustrated
by examples 2 and 3.

66

6. MATRICES

Definition 18. The square n n-matrix

1 0 . . . 0
0 1 . . . 0

E = En = .. ..
..
. .
.

0 0 ... 1
is called the identity matrix of order n.
Notice that E is the neutral element for matrix multiplication, i.e.
we have
EA = AE = A
for all n n matrices A.
While matrix multiplication fails to be commutative, it obeys the
other rules of calculation.
Theorem 19. Matrix multiplication obeys the associative law
C(BA) = (CB)A

(5)
and the distributive laws
(6)

B(A + A0 ) = BA + BA0

(B + B0 )A = BA + BA0 .

(We here assume that the dimensions of the matrices are such that the sums
and products make sense.)
Proof. The formulas can easily be verified by direct evaluation.
Nonetheless, we shall give an alternative argument for the associative
law (5).
Consider the relation (1) as a function F : Rn Rp , which to
each n-tuple (x1 , . . . , xn ) Rn associates a p-tuple (y1 , . . . , yp ) Rp .
Likewise (2) can be regarded as a function G from Rp to Rq , which
to (y1 , . . . , yp ) associates (z1 , . . . , zq ). The matrix product BA will then
correspond to the composite function G F, and (5) follows from the
associate law from composition of functions:
H (G F) = (H G) F.

We shall in this course only be concerned with matrices whose
entries are real numbers. Nonetheless, we want to mention that
matrices with complex entries can be handled in the same way, as in
the following example.

1. BASIC PROPERTIES

Example 5. Consider the three matrices


!
!
i 0
0 1
I=
, J=
0 i
1 0

67

!
0 i
K=
,
i 0

where i is the imaginary unit. If these are multiplied by i, one


obtains the famous Pauli matrices; these were used by Paul Dirac in
1928, to formulate an equation for the electron.
It is easy to check that
I2 = J2 = K2 = E,
where E is the identity matrix of order 2, and that
IJ = JI = K

JK = KJ = I

KI = IK = J.

If this is compared with the formulas for multiplication of quaternions in the preceding chapter, one realizes that Hamiltons quaternions x0 + x1 i + x2 j + x3 k can be identified with the set of complex 2 2
matrices of the form
!
x0 + ix2 x2 + ix3
x0 E + x1 I + x2 J + x3 K =
.
x2 + ix3 x0 ix1
The computational rules for quaternions can then be seen as special
cases of the rules for matrix multiplication.
Before we close this section, we define a new matrix operation
called transposition. If A is a p n matrix, then the transpose of A At is
defined as the n p matrix whose rows are the columns of A:

a11 a21 . . . ap1


a11 a12 . . . a1n
a

21 a22 . . . a2n
12 a22 . . . ap2
t

If A = ..
..
.. .
..
.. then A = ..
.

.
.
.
.
.

ap1 ap2 . . . apn


a1n a2n . . . apn
Transposition satisfies the following computational rules (proofs are
left as exercises for the reader)
(A + B)t = At + Bt

(AB)t = Bt At .

Notice that the last rule says that transposition reverses the order of a
matrix product.
Exercises.
1. Let

1 0 2

A = 0 3 1

2 2 1

0 1 1

B = 2 2 0

1 2 3

2 1

C = 1 1 .

1 2

68

6. MATRICES

Compute
a) AB b) BA c) At Bt d) (A + 3B)C e) CCt
2. Let
!
!
1 1
1 2
A=
and
B=
.
1 1
3 4

f) Ct C.

Determine: a) A2 B2 , b) (A + B)(A B).


Why are the answers different in a) and b)?
3. Let
!
1 3
A=
.
3 9
Find all 2 2 matrices B such that
AB = BA = 0.
Here 0 denotes the 2 2 zero-matrix, i.e. the matrix all of
whose entries equal zero.
4. Denote by Ak the product of a matrix A by itself k times.
a) Prove that if AB = BA, then we have the binomial expansion
!
k
(A + B)k = Ak + kAk1 B +
Ak2 B2 + . . . + Ak .
2
b) Compute (I + A)10 where I is the identity matrix and

0 5 3

A = 0 0 3 .

0 0 0
5. Show that, in order to verify all statements in Example 5, it
suffices to prove that
I2 = J2 = K2 = IJK = E.
2. Matrix inverse
Let

a11 a12 . . . a1n


a21 a22 . . . a2n

A = ..
..
..
.
.
.

ap1 ap2 . . . apn


x1
x2
x = ..
.

xn


y1
y2
y = .. .
.

yp

The linear equation system (1) can then be written in the matrix form:
(7)

Ax = y.

We shall here discuss (7) in the important case when n = p, i.e., when
A is a square matrix of order n.

2. MATRIX INVERSE

69

If n = 1, the system (7) reduces to a single equation


ax = y.
If a , 0 this equation can be solved by multiplication with a1 = 1/a:
x = a1 y.
There is a counterpart to this procedure also when n > 1.
Definition 20. Let A be a n n-matrix. We say that A is invertible
if there is an n n-matrix B such that
AB = E

BA = E.

and

In this case, B is called an inverse to A.


Remark 21. If A is invertible, then the inverse is unique. We can
thus speak of the inverse and write B = A1 . To see this, assume that
there are two matrices B and C which are inverse to A. Then BA = E
and AC = E, so
B = BE = B(AC) = (BA)C = EC = C.
The uniqueness is proved.
Example 1. If
1 2
A=
2 3

!
,

!
3 2
B=
,
2 1

then by direct calculation, AB = BA = E. Thus A is invertible and


B = A1 . For the same reason, B is invertible and A = B1 .
Now suppose that A is an invertible matrix. The linear system
Ax = y
can then be multiplied by A1 from the left, giving
x = Ex = (A1 A)x = A1 (Ax) = A1 y.
If the system has a solution x we must thus have x = A1 y. That this
really is a solution follows from that
A(A1 y) = (AA1 )y = Ey = y.
We have proved one direction of the following theorem.
Theorem 22. A square matrix A is invertible if and only if the linear
equation system Ax = y has a unique solution x for all right hand sides y.
If this is the case, the solution is given by x = A1 y.

70

6. MATRICES

Remark 23. In the proof, we shall use the following property of


matrix multiplication: If C is a square matrix with columns C1 , C2 , . . . , Cn ,
then the matrix AC has columns AC1 , AC2 , . . . , ACn . The (simple) verification of this fact is left as an exercise for the interested reader.
Proof of Theorem 22. It remains to prove that if the system Ax =
y has a unique solution for all possible right hand sides y, then A is
invertible.
Let C and D be two square matrices with columns C1 , C2 , . . . , Cn
resp. D1 , D2 , . . . , Dn . By Remark 23, the matrix identity
(8)

AC = D

is equivalent to the n vector identities


ACk = Dk ,

k = 1, . . . , n.

Hence if the system Ax = y has a unique solution for all y, there is


precisely one n n matrix D satisfying (8). In particular there is a
unique n n matrix B such that
(9)

AB = E.

In order to show that A is invertible, we shall show that also BA = E.


But by (9) we have
A(BA) = (AB)A = EA = A.
The matrix C = BA thus satisfies the equation
AC = A.
This last equation is also satisfied by C = E. Since (8) has precisely one
solution C for every right hand side D, we must then have BA = E. 
The following example shows how one can calculate inverse matrices in practice.
Example 2. To determine whether the matrix

1 1 1

A = 1 2 3

1 3 2

2. MATRIX INVERSE

71

is invertible, we try to solve the system Ax = y for an arbitrary right


hand side y:

x1 + x2 + x3 = y1

x1 + 2x2 + 3x3 = y2

x1 + 3x2 + 2x3 = y3

x1 + x2 + x3 = y1

x2 + 2x3 = y1 + y2

x2 x3 =
y2 + y3
...

3x1
= 5y1 y2 y3

3x2
= y1 y2 + 2y3

3x3 = y1 + 2y2 y3

We see that the system has a unique solution x for each right hand
side y, so the matrix A is invertible. The last system also shows that

5 1 1

A1 = 1 1 2 .

3 1 2 1
Computational rules for the inverse. If both of the matrices A
and B are invertible, then the product AB is also invertible, and
(AB)1 = B1 A1 .
The order between factors is thus reversed after inversion. This is
realized by observing that if A and B are invertible then the matrix
D = B1 A1 satisfies
D(AB) = B1 (A1 A)B = B1 EB = B1 B = E,
and similarly
(AB)D = E.
Thus AB is invertible with inverse D.
Finally, we leave it to the reader to check that if A is invertible,
then At is invertible and
(At )1 = (A1 )t.

72

6. MATRICES

Exercises.
6. Determine which matrices are invertible. Also determine the
inverse

matrix whenit exists.


1 0 1
1 1 2
1 2 3

a) 0 1 2 b) 2 1 1 c) 0 1 1 .

1 1 0
1 1 4
0 0 1
7. Let

1 0 a

A = 0 1 1 .

1 1 0
Calculate A1 for those values of a for which A is invertible.
8. Find the inverse matrices of A and of A2 where

1 2 3

A = 2 3 1 .

1 1 1
9. Find a matrix X which solves the matrix equation AXB = C
where

!
!
1 2 3
1 2 1
1 1
0 1 2
A=
, B =
.
, C =
1 2
2 1 2

0 0 1
10. Let A and B be two nn-matrices such that EAB is invertible.
Prove that E BA is invertible and that
(E BA)1 = E + B(E AB)1 A.
3. Answers to Exercises

7
2 5 0
2
2 5

1. a) 7 4 3 b) 0 6 2 c) 5

0
3 4 1
7 12 1

4 14
5 1 4
6

d) 16 5 e) 1 2 1 f)
3

10 29!
4
1 !5
5
12
4
9
2. a)
b)
.
17 10
20 9
!
9t 3t
3.
, t R.
3t t

1 50 705

4. b) (E + A)10 = 0 1 30 .

0 0
1

2 7
6 12

2 1
!
3
.
6

3. ANSWERS TO EXERCISES

1 2 1
1 1 1

1 .
c) 12 1 1
6. a) 0 1 2 . b) Not invertible.

0 0
1
1
1 1

1 a a
1
1 a 1 for a , 1.
7. A1 = 1a

1 1 1

2 1 7
10 7 2

2 5 and (A2 )1 = 19 5 8 8 .
8. A1 = 13 1

1 1 1
2 4 13
!
0 3 6
9. X = A1 CB1 =
.
1 3 4

73

Вам также может понравиться