You are on page 1of 20

MA 106 Linear Algebra

B. V. Limaye

Murali K. Srinivasan

Jugal K. Verma

January 7, 2014
2
Contents

1 Matrices, Linear Equations and Determinants 5


1.1 Matrix Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Gauss Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3
4 CONTENTS
Chapter 1

Matrices, Linear Equations and


Determinants

1.1 Matrix Operations

Convention 1.1.1. We shall write F to mean either the real numbers R or the complex numbers
C. Elements of F will be called scalars.

Let m, n be positive integers. An m × n matrix A over F is a collection of mn scalars aij ∈ F


arranged in a rectangular array of m rows and n columns:
 
a11 a12 · · · a1n
 a21 a22 · · · a2n 
 
A =   · · ··· ·  .
 · · ··· · 
am1 am2 · · · amn

The entry in row i and column j is aij . We also write A = (aij ) to denote the entries. When all
the entries are in R we say that A is a real matrix. Similarly, we define complex matrices. For
example,  
1 −1 3/2
5/2 6 11.2
is a 2 × 3 real matrix.
A 1 × n matrix [a1 a2 · · · an ] is called a row vector and a m × 1 matrix
 
b1
 b2 
 
 · 
 
 · 
bn

is called a column vector. An n × n matrix is called a square matrix.


Matrix Addition

5
6 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

Let M, N be m × n matrices. Then M + N is a m × n matrix whose (i, j) entry is the sum of


the (i, j) entries of M and N . For example,
     
2 1 0 1 0 3 3 1 3
+ = .
1 3 5 4 −3 1 5 0 6
Note that addition is defined only when both matrices have the same size.
Scalar multiplication
Let α ∈ F and let M be a m × n matrix. Then αM is a m × n matrix whose (i, j) entry is α ×
(i, j) entry of M . For example    
0 1 0 2
2 2 3 = 4
   6 .
2 1 4 2

Matrix multiplication


b1
 · 
First we define the product of a row vector a = [a1 . . . an ] and a column vector b = 
 · ,

bn
both with n components.
Define ab to be the scalar ni=1 ai bi .
P

The product of two matrices A = (aij ) and B = (bij ), denoted AB, is defined only when the
number of columns of A is equal to the number of rows of B. So let A be a m × n matrix and let
B be a n × p matrix. Let the row vectors of A be A1 , A2 , . . . , Am and let the column vectors of B
be B1 , B2 , . . . , Bp . We write
 
A1
 A2 
 
 ·  , B = [B1 B2 · · · Bp ] .
A= 
 · 
Am

Then M = AB is a m × p matrix whose (i, j) entry mij , for 1 ≤ i ≤ m and 1 ≤ j ≤ p, is given


by
n
X
mij = Ai Bj = aik bkj .
k=1
For example,  
  2 0  
1 3 1  1 1 = 5 4 .
2 4 2 8 6
0 1
The usefulness and “meaning” of this definition will emerge as this course progresses. Meanwhile,
let us note other ways of thinking about matrix multiplication. First a definition. By a linear
combination of n × 1 column vectors v1 , v2 , . . . , vr we mean a column vector v of the form v =
α1 v1 + · · · + αr vr , where αi ∈ F for all i are called the coefficients. Similarly we define a linear
combination of row vectors.
Matrix times a column vector
1.1. MATRIX OPERATIONS 7
 
x1
 · 
Lemma 1.1.2. Let B = [B1 B2 · · · Bp ] be a n × p matrix with columns B1 , . . . , Bp . Let x = 
 · 

xp
be a column vector with p components. Then

Bx = x1 B1 + x2 B2 + · · · + xp Bp .

Proof. Both sides are n × 1. By definition


 Pp
b x
  
Ppj=1 1j j b1j xj
j=1 b2j xj p  b2j xj p
  X  X
   
Bx =  · =


 · =
 xj Bj ,
·
Pp ·
  j=1   j=1
j=1 bnj xj bnj xj

as desired. So, Bx can be thought of as a linear combination of the columns of B, with column l
having coefficient xl . This way of thinking about Bx is very important.

Example 1.1.3. Let e1 , e2 , . . . , ep denote the standard column vectors with p components, i.e., ei
denotes the p × 1 column vector with 1 in component i and all other components 0. Then Bei = Bi ,
column i of B.

Row vector times a matrix


Let A = be a m × n with rows A1 , . . . , Am . Let y = [y1 · · · ym ] be a row vector with m
components. Then (why?)
yA = y1 A1 + y2 A2 + · · · + ym Am .
So, yA can be thought of as a linear combination of the rows of A, with row i having coefficient yi .
Columns and rows of product
Let A and B be as above. Then (why?)
 
A1 B

 A2 B 

AB = [AB1 AB2 · · · ABp ] = 
 · .

 · 
Am B

So, the jth column of AB is a linear combination of the columns of A, the coefficients coming from
the jth column Bj of B. For example,
 
  2 0  
1 3 1  5 4
1 1 =  .
2 4 2 8 6
0 1

The second column of the product can be written as


       
1 3 1 4
0 +1 +1 = .
2 4 2 6
8 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

Similarly, ith row Ai B of AB is a linear combination of the rows of B, the coefficients coming from
the ith row Ai of A.
Properties of Matrix Operations

Theorem 1.1.4. The following identities hold for matrix sum and product, whenever the sizes of
the matrices involved are compatible (for the stated operations).
(i) A(B + C) = AB + AC.
(ii) (P + Q)R = P R + QR.
(iii) A(BC) = (AB)C.
(iv) c(AB) = (cA)B = A(cB).

Proof. We prove item (iii) (leaving the others as exercises). Let A = (aij ) have p columns,
B = (bkl ) have p rows and q columns, and C = (crs ) have q rows. Then the entry in row i and
column s of A(BC) is
p
X
= a(i, m){entry in row m, column s of BC}
m=1
p q
( )
X X
= a(i, m) b(m, n)c(n, s)
m=1 n=1
q
( p )
X X
= a(i, m)b(m, n) c(n, s),
n=1 m=1

which is the entry in row i and column s of A(BC). 


Matrix multiplication is not commutative. For example :
     
1 0 0 1 0 1
= ,
0 0 0 0 0 0

but      
0 1 1 0 0 0
=
0 0 0 0 0 0

Definition 1.1.5. A matrix all of whose entries are zero is called the zero matrix. The entries
aii of a square matrix A = (aij ) are called the diagonal entries. If the only nonzero entries of a
square matrix A are the diagonal entries then A is called a diagonal matrix. An n × n diagonal
matrix whose diagonal entries are 1 is called the n × n identity matrix. It is denoted by In . A
square matrix A = (aij ) is called upper triangular if all the entries below the diagonal are zero,
i.e., aij = 0 for i > j. Similarly we define lower triangular matrices.
A square matrix A is called nilpotent if Ar = 0 for some r ≥ 1.

Example 1.1.6. Let A = (aij ) be an upper triangular n × n matrix with diagonal entries zero.
Then A is nilpotent. In fact An = 0.
Since column j of An is An ej , it is enough to show that An ej = 0 for j = 1, . . . , n. Denote
column j of A by Aj .
1.1. MATRIX OPERATIONS 9

We have Ae1 = A1 = 0. Now

A2 e2 = A(Ae2 ) = AA2 = A(a12 e1 ) = a12 Ae1 = 0.

Similarly
A3 e3 = A2 (Ae3 ) = AA3 = A2 (a13 e1 + a23 e2 ) = 0.
Continuing in this fashion we see that all columns of An are zero.

Inverse of a Matrix

Definition 1.1.7. Let A be an n×n matrix. If there is an n×n matrix B such that AB = In = BA
then we say A is invertible and B is the inverse of A. The inverse of A is denoted by A−1 .

Remark 1.1.8. (1) Inverse of a matrix is uniquely determined. Indeed, if B and C are inverses of
A then
B = BI = B(AC) = (BA)C = IC = C.
(2) If A and B are invertible n × n matrices, then AB is also invertible. Indeed,

(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 B = I.

Similarly (AB)(B −1 A−1 ) = I. Thus AB is invertible and (AB)−1 = B −1 A−1 .


(3) We will see later (in Chapter 3) that if there exists an n × n matix B for an n × n matrix A
such that AB = I or BA = I, then A is invertible. This fact fails for non-square matrices. For
example      
1 1 1 2
[1 2] = [1] = I1 , but [1 2] = 6= I2 .
0 0 0 0
   
1 0 a b
(4) Inverse of a square matrix need not exist. For example, let A = . If is any
0 0 c d
2 × 2 matrix, then      
1 0 a b a b
= 6= I2
0 0 c d 0 0
for any a, b, c, d.

Transpose of a Matrix

Definition 1.1.9. Let A = (aij ) be an m × n matrix. Then the transpose of A, denoted by At , is


the matrix n × n matrix (bij ) such that bij = aji for all i, j.
Thus rows of A become columns of At and columns of A become rows of At . For example, if
 
  2 1
2 0 1
A= then At =  0 0  .
1 0 1
1 1

Lemma 1.1.10. (i) For matrices A and B of suitable sizes, (AB)t = B t At .


(ii) For any invertible square matrix A, (A−1 )t = (At )−1 .
10 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

Proof. For any matrix C, let Cij denote its (i, j)th entry.
(i) Let A = (aij ), B = (bij ). Then, for all i, j,

((AB)t )ij = (AB)ji


X
= ajk bki
X
= (At )kj (B t )ik
X
= (B t )ik (At )kj
= (B t At )ij

(ii) Since AA−1 = I = A−1 A, we have (AA−1 )t = I = (A−1 A)t . By (i), (A−1 )t At = I =
At (A−1 )t . Thus (At )−1 = (A−1 )t . 

Definition 1.1.11. A square matrix A is called symmetric if A = At . It is called skew-symetric


if At = −A.

Lemma 1.1.12. (i) If A is a symmetric matrix then so is A−1 . (ii) Every square matrix A is a
sum of a symmetric and a skew symmetric matrix in a unique way.

Proof. (i) is clear from part (ii) above.


(ii) Since
1 1
A = (A + At ) + (A − At ),
2 2
every matrix is a sum of a symmetric and a skew-symmetric matrix. To see the uniqueness, suppose
that P is a symmetric matrix and Q is a skew-symmetric matrix such that

A = P + Q.

Then At = P t + Qt = P − Q. Hence P = 12 (A + At ) and Q = 12 (A − At ). 

1.2 Gauss Elimination

We discuss a widely used method called the Gauss elimination method to solve a system of m linear
equations in n unknowns x1 , . . . , xn :

ai1 x1 + ai2 x2 + · · · + ain xn = bi , i = 1, 2, . . . , m,

where the aij ’s and the bi ’s are known scalars in F. If each bi = 0 then the system above is called
a homogeneous system. Otherwise, we say it is inhomogeneous.
Set A = (aij ), b = (b1 , . . . , bm )t , and x = (x1 , . . . , xn )t . We can write the system above in the
matrix form
Ax = b.
The matrix A is called the coefficient matrix. By a solution, we mean any choice of the unknowns
x1 , . . . , xn which satisfies all the equations.
1.2. GAUSS ELIMINATION 11

Lemma 1.2.1. Let A be a m × n matrix over F, b ∈ Fm , and E an invertible m × m matrix over


F. Set U = EA and c = Eb. Then Ax = b has the same solutions as U x = c.

Proof. Ax = b implies EAx = Eb. Similarly, EAx = Eb implies E −1 (EAx) = E −1 (Eb) or


Ax = b. 
The idea of Gauss elimination is the following:
(i) Find a suitable invertible E so that U is in “row echelon form” or “row canonical form”
(defined below).
(ii) All solutions to U x = c, when U is in row echelon form or row canonical form , can be
written down easily.
We first describe step (ii) and then step (i).

Definition 1.2.2. A m × n matrix M is said to be in row echelon form (ref ) if it satisfies the
following conditions:
(a) By a zero row of M we mean a row with all entries zero. Suppose M has k nonzero rows
and m − k zero rows. Then the last m − k rows of M are the zero rows.
(b) The first nonzero entry in a nonzero row is called a pivot. For i = 1, 2, . . . , k, suppose that
the pivot in row i occurs in column ji . Then we have j1 < j2 < · · · < jk . The columns {j1 , . . . , jk }
are called the set of pivotal columns of M . Columns {1, . . . , n} \ {j1 , . . . , jk } are the nonpivotal
or free columns.

Definition 1.2.3. A m × n matrix M is said to be in row canonical form (rcf ) if it satisfies


the following conditions:
(a) By a zero row of M we mean a row with all entries zero. Suppose M has k nonzero rows
and m − k zero rows. Then the last m − k rows of M are the zero rows.
(b) The first nonzero entry in every nonzero row is 1. This entry is called a pivot. For
i = 1, 2, . . . , k, suppose that the pivot in row i occurs in column ji .
(c) We have j1 < j2 < · · · < jk . The columns {j1 , . . . , jk } are called the set of pivotal columns
of M . Columns {1, . . . , n} \ {j1 , . . . , jk } are the nonpivotal or free columns.
(d) The matrix formed by the first k rows and the k pivotal columns of M is the k × k identity
matrix, i.e., the only nonzero entry in a pivotal column is the pivot 1.

Note that a matrix in row canonical form is automatically in row echelon form. Also note that,
in both the definitions above, the number of pivots k is ≤ m, n.

Example 1.2.4. Consider the following 4 × 8 matrix U


 
0 1 a13 a14 0 a16 0 a18
 0 0 0 0 1 a26 0 a28 
 ,
 0 0 0 0 0 0 1 a38 
0 0 0 0 0 0 0 0

where the aij ’s are arbitrary scalars. It may be checked that U is in rcf with pivotal columns 2, 5, 7
12 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

and nonpivotal columns 1, 3, 4, 6, 8. Now let R be the matrix


 
0 a a13 a14 a15 a16 a17 a18
 0 0 0 0 b a26 a27 a28 
 ,
 0 0 0 0 0 0 c a38 
0 0 0 0 0 0 0 0
where a, b, c are nonzero scalars and the aij ’s are arbitrary scalars. It may be checked that R is in
ref with pivotal columns 2, 5, 7 and nonpivotal columns 1, 3, 4, 6, 8.
Example 1.2.5. Let U be the matrix from the example above. Let c = (c1 , c2 , c3 , c4 )t . We want
to write down all solutions to the system U x = c.
(i) If c4 6= 0 then clearly there is no solution.
(ii) Now assume that c4 = 0. Call the variables x2 , x5 , x7 pivotal and the variables x1 , x3 , x4 , x6 , x8
nonpivotal or free.
Give arbitrary values x1 = s, x3 = t, x4 = u, x6 = v, x8 = w to the free variables. These values
can be extended to values of the pivotal variables in one and only one way to get a solution to the
system U x = c:
x7 = c3 − a38 w
x5 = c2 − a26 v − a28 w
x2 = c1 − a13 t − a14 u − a16 v − a18 w
Thus (why?) the set of all solutions to U x = c can be written as
 
s
 c1 − a13 t − a14 u − a16 v − a18 w 
 

 t 


 u 
,

 c2 − a26 v − a28 w 


 v 

 c3 − a38 w 
w
where s, t, u, v, w are arbitrary scalars.
(iii) The column vector above can be written as
           
0 1 0 0 0 0
 c1   0   −a13   −a14   −a16   −a18 
           
 0   0   1   0   0   0 
           
 0 
 + s  + t 0
 0     1   0   0 
  + u +v +w 
 c2 
 
 0 
 
 0




 0 


 −a26 


 −a28 

 0   0   0   0   1   0 
           
 c3   0   0   0   0   −a38 
0 0 0 0 0 1
Thus every solution to U x = c is of the form above, for arbitrary scalars s, t, u, v, w. Note that the
first vector in the expression above is the unique solution to U x = c that has all free variables zero
and that the other vectors (without the coefficients) are the unique solutions to U x = 0 that have
one free variable equal to 1 and the other free variables equal to zero.
1.2. GAUSS ELIMINATION 13

Example 1.2.6. Let R be the matrix from the example above. Let c = (c1 , c2 , c3 , c4 )t . We want
to write down all solutions to the system U x = c.
(i) If c4 6= 0 then clearly there is no solution.
(ii) Now assume that c4 = 0. Call the variables x2 , x5 , x7 pivotal and the variables x1 , x3 , x4 , x6 , x8
nonpivotal or free.
Give arbitrary values x1 = s, x3 = t, x4 = u, x6 = v, x8 = w to the free variables. These values
can be extended to values of the pivotal variables in one and only one way to get a solution to the
system Rx = c:

x7 = (c3 − a38 w)/c


x5 = (c2 − a26 v − a27 x7 − a28 w)/b
x2 = (c1 − a13 t − a14 u − a15 x5 − a16 v − a17 x7 − a18 w)/a

The process above is called back substitution. Given arbitrary values for the free variables,
we first solve for the value of the largest pivotal variable, then using this value (and the values of
the free variables) we get the value of the second largest pivotal variable, and so on.

We extract the following Lemma from the examples above and its proof is left as an exercise.
Lemma 1.2.7. Let U be a m × n matrix in ref. Then the only solution to the homogeneous system
U x = 0 which is zero in all free variables is the zero solution.

Note that a matrix in rcf is also in ref and the lemma above also applies to such matrices.
Theorem 1.2.8. Let Ax = b, with A an m × n matrix. Let c be a solution of Ax = b and S the
set of all solutions of the associated homogeneous system Ax = 0. Then the set of all solutions to
Ax = b is
c + S = {c + v : v ∈ S}.

Proof. Let Au = b. Then A(u − c) = Au − Ac = b − b = 0. So u − c ∈ S and u = c + (u − c) ∈ c + S.


Conversely, let v ∈ S. Then A(c + v) = Ac + Av = b + 0 = b. Hence c + v is a solution to Ax = b.

The proof of the following important result is almost obvious and is left as an exercise.
Theorem 1.2.9. Let U be a m × n matrix in ref with k pivotal columns P = {j1 < j2 < · · · < jk }
and nonpivotal or free columns F = {1, . . . , n} \ P . Let c = (c1 , . . . , cm )t .
(i) The system U x = c has a solution iff ck+1 = · · · = cm = 0.
(ii) Assume ck+1 = · · · = cm = 0. Given arbitrary scalars yi , i ∈ F , there exist unique scalars
yi , i ∈ P such that y = (y1 , . . . , yn )t satisfies U y = c.
(iii) For i ∈ F , let si be the unique solution of U x = 0 which is zero in all free components
except component i, where it is 1. Then every solution of U x = 0 is of the form
X
ai si ,
i∈F

where the ai ’s are arbitrary scalars.


14 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

(iv) Let p be the unique solution of U x = c having all free variables zero. Then every solution
of U x = c is of the form X
p+ ai s i ,
i∈F

where the ai ’s are arbitrary scalars.

Example 1.2.10. In our previous two examples P = {2, 5, 7} and F = {1, 3, 4, 6, 8}. To make sure
the notation of the theorem is understood write down p and si , i = 1, 3, 4, 6, 8.

We now discuss the first step in Gauss elimination, namely, how to reduce a matrix to ref or
rcf. We define a set of elementary row operations to be performed on the equations of a system.
These operations transform a system of equations into another system with the same solution set.
Performing an elementary row operation on Ax = b is equivalent to replacing this system by the
system EAx = Eb, where E is an invertible elementary matrix.

Elementary row operations and elementary matrices

Let eij denote the m × n matrix with 1 in the ith row and jth column and zero elsewhere. Any
matrix A = (aij ) of size m × n can be written as

m X
X n
A= aij eij .
i=1 j=1

For this reason eij ’s are called the matrix units. Let us see the effect of multiplying e13 with a
matrix A written in terms of row vectors :
 
0 0 1 ··· 0    
 .. .. .. — R1 — — R3 —
 . . . .
..  
 —
   R2 — 


 — 0 — 

e13 A =  ... ... ... ..   — R3 — — 0 —
= .
   
.    
.. .
 .. .. .. ..  — ..
     
 — . —   — 
 . . . . 
— Rm — m×n — 0 —
0 0 0 · · · 0 m×m

In general, if eij is an m × m matrix unit and A is an m × n matrix then


 
— 0 —
 .. 

 — . — 


 — Rj — 
 ith row.
eij A = 
 — 0 — 

.
— ..
 

 — 

 — 0 — 
— —

We now define three kinds of elementary row operations and elementary matrices. Consider the
system Ax = b, where A is m × n, b is m × 1, and x is a n × 1 unknown vector.
1.2. GAUSS ELIMINATION 15

(i) Elementary row operation of type I: For i 6= j and a scalar a, add a times equation j to
equation i in the system Ax = b.
What effect does this operation have on A and b? Consider the matrix
   
1 1

 1 a 


 1 

E=
 1  or 
  ..  = I + aeij , i 6= j.

.
 ..   
 .   a 
1 1

This matrix has 1’s on the diagonal and a scalar a as an off-diagonal entry. By the above observation
     
— R1 — — R1 — — 0 —
 — R2 —   — R2 —   — Rj — 
 ith row
(I + aeij )   =   + a
    
.. .. .. 
 .   .   . 
— Rm — — Rm — — 0 —
 
— R1 —
 .. 

 . 

=  — Ri + aRj — 

 ith row
 .. 
 . 
— Rm —

It is now clear that performing an elementary row operation of type I on the system Ax = b we
get the new system EAx = Eb.
Suppose we perform an elementary row operation of type I as above. Then perform the same
elementary row operation of type I but with the scalar a replaced by the scalar −a. It is clear that
we get back the original system Ax = b. It follows (why?) that E −1 = I − aeij .
(ii) Elementary row operation of type II: For i 6= j interchange equations i and j in the system
Ax = b.
What effect does this operation have on A and b?. Consider the matrix
 
1

 1 

 .. 

 . 

 0 1 
F =  = I + eij + eji − eii − ejj .
 
..

 . 


 1 0 

 .. 
 . 
1

Premultiplication by this matrix has the effect of interchanging the ith and jth rows. Performing
this operation twice in succession gives back the original system. Thus F 2 = I.
16 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

(iii) Elementary row operation of type III: Multiply equation i in the system Ax = b by a
nonzero scalar c.
What effect does this operation have on A and b?. Consider the matrix
 
1

 1 

 .. 
 . 
 
 c 
 = I + (c − 1)eii , c 6= 0
G=

 1 


 1 

 .. 
 . 
1

Premultiplication by G has the effect of multiplying the ith row by c. Do this operation twice
in succession, first time with the scalar c and the second time with scalar 1/c, yields the original
system back. It follows that G−1 = I + (c−1 − 1)eii .
The matrices E, F, G above are called elementary matrices of type I,II,III respectively. We
summarize the above discussion in the following result.
Theorem 1.2.11. Performing an elementary row operation (of a certain type) on the system
Ax = b is equivalent to premultiplying A and b by an elementary matrix E (of the same type),
yielding the system EAx = Eb.
Elementary matrices are invertible and the inverse of an elementary matrix is an elementary
matrix of the same type.

Since elementary matrices are invertible it follows that performing elementary row operations
does not change the solution set of the system. We now show how to reduce a matrix to row
reduced echelon form using a sequence of elementary row operations.
Theorem 1.2.12. Every matrix can be reduced to a matrix in rcf by a sequence of elementary row
operations.

Proof. We apply induction on the number of rows.If the matrix A is a row vector, the conclusion
is obvious. Now suppose that A is m × n, where m ≥ 2. If A = 0 then we are done. If A is not
the zero matrix then there is a nonzero column in A. Find the first nonzero column, say column
j1 , from the left. Interchange rows to move the first nonzero in column j1 to the top row. Now
multiply by a nonzero scalar to make this entry (in row 1 and column j1 ) 1. Now add suitable
multiples of the first row to the remaining rows so that all entries in column j1 , except the entry
in row 1, become zero. The resulting matrix looks like

0 ··· 0 1 ∗ ··· ∗
 
 0 ··· 0 0 ∗ ··· ∗ 
 
 · · · · · 
A1 = 
 ·

 · · · · 

 · · · · · 
0 ··· 0 0 ∗ ··· ∗
1.2. GAUSS ELIMINATION 17

By induction, the submatrix of A1 consisting of rows 2, 3 . . . , m can be reduced to row reduced


echelon form. So now the resulting matrix looks like
 
1 v
A2 =
D
where blank space consists of 0’s, v is a row vector with n−j1 components, and D is a (m−1)×(n−j1 )
matix in rcf. Let the pivotal columns of D be j2 < j3 < · · · < jk , where j1 < j2 . By subtracting
suitable multiples of rows 2, . . . , k of A2 from row 1 of A2 we can make the entries in columns
j2 , . . . , jk of row 1 equal to 0. The resulting matrix is in rcf. 
Before giving examples of row operations we collect together some results on systems of linear
equations that follow from Gauss elimination.
Theorem 1.2.13. Let Ax = b, with A an m × n matrix.
(i) Suppose m < n. Then there is a nontrivial solution to the homogeneous system Ax = 0.
(ii) The number of solutions to Ax = b is either 0, 1, or ∞.

Proof. (i) Reduce A to rcf U by Gauss elimination. Since m < n there is atleast one free variable.
It follows that there is a nontrivial solution.
(ii) Reduce Ax = b to EAx = Eb using Gauss elimination, where U = EA is in rref. Put
c = Eb = (c1 , . . . , cm )t . Suppose U has k nonzero rows. There cases arise:
(a) atleast one of ck+1 , . . . , cm is nonzero: in this case there is no solution.
(b) ck+1 = · · · = cm = 0 and k = n: there is a unique solution (why?).
(c)ck+1 = · · · = cm = 0 and k < n: there are infinitely many solutions (why?).
No other cases are possible (why?). That completes the proof. 
In the following examples an elementary row operation of type I is indicated by Ri + aRj , of
type II is indicated by Ri ↔ Rj , and of type III is indicated by aRi .
Example 1.2.14. Consider the system
     
2 1 1 x1 5
Ax =  4 −6 0   x2  =  −2  = b.
−2 7 2 x3 9

Applying the indicated elementary row operations to A and b we get


     
2 1 1 5 R2 − 2R1 2 1 1 5 2 1 1 5
 4 −6 0 −2  −→  0 −8 −2 −12  R3 + R2  0 −8 −2 −12 
−→
−2 7 2 9 R3 + R1 0 8 3 14 0 0 1 2
   
R1 − R3 2 1 0 3 2 0 0 2
R1 + (1/8)R2 
−→  0 −8 0 −8  0 −8 0 −8 
−→
R2 + 2R3 0 0 1 2 0 0 1 2
 
(1/2)R1 1 0 0 1
−→  0 1 0 1 
(−1/8)R2 0 0 1 2
Since there are no free columns the problem has a unique solution given by x1 = x2 = 1 and x3 = 2.
18 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

Example 1.2.15. Consider the system


 
  x1  
1 3 −2 0 2 0 
 x2 
 0
 2 6 −5 −2 4 −3   x3   −1 
Ax = 
 0
 
  5  = b.
= 
0 5 10 0 15  
 x4 
2 6 0 8 4 18  x5  6
x6

Applying the indicated elementary row operations to A and b we get


   
1 3 −2 0 2 0 0 1 3 −2 0 2 0 0
 2 6 −5 −2 4 −3 −1  R2 − 2R1  0 0 −1 −2 0 −3 −1 

 0 0
 −→  
5 10 0 15 5   0 0 5 10 0 15 5 
R4 − 2R1
2 6 0 8 4 18 6 0 0 4 8 0 18 6
   
1 3 −2 0 2 0 0 1 3 −2 0 2 0 0
 3R − 5R2 
−R2  0 0 1 2 0 3 1  0 0 1 2 0 3 1 
  −→ 
−→  0 0 5 10 0 15 5   0 0 0 0 0 0 0 
R4 − 4R2
0 0 4 8 0 18 6 0 0 0 0 0 6 2
   
1 3 −2 0 2 0 0 1 3 −2 0 2 0 0
R3 ↔ R4  0 0
 1 2 0 3 1  (1/6)R3  0 0
  1 2 0 3 1 

−→  0 0 0 0 0 6 2  −→  0 0 0 0 0 1 1/3 
0 0 0 0 0 0 0 0 0 0 0 0 0 0
   
1 3 −2 0 2 0 0 1 3 0 4 2 0 0
R2 − 3R3   0 0 1 2 0 0 0  R1 + 2R2  0 0
  1 2 0 0 0 

−→  0 0 0 0 0 1 1/3  −→  0 0 0 0 0 1 1/3 
0 0 0 0 0 0 0 0 0 0 0 0 0 0

It may be checked that every solution to Ax = b is of the form

−3 −4 −2
       
0
 0   1   0   0 
       
 0   0   −2 
 + t 0 ,
 
  + s +r
 0   0   1   0 
       
 0   0   0   1 
1/3 0 0 0

for some scalars s, t, r.

Example 1.2.16. Consider the system


 
  x1  
1 3 −2 0 2 0 
 x2 
 0
 2 6 −5 −2 4 −3  
 x3   −1 
Ax = 
 0
 =
  6  = b.

0 5 10 0 15  
 x4 
2 6 0 8 4 18  x5  6
x6
1.2. GAUSS ELIMINATION 19

Applying the indicated elementary row operations to A and b we get


   
1 3 −2 0 2 0 0 1 3 −2 0 2 0 0
 2 6 −5 −2 4 −3 −1  2 R − 2R1 
0 0 −1 −2 0 −3 −1 

 0 0
 −→  
5 10 0 15 6   0 0 5 10 0 15 6 
R4 − 2R1
2 6 0 8 4 18 6 0 0 4 8 0 18 6
   
1 3 −2 0 2 0 0 1 3 −2 0 2 0 0
R3 − 5R2 
−R2  0 0
 1 2 0 3 1  
 0 0 1 2 0 3 1 
−→ 
−→  0 0 5 10 0 15 6   0 0 0 0 0 0 1 
R4 − 4R2
0 0 4 8 0 18 6 0 0 0 0 0 6 2
   
1 3 −2 0 2 0 0 1 3 −2 0 2 0 0
R3 ↔ R4   0 0 1 2 0 3 1  (1/6)R3  0 0
  1 2 0 3 1 

−→  0 0 0 0 0 6 2  −→  0 0 0 0 0 1 1/3 
0 0 0 0 0 0 1 0 0 0 0 0 0 1
   
1 3 −2 0 2 0 0 1 3 0 4 2 0 0
R2 − 3R3  0 0
 1 2 0 0 0  R1 + 2R2  0 0
  1 2 0 0 0 

−→  0 0 0 0 0 1 1/3  −→  0 0 0 0 0 1 1/3 
0 0 0 0 0 0 1 0 0 0 0 0 0 1
It follows that the system has no solution.

Calculation of A−1 by Gauss elimination


Lemma 1.2.17. Let A be a square matrix. Then the following are equivalent:

(a) A can be reduced to I by a sequence of elementary row operations.


(b) A is a product of elementary matrices.
(c) A is invertible.
(d) The system Ax = 0 has only the trivial solution x = 0.

Proof. (a) ⇒ (b). Let E1 , . . . , Ek be elementary matrices so that Ek . . . E1 A = I. Thus A =


E1−1 . . . Ek−1 .

(b) ⇒ (c) Elementary matrices are invertible.

(c) ⇒ (d) Suppose A is invertible. Then AX = 0. Hence A−1 (AX) = X = 0.

(d) ⇒ (a) First observe that a square matrix in rcf is either the identity matrix or its bottom row
is zero. If A can’t be reduced to I by elementary row operations then U = the rcf of A has a zero
row at the bottom. Hence U x = 0 has atmost n − 1 nontrivial equations. which have a nontrivial
solution. This contradicts (d). 
This proposition provides us with an algorithm to calculate inverse of a matrix if it exists. If A
is invertible then there exist invertible matrices E1 , E2 , . . . , Ek such that Ek · · · E1 A = I. Multiply
by A−1 on both sides to get Ek · · · E1 I = A−1 .
20 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS

Lemma 1.2.18. (Gauss-Jordan Algorithm) Let A be an invertible matrix. To compute A−1 , apply
elementary row operations to A to reduce it to an identity matrix. The same operations when
applied to I, produce A−1 .

Example 1.2.19. We find the inverse of the matrix


 
1 0 0
A =  1 1 0 .
1 1 1

by forming the 3 × 6 matrix  


1 0 0 1 0 0
[A | I] =  1 1 0 0 1 0  .
1 1 1 0 0 1
Now perform row operations to reduce the matrix A to I. In this process the identity matrix will
reduce to A−1 .
   
1 0 0 1 0 0 R2 − R1 1 0 0 1 0 0
[A | I] =  1 1 0 0 1 0  −→  0 1 0 −1 1 0 
1 1 1 0 0 1 R3 − R1 0 1 1 −1 0 1

   
1 0 0 1 0 0 1 0 0
R3 − R2 
0 1 0 −1 1 0 . Hence A−1 =  −1 1 0 
−→
0 0 1 0 −1 1 0 −1 1