Академический Документы
Профессиональный Документы
Культура Документы
2
2
5
7
14
17
19
22
24
26
30
33
(2.1)
In terms of the x1 {x2 plane, solutions to this equation correspond to points along a line.
Suppose now that we have two linear equations
2x1 + x2 ; 5 = 0
x1 ; x2 + 1 = 0:
(2.2a)
(2.2b)
We refer to this as a system of two linear equations. If we want to know the solution to
this system of equations we can nd it graphically by plotting the two lines and nding the
4
point of intersection. In this particular case, there is a unique solution x1 = 3 , x2 = 7 . With
3
this geometrical interpretation in mind, however, it is not hard to imagine that a system
of two linear equations could have no solutions, one solution, or in nitely many solutions.
Geometrically a system of two linear equations with no solutions would correspond to two
parallel lines. An example is
2x1 + x2 ; 5 = 0
2x1 + x2 ; 6 = 0:
(2.3a)
(2.3b)
A system of two linear equations that has in nitely many solutions corresponds to two lines
that coincide at every point. An example is
2x1 + x2 ; 5 = 0
4x1 + 2x2 ; 10 = 0:
(2.4a)
(2.4b)
This geometrical view of systems of linear equations can be extended to systems with
three equations and three unknowns. The equation
x1 + x2 + x3 = 5
(2.5)
planes that are coincident, or two parallel planes which do not intersect at all. Given three
such equations, one could imagine three planes intersecting at a single point, on a line, on a
coincident plane, or not at all.
Our geometric intuition suggests that systems of linear equations can have no solutions,
one solution, or an in nite number of solutions. We shall see that this is in fact the general
case. We de ne a general linear system with m equations and n unknowns as the set of
equations
3. Replace any equation with the sum of that equation and a multiple of another
equation.
Applying these operations to the system of linear equations will change the form of the
system but not the solution of the system. Be careful to notice that operation number 3
does not allow one to add up two equations and replace a third equation by the result. The
equation that gets replaced by operation number 3 must be one of the equations involved in
the steps. With this in mind, you should think about why operation 3 leaves the solution
unchanged.
Let us illustrate our method with an example. Suppose we have the following linear
system
x1 + 3x2 ; x3 = 1
(2.9a)
3x1 + 6x2 + 2x3 = ;3
(2.9b)
3x1 + 4x2 ; 4x3 = 7:
(2.9c)
The idea is to simplify the system of equations by using the elementary operations to a point
where the solution is easily identi ed. We outline two such procedures below.
The rst method is known as Row Echelon Reduction or the Row Echelon Method. The
general idea of this approach is to eliminate the variables x1 , x2 and x3 in as many places
as possible. Our rst step is to multiply Eq. (2.9a) by ;3 and add the resulting equation
to Eq. (2.9b). Replace Eq. (2.9b) by the result. Here we have used the third elementary
operation. The next step is to multiply Eq. (2.9a) by ;3 and add the resulting equation to
Eq. (2.9c). Replace Eq. (2.9c) by the result. This gives
x1 + 3x2 ; x3 = 1
(2.10a)
; 3x2 + 5x3 = ;6
;3 ROW1 + ROW2]
(2.10b)
; 5x2 ; x3 = 4: ;3 ROW1 + ROW3]
(2.10c)
In the brackets above we write the steps used in simplying the system. By performing these
steps, we have eliminated the variable x1 in all but the rst equation. Our next task is to try
to eliminate the variable x2 in all but the second equation, using the elementary operations.
To do this, we replace Eq. (2.10a) by the sum of Eqs. (2.10a) and (2.10b). We then replace
Eq. (2.10c) by ;5=3 times Eq. (2.10b) plus Eq. (2.10c). These operations result in the
system
x1
+ 4x3 = ;5
ROW2 + ROW1]
(2.11a)
; 3x2 + 5x3 = ;6
(2.11b)
; 28 x3 = 14 ;5=3 ROW2 + ROW3]:
(2.11c)
3
We have now eliminated the variable x2 in all but the second equation. Before attempting to
eliminate the variable x3 in all but the third equation, we simplify the system by multiplying
4
x1
+ 4x3 = ;5
x2 ; (5=3)x3 = 2
;1=3 ROW2]
x3 = ;3=2:
;3=28 ROW3]
(2.12a)
(2.12b)
(2.12c)
We now eliminate the variable x3 everywhere but in Eq. (2.12c). We replace Eq. (2.12a) by
;4 times Eq. (2.12c) plus Eq. (2.12a). We replace Eq. (2.12b) by 5=3 times Eq. (2.12c) plus
Eq. (2.12b). The result is the system
x1
= 1
;4 ROW3 + ROW1]
x2
= ;1=2
5=3 ROW3 + ROW2]
x3 = ;3=2:
(2.13a)
(2.13b)
(2.13c)
1
The solution of the original system of equations is therefore just (x1 x2 x3 ) = (1 ; 2 ; 3 ),
2
which can be checked by substituting these values back into the original equation. This
approach is the basic idea of the Row Echelon Method.
A similar approach called Gaussian Elimination uses the same Elementary Operations
but simpli es the original system of equations given by Eq. (2.9) to the form
x1 + 3x2 ; x3 = 1
; 3x2 + 5x3 = ;6
; 28 x3 = 14:
3
(2.14a)
(2.14b)
(2.14c)
That is, one equation involves just x3 , another involves just x2 and x3 , and the third equation
involves all three variables. The solution is then obtained by rst solving for x3 using
equation (2.14c), then substituting it into equation (2.14b) and solving for x2 and then
substituting the results for both x3 and x2 into equation (2.14a) and solving for x1 a method
referred to as back substitution.
3. Homogeneous Systems
The systems considered in the previous section were nonhomogeneous (the system has at
least one nonzero constant term). That is, if we examine the system given by Eq. (2.6), the
system is nonhomogeneous if any one (or more) of the terms bi is nonzero. A homogeneous
system is one in which all of the constant terms bi are equal to zero. An example of a
homogeneous system is the system
x1 + 3x2 ; x3 = 0
3x1 + 6x2 + 2x3 = 0
3x1 + 4x2 ; 4x3 = 0:
5
(3.1a)
(3.1b)
(3.1c)
On the other hand, a homogeneous system may or may not have nontrivial solutions (a
nontrivial solution is one in which the unknowns are not all zero). We will return to this
issue later, but for now we demonstrate this by considering speci c examples.
In the case of the system (3.1), we can apply elementary operations to simplify it to
x1 + 3x2 ; x3 = 0
(3.2a)
;3x2 + 5x3 = 0
(3.2b)
x3 = 0:
(3.2c)
This system is solved only for x1 = x2 = x3 = 0. Therefore, this particular example has only
the trivial solution.
Let us consider the following example.
x1 + 5x2 + x3 = 0
(3.3a)
2x1 + x2 ; x3 = 0:
(3.3b)
This is a homogeneous system and therefore has as one solution the trivial solution. Notice
that this system has three unknowns and two equations. We can apply elementary operations
to this system to obtain
x1 + 5x2 + x3 = 0
(3.4a)
;9x2 ; 3x3 = 0:
(3.4b)
The second equation is solved whenever x2 = ;(1=3)x3 and the rst is then solved whenever
x1 = (2=3)x3. Therefore, if we choose x3 to be any constant, a, a solution of the following
2
form can be found (x1 x2 x3 ) = ( 3 a ; 1 a a). In particular we note that both zero and
3
nonzero values of a are allowed. Therefore, the system (3.3) has both trivial and nontrivial
solutions. In fact, since a can be any number, there are an in nite number of nontrivial
solutions.
Theorem 3. (Rabenstein1 , Thm. 2.1) A system of linear homogeneous equations in which
.1
the number of variables (unknowns) exceeds the number of equations always has an in nite
number of nontrivial solutions.
Example 3. (Rabenstein1, section 2.1, problem 5) Find the solution(s) to the following
.1
system using elementary operations.
3x1 ; x2 = 4
(3.5a)
6x1 ; 2x2 = 8
(3.5b)
;9x1 + 3x2 = ;12:
(3.5c)
6
Note that geometrically these equations correspond to three coincident lines. Using elementary operations we obtain
3x1 ; x2 = 4
0 = 0
0 = 0:
;2 ROW1 + ROW2]
ROW1 + ROW3]
(3.6a)
(3.6b)
(3.6c)
1
The solutions are given by (x1 x2 ) = ( 3 (4 + a) a) where a is a constant. Another way to
express the solution is just
4
(3.7)
x1 = 3 + 1 x2 :
3
There are an in nite number of nontrivial solutions to this system of equations.
(4.1a)
(4.1b)
(4.1c)
We de ne the coe cient matrix (or just matrix) A for the above system by
2
A=6
4
2 ;3 2 2
3 ;5 1 3 7
5
4 ;3 1 2
(4.2)
This is a 3 4 matrix it has 3 rows and 4 columns. One can think of the matrix as being
made up of column vectors
2
6
4
3 2
3 2
3 2
2
;3
2
2
7 6 ;5 7 6 1 7 6 3 7
35 4
5 4 5 4 5
4
;3
1
2
(4.3)
2 ;3 2 2
h
i
3 ;5 1 3
h
i
4 ;3 1 2
7
(4.4)
;2
2
3
7
5
2
6
~=6
x 6
4
x1
x2
x3
x4
3
7
7
7
5
(4.5)
A~ = ~:
x b
(4.6)
In order to do this, we need to understand how to multiply a matrix and a vector. That is,
we need to know what A~ means. We have
x
3 2
2
32
3
2 ;3 2 2 6 x1 7
2x1 ; 3x2 + 2x3 + 2x4
A~ = 6 3 ;5 1 3 7 6 x2 7 = 6 3x1 ; 5x2 + x3 + 3x4 7 :
x 4
(4.7)
56
5
4 x3 7 4
5
4 ;3 1 2
4x1 ; 3x2 + x3 + 2x4
x4
2 ;3 2 2
3 ;5 1 3
4 ;3 1 2
32
76
6
56
4
2 3
2
2 3
2 3
x1 3
2
;3 3
2
7
x2 7 = x 6 3 7 + x 6 ;5 7 + x 6 1 7 + x 6 2 7 :
5
x3 7 1 4 4 5 2 4 ;3 5 3 4 1 5 4 4 3 5
2
x4
(4.8)
2 ;3 2 2 j 3
3 ;5 1 3 j ;2 7
5
4 ;3 1 2 j 2
(4.9)
where the last column in this matrix represents the right-hand side of Eq. (4.1). The same
rules of Gaussian Elimination or Row Echelon reduction apply, only the notation is simpli ed
somewhat.
MATRIX{VECTOR MULTIPLICATION: Not all matrices and vectors can be
multiplied together. The dimension of each must be appropriate. Examples are given below
8
A = a11 a12
a21 a22
2
=6
4
2
~ =6
x 4
3
7
5
"
"
#
x1
y1 :
7
x2 5 ~ = y
y
2
x3
Based on the above matrices and vectors the following matrix{vector products
A~ B~ C ~
y
x
x
(4.10)
(4.11)
(4.12)
A~ B~ C ~
x
y
y
if their dimensions are appropriate. For example, using the matrices de ned in Eq. (4.10),
we have that
"
#"
#
a11 a12 c11 c12 c13
AC = a21 a22 c21 c22 c23
"
#
(a11c11 + a12 c21 ) (a11 c12 + a12 c22) (a11 c13 + a12 c23 )
= (a c + a c ) (a c + a c ) (a c + a c )
(4.13)
21 11
22 21
21 12
22 22
21 13
22 23
so that the matrix product AC is a 2 3 matrix. In a similar fashion, one can de ne the
matrix product CB . We note, however, that CA, AB , BA, and BC are not de ned. In
general, matrix{matrix multiplication is not commutative. In fact, as we see in the above
examples, AC can be de ned while CA is not!
Example
"
A = a11 a12
a21 a22
"
B = b11 b12 :
b21 b22
The two matrices are equal to each other, A = B , if a11 = b11 , a12 = b12 , etc.
We can add and subtract matrices of the same size
"
#
a11 + b11 a12 + b12 :
A + B = a21 + b21 a22 + b22
9
(4.15)
(4.16)
rA = ra11 ra12 :
ra21 ra22
(4.17)
AB 6= BA
(4.18)
(AB )C = A(BC )
(4.19)
but it is associative
as long as each matrix multiplication is de ned.
SPECIAL MATRICES: The identity matrix is a square matrix (that is, n n) given
by (for the 3 3 case)
2
=6
4
1 0 0
0 1 0 7:
5
0 0 1
(4.20)
In general, it has the entry 1 along its diagonal and 0 everywhere else. The identity matrix
has the properties that
AI = IA = A
~I = I~ = ~:
b
b b
The matrix
2
A=6
4
(4.21a)
(4.21b)
3
7
5
(4.22)
(4.23)
That is, it has its elements \ ipped" about the diagonal so taking the transpose changes
column one into row one and vice versa, column two into row two and vice versa, etc. The
transpose can also be de ned for non-square matrices (i.e. m n matrices where m 6= n).
We also have that (AT )T = A, (A + B )T = AT + B T and (AB )T = B T AT . Note that if
A is symmetric then AT = A.
10
A diagonal matrix has all of its o -diagonal entries equal to zero, e.g.
2
A=6
4
a11 0 0
0 a22 0 7 :
(4.24)
5
0 0 a33
INVERSE OF A MATRIX: Another special matrix is the inverse matrix. Suppose
we have the linear system
A~ = ~
x b
(4.25)
where A is an n n matrix and ~ and ~ are n-dimensional vectors. If n = 1, so the system
x
b
is just a single equation, then all we really have is
ax = b:
(4.26)
In this case, the way we solve for x is to divide both sides by a, or equivalently multiply
both sides by 1=a or a;1
x = a;1 b:
(4.27)
~ = A;1~?
x
b
(4.29)
The answer is that sometimes we can and sometimes we cannot. Clearly if A = 0 (in analogy
with the scalar case) we should expect that A;1 would not be de ned, but it turns out that
there are non-zero matrices for which this is true also.
We say that an n n matrix A is invertible if there exists a matrix B such that BA = I
and AB = I . There is at most one such, B , called the inverse of A and is denoted by A;1
(note that if A is n n then so is A;1). So if A is invertible, then
A;1(A~ ) = A;1~
x
b
;1 A)~ = A;1~
(A
x
b
;1~
I~ = A b
x
~ = A;1~
x
b
(4.30)
where the bottom line represents the solution to our system of linear equations. That is, ~
x
;1 and ~. For future reference we say
can be obtained by a matrix{vector multiplication of A
b
that A is invertible i it is nonsingular and A is not invertible i it is singular.
11
The next question we ask is that if A is invertible, how do we nd A;1? To address the
above question, suppose we have the 2 2 matrix
"
1 2 :
3 4
(4.31)
The inverse has the property that AA;1 = I . Think of A;1 as made up of column vectors
2
A;1 = 6
4
1 2
3 4
#2
6
4
"
1
3
"
1
3
" "
~1 ~2
c c
# #
3
7:
5
~1 ~2
c c
3 "
7=
5
"
" "
# #
(4.32)
1 0 :
0 1
#
2 ~ = 1
4 c1
0
#
" #
2 ~ = 0
4 c2
1
(4.33)
(4.34a)
(4.34b)
where we would like to solve for ~1 and ~2 simultaneously. We can do so by using the following
c
c
technique. First form the extended matrix
"
#
1 2 j 1 0 :
(4.35)
3 4 j 0 1
Next, use elementary operations on the rows of this matrix to reduce the left two columns
to be the identity matrix, if possible. That is, form the matrix
"
#
1 0 j i11 i12
(4.36)
0 1 j i21 i22
where i11 i12 i21 i22 are determined as the elementary row operations are performed. This
is nothing more than performing the Row Echelon Method to simultaneously nd ~1 and ~2 .
c
c
Hence, we have
"
#
i11 i12
A;1 = i21 i22
(4.37)
So for the above example, we have
"
1 2 j 1 0
3 4 j 0 1
12
"
1
0
"
1
0
"
1
0
2
;2
0
;2
0 j
1 j
j 1 0
j ;3 1
#
j ;2 1
j ;3 1 #
;2 1
3
1
;2
2
A;1 =
;3 ROW1 + ROW2]
ROW2 + ROW1]
"
;2 1 :
3
;1
2
2
AA;1 =
"
1 2
3 4
#"
(4.38)
; 1 ROW2]
2
(4.39)
"
;2 1 = 1 0 = I :
3
;1
0 1
2
2
(4.40)
This method works, at least in principle, for any invertible n n matrix. A general result
for the inverse of a 2 2 matrix is given as follows:
Theorem
4.
.1
"
b
If A = a d
c
then
A;1 =
"
d ;b
ad ; bc ;c a
1
as long as ad ; bc 6= 0.
1 0 2
2 ;3 4 7 :
5
0 2 1
(4.41)
3
1 0 2 j 1 0 0
2 ;3 4 j 0 1 0 7
5
0 2 1 j 0 0 1
13
(4.42)
1
0
0
2
1 0
6 0 ;3
4
0 0
2
1 0 0
60 1 0
4
0 0 1
0 2 j 1 0 0
;3 0 j ;2 1 0 7 ;2 ROW1 + ROW2]
5
2 1 j 0 0 1
3
2 j 1
0 0
0 j ;2 1 0 7
5
1 j ;4=3 2=3 1
2=3 ROW2 + ROW3]
3
j 11=3 ;4=3 ;2 7 ;2 ROW3 + ROW1]
j 2=3 ;1=3 0 5
;1=3 ROW2]
j ;4=3 2=3 1
so
2
A;1 = 6
4
11=3 ;4=3 ;2
2=3 ;1=3 0 7 :
5
;4=3 2=3 1
(4.43)
5. Determinants
Recall from our rst discussion of the inverse of a matrix, that for 1 1 matrices (i.e.
scalars) a;1 exists if a 6= 0. The above theorem tells us that for general n n matrices A;1
exists if detA 6= 0. The following is hence also true.
Theorem
~ has nontrivial solutions if and only if detA = 0. When detA = 0, A is singular, and the
0
matrix A does not have an inverse.
Determinants play key roles in numerous situations that arise in the context of linear
algebra and di erential equations. As can be seen from the above two theorems, the determinant of a matrix provides information about a system of linear equations before one
actually tries to solve the system. We now focus on how to calculate detA.
We will start by giving an operational de nition of the determinant of 2 2 and 3 3
systems before stating a general method for nding detA for n n systems.
14
If A is a 2 2 matrix
then
If A is a 3 3 matrix
then
"
b
A= a d
c
(5.1)
b
detA = a d = ad ; bc:
c
2
A=6
4
3
7
5
(5.2)
(5.3)
which is essentially a checkerboard pattern starting with a plus sign in the upper left corner.
Now, to write down the determinant for this general case, we again start by choosing any
row or any column (as we'll see, it is often advantageous to choose a row or column with
lots or zeros in it, if such a row or column is present). If we choose the rst row, say, the
determinant is given by
detA = a11M11 ; a12 M12 + a13 M13 + : : : + (;1)n+1 a1nM1n
(5.8)
where M11 through M1n are minors (e.g. M11 is the determinant of the submatrix formed
by deleting row 1 and column 1 from the original matrix, while M1n is the determinant of
the submatrix formed by deleting row 1 and column n of the original matrix). This is a
recursive formula in the sense that the Mij are themselves determinants of smaller matrices
which can be expressed by formulas similar to equation (5.8). The following de nition and
theorem state this result.
Definition
5. The cofactor of element aij is (;1)i+j Mij , where Mij is the determinant of
.1
the submatrix formed by deleting the row and column occupied by the element aij .
Theorem
choosing any row or column vector of A, multiplying each component by its cofactor and
adding these products.
It is worth pointing out that this method in principal allows one to hand calculate the
determinant of any n n matrix, although the number of arithmetic operations grows like
n! for general matrices. In practice, most matrices larger than 3 3, depending on the
exact nature of the matrix, should probably be calculated using software such as Matlab or
Mathematica. Huge matrices, which appear frequently in applications, almost always require
computational tools.
Example
A=4 4
1 2 5:
1 1 ;3
(5.9)
(5.10)
Example
5. Find detA of
.2
2
A=6
4
1 0 ;1
4 0 2 7:
(5.11)
5
1 1 ;3
Here we choose the second column since it has two zeros.
detA = ;(1) 1 ;1 = ;6:
(5.12)
4 2
Again since the determinant is nonzero, the matrix A is nonsingular and has an inverse.
Example 5. Find detA of
.3
2
1 0 1 0 3
6
7
A = 6 0 1 2 ;1 7 :
6
(5.13)
40 4 1 2 7
5
1 1 1 ;3
Here we choose the rst row since it has two zeros (the rst column also would be a good
choice).
1 2 ;1
0 1 ;1
detA = (1) 4 1 2 + (1) 0 4 2
1 1 ;3
1 1 ;3
= 20 + (1) 1 ;1 = 20 + 6 = 26:
(5.14)
4 2
The rst 3 3 determinant appearing here we recognize from example 5.
.1.
If A is diagonal then detA = a11 a22 a33 ann (i.e. the determinant is the product of
the diagonal elements).
If A has a row (or column) with all zero entries, then detA = 0.
If two rows (or two columns) of A are interchanged, then the determinant of the
resulting matrix is ;detA.
If two rows (or two columns) of A are identical, then detA = 0.
As an exercise, write down some 2 2, 3 3 and 4 4 matrices and convince yourself
that the above properties are true.
6. Wronskian
Recall that in our discussion of solutions to 2nd order O.D.E.'s, we were concerned with
whether two given solutions to an equation were linearly independent. Our working de nition
was that two solutions y1(x) and y2(x) are linearly independent if the equation
c1y1(x) + c2y2(x) = 0
(6.1)
17
holds only for c1 = c2 = 0. If there exist nonzero c1 and c2 such that this equation holds
for all x, then the functions are linearly dependent. Given the above Eq. (6.1) it is also true
that we can take its derivative to obtain
0
0
c1y1(x) + c2y2(x) = 0:
(6.2)
If we write Eqs. (6.1) and (6.2) in matrix notation we have
"
#"
#
y1(x) y2(x) c1 = ~
0
(6.3)
0
0
y1(x) y2(x) c2
or A~ = ~ . Note that by Thm. 5. this equation has nontrivial solutions ~ if and only if
c 0
.2
c
detA = y1(x) y2(x) = 0 for all x:
(6.4)
0
0
y1(x) y2(x)
The determinant of A is, in general, a function of x and is called the Wronskian (this is the
2 2 case) of the functions y1(x) and y2(x) at the point x.
The above de nitions of linear independence and Wronskian generalize to any number of
functions.
Definition 6. In general if f1 (x), f2 (x), f3 (x), : : :, fm (x) are de ned functions with at
.1
least m ; 1 derivatives on some interval, we de ne the Wronskian of these functions to be
2
f1 (x)
f2(x)
f3(x) : : : fm (x) 3
0
6 f 0 (x)
f20 (x)
f30 (x) : : : fm (x) 7
6
7
7
W = det 6 1 .
(6.5)
.
.
.
.
6
7
.
.
.
.
.
.
.
.
.
.
4
5
(
f1(m;1) (x) f2(m;1) (x) f3(m;1) (x) : : : fmm;1) (x)
where the superscripts in parentheses in the last row indicate the (m ; 1) derivative of the
function.
Theorem
fm (x). If these functions are linearly dependent on some interval, then the Wronskian is
zero for all values of x on that interval. THEREFORE, if W 6= 0 for at least one value
of x on the interval, then the functions are linearly independent on that interval. (see also
Theorem 3.3.1 in Boyce and DiPrima4).
CAUTION: If W = 0, it is not necessarily true that the functions are linearly dependent. As
an example, consider the two functions g1(x) = x2 and g2 (x) = xjxj which are de ned for all
values of x. These have W = 0 but are linearly independent on the interval ;1 < x < 1.
This exception is removed under the conditions of the next theorem.
Theorem
6. (Boyce and DiPrima4, Thm. 3.3.3) Let y1 and y2 be solutions to y00 + p(t)y0 +
.2
q(t)y = 0 where p and q are continuous on an open interval I . Then y1 and y2 are linearly
independent if and only if W (y1 y2) is never zero on the interval. Alternatively, y1 and y2
are linearly dependent if and only if W (y1 y2) = 0 everywhere on I .
18
We can conclude from these two theorems that if the Wronskian of two functions is
nonzero somewhere on an interval, then those functions are linearly independent on that
interval. However, if the Wronskian of two functions is zero everywhere on some interval,
you can be assured that those functions are linearly dependent only if they are also solutions
to a linear ODE as described in Thm. 6.
.2.
Example
6. Show that the functions f (x) = x4 , g(x) = x3 , and h(x) = x2 are linearly
.1
x4 x3 x2
4x3 3x2 2x 7 = ;2x6 :
W
(6.6)
5
12x2 6x 2
Since we can identify values of x for which W 6= 0, these functions are linearly independent.
Example
(6.8)
for all x. Notice that for the above functions, this equation is satis ed for c1 = 1, c2 = ;1,
and c3 = ;1 that is, g(x) + h(x) = f (x). Therefore, these functions are linearly dependent.
Exercise 6. (Rabenstein1, section 3.4, problem 1f) Determine whether the functions x, ex
.1
7. Vector Spaces
We have looked at linear systems, how to solve them, how to write them using matrices and
vectors, and we have identi ed some properties of the systems by computing determinants.
In the following section we shall describe the \space" in which the solutions to these problems
reside.
Two of the most familiar \spaces" are <2 and <3. One can think of <2 mathematically as
the set of ordered pairs (x y), or geometrically as the space de ned by the x ; y plane. Here
we can de ne vectors ~ 1 = (x1 y1) and ~ 2 = (x2 y2) which \reside" in <2 . Two important
x
x
2
points to notice about < are that if ~ 1 and ~ 2 are elements in this space, then
x
x
19
their sum ~ 1 + ~ 2 is also a member of this space (Closure Under Addition), and
x x
any scalar multiple r~ 1 is also a member of this space (Closure Under Multiplication).
x
This is saying that if two vectors in <2 are added together, or if a vector in <2 is multiplied
by a scalar, the result is still a vector in <2 .
The situation is similar if we consider the three-dimensional space <3 . Here we have
vectors ~ 1 = (x1 y1 z1), ~ 2 = (x2 y2 z2), etc. Again, ~ 1 + ~ 2 is in <3 and r~ 1 is in <3 . The
x
x
x x
x
n , although the familiar geometrical view begins to
same arguments extend to vectors in <
break down when n > 3.
<1 , <2, <3, : : :, <n are examples of vector spaces. As we shall see below in the following
formal de nition, a vector space is de ned not so much by the nature of its elements, but
rather by the nature of the operations applied to its elements.
Definition
that can be added together or multiplied by scalars. V is a vector space (or linear space) if
for arbitrary elements ~ , ~ , w in V and arbitrary scalars a b c the following properties are
u v ~
satis ed:
1. ~ + ~ is in V (closure under addition)
u v
2. a~ is in V (closure under multiplication)
u
3. ~ + ~ = ~ + ~ (commutative property)
u v v u
4. (~ + ~ ) + w = ~ + (~ + w) (associative property)
u v ~ u v ~
5. There is a zero element ~ in V such that ~ + ~ = ~
0
u 0 u
6. For every ~ in V there is another element ;~ in V such that ~ + (;~ ) = ~
u
u
u
u 0
7. c(~ + ~ ) = c~ + c~
u v
u v
8. (a + b)~ = a~ + b~
u u u
9. a(b~ ) = (ab)~
u
u
10. 1~ = ~ .
u u
If you go through this list with <2 , <3 or in general <n in mind, for example, you should
be able to convince yourself that each of the ten properties hold.
The notion of a vector space is not limited to sets whose elements are geometrical vectors
(in the usual way of thinking of vectors). The following examples show some other collections
of elements which are vector spaces and some collections of elements which are not.
Example
7. The set of all polynomials forms a vector space. Examples of elements in the
.1
u(x) = 7 ; 3x + x3
w(x) = 5x ; x2 + 2x3 + x5 :
If we add any two polynomials together we get another polynomial (so point 1 is satis ed).
If we multiply any polynomial by a scalar we get a polynomial (so point 2 is satis ed). The
20
commutative and associative properties hold for polynomials (points 3 and 4). There is a
zero element, namely 0 (point 5). Taking the negative of a polynomial is a polynomial and
adding these together gives 0 (point 6). Properties 7 through 10 can also be con rmed for
polynomials.
Example
7. The set of all real-valued functions that are de ned and have a continuous
.2
Example
7. The set of odd integers does not form a vector space. To see this note that 1
.3
derivative on an interval (a b) forms a vector space. Here we think of the elements ~ , ~ and
u v
w as representing the functions f (x), g(x) and h(x), for example. Again, a mental check
~
con rms the above 10 conditions.
and 3 are both odd integers, but their sum 1 + 3 = 4 is not an odd integer. This set is not
closed under addition and is therefore not a vector space.
Why do we worry about vector spaces? They can be used to describe the space where
solutions to linear systems (algebraic or di erential systems) reside. In the case of algebraic
systems, if we know that the solution to a problem resides in <2 (the solutions are vectors
in a plane) then we know not to look for solutions in <3 (which include vectors that can
point out of the above plane). In the case of di erential equations, knowing the \size" or
dimension of a vector space essentially tells us when we can stop looking for solutions (i.e.
when we have found all that exist). Often solving di erential equations involves making
good guesses about the solution so it is advantageous to have all the clues we can get.
21
8. Subspaces
Basically, subspaces are vector spaces, only smaller.
Definition
While this is a formal de nition of a subspace, it turns out that all one really needs to
show is closure under addition and closure under multiplication for the elements in U . Note
that it is required that the elements already be elements of a known vector space V . If ~ , ~
u v
and w are in U , then properties 3,4,7-10 in the de nition of vector space hold automatically
~
(since ~ , ~ and w are also in V ). Further, once the properties of closure under addition and
u v
~
multiplication are shown, properties 5 and 6 in the de nition of vector space follow. We can
state this as a theorem.
Theorem
Example
8. Show that the set of polynomials of degree 2 or less forms a subspace of the
.1
vector space V , then U is a subspace of V if and only if U is closed under addition and
multiplication. That is,
if ~ 1 and ~ 2 are in U , then so is ~ 1 + ~ 2 , and
u
u
u u
if ~ is in U , then so is r~ .
u
u
vector space of all polynomials. Note that the set of all polynomials is a vector space and
the set of polynomials of degree 2 or less are included in that vector space.
Closure under addition: We rst identify two general elements in the subcollection as
u(x) = a2 x2 + a1 x + a0
v(x) = b2 x2 + b1 x + b0:
8. (Rabenstein1, section 3.2, problem 2) Show that the set of all elements of <2
.2
of the form (1 a), where a is any constant, is not a subspace of <2.
Example
Closure under addition: We rst identify two general elements in the subcollection as
~ 1 = (1 a1)
u
~ 2 = (1 a2):
u
~ 1 + ~ 2 = (2 a1 + a2 ):
u u
(8.1)
While the second component of this vector is a scalar (like a), the rst component is not
equal to 1 and therefore this subcollection is not closed under addition and therefore is not
a subspace of <2.
Example
8. Let F 2 be the vector space of all twice di erentiable functions de ned on the
.3
f 00(x) + 2f (x) = 0
(8.2)
is a subspace of F 2 .
Closure under addition: We rst identify two general elements in the subcollection as f1 (x)
and f2(x) where
(8.3a)
(8.3b)
(8.4)
This shows that the new function f1 + f2 satis es Eq. (8.2) so this subcollection is closed
under addition.
Closure under multiplication: We need to determine whether the function rf1 satis es
Eq. (8.2). Note
r f100(x) + 2f1(x)] = 0
rf1(x)]00 + 2 rf1(x)] = 0:
The second equation shows that the function rf1 satis es Eq. (8.2). Therefore the subcollection is closed under multiplication. We have therefore shown that the subcollection described
above is a subspace of F 2.
23
u u
u
9. A linear combination of the vectors f~ 1, ~ 2, : : :, ~ k g is de ned as
.1
c1~ 1 + c2~ 2 + : : : + ck~ k
u
u
u
(9.1)
Given this de nition, we can now consider the set of all linear combinations of the elements (that is, the set of all vectors of the form given by equation (9.1)). We refer to this
as the span of the vectors f~ 1, ~ 2, : : :, ~ k g.
u u
u
u u
u
9. The span of the vectors f~ 1, ~ 2, : : :, ~ k g is the set of all possible linear
.2
combinations of f~ 1 , ~ 2 , : : :, ~ k g
u u
u
Definition
If we take two such linear combinations l1 = a1~ 1 + a2~ 2 + : : : + ak~ k and l2 = b1~ 1 +
u
u
u
u
b2~ 2 + : : : + bk ~ k and add them together we get another linear combination l1 + l2 = (a1 +
u
u
b1 )~ 1 + (a2 + b2 )~ 2 + : : : + (ak + bk )~ k . Also, if we take a given linear combination (for
u
u
u
instance the rst one), and multiply it by a scalar r we get another linear combination
rl1 = (ra1 )~ 1 + (ra2 )~ 2 + : : : + (rak )~ k . Therefore, the set of all linear combinations of these
u
u
u
elements is closed under addition and multiplication and hence is a subspace. The span
of a set of vectors in V is a subspace of V . This subspace U , which is made up of all
u u
u
linear combinations of the elements ~ 1, ~ 2, : : :, ~ k , is said to be spanned by ~ 1, ~ 2, : : :, ~ k .
u u
u
That is, the elements ~ 1, ~ 2, : : :, ~ k span the subspace U .
u u
u
An equivalent way of thinking about the span of a set of elements is the following. If the
set f~ 1 ~ 2 : : : ~ k g spans U , then any element of U can be written as a linear combination
u u
u
of the elements ~ 1, ~ 2, : : :, ~ k . Note that there could be an in nite number of elements in
u u
u
U (which are all linear combinations of ~ 1, ~ 2, : : :, ~ k ). For example, if two functions y1(x)
u u
u
and y2(x) span the set of all solutions to the second order homogeneous linear di erential
equation y00 + p(x)y0 + q(x)y = 0, then any solution to this di erential equation can be written
as a linear combination of y1(x) and y2(x).
The idea here is that if you know that a solution you are seeking (to a di erential equation
or a linear algebraic equation) resides in a vector space and that you have a set of vectors (or
elements) that span that vector space (or subspace), then, you can stop looking for solutions
since all other solutions are linear combinations of the spanning vectors.
The above description tells how the concept of the span of a set of elements is related to
linear combinations of those elements. In the sections that follow the notion of the span of
24
a set will also be related to notions of dimension, basis and linear dependence. For now, we
shall consider several examples based on the above de nitions of a linear combination and
the span of a set.
Example
ometrically. Since these two vectors span the subspace, any element of U must have the
form
2
;2 3 2 1 3
~ = a 6 1 7 + b 6 ;1 7 :
u 4 5 4 5
(9.2)
1
3
You should be able to con rm for yourself that such elements do, in fact, form a subspace
(i.e. show closure under addition and multiplication). The geometric interpretation of this
subspace is a plane passing through the origin and the points (;2 1 1) and (1 ;1 3) (i.e.
their x y z coordinates). The parametric representation of the plane is
x = ;2a + b
y = a;b
z = a + 3b
where a and b are the \parameters" of the plane (once given, they determine x y z). The
equation for the plane can also be represented by
z + 4x + 7y = 0:
Example
the vector spans the subspace, any element of U must have the form
2
~ = a6
u 4
1
3 7:
5
;2
(9.3)
The geometric interpretation of this subspace is a line passing through the origin and the
point (1 3 ;2) in 3D. The parametric representation of the line is
x=a
This can also be expressed as
y = 3a
z = ;2a:
x=y= z :
1 3 ;2
25
Example
represent the plane with x = 0 than do the vectors in Eq. (10.1). The reason for this greater
e ciency has to do with the notion of linear dependence. This is also closely related to
dimension and basis which we shall discuss in section 11.
Definition
(10.4)
The set or the elements themselves are referred to as linearly dependent. If the only way
to satisfy this equation is with c1 = c2 = : : : = cm = 0 then the elements (or the set) are
linearly independent.
Example
satisfy
10. If we consider the previous case of the three vectors in Eq. (10.1) and try to
.1
c1
2
6
4
0
0 7 + c2 6
5
4
1
0
1 7 + c3 6
5
4
0
0
1 7=6
5 4
1
0
07
5
0
(10.5)
c1
2
6
4
0
0 7 + c2 6
5
4
1
0
0
1 7=6 0 7
5 4 5
0
0
(10.6)
is c1 = c2 = 0.
There is another equivalent way to think about linear dependence. From our de nition,
if a given set of vectors are linearly dependent, then
(10.7)
(10.8)
10. (Rabenstein1, Thm. 3.3) The set of elements f~1 ~2 : : : ~mg is linearly
.1
v v
v
dependent if and only if at least one of the elements in this set is a linear combination of the
others.
27
The basic idea behind this theorem is that for a linearly dependent set of vectors, one of
the vectors must be redundant.
Now consider n vectors ~1 ~2 : : : ~n in <m. Each vector has m components, for example
v v
v
2
6
6
~3 = 6
v 6
4
v31 3
v32 7
7
... 7 :
7
5
v3m
(10.9)
The linear dependence of these n vectors is determined by nding for which values of ci the
following equation holds
(10.10)
"
c1 ~ 1
v
3
2
7+c 6
5 24
"
~2
v
3
2
" 3
7 +:::+c 6 ~ 7
5
n 4 vn 5
~1 ~2 : : : ~n
v v
v
# # # #
3
7~
5c
= 0
= 0
(10.11)
(10.12)
these elements are linearly dependent. If n = m then these elements are linearly dependent
if and only if
2
det 6
4
~1 ~2 : : : ~n
v v
v
# # # #
28
3
7=0
5
(10.13)
Example
pendent.
10. Determine if the following vectors are linearly dependent or linearly inde.2
~1
v
2
=6
4
0
17
5
1
~2
v
2
=6
4
1
07
5
1
~3
v
2
=6
4
1
17
5
0
(10.14)
Since this determinant is nonzero, these vectors are linearly independent. Note that if any
additional vector in <3 was included in this list, those 4 vectors would automatically be
linearly dependent as a consequence of the rst part of Theorem 10.
.2.
Example
2
=6
4
1
0 7
5
;1
~2
v
2
=6
4
2
67
5
1
~3
v
2
=6
4
;1 3
;12 7 :
5
;5
1 2 ;1
0 6 ;12 7 = (1)(;18) ; 0 + (;1)(;24 + 6) = 0:
5
;1 1 ;5
(10.15)
Therefore, since the determinant vanishes, these vectors are linearly dependent. This means
that
(10.16)
can be satis ed by values of c1 c2 c3 that are not all zero. By setting up the system
2
6
4
32
1 2 ;1
c1
76 c 7 =~
0 6 ;12 5 4 2 5 0
;1 1 ; 5
c3
(10.17)
and using elementary operations on the rows, show that any vector ~ = (c1 c2 c3) of the
c
form
~ = (;3a 2a a)
c
where a is a constant, is a solution.
29
(10.18)
Definition
space V . If every set of more than n elements of V is linearly dependent then V is a vector
space of dimension n. THE DIMENSION OF A VECTOR SPACE IS THE NUMBER OF
ELEMENTS IN THE LARGEST LINEARLY INDEPENDENT SET IN THAT SPACE.
pendent elements of V is called a basis for V . A BASIS FOR V IS A SET OF VECTORS
THAT SPANS V AND IS ALSO LINEARLY INDEPENDENT.
So one could also think of a basis as a maximal independent set or a minimal spanning
set. In order to describe the elements of a vector space, all one really needs is a basis! Notice
that there is not just one basis for a vector space. For example, the two sets of vectors
f(1 0) (0 1)g and f(1 ;1) (1 1)g are both bases for <2.
Theorem
Example
11. Use the above theorem to show that <3 has dimension 3. Consider the set
.1
of three vectors
82
>
<6
>4
:
39
1
0
0 >
=
0 7 6 1 7 6 0 7> :
5
4 5
4 5
0
0
1
These vectors are linearly independent since
2
3
1 0 0
det 6 0 1 0 7 = 1 6= 0:
4
5
0 0 1
Furthermore, since
2 3
2 3
2 3 2 3
1
0
0
a
a6 0 7+b6 1 7+c6 0 7 = 6 b 7
4 5
4 5
4 5 4 5
0
0
1
c
(11.1)
(11.2)
(11.3)
represents a general element of <3, these three vectors span <3 . Therefore, by Theorem 11.
.1, <3 has dimension 3. Similar reasoning shows that <m has dimension m.
30
11. Show that the vector space of all polynomials of degree two or less has di.2
mension 3. Consider the set of elements
Example
f1
x2 g:
(11.4)
Example
gree three or less? No. One could argue as we did in the previous example that the vector
space of all polynomials of degree three or less has dimension 4. By Theorem 11. , in order
.2
for a set to span this space, the set must contain exactly 4 elements. The proposed set
fp1(x) p2 (x) p3(x)g has only 3 elements and therefore cannot form a basis for this space.
Example
g1(x) = sin x
g2(x) = cos x?
(11.7)
Elements in this space look like
c1 sin x + c2 cos x
(11.8)
where c1 and c2 are constants. If these elements are linearly independent (and since they
span this space) then by Theorem 11. the space would have dimension 2. So, consider the
.1
Wronskian
"
#
sin x cos x = ; sin2 x ; cos2 x = ;1 6= 0:
W = det cos x ; sin x
(11.9)
Therefore these two functions are linearly independent. By Theorem 11. the space has
.1
dimension 2.
31
It is also true that the set f1 x x2 x3 : : :g forms a basis for the vector space of all
polynomials. This vector space has in nite dimension.
The following describes a method for nding a basis for a subspace spanned by a set of
vectors. Consider the subspace of <4 spanned by the vectors
2 3
2
2 3
2
1
1 3
1
1 3
6 7
6
7
6 7
6
7
~1 = 6 0 7 ~2 = 6 ;1 7 ~3 = 6 1 7 ~4 = 6 ;4 7 :
v 617 v 6 2 7 v 607 v 6 5 7
(11.10)
4 5
4
5
4 5
4
5
2
0
1
6
The rst thing to notice is that
2
det 6
4
~1 ~2 ~3 ~4
v v v v
# # # #
3
7 = 0:
5
(11.11)
Therefore, based on Theorem 10. we know that these vectors are linearly dependent. Con.2,
sequently, these vectors do not form a basis they are not linearly independent. A method
for nding a basis starts by forming the matrix whose rows are made up of the above vectors,
namely
2
1 0 1 23
6 1 2 ;1 0 7
6
7
6
(11.12)
41 1 0 17
5
1 ;4 5 6
and applying the elementary operations on the rows. So
2
1 0 1 2 3
6 0 2 ;2 ;2 7
;1 ROW1 + ROW2]
6
7
6
4 0 1 ;1 ;1 7
5
;1 ROW1 + ROW3]
0 ;4 4 4
;1 ROW1 + ROW4]
2
3
1 0 1 2
6 0 1 ;1 ;1 7
1=2 ROW2]
6
7
6
7
40 0 0 0 5
;1 ROW2 + ROW3]
0 0 0 0
4 ROW2 + ROW4]
The above matrix is now as simpli ed as possible. If we extract the two remaining (nonzero)
row vectors we have
2 3
2
1
0 3
6 7
6
7
w1 = 6 0 7
~ 617
w2 = 6 ;1 7
~ 6 1 7
(11.13)
4 5
4
5
2
;1
These two vectors are linearly independent since c1 w1 + c2w2 = 0 can only be satis ed by
~
~
c1 = c2 = 0. Furthermore, the vectors w1 and w2 span the same space as did the vectors ~1 ,
~
~
v
~2 , ~3 and ~4 this result is the consequence of the following theorem.
v v
v
32
Theorem
the matrix formed by performing a nite number of elementary operations on the rows of A.
Then the subspace of <n that is spanned by the rows vectors of A is the same as the subspace
spanned by the row vectors of B .
Therefore, a basis for the space spanned by the vectors ~1 , ~2 , ~3 and ~4 is given by the
v v v
v
set fw1 w2g. This subspace therefore has dimension 2.
~ ~
A~ = ~
x x
(12.1)
A~
x
A~ ; I~
x
x
(A ; I )~
x
= ~=
x
= 0
= 0:
I~
x
(12.2)
The last line represents a homogeneous linear system with a matrix A ; I . This matrix is
modi ed from the original matrix A by subtracting from each of the diagonal elements.
We know from Theorem 5. that if det(A ; I ) = 0 then this homogeneous problem has
.2
nontrivial solutions { notice that Eq. (12.1) is satis ed by the trivial solution ~ = ~ .
x 0
Now, since we assume that A is a given matrix, the idea is that we wish to choose such
that det(A ; I ) = 0. Such a value is called an eigenvalue of A (also called a characteristic
value). If such a value exists so that Eq. (12.1) has a nontrivial solution ~ , we call such
x
solutions eigenvectors (or characteristic vectors).
There are a wide variety of situations which arise when studying eigenvalue problems.
Many of these are described in Boyce and DiPrima4. In these notes, we will treat only 2 2
matrices that have either real and distinct eigenvalues or complex eigenvalues. In each of
these cases, we can identify two linearly independent eigenvectors. Mainly, the methods will
be demonstrated by way of examples.
Example
(12.3)
and nd the eigenvalues and eigenvectors. We begin by nding the values of that satisfy
"
det 4 ;
2
;5
;3 ;
= 0:
(12.4)
We nd that
(4 ; )(;3 ; ) + 10 = 0
( ; 2)( + 1) = 0:
(12.5)
This equation is sometimes called the characteristic polynomial equation. Therefore, the
eigenvalues are 1 = 2 and 2 = ;1. Now, for each of these eigenvalues we want to nd
the corresponding nontrivial solution, or eigenvector. We rst consider the case = 1 = 2.
This gives us the system ((A ; I )~ = ~ )
x 0
"
2 ;5 ~ = ~ :
2 ;5 x 0
(12.6)
This system represents two equations, each of which is 2x1 ; 5x2 = 0 (generally, the system
may need to be reduced using elementary operations on the rows of the matrix before writing
down the solution). Therefore, we have as a solution
~ 1 = a1
x
"
5
2
(12.7)
5 ;5 ~ = ~ :
2 ;2 x 0
(12.8)
12. Find the eigenvalues and eigenvectors for the following matrix
.2
"
2 5 :
;1 ;2
34
(12.10)
This represents the single equation ;x1 + (;2 + i)x2 = 0. Therefore, we have as a solution
"
~ 1 = a2 ;21+ i
x
(12.18)
where a2 is an arbitrary, but nonzero, complex constant. This is the eigenvector ~ 2 correx
sponding to the eigenvalue 2 = i. You may notice that the two eigenvalues are complex
conjugates of each other and that the two eigenvectors are complex conjugates of each other
as well. So actually, once we found the rst eigenvector, we could have just written down
the other eigenvector without further calculations.
As was mentioned earlier, there are many di erent scenarios that can play out in eigenvalue problems. The above examples describe only two such situations. See Boyce and
DiPrima4 for further discussion of eigenvalues and eigenvectors.
References
1
A.L. Rabenstein, Elementary Di erential Equations With Linear Algebra, Fourth Edition, (Harcourt Brace Jovanovich, Fort Worth, 1992).
G. Strang, Linear Algebra and its Applications, Third Edition, (Harcourt Brace Jovanovich, San Diego, 1988).
W.E. Boyce and R.C. DiPrima, Elementary Di erential Equations and Boundary Value
Problems Sixth Edition, (Wiley, New York, 1997).
36