Вы находитесь на странице: 1из 140

Applied Linear Algebra Refresher Course

Karianne Bergen
kbergen@stanford.edu

Institute for Computational and Mathematical Engineering,


Stanford University

September 16-19, 2013

K. Bergen (ICME) Applied Linear Algebra 1 / 140


Course Topics

Matrix and vector properties


Matrix algebra
Linear transformations
Solving linear systems
Orthogonality
Eigenvectors and eigenvalues
Matrix decompositions

K. Bergen (ICME) Applied Linear Algebra 2 / 140


Vectors and Matrices

K. Bergen (ICME) Applied Linear Algebra 3 / 140



Scalars, vectors and matrices
Scalar : a single quantity or measurement that is invariant under
coordinate rotations (e.g. mass, volume, temperature).
, , R (Greek letters)
Vector : an ordered collection of scalars (e.g. displacement,
acceleration, force).

x1
x2
x = . , xi R i = 1, . . . n

..
xn

x, y, z Rn (lower case) is a vector with n entries.


All vectors are column vectors unless otherwise noted.
Row vectors are denoted by xT = [ x1 x2 xn ].
For simplicity, we will restrict ourselves to real numbers. The generalization to the
complex case can be found in any linear algebra text.
K. Bergen (ICME) Applied Linear Algebra 4 / 140
... Scalars, vectors and matrices

Matrix : a two-dimensional collection of scalars.


a11 a12 a1n aT1

a21 a22 a2n aT2
A = a1 a2 an = . .. , AT =

.. ..
. .
T
am
am1 am2 amn

A Rmn is a matrix with m rows and n columns.

A, B, , Rmn (upper case)

K. Bergen (ICME) Applied Linear Algebra 5 / 140


Operations on vectors
Scalar multiplication:

x1 x1
x2 x2
(x)i = xi , x = . = .

.. ..
xn xn
Vector addition/subtraction:

x1 y1 x 1 + y1
x 2 y2 x 2 + y2
(x + y)i = xi + yi , x+y = . + . =

..
.. ..

.
xn yn x n + yn
Inner Product (dot product, scalar product):
n
X
xT y = xi yi = x1 y1 + x2 y2 + + xn yn
i=1
Other common notation for the inner product: hx, yi, xy
K. Bergen (ICME) Applied Linear Algebra 6 / 140
Vector norms

How do we measure the length or size of a vector?

Vector norm is a function f : Rn R+ that satisfies the following


properties:

1 Scale invariance: f (v) = ||f (v), R and v Rn


2 Triangle inequality: f (u + v) f (u) + f (v), u, v Rn
3 Positivity: f (x) = 0 x 0

K. Bergen (ICME) Applied Linear Algebra 7 / 140


Commonly used vector norms
1-norm (taxi-cab norm):
n
X
kvk1 = |vi |
i=1

2-norm (Euclidean norm):


n
!1
X 2
kvk2 = vi2 = vT v
i=1

infinity-norm (max norm):


kvk = max |vi |
i
These are examples of the p-norm, for p = 1, 2, and :
n
!1
X p
kvkp = |vi |p , p1
i=1

K. Bergen (ICME) Applied Linear Algebra 8 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 9 / 140


Mini-Quiz 1:

Compute the 1-norm, 2-norm, infinity-norm of the following vector:



3
v = 6

2

K. Bergen (ICME) Applied Linear Algebra 10 / 140


Solution (Q1):

Solutions:
1-norm:
3
X
kvk1 = vi = 3 + 6 + 2 = 11
i=1
2-norm:

3
!1
X 2 p
kvk2 = vi2 = 32 + 62 + 22 = 9 + 36 + 4 = 49 = 7
i=1

-norm
kvk = max |vi | = max{3, 6, 2} = 6
i

K. Bergen (ICME) Applied Linear Algebra 11 / 140


Cauchy-Schwartz Inequality

The inner product satisfies the Cauchy-Schwartz inequality:

|xT y| kxk2 kyk2 for x, y Rn

with equality when y is a scalar multiple of x.

This inequality is arises in many applications, including


I Geometry: triangle inequality
I Physics: Heisenberg uncertainty principle
I Statistics: Cramer-Rao lower bound

K. Bergen (ICME) Applied Linear Algebra 12 / 140


Inner Product and Orthogonality

Orthogonal vectors: vectors u and v are orthogonal (u v) if their


inner product is zero.

uT v = 0 uv

Orthonormal vectors: vectors u and v are orthonormal if they are


orthogonal and both are normalized to length 1.

uv and kuk = 1, kvk = 1

The orthogonal complement of a set of vectors V , denoted V , is


the set of vectors x such that xT v = 0, v V .

K. Bergen (ICME) Applied Linear Algebra 13 / 140


Inner Product and Projections

Projection: the inner product gives us a way to compute the


component of a vector v along any direction u:
uT v u
proju v = kuk kuk

When u is normalized to unit length, kuk = 1, this becomes

proju v = uT v u


K. Bergen (ICME) Applied Linear Algebra 14 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 15 / 140


Mini-Quiz 2:

Give an algebraic proof for the triangle inequality

kx + yk2 kxk2 + kyk2

using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )

K. Bergen (ICME) Applied Linear Algebra 16 / 140


Solution (Q2):
Give an algebraic proof for the triangle inequality

kx + yk2 kxk2 + kyk2

using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )

Solution:

kx + yk2 = (x + y)T (x + y)
= xT x + 2xT y + y T y
= kxk2 + kyk2 + 2 xT y


kxk2 + kyk2 + 2|xT y|


kxk2 + kyk2 + 2kxkkyk (by Cauchy-Schwartz)
2
= (kxk + kyk)

K. Bergen (ICME) Applied Linear Algebra 17 / 140


Operations with Matrices

Scalar multiplication:

a11 a1n a11 a1n
(A)ij = Aij ,
.. =
..
. .
am1 amn am1 amn

Matrix addition:
(A + B)ij = Aij + Bij

a11 + b11 a1n + b1n
A+B =
.. ,

A, B Rmn
.
am1 + bm1 amn + bmn

K. Bergen (ICME) Applied Linear Algebra 18 / 140


Matrix-vector multiplication
Matrix-vector multiplication:
n
X
(Ax)i = Aij xj , A Rmn , x Rn
j=1

Matrix-vector product as linear combination of matrix columns



x1
x2
Ax = a1 a2 an . = x1 a1 +x2 a2 + +xn an

..
xn
Matrix-vector product in terms as a vector of scalar products
aT1 aT1 x


aT aT x
2 2
Ax = . x =

. ..
. .
aTm aTm x
K. Bergen (ICME) Applied Linear Algebra 19 / 140
Matrix-matrix multiplication

Matrix product:
p
X
(AB)ij = Aik Bkj , A Rmp , B Rpn
k=1

Matrix products in terms of inner products:

aT1


aT
2
(AB)ij = aTi bj , A= . , B = b1 b2 bn

. .
aTm

Since inner products are only defined for vectors of the same
length, the matrix product requires that ai , bj Rp for all i, j.

K. Bergen (ICME) Applied Linear Algebra 20 / 140


... Matrix-matrix multiplication
Matrix products in terms of Matrix-vector products:

AB = A b1 b2 bn = Ab1 Ab2 Abn

aT1 aT1 B

aT2 aT2 B
AB = B =

.. ..
. .
aTm aTm B
Matrix-matrix multiplication can be expressed in terms of the
concatenation of matrix-vector products.
For A Rkl and B Rmn the product AB is only defined if
l = m, and the product BA is only defined if k = n.

K. Bergen (ICME) Applied Linear Algebra 21 / 140


Properties of Matrix-matrix multiplication
Matrix multiplication is associative:

A(BC) = (AB)C

Matrix multiplication is distributive:

A(B + C) = AB + AC

Matrix multiplication is, in general, not commutative:

AB 6= BA
   
1 1 1 0
For example, let A = and B = , then
0 1 1 1
   
2 1 1 1
AB = 6 = = BA.
1 1 1 2

K. Bergen (ICME) Applied Linear Algebra 22 / 140


Vector Outer Product

Using the definition of matrix multiplication, we can define the


vector outer product:

x1 x1 y1 x1 y2 x1 yn
x2   x2 y1 x2 y2 x 2 yn
xy T = . y1 y2 yn = .

.. ..
.. .. . .
xm xm y1 xm y2 xm yn

xy T

ij
= x i yj

x Rm , y Rn , xy T Rmn
The outer product xy T of vectors x and y is a matrix, and is
defined even if the vectors are not the same length.

K. Bergen (ICME) Applied Linear Algebra 23 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 24 / 140


Mini-Quiz 3:

Which of the following vector and matrix operations are well-defined?

1 x 4 xT y 7 Az
2 A 5 xT z
3 xy 6 zxT 8 z T Ay


1 4    
3 4 0 2
= 5, x = 5 , y = 0 , z= , A=
0 1 5 3
2 1

K. Bergen (ICME) Applied Linear Algebra 25 / 140


Solution (Q3):

Which of the following vector and matrix operations are well-defined?

1 x Yes 4 xT y Yes 7 Az No
2 A Yes 5 xT z No
3 xy Yes 6 zxT Yes 8 z T Ay Yes


1 4    
3 4 0 2
= 5, x = 5 , y = 0 , z= , A=
0 1 5 3
2 1

K. Bergen (ICME) Applied Linear Algebra 26 / 140


Transpose of a Matrix

Transpose: the transpose of matrix A Rmn is the matrix


AT Rnm formed by exchanging the rows and columns of A:

(AT )ij = Aji

Properties:

(AT )T = A
T
(A + B) = AT + B T
(AB)T = B T AT
(B)T = (B T )

K. Bergen (ICME) Applied Linear Algebra 27 / 140


Trace of a Matrix
Trace: the trace of a square matrix A Rnn is the sum of its
diagonal entries:
Xn
tr(A) = aii
i=1

The trace is a linear map and has the following properties:


tr(A + B) = tr(A) + tr(B)
tr(A) = tr(A), R
T
tr(A) = tr(A )
tr(AB) = tr(BA)
The trace of a matrix product functions similarly to the inner
product of vectors:
X
tr(AT B) = Aij Bij , A, B Rmn
i,j

K. Bergen (ICME) Applied Linear Algebra 28 / 140


Matrix norms

Induced norm (operator norm) corresponding to vector norm k k:

kAk = max kAxk


kxk=1

I kAk1 : maximum absolute column sum


I kAk : p
maximum absolute row sum
I kAk2 = max (AT A) = max (A) (spectral norm)

Frobenius norm:
v
um X
n
uX q
kAkF = t |aij |2 = trace(AT A)
i=1 j=1

The Frobenius norm is equivalent to kvec(A)k2 .

K. Bergen (ICME) Applied Linear Algebra 29 / 140


1
Geometric Interpretation of Matrix Norms

1
diagram from Trefethen and Bau.
K. Bergen (ICME) Applied Linear Algebra 30 / 140
Properties of Matrix Norms

Matrix norms satisfy the same properties as vector norms:


1 Scale invariance: kAk = ||kAk
2 Triangle inequality: kA + Bk kAk + kBk
3 Positivity: kAk 0 and kAk = 0 only if A 0

Some matrix norms, including induced norms and the Frobenius


norm, also satisfy the submultiplicative property:

kAxk kAkkxk, and kABk kAkkBk

K. Bergen (ICME) Applied Linear Algebra 31 / 140


A few words on: Vector and Matrix Norms
Vector norms is to give us a way to measure how big a vector is.
2-norm (Euclidean norm) is most common: it has nice relationship
with the inner product kvk22 = v T v, and fits with our usual
geometric notions of length and distance.
The choice of norm may depend on the application:
Let y Rn be a set of n measurements, and let y Rn be the
estimates for y from a model. Let r = y y be the difference
between the measured value and the model estimate. We want to
optimize our model to make r as small as possible, but how
should we measure small?
I krk : use this if no error should exceed a prescribed value .
I krk2 : use this if large errors are bad, but small errors are ok.
I krk1 : use this norm if you want to your measurement and estimate
to match exactly for as many samples as possible.

Matrix norms are an extension of vector norms to matrices.


K. Bergen (ICME) Applied Linear Algebra 32 / 140
Questions?

K. Bergen (ICME) Applied Linear Algebra 33 / 140


Mini-Quiz 4:
Determine whether each statement given below is True or False:

There exist matrices A, B 6= 0 such that AB = 0.

The trace is invariant under cyclic permutations:

tr(ABC) = tr(CAB) = tr(BCA).

The definition of the induced norm allows us to use different vector


norms on the numerator and denominator

kAk, = max kAxk .


kxk =1

kAk,1 is equal to the largest element of A in absolute value .

K. Bergen (ICME) Applied Linear Algebra 34 / 140


Solutions (Q4):

Determine whether each statement given below is True or False:


There exist matrices A, B 6= 0 such that AB = 0.
   
1 2 2 4
TRUE : For example, when A = , and B = .
2 4 1 2

The trace is invariant under cyclic permutations:

tr(ABC) = tr(CAB) = tr(BCA).

TRUE
Use the property that tr(XY ) = tr(Y X):
To show that tr(ABC) = tr(CAB), let X = AB, and Y = C.
To show that tr(ABC) = tr(BCA), let X = A and Y = BC.

K. Bergen (ICME) Applied Linear Algebra 35 / 140


Solutions (Q4):

Determine whether each statement given below is True or False:

The definition of the induced norm allows us to use different vector


norms on the numerator and denominator

kAk, = max kAxk .


kxk =1

kAk,1 is equal to the largest element of A in absolute value .


TRUE

 
T
kAk,1 = max kAxk = max max(ai xi ) = max |aij |.
kxk1 =1 kxk1 =1 i i,j

K. Bergen (ICME) Applied Linear Algebra 36 / 140


Special Matrices: Diagonal, Identity
Diagonal Matrix : a matrix is diagonal if the only non-zero entries
are on the matrix diagonal.

( d1 0 0
di if i = j 0 d2 0

Dij = , D=.

.. ..
0 if i 6= j .. . .
0 0 dn
Identity Matrix : the identity In Rnn is a diagonal matrix with
1s along the diagonal (denoted by I when size is unambiguous).

( 1 0 0
1 if i = j 0 1 0
Iij = , I = .

. . . ..
0 if i 6= j . . .
0 0 1
AIn = Im A = A for rectangular A Rmn
AI = IA = A for square A Rnn
K. Bergen (ICME) Applied Linear Algebra 37 / 140
Special Matrices: Triangular

Triangular Matrix : a matrix is lower/upper triangular if all of the


entries above/below the diagonal are zero:


l11 u11 u12 u1n
l21 l22 u22 u2n
L= . , U =

.. .. .. ..
.. . . . .
lm1 lm2 lmn umn

I The sum of two upper triangular matrices is upper triangular.


I The product of two upper triangular matrices is upper triangular.
I The inverse of an invertible upper triangular matrix is upper
triangular.

K. Bergen (ICME) Applied Linear Algebra 38 / 140


Special Matrices: Symmetric, Orthogonal

Symmetric Matrix : a square matrix is symmetric if it is equal to


its own transpose.

1
2
A = AT Rnn , A= .

.. ..
.. . .
n

Orthogonal (Unitary) Matrix : a square matrix U Rnn is


unitary if it satisfies

U T U = U U T = I.

The columns of a orthogonal matrix U are orthonormal.

K. Bergen (ICME) Applied Linear Algebra 39 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 40 / 140


Mini-Quiz 5:

Determine whether each statement given below is True or False:


For any A Rnn , AT A is symmetric.

For any A Rnn , A AT is symmetric.

Invariance under Unitary Multiplication.


For any matrix x Rm and unitary Q Rmm , we have

kQxk2 = kxk2

Let D Rnn = diag(d1 , , dn ) be any diagonal matrix, then


DA = AD for any matrix A Rnn .

K. Bergen (ICME) Applied Linear Algebra 41 / 140


Solution (Q5):

Determine whether each statement given below is True or False:


For any A Rnn , AT A is symmetric.
TRUE : (AT A)T = (A)T (AT )T = AT A

For any A Rnn , A AT is symmetric.


   
1 1 0 2
FALSE : For example when A = , A AT = .
1 0 2 0

K. Bergen (ICME) Applied Linear Algebra 42 / 140


Solution (Q5):
Invariance under Unitary Multiplication.
For any matrix x Rm and unitary Q Rmm , we have
kQxk2 = kxk2
TRUE : kQxk22 = (Qx)T (Qx) = xT QT Qx = xT x = kxk22

Similarly, for the matrix 2-norm and Frobenius norm, we have:


kQAk22 = max (QA)T (QA)


= max AT QT QA


= max AT A)


= kAk22

and kQAk2F = tr (QA)T (QA)




= tr AT A


= kAk2F
K. Bergen (ICME) Applied Linear Algebra 43 / 140
Solution (Q5):
Determine whether each statement given below is True or False:
Let D Rnn = diag(d1 , , dn ) be any diagonal matrix, then
DA = AD for any matrix A Rnn .
FALSE
DA rescales the rows of A while AD rescales the columns of A:
T
d1 aT1

d1 0 0 a1
0 d2 0 T T
a2 d2 a2

DA = . = 6=

.. . . .
.. . .. .. ..


0 0 dn T
an dn anT


d1 0 0
0 d2 0
AD = a1 an . .. = d1 a1 dn an


.. ..
. .
0 0 dn

K. Bergen (ICME) Applied Linear Algebra 44 / 140


Vector Spaces

K. Bergen (ICME) Applied Linear Algebra 45 / 140


Linear Transformations

Linear transformation: T : Rn Rm is a linear transformation if


for all x, y Rn the following properties hold:
I T (x + y) = T (x) + T (y)
I T (x) = T (x)

Theorem: Let A Rmn be a matrix and define the function


TA (x) = Ax. Then TA : Rn Rm is a linear transformation.
I Matrix multiplication is a linear function.

Theorem: Let T : Rn Rm , then there exists a matrix A such


that T (x) = Ax.
I There is a matrix for every linear function.

K. Bergen (ICME) Applied Linear Algebra 46 / 140


Linear Combinations

Let v1 , v2 , , vr Rm be a set of vectors. Then any vector v


which can be written in the form

v = 1 v1 + 2 v2 + + r vr , i R i = 1, . . . , r

is a linear combination of vectors v1 , v2 , , vr .

Span: the set of all linear combination of vectors v1 , , vr Rm


is called the span of {v1 , , vr }
I span{v1 , , vr }, is always a subspace of Rm .
I If S = span{v1 , , vr }, then S is spanned by v1 , v2 , , vr .

A set S is called a subspace if it is closed under vector addition


and scalar multiplication:
I if x, y S and R, then x + y S and x S

K. Bergen (ICME) Applied Linear Algebra 47 / 140


Linear Independence

Linear dependence: a set of vectors v1 , v2 , , vr is linearly


dependent if there exists a set of scalars 1 , 2 , , r R with
at least one i 6= 0 such that

1 v1 + 2 v2 + + r vr = 0

That is, a set of vectors is linearly dependent if one of the vectors


in the set can be written as a linear combination of one or more
other vectors in the set.
Linear independence: a set of vectors v1 , v2 , , vr is linearly
independent if it is not linearly dependent. That is

1 v1 + 2 v2 + + r vr = 0 1 = = r = 0

K. Bergen (ICME) Applied Linear Algebra 48 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 49 / 140


Mini-Quiz 6:

Are the following sets of vectors linearly independent or linearly


dependent?
1 {a, b, c}
2 {b, c, d}
3 {a, b, d}
4 {a, b, c, d}


1 0 0 0
a = 0 , b = 1 , c = 0 , d = 1
0 1 1 1

K. Bergen (ICME) Applied Linear Algebra 50 / 140


Solution (Q6):

Are the following sets of vectors linearly independent or linearly


dependent?
1 {a, b, c} independent
2 {b, c, d} dependent: b + 2c = d
3 {a, b, d} independent
4 {a, b, c, d} dependent: at most 3 linearly independent vectors in R3

1 0 0 0
a = 0 , b = 1 , c = 0 , d = 1
0 1 1 1

K. Bergen (ICME) Applied Linear Algebra 51 / 140


Basis and Dimension

A basis for a subspace S is a linearly independent set of vectors


that span S
I The basis for a subspace S is not unique, but all bases for the
subspace contain the same number of vectors.
I For example, consider the case where S = R2 :
       
1 0 1 1
, and , are both bases that span R2 .
0 1 1 1

The dimension of the subspace S, dim(S), is the number of


linearly independent vectors in the basis for S (in the example
above, dim(S) = 2).
Theorem (unique representation): if vectors v1 , . . ., vn are a
basis for subspace S, then every vector in S can be uniquely
represented as a linear combination of these basis vectors.

K. Bergen (ICME) Applied Linear Algebra 52 / 140


Range and Nullspace

The range (column space, image) of a matrix A Rmn , denoted


by R(A), is the set of all linear combinations of the columns of A:

R(A) := {Ax | x Rn } , R(A) Rm

The nullspace (kernel) of a matrix A Rmn , denoted by N (A), is


the set of vectors z such that Az = 0:

N (A) := {z Rn | Az = 0} , N (A) Rn

The range and nullspace of AT are called the row space and left
nullspace of A.
I These four subspaces of A are intrinsic to A and do not depend on
the choice of basis.

K. Bergen (ICME) Applied Linear Algebra 53 / 140


Rank of a Matrix

The column rank of A, denoted rank(A), is dimension of R(A).

The row rank of A is the dimension of R(AT ).

Rank of a matrix: the column rank of a matrix is always equal to


the row rank, and therefore we refer to this value as the rank of A.

Matrix A Rmn is full rank if rank(A) = min{m, n}.

K. Bergen (ICME) Applied Linear Algebra 54 / 140


Fundamental Theorem of Linear Algebra

Fundamental theorem of linear algebra:


1 The nullspace of A is the orthogonal complement of the row space.

N (A) = R(AT )

2 The left nullspace of A is the orthogonal complement of the column


space.

N (AT ) = (R(A))

Corollary (Rank-nullity):

dim (R(A)) + dim (N (A)) = n, A Rmn

K. Bergen (ICME) Applied Linear Algebra 55 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 56 / 140


Mini-Quiz 7:

Give the range R() and nullspace N () for the matrices



1 0  
32 1 0 1
A = 0 2 R , and B =
R23
0 1 1
0 0

What are their dimensions ( dim (R()) and dim (N ()) )?

K. Bergen (ICME) Applied Linear Algebra 57 / 140


Solution (Q7):
Give the range R() and nullspace N () for the matrices

1 0  
32 1 0 1
A = 0 2 R , and B =
R23
0 1 1
0 0
What are their dimensions ( dim (R()) and dim (N ()) )?
Solution:

1 0
R(A) = 0 + 2 , , R , N (A) =
0 0


      1
1 0

R(B) = + , , R , N (B) = 1 , R

0 1
1

dim (R(A)) = 2 dim (N (A)) = 0


dim (R(B)) = 2 dim (N (B)) = 1
K. Bergen (ICME) Applied Linear Algebra 58 / 140
Solving Linear Systems

K. Bergen (ICME) Applied Linear Algebra 59 / 140


Nonsingular Matrix
Theorem: a matrix A Rmn with m n has full rank if and
only if it maps no two distinct vectors to the same vector, and

Ax = Ay = x = y.

Singular matrix : a square matrix A Rnn that is not full rank.


Nonsingular matrix : a square matrix A Rnn of full rank.

If A is nonsingular, we can uniquely express any vector y Rn as


y = Ax for some unique x Rn . If we chose vectors ei = Axi ,
i = 1, . . . , n, we can write

AX = A x1 xn = Ax1 Axn = e1 en = I

K. Bergen (ICME) Applied Linear Algebra 60 / 140


Matrix Inverse
Matrix inverse: the inverse of a square matrix A Rnn is the
matrix B Rnn such that

AB = BA = I where I is the identity matrix

I The inverse of A is denoted by A1 .


I Any square, nonsingular matrix A has a unique inverse A1
satisfying AA1 = A1 A = I.
1
I (AB) = B 1 A1 (assuming both inverses exist).
I (AT )1 = (A1 )T

If Ax = b, then A1 b gives the vector of coefficients in the linear


combination of the columns of A that yields b:
x1
1 ..
b = Ax = x1 a1 + + xn an , x=A b= .
xn

K. Bergen (ICME) Applied Linear Algebra 61 / 140


Invertible Matrix Theorem

Let A Rnn be a square matrix, then the following statements


are equivalent:
I A is invertible, A is nonsingular.
I There exists a matrix A1 such that AA1 = In = A1 A.
I Ax = b has exactly one solution for each b Rn .
I Az = 0 has only the trivial solution z = 0, i.e. N (A) =
I The columns of A are linearly independent.
I A is full rank, i.e. rank(A) = n.
I det(A) 6= 0.
I 0 is not an eigenvalue of A.

K. Bergen (ICME) Applied Linear Algebra 62 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 63 / 140


Mini-Quiz 8:

If A and B Rnn are two invertible matrices, which of the following


formulas are necessarily true?
A + B is invertible and (A + B)1 = A1 + B 1 .

(A + B)2 = A2 + 2AB + B 2 .

(ABA1 )3 = AB 3 A1 .
1
A1 B is invertible and A1 B = B 1 A.

K. Bergen (ICME) Applied Linear Algebra 64 / 140


Solutions (Q8):

If A and B Rnn are two invertible matrices, which of the following


formulas are necessarily true?
A + B is invertible and (A + B)1 = A1 + B 1 .
     
1 1 1 0 2 1
FALSE : A = , B= , A+B =
0 1 1 1 1 2
2
13
     
1 1 1 1 1 0 1 3
A = , B = , (A + B) =
0 1 1 1 31 2
3

(A + B)2 = A2 + 2AB + B 2 .

FALSE : (A + B)2 = A2 + AB + BA + B 2 and AB 6= BA for all


invertible matrix pairs A, B.

K. Bergen (ICME) Applied Linear Algebra 65 / 140


Solutions (Q8):
If A and B Rnn are two invertible matrices, which of the following
formulas are necessarily true?
(ABA1 )3 = AB 3 A1 TRUE

(ABA1 )3 = (ABA1 )(ABA1 )(ABA1


= AB(A1 A)B(A1 A)BA1 )
= AB 3 A1
1
A1 B is invertible and A1 B = B 1 A TRUE

(A1 B)(A1 B)1 = (A1 B)(B 1 A)


= A1 (BB 1 )A
= A1 A
= In

K. Bergen (ICME) Applied Linear Algebra 66 / 140


Systems of Linear Equations

Example: Find values x1 , x2 R that satisfy

2x1 + x2 = 7
and x1 3x2 = 7

(for example, if we want to find the intersection point of two lines).

This can be written as a matrix equation:


    
2 1 x1 7
= or Ax = b
1 3 x2 7
     
2 1 x1 7
where A = , x= , and b =
1 3 x2 7

K. Bergen (ICME) Applied Linear Algebra 67 / 140


Linear Systems
One of the fundamental problems in linear algebra:
Given A Rmn and b Rm , find x Rn such that

Ax = b.

The set of solutions of a linear system can behave in three ways:


1 The system has one unique solution.
2 The system has infinitely many solutions.
3 The system has no solution.

For a solution x to exist, we must have b R(A).


I If b R(A) and rank(A) = n, then there is a unique solution
(usually square systems: m = n).
I If b R(A) and rank(A) < n (N (A) 6= ), then there are infinitely
many solutions (usually underdetermined systems: m < n).
I If b
/ R(A), then there is no solution (usually overdetermined
systems: m > n).

K. Bergen (ICME) Applied Linear Algebra 68 / 140


2
Solution Set of Linear Systems

2
diagram from http://oak.ucc.nau.edu/jws8/3equations3unknowns.html
K. Bergen (ICME) Applied Linear Algebra 69 / 140
Gaussian Elimination (row reduction)

How do we solve Ax = b, A Rnn ?

For some types of matrices, Ax = b is easy to solve.


I Diagonal matrices: equations independent, so xi = bi /aii , i
I Upper/Lower triangular matrices: back/forward substitution.

Gaussian Elimination: Transform the system Ax = b into an


upper triangular system U x = y using elementary row operations.
The following are elementary
 row operations that can be applied
to the augmented system A | b to introduce zeros below the
diagonal without changing the solution set x:
1 Multiply row by a scalar.
2 Add scalar multiples of one row to another.
3 Permute rows.

K. Bergen (ICME) Applied Linear Algebra 70 / 140


Step 1 : For A R33 , b R3 , form the augmented system.

 
A b =

Step 2 : Compute the row echelon form using row operations.


L1
0 + + +
0 + + +


L2
+ + +
0 + +
Step 3 : Solve the triangular system by back substitution.


back substitution
x = U 1 y
 
+ + + = U y , Ux = y
+ +
K. Bergen (ICME) Applied Linear Algebra 71 / 140
Questions?

K. Bergen (ICME) Applied Linear Algebra 72 / 140


Mini-Quiz 9:
Determine the lower triangular matrices Li that correspond to the
following elementary row operations:
L1 such that L1 A = A1 :

2 8 4 1 4 2
L1 2 5 1 = 2 5 1
4 10 1 4 10 1
L2 such that L2 A1 = A2 :

1 4 2 1 4 2
L2 2 5 1 = 0 3 3
4 10 1 4 10 1
L3 such that L3 A2 = A3 :

1 4 2 1 4 2
L3 0 3 3 = 0 3 3
4 10 1 0 6 9
K. Bergen (ICME) Applied Linear Algebra 73 / 140
Solution (Q9):
1
0 0
2 2 8 4 1 4 2
L1 A = A1 : 0 1 0 2 5 1 = 2 5 1
0 0 1 4 10 1 4 10 1

1 0 0 1 4 2 1 4 2
L2 A1 = A2 : 2 1 0 2 5 1 = 0 3 3
0 0 1 4 10 1 4 10 1

1 0 0 1 4 2 1 4 2
L2 A3 = A3 : 0 1 0 0 3 3 = 0 3 3
4 0 1 4 10 1 0 6 9

L3 L2 L1 such that L3 L2 L1 A = A3 :
1
2 0 0 2 8 4 1 4 2
1 1 0 2 5 1 = 0 3 3
2 0 1 4 10 1 0 6 9

K. Bergen (ICME) Applied Linear Algebra 74 / 140


Gaussian Elimination and LU Decomposition

We can think of the elementary row operations applied to A as


linear transformations, which can be represented as a series of
lower triangular matrices, L1 , , Ln1 .

Ln1 L1 A = U or L1 A = U, L1 = L1 1
1 Ln1

rearranging yields A = LU.

If we can factor A as A = LU , then we can solve the system


Ax = b by solving two triangular systems:
I Solve Ly = b using forward substitution,
I Solve U x = y using backward substitution.

Gaussian elimination implicitly computes the LU factorization.

K. Bergen (ICME) Applied Linear Algebra 75 / 140


Partial Pivoting

Pure Gaussian elimination can not be used to solve general linear


systems because we may encounter a zero on the diagonal during
the process, e.g.  
0 1
A=
1 2
In practical implementation of the algorithm, the rows of the
matrix must be permuted to avoid division by zero.
Partial pivoting swaps rows if an entry below the diagonal of the
current column is larger in absolute value than the current
diagonal entry (pivot element).

K. Bergen (ICME) Applied Linear Algebra 76 / 140


Permutations
The reordering, or permutation, of the columns in partial pivoting
can be represented as a linear transformation.
Permutation matrix : a square, binary matrix P Rnn that has
exactly one entry 1 in each row and each column.
I Left-multiplication of a matrix A by a permuation matrix reorders
the rows of A, while right-multiplication reorders the columns of A.

0 1 0 1 2 3
For example, let P = 0 0 1 , A = 10 20 30 ,
1 0 0 100 200 300
then the row and column permutations are, respectively:

10 20 30 3 2 1
P A = 100 200 300 , and AP = 30 20 10 .
1 2 3 300 200 100

K. Bergen (ICME) Applied Linear Algebra 77 / 140


LU Decomposition

Let A Rnn be a square matrix, then the LU decomposition


factors A into the product of a lower triangular matrix L Rnn
and an upper triangular matrix U Rnn

A = LU.

When partial pivoting is used to permute the rows, we obtain a


decomposition for the form Ln1 Pn1 L1 P1 A = U , which can
be written3 in form: L1 P A = U .

In general, any square matrix A Rnn (singular or nonsingular)


has a factorization

P A = LU, where P is a permutation matrix.

3
see Trefethen and Bau: Chapter 21 for details.
K. Bergen (ICME) Applied Linear Algebra 78 / 140
Cholesky and Positive Definiteness
Positive definite matrix : a matrix A Rnn is positive definite if

z T Az > 0 for every z Rn , often denoted by A  0.

Example:  
4 2
A=
2 10

  
4 2 x1
xT Ax =
 
x1 x2
2 10 x2
= 4x21 4x1 x2 + 10x22
= (2x1 x2 )2 + 9x22 > 0 x 6= 0

I Positive semi-definite matrix : z T Az 0 for every z Rn .


I Negative definite matrix : z T Az < 0 for every z Rn .

K. Bergen (ICME) Applied Linear Algebra 79 / 140


... Cholesky and Positive Definiteness
A symmetric matrix A is positive definite (A  0) if and only if
there exists a unique lower triangular matrix L such that
A = LLT .
This factorization is called the Cholesky decomposition of A.
The Cholesky algorithm, a modified version of Gauss elimination,
is used to compute the factor L.
I Basic idea: In Gauss-elimination we use row operations to
introduce zeros below the diagonal in each column. To maintain
symmetry, we can apply the same operations to the columns of the
matrix to introduce zeros in the first row.


L1 A
A = 0 + + = L1 A = A1
0 + +

0 0
A1 LT
A1 = 0 1 0 + + = A2 = A1 LT1 = L1 ALT1
0 0 + +
K. Bergen (ICME) Applied Linear Algebra 80 / 140
Questions?

K. Bergen (ICME) Applied Linear Algebra 81 / 140


Mini-Quiz 10:

Quadratic forms: a function q(x) : Rn R if it is a linear


combination of functions of the form xi xj . A quadratic form can
be written as

q(x) = xT Ax, where A Rnn is symmetric.

Let B Rmn . Show that q(x) = kBxk2 is a quadratic form, find


A such that q(x) = xT Ax, and determine the definiteness of A.

Given A, B, find the permutation matrix P such that P BP = A:



1 2 3 1 3 2
A = 4 5 6 , B = 7 9 8
7 8 9 4 6 5

K. Bergen (ICME) Applied Linear Algebra 82 / 140


Solution (Q10):

Let B Rmn . Show that q(x) = kBxk2 is a quadratic form, find


A such that q(x) = xT Ax, and determine the definiteness of A.
Solution:

q(x) = (Ax)T (Ax) = xT AT Ax = xT Bx, where B = AT A.

q(x) is positive semi-definite, since q(x) = kAxk2 0, x Rm .

Given A, B, find the permutation matrix P such that P BP = A:



1 2 3 1 3 2 1 0 0
A = 4 5 6 , B = 7 9 8 , and P = 0 0 1
7 8 9 4 6 5 0 1 0

K. Bergen (ICME) Applied Linear Algebra 83 / 140


Orthogonalization

K. Bergen (ICME) Applied Linear Algebra 84 / 140


Orthogonal (Unitary) Matrices
Recall that two vectors x, y Rn are orthogonal if xT y = 0, and a
set of vectors {u1 , , un } is orthonormal if
(
1 for i = j
uTi uj = , kui k = 1, i
0 for i 6= j

Orthogonal (Unitary) Matrix : a matrix Q Rnn is orthogonal if


its columns q1 , , qn Rn form an orthonormal set:
T
q1

QT Q = ... q1 qn = In

qnT

I If Q is orthogonal, then Q1 = QT (inverse is easy to compute!)


I Orthogonal transformations preserve lengths and angles and
represent rotations or reflections.
p
kQxk2 = xT QT Qx = xT Ix = kxk2
K. Bergen (ICME) Applied Linear Algebra 85 / 140
Projectors

Recall our notation for the projection of vector v onto u:


uT v u
proju v = kuk kuk

We can generalize the idea to projections onto subspaces of Rn .

Projector: a square matrix P Rnn is a projector if P = P 2 .


I For v Rn , P v represents the projection of v into the range of P .
I If v R(P ) (x : v = P x), then P v = P 2 x = P x = v.

K. Bergen (ICME) Applied Linear Algebra 86 / 140


Orthogonal Projectors

If x Rm and A Rmn is a matrix whose columns form a


subspace of Rm , then x can be decomposed as:

x = xk + x , where xk R(A) and x R(A)

xk is the orthogonal projection of x onto R(A).


I xk is the vector in R(A) closest to x, in the sense that

kx xk k < kx vk, v R(A) 6= xk .

K. Bergen (ICME) Applied Linear Algebra 87 / 140


... Orthogonal Projectors

An orthogonal projector is a projector for which the range and


nullspace are orthogonal complements.
I An orthogonal projector satisfies P T = P .
I If u Rn is a unit vector, then Pu = uuT is the orthogonal
projection onto the line containing u.
I If u1 , , uk are an orthonormal basis for subspace V and
A = [u1 un ], then the projection onto V is

PA = AAT

or equivalently: projA x = PA x = (uT1 x)u1 + + (uTn x)un


= proju1 x + + projun x

K. Bergen (ICME) Applied Linear Algebra 88 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 89 / 140


Mini-Quiz 11:

Show that if P is an orthogonal projector, then I 2P is


orthogonal.
Hint: P T = P for orthogonal projectors and P 2 = P for all
projectors. Q is orthogonal if QT Q = I.

Find the matrix for the orthogonal projector PA onto R(A):



1 0
A= 0 1 .
1 0

Hint: The columns of A are orthogonal, but not orthonormal


because the first column does not have length 1. The projection
onto R(Q) is PQ = QQT if the columns of Q are orthonormal.

K. Bergen (ICME) Applied Linear Algebra 90 / 140


Solution (11):
Show that if P is an orthogonal projector, then I 2P is
orthogonal. Solution:
(I 2P )T (I 2P ) = I 2 2P T I 2IP + 4P T P
= 4P 2 4P + I (P T = P )
= 4P 4P + I (P 2 = P )
= I

1 0
Find the orthogonal projector PA onto R(A): A = 0 1.
1 0
1
0
2
R(A) = R Q = 0 1

1 0
2
1 1

2 0 2
PA = PQ = QQT = 0 1 0
1 1
2 0 2
K. Bergen (ICME) Applied Linear Algebra 91 / 140
Gram-Schmidt Orthogonalization
Gram-Schmidt algorithm: a method for computing an orthonormal
basis {q1 , , qm } for a set of vectors A = [a1 , , an ] Rmn .
Each vector is orthogonalized with respect to the previous vectors
and then normalized:
v1
v1 = a1 q1 = kv1 k
v2
v2 = a2 projq1 a2 q2 = kv2 k
.. ..
. .
k1
X
vk
vk = ak projqi ak qk = kvk k
i=1

In matrix form: A = QR, where (


  qiT aj if i j
Q = q1 , , qm is orthogonal, and Rij =
0 if i > j

K. Bergen (ICME) Applied Linear Algebra 92 / 140


QR Decomposition
In the process of orthogonalizing the columms of matrix A, the
Gram-Schmidt algorithm computes the QR decomposition of A.
QR decomposition: any matrix A Rmn can be decomposed as

A = QR

where Q Rmm is an orthogonal matrix, and R Rmn is an


upper triangular matrix.
For rectangular A Rmn , m n
 
  R1
A = QR = Q1 Q2 = Q1 R1
0

where R1 Rnn is upper triangular, and Q1 Rmn and


Q2 Rm(mn) both have orthogonal columns.

K. Bergen (ICME) Applied Linear Algebra 93 / 140


QR Decomposition for Solving Linear Systems

Consider the linear system Ax = b, where A Rnn is a square,


nonsingular matrix and b Rn .
I The QR decomposition expresses the matrix A = QR as the product
of an orthogonal matrix Q (QQT = In ) and upper triangular R:

Ax = b = QRx = b = Rx = QT b

I y = QT b can be easily computed, and Rx = y is a triangular system


that can be solved by back substitution.

K. Bergen (ICME) Applied Linear Algebra 94 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 95 / 140


Mini-Quiz 12:

Use Gram-Schmidt to compute an orthonormal basis for the


columns of A :
1 1 2
3
13 2
A=
3 12 2
3

1 0 1
3 3

Hint: All three columns are normalized to length 1. The first and
second columns are orthogonal, and the second and third columns
are orthogonal. The projection operator for normalized vectors is
proju v = (uT v)u.

K. Bergen (ICME) Applied Linear Algebra 96 / 140


Solution (Q12):
Use Gram-Schmidt to compute an orthonormal basis for A :

1 1 2
3
13 2
A=
3 12 2
3

1 0 1
3 3

q1 = a1 , q2 = a2
v3 = a3 projq1 a3 projq2 a3 = a3 (aT1 a3 )a1

1
2 1
3 3 9
5 1
= 23 3
1
= 9

3 3
1 1
3

3
29
1

v3 16
q3 = =
kv3 k 6
26

K. Bergen (ICME) Applied Linear Algebra 97 / 140


Least Squares Problems

K. Bergen (ICME) Applied Linear Algebra 98 / 140


Least Squares

We want to solve Ax = b, but what do we do if b / R(A) (i.e.


there does not exist a vector x such that Ax = b)?
I For example, let A Rmn , m > n be a tall-and-skinny
(overdetermined ) matrix. Then for most b Rm , there is no
solution x Rn such that Ax = b.

Least Squares problem: Define residual r = b Ax and find the


vector x that minimizes

krk22 = kb Axk22 .

K. Bergen (ICME) Applied Linear Algebra 99 / 140


... Least Squares

We can decompose any vector b Rm into components b = b1 + b2 ,


with b1 R(A) and b2 N (AT ).

Since b2 is in the orthogonal complement of the R(A), we obtain


the following expression for the residual norm:

krk22 = kb1 Ax + b2 k22 = kb1 Axk22 + kb2 k22

which is minimized when Ax = b1 and r = b2 N (AT ).

K. Bergen (ICME) Applied Linear Algebra 100 / 140


Normal Equations

The least squares solution x occurs when r = b2 N (AT ) or


equivalently AT r = AT (b Ax) = 0.

Rearranging this expression gives the normal equations:

AT Ax = AT b

I If A has full column rank, then A = AT A is invertible.


Thus Ax = b (where b = AT b) has a unique solution x.

K. Bergen (ICME) Applied Linear Algebra 101 / 140


Least Squares and Orthogonal Projections
If x? Rn is the least squares solution to Ax = b, then Ax? is the
orthogonal projection of b onto R(A).

x? = arg min kb Axk = kb Ax? k kb Axk, x Rn


x
= Ax? = PA b
(PA is the projection onto R(A))
= b Ax? R(A) = N (AT )
= AT (b Ax? ) = 0
= AT Ax? = AT b

Combining the solution x? = (AT A)1 AT b and Ax? = PA b gives


the matrix for the orthogonal projection onto R(A):

PA = A(AT A)1 AT

K. Bergen (ICME) Applied Linear Algebra 102 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 103 / 140


Mini-Quiz 13
Find the least squares solution of

1 0   1
0 1 x1 = 2 .
x2
0 1 1

by solving Ax = b1 or AT Ax = AT b
Hints:

1 0 0
R(A) = 0 + 1 , , R , N (AT ) = 1 , R ,
0 1 1


1 0 0
b = 1 0 + 1.5 1 + 0.5 1
0 1 1

K. Bergen (ICME) Applied Linear Algebra 104 / 140


Mini-Quiz 13
Find the least squares solution of

1 0   1
x
0 1 1 = 2 .
x2
0 1 1

Solution:

1 0    
T 1 0 T 1
b = b1 + b2 + 1.5 + 0.5 , A A= , A b=
0 2 3
1.5 0.5

1 0   1    
x x1 1
Ax = b1 = 0 1 1 = 1.5 = =
x2 x2 1.5
0 1 1.5
        
1 0 x1 1 x1 1
AT Ax = AT b = = = =
0 2 x2 3 x2 1.5
K. Bergen (ICME) Applied Linear Algebra 105 / 140
Least Squares via QR

The normal equation AT Ax = AT b can be rewritten using the QR


factorization A = QR:

AT Ax = AT b
RT QT QRx = RT QT b
RT Rx = RT QT b
Rx = QT b

Solution using QR decomposition:


I compute QR factorization of A : A = QR
I form y = QT b
I solve Rx = y by back substitution

K. Bergen (ICME) Applied Linear Algebra 106 / 140


Least Squares via Cholesky

M = AT A is symmetric and positive definite, and can be rewritten


in terms the Cholesky factorization M = LLT

Solution using Cholesky factorization:


I form M = AT A
I compute the Cholesky factorization of M : M = LLT
I form y = AT b
I solve Lz = y by forward substitution
I solve LT x = z by back substitution

K. Bergen (ICME) Applied Linear Algebra 107 / 140


Matrix Calculus

Vector-by-scalar and Scalar-by-vector derivatives:



x
1
R, x Rn ,
x x
 x x2 xn

x .. ,
= 2
=
1

.

xn

Vector-by-vector derivatives:
y y2 ym

1

x 1 x1 x1
y1 y2 ym

y x2 x2 x2
y Rm , x Rn , =
x .. .. ..
. . .

y1 y2 ym
xn xn xn

K. Bergen (ICME) Applied Linear Algebra 108 / 140


... Matrix Calculus4

Vector-by-vector identities:

(xT x) (bT x) (Ax)


= 2x, = b, = AT
x x x
(xT Ax)
= A + AT x = 2Ax, if A = AT
 
x
Derivative of a quadratic:

Axk22 =
xT AT Ax 2bT Ax + bT b = 2 AT Ax AT b
 
x kb x

I Setting this expression equal to zero gives the normal equations:



x kb Axk22 = 0 = AT Ax = AT b

4
see http://en.wikipedia.org/wiki/Matrix_calculus for more properties.
K. Bergen (ICME) Applied Linear Algebra 109 / 140
Eigenvalues and Eigenvectors

K. Bergen (ICME) Applied Linear Algebra 110 / 140


Eigenvalues and Eigenvectors

For any square matrix A Rnn , there is at least one scalar and
a corresponding vector v 6= 0 such that:

Av = v or equivalently (A I)v = 0.

This scalar is an eigenvalue of A and v is an eigenvector


corresponding to eigenvalue .

K. Bergen (ICME) Applied Linear Algebra 111 / 140


5
Geometric interpretation of Eigenvectors

(a) (b)
 
2 1
Figure: Under the transformation matrix A = , the directions of
   1 2
1 1
vectors parallel to v1 = (blue) and v2 = (purple) are preserved.
1 1

5
diagram from wikipedia.org/wiki/Eigenvalues_and_eigenvectors
K. Bergen (ICME) Applied Linear Algebra 112 / 140
... Eigenvalues and Eigenvectors

The set of distinct eigenvalues of A is called the spectrum of A and


is denoted (A):

(A) := { R | A I is singular.}

The magnitude of the largest eigenvalue in absolute value is called


the spectral radius of A,

(A) = max |i |.
i (A)

The set of all eigenvectors associated with an eigenvalue form


the eigenspace, E = N (A I), associated with .

K. Bergen (ICME) Applied Linear Algebra 113 / 140


Similar Matrices

Two matrices A and B are similar is they share the same


eigenvalues.
I Similar matrices represent the same linear transformation under two
different bases.
I If P is a nonsingular matrix, then B = P 1 AP is similar to A.
P is called a similarity transformation or change of basis matrix.
I A and B do not in general have the same eigenvectors.
If v is an eigenvector of A, then P 1 v is an eigenvector of B.

Given a square matrix A Rnn , we wish to to reduce it to its


simplest form by means of a similarity transformation.

K. Bergen (ICME) Applied Linear Algebra 114 / 140


Diagonalizable Matrices

Diagonalizable matrix : a matrix A Rnn that is similar to a


diagonal matrix, i.e. there exists an invertible matrix P such that

P 1 AP = D, where D is diagonal.

A matrix is diagonalizable if and only if it has n linearly


independent eigenvectors.
I Matrices with n distinct eigenvalues are diagonalizable.

Real symmetric matrices are diagonalizable by unitary matrices.


I Matrices are diagonalizable by unitary matrices if and only if they
are normal (a matrix is normal if it satisfies AT A = AAT ).

K. Bergen (ICME) Applied Linear Algebra 115 / 140


Eigendecomposition

Let A Rnn be a square, diagonalizable matrix, then A can be


factorized as
A = V V 1
where = diag(1 , , n ) is a diagonal matrix whose elements
are eigenvalues of A, and V is a square matrix whose ith column
vi is the eigenvector corresponding to eigenvalue ii = i .

I Note: this decomposition does not exist for every square matrix A.
 
1 1
e.g. A = can not be diagonalized.
0 1

K. Bergen (ICME) Applied Linear Algebra 116 / 140


Spectral Theorem

Any symmetric matrix A = AT Rnn , has n (not necessarily


distinct) eigenvalues, and A can be decomposed with the
symmetric eigenvalue decomposition:
n
X
A= i ui uTi = U U T ,
i=1

where U is orthogonal and = diag(1 , , n ) is diagonal.

K. Bergen (ICME) Applied Linear Algebra 117 / 140


Properties of Eigenvalues

The trace of A Rnn is equal to the sum of its eigenvalues:


n
X
tr(A) = i
i=1

The rank of A Rnn is equal to the number of non-zero


eigenvalues of A.
1
If A Rnn is non-singular with eigenvalue i 6= 0, then i is an
eigenvalue of A1 .
If A Rnn is symmetric positive definite (A = AT and
z T Az > 0, z Rn ), then all of its eigenvalues are positive.

K. Bergen (ICME) Applied Linear Algebra 118 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 119 / 140


Mini-Quiz 14

Show that if is an eigenvalue of a nonsingular matrix A with


corresponding eigenvector v, then v is also an eigenvector of A1 with
eigenvalue 1 .

Hint: we know that Av = v and we want to show that A1 v = 1 v.

K. Bergen (ICME) Applied Linear Algebra 120 / 140


Solution (Q14)

Show that if is an eigenvalue of a nonsingular matrix A with


corresponding eigenvector v, then v is also an eigenvector of A1 with
eigenvalue 1 .

Av = v
1
A (Av) = A1 (v)
v = (A1 v)
1
v = A1 v

K. Bergen (ICME) Applied Linear Algebra 121 / 140


Solving Eigenvalue Problems

Solving eigenvalue problems Av = v is a fundamentally more


difficult problem than solving a linear system Ax = b.
Solving the eigenvalue problem for an n n matrix can be reduced
to solving for the roots of an nth degree polynomial. For n 5, no
equation exists for the roots of an arbitrary nth degree polynomial
given its coefficients.
All algorithms to solve eigenvalue problems for matrices of
arbitrary size are iterative. This contrasts with linear systems
which have algorithms that are guaranteed to produce the solution
in a finite number of steps.

K. Bergen (ICME) Applied Linear Algebra 122 / 140


Solving Small Eigenvalue Problems
Recall that if Av = v, then (A I)v = 0.
I Since (A I) has a vector, v, in its nullspace, it is not singular
(non-invertible).
I Therefore finding the eigenvalues of A is equivalent to finding the
values for which (A I) is singular.
For a 2 2 matrix,
   
a b d b
A= , the matrix inverse is A1 = 1
adbc .
c d c a

The quantity ad bc is called the determinant of A, det(A), and


the inverse exists if and only if det(A) 6= 0.
Therefore, for our 2 2 eigenvalue problem, we want to find the
eigenvalues such that det(A I) = 0:
 
(a ) b
(AI) = , det(AI) = (a)(d)bc = 0
c (d )

K. Bergen (ICME) Applied Linear Algebra 123 / 140


... Solving Small Eigenvalue Problems
Example:
   
2 1 (2 ) 1
A= , (A I) =
1 2 1 (2 )

det(A I) = (2 )(2 ) 1 1
= 2 4 + 3

This is a 2nd degree polynomial (i.e. quadratic). We want to find


the roots of this polynomial, 1 , and 2 , which will satisfy
det(A i I) = 0.
2 4 + 3 = 0
( 3)( 1) = 0
= 1 = 3, 2 = 1

K. Bergen (ICME) Applied Linear Algebra 124 / 140


Power Method for Finding Eigenvectors
One method for finding the eigenvector, v1 , corresponding to the
largest eigenvalue, 1 , of A Rnn is called the power method :
Pick any vector z Rn , and compute the sequence:
n o
Az, A2 z, A3 z, A4 z, , Ak z , then Ak z v1 as k .
z can be written as a linear combination of the (linearly
independent) eigenvectors of A:
z = 1 v1 + + n vn

Therefore Az = A(1 v1 + + n vn )
= 1 Av1 + + n Avn
= 1 1 v1 + + n n vn

and iterating gives Ak z = 1 k1 v1 + + n kn vn


since 1 is the largest eigenvalue, k1 will dominate the sum, so
Ak z will converge toward the direction of v1 .
K. Bergen (ICME) Applied Linear Algebra 125 / 140
The Singular Value
Decomposition

K. Bergen (ICME) Applied Linear Algebra 126 / 140


Singular Value Decomposition

Theorem (Singular Value Decomposition): every matrix


A Rmn has a decomposition, called the singular value
decomposition (SVD) of the form
r
X
A = U V T = i ui viT ,
i=1

mm , V Rnn are unitary matrices and


 U R
where
r 0
= Rmn is a diagonal matrix with
0 0
r = diag(1 , , r ) and r min(m, n) = rank(A).

The positive numbers 1 2 r > 0 are called the


singular values of A and are uniquely determined.

K. Bergen (ICME) Applied Linear Algebra 127 / 140


6
Geometric Interpretation of the SVD
The image of the unit sphere under a matrix is a hyper-ellipse.

6
diagram: http://people.sc.fsu.edu/~jburkardt/latex/fsu_2006/svd.png
K. Bergen (ICME) Applied Linear Algebra 128 / 140
Singular Values

The number of non-zero singular values r is the rank of the matrix.


The singular values are also related to a class of matrix norms:
Spectral norm (induced by the Euclidean vector norm):
q
kAk2 = max (AT A) = max (A).

Frobenius norm:
v v
um X
n
uX q u r
uX
kAkF = t 2 T
|aij | = trace(A A) = t i2
i=1 j=1 i=1

K. Bergen (ICME) Applied Linear Algebra 129 / 140


Singular Values and Eigenvalues

The singular value decomposition of A = U V T Rmn is related


to the eigenvalue decomposition of symmetric matrix AT A Rnn :

AT A = (V U T )(U V T ) = V 2 V T = V V T
and similarly AAT = U U T = U U T .
q q
= i (A) = i (AT A) = i (AAT ), i = 1, , r

The singular values of A are the square roots of the eigenvalues of


both AT A and AAT .
The right singular vectors, V , of A are the eigenvectors of AT A
The left singular vectors, U , of A are the eigenvectors of AAT

K. Bergen (ICME) Applied Linear Algebra 130 / 140


Singular Vectors
The columns of U = [u1 , , um ] and V = [v1 , , vn ] are called
the left and right singular vectors of A, respectively.
The columns of U and V form an orthonormal set of vectors,
which can be regarded as orthonormal bases.
The singular vectors provide a convenient way to represent the
bases of the fundamental subspaces of matrix A.
If A Rmn is a matrix of rank r, then

R(A) = span {u1 , , ur }


N (AT ) = span {ur+1 , , um }
R(AT ) = span {v1 , , vr }
N (A) = span {vr+1 , , vn }

K. Bergen (ICME) Applied Linear Algebra 131 / 140


Low-rank Approximation via the SVD

From the definition of the SVD, any matrix A Rmn of rank r


can be written as a sum of rank-one matrices:
r
X
A = U V T = i ui viT ,
i=1

Theorem (low-rank approximation): Let A Rmn and


0 k r, then

min kA Bk2 = k+1


B : rank(B)=k

k
where the minimum is attained by B ? = Ak = i ui viT .
P
i=1
I Application: image compression

K. Bergen (ICME) Applied Linear Algebra 132 / 140


Image Compression using SVD

(a) rank-1 (b) rank-2 (c) rank-5


approximation approximation approximation
(18,456 bytes 0.6%) (36,912 bytes 1.1%) (92,280 bytes 2.8%)

K. Bergen (ICME) Applied Linear Algebra 133 / 140


Image Compression using SVD

(a) rank-10 (b) rank-20 (c) rank-40


approximation approximation approximation
(184,560 bytes 5.6%) (369,120 bytes 11.1%) (738,240 bytes 22.3%)

K. Bergen (ICME) Applied Linear Algebra 134 / 140


Image Compression using SVD

(a) original image


(3,317,760 bytes)

K. Bergen (ICME) Applied Linear Algebra 135 / 140


Questions?

K. Bergen (ICME) Applied Linear Algebra 136 / 140


Mini-Quiz 15
The SVD is useful in many linear algebra proofs, since it decomposes a
matrix into orthogonal and diagonal factors which have nice properties.
Let A Rmn = U V T . Show that

kAk2 = max (A) = max y T Ax.


kxk=1, kyk=1

Hint: Use the SVD and unitary invariance of the Euclidean vector
norm (kU xk = kxk).

Polar decomposition:
Show how any square matrix A Rnn = U V T can be written as

A = QS, Q orthogonal, S symmetric positive semi-definite.

Hint: V V T is symmetric positive semi-definite.

K. Bergen (ICME) Applied Linear Algebra 137 / 140


Solution (Q15)
Let A = U V T . Show that kAk2 = max (A) = max y T Ax.
kxk=1,kyk=1
Solution:
max y T Ax = max y T (U V T )x
kxk=1,kyk=1 kxk=1, kyk=1

= max (y T U )(V T x)
kV T xk=1, kU T yk=1

= max y T x
kxk=1, kyk=1
= max ii
i
= max (A)
Show how any square matrix A = U V T can be written as
A = QS, Q orthogonal, S symmetric positive semi-definite.
Solution:
A = U V T = U In V T = U (V T V )V T = (U V T )(V V T )
A = QS, where Q = U V T , and S = V V T .
K. Bergen (ICME) Applied Linear Algebra 138 / 140
The End!

K. Bergen (ICME) Applied Linear Algebra 139 / 140


References and Acknowledgements
1 Numerical Linear Algebra by Lloyd N. Trefethen & David Bau III.
2 Matrix Computations by Gene H. Golub & Charles F. Van Load.
3 Linear Algebra and its Applications by Gilbert Strang.
4 Linear Algebra with Applications. by Otto Bretscher.
5 www.Wikipedia.org and Mathworld.Wolfram.com

Thanks to former ICME Refresher Course instructors Yuekai Sun,


Nicole Taheri, and Milinda Lakkam whose slides served as useful
references in creating this presentation.

Thanks to Carlos Sing-Long and Anil Damle for their feedback on


this presentation.

K. Bergen (ICME) Applied Linear Algebra 140 / 140

Вам также может понравиться