Applied La 2013 PDF

Applied Linear Algebra Refresher Course
Karianne Bergen
kbergen@stanford.edu
Institute for Computational and Mathematical Engineering,

Stanford University
September 16-19, 2013
K. Bergen (ICME) Applied Linear Algebra 1 / 140

Course Topics
Matrix and vector properties

Matrix algebra
Linear transformations
Solving linear systems
Orthogonality
Eigenvectors and eigenvalues
Matrix decompositions

Vectors and Matrices

Scalars, vectors and matrices
Scalar : a single quantity or measurement that is invariant under
coordinate rotations (e.g. mass, volume, temperature).
, , R (Greek letters)
Vector : an ordered collection of scalars (e.g. displacement,
acceleration, force).

x1
x2
x = . , xi R i = 1, . . . n

..
xn
x, y, z Rn (lower case) is a vector with n entries.

All vectors are column vectors unless otherwise noted.
Row vectors are denoted by xT = [ x1 x2 xn ].
For simplicity, we will restrict ourselves to real numbers. The generalization to the
complex case can be found in any linear algebra text.
... Scalars, vectors and matrices
Matrix : a two-dimensional collection of scalars.

a11 a12 a1n aT1

a21 a22 a2n aT2
A = a1 a2 an = . .. , AT =

.. ..
. .
T
am
am1 am2 amn
A Rmn is a matrix with m rows and n columns.
A, B, , Rmn (upper case)

Operations on vectors
Scalar multiplication:

x1 x1
x2 x2
(x)i = xi , x = . = .

.. ..
xn xn
Vector addition/subtraction:

x1 y1 x 1 + y1
x 2 y2 x 2 + y2
(x + y)i = xi + yi , x+y = . + . =

..
.. ..

.
xn yn x n + yn
Inner Product (dot product, scalar product):
n
X
xT y = xi yi = x1 y1 + x2 y2 + + xn yn
i=1
Other common notation for the inner product: hx, yi, xy
Vector norms
How do we measure the length or size of a vector?
Vector norm is a function f : Rn R+ that satisfies the following

properties:
1 Scale invariance: f (v) = ||f (v), R and v Rn

2 Triangle inequality: f (u + v) f (u) + f (v), u, v Rn
3 Positivity: f (x) = 0 x 0

Commonly used vector norms
1-norm (taxi-cab norm):
n
X
kvk1 = |vi |
i=1
2-norm (Euclidean norm):

n
!1
X 2
kvk2 = vi2 = vT v
i=1
infinity-norm (max norm):

kvk = max |vi |
i
These are examples of the p-norm, for p = 1, 2, and :
n
!1
X p
kvkp = |vi |p , p1
i=1

Questions?

Mini-Quiz 1:
Compute the 1-norm, 2-norm, infinity-norm of the following vector:

3
v = 6

2

Solution (Q1):
Solutions:
1-norm:
3
X
kvk1 = vi = 3 + 6 + 2 = 11
i=1
2-norm:
3
!1
X 2 p
kvk2 = vi2 = 32 + 62 + 22 = 9 + 36 + 4 = 49 = 7
i=1
-norm
kvk = max |vi | = max{3, 6, 2} = 6
i

Cauchy-Schwartz Inequality
The inner product satisfies the Cauchy-Schwartz inequality:
|xT y| kxk2 kyk2 for x, y Rn
with equality when y is a scalar multiple of x.
This inequality is arises in many applications, including

I Geometry: triangle inequality
I Physics: Heisenberg uncertainty principle
I Statistics: Cramer-Rao lower bound

Inner Product and Orthogonality
Orthogonal vectors: vectors u and v are orthogonal (u v) if their

inner product is zero.
uT v = 0 uv
Orthonormal vectors: vectors u and v are orthonormal if they are

orthogonal and both are normalized to length 1.
uv and kuk = 1, kvk = 1
The orthogonal complement of a set of vectors V , denoted V , is

the set of vectors x such that xT v = 0, v V .

Inner Product and Projections
Projection: the inner product gives us a way to compute the

component of a vector v along any direction u:
uT v u
proju v = kuk kuk
When u is normalized to unit length, kuk = 1, this becomes
proju v = uT v u


Questions?

Mini-Quiz 2:
Give an algebraic proof for the triangle inequality
kx + yk2 kxk2 + kyk2
using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )

Solution (Q2):
Give an algebraic proof for the triangle inequality
kx + yk2 kxk2 + kyk2
using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )
Solution:
kx + yk2 = (x + y)T (x + y)
= xT x + 2xT y + y T y
= kxk2 + kyk2 + 2 xT y

kxk2 + kyk2 + 2|xT y|

kxk2 + kyk2 + 2kxkkyk (by Cauchy-Schwartz)
2
= (kxk + kyk)

Operations with Matrices
Scalar multiplication:

a11 a1n a11 a1n
(A)ij = Aij ,
.. =
..
. .
am1 amn am1 amn
Matrix addition:
(A + B)ij = Aij + Bij

a11 + b11 a1n + b1n
A+B =
.. ,

A, B Rmn
.
am1 + bm1 amn + bmn

Matrix-vector multiplication
Matrix-vector multiplication:
n
X
(Ax)i = Aij xj , A Rmn , x Rn
j=1
Matrix-vector product as linear combination of matrix columns

x1
x2
Ax = a1 a2 an . = x1 a1 +x2 a2 + +xn an

..
xn
Matrix-vector product in terms as a vector of scalar products
aT1 aT1 x

aT aT x
2 2
Ax = . x =

. ..
. .
aTm aTm x
Matrix-matrix multiplication
Matrix product:
p
X
(AB)ij = Aik Bkj , A Rmp , B Rpn
k=1
Matrix products in terms of inner products:
aT1

aT
2
(AB)ij = aTi bj , A= . , B = b1 b2 bn

. .
aTm
Since inner products are only defined for vectors of the same
length, the matrix product requires that ai , bj Rp for all i, j.

... Matrix-matrix multiplication
Matrix products in terms of Matrix-vector products:

AB = A b1 b2 bn = Ab1 Ab2 Abn
aT1 aT1 B

aT2 aT2 B
AB = B =

.. ..
. .
aTm aTm B
Matrix-matrix multiplication can be expressed in terms of the
concatenation of matrix-vector products.
For A Rkl and B Rmn the product AB is only defined if
l = m, and the product BA is only defined if k = n.

Properties of Matrix-matrix multiplication
Matrix multiplication is associative:
A(BC) = (AB)C
Matrix multiplication is distributive:
A(B + C) = AB + AC
Matrix multiplication is, in general, not commutative:
AB 6= BA

1 1 1 0
For example, let A = and B = , then
0 1 1 1

2 1 1 1
AB = 6 = = BA.
1 1 1 2

Vector Outer Product
Using the definition of matrix multiplication, we can define the

vector outer product:

x1 x1 y1 x1 y2 x1 yn
x2 x2 y1 x2 y2 x 2 yn
xy T = . y1 y2 yn = .

.. ..
.. .. . .
xm xm y1 xm y2 xm yn
xy T

ij
= x i yj
x Rm , y Rn , xy T Rmn
The outer product xy T of vectors x and y is a matrix, and is
defined even if the vectors are not the same length.

Questions?

Mini-Quiz 3:
Which of the following vector and matrix operations are well-defined?
1 x 4 xT y 7 Az
2 A 5 xT z
3 xy 6 zxT 8 z T Ay

1 4
3 4 0 2
= 5, x = 5 , y = 0 , z= , A=
0 1 5 3
2 1

Solution (Q3):
Which of the following vector and matrix operations are well-defined?
1 x Yes 4 xT y Yes 7 Az No
2 A Yes 5 xT z No
3 xy Yes 6 zxT Yes 8 z T Ay Yes

1 4
3 4 0 2
= 5, x = 5 , y = 0 , z= , A=
0 1 5 3
2 1

Transpose of a Matrix
Transpose: the transpose of matrix A Rmn is the matrix

AT Rnm formed by exchanging the rows and columns of A:
(AT )ij = Aji
Properties:
(AT )T = A
T
(A + B) = AT + B T
(AB)T = B T AT
(B)T = (B T )

Trace of a Matrix
Trace: the trace of a square matrix A Rnn is the sum of its
diagonal entries:
Xn
tr(A) = aii
i=1
The trace is a linear map and has the following properties:

tr(A + B) = tr(A) + tr(B)
tr(A) = tr(A), R
T
tr(A) = tr(A )
tr(AB) = tr(BA)
The trace of a matrix product functions similarly to the inner
product of vectors:
X
tr(AT B) = Aij Bij , A, B Rmn
i,j

Matrix norms
Induced norm (operator norm) corresponding to vector norm k k:
kAk = max kAxk

kxk=1
I kAk1 : maximum absolute column sum

I kAk : p
maximum absolute row sum
I kAk2 = max (AT A) = max (A) (spectral norm)
Frobenius norm:
v
um X
n
uX q
kAkF = t |aij |2 = trace(AT A)
i=1 j=1
The Frobenius norm is equivalent to kvec(A)k2 .

1
Geometric Interpretation of Matrix Norms
1
diagram from Trefethen and Bau.
Properties of Matrix Norms
Matrix norms satisfy the same properties as vector norms:

1 Scale invariance: kAk = ||kAk
2 Triangle inequality: kA + Bk kAk + kBk
3 Positivity: kAk 0 and kAk = 0 only if A 0
Some matrix norms, including induced norms and the Frobenius

norm, also satisfy the submultiplicative property:
kAxk kAkkxk, and kABk kAkkBk

A few words on: Vector and Matrix Norms
Vector norms is to give us a way to measure how big a vector is.
2-norm (Euclidean norm) is most common: it has nice relationship
with the inner product kvk22 = v T v, and fits with our usual
geometric notions of length and distance.
The choice of norm may depend on the application:
Let y Rn be a set of n measurements, and let y Rn be the
estimates for y from a model. Let r = y y be the difference
between the measured value and the model estimate. We want to
optimize our model to make r as small as possible, but how
should we measure small?
I krk : use this if no error should exceed a prescribed value .
I krk2 : use this if large errors are bad, but small errors are ok.
I krk1 : use this norm if you want to your measurement and estimate
to match exactly for as many samples as possible.
Matrix norms are an extension of vector norms to matrices.

Questions?

Mini-Quiz 4:
Determine whether each statement given below is True or False:
There exist matrices A, B 6= 0 such that AB = 0.
The trace is invariant under cyclic permutations:
tr(ABC) = tr(CAB) = tr(BCA).
The definition of the induced norm allows us to use different vector

norms on the numerator and denominator
kAk, = max kAxk .

kxk =1
kAk,1 is equal to the largest element of A in absolute value .

Solutions (Q4):

There exist matrices A, B 6= 0 such that AB = 0.

1 2 2 4
TRUE : For example, when A = , and B = .
2 4 1 2
The trace is invariant under cyclic permutations:
tr(ABC) = tr(CAB) = tr(BCA).
TRUE
Use the property that tr(XY ) = tr(Y X):
To show that tr(ABC) = tr(CAB), let X = AB, and Y = C.
To show that tr(ABC) = tr(BCA), let X = A and Y = BC.

Solutions (Q4):
The definition of the induced norm allows us to use different vector

norms on the numerator and denominator
kAk, = max kAxk .

kxk =1
kAk,1 is equal to the largest element of A in absolute value .

TRUE

T
kAk,1 = max kAxk = max max(ai xi ) = max |aij |.
kxk1 =1 kxk1 =1 i i,j

Special Matrices: Diagonal, Identity
Diagonal Matrix : a matrix is diagonal if the only non-zero entries
are on the matrix diagonal.

( d1 0 0
di if i = j 0 d2 0

Dij = , D=.

.. ..
0 if i 6= j .. . .
0 0 dn
Identity Matrix : the identity In Rnn is a diagonal matrix with
1s along the diagonal (denoted by I when size is unambiguous).

( 1 0 0
1 if i = j 0 1 0
Iij = , I = .

. . . ..
0 if i 6= j . . .
0 0 1
AIn = Im A = A for rectangular A Rmn
AI = IA = A for square A Rnn
Special Matrices: Triangular
Triangular Matrix : a matrix is lower/upper triangular if all of the

entries above/below the diagonal are zero:

l11 u11 u12 u1n
l21 l22 u22 u2n
L= . , U =

.. .. .. ..
.. . . . .
lm1 lm2 lmn umn
I The sum of two upper triangular matrices is upper triangular.

I The product of two upper triangular matrices is upper triangular.
I The inverse of an invertible upper triangular matrix is upper
triangular.

Special Matrices: Symmetric, Orthogonal
Symmetric Matrix : a square matrix is symmetric if it is equal to

its own transpose.

1
2
A = AT Rnn , A= .

.. ..
.. . .
n
Orthogonal (Unitary) Matrix : a square matrix U Rnn is

unitary if it satisfies
U T U = U U T = I.
The columns of a orthogonal matrix U are orthonormal.

Questions?

Mini-Quiz 5:

For any A Rnn , AT A is symmetric.
For any A Rnn , A AT is symmetric.
Invariance under Unitary Multiplication.

For any matrix x Rm and unitary Q Rmm , we have
kQxk2 = kxk2
Let D Rnn = diag(d1 , , dn ) be any diagonal matrix, then

DA = AD for any matrix A Rnn .

Solution (Q5):

For any A Rnn , AT A is symmetric.
TRUE : (AT A)T = (A)T (AT )T = AT A
For any A Rnn , A AT is symmetric.

1 1 0 2
FALSE : For example when A = , A AT = .
1 0 2 0

Solution (Q5):
Invariance under Unitary Multiplication.
For any matrix x Rm and unitary Q Rmm , we have
kQxk2 = kxk2
TRUE : kQxk22 = (Qx)T (Qx) = xT QT Qx = xT x = kxk22
Similarly, for the matrix 2-norm and Frobenius norm, we have:

kQAk22 = max (QA)T (QA)

= max AT QT QA

= max AT A)

= kAk22
and kQAk2F = tr (QA)T (QA)

= tr AT A

= kAk2F
Solution (Q5):
Let D Rnn = diag(d1 , , dn ) be any diagonal matrix, then
DA = AD for any matrix A Rnn .
FALSE
DA rescales the rows of A while AD rescales the columns of A:
T
d1 aT1

d1 0 0 a1
0 d2 0 T T
a2 d2 a2

DA = . = 6=

.. . . .
.. . .. .. ..

0 0 dn T
an dn anT

d1 0 0
0 d2 0
AD = a1 an . .. = d1 a1 dn an

.. ..
. .
0 0 dn

Vector Spaces

Linear Transformations
Linear transformation: T : Rn Rm is a linear transformation if

for all x, y Rn the following properties hold:
I T (x + y) = T (x) + T (y)
I T (x) = T (x)
Theorem: Let A Rmn be a matrix and define the function

TA (x) = Ax. Then TA : Rn Rm is a linear transformation.
I Matrix multiplication is a linear function.
Theorem: Let T : Rn Rm , then there exists a matrix A such

that T (x) = Ax.
I There is a matrix for every linear function.

Linear Combinations
Let v1 , v2 , , vr Rm be a set of vectors. Then any vector v

which can be written in the form
v = 1 v1 + 2 v2 + + r vr , i R i = 1, . . . , r
is a linear combination of vectors v1 , v2 , , vr .
Span: the set of all linear combination of vectors v1 , , vr Rm

is called the span of {v1 , , vr }
I span{v1 , , vr }, is always a subspace of Rm .
I If S = span{v1 , , vr }, then S is spanned by v1 , v2 , , vr .
A set S is called a subspace if it is closed under vector addition

and scalar multiplication:
I if x, y S and R, then x + y S and x S

Linear Independence
Linear dependence: a set of vectors v1 , v2 , , vr is linearly

dependent if there exists a set of scalars 1 , 2 , , r R with
at least one i 6= 0 such that
1 v1 + 2 v2 + + r vr = 0
That is, a set of vectors is linearly dependent if one of the vectors

in the set can be written as a linear combination of one or more
other vectors in the set.
Linear independence: a set of vectors v1 , v2 , , vr is linearly
independent if it is not linearly dependent. That is
1 v1 + 2 v2 + + r vr = 0 1 = = r = 0

Questions?

Mini-Quiz 6:
Are the following sets of vectors linearly independent or linearly

dependent?
1 {a, b, c}
2 {b, c, d}
3 {a, b, d}
4 {a, b, c, d}

1 0 0 0
a = 0 , b = 1 , c = 0 , d = 1
0 1 1 1

Solution (Q6):
Are the following sets of vectors linearly independent or linearly

dependent?
1 {a, b, c} independent
2 {b, c, d} dependent: b + 2c = d
3 {a, b, d} independent
4 {a, b, c, d} dependent: at most 3 linearly independent vectors in R3

1 0 0 0
a = 0 , b = 1 , c = 0 , d = 1
0 1 1 1

Basis and Dimension
A basis for a subspace S is a linearly independent set of vectors

that span S
I The basis for a subspace S is not unique, but all bases for the
subspace contain the same number of vectors.
I For example, consider the case where S = R2 :

1 0 1 1
, and , are both bases that span R2 .
0 1 1 1
The dimension of the subspace S, dim(S), is the number of

linearly independent vectors in the basis for S (in the example
above, dim(S) = 2).
Theorem (unique representation): if vectors v1 , . . ., vn are a
basis for subspace S, then every vector in S can be uniquely
represented as a linear combination of these basis vectors.

Range and Nullspace
The range (column space, image) of a matrix A Rmn , denoted

by R(A), is the set of all linear combinations of the columns of A:
R(A) := {Ax | x Rn } , R(A) Rm
The nullspace (kernel) of a matrix A Rmn , denoted by N (A), is

the set of vectors z such that Az = 0:
N (A) := {z Rn | Az = 0} , N (A) Rn
The range and nullspace of AT are called the row space and left
nullspace of A.
I These four subspaces of A are intrinsic to A and do not depend on
the choice of basis.

Rank of a Matrix
The column rank of A, denoted rank(A), is dimension of R(A).
The row rank of A is the dimension of R(AT ).
Rank of a matrix: the column rank of a matrix is always equal to

the row rank, and therefore we refer to this value as the rank of A.
Matrix A Rmn is full rank if rank(A) = min{m, n}.

Fundamental Theorem of Linear Algebra
Fundamental theorem of linear algebra:

1 The nullspace of A is the orthogonal complement of the row space.

N (A) = R(AT )
2 The left nullspace of A is the orthogonal complement of the column

space.

N (AT ) = (R(A))
Corollary (Rank-nullity):
dim (R(A)) + dim (N (A)) = n, A Rmn

Questions?

Mini-Quiz 7:
Give the range R() and nullspace N () for the matrices

1 0
32 1 0 1
A = 0 2 R , and B =
R23
0 1 1
0 0
What are their dimensions ( dim (R()) and dim (N ()) )?

Solution (Q7):
Give the range R() and nullspace N () for the matrices

1 0
32 1 0 1
A = 0 2 R , and B =
R23
0 1 1
0 0
What are their dimensions ( dim (R()) and dim (N ()) )?
Solution:

1 0
R(A) = 0 + 2 , , R , N (A) =
0 0

1
1 0

R(B) = + , , R , N (B) = 1 , R

0 1
1

dim (R(A)) = 2 dim (N (A)) = 0

dim (R(B)) = 2 dim (N (B)) = 1
Solving Linear Systems

Nonsingular Matrix
Theorem: a matrix A Rmn with m n has full rank if and
only if it maps no two distinct vectors to the same vector, and
Ax = Ay = x = y.
Singular matrix : a square matrix A Rnn that is not full rank.

Nonsingular matrix : a square matrix A Rnn of full rank.
If A is nonsingular, we can uniquely express any vector y Rn as

y = Ax for some unique x Rn . If we chose vectors ei = Axi ,
i = 1, . . . , n, we can write

AX = A x1 xn = Ax1 Axn = e1 en = I

Matrix Inverse
Matrix inverse: the inverse of a square matrix A Rnn is the
matrix B Rnn such that
AB = BA = I where I is the identity matrix
I The inverse of A is denoted by A1 .

I Any square, nonsingular matrix A has a unique inverse A1
satisfying AA1 = A1 A = I.
1
I (AB) = B 1 A1 (assuming both inverses exist).
I (AT )1 = (A1 )T
If Ax = b, then A1 b gives the vector of coefficients in the linear

combination of the columns of A that yields b:
x1
1 ..
b = Ax = x1 a1 + + xn an , x=A b= .
xn

Invertible Matrix Theorem
Let A Rnn be a square matrix, then the following statements

are equivalent:
I A is invertible, A is nonsingular.
I There exists a matrix A1 such that AA1 = In = A1 A.
I Ax = b has exactly one solution for each b Rn .
I Az = 0 has only the trivial solution z = 0, i.e. N (A) =
I The columns of A are linearly independent.
I A is full rank, i.e. rank(A) = n.
I det(A) 6= 0.
I 0 is not an eigenvalue of A.

Questions?

Mini-Quiz 8:
If A and B Rnn are two invertible matrices, which of the following

formulas are necessarily true?
A + B is invertible and (A + B)1 = A1 + B 1 .
(A + B)2 = A2 + 2AB + B 2 .
(ABA1 )3 = AB 3 A1 .
1
A1 B is invertible and A1 B = B 1 A.

Solutions (Q8):

A + B is invertible and (A + B)1 = A1 + B 1 .

1 1 1 0 2 1
FALSE : A = , B= , A+B =
0 1 1 1 1 2
2
13

1 1 1 1 1 0 1 3
A = , B = , (A + B) =
0 1 1 1 31 2
3
(A + B)2 = A2 + 2AB + B 2 .
FALSE : (A + B)2 = A2 + AB + BA + B 2 and AB 6= BA for all

invertible matrix pairs A, B.

Solutions (Q8):
(ABA1 )3 = AB 3 A1 TRUE
(ABA1 )3 = (ABA1 )(ABA1 )(ABA1

= AB(A1 A)B(A1 A)BA1 )
= AB 3 A1
1
A1 B is invertible and A1 B = B 1 A TRUE
(A1 B)(A1 B)1 = (A1 B)(B 1 A)

= A1 (BB 1 )A
= A1 A
= In

Systems of Linear Equations
Example: Find values x1 , x2 R that satisfy
2x1 + x2 = 7
and x1 3x2 = 7
(for example, if we want to find the intersection point of two lines).
This can be written as a matrix equation:

2 1 x1 7
= or Ax = b
1 3 x2 7

2 1 x1 7
where A = , x= , and b =
1 3 x2 7

Linear Systems
One of the fundamental problems in linear algebra:
Given A Rmn and b Rm , find x Rn such that
Ax = b.
The set of solutions of a linear system can behave in three ways:

1 The system has one unique solution.
2 The system has infinitely many solutions.
3 The system has no solution.
For a solution x to exist, we must have b R(A).

I If b R(A) and rank(A) = n, then there is a unique solution
(usually square systems: m = n).
I If b R(A) and rank(A) < n (N (A) 6= ), then there are infinitely
many solutions (usually underdetermined systems: m < n).
I If b
/ R(A), then there is no solution (usually overdetermined
systems: m > n).

2
Solution Set of Linear Systems
2
diagram from http://oak.ucc.nau.edu/jws8/3equations3unknowns.html
Gaussian Elimination (row reduction)
How do we solve Ax = b, A Rnn ?
For some types of matrices, Ax = b is easy to solve.

I Diagonal matrices: equations independent, so xi = bi /aii , i
I Upper/Lower triangular matrices: back/forward substitution.
Gaussian Elimination: Transform the system Ax = b into an

upper triangular system U x = y using elementary row operations.
The following are elementary
row operations that can be applied
to the augmented system A | b to introduce zeros below the
diagonal without changing the solution set x:
1 Multiply row by a scalar.
2 Add scalar multiples of one row to another.
3 Permute rows.

Step 1 : For A R33 , b R3 , form the augmented system.

A b =

Step 2 : Compute the row echelon form using row operations.

L1
0 + + +
0 + + +

L2
+ + +
0 + +
Step 3 : Solve the triangular system by back substitution.

back substitution
x = U 1 y

+ + + = U y , Ux = y
+ +
Questions?

Mini-Quiz 9:
Determine the lower triangular matrices Li that correspond to the
following elementary row operations:
L1 such that L1 A = A1 :

2 8 4 1 4 2
L1 2 5 1 = 2 5 1
4 10 1 4 10 1
L2 such that L2 A1 = A2 :

1 4 2 1 4 2
L2 2 5 1 = 0 3 3
4 10 1 4 10 1
L3 such that L3 A2 = A3 :

1 4 2 1 4 2
L3 0 3 3 = 0 3 3
4 10 1 0 6 9
Solution (Q9):
1
0 0
2 2 8 4 1 4 2
L1 A = A1 : 0 1 0 2 5 1 = 2 5 1
0 0 1 4 10 1 4 10 1

1 0 0 1 4 2 1 4 2
L2 A1 = A2 : 2 1 0 2 5 1 = 0 3 3
0 0 1 4 10 1 4 10 1

1 0 0 1 4 2 1 4 2
L2 A3 = A3 : 0 1 0 0 3 3 = 0 3 3
4 0 1 4 10 1 0 6 9
L3 L2 L1 such that L3 L2 L1 A = A3 :
1
2 0 0 2 8 4 1 4 2
1 1 0 2 5 1 = 0 3 3
2 0 1 4 10 1 0 6 9

Gaussian Elimination and LU Decomposition
We can think of the elementary row operations applied to A as

linear transformations, which can be represented as a series of
lower triangular matrices, L1 , , Ln1 .
Ln1 L1 A = U or L1 A = U, L1 = L1 1
1 Ln1
rearranging yields A = LU.
If we can factor A as A = LU , then we can solve the system

Ax = b by solving two triangular systems:
I Solve Ly = b using forward substitution,
I Solve U x = y using backward substitution.
Gaussian elimination implicitly computes the LU factorization.

Partial Pivoting
Pure Gaussian elimination can not be used to solve general linear

systems because we may encounter a zero on the diagonal during
the process, e.g.
0 1
A=
1 2
In practical implementation of the algorithm, the rows of the
matrix must be permuted to avoid division by zero.
Partial pivoting swaps rows if an entry below the diagonal of the
current column is larger in absolute value than the current
diagonal entry (pivot element).

Permutations
The reordering, or permutation, of the columns in partial pivoting
can be represented as a linear transformation.
Permutation matrix : a square, binary matrix P Rnn that has
exactly one entry 1 in each row and each column.
I Left-multiplication of a matrix A by a permuation matrix reorders
the rows of A, while right-multiplication reorders the columns of A.

0 1 0 1 2 3
For example, let P = 0 0 1 , A = 10 20 30 ,
1 0 0 100 200 300
then the row and column permutations are, respectively:

10 20 30 3 2 1
P A = 100 200 300 , and AP = 30 20 10 .
1 2 3 300 200 100

LU Decomposition
Let A Rnn be a square matrix, then the LU decomposition

factors A into the product of a lower triangular matrix L Rnn
and an upper triangular matrix U Rnn
A = LU.
When partial pivoting is used to permute the rows, we obtain a

decomposition for the form Ln1 Pn1 L1 P1 A = U , which can
be written3 in form: L1 P A = U .
In general, any square matrix A Rnn (singular or nonsingular)

has a factorization
P A = LU, where P is a permutation matrix.
3
see Trefethen and Bau: Chapter 21 for details.
Cholesky and Positive Definiteness
Positive definite matrix : a matrix A Rnn is positive definite if
z T Az > 0 for every z Rn , often denoted by A 0.
Example:
4 2
A=
2 10

4 2 x1
xT Ax =

x1 x2
2 10 x2
= 4x21 4x1 x2 + 10x22
= (2x1 x2 )2 + 9x22 > 0 x 6= 0
I Positive semi-definite matrix : z T Az 0 for every z Rn .

I Negative definite matrix : z T Az < 0 for every z Rn .

... Cholesky and Positive Definiteness
A symmetric matrix A is positive definite (A 0) if and only if
there exists a unique lower triangular matrix L such that
A = LLT .
This factorization is called the Cholesky decomposition of A.
The Cholesky algorithm, a modified version of Gauss elimination,
is used to compute the factor L.
I Basic idea: In Gauss-elimination we use row operations to
introduce zeros below the diagonal in each column. To maintain
symmetry, we can apply the same operations to the columns of the
matrix to introduce zeros in the first row.

L1 A
A = 0 + + = L1 A = A1
0 + +

0 0
A1 LT
A1 = 0 1 0 + + = A2 = A1 LT1 = L1 ALT1
0 0 + +
Questions?

Mini-Quiz 10:
Quadratic forms: a function q(x) : Rn R if it is a linear

combination of functions of the form xi xj . A quadratic form can
be written as
q(x) = xT Ax, where A Rnn is symmetric.
Let B Rmn . Show that q(x) = kBxk2 is a quadratic form, find

A such that q(x) = xT Ax, and determine the definiteness of A.
Given A, B, find the permutation matrix P such that P BP = A:

1 2 3 1 3 2
A = 4 5 6 , B = 7 9 8
7 8 9 4 6 5

Solution (Q10):
Let B Rmn . Show that q(x) = kBxk2 is a quadratic form, find

A such that q(x) = xT Ax, and determine the definiteness of A.
Solution:
q(x) = (Ax)T (Ax) = xT AT Ax = xT Bx, where B = AT A.
q(x) is positive semi-definite, since q(x) = kAxk2 0, x Rm .
Given A, B, find the permutation matrix P such that P BP = A:

1 2 3 1 3 2 1 0 0
A = 4 5 6 , B = 7 9 8 , and P = 0 0 1
7 8 9 4 6 5 0 1 0

Orthogonalization

Orthogonal (Unitary) Matrices
Recall that two vectors x, y Rn are orthogonal if xT y = 0, and a
set of vectors {u1 , , un } is orthonormal if
(
1 for i = j
uTi uj = , kui k = 1, i
0 for i 6= j
Orthogonal (Unitary) Matrix : a matrix Q Rnn is orthogonal if

its columns q1 , , qn Rn form an orthonormal set:
T
q1

QT Q = ... q1 qn = In

qnT
I If Q is orthogonal, then Q1 = QT (inverse is easy to compute!)

I Orthogonal transformations preserve lengths and angles and
represent rotations or reflections.
p
kQxk2 = xT QT Qx = xT Ix = kxk2
Projectors
Recall our notation for the projection of vector v onto u:

uT v u
proju v = kuk kuk
We can generalize the idea to projections onto subspaces of Rn .
Projector: a square matrix P Rnn is a projector if P = P 2 .

I For v Rn , P v represents the projection of v into the range of P .
I If v R(P ) (x : v = P x), then P v = P 2 x = P x = v.

Orthogonal Projectors
If x Rm and A Rmn is a matrix whose columns form a

subspace of Rm , then x can be decomposed as:
x = xk + x , where xk R(A) and x R(A)
xk is the orthogonal projection of x onto R(A).

I xk is the vector in R(A) closest to x, in the sense that
kx xk k < kx vk, v R(A) 6= xk .

... Orthogonal Projectors
An orthogonal projector is a projector for which the range and

nullspace are orthogonal complements.
I An orthogonal projector satisfies P T = P .
I If u Rn is a unit vector, then Pu = uuT is the orthogonal
projection onto the line containing u.
I If u1 , , uk are an orthonormal basis for subspace V and
A = [u1 un ], then the projection onto V is
PA = AAT
or equivalently: projA x = PA x = (uT1 x)u1 + + (uTn x)un

= proju1 x + + projun x

Questions?

Mini-Quiz 11:
Show that if P is an orthogonal projector, then I 2P is

orthogonal.
Hint: P T = P for orthogonal projectors and P 2 = P for all
projectors. Q is orthogonal if QT Q = I.
Find the matrix for the orthogonal projector PA onto R(A):

1 0
A= 0 1 .
1 0
Hint: The columns of A are orthogonal, but not orthonormal

because the first column does not have length 1. The projection
onto R(Q) is PQ = QQT if the columns of Q are orthonormal.

Solution (11):
Show that if P is an orthogonal projector, then I 2P is
orthogonal. Solution:
(I 2P )T (I 2P ) = I 2 2P T I 2IP + 4P T P
= 4P 2 4P + I (P T = P )
= 4P 4P + I (P 2 = P )
= I

1 0
Find the orthogonal projector PA onto R(A): A = 0 1.
1 0
1
0
2
R(A) = R Q = 0 1

1 0
2
1 1

2 0 2
PA = PQ = QQT = 0 1 0
1 1
2 0 2
Gram-Schmidt Orthogonalization
Gram-Schmidt algorithm: a method for computing an orthonormal
basis {q1 , , qm } for a set of vectors A = [a1 , , an ] Rmn .
Each vector is orthogonalized with respect to the previous vectors
and then normalized:
v1
v1 = a1 q1 = kv1 k
v2
v2 = a2 projq1 a2 q2 = kv2 k
.. ..
. .
k1
X
vk
vk = ak projqi ak qk = kvk k
i=1
In matrix form: A = QR, where (

qiT aj if i j
Q = q1 , , qm is orthogonal, and Rij =
0 if i > j

QR Decomposition
In the process of orthogonalizing the columms of matrix A, the
Gram-Schmidt algorithm computes the QR decomposition of A.
QR decomposition: any matrix A Rmn can be decomposed as
A = QR
where Q Rmm is an orthogonal matrix, and R Rmn is an

upper triangular matrix.
For rectangular A Rmn , m n

R1
A = QR = Q1 Q2 = Q1 R1
0
where R1 Rnn is upper triangular, and Q1 Rmn and

Q2 Rm(mn) both have orthogonal columns.

QR Decomposition for Solving Linear Systems
Consider the linear system Ax = b, where A Rnn is a square,

nonsingular matrix and b Rn .
I The QR decomposition expresses the matrix A = QR as the product
of an orthogonal matrix Q (QQT = In ) and upper triangular R:
Ax = b = QRx = b = Rx = QT b
I y = QT b can be easily computed, and Rx = y is a triangular system

that can be solved by back substitution.

Questions?

Mini-Quiz 12:
Use Gram-Schmidt to compute an orthonormal basis for the

columns of A :
1 1 2
3
13 2
A=
3 12 2
3

1 0 1
3 3
Hint: All three columns are normalized to length 1. The first and
second columns are orthogonal, and the second and third columns
are orthogonal. The projection operator for normalized vectors is
proju v = (uT v)u.

Solution (Q12):
Use Gram-Schmidt to compute an orthonormal basis for A :

1 1 2
3
13 2
A=
3 12 2
3

1 0 1
3 3
q1 = a1 , q2 = a2
v3 = a3 projq1 a3 projq2 a3 = a3 (aT1 a3 )a1

1
2 1
3 3 9
5 1
= 23 3
1
= 9

3 3
1 1
3

3
29
1

v3 16
q3 = =
kv3 k 6
26

Least Squares Problems

Least Squares
We want to solve Ax = b, but what do we do if b / R(A) (i.e.

there does not exist a vector x such that Ax = b)?
I For example, let A Rmn , m > n be a tall-and-skinny
(overdetermined ) matrix. Then for most b Rm , there is no
solution x Rn such that Ax = b.
Least Squares problem: Define residual r = b Ax and find the

vector x that minimizes
krk22 = kb Axk22 .

... Least Squares
We can decompose any vector b Rm into components b = b1 + b2 ,

with b1 R(A) and b2 N (AT ).
Since b2 is in the orthogonal complement of the R(A), we obtain

the following expression for the residual norm:
krk22 = kb1 Ax + b2 k22 = kb1 Axk22 + kb2 k22
which is minimized when Ax = b1 and r = b2 N (AT ).

Normal Equations
The least squares solution x occurs when r = b2 N (AT ) or

equivalently AT r = AT (b Ax) = 0.
Rearranging this expression gives the normal equations:
AT Ax = AT b
I If A has full column rank, then A = AT A is invertible.

Thus Ax = b (where b = AT b) has a unique solution x.

Least Squares and Orthogonal Projections
If x? Rn is the least squares solution to Ax = b, then Ax? is the
orthogonal projection of b onto R(A).
x? = arg min kb Axk = kb Ax? k kb Axk, x Rn

x
= Ax? = PA b
(PA is the projection onto R(A))
= b Ax? R(A) = N (AT )
= AT (b Ax? ) = 0
= AT Ax? = AT b
Combining the solution x? = (AT A)1 AT b and Ax? = PA b gives

the matrix for the orthogonal projection onto R(A):
PA = A(AT A)1 AT

Questions?

Mini-Quiz 13
Find the least squares solution of

1 0 1
0 1 x1 = 2 .
x2
0 1 1
by solving Ax = b1 or AT Ax = AT b
Hints:

1 0 0
R(A) = 0 + 1 , , R , N (AT ) = 1 , R ,
0 1 1

1 0 0
b = 1 0 + 1.5 1 + 0.5 1
0 1 1

Mini-Quiz 13
Find the least squares solution of

1 0 1
x
0 1 1 = 2 .
x2
0 1 1
Solution:

1 0
T 1 0 T 1
b = b1 + b2 + 1.5 + 0.5 , A A= , A b=
0 2 3
1.5 0.5

1 0 1
x x1 1
Ax = b1 = 0 1 1 = 1.5 = =
x2 x2 1.5
0 1 1.5

1 0 x1 1 x1 1
AT Ax = AT b = = = =
0 2 x2 3 x2 1.5
Least Squares via QR
The normal equation AT Ax = AT b can be rewritten using the QR

factorization A = QR:
AT Ax = AT b
RT QT QRx = RT QT b
RT Rx = RT QT b
Rx = QT b
Solution using QR decomposition:

I compute QR factorization of A : A = QR
I form y = QT b
I solve Rx = y by back substitution

Least Squares via Cholesky
M = AT A is symmetric and positive definite, and can be rewritten

in terms the Cholesky factorization M = LLT
Solution using Cholesky factorization:

I form M = AT A
I compute the Cholesky factorization of M : M = LLT
I form y = AT b
I solve Lz = y by forward substitution
I solve LT x = z by back substitution

Matrix Calculus
Vector-by-scalar and Scalar-by-vector derivatives:

x
1
R, x Rn ,
x x
x x2 xn

x .. ,
= 2
=
1

.

xn
Vector-by-vector derivatives:
y y2 ym

1

x 1 x1 x1
y1 y2 ym

y x2 x2 x2
y Rm , x Rn , =
x .. .. ..
. . .

y1 y2 ym
xn xn xn

... Matrix Calculus4
Vector-by-vector identities:
(xT x) (bT x) (Ax)

= 2x, = b, = AT
x x x
(xT Ax)
= A + AT x = 2Ax, if A = AT

x
Derivative of a quadratic:

Axk22 =
xT AT Ax 2bT Ax + bT b = 2 AT Ax AT b

x kb x
I Setting this expression equal to zero gives the normal equations:

x kb Axk22 = 0 = AT Ax = AT b
4
see http://en.wikipedia.org/wiki/Matrix_calculus for more properties.
Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors
For any square matrix A Rnn , there is at least one scalar and
a corresponding vector v 6= 0 such that:
Av = v or equivalently (A I)v = 0.
This scalar is an eigenvalue of A and v is an eigenvector

corresponding to eigenvalue .

5
Geometric interpretation of Eigenvectors
(a) (b)

2 1
Figure: Under the transformation matrix A = , the directions of
1 2
1 1
vectors parallel to v1 = (blue) and v2 = (purple) are preserved.
1 1
5
diagram from wikipedia.org/wiki/Eigenvalues_and_eigenvectors
... Eigenvalues and Eigenvectors
The set of distinct eigenvalues of A is called the spectrum of A and

is denoted (A):
(A) := { R | A I is singular.}
The magnitude of the largest eigenvalue in absolute value is called

the spectral radius of A,
(A) = max |i |.
i (A)
The set of all eigenvectors associated with an eigenvalue form

the eigenspace, E = N (A I), associated with .

Similar Matrices
Two matrices A and B are similar is they share the same

eigenvalues.
I Similar matrices represent the same linear transformation under two
different bases.
I If P is a nonsingular matrix, then B = P 1 AP is similar to A.
P is called a similarity transformation or change of basis matrix.
I A and B do not in general have the same eigenvectors.
If v is an eigenvector of A, then P 1 v is an eigenvector of B.
Given a square matrix A Rnn , we wish to to reduce it to its

simplest form by means of a similarity transformation.

Diagonalizable Matrices
Diagonalizable matrix : a matrix A Rnn that is similar to a

diagonal matrix, i.e. there exists an invertible matrix P such that
P 1 AP = D, where D is diagonal.
A matrix is diagonalizable if and only if it has n linearly

independent eigenvectors.
I Matrices with n distinct eigenvalues are diagonalizable.
Real symmetric matrices are diagonalizable by unitary matrices.

I Matrices are diagonalizable by unitary matrices if and only if they
are normal (a matrix is normal if it satisfies AT A = AAT ).

Eigendecomposition
Let A Rnn be a square, diagonalizable matrix, then A can be

factorized as
A = V V 1
where = diag(1 , , n ) is a diagonal matrix whose elements
are eigenvalues of A, and V is a square matrix whose ith column
vi is the eigenvector corresponding to eigenvalue ii = i .
I Note: this decomposition does not exist for every square matrix A.

1 1
e.g. A = can not be diagonalized.
0 1

Spectral Theorem
Any symmetric matrix A = AT Rnn , has n (not necessarily

distinct) eigenvalues, and A can be decomposed with the
symmetric eigenvalue decomposition:
n
X
A= i ui uTi = U U T ,
i=1
where U is orthogonal and = diag(1 , , n ) is diagonal.

Properties of Eigenvalues
The trace of A Rnn is equal to the sum of its eigenvalues:

n
X
tr(A) = i
i=1
The rank of A Rnn is equal to the number of non-zero

eigenvalues of A.
1
If A Rnn is non-singular with eigenvalue i 6= 0, then i is an
eigenvalue of A1 .
If A Rnn is symmetric positive definite (A = AT and
z T Az > 0, z Rn ), then all of its eigenvalues are positive.

Questions?

Mini-Quiz 14
Show that if is an eigenvalue of a nonsingular matrix A with

corresponding eigenvector v, then v is also an eigenvector of A1 with
eigenvalue 1 .
Hint: we know that Av = v and we want to show that A1 v = 1 v.

Solution (Q14)
Show that if is an eigenvalue of a nonsingular matrix A with

corresponding eigenvector v, then v is also an eigenvector of A1 with
eigenvalue 1 .
Av = v
1
A (Av) = A1 (v)
v = (A1 v)
1
v = A1 v

Solving Eigenvalue Problems
Solving eigenvalue problems Av = v is a fundamentally more

difficult problem than solving a linear system Ax = b.
Solving the eigenvalue problem for an n n matrix can be reduced
to solving for the roots of an nth degree polynomial. For n 5, no
equation exists for the roots of an arbitrary nth degree polynomial
given its coefficients.
All algorithms to solve eigenvalue problems for matrices of
arbitrary size are iterative. This contrasts with linear systems
which have algorithms that are guaranteed to produce the solution
in a finite number of steps.

Solving Small Eigenvalue Problems
Recall that if Av = v, then (A I)v = 0.
I Since (A I) has a vector, v, in its nullspace, it is not singular
(non-invertible).
I Therefore finding the eigenvalues of A is equivalent to finding the
values for which (A I) is singular.
For a 2 2 matrix,

a b d b
A= , the matrix inverse is A1 = 1
adbc .
c d c a
The quantity ad bc is called the determinant of A, det(A), and

the inverse exists if and only if det(A) 6= 0.
Therefore, for our 2 2 eigenvalue problem, we want to find the
eigenvalues such that det(A I) = 0:

(a ) b
(AI) = , det(AI) = (a)(d)bc = 0
c (d )

... Solving Small Eigenvalue Problems
Example:

2 1 (2 ) 1
A= , (A I) =
1 2 1 (2 )
det(A I) = (2 )(2 ) 1 1
= 2 4 + 3
This is a 2nd degree polynomial (i.e. quadratic). We want to find

the roots of this polynomial, 1 , and 2 , which will satisfy
det(A i I) = 0.
2 4 + 3 = 0
( 3)( 1) = 0
= 1 = 3, 2 = 1

Power Method for Finding Eigenvectors
One method for finding the eigenvector, v1 , corresponding to the
largest eigenvalue, 1 , of A Rnn is called the power method :
Pick any vector z Rn , and compute the sequence:
n o
Az, A2 z, A3 z, A4 z, , Ak z , then Ak z v1 as k .
z can be written as a linear combination of the (linearly
independent) eigenvectors of A:
z = 1 v1 + + n vn
Therefore Az = A(1 v1 + + n vn )
= 1 Av1 + + n Avn
= 1 1 v1 + + n n vn
and iterating gives Ak z = 1 k1 v1 + + n kn vn

since 1 is the largest eigenvalue, k1 will dominate the sum, so
Ak z will converge toward the direction of v1 .
The Singular Value
Decomposition

Singular Value Decomposition
Theorem (Singular Value Decomposition): every matrix

A Rmn has a decomposition, called the singular value
decomposition (SVD) of the form
r
X
A = U V T = i ui viT ,
i=1
mm , V Rnn are unitary matrices and

U R
where
r 0
= Rmn is a diagonal matrix with
0 0
r = diag(1 , , r ) and r min(m, n) = rank(A).
The positive numbers 1 2 r > 0 are called the

singular values of A and are uniquely determined.

6
Geometric Interpretation of the SVD
The image of the unit sphere under a matrix is a hyper-ellipse.
6
diagram: http://people.sc.fsu.edu/~jburkardt/latex/fsu_2006/svd.png
Singular Values
The number of non-zero singular values r is the rank of the matrix.

The singular values are also related to a class of matrix norms:
Spectral norm (induced by the Euclidean vector norm):
q
kAk2 = max (AT A) = max (A).
Frobenius norm:
v v
um X
n
uX q u r
uX
kAkF = t 2 T
|aij | = trace(A A) = t i2
i=1 j=1 i=1

Singular Values and Eigenvalues
The singular value decomposition of A = U V T Rmn is related

to the eigenvalue decomposition of symmetric matrix AT A Rnn :
AT A = (V U T )(U V T ) = V 2 V T = V V T
and similarly AAT = U U T = U U T .
q q
= i (A) = i (AT A) = i (AAT ), i = 1, , r
The singular values of A are the square roots of the eigenvalues of

both AT A and AAT .
The right singular vectors, V , of A are the eigenvectors of AT A
The left singular vectors, U , of A are the eigenvectors of AAT

Singular Vectors
The columns of U = [u1 , , um ] and V = [v1 , , vn ] are called
the left and right singular vectors of A, respectively.
The columns of U and V form an orthonormal set of vectors,
which can be regarded as orthonormal bases.
The singular vectors provide a convenient way to represent the
bases of the fundamental subspaces of matrix A.
If A Rmn is a matrix of rank r, then
R(A) = span {u1 , , ur }

N (AT ) = span {ur+1 , , um }
R(AT ) = span {v1 , , vr }
N (A) = span {vr+1 , , vn }

Low-rank Approximation via the SVD
From the definition of the SVD, any matrix A Rmn of rank r

can be written as a sum of rank-one matrices:
r
X
A = U V T = i ui viT ,
i=1
Theorem (low-rank approximation): Let A Rmn and

0 k r, then
min kA Bk2 = k+1

B : rank(B)=k
k
where the minimum is attained by B ? = Ak = i ui viT .
P
i=1
I Application: image compression

Image Compression using SVD
(a) rank-1 (b) rank-2 (c) rank-5

approximation approximation approximation
(18,456 bytes 0.6%) (36,912 bytes 1.1%) (92,280 bytes 2.8%)

(a) rank-10 (b) rank-20 (c) rank-40

approximation approximation approximation
(184,560 bytes 5.6%) (369,120 bytes 11.1%) (738,240 bytes 22.3%)

(a) original image

(3,317,760 bytes)

Questions?

Mini-Quiz 15
The SVD is useful in many linear algebra proofs, since it decomposes a
matrix into orthogonal and diagonal factors which have nice properties.
Let A Rmn = U V T . Show that
kAk2 = max (A) = max y T Ax.

kxk=1, kyk=1
Hint: Use the SVD and unitary invariance of the Euclidean vector
norm (kU xk = kxk).
Polar decomposition:
Show how any square matrix A Rnn = U V T can be written as
A = QS, Q orthogonal, S symmetric positive semi-definite.
Hint: V V T is symmetric positive semi-definite.

Solution (Q15)
Let A = U V T . Show that kAk2 = max (A) = max y T Ax.
kxk=1,kyk=1
Solution:
max y T Ax = max y T (U V T )x
kxk=1,kyk=1 kxk=1, kyk=1
= max (y T U )(V T x)
kV T xk=1, kU T yk=1
= max y T x
kxk=1, kyk=1
= max ii
i
= max (A)
Show how any square matrix A = U V T can be written as
A = QS, Q orthogonal, S symmetric positive semi-definite.
Solution:
A = U V T = U In V T = U (V T V )V T = (U V T )(V V T )
A = QS, where Q = U V T , and S = V V T .
The End!

References and Acknowledgements
1 Numerical Linear Algebra by Lloyd N. Trefethen & David Bau III.
2 Matrix Computations by Gene H. Golub & Charles F. Van Load.
3 Linear Algebra and its Applications by Gilbert Strang.
4 Linear Algebra with Applications. by Otto Bretscher.
5 www.Wikipedia.org and Mathworld.Wolfram.com
Thanks to former ICME Refresher Course instructors Yuekai Sun,

Nicole Taheri, and Milinda Lakkam whose slides served as useful
references in creating this presentation.
Thanks to Carlos Sing-Long and Anil Damle for their feedback on

this presentation.

Applied La 2013 PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Applied La 2013 PDF

Загружено:

Авторское право:

Доступные форматы

Applied Linear Algebra Refresher Course

Institute for Computational and Mathematical Engineering,

September 16-19, 2013

K. Bergen (ICME) Applied Linear Algebra 1 / 140

Matrix and vector properties

K. Bergen (ICME) Applied Linear Algebra 2 / 140

K. Bergen (ICME) Applied Linear Algebra 3 / 140

x, y, z Rn (lower case) is a vector with n entries.

Matrix : a two-dimensional collection of scalars.

A Rmn is a matrix with m rows and n columns.

A, B, , Rmn (upper case)

K. Bergen (ICME) Applied Linear Algebra 5 / 140

How do we measure the length or size of a vector?

Vector norm is a function f : Rn R+ that satisfies the following

1 Scale invariance: f (v) = ||f (v), R and v Rn

K. Bergen (ICME) Applied Linear Algebra 7 / 140

2-norm (Euclidean norm):

infinity-norm (max norm):

K. Bergen (ICME) Applied Linear Algebra 8 / 140

K. Bergen (ICME) Applied Linear Algebra 9 / 140

Compute the 1-norm, 2-norm, infinity-norm of the following vector:

K. Bergen (ICME) Applied Linear Algebra 10 / 140

K. Bergen (ICME) Applied Linear Algebra 11 / 140

The inner product satisfies the Cauchy-Schwartz inequality:

|xT y| kxk2 kyk2 for x, y Rn

with equality when y is a scalar multiple of x.

This inequality is arises in many applications, including

K. Bergen (ICME) Applied Linear Algebra 12 / 140

Orthogonal vectors: vectors u and v are orthogonal (u v) if their

Orthonormal vectors: vectors u and v are orthonormal if they are

uv and kuk = 1, kvk = 1

The orthogonal complement of a set of vectors V , denoted V , is

K. Bergen (ICME) Applied Linear Algebra 13 / 140

Projection: the inner product gives us a way to compute the

When u is normalized to unit length, kuk = 1, this becomes

K. Bergen (ICME) Applied Linear Algebra 14 / 140

K. Bergen (ICME) Applied Linear Algebra 15 / 140

Give an algebraic proof for the triangle inequality

kx + yk2 kxk2 + kyk2

using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )

K. Bergen (ICME) Applied Linear Algebra 16 / 140

kx + yk2 kxk2 + kyk2

using the Cauchy-Schwartz inequality. (Hint: expand kx + yk2 )

kxk2 + kyk2 + 2|xT y|

K. Bergen (ICME) Applied Linear Algebra 17 / 140

K. Bergen (ICME) Applied Linear Algebra 18 / 140

Matrix-vector product as linear combination of matrix columns

Matrix products in terms of inner products:

K. Bergen (ICME) Applied Linear Algebra 20 / 140

AB = A b1 b2 bn = Ab1 Ab2 Abn

K. Bergen (ICME) Applied Linear Algebra 21 / 140

Matrix multiplication is distributive:

Matrix multiplication is, in general, not commutative:

K. Bergen (ICME) Applied Linear Algebra 22 / 140

Using the definition of matrix multiplication, we can define the

K. Bergen (ICME) Applied Linear Algebra 23 / 140

K. Bergen (ICME) Applied Linear Algebra 24 / 140

Which of the following vector and matrix operations are well-defined?

K. Bergen (ICME) Applied Linear Algebra 25 / 140

Which of the following vector and matrix operations are well-defined?

K. Bergen (ICME) Applied Linear Algebra 26 / 140

Transpose: the transpose of matrix A Rmn is the matrix

(AT )ij = Aji