Академический Документы
Профессиональный Документы
Культура Документы
n
5.1 Length and Dot Product in R
5.2 Inner Product Spaces
5.3 Orthonormal Bases: Gram-Schmidt Process
5.4 Mathematical Models and Least Square Analysis
5.5 Applications of Inner Product Spaces
5.1
n
5.1 Length and Dot Product in R
Length (長度):
The length of a vector v (v1 , v2 ,, vn ) in Rn is given by
|| v || 0 2 (2) 2 12 4 2 (2) 2 25 5
2 2 2
2 2 3 17
|| v || 1
17 17 17 17
5.3
n
A standard unit vector (標準單位向量) in R : only one
component of the vector is 1 and the others are 0 (thus the length
of this vector must be 1)
5.4
Theorem 5.1: Length of a scalar multiple
Let v be a vector in Rn and c be a scalar. Then
|| cv || | c | || v ||
Pf:
v (v1 , v2 , , vn )
cv (cv1 , cv2 , , cvn )
|| cv || || ( cv1 , cv2 , , cvn ) ||
(cv1 ) 2 (cv2 ) 2 (cvn ) 2
c 2 (v1 v2 vn )
2 2 2
| c | v1 v2 vn
2 2 2
| c | || v ||
5.5
Theorem 5.2: How to find the unit vector in the direction of v
v
If v is a nonzero vector in R , then the vector u
n
|| v ||
has length 1 and has the same direction as v. This vector u
is called the unit vector in the direction of v
Pf:
1
v is nonzero v 0 0
v
1
If u v (u has the same direction as v)
v
|| cv || | c | || v ||
v 1
|| u || || v || 1 (u has length 1)
|| v || || v ||
5.6
Notes:
v
(1) The vector is called the unit vector in the direction of v
|| v ||
(2) The process of finding the unit vector in the direction of v
is called normalizing the vector v
5.7
Ex 2: Finding a unit vector
Find the unit vector in the direction of v = (3, –1, 2), and verify
that this vector has length 1
Sol:
v (3 , 1 , 2) v 32 1 22 14
2
v (3 , 1 , 2) 1
(3 , 1 , 2)
|| v || 3 (1) 2
2 2 2
14
3 1 2
, ,
14 14 14
2 2 2
3 1 2 14
14 1
14 14 14
v
is a unit vector
v
5.8
Distance between two vectors:
n
The distance between two vectors u and v in R is
d (u , v) || u v ||
Properties of distance
(1) d (u , v) 0
(2) d (u , v) 0 if and only if u = v
(3) d (u , v) d ( v , u) (commutative property of the function of distance) 5.9
Ex 3: Finding the distance between two vectors
The distance between u = (0, 2, 2) and v = (2, 0, 1) is
d (u , v ) || u v || || (0 2 , 2 0 , 2 1) ||
(2) 2 2 2 12 3
5.10
Dot product (點積) in Rn:
The dot product of u (u1 , u 2 , , u n ) and v (v1 , v2 , , vn )
returns a scalar quantity
u v u1v1 u2v2 unvn (u v is a real number)
(The dot product is defined as the sum of component-by-component
multiplications)
※ The proofs of the above properties follow simply from the definition
of dot product in Rn
5.12
Euclidean n-space:
– In section 4.1, Rn was defined to be the set of all order n-
tuples of real numbers
– When Rn is combined with the standard operations of
vector addition, scalar multiplication, vector length,
and dot product, the resulting vector space is called
Euclidean n-space (歐幾里德 n維空間)
5.13
Ex 5: Find dot products
u (2 , 2) , v (5 , 8), w (4 , 3)
(a) u v (b) (u v )w (c) u (2 v ) (d) || w || 2 (e) u ( v 2w )
Sol:
(a) u v (2)(5) (2)(8) 6
(b) (u v)w 6w 6(4 , 3) (24 , 18)
(c) u (2v) 2(u v) 2(6) 12
(d) || w ||2 w w (4)( 4) (3)(3) 25
(e) v 2w (5 (8) , 8 6) (13 , 2)
u ( v 2w ) (2)(13) (2)( 2) 26 4 22
5.14
Ex 6: Using the properties of the dot product
Given u u 39, u v 3, v v 79,
find (u 2 v) (3u v)
Sol:
(u 2 v) (3u v) u (3u v) 2 v (3u v)
u (3u) u v (2 v) (3u) (2 v) v
3(u u) u v 6( v u) 2( v v)
3(u u) 7(u v) 2( v v)
5.15
Theorem 5.4: The Cauchy-Schwarz inequality (科西-舒瓦茲不等式)
If u and v are vectors in Rn, then
| u v | || u || || v || ( | u v | denotes the absolute value of u v )
(The geometric interpretation for this inequality is shown on the next slide)
v u (u1 v1 ) 2 (u2 v2 ) 2
2
v v12 v22
2
u u12 u22
2
0 0
2 2 2
cos 1 cos 0 cos 0 cos 0 cos 1
Note:
The angle between the zero vector and another vector is
not defined (since the denominator cannot be zero)
5.18
Ex 8: Finding the angle between two vectors
u (4 , 0 , 2 , 2) v (2 , 0 , 1 , 1)
Sol:
u u u 42 02 22 22 24
v vv 2 2
0 1 12
2 2
6
u v (4)( 2) (0)(0) (2)( 1) (2)(1) 12
uv 12 12
cos 1
|| u || || v || 24 6 144
5.20
Ex 10: Finding orthogonal vectors
Determine all vectors in Rn that are orthogonal to u = (4, 2)
Sol:
u ( 4 , 2) Let v (v1 , v2 )
u v (4 , 2) (v1 , v2 )
4v1 2v2
0
t
v1 , v2 t
2
t
v ,t , t R
2
5.21
Theorem 5.5: The triangle inequality (三角不等式)
If u and v are vectors in Rn, then || u v || || u || || v ||
Pf:
|| u v || 2 (u v) (u v)
u (u v) v (u v) u u 2(u v) v v
|| u ||2 2(u v) || v ||2 || u ||2 2 | u v | || v ||2 (c |c|)
|| u ||2 2 || u || || v || || v ||2 (Cauchy-Schwarz inequality)
(|| u || || v ||) 2
(The geometric representation of the triangle inequality:
|| u v || || u || || v || for any triangle, the sum of the lengths of any two sides is
larger than the length of the third side (see the next slide))
Note:
Equality occurs in the triangle inequality if and only if
the vectors u and v have the same direction (in this
situation, cos θ = 1 and thus u v u v 0) 5.22
Theorem 5.6: The Pythagorean (畢氏定理) theorem
If u and v are vectors in Rn, then u and v are orthogonal
if and only if
|| u v || || u || || v ||
2 2 2 (This is because u·v = 0 in the
proof for Theorem 5.5)
※ The geometric meaning: for any right triangle, the sum of the squares of the
lengths of two legs (兩股) equals the square of the length of the hypotenuse (斜邊).
u1 v1
u v (A vector u = (u1, u2,…, un) in Rn can be
u 2 v 2 represented as an n×1 column matrix)
u n v n
v1
v
u v uT v [u1 u2 un ] 2 [u1v1 u2v2 un vn ]
vn
(The result of the dot product of u and v is the same as the result
of the matrix multiplication of uT and v)
5.24
Keywords in Section 5.1:
length: 長度
norm: 範數
unit vector: 單位向量
standard unit vector: 標準單位向量
distance: 距離
dot product: 點積
Euclidean n-space: 歐基里德n維空間
Cauchy-Schwarz inequality: 科西-舒瓦茲不等式
angle: 夾角
triangle inequality: 三角不等式
Pythagorean theorem: 畢氏定理
5.25
5.2 Inner Product Spaces
Inner product (內積): represented by angle brackets〈u , v〉
Let u, v, and w be vectors in a vector space V, and let c be
any scalar. An inner product on V is a function that associates
a real number〈u , v〉with each pair of vectors u and v and
satisfies the following axioms (abstraction definition from
the properties of dot product in Theorem 5.3 on Slide 5.12)
(1)〈u , v〉〈v , u〉(commutative property of the inner product)
(distributive property of the inner product
(2)〈u , v w〉〈u , v〉〈u , w〉over vector addition)
(3) c〈u , v〉〈cu , v〉(associative property of the scalar multiplication and the
inner product)
(4)〈v , v〉0
and ifand
〈 v , v〉 0 only if v0
5.26
Note:
u v dot product (Euclidean inner product for R n )
u , v general inner product for a vector space V
Note:
A vector space V with an inner product is called an inner
product space (內積空間)
Vector space: (V , , )
Inner product space: (V , , , , >)
5.27
Ex 1: The Euclidean inner product for Rn
Show that the dot product in Rn satisfies the four axioms
of an inner product
Sol:
u (u1 , u2 ,, un ) , v (v1 , v2 ,, vn )
〈u , v〉 u v u1v1 u2v2 un vn
By Theorem 5.3, this dot product satisfies the required four axioms.
Thus, the dot product can be a sort of inner product in Rn
5.28
Ex 2: A different inner product for Rn
Show that the following function defines an inner product
on R2. Given u (u1 , u2 ) and v (v1 , v2 ),
〈u , v〉 u1v1 2u 2 v 2
Sol:
(1) 〈u , v〉 u1v1 2u2v2 v1u1 2v2u2 〈v , u〉
(2) w ( w1 , w2 )
〈u , v w〉 u1 (v1 w1 ) 2u2 (v2 w2 )
u1v1 u1w1 2u2 v2 2u2 w2
(u1v1 2u2v2 ) (u1w1 2u2 w2 )
〈u , v〉〈u , w〉
5.29
(3) c〈u , v〉 c(u1v1 2u2v2 ) (cu1 )v1 2(cu2 )v2 〈cu , v〉
(4) 〈v , v〉 v12 2v2 2 0
〈v , v〉 0 v12 2v2 2 0 v1 v2 0 ( v 0)
5.30
Ex 3: A function that is not an inner product
Show that the following function is not an inner product on R3
〈u , v〉 u1v1 2u2 v2 u3v3
Sol:
Let v (1 , 2 , 1)
5.31
Theorem 5.7: Properties of inner products
Let u, v, and w be vectors in an inner product space V, and
let c be any real number
v, 0〉 0
(1)〈0, v〉〈
(2)〈u v, w〉〈
u, w〉〈
v, w 〉
(3)〈u, cv〉 c〈u, v〉
※ To prove these properties, you can use only basic properties for
vectors the four axioms in the definition of inner product (see
Slide 5.26)
Pf:
(3)
(1)〈0, v〉=〈0u, v〉 0〈u, v〉 0
(1) (2) (1)
(2)〈u v, w〉〈
w , u v〉〈
w , u〉+〈w , v〉〈
u, w 〉+〈v, w 〉
(1) (3)
cv, u〉 〈
(3) 〈u, cv〉〈 c u , v〉
5.32
※ The definition of norm (or length), distance, angle, orthogonal, and
normalizing for general inner product spaces closely parallel to
those based on the dot product in Euclidean n-space
Norm (length) of u:
|| u || 〈u , u〉
(2) v 0
Normalizing v (the unit vector in the
v direction of v)
(if v is not a
zero vector)
5.34
Ex 6: An inner product in the polynomial space
For p a0 a1 x an x n and q b0 b1 x bn x n ,
and p , q a0b0 a1b1 anbn is an inner product
Let p( x) 1 2 x 2 , q( x) 4 2 x x 2 be polynomials in P2
(a) p , q ? (b) || q || ? (c) d ( p , q) ?
Sol:
(a) p , q (1)(4) (0)(2) (2)(1) 2
(b) || q || q , q 42 (2) 2 12 21
(c) p q 3 2 x 3x 2
d ( p , q) || p q || p q, p q
( 3) 2 2 2 ( 3) 2 22
5.35
Properties of norm: (the same as the properties for the dot
product in Rn on Slide 5.2)
(1) || u || 0
(2) || u || 0 if and only if u 0
(3) || cu || | c | || u ||
5.36
Theorem 5.8:
Let u and v be vectors in an inner product space V
(1) Cauchy-Schwarz inequality:
| u , v〉| || u || || v ||
〈 Theorem 5.4
(2) Triangle inequality:
|| u v || || u || || v || Theorem 5.5
(3) Pythagorean theorem:
u and v are orthogonal if and only if
|| u v || 2 || u || 2 || v || 2 Theorem 5.6
5.37
Orthogonal projections (正交投影): For the dot product function
n
in R , we define the orthogonal projection of u onto v to be
projvu = av (a scalar multiple of v), and the coefficient a can be
derived as follows
Consider a 0, av a v a v u cos
u
|| u || || v || cos || u || || v || u v uv
v v || u || || v || v
v
projv u av, a 0 uv uv uv
a projv u v
v
2
vv vv
For inner product spaces:
Let u and v be two vectors in an inner product space V.
If v 0 , then the orthogonal projection of u onto v is
given by
u, v
projv u v
v, v 5.38
Ex 10: Finding an orthogonal projection in R3
Use the Euclidean inner product in R3 to find the
orthogonal projection of u = (6, 2, 4) onto v = (1, 2, 0)
Sol:
u , v (6)(1) (2)( 2) (4)(0) 10
v , v 12 22 02 5
u , v uv 10
projv u v v (1 , 2 , 0) (2 , 4 , 0)
v , v vv 5
5.39
Theorem 5.9: Orthogonal projection and distance
Let u and v be two vectors in an inner product space V,
and if v ≠ 0, then
u , v
d (u, projv u) d (u, cv) , c
v , v
u u d (u, cv )
d (u, projv u)
v v
projv u cv
5.41
5.3 Orthonormal Bases: Gram-Schmidt Process
Orthogonal set (正交集合):
A set S of vectors in an inner product space V is called an
orthogonal set if every pair of vectors in the set is orthogonal
S v1 , v 2 ,, v n V
v i , v j 0, for i j
S (1,0,0),(0,1,0),(0,0,1)
is an orthonormal basis for R3
5.43
Ex 1: A nonstandard orthonormal basis for R3
Show that the following set is an orthonormal basis
v1 v2 v3
1 1 2 2 2 2 2 2 1
S , , 0 , ,
6 , 6 , 3 , 3
,
3 3
2 2
Sol:
First, show that the three vectors are mutually orthogonal
v1 v 2 16 16 0 0
2 2
v1 v 3 0 0
3 2 3 2
2 2 2 2
v 2 v3 0
9 9 9
5.44
Second, show that each vector is of length 1
|| v 1 || v1 v1 1
2
12 0 1
|| v 2 || v 2 v 2 2
36
2
36
8
9
1
|| v 3 || v 3 v 3 4
9
94 19 1
Thus S is an orthonormal set
v1 v1 , v1 11 0 0 0 0 1
v2 v2 , v2 0 0 11 0 0 1
v3 v3 , v3 0 0 0 0 11 1 5.46
Theorem 5.10: Orthogonal sets are linearly independent
If S {v1 , v 2 , , v n } is an orthogonal set of nonzero vectors
in an inner product space V, then S is linearly independent
Pf:
S is an orthogonal set of nonzero vectors,
i.e., v i , v j 0 for i j, and v i , v i 0
(If there is only the trivial solution for ci’s,
For c1 v1 c2 v 2 cn v n 0 i.e., all ci’s are 0, S is linearly independent)
c1 v1 c2 v 2 cn v n , v i 0, v i 0 i
c1 v1 , v i c2 v 2 , v i ci v i , v i cn v n , v i
ci v i , v i 0 (because S is an orthogonal set of nonzero vectors)
vi , vi 0 ci 0 i S is linearly independent
5.47
Corollary to Theorem 5.10:
If V is an inner product space with dimension n, then any
orthogonal set of n nonzero vectors is a basis for V
1. By Theorem 5.10, if S = {v1, v2, …, vn} is an orthogonal set of n
vectors, then S is linearly independent
2. According to Theorem 4.12, if S = {v1, v2, …, vn} is a linearly
independent set of n vectors in V (with dimension n), then S is a
basis for V
※ Based on the above two arguments, it is straightforward to
derive the above corollary to Theorem 5.10
5.48
Ex 4: Using orthogonality to test for a basis
Show that the following set is a basis for R 4
v1 v2 v3 v4
S {(2 , 3 , 2 , 2) , (1 , 0 , 0 , 1) , (1 , 0 , 2 , 1) , (1 , 2 , 1 , 1)}
Sol:
v1 , v 2 , v3 , v 4 : nonzero vectors
v1 v 2 2 0 0 2 0 v 2 v 3 1 0 0 1 0
v1 v 3 2 0 4 2 0 v 2 v 4 1 0 0 1 0
v1 v 4 2 6 2 2 0 v3 v 4 1 0 2 1 0
S is orthogonal
S is a basis for R 4 (by Corollary to Theorem 5.10)
※ The corollary to Thm. 5.10 shows an advantage of introducing the concept of
orthogonal vectors, i.e., it is not necessary to solve linear systems to test
whether S is a basis (e.g., Ex 1 on 5.44) if S is a set of orthogonal vectors 5.49
Theorem 5.11: Coordinates relative to an orthonormal basis
If B {v1 , v 2 , , v n } is an orthonormal basis for an inner
product space V, then the unique coordinate representation of a
vector w with respect to B is
w w, v1 v1 w, v 2 v 2 w, v n v n
※ The above theorem tells us that it is easy to derive the coordinate
representation of a vector relative to an orthonormal basis, which is
another advantage of using orthonormal bases
Pf:
B {v1 , v 2 ,isan, vorthonormal
n} basis for V
w k1v1 k2 v 2 kn v n V (unique representation from Thm. 4.9)
1 i j
Since vi , v j , then
0 i j
5.50
w, v i (k1 v1 k2 v 2 kn v n ), v i
k1 v1 , v i ki v i , v i kn v n , v i
ki for i = 1 to n
w w, v1 v1 w, v 2 v 2 w, v n v n
Note:
If B {v1 , v 2 , , v n } is an orthonormal basis for V and w V ,
Then the corresponding coordinate matrix of w relative to B is
w, v1
w , v
w B 2
w , v
n
5.51
Ex
For w = (5, –5, 2), find its coordinates relative to the standard
basis for R3
w, v1 w v1 (5 , 5 , 2) (1 , 0 , 0) 5
w, v 2 w v 2 (5, 5 , 2) (0 , 1 , 0) 5
w, v 3 w v 3 (5 , 5 , 2) (0 , 0 , 1) 2
5
[w ]B 5
2
※ In fact, it is not necessary to use Thm. 5.11 to find the coordinates relative
to the standard basis, because we know that the coordinates of a vector
relative to the standard basis are the same as the components of that vector
※ The advantage of the orthonormal basis emerges when we try to find the
coordinate matrix of a vector relative to an nonstandard orthonormal basis
(see the next slide) 5.52
Ex 5: Representing vectors relative to an orthonormal basis
Find the coordinates of w = (5, –5, 2) relative to the following
orthonormal basis for R3
v1 v2 v3
B {( 53 , 54 , 0) , ( 54 , 53 , 0) , (0 , 0 , 1)}
Sol:
w, v1 w v1 (5 , 5 , 2) ( 53 , 54 , 0) 1
w, v 2 w v 2 (5, 5 , 2) ( 54 , 53 , 0) 7
w, v 3 w v 3 (5 , 5 , 2) (0 , 0 , 1) 2
1
[w ]B 7
2
5.53
The geometric intuition of the Gram-Schmidt process to find an
orthonormal basis in R2
v2
w2
v2 w1 v1
v1
projw1 v 2
w 2 v 2 projw1 v 2 is
v1 , v 2 is a basis for R 2
orthogonal to w1 v1
w1 w2
{ , } is an orthonormal basis for R 2
w1 w2
5.54
Gram-Schmidt orthonormalization process:
B {v1 , v 2 ,
is ,a vbasis
n} for an inner product space V
Let w1 v1 S1 span({w1})
v 2 , w1
w 2 v 2 projS1 v 2 v 2 w1 S2 span({w1 , w 2 })
w1 , w 1
v 3 , w1 v3 , w 2
w 3 v 3 projS2 v 3 v 3 w1 w2
w1 , w 1 w2 , w2
The orthogonal projection onto a
n 1 subspace is actually the sum of
v n , wi
w n v n projSn1 v n v n w i orthogonal projections onto the vectors
i 1 w i , w i in an orthogonal basis for that subspace
(I will prove it on Slides 5.67 and 5.68)
B' {w1 , w 2 , , w n } is an orthogonal basis
w1 w 2 wn
B '' { , , , } is an orthonormal basis
w1 w 2 wn
5.55
Ex 7: Applying the Gram-Schmidt orthonormalization process
Apply the Gram-Schmidt process to the following basis for R3
v1 v2 v3
B {(1 , 1 , 0) , (1 , 2 , 0) , (0 , 1 , 2)}
Sol:
w1 v1 (1 , 1 , 0)
v 2 w1 3 1 1
w2 v2 w1 (1 , 2 , 0) (1 , 1 , 0) ( , , 0)
w1 w1 2 2 2
v 3 w1 v3 w 2
w3 v3 w1 w2
w1 w1 w2 w2
1 1/ 2 1 1
(0 , 1 , 2) (1 , 1 , 0) ( , , 0) (0 , 0 , 2)
2 1/ 2 2 2 5.56
Orthogonal basis
1 1
B' {w1 , w 2 , w 3 } {(1, 1, 0), ( , , 0), (0, 0, 2)}
2 2
Orthonormal basis
w1 w 2 w 3 1 1 1 1
B' ' { , , } {( , , 0), ( , , 0), (0, 0, 1)}
w1 w 2 w 3 2 2 2 2
5.57
Ex 10: Alternative form of Gram-Schmidt orthonormalization process
Find an orthonormal basis for the solution space of the
homogeneous system of linear equations
x1 x 2 7 x4 0
2 x1 x 2 2 x3 6 x 4 0
Sol:
1 1 0 7 0 1 0 2 1 0
2 1 2 6 0
G.-J.E
0 1 2 8 0
x1 2s t 2 1
x 2s 8t 2 8
2 s t
x3 s 1 0
x4 t 0 1
5.58
Thus one basis for the solution space is
B {v 1 , v 2 } {( 2 , 2 , 1 , 0) , (1 , 8 , 0 , 1)}
w1 1 2 2 1
w1 v1 and u1 2, 2, 1, 0 , , , 0
w1 3 3 3 3
v 2 , u1
w 2 v 2 v 2 , u1 u1 (due to w 2 v 2 u1 and u1 , u1 1)
u1 , u1
2 2 1 2 2 1
1, 8, 0, 1 1, 8, 0, 1 , , , 0 , , , 0
3 3 3 3 3 3
3, 4, 2, 1
※ In this alternative form,
we always normalize wi
to be ui before
w2 1
u2 3, 4, 2, 1 processing wi+1
※ The advantage of this
w2 30 method is that it is
2 2 1 3 4 2 1 easier to calculate the
B' ' , , ,0 , , , , orthogonal projection
3 3 3 30 30 30 30 of wi+1 on u1, u2,…, ui
5.59
Alternative form of the Gram-Schmidt orthonormalization process:
B {v1 , v 2 ,isa ,basis
v n } for an inner product space V
w1 v1
u1
w1 v1
w2
u2 , where w 2 v 2 v 2 , u1 u1
w2
w3
u3 , where w 3 v 3 v 3 , u1 u1 v 3 , u 2 u 2
w3
n 1
wn
un , where w n v n v n , u i u i
wn i 1
5.61
5.4 Mathematical Models and Least Squares Analysis
Orthogonal complement (正交補集) of V:
Let S be a subspace of an inner product space V
(a) A vector v in V is said to be orthogonal to S, if v is
orthogonal to every vector in S, i.e., v , w 0 , w S
(b) The set of all vectors in V that are orthogonal to S is
called the orthogonal complement of S
S {v V | v , w 0 , w S}
(S (read " S perp" ))
Notes:
0
(1) V (2) V 0
(This is because 0, v 0 for any vector v in V) 5.62
Notes:
Given S to be a subspace of V ,
(1) S is a subspace of V
(2) S S 0
(3) ( S ) S ※ Any vector in x-axis (or y-axis)
can be represented as (t, 0) (or
Ex: (0, s)). It is straightforward to
If V R 2 , S x-axis see that any vector in y-axis is
orthogonal to every vector in x-
Then (1) S y -axis is a subspace of 2
R axis since (0, s) · (t, 0) = 0
※ On Slide 4.32, any subset
(2) S S (0, 0) consisting of all points on a line
passing through the origin is a
(3) ( S ) S subspace of R2. Therefore, both
x- and y-axes are subspaces of
R2
5.64
Theorem 5.13: Properties of orthogonal subspaces
Let S be a subspace of V (with dimension n). Then the
following properties are true
(1) dim( S ) dim( S ) n
(2) V S S
(3) (S ) S
5.66
Theorem 5.14: Projection onto a subspace
(Here we will formally prove that the orthogonal projection onto a subspace is the sum
of orthogonal projections onto the vectors in an orthonormal basis for that subspace)
If {u1 , u 2 , , u t } is an orthonormal basis for the
subspace S of V, and v V , then
projS v v , u1 u1 v , u 2 u 2 v , ut ut
Pf:
(Since the vector space v can be expressed as the
Let v V , v v1 v 2 , v1 S , v 2 S direct sum of S and S , according to the results
on Slide 5.66, v1 is the orthogonal projection of v
u1 , , ut is an orthonormal basis for S onto S, and v2 is orthogonal to any vector in S.)
projW v v, u1 u1 v, u 2 u 2
6 3 1 9 3
(0, , ) 1, 0, 0 (1, , )
10 10 10 5 3
5.69
v v
v u
v cu v projS v
v proju v projS v u
projS v
u u
proju v cu S
v proju v v cu v projS v v u
※ Theorem 5.9 tells us that among all the scalar multiples of a vector
u, i.e., cu, the orthogonal projection of v onto u is the one closest
to v (see the left figure)
※ This property is also true for projections onto subspaces. That is,
among all the vectors in the subspace S, the vector projSv is the
vector closest to v (see the right figure and Theorem 5.15 on the
next slide)
5.70
Theorem 5.15: Orthogonal projection and distance
Let S be a subspace of an inner product space V, and v V.
Then for all u S , u projS v
|| v projS v || || v u ||
(Among all the vectors in the subspace S,
or || v projS v || min || v u || the vector projSv is closest to v)
Pf:
v u ( v projS v) (projS v u)
projS v u S and v projS v S v projS v projS v u
v projS v, projS v u 0
Thus the Pythagoream Theorem in Theorem 5.6 can be applied,
v u v projS v projS v u .
2 2 2
Since u projS v, the second term on the right hand side is positive,
and we can have v projS v v u
5.71
Theorem 5.16: Fundamental subspaces (基本子空間) of a
matrix, including CS(A), CS(AT), NS(A), and NS(AT)
If A is an m×n matrix, then
(1) CS ( A) NS ( AΤ ) (or expressed as CS ( A) NS ( AΤ ))
Pf: Consider any v CS ( A) and any u NS ( AT ), and the goal is to prove v u 0
( A(1) )T ( A(1) )T u 0
(Proved by
u NS ( AT ) AT u u = =0 setting B = AT
( A( n ) )T ( A( n ) )T u 0 and B satisfies
the first property)
v u (c1 A(1) cn A( n ) ) u (c1 A(1) u cn A( n ) u) (0 0) 0
Check:
CS ( A) NS ( AΤ ) (a(1, 0, 0, 0) + b(0, 1, 0, 0)) · (c(0, 0, 1, 0) + d(0, 0, 0, 1)) = 0
CS ( A) NS ( AT ) R 4
v = v1 + v2 = (a(1, 0, 0, 0) + b(0, 1, 0, 0)) + (c(0, 0, 1, 0) + d(0,
0, 0, 1))
※ The term least squares comes from the fact that minimizing
Ax b is equivalent to minimizing Ax b = (Ax – b) ·
2
5.78
A M mn In Thm. 4.19 on Slides 4.97 and 4.98, Ax can
expressed as x1A(1)+x2A(2)+…+xnA(n)
Ax CS ( A) which is closest to b
Axˆ = projW b
W
Thus Axˆ projW b (To find the best solution x̂ which should satisfy this equation)
projW b Axˆ A( A A) 1 A b
5.81
Ex 7: Solving the normal equations
Find the least squares solution of the following system
Ax b
1 1 0
1 2 c0 1
c
1 3 1 3
5.82
Sol:
1 1
1 1 1 3 6
A A
T
1 2
1 2 3 1 3 6 14
0
1 1 1 4
A b
T
1
1 2 3 3 11
5.83
the least squares solution of Ax = b
cˆ0 53
xˆ 3
cˆ1 2
1 1 5 61
3 8
projCS ( A)b Axˆ 1 2 3 6
1 3 2 176
5.84
※ The above problem is equivalent to find a line y = c0+c1x that “best fits”
these three points (1, 0), (2, 1), and (3, 3) on the xy-plane.
※ This analysis is also called the least squares regression analysis, in
which we can find the “best fit” linear relationship between x and y, and
next we can estimate y given the different values of x.
※ The matrix representation for the three equations corresponding to (1, 0),
(2, 1), and (3, 3) is as follows
0 c0 c11 0 1 1
c0
1 c0 c1 2 1 1 2 Y XC
3 c c 3 3 1 3 c1
0 1
※ Since the above system of linear equations is inconsistent, only the “best
possible” solution of the system, i.e., to find a solution of C to minimize
the difference between XC and Y
※ According to the theorem on Slide 5.81, the solution of C should be Ĉ
(XTX) –1XTY (Y b, X A, and C x), which is exactly the same as the
formula of the least squares regression in Section 2.5
5.85
※ The results imply the least squares regression line for (1, 0), (2, 1), and (3, 3)
is y = – (5/3) + (3/2)x
※ Since the “best fit” rather than the “exact” relationship is considered, an
error term e should be introduced to obtain the exact equations
corresponding to (1, 0), (2, 1), and (3, 3).
0 cˆ0 cˆ11 e1 0 1 1 1/ 6
5 / 3 Y XCˆ E
1 cˆ0 cˆ1 2 e1 1 1 2
1/ 3
3 cˆ cˆ 3 e 3 / 2
0 1 1 3 1 3 1/ 6
※ The minimized sum of squared errors is
1/ 6
E E 1/ 6 1/ 3 1/ 6 1/ 3
T 1
6
1/ 6
※ Note that the square root of ETE is the distance between Y and XCˆ , i.e.,
5.87
By substituting the data points (0, 4.5), (5, 4.8), (10, 5.3), (15, 5.7),
(20, 6.1), and (25, 6.5) into the quadratic polynomial y = c0+c1x+c2x2,
we can produce the least squares problem
Ax b
1 0 0 4.5
1 5 25 4.8
c0
1 10 100 5.3
c1
1 15 225
c2 5.7
1 20 400 6.1
1 25 625 6.5
The normal equations are
AT Axˆ AT b
6 75 1375 cˆ0 32.9
75 1375 28125 cˆ 447
1
1375 28125 611875 cˆ2 8435
5.88
The solution is
5.89
In the field of management, the least squares regression is
commonly used to find the linear relationship between the
dependent variable (y) and explanatory variables (xi), for example,
– The relationship between the growth of the cost of purchasing
return and beta as the market portfolio for the prior two years, i.e.,
verify the expected return and beta of a portfolio are the weighted
averages of the expected returns and betas of component stocks
– Out-of-sample test: compare the average return and beta of this
portfolio and the market index for the next two months and tell me
what you observe
Total points for this homework is 10. The basic requirement is 7
points, and the bonus is 3 points 5.92
Keywords in Section 5.4:
orthogonal to W: 正交於W
orthogonal complement: 正交補集
direct sum: 直和
projection onto a subspace: 在子空間的投影
fundamental subspaces: 基本子空間
least squares problem: 最小平方問題
normal equations: 一般方程式
5.93
5.5 Applications of Inner Product Spaces
Least Squares Approximations for a function
– An important application: use a polynomial function to
1 1 1 1
(k can be solved via a0 x a1 x 2 a2 x3 a3 x 4 a4 x 5 k | 1)
2 3 4 5 5.94
Calculate N(c) by the Monte Carlo Simulation
– Monte Carlo Simulation: generate some random scenarios and
5.95
–Normsinv(): inverse of the cumulative density function of the
standard normal distribution
※Normsinv(Rand()) can draw random samples from the
standard normal distribution
N ( x)
x
0
The function Rand() draws a random sample from [0,1].
Then we use the inverse function N 1 (Rand()) to derive a
simulated value of x from the standard normal distribution,
where N (x) is the cumulative distribution function of the
standard normal distribution. 5.96
Analyze a project with expanding and aborting options by the
Monte Carlo Simulation
– Time 0: initial investment V0