Вы находитесь на странице: 1из 97

Chapter 5

Inner Product Spaces

n
5.1 Length and Dot Product in R
5.2 Inner Product Spaces
5.3 Orthonormal Bases: Gram-Schmidt Process
5.4 Mathematical Models and Least Square Analysis
5.5 Applications of Inner Product Spaces

5.1
n
5.1 Length and Dot Product in R
 Length (長度):
The length of a vector v  (v1 , v2 ,, vn ) in Rn is given by

|| v ||  v12  v22   vn 2 ( || v || is a real number)

 Notes: The length of a vector is also called its norm (範數)

 Properties of length (or norm)


(1) v  0
(2) v  1  v is called a unit vector (單位向量)
(3) v  0 if and only if v  0
(4) cv  c v (proved in Theoerm 5.1)
5.2
 Ex 1:
(a) In R5, the length of v  (0 ,  2 , 1 , 4 ,  2) is given by

|| v ||  0 2  (2) 2  12  4 2  (2) 2  25  5

(b) In R3, the length of v  ( 2


17
, 2
17
, 3
17
) is given by

2 2 2
 2   2   3  17
|| v ||         1
 17   17   17  17

(If the length of v is 1, then v is a unit vector)

5.3
n
 A standard unit vector (標準單位向量) in R : only one
component of the vector is 1 and the others are 0 (thus the length
of this vector must be 1)

R2 : e1 , e2   1,0 ,  0,1


R3 : e1 , e2 , e3  1,0,0 ,  0,1,0 ,  0,0,1

R n : e1 , e 2 , , e n   1, 0, , 0  ,  0,1, , 0 , ,  0, 0, ,1

 Notes: Two nonzero vectors are parallel if u  cv


(1) c  0  u and v have the same direction
(2) c  0  u and v have the opposite directions

5.4
 Theorem 5.1: Length of a scalar multiple
Let v be a vector in Rn and c be a scalar. Then
|| cv ||  | c | || v ||
Pf:
v  (v1 , v2 ,  , vn )
 cv  (cv1 , cv2 ,  , cvn )
|| cv ||  || ( cv1 , cv2 ,  , cvn ) ||
 (cv1 ) 2  (cv2 ) 2    (cvn ) 2

 c 2 (v1  v2    vn )
2 2 2

 | c | v1  v2    vn
2 2 2

 | c | || v ||
5.5
 Theorem 5.2: How to find the unit vector in the direction of v
v
If v is a nonzero vector in R , then the vector u 
n

|| v ||
has length 1 and has the same direction as v. This vector u
is called the unit vector in the direction of v
Pf:
1
v is nonzero  v  0  0
v
1
If u  v (u has the same direction as v)
v
|| cv ||  | c | || v ||
v 1
|| u ||   || v ||  1 (u has length 1)
|| v || || v ||
5.6
 Notes:
v
(1) The vector is called the unit vector in the direction of v
|| v ||
(2) The process of finding the unit vector in the direction of v
is called normalizing the vector v

5.7
 Ex 2: Finding a unit vector
Find the unit vector in the direction of v = (3, –1, 2), and verify
that this vector has length 1
Sol:

v  (3 ,  1 , 2)  v  32   1  22  14
2

v (3 ,  1 , 2) 1
   (3 ,  1 , 2)
|| v || 3  (1)  2
2 2 2
14
 3 1 2 
 , , 
 14 14 14 
2 2 2
 3   1   2  14
       14  1
 14   14   14 
v
 is a unit vector
v
5.8
 Distance between two vectors:
n
The distance between two vectors u and v in R is
d (u , v)  || u  v ||

 Properties of distance
(1) d (u , v)  0
(2) d (u , v)  0 if and only if u = v
(3) d (u , v)  d ( v , u) (commutative property of the function of distance) 5.9
 Ex 3: Finding the distance between two vectors
The distance between u = (0, 2, 2) and v = (2, 0, 1) is

d (u , v ) || u  v ||  || (0  2 , 2  0 , 2  1) ||
 (2) 2  2 2  12  3

5.10
 Dot product (點積) in Rn:
The dot product of u  (u1 , u 2 ,  , u n ) and v  (v1 , v2 ,  , vn )
returns a scalar quantity
u  v  u1v1  u2v2   unvn (u  v is a real number)
(The dot product is defined as the sum of component-by-component
multiplications)

 Ex 4: Finding the dot product of two vectors


The dot product of u = (1, 2, 0, –3) and v = (3, –2, 4, 2) is
u  v  (1)(3)  (2)(2)  (0)(4)  (3)(2)  7

 Matrix Operations in Excel


SUMPRODUCT: calculate the dot product of two vectors
5.11
 Theorem 5.3: Properties of the dot product
If u, v, and w are vectors in Rn and c is a scalar,
then the following properties are true
(1) u  v  v  u (commutative property of the dot product)

(2) u  ( v  w)  u  v  u  w over vector addition)


(distributive property of the dot product

(3) c(u  v )  (cu)  v  u  (cv ) (associative property of the scalar


multiplication and the dot product)
(4) v  v  || v || 2
(5) v  v ,and
0 v  v if0and only if v0

※ The proofs of the above properties follow simply from the definition
of dot product in Rn

5.12
 Euclidean n-space:
– In section 4.1, Rn was defined to be the set of all order n-
tuples of real numbers
– When Rn is combined with the standard operations of
vector addition, scalar multiplication, vector length,
and dot product, the resulting vector space is called
Euclidean n-space (歐幾里德 n維空間)

5.13
 Ex 5: Find dot products

u  (2 ,  2) , v  (5 , 8), w  (4 , 3)
(a) u  v (b) (u  v )w (c) u  (2 v ) (d) || w || 2 (e) u  ( v  2w )
Sol:
(a) u  v  (2)(5)  (2)(8)  6
(b) (u  v)w  6w  6(4 , 3)  (24 , 18)
(c) u  (2v)  2(u  v)  2(6)  12
(d) || w ||2  w  w  (4)( 4)  (3)(3)  25
(e) v  2w  (5  (8) , 8  6)  (13 , 2)
u  ( v  2w )  (2)(13)  (2)( 2)  26  4  22

5.14
 Ex 6: Using the properties of the dot product
Given u  u  39, u  v  3, v  v  79,
find (u  2 v)  (3u  v)

Sol:
(u  2 v)  (3u  v)  u  (3u  v)  2 v  (3u  v)
 u  (3u)  u  v  (2 v)  (3u)  (2 v)  v
 3(u  u)  u  v  6( v  u)  2( v  v)

 3(u  u)  7(u  v)  2( v  v)

 3(39)  7(3)  2(79)  254

5.15
 Theorem 5.4: The Cauchy-Schwarz inequality (科西-舒瓦茲不等式)
If u and v are vectors in Rn, then
| u  v |  || u || || v || ( | u  v | denotes the absolute value of u  v )
(The geometric interpretation for this inequality is shown on the next slide)

 Ex 7: An example of the Cauchy-Schwarz inequality


Verify the Cauchy-Schwarz inequality for u = (1, –1, 3)
and v = (2, 0, –1)
Sol:
u  v  1, u  u  11, v  v  5
 u  v  1  1
u v  u  u  v  v  11  5  55
 uv  u v
5.16
 Dot product and the angle between two vectors
To find the angle  (0     ) between two nonzero vectors
u = (u1, u2) and v = (v1, v2) in R2, the Law of Cosines can be
applied to the following triangle to obtain
v  u  v  u  2 v u cos 
2 2 2

(The length of the subtense (對邊) of θ can be expressed in


terms of the lengths of the adjacent sides (鄰邊) and cos θ)

 v  u  (u1  v1 ) 2  (u2  v2 ) 2
2

v  v12  v22
2

u  u12  u22
2

u1v1  u2v2 uv


 cos    ※ You can employ the fact that |cos θ|  1 to
v u v u prove the Cauchy-Schwarz inequality in R2
5.17
 The angle between two nonzero vectors in Rn:
uv
cos   , 0  
|| u || || v ||
Opposite Same
u v  0 u v  0 u v  0 direction
direction

   
  
     0    0
2 2 2
cos   1 cos   0 cos   0 cos   0 cos   1

 Note:
The angle between the zero vector and another vector is
not defined (since the denominator cannot be zero)
5.18
 Ex 8: Finding the angle between two vectors
u  (4 , 0 , 2 ,  2) v  (2 , 0 ,  1 , 1)
Sol:
u  u u   42  02  22   22  24

v  vv  2 2
   
 0   1  12
2 2
 6
u  v  (4)( 2)  (0)(0)  (2)( 1)  (2)(1)  12
uv  12 12
 cos     1
|| u || || v || 24 6 144

    u and v have opposite directions


(In fact, u = –2v and according to the
arguments on Slide 5.4, u and v are with
different directions) 5.19
 Orthogonal (正交) vectors:
Two vectors u and v in Rn are orthogonal (perpendicular) if
uv 0
 Note:
The vector 0 is said to be orthogonal to every vector

5.20
 Ex 10: Finding orthogonal vectors
Determine all vectors in Rn that are orthogonal to u = (4, 2)
Sol:
u  ( 4 , 2) Let v  (v1 , v2 )
 u  v  (4 , 2)  (v1 , v2 )
 4v1  2v2
0
t
 v1  , v2  t
2
 t 
 v   ,t  , t  R
 2 

5.21
 Theorem 5.5: The triangle inequality (三角不等式)
If u and v are vectors in Rn, then || u  v ||  || u ||  || v ||
Pf:
|| u  v || 2  (u  v)  (u  v)
 u  (u  v)  v  (u  v)  u  u  2(u  v)  v  v
 || u ||2 2(u  v) || v ||2  || u ||2 2 | u  v |  || v ||2 (c  |c|)
 || u ||2 2 || u || || v ||  || v ||2 (Cauchy-Schwarz inequality)
 (|| u ||  || v ||) 2
(The geometric representation of the triangle inequality:
|| u  v ||  || u ||  || v || for any triangle, the sum of the lengths of any two sides is
larger than the length of the third side (see the next slide))
 Note:
Equality occurs in the triangle inequality if and only if
the vectors u and v have the same direction (in this
situation, cos θ = 1 and thus u  v  u v  0) 5.22
 Theorem 5.6: The Pythagorean (畢氏定理) theorem
If u and v are vectors in Rn, then u and v are orthogonal
if and only if
|| u  v || || u ||  || v ||
2 2 2 (This is because u·v = 0 in the
proof for Theorem 5.5)

※ The geometric meaning: for any right triangle, the sum of the squares of the
lengths of two legs (兩股) equals the square of the length of the hypotenuse (斜邊).

|| u  v ||  || u ||  || v || || u  v ||2  || u ||2  || v ||2 5.23


 Similarity between dot product and matrix multiplication:

 u1   v1 
u  v  (A vector u = (u1, u2,…, un) in Rn can be
u   2 v   2 represented as an n×1 column matrix)
 
   
u n  v n 

v1 
v 
u  v  uT v  [u1 u2 un ]  2   [u1v1  u2v2   un vn ]
 
 
vn 
(The result of the dot product of u and v is the same as the result
of the matrix multiplication of uT and v)

5.24
Keywords in Section 5.1:
 length: 長度
 norm: 範數
 unit vector: 單位向量
 standard unit vector: 標準單位向量
 distance: 距離
 dot product: 點積
 Euclidean n-space: 歐基里德n維空間
 Cauchy-Schwarz inequality: 科西-舒瓦茲不等式
 angle: 夾角
 triangle inequality: 三角不等式
 Pythagorean theorem: 畢氏定理

5.25
5.2 Inner Product Spaces
 Inner product (內積): represented by angle brackets〈u , v〉
Let u, v, and w be vectors in a vector space V, and let c be
any scalar. An inner product on V is a function that associates
a real number〈u , v〉with each pair of vectors u and v and
satisfies the following axioms (abstraction definition from
the properties of dot product in Theorem 5.3 on Slide 5.12)
(1)〈u , v〉〈v , u〉(commutative property of the inner product)
(distributive property of the inner product
(2)〈u , v  w〉〈u , v〉〈u , w〉over vector addition)
(3) c〈u , v〉〈cu , v〉(associative property of the scalar multiplication and the
inner product)

(4)〈v , v〉0
and ifand
〈 v , v〉 0 only if v0

5.26
 Note:
u  v  dot product (Euclidean inner product for R n )
 u , v  general inner product for a vector space V

 Note:
A vector space V with an inner product is called an inner
product space (內積空間)

Vector space: (V , , )
Inner product space: (V , ,  , , >)

5.27
 Ex 1: The Euclidean inner product for Rn
Show that the dot product in Rn satisfies the four axioms
of an inner product
Sol:
u  (u1 , u2 ,, un ) , v  (v1 , v2 ,, vn )
〈u , v〉 u  v  u1v1  u2v2    un vn

By Theorem 5.3, this dot product satisfies the required four axioms.
Thus, the dot product can be a sort of inner product in Rn

5.28
 Ex 2: A different inner product for Rn
Show that the following function defines an inner product
on R2. Given u  (u1 , u2 ) and v  (v1 , v2 ),
〈u , v〉 u1v1  2u 2 v 2

Sol:
(1) 〈u , v〉 u1v1  2u2v2  v1u1  2v2u2 〈v , u〉
(2) w  ( w1 , w2 )
 〈u , v  w〉 u1 (v1  w1 )  2u2 (v2  w2 )
 u1v1  u1w1  2u2 v2  2u2 w2
 (u1v1  2u2v2 )  (u1w1  2u2 w2 )
〈u , v〉〈u , w〉

5.29
(3) c〈u , v〉 c(u1v1  2u2v2 )  (cu1 )v1  2(cu2 )v2 〈cu , v〉
(4) 〈v , v〉 v12  2v2 2  0
〈v , v〉 0  v12  2v2 2  0  v1  v2  0 ( v  0)

 Note: Example 2 can be generalized such that


〈u , v〉 c1u1v1  c2u2 v2   cnun vn , ci  0
can be an inner product on Rn

5.30
 Ex 3: A function that is not an inner product
Show that the following function is not an inner product on R3
〈u , v〉 u1v1  2u2 v2  u3v3
Sol:
Let v  (1 , 2 , 1)

Then 〈v , v〉 (1)(1)  2(2)(2)  (1)(1)  6  0


Axiom 4 is not satisfied
Thus this function is not an inner product on R3

5.31
 Theorem 5.7: Properties of inner products
Let u, v, and w be vectors in an inner product space V, and
let c be any real number
 v, 0〉 0
(1)〈0, v〉〈
(2)〈u  v, w〉〈
 u, w〉〈
 v, w 〉
(3)〈u, cv〉 c〈u, v〉
※ To prove these properties, you can use only basic properties for
vectors the four axioms in the definition of inner product (see
Slide 5.26)
Pf:
(3)
(1)〈0, v〉=〈0u, v〉 0〈u, v〉 0
(1) (2) (1)
(2)〈u  v, w〉〈
 w , u  v〉〈
 w , u〉+〈w , v〉〈
 u, w 〉+〈v, w 〉
(1) (3)
 cv, u〉 〈
(3) 〈u, cv〉〈 c u , v〉
5.32
※ The definition of norm (or length), distance, angle, orthogonal, and
normalizing for general inner product spaces closely parallel to
those based on the dot product in Euclidean n-space

 Norm (length) of u:
|| u ||  〈u , u〉

 Distance between u and v:


d (u , v) || u  v || u  v, u  v

 Angle between two nonzero vectors u and v:


〈u , v〉
cos   , 0  
|| u || || v ||
 Orthogonal: (u  v )
u and v are orthogonal if 〈u , v〉 0 5.33
 Normalizing vectors
(1) If || v ||  1 , then v is called a unit vector
(Note that v is defined as v, v )

(2) v  0 
Normalizing v (the unit vector in the
v direction of v)
(if v is not a
zero vector)

5.34
 Ex 6: An inner product in the polynomial space
For p  a0  a1 x    an x n and q  b0  b1 x    bn x n ,
and p , q  a0b0  a1b1   anbn is an inner product

Let p( x)  1  2 x 2 , q( x)  4  2 x  x 2 be polynomials in P2
(a) p , q  ? (b) || q || ? (c) d ( p , q)  ?
Sol:
(a) p , q  (1)(4)  (0)(2)  (2)(1)  2
(b) || q ||  q , q  42  (2) 2  12  21
(c) p  q  3  2 x  3x 2
 d ( p , q)  || p  q ||   p  q, p  q
 ( 3) 2  2 2  ( 3) 2  22
5.35
 Properties of norm: (the same as the properties for the dot
product in Rn on Slide 5.2)
(1) || u ||  0
(2) || u ||  0 if and only if u  0
(3) || cu ||  | c | || u ||

 Properties of distance: (the same as the properties for the dot


product in Rn on Slide 5.9)
(1) d (u , v)  0
(2) d (u , v)  0 if and only if u  v
(3) d (u , v)  d ( v , u)

5.36
 Theorem 5.8:
Let u and v be vectors in an inner product space V
(1) Cauchy-Schwarz inequality:
| u , v〉|  || u || || v ||
〈 Theorem 5.4
(2) Triangle inequality:
|| u  v ||  || u ||  || v || Theorem 5.5
(3) Pythagorean theorem:
u and v are orthogonal if and only if

|| u  v || 2  || u || 2  || v || 2 Theorem 5.6

5.37
 Orthogonal projections (正交投影): For the dot product function
n
in R , we define the orthogonal projection of u onto v to be
projvu = av (a scalar multiple of v), and the coefficient a can be
derived as follows
Consider a  0, av  a v  a v  u cos 
u
|| u || || v || cos  || u || || v || u  v uv
  
 v v || u || || v || v
v
projv u  av, a  0 uv uv uv
a   projv u  v
v
2
vv vv
 For inner product spaces:
Let u and v be two vectors in an inner product space V.
If v  0 , then the orthogonal projection of u onto v is
given by
 u, v 
projv u  v
 v, v 5.38
 Ex 10: Finding an orthogonal projection in R3
Use the Euclidean inner product in R3 to find the
orthogonal projection of u = (6, 2, 4) onto v = (1, 2, 0)

Sol:
 u , v  (6)(1)  (2)( 2)  (4)(0)  10
 v , v  12  22  02  5
u , v uv 10
 projv u  v v  (1 , 2 , 0)  (2 , 4 , 0)
 v , v vv 5

5.39
 Theorem 5.9: Orthogonal projection and distance
Let u and v be two vectors in an inner product space V,
and if v ≠ 0, then
u , v
d (u, projv u)  d (u, cv) , c 
 v , v

u u d (u, cv )
d (u, projv u)

v v
projv u cv

※ Theorem 5.9 can be inferred straightforwardly by the Pythagorean Theorem,


i.e., in a right triangle, the hypotenuse (斜邊) is longer than both legs (兩股)
5.40
Keywords in Section 5.2:
 inner product: 內積
 inner product space: 內積空間
 norm: 範數
 distance: 距離
 angle: 夾角
 orthogonal: 正交
 unit vector: 單位向量
 normalizing: 單位化
 Cauchy-Schwarz inequality: 科西-舒瓦茲不等式
 triangle inequality: 三角不等式
 Pythagorean theorem: 畢氏定理
 orthogonal projection: 正交投影

5.41
5.3 Orthonormal Bases: Gram-Schmidt Process
 Orthogonal set (正交集合):
A set S of vectors in an inner product space V is called an
orthogonal set if every pair of vectors in the set is orthogonal

S  v1 , v 2 ,, v n   V
 v i , v j   0, for i  j

 Orthonormal set (單位正交集合):


An orthogonal set in which each vector is a unit vector is
called orthonormal set
S  v1 , v 2 , , v n   V
 For i  j ,  vi , v j    vi , vi   vi 2
1

For i  j,  vi , v j   0
5.42
 Note:
– If S is also a basis, then it is called an orthogonal basis (正

交基底) or an orthonormal basis (單位正交基底)


n
– The standard basis for R is orthonormal. For example,

S  (1,0,0),(0,1,0),(0,0,1)
is an orthonormal basis for R3

 This section identifies some advantages of orthonormal bases,


and develops a procedure for constructing such bases, known
as Gram-Schmidt orthonormalization process

5.43
 Ex 1: A nonstandard orthonormal basis for R3
Show that the following set is an orthonormal basis
v1 v2 v3
 1 1   2 2 2 2 2 2 1 
S   , , 0 ,     , 
 6 , 6 , 3 , 3
,
3 3 
 2 2   
Sol:
First, show that the three vectors are mutually orthogonal
v1  v 2   16  16  0  0
2 2
v1  v 3   0 0
3 2 3 2
2 2 2 2
v 2  v3     0
9 9 9

5.44
Second, show that each vector is of length 1
|| v 1 ||  v1  v1  1
2
 12  0  1

|| v 2 ||  v 2  v 2  2
36
 2
36
 8
9
1

|| v 3 ||  v 3  v 3  4
9
 94  19  1
Thus S is an orthonormal set

Because these three vectors are linearly independent (you can


check by solving c1v1 + c2v2 + c3v3 = 0) in R3 (of dimension 3), by
Theorem 4.12 (given a vector space with dimension n, then n
linearly independent vectors can form a basis for this vector space),
these three linearly independent vectors form a basis for R3.
 S is a (nonstandard) orthonormal basis for R3
5.45
 Ex : An orthonormal basis for P2(x)
In P2 ( x), with the inner product  p, q  a0b0  a1b1  a2b2,
the standard basis B  {1, x, x 2 } is orthonormal
Sol:
v1  1  0 x  0 x 2 , v2  0  x  0x2 , v3  0  0x  x2 ,
Then
 v1 , v 2   (1)(0)  (0)(1)  (0)(0)  0
 v1 , v3   (1)(0)  (0)(0)  (0)(1)  0
 v 2 , v3   (0)(0)  (1)(0)  (0)(1)  0

v1   v1 , v1   11   0  0    0  0   1
v2   v2 , v2    0  0   11   0  0   1
v3   v3 , v3    0  0    0  0   11  1 5.46
 Theorem 5.10: Orthogonal sets are linearly independent
If S  {v1 , v 2 ,  , v n } is an orthogonal set of nonzero vectors
in an inner product space V, then S is linearly independent
Pf:
S is an orthogonal set of nonzero vectors,
i.e.,  v i , v j   0 for i  j, and  v i , v i   0
(If there is only the trivial solution for ci’s,
For c1 v1  c2 v 2   cn v n  0 i.e., all ci’s are 0, S is linearly independent)
  c1 v1  c2 v 2   cn v n , v i    0, v i   0 i
 c1  v1 , v i   c2  v 2 , v i    ci  v i , v i    cn  v n , v i 
 ci  v i , v i   0 (because S is an orthogonal set of nonzero vectors)
 vi , vi   0  ci  0 i  S is linearly independent
5.47
 Corollary to Theorem 5.10:
If V is an inner product space with dimension n, then any
orthogonal set of n nonzero vectors is a basis for V
1. By Theorem 5.10, if S = {v1, v2, …, vn} is an orthogonal set of n
vectors, then S is linearly independent
2. According to Theorem 4.12, if S = {v1, v2, …, vn} is a linearly
independent set of n vectors in V (with dimension n), then S is a
basis for V
※ Based on the above two arguments, it is straightforward to
derive the above corollary to Theorem 5.10

5.48
 Ex 4: Using orthogonality to test for a basis
Show that the following set is a basis for R 4
v1 v2 v3 v4
S {(2 , 3 , 2 ,  2) , (1 , 0 , 0 , 1) , (1 , 0 , 2 , 1) , (1 , 2 ,  1 , 1)}
Sol:
v1 , v 2 , v3 , v 4 : nonzero vectors
v1  v 2  2  0  0  2  0 v 2  v 3  1  0  0  1  0
v1  v 3  2  0  4  2  0 v 2  v 4  1  0  0  1  0
v1  v 4  2  6  2  2  0 v3  v 4  1  0  2  1  0
 S is orthogonal
 S is a basis for R 4 (by Corollary to Theorem 5.10)
※ The corollary to Thm. 5.10 shows an advantage of introducing the concept of
orthogonal vectors, i.e., it is not necessary to solve linear systems to test
whether S is a basis (e.g., Ex 1 on 5.44) if S is a set of orthogonal vectors 5.49
 Theorem 5.11: Coordinates relative to an orthonormal basis
If B  {v1 , v 2 ,  , v n } is an orthonormal basis for an inner
product space V, then the unique coordinate representation of a
vector w with respect to B is
w   w, v1  v1   w, v 2  v 2    w, v n  v n
※ The above theorem tells us that it is easy to derive the coordinate
representation of a vector relative to an orthonormal basis, which is
another advantage of using orthonormal bases
Pf:
B  {v1 , v 2 ,isan, vorthonormal
n} basis for V
w  k1v1  k2 v 2    kn v n V (unique representation from Thm. 4.9)
1 i j
Since  vi , v j    , then
0 i j
5.50
w, v i  (k1 v1  k2 v 2   kn v n ), v i
 k1 v1 , v i   ki v i , v i   kn v n , v i
 ki for i = 1 to n
 w   w, v1  v1   w, v 2  v 2     w, v n  v n

 Note:
If B  {v1 , v 2 ,  , v n } is an orthonormal basis for V and w V ,
Then the corresponding coordinate matrix of w relative to B is
  w, v1  
 w , v  
w B  2 

  
 
  w , v 
n 

5.51
 Ex
For w = (5, –5, 2), find its coordinates relative to the standard
basis for R3
 w, v1   w  v1  (5 ,  5 , 2)  (1 , 0 , 0)  5
 w, v 2   w  v 2  (5,  5 , 2)  (0 , 1 , 0)  5
 w, v 3   w  v 3  (5 ,  5 , 2)  (0 , 0 , 1)  2
 5 
 [w ]B   5
 2 
※ In fact, it is not necessary to use Thm. 5.11 to find the coordinates relative
to the standard basis, because we know that the coordinates of a vector
relative to the standard basis are the same as the components of that vector
※ The advantage of the orthonormal basis emerges when we try to find the
coordinate matrix of a vector relative to an nonstandard orthonormal basis
(see the next slide) 5.52
 Ex 5: Representing vectors relative to an orthonormal basis
Find the coordinates of w = (5, –5, 2) relative to the following
orthonormal basis for R3
v1 v2 v3
B  {( 53 , 54 , 0) , ( 54 , 53 , 0) , (0 , 0 , 1)}
Sol:
 w, v1   w  v1  (5 ,  5 , 2)  ( 53 , 54 , 0)  1
 w, v 2   w  v 2  (5,  5 , 2)  ( 54 , 53 , 0)  7
 w, v 3   w  v 3  (5 ,  5 , 2)  (0 , 0 , 1)  2
  1
 [w ]B   7
 2 
5.53
 The geometric intuition of the Gram-Schmidt process to find an
orthonormal basis in R2

v2
w2
v2 w1  v1
v1
projw1 v 2

w 2  v 2  projw1 v 2 is
v1 , v 2 is a basis for R 2
orthogonal to w1  v1

w1 w2
{ , } is an orthonormal basis for R 2
w1 w2
5.54
 Gram-Schmidt orthonormalization process:
B  {v1 , v 2 , 
is ,a vbasis
n} for an inner product space V
Let w1  v1 S1  span({w1})
v 2 , w1
w 2  v 2  projS1 v 2  v 2  w1 S2  span({w1 , w 2 })
w1 , w 1
v 3 , w1 v3 , w 2
w 3  v 3  projS2 v 3  v 3  w1  w2
w1 , w 1 w2 , w2
The orthogonal projection onto a
n 1 subspace is actually the sum of
v n , wi
w n  v n  projSn1 v n  v n   w i orthogonal projections onto the vectors
i 1 w i , w i in an orthogonal basis for that subspace
(I will prove it on Slides 5.67 and 5.68)
 B'  {w1 , w 2 , , w n } is an orthogonal basis
w1 w 2 wn
 B ''  { , , , } is an orthonormal basis
w1 w 2 wn
5.55
 Ex 7: Applying the Gram-Schmidt orthonormalization process
Apply the Gram-Schmidt process to the following basis for R3
v1 v2 v3
B  {(1 , 1 , 0) , (1 , 2 , 0) , (0 , 1 , 2)}
Sol:

w1  v1  (1 , 1 , 0)
v 2  w1 3 1 1
w2  v2  w1  (1 , 2 , 0)  (1 , 1 , 0)  ( , , 0)
w1  w1 2 2 2
v 3  w1 v3  w 2
w3  v3  w1  w2
w1  w1 w2  w2
1 1/ 2 1 1
 (0 , 1 , 2)  (1 , 1 , 0)  ( , , 0)  (0 , 0 , 2)
2 1/ 2 2 2 5.56
Orthogonal basis
1 1
 B'  {w1 , w 2 , w 3 }  {(1, 1, 0), ( , , 0), (0, 0, 2)}
2 2

Orthonormal basis
w1 w 2 w 3 1 1 1 1
 B' '  { , , }  {( , , 0), ( , , 0), (0, 0, 1)}
w1 w 2 w 3 2 2 2 2

5.57
 Ex 10: Alternative form of Gram-Schmidt orthonormalization process
Find an orthonormal basis for the solution space of the
homogeneous system of linear equations
x1  x 2  7 x4  0
2 x1  x 2  2 x3  6 x 4  0

Sol:
1 1 0 7 0  1 0 2 1 0 
2 1 2 6 0 
G.-J.E
0 1 2 8 0 
   

 x1   2s  t   2  1 
 x   2s  8t   2   8
  2     s   t  
 x3   s   1   0 
       
 x4   t   0   1 
5.58
Thus one basis for the solution space is
B  {v 1 , v 2 }  {( 2 , 2 , 1 , 0) , (1 ,  8 , 0 , 1)}
w1 1  2 2 1 
w1  v1 and u1    2, 2, 1, 0    , , , 0 
w1 3  3 3 3 
 v 2 , u1 
w 2  v 2   v 2 , u1 u1 (due to w 2  v 2  u1 and u1 , u1   1)
u1 , u1 
  2 2 1    2 2 1 
 1,  8, 0, 1  1,  8, 0, 1   , , , 0    , , , 0 
  3 3 3   3 3 3 
  3,  4, 2, 1
※ In this alternative form,
we always normalize wi
to be ui before
w2 1
u2    3,  4, 2, 1 processing wi+1
※ The advantage of this
w2 30 method is that it is
  2 2 1    3  4 2 1  easier to calculate the
 B' '   , , ,0  ,  , , ,  orthogonal projection
 3 3 3   30 30 30 30   of wi+1 on u1, u2,…, ui
5.59
 Alternative form of the Gram-Schmidt orthonormalization process:
B  {v1 , v 2 ,isa ,basis
v n } for an inner product space V

w1 v1
u1  
w1 v1
w2
u2  , where w 2  v 2  v 2 , u1 u1
w2
w3
u3  , where w 3  v 3  v 3 , u1 u1  v 3 , u 2 u 2
w3

n 1
wn
un  , where w n  v n   v n , u i u i
wn i 1

 {u1 , u 2 , , u n } is an orthonormal basis for V


5.60
Keywords in Section 5.3:
 orthogonal set: 正交集合
 orthonormal set: 單範正交集合
 orthogonal basis: 正交基底
 orthonormal basis: 單位正交基底
 linear independent: 線性獨立
 Gram-Schmidt Process: Gram-Schmidt過程

5.61
5.4 Mathematical Models and Least Squares Analysis
 Orthogonal complement (正交補集) of V:
Let S be a subspace of an inner product space V
(a) A vector v in V is said to be orthogonal to S, if v is
orthogonal to every vector in S, i.e.,  v , w  0 ,  w  S
(b) The set of all vectors in V that are orthogonal to S is
called the orthogonal complement of S

S   {v V |  v , w  0 ,  w  S}
(S  (read " S perp" ))

 Notes:

0

(1) V (2) V   0
(This is because  0, v  0 for any vector v in V) 5.62
 Notes:
Given S to be a subspace of V ,
(1) S  is a subspace of V
(2) S  S   0
(3) ( S  )   S ※ Any vector in x-axis (or y-axis)
can be represented as (t, 0) (or
 Ex: (0, s)). It is straightforward to
If V  R 2 , S  x-axis see that any vector in y-axis is
orthogonal to every vector in x-
Then (1) S   y -axis is a subspace of 2
R axis since (0, s) · (t, 0) = 0
※ On Slide 4.32, any subset
(2) S  S   (0, 0) consisting of all points on a line
passing through the origin is a
(3) ( S  )  S subspace of R2. Therefore, both
x- and y-axes are subspaces of
R2

※ In the above example, it can be observed that any vector in R2 can be


uniquely written as a sum of a vector from S and a vector from S  5.63
 Direct sum (直和):
Let S1 and S2 be two subspaces of V. If each vector x V
can be uniquely written as a sum of a vector v1 from S1
and a vector v2 from S2, i.e., x = v1+ v2, then V is the
direct sum of S1 and S2 , and we can write
V  S1  S2

5.64
 Theorem 5.13: Properties of orthogonal subspaces
Let S be a subspace of V (with dimension n). Then the
following properties are true
(1) dim( S )  dim( S  )  n
(2) V  S  S 
(3) (S  )   S

※ You can memorize this theorem by understanding the example


on Slide 5.63 or by considering V as the xyz-space (with

dimension of 3), S as the xy-plane (with dimension of 2), and S
as the z-axis (with dimension of 1)
※ The theorem states that every inner product space V can be
represented as the direct sum of a subspace S and its
corresponding orthogonal complement S 
5.65
※ In Section 5.2, we studied the orthogonal projection of
one vector onto another vector. That can be generalized to
the orthogonal projection of a vector onto a subspace S
1. The vector v1 is the projection of v onto the subspace S
and denoted as v1 = projSv
2. v2 = v – v1 = v – projSv, which implies that the vector v2 =

v – projSv is orthogonal to the subspace S, i.e., v 2  S
※ For example, v = (1, 2, 3), S is the xy-plane with the standard basis of {e1,

e2} = {(1, 0, 0), (0, 1, 0)}, and S is the z-axis with the standard basis {e3}
= {(0, 0, 1)}  v  (1, 2,0)  (0,0,3)  v1  v 2
※ Geometrically speaking, v1 = (1, 2, 0) = projSv, so v2 = v – v1 = v – projSv
= (0, 0, 3). In addition, v2  projSv since (0, 0, 3) · (1, 2, 0) = 0
※ Since any vector 𝐯 ∈ 𝑉 can expressed as v1 + v2 and 𝐯1 ∈ 𝑆, 𝐯2 ∈ 𝑆 ⊥ , so
we can obtain 𝑉 = 𝑆⨁𝑆 ⊥

5.66
 Theorem 5.14: Projection onto a subspace
(Here we will formally prove that the orthogonal projection onto a subspace is the sum
of orthogonal projections onto the vectors in an orthonormal basis for that subspace)
If {u1 , u 2 ,  , u t } is an orthonormal basis for the
subspace S of V, and v V , then
projS v   v , u1 u1   v , u 2 u 2     v , ut ut
Pf:
(Since the vector space v can be expressed as the

Let v V , v  v1  v 2 , v1  S , v 2  S direct sum of S and S  , according to the results
on Slide 5.66, v1 is the orthogonal projection of v
u1 , , ut  is an orthonormal basis for S onto S, and v2 is orthogonal to any vector in S.)

 we can have v1  c1u1   ct ut and thus v  v1  v 2  c1u1   ct ut  v 2


v, ui  (c1u1   ct ut  v 2 ), ui
 ci ui , ui  ci , for i = 1 to t
 projS v  v1  c1u1   ct ut   v , u1 u1   v , u 2 u 2    v , ut ut
5.67
Corollary to Theorem 5.14: Consider an orthogonal basis
If {w1 , w 2 , , wt } is an orthogonal basis for the
subspace S of V, and v V , then
 v , w1   v, w 2   v, w t 
projS v  w1  w2   wt
 w1 , w1  w 2 , w 2  wt , wt 
Pf:
w1 w wt
If u1  , u2  2 , , and ut  ,
w1 w2 wt
we can derive projS v   v, u1 u1   v, u 2 u 2    v, u t  u t .
wi w w  v, w i   v, w i 
Replacing ui with ,  v , ui ui   v , i  i  w  wi
wi , wi 
2 i
wi wi wi wi
 v, w1   v, w 2   v, w t 
 projS v  w1  w2   wt
 w1 , w1  w 2 , w 2  wt , wt 
※ So, the orthogonal projection onto a subspace is the sum of orthogonal projections onto the
vectors in an orthogonal (or orthonormal) basis for that subspace
※ This formula is what we use in the Gram-Schmidt orthonormalization process on Slide 5.55 5.68
 Ex 5: Projection onto a subspace (在子空間的投影)
w1  0, 3, 1, w2  2, 0, 0, v  1, 1, 3
Find the projection of v onto the subspace W  span ({w1 , w 2 })
Sol:
w1, w 2 : an orthogonal basis for W
 w1 w 2    an orthonormal
u1 , u 2    ,   (0,
3
,
1
), 1,0,0 :
 w1 w 2   10 10  basis for W

projW v   v, u1  u1   v, u 2  u 2
6 3 1 9 3
 (0, , )  1, 0, 0   (1, , )
10 10 10 5 3
5.69
v v
v u
v  cu v  projS v
v  proju v projS v  u
projS v
u u
proju v cu S

v  proju v  v  cu v  projS v  v  u

※ Theorem 5.9 tells us that among all the scalar multiples of a vector
u, i.e., cu, the orthogonal projection of v onto u is the one closest
to v (see the left figure)
※ This property is also true for projections onto subspaces. That is,
among all the vectors in the subspace S, the vector projSv is the
vector closest to v (see the right figure and Theorem 5.15 on the
next slide)
5.70
 Theorem 5.15: Orthogonal projection and distance
Let S be a subspace of an inner product space V, and v V.
Then for all u  S , u  projS v
|| v  projS v ||  || v  u ||
(Among all the vectors in the subspace S,
or || v  projS v ||  min || v  u || the vector projSv is closest to v)
Pf:
v  u  ( v  projS v)  (projS v  u)
projS v  u  S and v  projS v  S   v  projS v  projS v  u
 v  projS v, projS v  u  0
Thus the Pythagoream Theorem in Theorem 5.6 can be applied,
v  u  v  projS v  projS v  u .
2 2 2

Since u  projS v, the second term on the right hand side is positive,
and we can have v  projS v  v  u
5.71
 Theorem 5.16: Fundamental subspaces (基本子空間) of a
matrix, including CS(A), CS(AT), NS(A), and NS(AT)
If A is an m×n matrix, then
(1) CS ( A)  NS ( AΤ ) (or expressed as CS ( A)  NS ( AΤ ))
Pf: Consider any v  CS ( A) and any u  NS ( AT ), and the goal is to prove v  u  0
 ( A(1) )T   ( A(1) )T u  0 
      (Proved by
u  NS ( AT )  AT u   u   = =0 setting B = AT
( A( n ) )T  ( A( n ) )T u  0  and B satisfies
   
the first property)
 v  u  (c1 A(1)   cn A( n ) )  u  (c1 A(1)  u   cn A( n )  u)  (0   0)  0

(2) CS ( A)  NS ( AT )  Rm (because CS ( A)  CS ( A)  R m in Thm. 5.13)


(3) CS ( AΤ )  NS ( A) (or expressed as CS ( AΤ )  NS ( A))
(4) CS ( AT )  NS ( A)  Rn (because CS ( AT )  CS ( AT )  Rn in Thm. 5.13)
※ Note that according to Theorem 5.13, we know that every inner product space
V can be represented as the direct sum of a subspace S and its
corresponding S  . Since CS(A) (CS(AT)) is a subspace of Rm (Rn), we can
derive CS ( A)  CS ( A)  R m (CS ( AT )  CS ( AT )  R n )
※ Ex. 6 on the next slide verifies these four properties are true 5.72
 Ex 6: (Fundamental subspaces)
Find the four fundamental subspaces of the matrix
1 2 0
0 0 1
A (reduced row-echelon form)
0 0 0
 
0 0 0
Sol:
CS ( A)  span 1, 0, 0, 0 ,  0,1, 0, 0 is a subspace of R 4

CS ( AΤ )  RS  A  span 1, 2, 0  ,  0, 0,1 is a subspace of R 3

NS ( A)  span  2,1,0 is a subspace of R3


(The nullspace of A is a solution space of the homogeneous system
Ax = 0, i.e., you need to solve Ax = 0 to derive (–2, 1, 0))
5.73
1 0 0 0 0  1 0 0 0 0 
[ AΤ | 0]   2 0 0 0 0  
G.-J. E.
 0 1 0 0 0 
 0 1 0 0 0  0 0 0 0 0 
s t
NS ( AT )  span  0, 0,1, 0 ,  0, 0, 0,1 is a subspace of R 4

 Check:
CS ( A)  NS ( AΤ ) (a(1, 0, 0, 0) + b(0, 1, 0, 0)) · (c(0, 0, 1, 0) + d(0, 0, 0, 1)) = 0

CS ( A)  NS ( AT )  R 4
v = v1 + v2 = (a(1, 0, 0, 0) + b(0, 1, 0, 0)) + (c(0, 0, 1, 0) + d(0,
0, 0, 1))

CS ( AΤ )  NS ( A) (e(1, 2, 0) + f(0, 0, 1)) · (g(–2, 1, 0)) = (e, 2e, f) · (–2g, g, 0) = 0

CS ( AT )  NS ( A)  R3 w = w1 + w2 = (e(1, 2, 0) + f(0, 0, 1)) + (g(–2, 1, 0)). (Since


(1, 2, 0), (0, 0, 1), (–2, 1, 0) are mutually orthogonal, these
three vectors can be a basis to span R3 according to
Corollary of Theorem 5.10) 5.74
 Ex 3:
W  span({w1 , w 2 })
Let W is a subspace of R4 and w1  (1, 2, 1, 0), w 2  (0, 0, 0, 1).
(a) Find a basis for W
(b) Find a basis for the orthogonal complement of W
Sol:
1 0 1 0
2 0  G.-J. E. 0 1 
A  (reduced row-echelon form)
1 0 0 0
   
0 1 0 0
w1 w2
(a) W  CS  A , and since G.-J. E. will not affect the dependency among
columns, we can conlude that 1, 2,1,0 ,  0, 0, 0,1 are linearly
independent and could be a basis of W 5.75
(b) W  CS  A  NS  A 
  Τ (The nullspace of AT is a solution space of the
homogeneous system ATx = 0)

 x1   2s  t   2  1


x   s  1 0
 1 2 1 0   2     s t  
AΤ    
0 0 0 1   x3   t  0 1
       
 4 
x 0  0 0
  2,1, 0, 0   1, 0,1, 0  is a basis for W 

 Check for Theorem 5.13:


(1) dim( W )  dim( W  )  dim( R 4 )
(2) W  W   R 4
v = v1 + v2 = (c1(1, 2, 1, 0) + c2(0, 0, 0, 1)) + (c3(–2, 1, 0, 0)
+ c4(–1, 0, 1, 0)). (Since these four vectors are linearly
independent, these four vectors can form a basis for R4, i.e.,
for each v, there is a unique solution set of ci. In other
words, R4 is the direct sum of W and W  )
5.76
 Least squares problem (最小平方問題):
Ax  b (A system of linear equations)
m  n n 1 m 1

(1) When the system is consistent, we can use the Gaussian


elimination with the back substitution to solve for x

(2) When the system is inconsistent, only the “best possible”


solution of the system can be found, i.e., to find a solution of x
for which the difference (or said the error) between Ax and b is
smallest

Note: Theorem 4.19 states that the system of linear equations Ax = b is


consistent if and only if b is in the column space of A
5.77
 Least squares solution:
Given a system Ax = b of m linear equations in n unknowns,
the least squares problem is to find a vector x in Rn that
minimizes the distance between Ax and b, i.e., Ax  b
with respect to the Euclidean inner product in Rn. Such
vector is called a least squares solution of Ax = b

※ The term least squares comes from the fact that minimizing
Ax  b is equivalent to minimizing Ax  b = (Ax – b) ·
2

(Ax – b), which is a sum of squared errors

5.78
A  M mn In Thm. 4.19 on Slides 4.97 and 4.98, Ax can
expressed as x1A(1)+x2A(2)+…+xnA(n)

x  Rn That is, find xˆ1 A  xˆ2 A  ...  xˆn A ,


(1) (2) ( n)

Ax  CS ( A) which is closest to b

Define W  CS ( A), and the problem to find xˆ


such that Axˆ is closest to b is equivalent to find
b  Axˆ
the vector in CS ( A) closest to b, that is projW b b

Axˆ = projW b
W

Thus Axˆ  projW b (To find the best solution x̂ which should satisfy this equation)

 (b  projW b)  (b  Axˆ )  W  (b  Axˆ )  CS ( A)


 b  Axˆ  CS ( A)   NS ( AΤ ) (The nullspace of AT is a solution space of
the homogeneous system ATx=0)
 AΤ (b  Axˆ )  0
 AΤ Axˆ  AΤ b (the n×n linear system of normal equations (一般方程式)
associated with Ax = b) 5.79
 Note:
The problem of finding the least squares solution of Ax  b
is equal to the problem of finding an exact solution of the
associated normal system AΤ Axˆ  AΤ b

 Theorem associated with the least squares solution:


For any linear system Ax  b , the associated normal system
AΤ Axˆ  AΤ b
is consistent, and all solutions of the normal system are least
squares solution of Ax = b. In other words, if W is the column
space of A, and x̂ is the least squares solution of Ax = b, then
the orthogonal projection of b on W is Axˆ , i.e.,
projW b  Axˆ
5.80
 Theorem for the normal equation:
If A is an m×n matrix with linearly independent column
vectors, then for every m×1 vector b, the linear system Ax = b
has a unique least squares solution. This solution is given by
xˆ  ( AΤ A) 1 AΤ b
Meanwhile, if W denotes the column space of A, then the
orthogonal projection of b onto W is

projW b  Axˆ  A( A A) 1 A b

5.81
 Ex 7: Solving the normal equations
Find the least squares solution of the following system
Ax  b
1 1 0 
1 2 c0   1
  c   
1 3  1  3

and find the orthogonal projection of b onto the column space


of A

5.82
 Sol:
1 1 
1 1 1   3 6 
A A
T
 1 2   
1 2 3 1 3  6 14 
 
0 
1 1 1    4 
A b
T
 1   
1 2 3 3 11
 

the corresponding normal system


AT Axˆ  AT b
3 6  cˆ0   4 
6 14  cˆ   11
  1   

5.83
the least squares solution of Ax = b

cˆ0   53 
xˆ      3 
 cˆ1   2 

the orthogonal projection of b onto the column space of A

1 1 5  61 
  3 8
projCS ( A)b  Axˆ  1 2  3    6 
1 3  2   176 

※ Find an orthogonal basis for CS(A) by performing the Gram-


Schmidt process, and then calculate projCS(A)b directly, you will
derive the same result

5.84
※ The above problem is equivalent to find a line y = c0+c1x that “best fits”
these three points (1, 0), (2, 1), and (3, 3) on the xy-plane.
※ This analysis is also called the least squares regression analysis, in
which we can find the “best fit” linear relationship between x and y, and
next we can estimate y given the different values of x.
※ The matrix representation for the three equations corresponding to (1, 0),
(2, 1), and (3, 3) is as follows
 0  c0  c11 0 1 1 
     c0 
1  c0  c1 2  1   1 2    Y  XC
3  c  c 3 3 1 3  c1 
 0 1    
※ Since the above system of linear equations is inconsistent, only the “best
possible” solution of the system, i.e., to find a solution of C to minimize
the difference between XC and Y
※ According to the theorem on Slide 5.81, the solution of C should be Ĉ 
(XTX) –1XTY (Y  b, X  A, and C  x), which is exactly the same as the
formula of the least squares regression in Section 2.5
5.85
※ The results imply the least squares regression line for (1, 0), (2, 1), and (3, 3)
is y = – (5/3) + (3/2)x
※ Since the “best fit” rather than the “exact” relationship is considered, an
error term e should be introduced to obtain the exact equations
corresponding to (1, 0), (2, 1), and (3, 3).

 0  cˆ0  cˆ11  e1 0  1 1   1/ 6 
      5 / 3    Y  XCˆ  E
1  cˆ0  cˆ1 2  e1  1   1 2   
   1/ 3 
3  cˆ  cˆ 3  e  3 / 2 
 0 1 1  3 1 3   1/ 6 
※ The minimized sum of squared errors is
 1/ 6 
 
E E  1/ 6 1/ 3 1/ 6  1/ 3 
T 1
6
 1/ 6 
※ Note that the square root of ETE is the distance between Y and XCˆ , i.e.,

(0  (1/ 6)) 2  (1  (8 / 6)) 2  (3  (17 / 6)) 2  1/ 6


5.86
 Ex: Prediction of world population
The following table shows the world population (in billions)
for six different years
Year 1980 1985 1990 1995 2000 2005
Population 4.5 4.8 5.3 5.7 6.1 6.5

Let x = 0 represent the year 1980. Find the least squares


regression quadratic polynomial y = c0+c1x+c2x2 for these data
and use the model to estimate the population for the year 2010
※ The least squares regression analysis can be employed to solve not
only the univariate regression in the previous example but also
multivariate regressions, i.e., y = c0 + c1x1 + c2x2+…+ cnxn, where y is
the dependent variable and xi are explanatory variables

5.87
By substituting the data points (0, 4.5), (5, 4.8), (10, 5.3), (15, 5.7),
(20, 6.1), and (25, 6.5) into the quadratic polynomial y = c0+c1x+c2x2,
we can produce the least squares problem
Ax  b
1 0 0  4.5
1 5 25  4.8
 c0   
1 10 100    5.3
   c1    
1 15 225
c2   5.7
1 20 400   6.1
   
1 25 625 6.5
The normal equations are
AT Axˆ  AT b
 6 75 1375  cˆ0   32.9 
 75 1375 28125   cˆ    447 
   1  
1375 28125 611875 cˆ2  8435
5.88
The solution is

cˆ0   4.47   4.5 


xˆ   cˆ1    0.08   0.08
cˆ2  0.00007  0 

Evaluating this polynomial at x = 30 gives the estimate of the


world population for the year 2010

y  4.5  0.08(30)  6.9 billion

5.89
 In the field of management, the least squares regression is
commonly used to find the linear relationship between the
dependent variable (y) and explanatory variables (xi), for example,
– The relationship between the growth of the cost of purchasing

material (y) and the growth of the oil price (x)


– If the regression result is y = 0.01 + 3x, it means that if the oil price rises
by 1%, then the firm’s cost on material on average increases by 3%
– The relationship between the excess returns of an individual
stock i (y = ri – rf) and the excess returns of the market index (x
= rM – rf) (single-factor model or CAPM)
– If the regression result is y = 0.05 + 1.2x, it means that if the market
excess return rises by 1%, then the excess return of the individual stock
on average increases by 1.2%
Expected market excess return next year 9% 10% 11%
Expected individual excess return next year 15.8% 17% 18.2%
– The coefficient for x, which is 1.2 here, is called beta 5.90
 Homework 1: Find betas for three firms
Solve the least squares regression problem as follows
ri,t – rf,t = αi + βi(rM,t – rf,t) + et
– ri,t and rM,t are total returns of the asset i and the market index on each
trading day t, which include both capital gains and dividend income
 It is inappropriate to calculate returns according to closing prices, and you
need to find adjusted closing prices to compute the adjusted (total) return
 S&P 500 total return index can be downloaded from the course website
 Adjusted closing prices of S&P 500 components can be downloaded from
the finance page of U.S. Yahoo
 Since ri,t and rM,t are daily returns, convert them to annualized returns
– The risk free rate rf,t is approximated by 1-month Treasury yields
 U.S. Department of the Treasury: http://www.treasury.gov/resource-
center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yield
– Today is assumed to be Sept. 1, 2014 and it is required to employ the
prior two-year historical daily returns to solve αi and βi for each firm5.91
 Bonus: construct a portfolio rp = w1r1 + w2r2 + w3r3 to replicate
the expected return and beta of the market index in the prior two
years, i.e., solving w1, w2, and w3 in the following system
 w1E (r1 )  w2 E (r2 )  w3 E (r3 )  E (rM )

 w11  w2  2  w3 3   M  1
 w1  w2  w3  1

(Note that the expectations are approximated by the sample averages)
– In-sample test: examine that rp is indeed with identical average

return and beta as the market portfolio for the prior two years, i.e.,
verify the expected return and beta of a portfolio are the weighted
averages of the expected returns and betas of component stocks
– Out-of-sample test: compare the average return and beta of this

portfolio and the market index for the next two months and tell me
what you observe
 Total points for this homework is 10. The basic requirement is 7
points, and the bonus is 3 points 5.92
Keywords in Section 5.4:
 orthogonal to W: 正交於W
 orthogonal complement: 正交補集
 direct sum: 直和
 projection onto a subspace: 在子空間的投影
 fundamental subspaces: 基本子空間
 least squares problem: 最小平方問題
 normal equations: 一般方程式

5.93
5.5 Applications of Inner Product Spaces
 Least Squares Approximations for a function
– An important application: use a polynomial function to

approximate the probability density function (pdf) for the


standard normal distribution, and then you can derive the
approximation for the cumulative distribution function
(cdf) of the standard normal distribution
1  x2 / 2
g ( x)  a0  a1 x  a2 x  a3 x  a4 x  n( x) 
2 3 4
e
2
c

(a0  a1 x  a2 x 2  a3 x 3  a4 x 4 )dx
c c 1  x2 / 2
 N (c)   n( x)dx   e dx
1 1 1 1
 a0 x  a1 x 2  a2 x 3  a3 x 4  a4 x 5  k |c
 
2
2 3 4 5

1 1 1 1
(k can be solved via a0 x  a1 x 2  a2 x3  a3 x 4  a4 x 5  k |  1)
2 3 4 5 5.94
 Calculate N(c) by the Monte Carlo Simulation
– Monte Carlo Simulation: generate some random scenarios and

conduct further analysis based on these scenarios


– By definition, N(c) = prob (x  c) for x ~ N(0,1)

– Draw, for example, 1000 random samples from a standard

normal distribution x ~ N(0,1)


– Then N(c) can be estimated with #(x  c) / 1000

 Draw random samples from N(0,1) in Excel


– Rand(): draw uniformly distributed random samples from [0,1]

(The probabilities for x equal to any


values between 0 and 1 are the same)

5.95
–Normsinv(): inverse of the cumulative density function of the
standard normal distribution
※Normsinv(Rand()) can draw random samples from the
standard normal distribution
N ( x)

x
0
The function Rand() draws a random sample from [0,1].
Then we use the inverse function N 1 (Rand()) to derive a
simulated value of x from the standard normal distribution,
where N (x) is the cumulative distribution function of the
standard normal distribution. 5.96
 Analyze a project with expanding and aborting options by the
Monte Carlo Simulation
– Time 0: initial investment V0

– Time t1: the value of the project is V1 ~ N (V0 (1  k ) ,  1 ) , and the


t1

firm can double the investment if V1  u (i.e.,V1  2V1), abort


the investment if V1  d (i.e., V1  0), or otherwise maintain the
same investment amount such that V1  V1 , where k is the
expected growth rate of the investment
– Time t2: the value of the project is V2 ~ N (V1(1  k ) , 2 )
( t t )
2 1

※ Through the Monte Carlo Simulation, you can generate


thousands of scenarios for the final payoff of the project, and
further analyze the associated mean and s.d. of the payoff of
the project
5.97

Вам также может понравиться