Академический Документы
Профессиональный Документы
Культура Документы
Guangri Xue
Problem 1 (4.1). Let f (x) = 10(x2 x21 )2 + (1 x1 )2 . At x = (0, 1) draw the contour
lines of the quadratic model
1
mk (p) = fk + fkT p + pT Bk p (1)
2
assuming that B is the Hessian of f . Draw the family of solutions of
1
minn mk (p) = fk + fkT p + pT Bk p s.t. kpk k (2)
pR 2
Solutions: Compute
1
Figure 1: Contour of mk at x = (0, 1) with Figure 2: Zooming in the left Figure 1
solutions
Figure 3: Contour of mk at x = (0, 0.5) with Figure 4: Zooming in the left Figure 3
solutions
At x = (0, 0.5),
7 18 0
fk = , fk = [2, 10]T , Bk = .
2 0 20
See the Figures 3 and 4 for the contours and the solutions.
2
Problem 2 (4.2). Write a program that implements the dogleg method. Choose Bk to
be the exact Hessian. Apply it to solve Rosenbrocks function
Experiment with the update rule for the trust region by changing the constants in Algo-
rithm 4.1, or by designing your own rules.
Solutions:
400x1 (x2 x21 ) 2(1 x1 ) 400x2 + 1200x21 + 2 400x1
2
f (x) = , f (x) = .
200(x2 x21 ) 400x1 200
For this initial guess, the hessian matrix is always positive definite (checked through the
code) during this iterative process. So dogleg method should work well. In fact x0 is close
to the exact solution. As a result the method behaves like Newtons method. In addition,
kf (x)k decreases monotonically.
With = 0 and = 0.1,
3
1.60e-06 3.03e-02 1.28e-04 7
3.26e-10 5.50e-04 1.60e-06 8
1.39e-17 8.98e-08 3.26e-10 9
One can see that kf (x)k does not decrease monotonically, which verifies the convergence
theory numerically when = 0.
We test the method with a more conservative data: = 0.01 and = 105 . It took
31 iterations instead of 8 before.
Now we take x0 = (1.2, 1)T . In this case, the hessian is not always positive definite.
However, the iteration converges in 190 iterations with = 0.1 and = 103 . Of course,
we can not guarantee that kf (x)k decreases monotonically.
4
Problem 3 (4.6). The Cauchy-Schwartz inequality states that for any vectors u and v,
we have
|(u, v)|2 (u, u)(v, v)
with equality only when u and v are parallel. When B is positive definite, use this inequality
to show that
kgk4
:= 1
(g, Bg)(g, B 1 g)
with equality only if g and Bg (and B 1 g) are parallel.
Proof. First of all, B must be symmetric. If not, the following is a counter example. Take
1 1
1 1
B= . Then B 1 = 2 2 .
1 1 21 12
Let g = [g1 , g2 ]T . Then, (g, Bg) = g12 + g22 and (g, B 1 g) = 12 (g12 + g22 ). As a result
= 2 > 1.
So B is symmetric positive definite. Let 0 < 1 2 n be the eigenvalues.
Pn And
xi , (i = 1, 2, ..., n), are the corresponding orthonormal eigenvectors. Write g = i=1 ci xi .
We have
Xn Xn X n
2 1 1 2
(g, Bg) = i ci , (g, B g) = i ci , (g, g) = c2i .
i=1 i=1 i=1
By Cauchy-Schwartz inequality,
n
!2 n
!2 n n
1
X X 1 X X
c2i = (i ci )(i 2 ci )
2
i c2i 1 2
i ci
i=1 i=1 i=1 i=1
5
Problem 4 (4.7). When B is positive definite, the double-dogleg method constructs a
path with three line segments from the origin to the full step. The four points that define
the path are
the origin;
(g,g)
the unconstrained Cauchy step pC = (g,Bg) g;
a fraction of the full step pB = B 1 g, for some (, 1], where is defined in
previous question; and
the full step pB = B 1 g.
Proof.
pC ,
0 1
p( ) = C B C
p + ( 1)(p p ), 1 2
( + ( 2)(1 )) pB , 2 3
1 2
h() = kpC k2 + (pC , pB pC ) + kpB pC k2 .
2 2
It is enough to show h0 () > 0 for (0, 1).
6
Problem 5 (4.9). Derive the solution of the two-dimensional subspace minimization
problem in the case where B is positive definite.
Solutions:
1
min m(p) = f + g T p + pT Bp kpk , p span[g, B 1 g]
p 2
Let p = g + B 1 g and u = (, )T .
1
J(u) := kpk2 = 2 kgk2 + 2 kB 1 gk2 + 2(g, B 1 g) = uT Bu,
2
here
kgk2 (B 1 g, g)
B = 2 .
(B 1 g, g) kB 1 gk2
Easy to see B is symmetric positive definite.
We derive the constraint for u from kpk .
J(u) 2 . (3)
Easy to check B is symmetric positive and definite. Let u be the minimizer of (4) with
no constraint.
u = B 1 g.
If J(u ) 2 , then u is the solution.
If J(u ) > 2 , solve the following problem: for 0,
h(u) + J(u) = 0
7
This is equivalent to
(B + B)u = g
(B + B) is symmetric positive definite. Then the solution is
(B + B)1 g.
8
Problem 6 (4.8). Show that
2
2 ((l) )
(l+1) (l) (l+1) (l) kpl k kpl k
= 0 (l) , and = +
2 ( ) kql k
are equivalent.
Proof: From the hint, we have
d 1 1 3 d
02 () = ( ) = (kp()k2 ) 2 kp()k2 , (5)
d kp()k 2 d
and
n
X (g T g)2
d
kp()k2 = 2 . (6)
d (j + )
j=1
Note that
1 1
2 () ( kp()k )
0 = 3 ,
2 () 12 (kp()k2 ) 2 d
d
kp()k2
by (7), we have
2
kp()k3 (kp()k )
2 () kp()k kpk
0 = = .
2 () kqk2 (kp()k) kqk
9
Problem 7. Consider the problem the problem (4.5) and its approximate solutions:
Prove that
m(p ) m(pT ) m(pD ) m(pC ).
Proof:
T
D
ggT Bg
g
g 0< 1
p ={ Tg gT g
ggT Bg g + ( 1)(B 1 g + g T Bg
g) 1 < 2.
m(pD ) = m(pC ).
10
Problem 8. The symmetric matrix B has eigenvalues 1 2 n with corre-
sponding orthonormal eigenvectors qi , i = 1, 2, , n. When q1T g 6= 0 (the hard case in the
textbook), consider
X qT g
i
p( ) = qi + q 1 .
i 1
i6=1
X qiT g X (q T g)2
kp( )k2 = k qi + q1 k2 = i
+ 2.
i 1 (i 1 )2
i6=1 i6=1
Since
X (qiT g)2
2 ,
(i 1 )2
i6=1
11