Вы находитесь на странице: 1из 11

Computational Methods of Optimization

First Midterm(22nd Sep, 2019)

Instructions:
• This is a closed book test. Please do not consult any additional material.
• Attempt all questions

• Total time is 90 mins.


• Answer the questions in the spaces provided. Answers outside the spaces provided
will not be graded.
• Rough work can be done in the spaces provided at the end of the booklet

Name:

SRNO: Degree: Dept:

Question: 1 2 3 4 5 6 Total
Points: 10 10 5 10 10 5 50
Score:

1
1
In the following, assume that f is a C√ function defined from Rd → R unless otherwise mentioned. Also
> d
x = [x1 , x2 , . . . , xd ] ∈ R and kxk = x> x. Set of real symmetric d × d matrices will be denoted by Sd .
[n] will denote the set {1, 2, . . . , n}

1. (10 points) Please indicate True(T) or False(F) in the space given after each question. All questions carry
equal marks
(a) Let a < b where a, b ∈ R and h : [a, b] → R be differentiable and satisfies h(a) = h(b). Then h has
a critical point in (a, b). Recall that a critical point is a point, x, such that ∇f (x) = 0. T
(b) In the previous question let h(x) = α1 x2 + α2 x + α3 The values of α1 , α2 , α3 are not given but it
is given that h(a) = h(b) = 0 and it is given that h(x) > 0 for all x ∈ (a, b). The function h is
convex. F
(c) If f is a coercive function bounded from below then the global mimimum must lie at one of the
critical points. T
(d) Consider g : R → R, g(u) = 12 u2 − 13 u3 .The function has a global minimum. F
(e) The point u = 0 is a local maximum of g, defined in the previous question. F
(f) If all critical points of f are global minima then the function must be convex. F
(g) The Hessian of f is positive definite everywhere. The cardinality of the set of critical points of f is
three? F
 
1 2
(h) Let f : R2 → R. The Hessian at a critical point x is H(x) =
2 1
(i) The set {(x, t)|kxk ≤ t} is convex. T
(j) Let {x|Ax = b} is not a convex set. F
2. Consider minimization of the function f : R3 → R defined as follows
1 >
f (x) = x Ax − b> x
2
 
1 a b
Where A =  −a 2 c  and b = [1, 2, 1]> . The values of a, b, c ∈ R are not known.
−b −c z
(a) (4 points) From this information is it possible to determine the gradient and Hessian of f at x =
[1, 1, 1]> assuming that z = 1? If not what minimal additional information is required to evalute
the gradient and Hessian.

Solution: Yes it is possible.


1
∇f (x) = (A + A> )x − b, ∇f (0) = [1, 2, 1]> − b = 0
2
1
H(x) =(A + A> ) = Diag(1, 2, z)
2
where Diag(v) is a diagonal matrix with ith diagonal entry to be substituted by vi .

(b) (3 points) Determine if the global minimum exist at z = 1? If not give reasons. If yes compute it.

Page 2
Solution: For z=1, the function is convex as the Hessian is P.d. and the global minimum is
attained at x = [1, 1, 1]> .

(c) (3 points) Repeat the above two questions for z = −1. if we want to determine global maxima.

Solution: For z = −1 neither the global minimum or global maximum is attained.

Page 3
3. (5 points)
Pn If f is convex function defined over C ⊂ Rd then for any β ∈ ∆n where ∆n = {γ ∈
n
R | i=1 γi = 1, γi ≥ 0, i ∈ [n]}
Xn n
X
f( βi x i ) ≤ βi f (xi )
i=1 i=1

holds for any x1 , . . . , xn ∈ C. If f is strictly convex then equality is attained only whem xi = x for all
i ∈ [n]. Show that for any xi > 0, i ∈ [n], xi ∈ R and γ ∈ ∆n

Y n
X
xγi i ≤ γi xi
i=1

Solution: Let f (z) = −log(z) is a strictly convex function over z > 0. The second derivative of
f (z) is z12 . Thus
Xn n
X
− γi log xi ≥ − log γi xi
i=1 i=1

The proof follows by noting that log is an increasing function.

Page 4
4. (10 points) Consider desiging the tube for holding shuttle cocks. It is essentially a cylinder of radius r
cm and height h cm. The cost of painting the top and bottom of the cylinder is cRs/cm2 and the cost
of painting the sides is dRs/cm2 . How will you design the cyliner, that is choose r and h, such that
cost of painting it is minumum but it must hold 6 such shuttle cocks. In other words solve the following
problem
minr,h 2πr2 c + πrhd subject to πr2 h = V
where V is the volume of space needed to hold the shuttle cocks. Use the previous question to solve the
optimization problem.

Solution: Let γ1 z1 = 2πr2 c, γ2 z2 = πrhd where γ1 + γ2 = 1 and γ1 , γ2 ≥ 0.

z1γ1 z2γ2 = A(πr2 h)a


By equating the exponents of r and h one obtains 2γ1 + γ2 = 2a and γ2 = a. Thus γ1 = a/2
and γ2 = a. Since they should sum to 1, the value of a = 2/3. Using the previous question we
deduce that
2
γ1 z1 + γ2 z2 ≥ AV 3
and the minimum is attained for a choice of r, h such that
2
z1 = z2 = AV 3 (1)

Plugging all these we obtain z1 = 6πr2 c and z2 = 23 πrhd and the optimum cost
a π 1 2
z12 z2a = 3( cd2 ) 3 πr2 h 3
2
a π 1 2
z12 z2a = 3( cd2 ) 3 V 3
2
dV 16c2 V
Due to (1) the minimum is achieved at r3 = 4πc and h3 = πd2 .

Page 5
5. (10 points) The case of sloppy stepsize: Consider minimizing the function
1 >
f (x) = x Qx − b> x
2
over x ∈ Rd with Q ∈ Sd+ , b ∈ Rd using the steepest descent iterates,

x(k+1) = x(k) − αk ∇f (x(k) )


α
executed with a sloppy step-size; each step-size αk can be any element in {α|| ᾱ − 1| ≤ δ} The parameter
ᾱ is defined as the stepsize obtained through exact line search. How many iterations will it need to reach
to find a point x̄ such that
f (x̄) − f ∗ ≤ 
starting from an arbitrary point x(0) . Assume that you have chosen a stepsize which is farthest from ᾱ.

Solution: Consider the error function E(x) = f (x) − f (x∗ ) = 12 (x − x∗ )> Q(x − x∗ ). Recall that
E(x) = 21 ∇f (x)> Q−1 ∇f (x). We will use gk to denote ∇f (x(k) ). From Taylor expansion one obtains
that for any x = x(k) − αgk
1
E(x) = E(x(k) ) − αkgk k2 + α2 gk> Qgk
2
kgk k2
Define ᾱ = gk Qgk and completing squares lead to

1 1
E(x) = E(x(k) ) − ᾱ2 gk> Qgk + gk> Qgk (α − ᾱ)2
2 2
The decrease after each iteration is given by
1 > 1 1
E(x(k) ) − E(x(k+1) ) = g Qgk ᾱ2 − gk> Qgk (α − ᾱ)2 ≥ gk> Qgk ᾱ2 (1 − δ 2 )
2 k 2 2
1 (kgk k2 )2
= (1 − δ 2 )
2 gk> Qgk

E(x(k) ) − E(x(k+1) ) (kgk k2 )2


≥ > (1 − δ 2 )
E(x )(k) (gk Qgk )(gk> Q−1 gk
Applying Kantorovch inequality we obatin

E(x(k) ) − E(x(k+1) ) 4λ1 λd


≥ (1 − δ 2 )
E(x(k) ) (λ1 + λd )2

E(x(k+1) ) 4λ1 λd
Hence E(x(k) )
≤ r where r = 1 − (λ1 +λd )2 (1 − δ 2 ). From the above

E(xk ) ≤ rk E(x0 ) ≤ 

1 E(x0 )
whenever k ≥ log r1
log 

(λ1 −λd )2
Check that r = r0 + (1 − r0 )δ 2 where r0 = λ1 +λd )2 , is the rate obtained by using steepest descent.

Page 6
6. (5 points) Consider f as defined in Question 5. Show that Goldstein condition on the stepsize can be
written as n α o
α ∈ α − 1 ≤ δ

ᾱ
Identify ᾱ and δ. Assume that the iterates are of the form

x(k+1) = x(k) + αk u
where u is a Descent direction at xk . Briefly comment the relative merit/demerit of Wolfe condition
over this Goldstein in the context of this problem.

Solution: Introduce the following function


1
g(α) = f (x(k) + αu) = g(0) + αg 0 (0) + α2 g”(0)
2
g 0 (0) = gk> u, g”(0) = u> Qu
0
g (0)
Define ᾱ = − g”(0) , the minimum of g(α), and

1 1
g(α) = g(0) − ᾱ2 g”(0) + g”(0)(α − ᾱ)2
2 2

For any 0 ≤ m ≤ 21 , the Goldstein condition can be stated as

g(0) − g”(0)(1 − m)ᾱα ≤ g(α) ≤ g(0) − g”(0)mαᾱ

where we have used the definition of ᾱ. This simplifies to


1 2 1
(1 − m)αᾱ ≥ ᾱ − (α − ᾱ)2 ≥ mαᾱ
2 2
1
(1 − m)αᾱ ≥ − α2 + ᾱα ≥ mαᾱ
2

(1 − m) ≥ − +1≥m
2 ᾱ
α
2 − 2m ≥ − + 2 ≥ 2m
ᾱ
α
δ ≥ 1 − ≥ −δ
ᾱ
where we have used 1 − 2m = δ and the proof follows.
As the optimum stepsize is already contained in the set there is little merit in using Wolfe condition
over Goldstein.

Page 7
Rough Sheet 1

Page 8
Rough Sheet 2

Page 9
Rough Sheet 3

Page 10
Rough Sheet 4

Page 11

Вам также может понравиться