Академический Документы
Профессиональный Документы
Культура Документы
Instructions:
• This is a closed book test. Please do not consult any additional material.
• Attempt all questions
Name:
Question: 1 2 3 4 5 6 Total
Points: 10 10 5 10 10 5 50
Score:
1
1
In the following, assume that f is a C√ function defined from Rd → R unless otherwise mentioned. Also
> d
x = [x1 , x2 , . . . , xd ] ∈ R and kxk = x> x. Set of real symmetric d × d matrices will be denoted by Sd .
[n] will denote the set {1, 2, . . . , n}
1. (10 points) Please indicate True(T) or False(F) in the space given after each question. All questions carry
equal marks
(a) Let a < b where a, b ∈ R and h : [a, b] → R be differentiable and satisfies h(a) = h(b). Then h has
a critical point in (a, b). Recall that a critical point is a point, x, such that ∇f (x) = 0. T
(b) In the previous question let h(x) = α1 x2 + α2 x + α3 The values of α1 , α2 , α3 are not given but it
is given that h(a) = h(b) = 0 and it is given that h(x) > 0 for all x ∈ (a, b). The function h is
convex. F
(c) If f is a coercive function bounded from below then the global mimimum must lie at one of the
critical points. T
(d) Consider g : R → R, g(u) = 12 u2 − 13 u3 .The function has a global minimum. F
(e) The point u = 0 is a local maximum of g, defined in the previous question. F
(f) If all critical points of f are global minima then the function must be convex. F
(g) The Hessian of f is positive definite everywhere. The cardinality of the set of critical points of f is
three? F
1 2
(h) Let f : R2 → R. The Hessian at a critical point x is H(x) =
2 1
(i) The set {(x, t)|kxk ≤ t} is convex. T
(j) Let {x|Ax = b} is not a convex set. F
2. Consider minimization of the function f : R3 → R defined as follows
1 >
f (x) = x Ax − b> x
2
1 a b
Where A = −a 2 c and b = [1, 2, 1]> . The values of a, b, c ∈ R are not known.
−b −c z
(a) (4 points) From this information is it possible to determine the gradient and Hessian of f at x =
[1, 1, 1]> assuming that z = 1? If not what minimal additional information is required to evalute
the gradient and Hessian.
(b) (3 points) Determine if the global minimum exist at z = 1? If not give reasons. If yes compute it.
Page 2
Solution: For z=1, the function is convex as the Hessian is P.d. and the global minimum is
attained at x = [1, 1, 1]> .
(c) (3 points) Repeat the above two questions for z = −1. if we want to determine global maxima.
Page 3
3. (5 points)
Pn If f is convex function defined over C ⊂ Rd then for any β ∈ ∆n where ∆n = {γ ∈
n
R | i=1 γi = 1, γi ≥ 0, i ∈ [n]}
Xn n
X
f( βi x i ) ≤ βi f (xi )
i=1 i=1
holds for any x1 , . . . , xn ∈ C. If f is strictly convex then equality is attained only whem xi = x for all
i ∈ [n]. Show that for any xi > 0, i ∈ [n], xi ∈ R and γ ∈ ∆n
Y n
X
xγi i ≤ γi xi
i=1
Solution: Let f (z) = −log(z) is a strictly convex function over z > 0. The second derivative of
f (z) is z12 . Thus
Xn n
X
− γi log xi ≥ − log γi xi
i=1 i=1
Page 4
4. (10 points) Consider desiging the tube for holding shuttle cocks. It is essentially a cylinder of radius r
cm and height h cm. The cost of painting the top and bottom of the cylinder is cRs/cm2 and the cost
of painting the sides is dRs/cm2 . How will you design the cyliner, that is choose r and h, such that
cost of painting it is minumum but it must hold 6 such shuttle cocks. In other words solve the following
problem
minr,h 2πr2 c + πrhd subject to πr2 h = V
where V is the volume of space needed to hold the shuttle cocks. Use the previous question to solve the
optimization problem.
Plugging all these we obtain z1 = 6πr2 c and z2 = 23 πrhd and the optimum cost
a π 1 2
z12 z2a = 3( cd2 ) 3 πr2 h 3
2
a π 1 2
z12 z2a = 3( cd2 ) 3 V 3
2
dV 16c2 V
Due to (1) the minimum is achieved at r3 = 4πc and h3 = πd2 .
Page 5
5. (10 points) The case of sloppy stepsize: Consider minimizing the function
1 >
f (x) = x Qx − b> x
2
over x ∈ Rd with Q ∈ Sd+ , b ∈ Rd using the steepest descent iterates,
Solution: Consider the error function E(x) = f (x) − f (x∗ ) = 12 (x − x∗ )> Q(x − x∗ ). Recall that
E(x) = 21 ∇f (x)> Q−1 ∇f (x). We will use gk to denote ∇f (x(k) ). From Taylor expansion one obtains
that for any x = x(k) − αgk
1
E(x) = E(x(k) ) − αkgk k2 + α2 gk> Qgk
2
kgk k2
Define ᾱ = gk Qgk and completing squares lead to
1 1
E(x) = E(x(k) ) − ᾱ2 gk> Qgk + gk> Qgk (α − ᾱ)2
2 2
The decrease after each iteration is given by
1 > 1 1
E(x(k) ) − E(x(k+1) ) = g Qgk ᾱ2 − gk> Qgk (α − ᾱ)2 ≥ gk> Qgk ᾱ2 (1 − δ 2 )
2 k 2 2
1 (kgk k2 )2
= (1 − δ 2 )
2 gk> Qgk
E(x(k+1) ) 4λ1 λd
Hence E(x(k) )
≤ r where r = 1 − (λ1 +λd )2 (1 − δ 2 ). From the above
E(xk ) ≤ rk E(x0 ) ≤
1 E(x0 )
whenever k ≥ log r1
log
(λ1 −λd )2
Check that r = r0 + (1 − r0 )δ 2 where r0 = λ1 +λd )2 , is the rate obtained by using steepest descent.
Page 6
6. (5 points) Consider f as defined in Question 5. Show that Goldstein condition on the stepsize can be
written as n α o
α ∈ α − 1 ≤ δ
ᾱ
Identify ᾱ and δ. Assume that the iterates are of the form
x(k+1) = x(k) + αk u
where u is a Descent direction at xk . Briefly comment the relative merit/demerit of Wolfe condition
over this Goldstein in the context of this problem.
1 1
g(α) = g(0) − ᾱ2 g”(0) + g”(0)(α − ᾱ)2
2 2
Page 7
Rough Sheet 1
Page 8
Rough Sheet 2
Page 9
Rough Sheet 3
Page 10
Rough Sheet 4
Page 11