Pacific Journal of Mathematics Gradient Method Convergence

PACI FI C JOURNAL OF MATHEMATI CS
Vol. 16, No. 1, 1966

MINIMIZATION OF FUNCTIONS HAVING LIPSCHITZ
CONTINUOUS FIRST PARTIAL DERIVATIVES
LARRY ARMI JO
A general convergence theorem for the gradient method
is proved under hypotheses which are given below. I t is then
shown that the usual steepest descent and modified steepest
descent algorithms converge under the some hypotheses. The
modified steepest descent algorithm allows for the possibility
of variable stepsize.
For a comparison of our results with results previously obtained,
the reader is referred to the discussion at t he end ofthis paper.
P rincipal condit ions* Let / be a real valued function defined
and continuous everywhere on E
n
(real Euclidean w space) and bounded
below E
n
. For fixed x
0
e E
n
define S(x
0
) = {x : f(x) ^ f(x
0
)}. The func
tion^ / satisfies: condition I if there exists a unique point x* e E
n
such
t hat f(x*) = inf/(a); Condition I I at x
0
if fe C
1
on S(x
0
) and Vf(x) = 0
xEE
n
for x e S(x
0
) if and only if x = x*; Condition I I I at x
0
if / e C
1
on S(x
0
)
and Ff is Lipschitz continuous on S(x
0
), i.e., there exists a Lipschitz
constant K> 0such t hat |Ff(y) Ff(x) \ S K\ y x\ for every pair
x, yeS(x
0
); Condition IV at x
0
if feC
1
on S(x
0
) and if r > 0 implies
t hat m(r) > 0 where ra(r) = inf | Ff(x) |, S
r
(x
Q
) = S
r
S(x
0
), S
r
=
es
r
(
0
)
{x :I x x* I ^r }, and #* is any point for which f(x*) inf f(x). (If
xEE
n
S
r
(x
0
) is void, we define m(r) = oo.)
I t follows immediately from t he definitions ofConditions I through
IV t hat Condition IV implies Conditions I and I I , and if S(x
0
) is
bounded, then Condition IV is equivalent to Conditions I and I I .
2. The convergence t heorem* In t he convergence theorem and
its corollaries, we will assume t hat / is a real valued function defined
and continuous everywhere on E
n
, bounded below on E
n
, and t hat
Conditions I I I and IV hold at x
0
.
THEOREM. // 0 < g 1/41S , then for any xeS(x
0
), the set
(1) S*(a, ) = {x: x = x Wf{x), > 0, f(x) f(x) ^ \ Ff(x)\
2
}
Received January 30, 1964. The research for this paper was supported in part
by General Dynamics/Astronautics, San Diego, California, Rice University, Houston,
Texas, and the Martin Company, Denver, Colorado. The author is currently employed
by the National Engineering Science Company, Houston, Texas.
2 LARRY ARMIJO
is a nonempty subset of S(x
0
) and any sequence {x
k
}=
0
such that
x
k+1
e S*(x
k
, d), k = 0, 1, 2, , converges to the point x* which
minimizes f.
Proof. If xe S(x
0
), x = x Wf(x) and 0 ^ ^ 1/Z", Condition
I I I and the mean value theorem imply t he inequality f(x) f(x) ^
( X
2
K) I Pf(x) |
2
which in t urn implies t hat x e S*(x, ) for
x
^ ^
2
, <
so t hat S*(x, d) is a nonempty subset of S(x
Q
). If {^
fc
) =o is any sequ
ence for which x
k+1
e S*(x
k
, ), k =0, 1, 2, , then (1) implies t hat
sequence {f(x
k
)}~=
0
, which is bounded below, is monotone nonincreasing
and hence t hat | Vf(x
k
) \ * 0 as k > oo. The remainder of the theorem
follows from Condition I V.
COROLLARY 1. (The Steepest Descent Algorithm) If
x
k+1
= x
k
Lj7/(a?
4
), k = 0, 1, 2,
.
then the sequence {x
k
}^
=Q
converges to the point x* which minimizes f.
Proof. I t follows from t he proof of t he convergence theorem t hat
t he sequence {x
k
}^=
0
defined in t he statement of Corollary 1 is such
t hat x
k+1
e S*(x
k
, 1/4Z), k =0, 1, 2, . .
COROLLARY 2. (The Modified Steepest Descent Algorithm) If a
is an arbitrarily assigned positive number, a
m
= a/ 2
m
~\ m= 1, 2, ,
and x
k+1
= x
k
cc
mj
Pf(x
k
) where m
k
is the smallest positive integer
for which
( 2) f(x
k
a
m
/ f(x
k
)) f(x
k
) ^ \ a
mh
\ Ff(x
k
) |
2
,
k = 0, 1, 2, , then the sequence {x
k
}=
0
converges to the point x*
which minimizes f.
Proof. I t follows from t he proof of t he convergence theorem t hat
if x e S(x
Q
) and x=x \ Pf(x), then f(x) f(x) ^ (1/2) | Vf(x) |
2
for
0 ^ ^ 1J2K. If a ^ l/2i , then for t he sequence {x
k
}=
0
in t he state
ment of Corollary 2, m
k
=1 and x
k+1
e S*(x
k
, (l/ 2)a), k = 0, 1, 2, .
If a > 1/2Z", then t he integers m^ exist and a
mfc
> l/4i so t hat
MINIMIZATION OF FUNCTIONS 3
3* Discussion* The convergence theorem proves convergence
under hypotheses which are more restrictive than those imposed by
Curry [1] but less restrictive than those imposed by Goldstein [2].
However, both the algorithms which we have considered would be
considerably easier to apply than the algorithm proposed by Curry
since his algorithm requires the minimization of a function of one
variable at each step. The method of Goldstein requires the assumption
that feC
2
on S(x
0
) and that S(x
0
) be bounded. It also requires
knowledge of a bound for the norm of the Hessian matrix of / on
S(x
0
), but yields an estimate for the ultimate rate of convergence of the
gradient method. It should be pointed out that the modified steepest
descent algorithm of Corollary 2 allows for the possibility of variable
stepsize and does not require knowledge of the value of the Lipschitz
constant K.
The author is indebted to the referee for his comments and
suggestions.
REFERENCES
1. H. B. Curry, The method of steepest descent for nonlinear minimization problems,
Quart. Appl. Math. 2 (1944), 258-263.
'2. A. A. Goldstein, Cauchy's method of minimization, Numer. Math. 4 (2), (1962),
146-150.

Pacific Journal of Mathematics Gradient Method Convergence

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Pacific Journal of Mathematics Gradient Method Convergence

Загружено:

Авторское право:

Доступные форматы

PACI FI C JOURNAL OF MATHEMATI CS

Vol. 16, No. 1, 1966

Вам также может понравиться