Вы находитесь на странице: 1из 29

Optimization Techniques

Multi-variable Unconstrained Optimization:


Marquardts Method

Dr. Nasir M Mirza


Email: nasirmm@yahoo.com

Optimization Methods
One-Dimensional Unconstrained Optimization
Golden-Section Search
Quadratic Interpolation
Newton's Method

Multi-Dimensional Unconstrained Optimization


Non-gradient or direct methods
Gradient methods

Summary of Newton's Method


One-dimensional
Optimization
At the
optimal
Newton's
Method

Multi-dimensional
Optimization

f ' ( xi ) = 0

f (x) = 0

f ' ( xi )
xi +1 = xi
f " ( xi )

xi +1 = xi H i1f (xi )

f " ( xi ) 1 f ' ( xi )

Hi is the Hessian
matrix (or matrix of 2nd
partial derivatives) of f
evaluated at xi.

Newton's Method
1
i

x i +1 = x i H f ( x i )
This method Converges in quadratic fashion.
It May diverge if the starting point is not close
enough to the optimum point.
It is very Costly to evaluate H-1.

Marquardt Method
Idea
When a guessed point is far away from the optimum
point, use the Steepest Ascend method or Cauchys
method.
As the guessed point is getting closer and closer to the
optimum point, gradually switch to the Newton's
method.
In any given problem it is not known whether the
chosen initial point is away from the minimum or close
to the minimum.
So, we need a method that takes advantages of both.

Marquardt Method
The Marquardt method achieves the objective by
modifying the Hessian matrix H in the Newton's
Method in the following way:

x i +1

~ 1
= x i H i f (x i )

where

~
H i = H i + iI

Initially, set 0 a huge number.


Decrease the value of i in each iteration.
When xi is close to the optimum point, makes i
zero (or close to zero).

Marquardt Method
Wheni is large

~
H i = H i + iI i I
x i +1

Steepest Ascend Method: (i.e.,


Move in the direction of the
gradient.)

1
~ 1
= x i H i f ( x i ) x i +1 = x i
f (x i )

When i is close to zero


Newton's Method
~
H i = H i + iI H i
~ 1
1
x i +1 = x i H i f ( x i ) x i +1 = x i H i f ( x i )

EXERCISE 3.4.4: Marquardt Method


Consider the Himmelblau function:
Minimize using Marquardts method:
f(x, y) = (x2 + y 11)2 + (x + y2 7)2

Step 1: In order to ensure proper convergence, a large value of


M (= 100) is usually chosen.
Also, keep an initial point x(0) = (0, 0)T,
and termination parameters 1 = 10-3 .
We also set k = 0 as an iteration counter and parameter (0) = 100.
Step 2: The derivative at this point is calculated as (-14, -22)T.
Step 3: Since the derivative is not small we go to the step 4.

EXERCISE 3.4.4: Marquardt Method


3D Graph

EXERCISE 3.4.4: Marquardt Method


Contour graph:
% Matlab program to
draw contour of function
[X,Y] = meshgrid(0:.1:5);
Z = (X.*X + Y -11.).^2. +(
X + Y.*Y - 7.).^2.
contour(X, Y, Z, 150);
colormap(jet);

Minimum point

EXERCISE 3.4.4: Marquardt Method


(

) (
) (
2

f ( x, y ) = x + y 11 + x + y 7
f
= 4 x x 2 + y 11 + 2 x + y 2 7 = 4 x 3 + 4 xy 42 x + 2 y 2 14
x
f
= 2 x 2 + y 11 + 4 y x + y 2 7 = 2 x 2 22 + 4 xy + 4 y 3 26 y
y
2

2 f
2 f
2 f
2
2
= 12 x + 4 y 42;
= 4 y + 12 y 26;
= 4x + 4 y ;
2
2
x
y
xy
12 x 2 + 4 y 42
4x + 4 y
H=

2
4 y + 12 y 26
4x + 4 y
0
42.0
T
At (0,0), f = [ 14 22] and H =
.

26
0

EXERCISE 3.4.4: Marquardt Method

) (
2

f ( x, y ) = x + y 11 + x + y 7
2

0
42.0
At (0,0), f = [ 14 22] and H =
.

26
0
T

s (1) = [H (0) + I] f ( x(0))


1

42
0
1 0 14
+ 100

=
26
0 1 22
0
58 0

=
0 74

14
1 74 0 14 0.241

22 4292 0 58 22 0.297

Thus new point x(1) = x(0) + s(0)


x(1) = (0,0)T + (0.241, 0.297)T = (0.241, 0.297)T

EXERCISE 3.4.4: Marquardt Method


STEP 5: The function value at this point x(1) is f(x(1))
and it is 157.79, which is smaller than that at x(0);
F(0) = 170. Thus we move to next step.
STEP 6: we now set a new = 100/2 = 50. This has the
effect of switching from Cauchy to Newton Method. We
now set k = 1.
This completes one iteration of Marquardt algorithm.

EXERCISE 3.4.4: Marquardt Method


Now again select a new initial point x(1) = (0.241, 0.297)T,
and function value at this point is f(x(1)) = 157.79.
Step 2: The derivative at this point is calculated as (-23.60, -29.21 )T.
Step 3: Since the termination criteria are not met we go to the step 4.
Step 4: At this point the Hessian is given as:

) (
2

f ( x, y ) = x + y 11 + x + y 7
2

at po int = (0.241, 0.297)


f
= 4 x 3 + 4 xy 42 x + 2 y 2 14 = 23.6033
x
f
= 2 x 2 22 + 4 xy + 4 y 3 26 y = 29.2147
y
2.152
12 x 2 + 4 y 42
4 x + 4 y 40.115
H=
=

2
2
.
152
23
.
754

4
+
4
4
+
12

26
x
y
y
y

EXERCISE 3.4.4: Marquardt Method

) (
2

f ( x, y ) = x 2 + y 11 + x + y 2 7

2.152
40.115
At (0,0), f = [ 23.64 29.21] and H =

2
.
152
23
.
754

s (1) = [H (0) + I] f ( x(0))


1

40.115
2.152
1 0 23.64

=
+ 50

23.754
0 1 29.21
2.152
9.885 2.152

=
2.152 26.246

23.64
2.738

==

29.21
1.749

Thus new point x(2) = x(1) + s(1)


x(2) = (0.241, 0.297)T + (2.738, 1.749)T = (2.98, 2.045)T

EXERCISE 3.4.4: Marquardt Method


STEP 5: The function value at this point x(2) is f(x(2))
and it is 0.033, which is much smaller than that at x(1);
F(1) = 157.79. Thus we move to next step.
STEP 6: we now set a new = 50/2 = 25. This has the
effect of switching from Cauchy to Newton Method. We
now set k = 2.
This completes one more iteration for the Marquardt
algorithm.

EXERCISE 3.4.4: Marquardt Method


This process continues until the termination criteria is
satisfied.
One more iteration shows that x(3) = (2.994 , 2.005)T
and the function value f(x(3)) = 0.001
So, we can stop here and the optimum is x(3).
One difficulty for Marquardt method is that at every
iteration one has to estimate the Hessian matrix.

Conjugate Direction Methods


Conjugate direction methods can be regarded as
somewhat in between steepest descent and Newton's
method, having the positive features of both of them.

Motivation: There is desire to accelerate slow


convergence of steepest descent, but avoid expensive
evaluation, storage, and inversion of Hessian.

Conjugate Gradient Approaches


The Fletcher-Reeves Method
It is similar to the conjugate direction method.
Assuming that the object function is quadratic,
conjugate directions can be found using first order
derivatives;
Idea: Calculate conjugate direction at each points based
on the gradient as
Si = f i +

f i

f i 1

S i 1

This method Converge faster than Powell's method.

Example on various functions


Determine whether the stationary point of the following quadratic
functions is a local maxima, local minima or saddle point?

(i) f ( x ) = x 2 x + 100
2

(ii) f ( x, y ) = 2 xy + 1.5 y 1.25 x 2 2 y 2


(iii) f ( x, y ) = ( x 2) 2 ( y 3) 2
(iv) f ( x, y , z ) = x 2 + y 2 + z 2 2 xz + xy 3 yz + 10
A point x* is a stationary point iff
f '(x*) = 0 (if f is a function of one variable)
Vf (x*) = 0 (if f is a function of >1 variables)

Example Solution

(i)

f ( x ) = x 2 2 x + 100
f ' ( x) = 2 x 2 = 0 x = 1
f " (1) = 2 ( + ve) x = 1 is a local minima

(ii)

f ( x, y ) = 2 xy + 1.5 y 1.25 x 2 2 y 2
f
f
= 2 y 2.5 x ,
= 2 x + 1.5 4 y
x
y
Setting f = 0, we have
2 y 2.5 x = 0
2 x + 1.5 4 y = 0
Solving the system yields x = 0.5 and y = 0.625

We still have to test if the point is a local maxima, minima or saddle point
(continue next page )

Example Solution (Continue)


(ii) ( continue)

One way to test if a point is a local maxima, local minima, or


saddle point is to use the Hessian matrix
f ( x, y ) = 2 xy + 1.5 y 1.25 x 2 2 y 2

f
= 2 y 2.5 x,
x
2 f

xx

H= 2
f
yx

f
= 2 x + 1.5 4 y
y

2 f

xy 2.5 2
H = (2.5)(4) (2)(2) = 6
=
2

f 2
4
yy
2 f
Since H > 0 and 2 < 0, the point (0.5,0.625) is a local maxima.
x

Example Solution (Continue)

(iii)

f ( x, y ) = ( x 2) 2 ( y 3) 2
f
f
= 2( x 2) = 2 x 4,
= 2( y 3) = 2 y + 6
x
y
2 f

xx
H= 2
f
yx

2 f

xy 2 0
H = 4
=
2

f 0 2
yy
2 f
Since H < 0 but 2 > 0 (i.e, H is indefinite),
x
the stationary point is a saddle point.

Example Solution (Continue)

(iv)

f ( x, y, z ) = x 2 + y 2 + z 2 2 xz + xy 3 yz + 10
f
f
f
= 2 x 2 z + y,
= 2 y + x 3z,
= 2z 2x 3y
x
y
z
2 f

x2x
f

H=
yx
2
f
zx

2 f
xy
2 f
yy
2 f
zy

2 f

xz 2
1 2
2
f

1
2
3
=

yz
2 3 2
2
f
zz

Need to test if H is positive definite or negative definite or


neither in order to tell whether the stationary point is a
local maxima, local minima, or saddle point.

(continue next page )

Example Solution (Continue)


(iv) ( continue from previous slide)
We can verify if a matrix is positive definite by checking if the
determinants of all its upper left corner sub-matrices are positive.

1 2
2
2 3
H= 1

2 3 2

H11 = 2 > 0, H 22

H33

Forward
Elimination

2
2 1
0 1.5
2

0 0 4 / 1.5

2 1
=
=42 =2>0
1 2

2
1 2 2 1
2
4
= 1
2 3 = 0 1.5
2 = ( 2)(1.5)( ) = 8 < 0
1.5
2 3 2
0 0 4 / 1.5

Since H is neither positive definite nor negative definite (i.e., indefinite),


the stationary point is a saddle point.

Let us do more exercise


For each of the following points, determine whether it is a
local maxima, local minima, saddle point, or not a
stationary point of

f ( x, y ) = x 3 + y 3 3 xy
1.
2.
3.
4.

(0, 0)
(1, 0)
(-1, -1)
(1, 1)

Exercise;

solution

f ( x, y) = x 3 + y 3 3xy
6 x 3
H=

3
6

0 3
T
At (0,0), f = [0 0] and H =
.

3 0
f
= 3 x 2 3 y,
x

f
= 3 y 2 3x,
y

Since f = 0 and h1,1 = 0 (or H = -9), (0,0) is a saddle point.


At (1,0), f = [3 3] 0. Thus (1,0) is not a stationary point.
T

At (1,1), f = [6 6] 0. Thus (-1,-1) is not a stationary point.


T

6 3
At (1,1), f = [0 0] and H =
.

3 6
Since f = 0 and h1,1 > 0 and H = 36 9 = 27, (1,1) is a local minima.
T

Summary
Gradient What it is and how to derive
Hessian Matrix What it is and how to derive
How to test if a point is maximum, minimum, or
saddle point
Steepest Ascent Method vs. Conjugate-Gradient
Approach vs. Newton Method

Вам также может понравиться