NLP Unconstrained Multivariable

What you can do for one variable, you can do for many (in principle)
Optimization in Engineering Design
Georgia Institute of Technology Systems Realization Laboratory 78
Method of Steepest Descent

The method of steepest descent (also known as the gradient method) is the simplest example of a gradient based method for minimizing a function of several variables. Its core is the following recursion formula:
xk+1 = xk k F k
xk , xk+1 F(x) F
Remember: Direction = dk = S(k) = -F(x(k))
= values of the variables in the k and k+1 iteration. = objective function to be minimized (or maximized) = gradients of the objec tive function, constituting the direction of travel = the size of the step in the direction of travel
Refer to Section 3.5 for Algorithm and Stopping Criteria Advantage: Simple Disadvantage: Seldom converges reliably.
Newton's Method (multi-variable case)

How to extend New tons method to multivariable c ase? xk+1 = xk y(xk) y(xk) Is this correct? No. Why?
Remainder is dropped. Significance?
Start again with Taylor expansion: y(x) = y(xk) + y(xk)(x-xk) + 0.5 ( -xk)H(xk) (x-xk) x
T
See Note that H is the Hess ian containing the second order derivatives . Sec. 1.4.
xk+1 = xk -
y(xk) H(xk)
Is this correct? Not yet. Why?
New tons method for finding an extreme point is xk+1 = xk - H-1(xk) y(xk)
Dont confuse H-1 with .
Like the Steepest Descent Method, Newtons searches in the negative gradient direction.
Properties of Newton's Method
Good properties (fast convergence) if started near solution. However, needs modifications if started far away from solution. Also, (inverse) Hessian is expensive to calculate. To overcome this, several modifications are often made. One of them is to add a search parameter in from of the Hessian. (similar to steepest descent). This is often referred to as the modified Newton's method.
Other modification focus on enhancing the properties of the second and first order gradient combination.
Quasi-Newton methods build up curvature information by observing the behavior of the objective functions and its first order gradient. This info is used to generate an approximation of the Hessian.
Conjugate Directions Method

Conjugate direction methods can be regarded as somewhat in between steepest descent and Newton's method, having the positive features of both of them. Motivation: Desire to accelerate slow convergence of steepest descent, but avoid expensive evaluation, storage, and inversion of Hessian. Application: Conjugate direction methods are invariably invented and solved for the quadratic problem: Minimize: () xTQx - bTx Note: Condition for optimality is y = Qx - b = 0 or Qx = b (linear equation) Note: Textbook uses A instead of Q.
Basic Principle
Definition: Given a symmetric matrix Q, two vectors d1 and d2 are said to be Q orthogonal or Q conjugate (with respect to Q) if d1TQd2 = 0. Note that orthogonal vectors (d1Td2 = 0)are a special case of conjugate vectors So, since the vectors di are independent, the solution to the nxn quadratic problem can be rewritten as x* = 0d0 + ... + n-1 dn-1 Multiplying by Q and by taking the scalar product with di, you can express in terms of d, Q, and either x* or b Note that A is used instead of Q in your textbook
Conjugate Gradient Method

The conjugate gradient method is the conjugate direction method that is obtained by selecting the successive direction vectors as a conjugate version of the successive gradients obtained as the method progresses.
Search direction @ iteration k.
d k g k id i
i 0
k 1
or
d k 1 g k 1 k d k
You generate the conjugate directions as you go along.
Three advantages:
1) Gradient is always nonzero and linearly independent of all previous direction vectors. 2) Simple formula to determine the new direction. Only slightly more complicated than steepest descent. 3) Process makes good progress because it is based on gradients.
Pure Conjugate Gradient Method (Quadratic Case)

0 - Starting at any x0 define d0 = -g0 = b - Q x0 , where gk is the column vector of gradients of the objective function at point f(xk) 1 - Using dk , calculate the new point xk+1 = xk + k dk , where
gkTd k k=- T d k Qd k
Note that is calculated
2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + k dk where gk+1 TQd k k= d kTQd k
This is slightly different than your current textbook
Non-Quadratic Conjugate Gradient Methods

For non-quadratic cases, you have the problem that you do not know Q, and you would have to make an approximation. One approach is to substitute Hessian H(xk) instead of Q.
Problem is that Hessian has to be evaluated at each point.
Other approaches avoid the Q completely by using Line Searches

Examples: Fletcher-Reeves and Polak-Robiere methods
Difference in methods:
find k through line search different formulas for calculating k than the pure Conjugate Gradient algorithm
Polak-Robiere & Fletcher Reeves Method for Minimizing f(x)

0 - Starting at any x0 define d0 = -g0 ,where g is the column vector of gradients of the objective function at point f(x) 1 - Using dk , find the new point xk+1 = xk + k dk , where k is found using a line search that minimizes f(xk + k dk) 2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + k dk where k can vary depending on what (update) formula you use. Fletcher-Reeves: Polak-Robiere:
( g k 1 )T ( g k 1 ) (g k ) (g k )
T
( g k 1 )T ( g k 1 g k ) ( g k )T ( g k )
Note: gk+1 is the gradient of the objective function at point xk+1

Fletcher-Reeves Method for Minimizing f(x)

0 - Starting at any x0 define d0 = -g0 ,where g is the column vector of gradients of the objective function at point f(x) 1 - Using dk , find the new point xk+1 = xk + k dk , where k is found using a line search that minimizes f(xk + k dk) 2 - Calculate the new conjugate gradient direction dk+1 , according to: dk+1 = - gk+1 + k dk where T
( g k 1 ) ( g k 1 ) ( g k )T ( g k )
See also Example 3.9 (page 73) in your textbook

Conjugate Gradient Method Advantages
Attractive are the simple formulae for updating the direction vector. Method is slightly more complicated than steepest descent, but converges faster.
See em in action!
For animations of each of ALL preceding search techniques, check out: http://www.esm.vt.edu/~zgurdal/COURSES/4084/4084-Docs/Animation.html

NLP Unconstrained Multivariable

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

NLP Unconstrained Multivariable

Загружено:

Авторское право:

Доступные форматы

What you can do for one variable, you can do for many (in principle)

Optimization in Engineering Design

Georgia Institute of Technology Systems Realization Laboratory 78

Method of Steepest Descent

Remember: Direction = dk = S(k) = -F(x(k))

Optimization in Engineering Design

Newton's Method (multi-variable case)

Is this correct? Not yet. Why?

Properties of Newton's Method

Optimization in Engineering Design

Conjugate Directions Method

Optimization in Engineering Design

Georgia Institute of Technology Systems Realization Laboratory 83

Conjugate Gradient Method

You generate the conjugate directions as you go along.

Georgia Institute of Technology Systems Realization Laboratory 84

Pure Conjugate Gradient Method (Quadratic Case)

Optimization in Engineering Design

Georgia Institute of Technology Systems Realization Laboratory 85

Non-Quadratic Conjugate Gradient Methods

Other approaches avoid the Q completely by using Line Searches

Optimization in Engineering Design

Polak-Robiere & Fletcher Reeves Method for Minimizing f(x)

Note: gk+1 is the gradient of the objective function at point xk+1

Georgia Institute of Technology Systems Realization Laboratory 87

Fletcher-Reeves Method for Minimizing f(x)

See also Example 3.9 (page 73) in your textbook

Georgia Institute of Technology Systems Realization Laboratory 88

Conjugate Gradient Method Advantages

Optimization in Engineering Design

Вам также может понравиться