Вы находитесь на странице: 1из 24

8

Performance Surfaces

1
8 Taylor Series Expansion

d ( )
F ( x ) = F ( x ) + F x ( x x )
dx x = x

2
1 d ( ) ( x x ) +
2
+ --- F x
2 d x2
x = x

n
1 d ( ) ( x x ) n +
+ ----- F x
n! d x n
x = x

2
8 Example
x
F( x) = e

Taylor series of F(x) about x* = 0 :

x 0 0 1 0 2 1 0 3
F( x ) = e = e e ( x 0 ) + --- e ( x 0 ) --- e ( x 0 ) +
2 6

1 2 1 3
F ( x ) = 1 x + --- x --- x +
2 6

Taylor series approximations:


F ( x ) F0 ( x ) = 1

F( x ) F1( x ) = 1 x

1 2
F ( x ) F 2 ( x ) = 1 x + --- x
2
3
8 Plot of Approximations

F2 ( x )
3

2 F1 ( x )

1
F0 ( x )

-2 -1 0 1 2

4
8 Vector Case

F ( x ) = F ( x 1, x 2, , x n )


F ( x ) = F ( x ) + F(x ) ( x 1 x 1 ) + F(x) ( x 2 x 2 )
x1 x=x x 2 x=x

2
1
++ F( x ) ( x x ) + --- F ( x ) ( x x )2
xn x = x
n n 2 x2 x = x
1 1
1

2
1 ( x 1 x 1 ) ( x 2 x 2 ) +
+ --- F(x )
2 x 1 x 2 x=x

5
8 Matrix Form
F ( x ) = F ( x ) + F ( x ) ( x x )
T

x=x
1
+ --- ( x x ) 2F ( x ) ( x x ) +
T
2
x=x

Gradient Hessian
2 2 2
F(x) F( x ) F(x )
F(x) x 21 x 1 x 2 x 1 x n
x1
2 2 2
F(x) F( x) F(x ) F(x )
F ( x ) = x 2 2 F ( x ) = x 2 x 1 x 22 x 2 x n




F(x)
2

2

2
xn F( x) F( x ) F(x )
x n x 1 x n x 2 x 2n

6
8 Directional Derivatives

First derivative (slope) of F(x) along xi axis: F ( x ) x i

(ith element of gradient)

Second derivative (curvature) of F(x) along xi axis: 2 F ( x ) x 2i

(i,i element of Hessian)

T
p F ( x )
First derivative (slope) of F(x) along vector p: -----------------------
p

T
Second derivative (curvature) of F(x) along vector p: p 2 F ( x ) p
------------------------------
2
p

7
8 Example
2 2
F ( x ) = x 1 + 2x 1 x 2 + 2x 2

x = 0.5 p = 1
0 1


F(x)
x1 2x 1 + 2x 2
F ( x ) = = = 1
x = x 2x 1 + 4x 2 1
F(x)
x2 x = x
x = x

1
T 1 1
p F ( x ) 1 0
----------------------- = ------------------------ = ------- = 0
p 2
1
1
8
8 Plots
Directional
Derivatives
2

20

15
1
1.4
10
1.3
5 x2 0 1.0

0 0.5
2
1 2
-1
0.0
0 1
0
-1
x2 -2 -2
-1
x1
-2
-2 -1 0 1 2

x1

9
8 Minima
Strong Minimum
The point x* is a strong minimum of F(x) if a scalar > 0 exists,
such that F(x*) < F(x* + x) for all x such that > ||x|| > 0.

Global Minimum
The point x* is a unique global minimum of F(x) if
F(x*) < F(x* + x) for all x 0.

Weak Minimum
The point x* is a weak minimum of F(x) if it is not a strong
minimum, and a scalar > 0 exists, such that F(x*) F(x* + x)
for all x such that > ||x|| > 0.
10
8 Scalar Example
4 2 1
F ( x ) = 3x 7x --- x + 6
2
8

Strong Maximum
6

2 Strong Minimum

Global Minimum

0
-2 -1 0 1 2
11
8 Vector Example
4 2 2 2
F ( x ) = ( x 2 x 1 ) + 8x 1 x 2 x 1 + x 2 + 3 F ( x ) = ( x 1 1.5x 1 x 2 + 2x 2 )x 1
2 2

1.5

1 1

0.5

0 0

-0.5

-1 -1

-1.5

-2 -2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1 0 1 2

12 8

6
8

4
2

0 0
2 2
1 2 1 2
0 1 0 1
0 0
-1 -1
-1 -1
-2 -2 -2 -2
12
8 First-Order Optimality Condition
1 T 2 ( )
F ( x ) = F ( x + x ) = F ( x ) + F ( x ) x +
T

x + --- x F x
x=x 2 x=x
x = x x

For small x: If x* is a minimum, this implies:


F ( x + x ) F ( x ) + F ( x )
T T
x F ( x ) x 0
x = x x = x

If F ( x )
T
x > 0 then F ( x x ) F ( x ) F ( x ) T x < F ( x )
x=x x = x

T
But this would imply that x* is not a minimum. Therefore F ( x ) x = 0
x = x

Since this must be true for every x, F ( x )


= 0
x=x
13
8 Second-Order Condition
If the first-order condition is satisfied (zero gradient), then
1 T
F ( x + x ) = F ( x ) + --- x 2F ( x ) x +
2 x = x
T
A strong minimum will exist at x* if x 2F ( x )
x > 0 for any x 0.
x=x

Therefore the Hessian matrix must be positive definite. A matrix A is positive definite if:
T
z Az > 0 for any z 0.

This is a sufficient condition for optimality.

A necessary condition is that the Hessian matrix be positive semidefinite. A matrix A is


positive semidefinite if:
T
z Az 0 for any z.
14
8 Example
2 2
F( x ) = x 1 + 2x 1 x 2 + 2x 2 + x 1

2x 1 + 2x 2 + 1
F ( x ) = = 0 x = 1
2x 1 + 4x 2 0.5

(Not a function of x
2F ( x ) = 2 2
2 4 in this case.)

To test the definiteness, check the eigenvalues of the Hessian. If the eigenvalues
are all greater than zero, the Hessian is positive definite.

2F ( x ) I = 2 2 2
= 6 + 4 = ( 0.76 ) ( 5.24 )
2 4

= 0.76, 5.24 Both eigenvalues are positive, therefore strong minimum.


15
8 Quadratic Functions
1 T T
F ( x ) = --- x Ax + d x + c (Symmetric A)
2

Gradient and Hessian:


Useful properties of gradients:
T T
( h x ) = ( x h ) = h
T T
x Qx = Qx + Q x = 2 Qx (for symmetric Q )

Gradient of Quadratic Function:


F ( x ) = Ax + d

Hessian of Quadratic Function:


2 F ( x ) = A
16
8 Eigensystem of the Hessian
Consider a quadratic function which has a stationary
point at the origin, and whose value there is zero.
1 T
F ( x ) = --- x Ax
2
Perform a similarity transform on the Hessian matrix,
using the eigenvalues as the new basis vectors.
B = z 1 z 2 zn

Since the Hessian matrix is symmetric, its eigenvectors


are orthogonal. 1 T
B = B

1 0 0
T 0 2 0
A' = [ B AB ] = = A = BB T

0 0 n
17
8 Second Directional Derivative
T T
p 2 F ( x ) p p Ap
------------------------------ = ---------------
2 2
p p

Represent p with respect to the eigenvectors (new basis):


p = Bc

i c 2i
p Ap c B ( B B ) Bc c c i = 1
T T T T T
---------------
2
= -------------------------------------------
T T
- = -------------
T
- = -------------------
n
-
p c B Bc c c
c2i
i=1

T
p Ap
min --------------- max
2
p
18
8 Eigenvector (Largest Eigenvalue)
0
0


p = z max T T
c = B p = B zmax = 0
1
0


0
n

T
z max Az max
i c 2i
z1
-------------------------------
2
=1
n
- = max
- = i------------------- z2 (min)
z max
c 2i (max)
i=1

The eigenvalues represent curvature


(second derivatives) along the eigenvectors
(the principal axes).
19
8 Circular Hollow
1 T
F ( x ) = x 1 + x 2 = --- x 2 0 x
2 2
2 0 2

2F ( x ) = 2 0 1 = 2 z1 = 1 2 = 2 z2 = 0
02 0 1

(Any two independent vectors in the plane would work.)


2

2
0

0
2 -1

1 2
0 1
0
-1
-1
-2 -2
-2 -2 -1 0 1 2

20
8 Elliptical Hollow
1 T
F ( x ) = x 1 + x 1 x 2 + x 2 = --- x 2 1 x
2 2
2 1 2

2F ( x ) = 2 1 1 = 1 z1 = 1 2 = 3 z2 = 1
1 2 1 1

0
1

0
2 -1

1 2
0 1
0
-1
-1
-2 -2
-2 -2 -1 0 1 2

21
8 Elongated Saddle
1 2 3 1 2 1 T
F ( x ) = --- x 1 --- x 1 x 2 --- x 2 = --- x 0.5 1.5 x
4 2 4 2 1.5 0.5

2F ( x ) = 0.5 1.5 1 = 1 z1 = 1 2 = 2 z2 = 1
1.5 0.5 1 1

-4 0

-8
2
-1
1 2
0 1
0
-1
-1
-2 -2 -2
-2 -1 0 1 2

22
8 Stationary Valley
1 2 1 2 1 T
F( x ) = --- x 1 x 1 x 2 + --- x 2 = --- x 1 1 x
2 2 2 1 1

2F ( x ) = 1 1 1 = 1 z1 = 1 2 = 0 z2 = 1
1 1 1 1

0
1

0
2 -1

1 2
0 1
0
-1
-1
-2 -2
-2 -2 -1 0 1 2

23
8 Quadratic Function Summary
If the eigenvalues of the Hessian matrix are all positive, the
function will have a single strong minimum.
If the eigenvalues are all negative, the function will have a
single strong maximum.
If some eigenvalues are positive and other eigenvalues are
negative, the function will have a single saddle point.
If the eigenvalues are all nonnegative, but some
eigenvalues are zero, then the function will either have a
weak minimum or will have no stationary point.
If the eigenvalues are all nonpositive, but some
eigenvalues are zero, then the function will either have a
weak maximum or will have no stationary point.

x = A d
1
Stationary Point:
24

Вам также может понравиться