Академический Документы
Профессиональный Документы
Культура Документы
Performance Optimization
1
9 Basic Optimization Algorithm
x k + 1 = x k + k pk
or
x k = ( x k + 1 x k ) = k p k
xk + 1
k pk
xk
pk - Search Direction
k - Learning Rate
2
9 Steepest Descent
Choose the next step so that the function decreases:
F ( xk + 1 ) < F ( xk )
where
g k F ( x )
x = xk
xk + 1 = xk k gk
3
9 Example
2 2
F ( x ) = x 1 + 2x 1 x 2 + 2x 2 + x 1
x 0 = 0.5 = 0.1
0.5
F(x)
x1 2x 1 + 2x 2 + 1 g 0 = F ( x ) = 3
F ( x ) = = x = x0
2x 1 + 4x 2 3
F(x)
x2
-1
-2
-2 -1 0 1 2
5
9 Stable Learning Rates (Quadratic)
1 T T
F ( x ) = --- x Ax + d x + c
2
F ( x ) = Ax + d
x k + 1 = x k g k = x k ( Ax k + d ) xk + 1 = [ I A ] xk d
Stability is determined
by the eigenvalues of
this matrix.
[ I A ] z i = z i Az i = zi i zi = ( 1 i ) z i
(i - eigenvalue of A) Eigenvalues
of [I - A].
Stability Requirement:
2 2
( 1 i ) < 1 < ---- < ------------
i max
6
9 Example
0.851 , = 5.24, z = 0.526
A = 22 (
1 = 0.764 ) , 1
z = 2 2
2 4 0.526 0.851
2 2
< ------------ = ---------- = 0.38
max 5.24
= 0.37 = 0.39
2 2
1 1
0 0
-1 -1
-2 -2
-2 -1 0 1 2 -2 -1 0 1 2
7
9 Minimizing Along a Line
Choose k to minimize F ( x k + k pk )
d T T
---------(F ( x k + k p k )) = F ( x ) p k + k p k 2 F ( x ) p
d k x = xk x = xk k
T
F ( x ) pk T
g k pk
x = xk - = -------------------
k = ----------------------------------------------- -
T 2 T
pk F ( x ) pk pk Ak pk
x = xk
where
A k 2F ( x )
x = xk
8
9 Example
1 T
F ( x ) = --- x 2 2 x + 1 0 x x 0 = 0.5
2 2 4 0.5
F(x)
x1 2x 1 + 2x 2 + 1
F ( x ) = = p 0 = g 0 = F ( x ) = 3
2x 1 + 4x 2 x = x0 3
F(x)
x2
3
33
3
0 = -------------------------------------------- = 0.2 x1 = x0 0 g0 = 0.5 0.2 3 = 0.1
2 2 3 0.5 3 0.1
3 3
2 4 3
9
9 Plot
Contour Plot
2
-1
-2
-2 -1 0 1 2
x1
d d T d
F ( xk + k pk ) = F ( x k + 1 ) = F ( x ) [ x + k pk ]
d k d k x = xk + 1 d k k
T T
= F ( x ) pk = gk + 1 pk
x = xk + 1
10
9 Newtons Method
T 1 T
F ( x k + 1 ) = F ( x k + x k ) F ( x k ) + g k x k + --- x k A k x k
2
gk + Ak xk = 0
1
xk = Ak gk
1
x k + 1 = xk Ak g k
11
9 Example
2 2
F ( x ) = x 1 + 2x 1 x 2 + 2x 2 + x 1
x 0 = 0.5
0.5
g 0 = F ( x ) = 3
F(x) x = x0
x1 2x 1 + 2x 2 + 1 3
F ( x ) = =
2x 1 + 4x 2
F(x)
x2 A = 22
2 4
1
x 1 = 0.5 2 2 3
=
0.5
1 0.5 3
=
0.5
1.5
=
1
0.5 2 4 3 0.5 0.5 0.5 3 0.5 0 0.5
12
9 Plot
-1
-2
-2 -1 0 1 2
13
9 Non-Quadratic Example
4
F ( x ) = ( x 2 x 1 ) + 8x 1 x 2 x 1 + x 2 + 3
F(x) F2(x)
2 2
1 1
0 0
-1 -1
-2 -2
-2 -1 0 1 2 -2 -1 0 1 2
14
9 Different Initial Conditions
2 2 2
1 1 1
F(x) 0 0 0
-1 -1 -1
-2 -2 -2
-2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2
2 2 2
1 1 1
F2(x) 0 0 0
-1 -1 -1
-2 -2 -2
-2 -1 0 1 2 -2 -1 0 1 2 -2 -1 0 1 2
15
9 Conjugate Vectors
1 T T
F ( x ) = --- x Ax + d x + c
2
T T
z k Az j = j z k z j = 0 k j
F ( x ) = Ax + d
2 F ( x ) = A
where
x k = ( x k + 1 x k ) = k p k
where
T T T
g k 1 g k gk gk g k 1 g k
k = ----------------------------
T
- or k = ------------------------- or k = -------------------------
g k 1 p k 1 g Tk 1 g k 1 g Tk 1 g k 1
18
9 Conjugate Gradient algorithm
The first search direction is the negative of the gradient.
p0 = g0
1 T
F ( x ) = --- x 2 2 x + 1 0 x x 0 = 0.5
2 2 4 0.5
F(x)
x1 2x 1 + 2x 2 + 1
F ( x ) = = p 0 = g 0 = F ( x ) = 3
2x 1 + 4x 2 x = x0 3
F(x)
x2
3
33
3
0 = -------------------------------------------- = 0.2 x1 = x0 0 g0 = 0.5 0.2 3 = 0.1
2 2 3 0.5 3 0.1
3 3
2 4 3
20
9 Example
g 1 = F ( x ) = 2 2 0.1 + 1 = 0.6
x = x1 2 4 0.1 0 0.6
0.6
0.6 0.6
g T1 g 1 0.6 0.72
1 = ------------ = ----------------------------------------- = ---------- = 0.04
T
g0 g0 18
3
3 3
3
0.72
0.6 0.6
0.48 0.72
1 = --------------------------------------------------------------- = ------------- = 1.25
0.576
2 2 0.72
0.72 0.48
2 4 0.48
21
9 Plots
x2
0
-1
-2
-2 -1 0 1 2
x1
22