Вы находитесь на странице: 1из 67

# Introduction

## Optimization is viewed as a decision problem that

involves nding the best values of the decision variables over all possibilities. The best values would give the smallest objective function value for a minimization problem or the largest objective function value for a maximization problem. In terms of real world applications, the objective function is often a representation of some physically signicant measure such as prot, loss, utility, risk or error.

Evolution in Biology

Evolution in Biology I
Organisms produce a number of offspring similar to

themselves but can have variations due to: Mutations (random changes)

## Sexual reproduction (offspring have combinations of

features inherited from each parent)

Evolution in Biology II
Some offspring survive, and produce next generations, and some

dont: The organisms adapted to the environment better have higher chance to survive Over time, the generations become more and more adapted because the fittest organisms survive

Dierential Evolution
Population-based optimisation algorithm
Introduced by Storn and Price in 1996 But many practical problems have objective functions

that are non-dierentiable, hence gradient based methods cannot be employed Developed to optimise mixed integer-discretecontinuous non-linear programming as against the continuous non-linear programming which is done by most of the algorithms.

## Why use Dierential Evolution?

All evolutionary algorithms use predefined

probability density functions(PDFs) for parameter perturbation. These PDFs have to be adapted throughout the optimization by a process that imposes additional complexity to the optimization procedures. DE is a self-adjusting as it deduces the perturbation information from the distances between the vectors that comprise the population.

Problem

The population
A population is a congregation of individuals. An important

## and D the dimensionality, i.e. the number of variables in X.

E.g. X=[x1;x2;x3] =>D=3, we take a population size of 5 =>Np=5 and

Initialisation

Mutation

Cross-over

Selection

## Initialization of the population

X is a D-dimensional initialization vectors, bL and bU indicate the lower and upper bounds of the

parameter vectors xi,j. The random number generator, randj[0,1), returns a uniformly distributed random number from within the range [0,1), i.e., 0 randj[0,1) < 1. The subscript, j, indicates that a new random value is generated for each parameter.

DE Mutation Operator
DE generates new parameter vectors by adding the weighted

difference between two parameter vectors to a third vector For each target vector xi,G, i = 1,2,,NP, a mutant vector is generated according to: ui,G+1 = xr1,G + F (xr2,G xr3,G) where ui,G+1 is a mutant vector; r1,r2,r3 {1, 2, , NP}, random integer mutually different and different from index i then NP4 F is a real and constant factor (0, 1+] which controls the amplification of the differential variation; typical value is 0.8-0.9

DE Mutation Operator

Crossover
The classic variant of diversity enhancement is crossover

which mixes parameters of the mutation vector vi,g and the so-called target vector xi,g in order to generate the trial vector ui,g. The most common form of crossover is uniform and is defined as

## The crossover probability, Cr [0,1], is a user-defined value that

controls the fraction of parameter values that are copied from the mutant.

## Illustration of the crossover process for D=4.

Selection
DE uses simple one-to-one survivor selection where

the trial vector ui,g competes against the target vector xi,g. The vector with the lowest objective function value survives into the next generation g+1

Termination criteria
a preset number of maximum generations,
the difference between the best and worst function

values in the population is verysmall, the best function value has not improved beyond some tolerance value for a predened number of generations, the distance between solution vectors in the population is small.

## Intrinsic Control Parameters of Differential Evolution

population size Np; 2. mutation intensities Fy 3. crossover probability pc
1.
First Choice The originators recommend Np/N=10, F=0.8, and pc

=0.9. However, F=0.5 and pc=0.1 are also claimed to be a good rst choice.

Different problems usually require different intrinsic

control parameters. Increasing the population size is suggested if differential evolution does not converge. However, this only helps to a certain extent. Mutation intensity has to be adjusted a little lower or higher than 0.8 if population size is increased. It is claimed that convergence is more likely to occur but generally takes longer with larger population and weaker mutation (smaller F).

DE dynamics
There are two opposing mechanisms that influence

DEs population. 1) Population tendency to expand and explore the terrain on which it is working. It also prevents DEs population from converging prematurely. 2)By selection, expansion is counteracted by removing vectors which are located in unproductive regions.

Sample problem
f 2 (x) y (1 - x1).^2 100 * (x2 - x1.^2).^2 with x j [2.048, 2.048] and f * (x) 0.0.

Here it is a two variable problem, so D=2 The demonstration is for first 2 generations so g=2 We take a population size of 5 so Np=5 xL=-2.048 xU=2.048

Step 1:Initialization
bL=-2.048 bU=2.048

1 0.6557 0.0357

2 0.8491 0.9340

3 0.6787 0.7577

4 0.7431 0.3922

5 0.6555 0.1712

## Rand()*( bU-bL ) is generated

Pop Var 1 Var 2 1 2.6859 0.1463 2 3.4780 3.8256 3 2.7801 3.1037 4 3.0439 1.6066 5 2.6848 0.7012

## Initial vectors are then generated x = bL + Rand()*( bU-bL )

Pop Var 1 1 0.6379 2 1.4300 3 0.7321 4 0.9959 5 0.6368

Var 2

-1.9017

1.7776

1.0557

-0.4414

-1.3468

Example of Mutation
ui,G+1 = xr1,G + F (xr2,G xr3,G) F=0.9
Pop Var 1 Var 2 1 0.6379 -1.9017 2 1.4300 1.7776 3 0.7321 1.0557 4 0.9959 -0.4414 5 0.6368 -1.3468

## E.g. For 1st variable and first generation Xr1=1.4300

Xr2=0.6368
Xr3=0.9959 Trial vector U=1.4300 + 0.9 *(0.6368-0.9959)=1.10681

## selection is not done

Generation 2
Function value of the previous generation is
533.1020893 7.329829 27.08365508 205.4110593 307.1924338

## New vectors are generated

Pop Var 1 Var 2 1 -1.6977 -0.9729 2 1.2330 -1.9283 3 1.7566 0.9434 4 -0.0467 0.3216 5 -1.0761 -0.1686

Cross over

## Random numbers generated for comparison of cross-over

0.1239 0.4904 0.8530 0.2085 0.2703 0.5650 0.6403 0.8739 0.4170 0.2060

## Crossed population u(i,2)

Pop Var 1 1 -1.6977 2 1.4300 3 1.7566 4 -0.0456 5 -1.0761

Var 2

-0.9729

-1.9283

0.9434

-0.4414

-0.1686

Selection
f [u (i,2)]
1493.445845 1189.330898 459.4931906 11.29843767 180.2946151

f [x (i,2)]

## DE Algorithm for Rosenbrock function

Minimize Rosenbrock function:

f 2 (x) [100(x
j 1

D/ 2

2 2 j 1

x2 j ) (1 x2 j 1 ) ]
2 2

## with x j [2.048, 2.048] and f * (x) 0.0.

if D = 2, then f * (x11, x2 1) 10-6 Rosenbrock function D = 2 is a difficult minimization

problem because the convergence to the global optimum is inside a long, narrow, parabolic shaped flat valley

## Experiments and Results

DE/rand/1/bin scheme is used

## NP=5, F=0.9, CR=0.8 Results:

Total time for 20 test runs (ms) = 156 Eval. (mean) = 108.750 (vs 654) Best min fitness value = 8.56340e-08 Best member:

## The Difference Vector Distribution

All difference vectors have both a negative counterpart and

## DE Algorithm for Rosenbrock function

Minimize Rosenbrock function:

f 2 (x) [100( x2 x ) (1 x1 ) ]
2 2 1 2

## with x j [5, 5] and f * (x) 0.0.

if D = 2, then f * (x11, x2 1) 10-6 Rosenbrock function D = 2 is a difficult minimization

problem because the convergence to the global optimum is inside a long, narrow, parabolic shaped flat valley

## ui,G+1 = xr1,G + F (xr2,G xr3,G)

Gen 1
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 3
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 5
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 10
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 20
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 27
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 40
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 50
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

Gen 60
Contour plot generated using three random vectors
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

Contour plot generated using two random and a deterministic vector corresponding to fbest
5 4 3 2 1 0 -1 -2 -3 -4 -5 -5

-4

-3

-2

-1

-4

-3

-2

-1

X1=0.9997; x2=0.9995

X1=0.4367; x2=0.7395

## Optimization with Constraints

A general formulation for constrained optimization is

Inequality Constraints
Most often, inequality constraints are implemented as

penalty funcions. One common way to integrate constraint viola-tions into an optimization task is to multiply each penalty by a weight, wm, and add the result to the objective function, f(x):

Weights help normalize all penalties to the same range.
Without normalization, penalty function contributions

may differ by many orders of magnitude, leaving violations with small penalties underrepresented until those that generate large penalties become just as small. When there are many constraints, the main drawback of the penalty approach is that pre-specified weights must be well chosen to keep the population from converging upon either infeasible or non-optimal vectors.

## Drawbacks of Penalty functions

Schemes that sum penalty functions run the risk that

one penalty will dominate unless weights are correctly adjusted. In addition, the population can converge upon an infeasible region if its objective function values are much lower than those in feasible regions. It can even happen that no set of weights will work. Because weight selection tends to be a trial and error optimization problem in its own right, simpler direct constraint handling methods have been designed that do not require the user to tune penalty weights.

## Direct Constraint Handling

In simple terms, Lampinens criterion selects the trial

vector ui,g if: 1. ui,g satisfies all constraints and has a lower or equal objective function value than xi,g, or 2. ui,g is feasible and xi,g is not, or 3. ui,g and xi,g are both infeasible, but ui,g does not violate any constraint more than xi,g.

## Lampinens selection criterion

Equality Constraints
equality constraints can be written in the form

## constraints is to use them to eliminate variables from the objective function.

Eliminating variables ??
Eliminating an objective function variable with an

equality constraint not only ensures that all vectors satisfy the constraint, but also reduces the problems dimension by 1. In pactice, not all equality constraint equations can be solved for a term that also appears in the objective function. When it cannot be used to eliminate a variable, an equality constraint can be recast as a pair of inequality constraints.

## Transformation into Inequality Constraints

Eliminating a variable is the only way to ensure that a

solution exactly satisfies an equality constraint. Otherwise, the finite precision of floating-point number formats limits the degree to which an equality constraint can be satisfied. It is more reasonable, therefore, to demand that an equality constraint violation be less than , where can be made as small as desired:

Penalty Approach
Like inequality constraints, equality constraints can be

transformed into cost terms that penalize the objective function. Equation 4.29 shows that either the absolute value or the square of n(x) makes a suitable penalty. When all constraints are fulfilled, the penalty term, like , becomes zero:

## Нижнее меню

### Получите наши бесплатные приложения

Авторское право © 2021 Scribd Inc.