Genetic Algorithms

Genetic Algorithms
18-Dec-14
Evolution
Heres a very oversimplified description of how evolution works

in biology
Organisms (animals or plants) produce a number of offspring
which are almost, but not entirely, like themselves
Some of these offspring may survive to produce offspring of their

ownsome wont
Variation may be due to mutation (random changes)

Variation may be due to sexual reproduction (offspring have some
characteristics from each parent)
The better adapted offspring are more likely to survive

Over time, later generations become better and better adapted
Genetic algorithms use this same process to evolve better

programs
2
Genotypes and phenotypes
Genes are the basic instructions for building an

organism
A chromosome is a sequence of genes
Biologists distinguish between an organisms genotype
(the genes and chromosomes) and its phenotype (what
the organism actually is like)
Example: You might have genes to be tall, but never grow to

be tall for other reasons (such as poor diet)
Similarly, genes may describe a possible solution to a

problem, without actually being the solution
3
The basic genetic algorithm
Start with a large population of randomly generated

attempted solutions to a problem
Repeatedly do the following:
Evaluate each of the attempted solutions
Keep a subset of these solutions (the best ones)
Use these solutions to generate a new population
Quit when you have a satisfactory solution (or you run out of time)
A really simple example
Suppose your organisms are 32-bit computer words

You want a string in which all the bits are ones
Heres how you can do it:
Create 100 randomly generated computer words

Count the 1 bits in each word
Exit if any of the words have all 32 bits set to 1
Keep the ten words that have the most 1s (discard the rest)
From each word, generate 9 new words as follows:
Pick a random bit in the word and toggle (change) it
Note that this procedure does not guarantee that the next
generation will have more 1 bits, but its likely
5
A more realistic example, part I
Suppose you have a large number of (x, y) data points
For example, (1.0, 4.1), (3.1, 9.5), (-5.2, 8.6), ...
You would like to fit a polynomial (of up to degree 5) through

these data points
That is, you want a formula y = ax5 + bx4 + cx3 + dx2 +ex + f that gives
you a reasonably good fit to the actual data
Heres the usual way to compute goodness of fit:
Compute the sum of (actual y predicted y)2 for all the data points
The lowest sum represents the best fit
There are some standard curve fitting techniques, but lets

assume you dont know about them
You can use a genetic algorithm to find a pretty good solution
6
A more realistic example, part II
Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f

Your genes are a, b, c, d, e, and f
Your chromosome is the array [a, b, c, d, e, f]
Your evaluation function for one array is:
For every actual data point (x, y), (Im using red to mean actual data)
Compute = ax5 + bx4 + cx3 + dx2 +ex + f

Find the sum of (y )2 over all x
The sum is your measure of badness (larger numbers are worse)
Example: For [0, 0, 0, 2, 3, 5] and the data points (1, 12) and (2, 22):
= 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 2 + 3 + 5 = 10 when x is 1

= 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 8 + 6 + 5 = 19 when x is 2
(12 10)2 + (22 19)2 = 22 + 32 = 13
If these are the only two data points, the badness of [0, 0, 0, 2, 3, 5] is 13
7
A more realistic example, part III
Your algorithm might be as follows:
Create 100 six-element arrays of random numbers

Repeat 500 times (or any other number):
For each of the 100 arrays, compute its badness (using all data
points)
Keep the ten best arrays (discard the other 90)
From each array you keep, generate nine new arrays as
follows:
Pick a random element of the six
Pick a random floating-point number between 0.0 and 2.0
Multiply the random element of the array by the random
floating-point number
After all 500 trials, pick the best array as your final answer
Asexual vs. sexual reproduction
In the examples so far,
Each organism (or solution) had only one parent

Reproduction was asexual (without sex)
The only way to introduce variation was through mutation
(random changes)
In sexual reproduction,
Each organism (or solution) has two parents

Assuming that each organism has just one chromosome, new
offspring are produced by forming a new chromosome from
parts of the chromosomes of each parent
The really simple example again
Suppose your organisms are 32-bit computer words,

and you want a string in which all the bits are ones
Heres how you can do it:
Create 100 randomly generated computer words

Count the 1 bits in each word
Exit if any of the words have all 32 bits set to 1
Keep the ten words that have the most 1s (discard the rest)
From each word, generate 9 new words as follows:
Choose one of the other words
Take the first half of this word and combine it with the
second half of the other word
10
The example continued
Half from one, half from the other:

0110 1001 0100 1110 1010 1101 1011 0101
1101 0100 0101 1010 1011 0100 1010 0101
0110 1001 0100 1110 1011 0100 1010 0101
Or we might choose genes (bits) randomly:
0110 1001 0100 1110 1010 1101 1011 0101
1101 0100 0101 1010 1011 0100 1010 0101
0100 0101 0100 1010 1010 1100 1011 0101
Or we might consider a gene to be a larger unit:

0110 1001 0100 1110 1010 1101 1011 0101
1101 0100 0101 1010 1011 0100 1010 0101
1101 1001 0101 1010 1010 1101 1010 0101
11
Comparison of simple examples
In the simple example (trying to get all 1s):
The sexual (two-parent, no mutation) approach, if it succeeds,

is likely to succeed much faster
However, with no mutation, it may not succeed at all
Because up to half of the bits change each time, not just one bit
By pure bad luck, maybe none of the first (randomly generated) words
have (say) bit 17 set to 1
Then there is no way a 1 could ever occur in this position
Another problem is lack of genetic diversity
Maybe some of the first generation did have bit 17 set to 1, but
none of them were selected for the second generation
The best technique in general turns out to be sexual

reproduction with a small probability of mutation
12
Curve fitting with sexual reproduction
Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f

Your genes are a, b, c, d, e, and f
Your chromosome is the array [a, b, c, d, e, f]
Whats the best way to combine two chromosomes into
one?
You could choose the first half of one and the second half of
the other: [a, b, c, d, e, f]
You could choose genes randomly: [a, b, c, d, e, f]
You could compute gene averages:
[(a+a)/2, (b+b)/2, (c+c)/2, (d+d)/2, (e+e)/2,(f+f)/2]
I suspect this last may be the best, though I dont know of a good
biological analogy for it
13
Directed evolution
Notice that, in the previous examples, we formed the

child organisms randomly
We did not try to choose the best genes from each parent
This is how natural (biological) evolution works
Biological evolution is not directedthere is no goal
Genetic algorithms use biology as inspiration, not as a set of

rules to be slavishly followed
For trying to get a word of all 1s, there is an obvious

measure of a good gene
But thats mostly because its a silly example

Its much harder to detect a good gene in the curve-fitting
problem, harder still in almost any real use of a genetic
algorithm
14
Probabilistic matching
In previous examples, we chose the N best organisms

as parents for the next generation
A more common approach is to choose parents
randomly, based on their measure of goodness
Thus, an organism that is twice as good as another is likely

to have twice as many offspring
This has a couple of advantages:
The best organisms will contribute the most to the next

generation
Since every organism has some chance of being a parent, there
is somewhat less loss of genetic diversity
15
Genetic programming
A string of bits could represent a program

If you want a program to do something, you might try to
evolve one
As a concrete example, suppose you want a program to
help you choose stocks in the stock market
There is a huge amount of data, going back many years

What data has the most predictive value?
Whats the best way to combine this data?
A genetic program is possible in theory, but it might

take millions of years to evolve into something useful
How can we improve this?

16
Shrinking the search space
There are just too many possible bit patterns!
An incredible improvement would result if we could

somehow restrict the search space to only valid (even if
nonsensical) programs
99.9999% of these dont even represent valid programs
We can do this!
Programs, as you should know by now, can be

represented as trees
Internal nodes are operators: +, *, if, while, ...

Leaves are values: 2.71818, "AAPL", ...
17
Programs as trees
Given a program represented as a tree, we can mutate it

by changing one of its operators (or one of its values),
or by adding or removing nodes
Given two trees, we can form a new tree by taking parts
of its two parents
The next big problem: How do we evaluate program
trees that are (initially) nothing at all like what we
want?
I realize this is all very vagueI just wanted to give
you the general idea
18
Concluding remarks
Genetic algorithms are
Fun! They are enjoyable to program and to work with
This is probably why they are a subject of active research
Mind-bogglingly slowyou dont want to use them if you

have any alternatives
Good for a very few types of problems
Genetic algorithms can sometimes come up with a solution when you

can see no other way of tackling the problem
19
The End
20

Genetic Algorithms

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Genetic Algorithms

Загружено:

Авторское право:

Доступные форматы

Genetic Algorithms

Heres a very oversimplified description of how evolution works

Some of these offspring may survive to produce offspring of their

Variation may be due to mutation (random changes)

The better adapted offspring are more likely to survive

Genetic algorithms use this same process to evolve better

Genotypes and phenotypes

Genes are the basic instructions for building an

Example: You might have genes to be tall, but never grow to

Similarly, genes may describe a possible solution to a

The basic genetic algorithm

Start with a large population of randomly generated

A really simple example

Suppose your organisms are 32-bit computer words

Create 100 randomly generated computer words

A more realistic example, part I

Suppose you have a large number of (x, y) data points

For example, (1.0, 4.1), (3.1, 9.5), (-5.2, 8.6), ...

You would like to fit a polynomial (of up to degree 5) through

There are some standard curve fitting techniques, but lets

A more realistic example, part II

Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f

Compute = ax5 + bx4 + cx3 + dx2 +ex + f

= 0x5 + 0x4 + 0x3 + 2x2 +3x + 5 is 2 + 3 + 5 = 10 when x is 1

A more realistic example, part III

Your algorithm might be as follows:

Create 100 six-element arrays of random numbers

Asexual vs. sexual reproduction

In the examples so far,

Each organism (or solution) had only one parent

Each organism (or solution) has two parents

The really simple example again

Suppose your organisms are 32-bit computer words,

Create 100 randomly generated computer words

The example continued

Half from one, half from the other:

Or we might consider a gene to be a larger unit:

Comparison of simple examples

In the simple example (trying to get all 1s):

The sexual (two-parent, no mutation) approach, if it succeeds,

However, with no mutation, it may not succeed at all

The best technique in general turns out to be sexual

Curve fitting with sexual reproduction

Your formula is y = ax5 + bx4 + cx3 + dx2 +ex + f

Notice that, in the previous examples, we formed the

Biological evolution is not directedthere is no goal

Genetic algorithms use biology as inspiration, not as a set of

For trying to get a word of all 1s, there is an obvious

But thats mostly because its a silly example

In previous examples, we chose the N best organisms

Thus, an organism that is twice as good as another is likely

This has a couple of advantages:

The best organisms will contribute the most to the next

A string of bits could represent a program

There is a huge amount of data, going back many years

A genetic program is possible in theory, but it might

How can we improve this?

Shrinking the search space

There are just too many possible bit patterns!

An incredible improvement would result if we could

99.9999% of these dont even represent valid programs

Programs, as you should know by now, can be

Internal nodes are operators: +, *, if, while, ...

Given a program represented as a tree, we can mutate it