Genetic Algorithms

School Timetabling using Genetic Search
Caldeira JP, Rosa AC

Laseeb - ISR IST email: acrosa@isr.ist.utl.pt
Abstract
In the paper we discuss the implementation of a
genetic based algorithm that is used to produce
timetables for a small school. A problem specific
chromosome representation and the use of a repair
algorithm after the genetic operators avoid
searching through illegal timetables. We also tested
the use of different fitness functions and present
results obtained with our prototype timetabling
system implemented in C.
1. Introduction
The school-timetabling problem is basically the
assignment of weekly lessons to time periods.
Existing solutions are either difficult to use or lead
to inadequate solutions. Our objective is to develop
a program that can easily be used in a typical school
and allowing user interaction and modification of
parameter settings.
In section 2 we discuss the class-teacher-
timetabling problem in more detail. The genetic
algorithm (G.A.), chromosome representation and
the cost and fitness functions are explained in
section 3. Section 4 presents the genetic operators
and repair function. We finish off in section 5 with
computational results and conclusions.
2. The Class-Teacher Timetabling
problem
The Class-Teacher Timetabling [1] problem
basically considers set of classes, and a set of
teachers. Each class is a set of students having a
common curriculum and studying together. For each
pair <class, teacher> the required lessons and their
numbers are defined. Data corresponding to the
periods of unavailability of each teacher must also
be present.
Each day of the week is divided into 10 60-minute
periods (time slots), which results in a total of 50
periods numbered from 0-49, as can be seen in fig1.
Time Mon. Tues. Wed. Thurs. Fri.
8-9 0 10 20 30 40
9-10 1 11 21 31 41
10-11 2 12 22 32 42
11-12 3 13 23 33 43
12-13 4 14 24 34 44
13-14 5 15 25 35 45
14-15 6 16 26 36 46
15-16 7 17 27 37 47
16-17 8 18 28 38 48
17-18 9 19 29 39 49
Fig.1. Division of week into periods (time slots)
Each lesson must be assigned to a time period in
such a way that a number of requirements are met.
These requirements can be divided into 2
categories:
- hard constraints
- second-order (or soft) constraints
A timetable is feasible if and only if the following
hard constraints are satisfied:
Each lesson is scheduled to exactly one period
There are no clashes at all: Neither a class nor a
teacher is assigned to more than one lesson in
the same period
Teacher unavailabilities are considered
In addition to the above conditions, good timetables
satisfy as many of the following soft constraints as
possible, such as:
Lesson continuity students and teachers do
not like timetables with gaps between lessons.
A lunch break must be scheduled (one hour
between 12-15 must be free).
The number of lesson per day for a teacher or
class can not exceed a specified limit.
Lessons of the same subject must be distributed
uniformly over the week
As far as possible, classes should either have
lessons in the morning or afternoon.
Classes and teachers prefer to have more
lessons on some days, in order to have a day
without lessons (free day).
3. The Algorithm
3.1. Chromosome Representation
The chromosome is constituted by genes, which
depending on their position, contain data relative to
the time slot associated to each lesson in the class
and teacher timetables. As seen in fig. 2, the time
slot data is duplicated in the two genes that
represent a certain lesson of the class-teacher pair.
One in the class timetable and the other in the
teacher timetable. The number of genes of each
chromosome is, therefore, two times the total
number of lessons of all classes.
The header stores all information common to all
chromosomes, i.e. the information as to where in
each chromosome you can find the genes that
represent the lessons of a certain subject taught to a
certain class.
This chromosome representation was chosen
because it allows efficient chromosome
initialisation and repair. Observing Fig.2, we can
see that when any corresponding genes of any two
chromosomes are switched (by a crossover or
mutation operator, for example), the resulting
chromosomes automatically satisfy constraints
relative to the required lessons and their numbers
for each <class, teacher> pair. A timetable is only
valid, however, after possible resulting clashes have
been corrected by the repair function.
If a time-slot were randomly associated with a
lesson, the order of the search space would be dim
= n_of_classes*50
total_n_of_lessons
. For a timetable to
be feasible, however, all hard constraints (as
defined in previous section) must be met. We can
greatly reduce the search space if we work within
these restrictions, making it much faster to reach
good solutions. This is very important in a highly
constrained problem like this, otherwise, one runs
the risk of creating a genetic algorithm that spends
most of its time evaluating illegal individuals [2].
3.2. Initialisation
The G.A. starts by checking a number of conditions
necessary for feasible timetable to exist. The
Initialisation procedure randomly creates a
population of feasible solutions (solutions that
satisfy all hard constraints).
To determine which periods will be assigned to the
n
i
lessons of subject s
i
lectured to class c
i
, the
initialisation procedure:
- Calculates the set F
I
of the common periods
between the set of free periods of the class
(set F
C
) and set of free periods of the teacher
(set F
T
).
T C I
F F F
- Randomly selects n
I
periods of F
I
and
removes them F
C
and F
T
which are later used
when these steps are repeated for the other
subjects and classes.
- If the number of periods in F
I
is less than n
I
the chromosome is reinitialised.
The lesson allocation order is determined by
scarcity of its resources. Lesson with a more limited
number of possible allocations are considered first.
This is also the case in the repair algorithm,
discussed later on.
If resources are very scarce, the initialisation and
Timetable of Class 1
Timetable of Teacher of subject M
Mon Tues Wed Thurs Fri Mon Tues Wed Thurs Fri
8-9 0 10 20 30 F40 8-9 0 10 20 30 40
9-10 1 I 11 21 31 41 9-10 1 11 21 31 41
10-11 2 12 22 M 32 42 10-11 2 T312 T422 T1 32 42
11-12 3 13 23 33 I 43 11-12 3 13 T2 23 33 T3 43
12-13 4 14 24 34 44 12-13 4 T2 14 24 T434 44
13-14 5 M 15 Q25 35 45 13-14 T2 5 T1 15 25 35 T445
14-15 M 6 16 26 F36 46 14-15 T1 6 T3 16 26 36 46
15-16 F 7 Q17 27 I 37 47 15-16 7 17 27 37 47
16-17 8 18 28 Q 38 48 16-17 8 18 28 38 48
17-18 9 19 29 39 49 17-18 9 19 29 39 49
Class 1 Class 4 Teacher of M Teacher of I
Header M M M F F F Q Q Q I I I M M M F F F Q Q Q I I I T1 T1 T1 T2 T2 T2 T3 T3 T3 T4 T4 T4 T1 T1 T1 T2 T2 T2 T3 T3 T3 T4 T4 T4
LAB 6 32 15 7 36 40 25 17 38 11 37 43 34 45 22 34 2 21 4 31 23 43 12 29 6 32 15 5 23 14 16 43 12 34 45 22 7 36 40 43 32 27 5 42 23 34 2 21
Fig.2 shows chromosome representation and how time slot (period) data duplicated in genes. For example, both first
genes in class 1 and teacher of M timetables refer to the same lesson and therefore corresponding gene data refer to
the same period (6).
repair of a chromosome will take much longer
because these algorithms will frequently fail to
return valid timetables and have therefore, to be
repeated.
Although all members of the initial population are
feasible, they generally suffer from very poor
fitness. This is because the Initialisation routine
does not consider any soft-constraints. Handling of
the soft-constraints is left to the evolutionary
process of the genetic algorithm.
3.3. The Cost Function
The cost function is an very important part of any
G.A. It is responsible for determining the value of
each chromosome, allowing us to distinguish the
good from the bad and in this way guide the G.A. to
better solutions. Our Cost function has the form of:
( )
( ) [ ] 1 , 0 ,
) (

+
profscale Timetable Cost profscale

Timetable Cost Chromosome Cost
teachers
classes
Although the costs of class and teacher timetables
are determined in the same way (by function
Cost(Timetable) ), it is generally more important for
classes to have good timetables, than for teachers.
The weight, profscale, allows us to tell the G.A.
how much more important the class timetables are.
The function Cost(Timetable) is composed of a
weighed sum of the penalty values imposed to the
violation of each of the soft-constraints in section 2.
It can be written as:
( )
FO n n LH n MorA n n gap
nld nlsd Timetable Cost
o e a g
s
i
n
ns
j
s
i
nd
i ij
+ + + +
+

) ( .
min
1
7
1 1
1
Some of the penalty factors are linear and some are
exponential and the user can change the weights of
the different penalties, while the program is
running. The default values were chosen assuming a
weight value w
i
of a certain penalty factor and
choosing the others relative to w
i
in order for
resulting timetables to have desired characteristics.
The fist part of the timetable cost function
1
ij
nd
nlsd penalises timetables with more
than one hour of a certain subject per day and
therefore, contributes to distribute a subjects
lessons throughout the week, instead of
concentrating them on one day.
The default value for nlsd is 20 and for this value
the penalties are:
nd
ji
- number of lessons
of same subject
j
/day
i
Penalty
1 1
2 20
3 400
4 8000
Tab.1: Evolution of the factor that penalises timetables
with more than two lessons of the same subject per day.
The default value allows the possibility of 2 lessons
of the same subject per day, but makes 3 lessons
difficult and 4 and more virtually impossible. These
results can only be accomplished through a non-
linear term.
The second term
s
i
n
i
nld
1
7
penalises a timetable
with more than seven lessons per day. For the
default value of 40 we have:
n
i
- number of lessons
in day
i
Penalty
6 0
7 1
8 40
9 1600
Tab.2: Costs resulting from the factor that penalises
timetables with more than seven lessons per day.
These penalties allow seven lessons per day,
penalise 8 and make more than 8 virtually
impossible.
Great care must be taken on which terms are made
exponential. We must bear in mind that if the
exponent is large, an exponential factor will
outweigh all others. It is for this reason that we can
only use exponential factors when the maximum
value of the exponent is small. This was the case in
the first two terms of the timetable cost function.
In the next two terms however, n
g
and n
a
may have
large values. Therefore, if we were to use an
exponential terms, not only will the other terms of
the timetable cost function become irrelevant but
the convergence of the G.A. will be very slow.
The third term
g
n gap. penalises timetables with
spaces between lessons. The default value for gap is
50 and n
g
is the total amount of spaces between
lessons in the timetable.
The fourth term MorA n
a
penalises timetables
with morning and afternoon classes. The default
value for MorA is 50 and n
a
is the number of
lessons in the morning, if there are more lessons in
the morning than in the afternoon otherwise n
a
is the
number of lessons in the afternoon.
classes afternoon of number n
classes morning of number n
n n n
n n n
n
a
m
a m a
a m m
a
'
<

The term n LH
e
(n
e
- number of days without
lunch hour) penalises timetables with days without
lunch hour. The default value for LH is 500. This
large value guarantees makes it difficult for
timetables with lunch hour to evolve to timetables
without.
Finally, the last term (n
o
-n
min
).OD in which, n
o
is the
number of days with lessons and n
min
is the
minimum number of days in which the class could
have all its lessons, penalises timetables with more
days occupied with lessons than necessary. The
weight of OD should be chosen with the weight of
MorA in mind. The larger OD is relative to MorA,
the more days will have lessons in the morning and
afternoon in order for some days to be without
lessons. The default value for OD is 175.
3.4. The Fitness Function
The fitness function also plays a very important role
in G.A. because it is responsible for determining the
S.P.
1
(Survival Probability) of a certain
chromosome.
( )
( )
( )
i
i
i i
i
i
val F k
chromosome Cost val
Function Fitness F
which in
k
val F
P S
:
. .
Since worse solutions have greater costs (are more
penalised), in order for better solutions to have a
greater S.P.
( )
0 <
val
val F
. We experimented with 3
different fitness functions [3]:

1
S.P. (Survival Probability ) Probability of
chromosome being used to spawn next generation
] [ +

, 1 ,
.
.
) (
) (
2
2
2
2
1 1
k
val worstval k
val worstval k
val F
val k val F
( )
] [ 1
3
k
bestval val
bestval val
k
val F e , 0
. ln
) (
3
,
3
in which:
val Cost of a certain chromosome
worstval - Cost of worst chromosome in
population
bestval - Value of best chromosome in
population
val - Average value of population
The major difference between F
1
and the other
functions is that the latter adapt themselves from
generation to generation while F
1
remains constant.
In F
3
, the Relative Survival Probability (R.S.P.
2
) of
best chromosome relative to average chromosomes
is kept constant.
( )
( )
( )
] [ 1 , 0 ,
1 ) (
. . .
3
3 3
3

k
k val F
bestval F
omosome AverageChr p
some BestChromo p
P S R
With k
3
= 0.5 the survival probability of the best
chromosome is twice that of average chromosomes.
As we can see in fig.3 k
3
varies the slope of F
2
with
lower values of k
3
leading a more elitist fitness
function. A fitness function is said to be more elitist
when it is more selective i.e. slight differences in
chromosome cost lead to significant differences in
S.P.
On the other hand F
2
, adjusts itself in way that the
survival probability of worst chromosome relative
to average chromosomes is:

2
R.S.P. (Relative Survival Probability) Ratio
between the S.P.s of 2 chromosomes
k
3
=0.25
k
3
=0.5
k
3
=0.75
val
Fig.3. shows how changes in k
3
(of F
3
) affect the
S.P. of chromosomes with different costs.
1000 2000 3000 4000 5000 6000 7000
0.1
0.2
SPa( ) val
SPb( ) val
SPc( ) val
val
] [ ] [
RSP
p WorstLAB
p AverageLAB
F worstval
F val
k
k k
k
val
worstval
k k
r
r r

+
( )
( )
( )
( )
, ,
3
3
2
2
2
1
01 1
This indicates that the value of RSP (Relative
Survival Probability) depends on relative value
between val and worstval. This dependence can be
seen in fig.4 which show that when:
k RSP
r
1 1
k RSP
k
k
r
k

0
1
0
2
2
1
2
This means that when worstval becomes close to
val ( 1
r
k ), which is the case when the G.A. is
stalled at local maximum, k
r
makes the fitness
function less elitist to allow the G.A. to find a way
around the local maximum.
When val is much smaller than worstval
( 0
r
k ), k
r
makes F
2
more elitist making the G.A.
converge faster.
In fig.5 we can see how k
2
varies the elitism of F
2
for a fixed value of k
r
. The closer k
2
is to 1 the more
elitist F
2
becomes.
Finally, in fig. 6, we have a comparison S.P.s
resulting from all three fitness functions with the
following parameter values: k
1
=40000, k
2
=1.1 and
k
3
=0.5. Observing fig.6 we can easily attribute F
1
s
poor performance to its almost non-existing slope
that results in all chromosomes having
approximately equal copy probability.
4. Genetic Operations
In this application the canonical G.A. operators
have been redesigned so that they always produce
feasible timetables.
4.1. Reproduction
The reproduction operator consists of the coping of
chromosomes without the changing their
characteristics. The reproduction method used is
Roulette-wheel selection Goldberg[6], where the
probability that an element has of being copied, is
proportional to its fitness.
4.2. Crossover
The crossover operator is responsible for
exchanging the genetic material of the
chromosomes that were reproduced. This is done,
randomly forming pairs of chromosomes with all
the elements of the population. The uniform
crossover operator is applied to each pair.
In this operator, whether or not two corresponding
chromosome genes are exchanged, depends on the
values of randomly generated mask. If mask value
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
RSPa( ) kr
RSPb( ) kr
RSPc( ) kr
kr
k
2
=1.01
5
k
2
=1.5
k
2
=1.1
Fig.4 shows how changes in k
r
affect R.S.P.
1000 2000 3000 4000 5000 6000 7000
0
0.05
0.1
0.15
0.2
SPa ( ) val
SPb ( ) val
SPc ( ) val
val
Fig. 5 shows how k
2
varies the elitism of F
2
val
k
2
=1.01
k
2
=1.1
k
2
=1.5
val
1000 2500 4000 5500 7000
0
0.05
0.1
0.15
SPf1 ( ) val
SPf2 ( ) val
SPf3 ( ) val
val
Fig.6 shows a comparison between all 3 fitness functions
is one genes are exchanged otherwise they remain in
the same chromosome. This procedure is easily
understood observing fig.7.
4.3. Mutation
Each independent gene in every chromosome has a
user defined mutation probability. Mutation consists
of changing a gene value to a random position.
4.4. The Repair Function
For resulting chromosomes to be valid they have to
be repaired. Repairing chromosomes consists of the
changing of gene values to valid values closest to
the original ones. This is done by first finding the
free positions (positions unoccupied by other
subjects or classes) common to the timetable of
both the class and the teacher teaching uncertain
subject. The free position closest to the original
gene value is then chosen and removed from the list
of free positions of the class and the teacher.
4.5. Ultra-Elitism
Ultra-Elitism was implemented using two different
methods. The first consists simply of sorting
chromosomes by descending order of value and
then replacing the n
BP
(number of best parents)
worst chromosomes with the n
BP
best chromosomes
of the previous generation. The chromosomes are
now resorted and the n
BP
best chromosomes are
stored in order to incorporate the next generation.
This ultra-elitism strategy assures that the n best
individuals are always preserved resulting in
monotonous convergence.
In the second method, we start by evaluating the
fitness of the n
chr
chromosomes of the population
and also of n
BP
best parent of the previous
generation. Now, using Roulette-wheel selection
we create a new population of n
chr
chromosomes.
These chromosomes are sorted by descending order
of value and the n
BP
best chromosomes are stored in
order to incorporate the next generation.
5. Results and Conclusions
5.1. Evaluation of Results
For the simple example problem of 4 classes and 4
teachers, the G.A. leads to good results after less
than 30000 evaluations that took about 50 seconds
on a pentium 133 Mhz PC. An example of results
obtained for this problem are shown in fig.9. These
results were obtained using the first ultra-elitism
method and fitness function F
3
on a population of
60 chromosomes, with n
BP
at 20 and other
parameters at their default values.
After many experiences, we have come to the
conclusion that the tuning of the many parameters
of this problem is very delicate matter. First the
problem parameters should be adjusted in order to
obtain a timetable with specific desired
characteristics.
Finally, the convergence parameters have to be
adjusted. To do so we have to keep in mind that
faster convergence rates normally lead to a greater
number of populations being stalled at a local
maximum. In others words faster convergence
Mask 0 1 0 0 0 1 1 0 0 1 0
LAB 1 12 5 43 16 25 31 3 45 13 40 32
LAB 2 43 41 14 32 36 32 2 12 23 21 34
Fig.7 demonstrates the uniform crossover operator
Common Free
Positions 3 4 10 13 18 25 26 30
Original Value
Resulting Value 18
20
2 5
Fig.8 shows how gene clashes are eliminated by the
repair function
distance
Fig.9 Scool timetable obtained for a proble with 4 classes
(top 4 timetables) and 4 teachers (bottom 4 timetables)
implies a narrower search and a wider search leads
to a slower convergence.
Rate of convergence can be seen graphically
observing how the fitness of the best chromosome
evolves from generation to generation (fig. 10).
The width of the search can be seen observing the
range of values of the average fitness of the
population (fig. 11 and 12).
The parameters that a user can change are:
Population size: This value should be large because
a bigger population size leads to a better-
represented population and a wider search. The
downside of increasing this value is that the
calculation time increases proportionally.
Number of Best Parents (n
BP
): This parameter and
the mutation probability strongly influence both the
rate of convergence and the width of the search.
High values of n
BP
lead to high convergence rates
and narrow searches as can be seen in fig. 11 in
which
n population size
BP

2
3
. _
.
On the other hand, low values of n
BP
lead to lower
convergence rates and wider searches as shown in
fig. 10 in which
size population n
BP
_ .
15
1
.
Empirical results suggest that the ideal value is
somewhere in the interval

1
3
2
3
population size population size _ , _
]
]
]
.
Mutation Probability: Results suggest that the ideal
value for the mutation probability is
P mut
total number of lessons
( )
1 which leads to
approximately 1 mutation/chromosome. Higher
mutation rates lead to more than 1
mutation/chromosome that results in bad mutations
destroying the effect of good mutations. Less than 1
mutation/chromosome means that the G.A. will be
stalled longer at local maximum.
Fig.11. Evolution of average chromosome cost with a
n
BP
= 40=
3
2
population size (little variation
narrow search)
Fig.12. Evolution of average chromosome cost with a
n
BP
= 4=
15
1
population size (great variation wide
search)
Fig.10. Convergence rate of G.A. i.e. Evolution rate
of the cost of best chromosome
5.2. Conclusions:
To compare ultra-elitism methods and fitness
functions the G.A. program was run 30 times for
each variation and results are shown in table 1.
Ultra-
Elitism
1
st
method 2
nd
method
Fitness
Function
F
2
F
3
F
2
F
3
b e s t
3
2517 2492 2566 2572
G.A.P.
4
90 86.7 86.6 73
num Eval
5
27044 24469 46809 30376
Tab.3 Comparison between results obtained from
different fitness functions and ultra-elitism methods.
The table shows that the first ultra-elitism method is
clearly better than the second for both fitness
functions. This is due to the fact that the worst
results are normally very bad, and therefore have a
great influence on the average value, resulting in an
almost equal survival probability for all the
elements of the population. When these elements
are removed, the G.A. becomes more elitist and the
algorithm converges much faster.
As far as the fitness functions are concerned, F
3
converges much faster than F
2
. This can be
concluded observing that
gen num
. F
2
, on the other
hand, is better at getting past local maximums as
can be seen comparing G.A.Ps.
5.3. Future Directions:
We think it would be interesting to try running a
G.A. using as initial population, the n best
chromosomes obtained after running the G.A.
independently m times for more generations.
Dynamic operator selection as suggested in [7] also
seems interesting. The extension to schools with

3
Average of the value of the best chromosomes
resulting from 75000 evaluations. (Empirical value
after which most runs reach goal)
4
Goal Achievement Percentage - percentage of
populations that have best chromosome cost less
than 2700 after 75000 evaluations. (2700 is an
empirical value corresponding to the average cost of
good timetables i.e. that satisfy all soft constraints)
5
Average of number of evaluations after which the
goal is achieved. (Only runs that achieve goal
before 75000 evaluations contribute)
around 10.000 students and different courses are
underway.
Acknowledgements
This research was partly supported by:
Project AGHora, Grant 3/3.1/CEG/2684/95 of the
Praxis XXI program.
References
[1]. Bardadym, V. A., Computer Aided School
and University Timetabling: The New Wave.
In Lecture Notes in Computer Science,
Vol.1153, p22-45, Springer
[2]. Davis, L; Steenstrup, M. (1987): Genetic
Algorithms and Simulated Annealing: An
Overview. In: Davis l. (ed.), Genetic
Algorithms and Simulated Annealing. Morgan
Kaufmann Publishers Inc., Los Altos, CA:1-11
[3]. Erben W. and Keppler K., A Genetic
Algorithm Solving a Weekly Course-
Timetabling Problem. In Lecture Notes in
Computer Science, Vol.1153, p198-211,
Springer
[4]. Colorni A,Dorigo M, Maniezzo. Genetic
Algorithms: A new approach to the TimeTable
Problem. In Combinatorial Optimisation (ed.
M.Aggul et al.) Lectures Notes in Computer
Science - NATO ASI Series, Vol.F 82, p 235-
239, Springer-Verlag.
[5]. Allen Lima J, Gracias N, Pereira H, Rosa AC.
Fitness Function Design for Genetic
Algorithms in Cost Evaluation Based
Problems, Proc. IEEE - Int. Conf. Evolutionay
Computation, ICEC96, pp 207-212, 1996
[6]. Goldberg, D.E. In Genetic Algorithms in
search, optimization and machine Learning,
Addison-Wesley
[7]. Rich, D. C. A Smart Genetic Algorithm for
University Timetabling, . In Lecture Notes in
Computer Science, Vol.1153, p181-197,
Springer

Genetic Algorithms

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Genetic Algorithms

Загружено:

Авторское право:

Доступные форматы

School Timetabling using Genetic Search

Caldeira JP, Rosa AC

profscale Timetable Cost profscale

Вам также может понравиться