Hardware Implementation of Genetic Algorithms For Vlsi Cad Design

HARDWARE IMPLEMENTATION OF GENETIC ALGORITHMS FOR VLSI CAD DESIGN
G. Koonar S. Areibi M. Moussa
gkoonar@uoguelph.ca sareibi@uoguelph.ca mmoussa@uoguelph.ca

School of Engineering
University of Guelph
Guelph, Ontario
CANADA N1G 2W1
ABSTRACT function to be optimized were executed frequently during the

run. Neglecting I/O, these operations accounted for 80-90%
This paper proposes an architecture for implementing Genetic
of the total execution time. If m is the population size (num-
Algorithms(GA) used for circuit partitioning in VLSI physi-
ber of strings manipulated by the GA in one iteration) and
cal design automation. The architecture employs a combina-
g is the number of generations, a typical GA would execute
tion of pipelining and parallelization to achieve speedups over
each of its operations mg times. For complex problems, large
software based GA. The design uses six modules along with
values of m and g are required, so it is imperative to make
three external memories. The proposed design was coded in
the operations as efficient as possible. Work by Spears and
VHDL and was functionally verified by writing a testbench
De Jong [3] indicates that for NP-complete problems, m=100
and simulating it using ModelSim. The design was synthe-

and values of g on the order of 10 -10 may be necessary to

sized on Virtex part xcv50e using Xilinx ISE 4.1. The genetic
obtain a good result and avoid premature convergence to a lo-
algorithm processor proposed in this paper achieves more than
cal optimum. Pipelining and parallelization can help provide
100 improvement in processing speed as compared to the
the desired efficiency, and these are easily done in hardware.
software implementation. The proposed architecture is out-
The main goal of the research reported in this paper is to
lined and briefly discussed in this paper, while the current re-
propose an architecture for implementing GA that can employ
sults are presented and analyzed.
a combination of pipelining and parallelization to achieve speed-
Index Terms—Reconfigurable computing, Genetic Algorithms,
ups. This research demonstrates the feasibility of solving the
FPGAs, VHDL .
circuit-partitioning problem using hardware based GA. It also
demonstrates the usefulness of a GA processor by compar-
1. INTRODUCTION ing the performance of a hardware based GA with that of a
software-based GA.
A Genetic Algorithm (GA) is an optimization method based This work builds upon other research in reconfigurable
on natural selection [1]. Genetic Algorithms have been ap- hardware systems, which improved system performance by
plied to many hard optimization problems including VLSI mapping some or all software components to hardware us-
layout optimization, boolean satisfiability and the Hamilto- ing reprogrammable FPGAs. The paper is organized as fol-
nian circuit problem. They have been recognized as a robust lows. In section 2, an overview on Genetic Algorithm, FPGA
general-purpose optimization technique. But application of and Circuit Partitioning are discussed along with the previous
GAs to increasingly complex problems can overwhelm soft- work on Hardware based GAs. Section 3, describes the basic
ware implementations of GAs, causing unacceptable delays design of the proposed architecture and functionality of all the
in the optimization process. This is true of any non-trivial ap- modules in brief. Section 4, presents the simulation results
plication of GAs if the search space is large. It follows that and the paper is concluded in section 5.
a hardware implementation of a GA would be applicable to
problems too complex for software-based GAs.
Because a general-purpose GA engine requires certain parts 2. BACKGROUND
of its design to be easily changed (e.g. the function to be op-
2.1. Overview of Circuit Partitioning
timized), a hardware-based Genetic Algorithm (HGA) was
not feasible until field-programmable gate arrays (FPGAs) Circuit partitioning is the task of dividing a circuit into smaller
[2] were developed. Reprogrammable FPGAs (those pro- parts [4]. It is an important aspect of layout for several rea-
grammed via a bit patterns stored in a static RAM) are es- sons. Partitioning can be used directly to divide a circuit into
sential to the development of the HGA system. portions that are implemented on separate physical compo-
Some simple empirical analysis of software-based GAs nents, such as printed circuit boards or chips. The objective
indicated that a small number of simple operations and the is to partition the circuit into parts such that the sizes of the
The research of the first author is partially supported by a Natural Sci- components are within prescribed ranges and the complexity
ences and Engineering Research Council of Canada (NSERC). of connections between the components is minimized. As the
size of present-day computer chips become larger (i.e., chips
Old Random New New
containing more than ten million transistors in sub-micron ar- Chromosome Numbers Bit Chromosome
eas), the importance of obtaining near-optimal layouts that ef-
ficiently ”place” and ”route” the signals becomes increasingly 1 0 1 0 .801 .102 .266 .373 − 1 0 1 0
important. Partitioning is a ”key” approach in reducing the 1 1 1 0 .120 .096 .005 .840 0 1 1 0 0
connectivity between areas of the chip so that modules can be 0 0 1 0 .760 .473 .894 .001 1 0 0 1 1
more efficiently placed and routed to reduce wire-length, con-
gestion, and increase the speed of the overall design. Among Standard Mutation Operator
the different objectives that may be satisfied by the desired
partitioning are: One point crossover
1. The minimization of the number of cuts,

Parent1: 1 0 0 0 1 1 Child1: 1 0 0 0 0 0
2. The minimization of the deviation in the number of ele-
ments (inputs, logical gates, outputs and fanout points) Parent1: 0 1 1 1 0 0 Child2: 0 1 1 1 1 1
assigned to each partition.
In this paper GA is used to solve the circuit-partitioning Standard Crossover Operator

problem.
Fig. 1. Genetic Operators
2.2. Overview of Genetic Algorithms
A genetic algorithm (GA) is a natural selection-based opti- selection of population members can be parallelized to the
mization technique. There are four major differences between practical limit of area of the chip(s) on which selection mod-
GA-based approaches and conventional problem-solving meth- ules are implemented. Once these modules have made their
ods: selections, they can pass the selected members to the mod-
ules, which perform crossover and mutation, which in turn
1. GAs work with a coding of the parameter set, not the
parameters themselves. pass the new members to the fitness modules for evaluation.
Thus a coarse-grained pipeline is easily implemented. This
2. GAs search for optima from a population of points, not capability for parallelization and pipelining helps in mapping
a single point. GA to hardware.
3. GAs use payoff (objective function) information, not

other auxiliary knowledge such as derivative informa- 2.3. Overview of Field Programmable Gate Arrays
tion used in calculus-based methods. FPGA (field programmable gate array) is an inexpensive user-
4. GAs use probabilistic transition rules, not deterministic programmable device, which allows rapid design prototyping
rules. [2]. They offer more dense logic and less tedious wiring work
than discrete chip designs and faster turn around than sea-of-
These four properties make GAs robust, powerful, and gates, standard cell, or full-custom design fabrication. FPGAs
data-independent [1]. A GA is a stochastic technique with are generally composed of logic blocks which implement the
simple operations based on the theory of natural selection. FPGA’s logic, I/O cells which connect the logic blocks to
The basic operations are selection of population members for the chip pins and interconnection lines which connect logic
the next generation, “mating” these members via crossover of blocks together with I/O cells. Programming of these compo-
“chromosomes,” and performing mutations on the chromo- nents is allowed with the use of static RAM cells, anti-fuses,
somes to preserve population diversity so as to avoid conver- EPROM transistors or EEPROM transistors.
gence to local optima. The crossover and mutation operators Although field programmable gate arrays (FPGA’s) were
are shown in Figure 1. Finally, the fitness of each member in introduced a decade ago, they have only recently become more
the new generation is determined using an evaluation (fitness) popular. This is not only due to the fact that programmable
function. This fitness influences the selection process for the logic saves development cost and time over increasingly com-
next generation. The GA operations selection, crossover and plex ASIC designs, but also because the gate count per FPGA
mutation primarily involve random number generation, copy- chip has reached numbers that allow for the implementation
ing, and partial string exchange. Thus they are powerful tools of more complex applications.
which are simple to implement. Its basis in natural selection Many present day applications utilize a processor and other
allows a GA to employ a “survival of the fittest” strategy when logic on two or more separate chips. However, with the antici-
searching for optima. The use of a population of points helps pated ability to build chips with over ten million transistors, it
the GA avoid converging to false peaks (local optima) in the has become possible to implement a processor within a sea of
search. programmable logic, all on one chip. Such a design approach
The nature of GA operators is such that GAs lends them- allows a great degree of programmability freedom, both in
selves well to pipelining and parallelization. For example, hardware and in software. CAD tools could decide which
parts of a source code program are actually to be executed in 3. SYSTEM ARCHITECTURE
software and which other parts are to be implemented with
hardware. The hardware may be needed for application inter- The proposed architecture (core) for implementing the ge-
facing reasons or may simply represent a coprocessor used to netic algorithm in hardware uses a processing-pipeline for
improve execution time. performing the computationally intensive parts of the algo-
rithm. The current design is specifically optimized towards
FPGA designs can be created in a number of ways, includ-
solving the circuit-partitioning problem. Theoretically, there
ing graphical schematic component layout (Powerview) and
is no limit on the size of the problem the core can handle, but
hardware description languages such as ABEL, VHDL, and
since the core requires external RAM for storing the netlist in-
Verilog. VHDL (VHSIC hardware description language) can
formation, the size of the RAM is directly proportional to the
be used either for behavioral modeling of circuit designs or
product of number of nets and cells (modules) in the design.
for logic synthesis using either beavioral or structural descrip-
Too big problems would require a relatively large amount of
tions [5]. Since writing structural circuit descriptions is like
external RAM.
trying to describe a circuit using text instead of a schematic
editor, the real advantage of VHDL is seen only in its behav-
ioral synthesis potential.
2.4. Previous work in Hardware Based GA
The past several years have witnessed a sharp increase in

work with reconfigurable hardware systems. Reconfigura-
bility is essential in a general-purpose GA engine because
certain GA modules require changeability (e.g. the function
to be optimized by the GA). Thus a hardware-based GA is
both feasible and desirable. The use of reconfigurable hard-
ware for the design of GA was seen in projects such as [6],
[7], [8]. In Stephan Scott’s behavioural-level implementa-
tion of a GA [6], the targeted application was optimization
of an input function. In [7], a GA was designed and im-
plemented on a PLD, using Altera hardware description lan-
guage(AHDL). In [8], a number of GAs were designed and
implemented in a text compression chip. In [9], the GA was Fig. 2. Architecture for the Genetic Algorithm Processor.
implemented in hardware on a Splash 2 reconfigurable com-
puter. The problem selected for implementation is the famous The block diagram of the GA processor is shown in Fig-
Traveling Salesman Problem (TSP). Splash2 consisted of an ure 2. The selection module selects the parents with good fit-
interface board and a collection of processor array boards. Its ness from Fitness memory and sends the addresses of the par-
basic unit of computation was the processor, which consisted ents selected to Crossover and Mutation module. Crossover
of four Xilinx 4010 FPGA’s and associated memories. The and Mutation module performs crossover and mutation on
performance differences between Hardware and Software ver- the parents. Fitness Module generates fitness values for each
sions of Genetic Algorithm in solving Travelling Salesman of the generated chromosome after complete new population
Problem (TSP) was further analyzed by Paul Graham and is generated by crossover and mutation module. Main Con-
Brent Nelson in [10]. The new Xilinx XC6216 was also used troller generates control signals for all the blocks. The design
to accelerate the GA performance in [11]. It was used to is coded in VHDL and uses the generics shown in Table 1.
accelerate the most time-consuming fitness task of GA by The external RAM modules used by the design are listed in
embodying each individual of evolving population into hard- Table 2.
ware. The memory bottleneck is inevitable since GA requires Before starting the core process, the Control Registers
a large memory to store the population. As a result, high- have to be loaded with ‘legal’ values using the CPU interface.
speed memory may be used making the hardware expensive After loading the control registers, an active high pulse on the
or low-cost memory reducing performance. Therefore, in Start control input starts the GA process. After receiving the
contrast to the Simple GA, the Compact GA was more suit- Start signaling, the core accepts the netlist from the top-level
able for hardware implementation [12]. inputs. The input netlist is stored in the Netlist memory, from
In the current work, all the research done in hardware im- where it is read repeatedly by the core to compute the fitness.
plementation of GA is combined to create a system which After receiving the input netlist (based upon the number of
attempts to achieve significant speedup over software GA due nets stored in Control Registers), the core generates the ini-
to pipelining and parallelization. It also attempts to minimize tial population randomly and stores it in to the Chromosome
the logic resources used within FPGA. memory. The Selection module uses Tournament Selection to
of the chromosome is the number of cells in the netlist. Since
Table 1. Generics used in the design
there are practical limitations to word sizes of physical mem-
Generic Description ories, the chromosome is stored in the memory in the form
Name of smaller words, words corresponding to one chromosome
being stored consecutively.
FMAddrWidth Fitness memory address width.
This gives two times maximum M1 M2 M3 M4 M5 M6 M7 M8
size supported.
FMDataWidth Fitness memory data width. 0 0 1 0 1 1 1 0
CMDataWidth Chromosome memory data width.
This represents word size of BLOCK 0 BLOCK 1
chromosome memory.
CMField Number of bits used to represent Fig. 3. Representation of Circuit-Partitioning
the length of chromosome.
MaxNetNumBits Number of bits used to represent The GA Processor consists of the following modules.
maximum number of nets.
3.1. Control Registers
The design uses a set of Control Registers, which can be pro-
Table 2. Core Memories grammed using the CPU interface. The various registers used
are shown in the table 3.
Memory Description
Netlist It stores a binary sequence of length of
Table 3. Control Registers in design
Memory chromosome for each net. Each bit in the
sequence denotes if that net is connected Register Size (Bits) Description
to corresponding cell in the netlist or not.
CMLength 2 8 Chromosome length
This is single address port synchronous
NetNum 2 8 Number of nets
RAM.
PopSiz 1 8 Population size
Chromosome It stores population elements for the
GenNum 1 8 Generation Count
Memory parent as well as child population. The
CrossoverRate 1 8 Crossover rate
address space is divided into two halves.
MutationRate 1 8 Mutation rate
Each half stores either parent or child
population. This is dual address port
synchronous RAM.
Fitness It stores fitness of parent and child
Memory population. This is also divided into 3.2. Selection Module
two parts for storing parent and child The Selection Module performs Tournament selection on the
population. This is single address initial population by reading four random fitnesses from the
port synchronous RAM. Fitness memory upon receiving an active high signal from the
Main Controller. It uses an instantiation of an LFSR based
Random Number Generator. It compares two pairs of fit-
nesses and selects the best from each pair. The addresses of
select two parents. Memory addresses of these two parents the best two fitnesses are latched and held stable on the output
are used by the Crossover and Mutation module to perform signals until the next time when Selection Module is enabled.
the genetic operations. The Crossover and Selection mod- These two addresses (with zeros padded in LSB’s) represent
ule stores the two generated children into the Chromosome the starting addresses of the two-parent chromosome stored in
memory. After the new populalation is generated, the fitness the Chromosome memory. At the end of selection of two par-
module computes the fitness of each of the elements of the ents, the Selection Module generates a signal indicating end
new population, and stores the fitness into the Fitness mem- of selection.
ory. After the number of generations are executed (based on Internally, the Selection Module consists of a random num-
Control Register value), the core outputs the final population ber generator, a comparator for comparing unsigned integers,
along with the fitness of each chromosome. registers to latch the generated random addresses, and a con-
In order to solve the circuit-partitioning problem using trol state machine. The control state machine generates con-
GA, the following representation is used. Each chromosome trol/enable signals for different blocks in the module.
contains a sequence of 1’s and 0’s, each bit corresponding to a
distinct cell in the netlist. A one at a location in the sequence
3.3. Crossover and Mutation Module
means that the corresponding cell lies in the partition number
1. Similarly, a zero implies that the cell is present in the par- The Crossover Module performs the crossover and mutation
tition number 0 as shown in Figure 3. Therefore, the length operations on the two parent chromosomes, the starting ad-
dresses of which are generated by the Selection Module. The there is a cut, no further words are read from the memory.
chromosome memory is divided into two parts, namely the This eliminates the time wasted by reading redundant infor-
low bank, and the high bank. At any time, the parent popu- mation from the Chromosome and Netlist memories. A chro-
lation is stored into one of the banks and the child population mosome counter keeps track of the number of chromosomes
generated by the Crossover and Mutation module is stored processed. If this counter reaches PopSiz, FitnessDone signal
into the other bank. Upon receiving an active high signal from is asserted signaling the end of Fitness generation to the Main
the Main Controller, one word of the chromosome for each Controller. No further processing is done until the FitnessEn-
of the parents is read from the Chromosome memory based able signal is asserted again by the Main Controller.
upon the addresses generated by the Selection Module. Af- Internally, the Fitness module consists of a word counter,
ter reading one word of chromosome for each of the parents, a net counter, a population counter, a fitness accumulator,
the chromosome-word counter is incremented. The Crossover and a state machine, which generates control signals to these
Module generates a random crossover mask for each word of blocks. In addition to these, there are register flags for each
the parents. The Crossover and Mutation rates supplied by partition, which indicate if the chromosome is present in the
the control registers are compared to an internally generated corresponding partition.
random number of 8 bits. If the value of this random number
is less than the Crossover and Mutation rates, these opera- 3.5. Main Controller Module
tions are performed, otherwise the parents are copied to the
children. The results of the crossover and mutation are stored
word-by-word into the Chromosome memory. The starting
child addresses are obtained from the Main Control State Ma-
chine. The whole process is repeated until the chromosome
word counter reaches the length of the chromosome denoted
by the Control Register CMLength. Finally, the Main Control
State Machine is signaled the end of crossover process.
Internally, the Crossover and Mutation module consists of
a chromosome word counter, which is CMField bits wide and
trivial combinatorial logic to perform the crossover and mu-
tation operations. Also, the module contains an instantiation
of the Random Number Generator. The same random num-
ber is used as a mask for Uniform Crossover as well as for
determining the crossover and mutation probabilities.
3.4. Fitness Module
Once a complete new population is generated by the Crossover

and Mutation module, the Fitness module generates fitness
values for each of the generated chromosomes. Upon receiv-
ing the signal from the Main Controller, the Fitness module
determines for each net, if the present chromosome partition-
ing generates a cut. For each chromosome the fitness accu-
mulator is reset to 0. The chromosome and the net are read
word-by-word from the Chromosome and the Netlist mem-
ory, respectively. For each word of the chromosome and the
net, a simple bit-wise AND operation followed by OR opera-
tion is performed. This generates the information that based
upon the present word of chromosome, which partition does
the net lie in. At any point if the net is found to be present in
Fig. 4. State machine for Main Controller.
a particular partition, the bit representing the presence of net
in that partition is latched, and not overwritten for any subse-
quent word operations. At any time, if both of these bits are The Main Controller generates control signals for rest of
a ’1’, that determines a cut. In this case the fitness accumula- the blocks of the design. The Main Controller performs the
tor is incremented by one. This process is repeated for each following functions:
word, until the word counter reaches the length of chromo- 1. After receiving the active high pulse on StartGA, the
some. The resulted fitness is modified based upon the num- Main Controller starts reading the input netlist using
ber of cells present in each partition. This is done by reading the input handshake signals.
the chromosome word-by-word from the Chromosome mem-
ory, and counting the number of ones in the chromosome. At 2. After loading the netlist into the Netlist memory, the
any time during the computations of a net, if it is found that Main Controller generates random chromosomes and
initializes the Chromosome memory with random pop- version. The software results shown in table were achieved
ulation. using SUN ULTRA10 440 MHz processor system. As seen,
the speed increases to approximately 50 times the software
3. Following this the main Controller state machine enters implementation. This tremendous increase in speed for hard-
a loop in which the three functions of Fitness calcu- ware implementation is mainly attributed to the fact that, dur-
lation, chromosome selection, and crossover and mu- ing fitness evaluation, if a cut is determined for a net at any
tation operations are carried out in sequence until the time, the remaining words for that net and the chromosome
generation counter inside the Main controller reaches are not read from the memory. This eliminates the time wasted
the generation count loaded into GenNum control reg- by reading redundant information from the Chromosome and
ister. With each generation, the generation count is in- Netlist memories. The hardware processing speed can further
cremented by one. be increased by increasing the Chromosome memory data
bus width because this enables more computations to be per-
4. At the end of last generation, the Main Controller en-
formed in parallel.
ables the Fitness module for one last time and outputs
the final population and final fitness using the top-level
output signals. Table 6. Performance results for Hardware GA and Software
GA for different Generation Count
The state diagram of the main controller state machine is shown
in Figure 4. Benchmarks Generation Software Hardware
Count Time (ms) Time (ms)
4. RESULTS Pcb1 20 200 1.63
Nnets=32 60 600 4.91
The proposed design was coded in VHDL. It was function- Nmods=24 100 900 7.20
ally verified by writing a testbench and simulating it using Chip1 20 1700 40.50
ModelSim and synthesizing it on Virtex xcv50e using Xilinx Nnets=294 60 4800 121.25
ISE 4.1. Hardware GA was compared with software GA for Nmods=300 100 8100 202.32
different population size and different generation count for Chip3 20 1200 23.23
different benchmarks with Default GA parameters shown in Nnets=239 60 3400 69.52
Table 5. The different benchmarks used for simulations are Nmods=274 100 5900 116.23
given in Table 4.
Table 4. Benchmarks
Name Number of nets Number of modules Table 7. Performance results for Hardware GA and Software
(Nnets) (Nmods) GA for different population size
Pcb1 32 24 Benchmarks Population Software Hardware
Chip1 294 300 Size Time (ms) Time (ms)
Chip3 239 274
Pcb1 20 200 1.63
Nnets=32 60 700 4.82
Nmods=24 100 1100 7.20
Chip1 20 1700 40.50
Table 5. Default GA parameters Nnets=294 60 4900 122.25
Nmods=300 100 8800 203.60
Parameters Parameter value
Chip3 20 1200 23.23
Population Size 20 Nnets=239 60 3800 69.36
Generation Count 20 Nmods=274 100 5700 115.32
Crossover Rate 0.99
Mutation Rate 0.01
Crossover Type Uniform
Selection Type Tournament
Table 8. Synthesis Report
Tests were run assuming the clock frequency of 50MHz. Device Virtex xcv50e
The results obtained for different generation counts and popu- Slices 334 out of 768 (43%)
lation size are given in Table 6 and Table 7, respectively. The CLB’s 167
remaining GA parameters were assigned the default values Equivalent Gate Count 6044
given in Table 5. From the simulations results, it is clear that Max Clock Frequency 123 MHz
the hardware implementation is much faster than the software
Synthesis results are shown in Table 8. It is evident from [5] K. Skahill, VHDL for Programmable Logic, Addison
Table 8 that minimal hardware resources are utilized. Since Wesley, Reading, Massuchusetts, 1996.
the simulation results shown in Table 6 and Table 7 are ob-
[6] Stephen Donald Scott, “A hardware based genetic algo-
tained by assuming a 50 MHz clock frequency, the improve-
rithm”, Master’s thesis, University of Nebraska, August
ment in speed can be increased to more than 100 times the
1994.
software implementation with a maximum tolerable clock fre-
quency of 123 Mhz. [7] Tommi Rintala, “Hardware implementation of ga”,
September 20 2000.
5. CONCLUSIONS AND FUTURE WORK [8] Loring Wirbel, “Compression chip is first to use genetic
algorithms”, page 17, December 1984.
In this paper a new architecture for implementing the genetic
[9] PaulGraham and Brent Nelson, “A Hardware Ge-
algorithm in hardware is proposed. Although the architecture
netic Algorithm for the Travelling Salesman Problem on
is designed specifically to solve the circuit-partitioning prob-
Splash2”, 1995.
lem, some of the modules in the design can be re-used for
other problems as well. These include the Selection Module, [10] PaulGraham and Brent Nelson, “Genetic Algorithms
Crossover Module, the LFSR based random number genera- in Software and in Hardware- A performance Analysis
tor, and most of the Main Controller. The design takes into of workstation and custom Computing Machine Imple-
account the practical limitations of memory data bus imposed mentation”, in IEEE Symposium on FPGAs for cus-
by the memory chips available. In order to enable the use tom Computing Machines, pp. 216–225, Reconfigurable
of almost any memory chip along with the design, the de- Logic Laboratory, Brigham Young University, Provo,
sign uses configurable parameters (generics) which can eas- UT, USA, 1996.
ily change the memory address and data bus widths during [11] John R. Koza, Forrest h Bennett III, Stephen L Jef-
compilation time. The design was synthesized for a maxi- frey L, Martin A and David Andre, “Evolving Com-
mum clock frequency of 123 Mhz on Virtex xcv50e. At this puter Programs using Rapidly Reconfigurable Field-
frequency the design achieves more than 100 times improve- Programmable Gate Arrays and Genetic Programming”,
ment in processing speed over the software implementation. 1997.
There are many ways to extend the proposed design by
simple modifications to the VHDL code. This design was [12] Chatchawit and Prabhas, “A Hardware Implementation
used to solve two-way circuit partitioning problem with Tour- of compace Genetic Algorithm”, in Proceedings of the
nament Selection and Uniform crossover. Other Genetic Al- 2001 IEEE Congress on Evolutionary Computation, pp.
gorithm operators could be implemented like, multi-point cross- 624–629, Seoul, Korea, May 2001.
over, Partially Mapped crossover and different selection meth-
ods as well. The design can also be enhanced by incorporat-
ing a local search engine to create a hybrid memetic GA. The
chromosome representation used in this project requires a rel-
atively large amount of external memory to store the popula-
tion and netlist. Alternate chromosome representations can be
explored in order to reduce the memory requirements. Fur-
thermore, hardware/software co-design can be implemented
and it can be compared with current implementation.
6. REFERENCES
[1] D.E. Goldberg, Genetic Algorithms in Search, Opti-

mization, and Machine Learning, Addison-Welsey Pub-
lishing Company, Reading,Massachusetts, 1989.
[2] S.D. Brown, R.J. Francis, J. Rose and Z.G. Vranesic,
Field-Programmable Gate Arrays, Kluwer Academic
Publishers, USA, 1992.
[3] K.A. De Jong and W.M. Spears, “Using genetic al-
gorithms to solve NP-complete problems”, in J.David
Schaffer, editor, Proceedings of the Third International
Conference on Genetic Algorithms, pp. 124–132. Mor-
gan Kaufmann Publishers, 1989.
[4] S. Areibi, “A Review of Circuit Partitioning”, Techni-
cal report, School of Engineering, University of Guelph,
June 2000.

Hardware Implementation of Genetic Algorithms For Vlsi Cad Design

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Hardware Implementation of Genetic Algorithms For Vlsi Cad Design

Загружено:

Авторское право:

Доступные форматы

HARDWARE IMPLEMENTATION OF GENETIC ALGORITHMS FOR VLSI CAD DESIGN

G. Koonar S. Areibi M. Moussa

gkoonar@uoguelph.ca sareibi@uoguelph.ca mmoussa@uoguelph.ca

ABSTRACT function to be optimized were executed frequently during the

and values of g on the order of 10 -10 may be necessary to

1. The minimization of the number of cuts,

In this paper GA is used to solve the circuit-partitioning Standard Crossover Operator

3. GAs use payoff (objective function) information, not

2.4. Previous work in Hardware Based GA

The past several years have witnessed a sharp increase in

3.4. Fitness Module

Once a complete new population is generated by the Crossover

[1] D.E. Goldberg, Genetic Algorithms in Search, Opti-

Вам также может понравиться