Вы находитесь на странице: 1из 5

International Journal of Advanced Technology & Engineering Research (IJATER)

PARTITIONING ALGORITHMS IN VLSI PHYSICAL DESIGNS


A REVIEW
Deepak Batra, Dhruv Malik
Department of ECE, PIET, Samalkha, Panipat, Haryana, India
Email: oasisfermi@yahoo.co.in, er.dhruvmalik@gmail.com

Abstract
Initial electronic design automation is concerned with the
design and production of VLSI systems. The next important
step in creating a VLSI circuit is Physical Design. Physical
design problems are combinatorial in nature and of large
problem sizes. Due to its complexity, the physical design is
normally divided into various sub-steps. The circuit has to be
partitioned to get some (up to 50) macro cells which have to
be placed on the chip (floor-planning). In the routing phase
pins on the walls of these modules have to be connected. The
last step in the physical design is the compaction of the layout,
where it is compressed in all dimensions so that the total area
is reduced. This survey paper aims at a study on Efficient
Algorithms for Partitioning in VLSI design and observes the
common traits of the superior contributions.

which determines the number of pins required on each chip. In


(Field Programmable Gate Arrays) FPGAs also, this
technique is gaining renewed importance, which require
automatic software for partitioning and mapping large circuits
on several FPGA Chips for rapid prototyping.
3.Circuit Simulation: Partitioning has been used to
split a circuit into smaller sub-circuits which can be simulated
independently. The results are then combined to study the
overall circuit. This speeds up the simulation process by
several times and is used in relaxation-based circuit simulators.
This is also used for simulating circuits on multiprocessors.
[10]

Partitioning is used in various EDA applications, e.g. as a tool


for various placement algorithms or for assigning circuit
elements to blocks that can be packaged separately. In VLSI
design applications, partitioning algorithms are used to
achieve various objectives:

Partitioning algorithms are of two types: Constructive and


Iterative Algorithms. Constructive Algorithms start from
empty initial partitions and they grow clusters of well
connected components around one or more seed nodes or
components, selected on the basis of certain user defined
criteria, namely number of fan-in and fan-out lines associated
with a node. The quality of solutions generated with this class
of partitioning algorithms is poor and degrades significantly as
partitioning problem size increases. The main advantage is
that they are very fast and usually scale well with the problem
size. Certain deterministic algorithms fall in this category. [11]

1.Circuit Layout: A class of placement algorithms


called min-cut partitioning, is based on repeated partitioning
of a given network, so as to minimize the size of the cutest at
each stage, where the cut set is the set of nets that connect two
partitions. At each partitioning stage, the chip area is also
partitioned, e.g. alternately in the vertical and horizontal
directions and each block of the network is assigned to one
region on the chip. This process is repeated several times, until
each block consists of only one cell. The resulting assignment
of cells on the chip gives the final layout.

Iterative improvement algorithms start with an initial


partitioning, accomplished through some user defined methods
or randomly, and the algorithms tend to incrementally refine
the initial partitioning through successive iterations. Thus at
any stage a complete solution is always available. The
algorithms tend to terminate when no more improvements can
be found. Thus, these algorithms often terminate at local
optima that are closely tied to initial partitioning. Certain
stochastic algorithms that fall in this category, may give better
results than deterministic algorithms. [5-6]

2.Circuit Packaging: Semiconductor technology


places restrictions on the total number of components that can
be placed on a single semiconductor chip. Large circuits are
partitioned into smaller sub circuits that can be fabricated on
separate chips. Circuit Partitioning algorithms are used to
obtain the sub circuits, with a goal of minimizing the cut-set,

Problem Formulation

Introduction

ISSN No: 2250-3536

Circuit partitioning divides a given circuit into a collection of


smaller sub-circuits to minimize the number of connections
among the sub-circuits, subject to the area balance constraint.

Volume 2, Issue 4, July 2012

43

International Journal of Advanced Technology & Engineering Research (IJATER)


The circuit partitioning problem becomes more important as
VLSI technology reaches sub-micron device dimensions.
Traditionally, this problem was important for breaking up a
complex system into several custom ASICs. Though it is
possible to solve the case of unbounded partition sizes exactly,
the case of balanced partition sizes is NP complete. Kernighan
and Lin have shown that in the worst case , it will take
exponential time to divide a set of circuit elements into k
blocks by enumerating all possible permutation in which n
circuit components can be divided into k equal blocks of size p
= n/k. [8-9]

Figure 1: Circuit partitioning, A Simple Example

Total number of unique way of partitioning the graph is


N(k) = 1/K! (n p) (n-p p) (n-2p p) .... (p p) = n!/[k! (p!)^k]
1/k! because it includes all K! permutation
N(k) = O( (n/p)^ ( (n-p)/p)) Sterlings approximation
It is NP complete problem as shown by KARPK.
Partitioning problem can be solved by following techniques:-

Figure 2: Circuit Partitioning, A graphical representation

1. Clustering
2. Graph
3. Ratio cut
4. Stochastic Algorithms
5. Neural Algorithms
Generic partitioning techniques are based on a graph model of
the design. Each node in the graph represents a physical
component such as a gate, flip-flop, register or adder and each
edge represents a physical connection between two
components. Multi-terminal nets between several components
are decomposed to several two terminal nets. Thus we get a
graph of several nodes and two terminal edges representing
the circuit. The main objective of partitioning is to decompose
a graph into a set of sub-graphs to satisfy a given constraints,
such as the size of sub-graphs, while minimizing the objective
function, such as the number of edges connecting the two subgraphs. Figure 1 shows graph G divided into two sub-graphs
G1 and G2 corresponding to chip1 and chip2 respectively. In
figure 2, Edges e24 and e36 across the cut-line represent the
two nets, ni and nj connecting chip1 and chip2 in figure 3. [4,
6]

ISSN No: 2250-3536

Figure 3: Circuit Partitioning, Physical Representation

GRAPH PARTITIONING
ALGORITHMS
In 1970, Kernighan and Lin proposed a semi greedy algorithm
called Min-cut partitioning or K-L Algorithm. The K-L
algorithm starts with random partition and tries to minimize
the cut cost by making small local changes through
interchanging pair of nodes. The algorithm makes several
passes. Each pass consists of a series of interchanges of pairs
of nodes. The nodes are interchanged in the sequence of
maximum gain in the cut-cost. At the end of a pass, all the
nodes are interchanged and the cut-cost becomes the same as
that at the beginning of the pass. The intermediate partitions

Volume 2, Issue 4, July 2012

44

International Journal of Advanced Technology & Engineering Research (IJATER)


are examined and the one with the sequence of pairwise
exchanges that yields the smallest cut-cost is returned as the
outcome of the pass. The time complexity of the algorithm is
O(n2logn). However, it is found that only a constant number
of passes are required independent of the graph size. The
quality of the final partition often heavily depends upon the
initial partition. [2]

O(nlogn) time in the K-L Algorithm. The partitioning is done


such that the sizes of two blocks are in a given ratio to the size
of the original circuit, upto a tolerance of + cell. The areas of
the cells are considered to be a measure of the size of the
partition. Also the user can specify some cells as being fixed
in either block. Each iteration is linear in the size of the input,
O(P), where P is the total number of pins in n circuit elements.
In this only a few iterations are required for convergence. [6]

KL-algorithm:
Pair-wise exchange of nodes to reduce cut size Allow cut
size to increase temporarily within a pass
Compute the gain of a swap
Repeat
Perform a feasible swap of max gain
Mark swapped nodes .locked.;
update swap gains;
Until no feasible swap;
Find max prefix partial sum in gain sequence g1, g2, ..., gm
Make corresponding swaps permanent.
Start another pass if current pass reduces the cut size
(usually converge after a few passes)

FM- algorithm:
1. Start with "balance" partition
2. Move across partition if move does not violate balance
condition
3. To choose next vertex
a. find max gain vertex max.
b. move vertex if balanced condition is not violated
c. lock vertex .
4. Identify the critical nets and update gain of only those
cells that are connected by those nets.

Figure 4. An Example for KL Pass

Figure 6. Data Structure used in FM Algorithm

Figure 5. Two best solutions found (Solutions are area balanced)

In 1982, Fiduccia and Mattheyses proposed a more efficient


method of implementing Kernighan and Lins algorithm
leading to a fast linear time algorithm for partitioning (the FM Algorithm). This algorithm is an iterative heuristic similar
to Kernighan and Lins Algorithm. It improves upon the time
complexity of K-L algorithm by moving only one cell at a
time and using efficient data structures in order to search for
the best element to move and also to minimize the effort
required to update the cells after each move. These data
structures eliminate the need for repeated sorting that takes
ISSN No: 2250-3536

Figure 7. Solutions after move 2 and 4 are balanced

Even though F-M Algorithm is fast, it tends to converge at


local minima, when ties occur between many cells in the gain
lists and ties are blindly broken in random way without giving
further consideration to future effects on other nets when a cell

Volume 2, Issue 4, July 2012

45

International Journal of Advanced Technology & Engineering Research (IJATER)


is selected to move. This technique of choosing the best cell
for movement to the other block was refined by
Krishnamurthy in 1984, by adding a LA (Look-Ahead)
mechanism. Krishnamurthys algorithm maintains a
multidimensional version of the F-M data structure, which
lists expected gains of cells in future moves. Thus in the event
of ties in current level gain in cut size due to a move,
Krishnamurthys LA algorithm chooses a cell which gives
better gain in future moves. Since multilevel gain calculations
and updating takes a significant amount of time, typically 2level (LA-2) and 3-level (LA-3) gain lists are used in practice.
Although, the timing overhead of 20 runs of LA-3 is
comparable to 100 runs of F-M Algorithm, statistically it is
found that LA-3 gives 7% improvement in cut size in
comparison with F-M Algorithm. [3]

previous approaches before Sanchis' algorithm are originally


bipartitioning algorithms. Sanchis approach takes about
O(lkP*log(k+p+l))time for a k-way concurrent partitioning
where P = total number of connections, p = the maximum
number of pins on a cell and l = no of gain levels. For k = 2
and l = 1 this time is reduced to linear time as that of F-M
algorithm. [6]

Ratio-Cut Algorithms
The graph algorithms are successful in bipartitioning and
multi-way partitioning, but they dont capture the fact that
digital circuits are hierarchical in nature. Hierarchy imposes a
certain type of clustering but graph algorithms tend to divide
the circuit in way of strict balanced partitioning and the
resulting cut sizes are not minimal. The ratio-cut algorithms
were introduced by Wei and Chang. They identify natural
clusters in the circuit and prevent them from being truncated
by the cut-set. The algorithm tries to find the best ratio-cut as
opposed to minimal cut size. The ratio-cut due to dividing the
graph into two blocks is given by the ratio of cut set between
two blocks to product of cardinality (size) of each block. The
ratio cut approach has been applied to a set of benchmark net
lists and it is found that cut size can be improved upto 70%
over the ones obtained by F-M algorithm. [11]

Stochastic Algorithms
Figure 8. Greedy Nature of KL and FM Algorithms

Dutt and Deng proposed a probability based augmentation of


graph partitioning algorithm, called the probabilistic gain
computation (PROP) approach, which is capable of capturing
the global and future implications of moving a node at the
current time. The technique associates with each node j, a
probability p(M(j)), where M(j) indicates the event that in the
current pass, node j will actually be moved to the other block.
From the value of these probability, g(u), the probabilistic gain
of the nodes (for all us in the graph) are calculated. While
calculating g(u) of a node, all nets connected to u contribute to
g(u), depending on their individual probabilities pertaining to
whether a particular net is on the cut-set or not. The
probability calculations have a reasonable overhead and the
time complexity of PROP is O(rqP), where P is the total
number of pins, q is the average number of pins with which
net is connected and r (typically less than 5) is the total
number of passes before the algorithm stops. [5]
In 1989, Sanchis generalized Krishnamurthy's algorithm to a
multiway circuit partitioning algorithm so that it could handle
the partitioning of a circuit into more than two parts. All the
ISSN No: 2250-3536

Simulated annealing is a stochastic optimization method. The


algorithm starts with random solution and makes incremental
refinement by moving cells from their current location to a
new location in order to generate new solutions. All moves
that decrease the cost are accepted and moves that increase the
cost are also accepted according to probabilistic decision
function. The decision is made by randomly drawing a rational
number between 0 and 1, and comparing with a continuous
probability function whose value decreases according to a preassigned cooling schedule and the magnitude of change in cost,
similar to Maxwell-Boltzmanns molecular distribution. At the
beginning of algorithm, high temperature setting allows a
number of moves that increase the cost from its current value,
but when temperature reaches the final value, only few moves
that increase the cost are accepted. Thus simulated annealing
is an algorithm that is capable of finding global optimal
solutions. The disadvantage is that it takes longer time than
linear time algorithm such as F-M algorithm. Green and
Supowit improved the time complexity by devising a
rejectionless method. They selected moves by attaching
weight factors to them. The technique works well for
partitioning a circuit into smaller clusters, if average degree of
nodes is small. The runtime for rejectionless simulated
annealing algorithm is linear. [5]

Volume 2, Issue 4, July 2012

46

International Journal of Advanced Technology & Engineering Research (IJATER)

Results
It is found that 100 runs of F-M Algorithm takes about same
amount of time as 40 runs of LA-2 and 20 runs of PROP. The
totals of best cutsets obtained by these three algorithms, run as
above, are 1776 (F-M), 1898 (LA-2) and 1380 (PROP). Thus
an improvement of 27.3% over LA-2 and 22.3% over F-M can
be achieved by PROP. In case of LA-3, PROP shows an
improvement of 16.6% over LA-3. A major limitation with all
the above algorithms is that they are suitable for bipartitioning
a graph or circuit network, and not for multiway partitioning.
[6] In order to perform k-way partitioning by the above
bipartitioning algorithms, one has to employ recursive
bipartitioning is k is exactly equal to or close to power of 2.
Otherwise, one has to perform k(k-1)/2 separate runs for a
bipartitioning algorithm. Neither of these methods yields good
results, since they tend to sequentially improve the partitioning
between two blocks at a time. In order to obtain a globally
optimal multiway partitioning, one has to apply partitioning
over entire graph or net list simultaneously. [1]

References

[11] Daniel D Gajski, Nikhil D. Dutt, Allen C-H Wu, Steve YL Lin High Level Synthesis Introduction to Chip and System
Design (1994) Kluwer Academic Publishers
[12] David E. Goldberg Genetic Algorithms in Search,
Optimization and Machine Learning (2009) Pearson
Education

Biographies
Assistant Professor, Deepak Batra obtained his Bachelors of
Engineering degree in Electronics and Communication from
Maharshi Dayanand University, Rohtak in 1999 and Masters
of Technology in VLSI-CAD from Manipal University,
Manipal, Karnataka in 2002. He has 4 years experience in IT
industry and has been a faculty in different colleges of
GGSIPU, MDU, KUK University for past 7 years. He has
taught B.E. and M.E. students and has provided guidance to
B.E. students for their projects.
Dhruv Malik, Assistant Professor did BE in ECE from UPTU,
Lucknow and ME in ECE from Maharshi Markendeshwar
University, Mullana. He has published a number of papers in
international journals. His area of interest is Digital
Communications.

[1] T. Lengauer, Combinatorial Algorithms for Integrated


Circuit Layout, John Wiley & Sons, 1990
[2] H. Mhlenbein, H.-M. Voigt, Gene Pool Recombination in
Genetic Algorithms, Procs. of the Metaheuristics Int. Conf., I.
H. Osman, J. P. Kelly (eds.), Kluwer Academic Publishers,
Norwell, 1995
[3] S. M. Sait, H. Youssef, VLSI Physical Design Automation:
Theory and Practice, McGraw-Hill (1995)
[4] V. Schnecke, O. Vornberger, Genetic Design of VLSILayouts, Procs. First IEE/IEEE Int. Conf. on GAs in
Engineering Systems: Innovations and Applications,
GALESIA'95, IEE Conference Publication No. 414, 1995,
430-435
[5] N. Sherwani, Algorithms for VLSI Physical Design
Automation, Kluwer Academic Publishers, 1993
[6] Mazumder P., Rudnick E. (1999) Genetic Algorithm for
VLSI Design, Layout and Automation. Addison-Wesley
Longman Singapore Pte. Ltd., Singapore.
[7] Schnecke V., Vornberger O (1996) A Genetic Algorithm
for VLSI Physical Design Automation :In Proceedings of
Second Int. Conf. on Adaptive Computing in Engineering
Design and Control, ACEDC '96 26-28 Mar 1996, University
of Plymouth, U.K., pp 53-58
[8] H. Chang, L. Cooks, and M. Hunt. Surviving the SOC
Revolution. Kluwer Academic Publishers, London, 1999.
[9] W.E Donath. Complexity theory and design automation. In
17th Design Automation Conference, pages 412419, 1980.
[10] Z. Yang and S. Areibi. Global Placement Techniques for
VLSI Circuit Design. Technical report, School of Engineering,
University of Guelph, Jul 2002.

ISSN No: 2250-3536

Volume 2, Issue 4, July 2012

47

Вам также может понравиться