Академический Документы
Профессиональный Документы
Культура Документы
Abstract
With the development of technology, interconnect delay
has become a key factor in VLSI. Buffer insertion is an
effective technique for reducing interconnect delay. This
paper presents an advanced algorithm for finding the
optimal buffer insertion solution. The advanced algorithm
can further improve O ( mn ) time algorithm for optimal
buffer insertion, in which m is the number of sinks and n is
the number of candidate buffer insertion positions.
Assuming that the sink number m is fixed, it is a significant
improvement over O ( nlog 2 n ) time algorithm, and the
a buffer.
In this paper, we first propose a new algorithm that can
find optimal buffer insertion solution faster than previous
best algorithm. The speedup is achieved by the nature
sorting method of the buffer insertion solutions and the
observation that the optimal candidate solution associated
with any buffer type must lie on the concave shell of the
(C,Q) plane. The algorithm can be extended to multi-pin
nets and b buffer types in time O ( b 2 n + bmn ) . In fact, m is
Keywords
Buffer insertion, interconnect, time complexity, delay
1. Introduction
As integrated circuit feature size continues to scale down,
the propagation delay of interconnect becomes more and
more serious, so that delay optimization techniques for
interconnect are increasingly important for achieving timing
closure of high performance design. Buffer insertion is a
popular technique for reducing interconnect delay. To solve
the buffer insertion problem, many algorithms have been
presented, whose basic target is finding out an optimal
solution to insert buffers on a wiring tree so that the time
slack at the source is max. A study [1] by Saxena et al
shows that intra-block repeaters for 32nm node will reach an
alarming 70% of the total block cells. It will cost a very long
time to find out the optimal buffer insertion solution if the
algorithm is not fast enough. Consequently, more efficient
algorithms are required.
In the last two decades, buffer insertion problem has
been extensively studied. In 1990, van Ginneken [2]
presented a dynamic programming algorithm. The algorithm
can obtain the optimal buffer insertion solution with time
complexity O ( n 2 ) , where n is the number of buffer
C ( B) .
D ( v, ) =
e = ( vi +1 , vi )
C ( vi +1 ) = C ( B )
The set of the pair ( C, Q ) of eaach solution is denoted as
sVs
s N ( v ) before pruning
Figure 1: Original solutions set
r l cl
r l C (vi , )
2
C ( vi +1 ) = C ( vi , ) + c l .
Q ( vi +1 ) = Q ( vi , )
c j ci
q j qi
>
ck c j
qk q j
we have
C(v, k ) - C(v, j )
the following.
Lemma 1: At the vertex v in a 2-pin net, iif there are three
solutions i , j and k , i < j < k , which meet the condition:
C(v, j ) - C(v, i )
Q(v, j ) Q(v, i )
>
C(v, k ) - C
C(v, j )
Q(v, k ) Q
Q(v, j )
Q(v, k ) Q(v,, j )
C(v, j ) - C(v, i )
Q(v, j ) Q(v, i )
<
1
,
R
1
.
R
then j is redundant.
r c L
r L C(v, i )
2
K ( B) R( B ) (C(v, i ) + c L)
Q(v', i ) = Q(v, i ) -
= Q(v, i ) - R C(v, i ) - D
D.
c j ci
q j qi
>
ck c j
qk q j
, j is redundant.
c3 c2 c5 c3
>
, 3 is redundant and should be
q3 q2 q5 q3
pruned.
since
Therefore
Q(v', k ) - Q(v', j )= Q(v, k ) - Q(v, j ) - R (C(v, k ) - C(v, j )),
Q(v', j ) - Q(v', i ) = Q(v, j ) - Q(v, i ) - R (C(v, j ) - C(v, i )).
When
Q(v', k ) - Q(v', j ) > 0,
Q(v', j ) - Q(v', i ) > 0,
we have
C(v, k ) - C(v, j )
Q(v, k ) Q(v, j )
C(v, j ) - C(v, i )
Q(v, j ) Q(v, i )
<
1
,
R
<
1
.
R
we have
C(v, k ) - C(v, j )
Q(v, k ) Q(v, j )
C(v, j ) - C(v, i )
Q(v, j ) Q(v, i )
>
1
,
R
<
1
,
R
therefore
C(v, j ) - C(v, i )
Q(v, j ) Q(v, i )
<
C(v, k ) - C
C(v, j )
Q(v, k ) Q(v,
Q j )
C ( vk ) = C ( vm ) + C ( vn )
}
When algorithm reaches a vertex vi , it first judges
whether its parent vertex is merging. If not, the solutions of
the vertex vi are pruned by two stages of pruning, which is
denoted as Prune() and if yes, the solutions are just pruned
by the first stage of pruning, which is denoted as
MergePrune(). By the prediction of the parents vertex type,
we can save much time and space compared with [13] which
maintains a redundant solution list A ( v ) .
Q = RAT ( vn ) and C = C ( vn ) ;
5. Experimental results
All algorithms are implemented in C++ on a Linux
server with Intel(R) Xeon(R) 2.4GH CPU and 12G memory.
The parameters of device and interconnect are shown in
Table 1 adapted from [4] and [13] which are based on
TSMC 180 nm technology.
Table 1: Device and interconnect parameters
Parameter
Value
unit length capacitance
0.118 fF/m
unit length resistance
0.076 /m
buffer intrinsic delay
29 ps -34 ps
output resistance
180 -500
input capacitance
0.7 fF-10fF
Table 2 shows for the simulation results of 2-pin nets
with different numbers of buffer insertion positions based
one buffer type. Table 3 shows for the simulation results of
multi-pin nets with different numbers of sinks and buffer
insertion positions. Our new algorithm is implemented with
comparison to both van Ginnekens algorithm [2] and Li and
Shis algorithm [13]. We have three different buffer libraries,
whose size are 1 and 2, respectively, and denote them as
CPU time(second)
Li and
Shi
[13]
0.004
New
algorithm
325
Van
Ginneken
[2]
0.315
404
0.377
0.006
0.0003
725
1.008
0.007
0.0004
1297
3.277
0.013
0.0006
1522
4.569
0.015
0.0007
2044
8.343
0.020
0.0010
2567
13.309
0.025
0.0011
0.0002
Sinks
m
Buffer
pos.
n
10
467
50
2547
75
3848
100
5147
Buffer
Type
b
b1
CPU time(second)
Van
Li and
New
Ginneken
Shi
algorithm
[2]
[13]
0.295
0.050
0.0002
b2
0.465
0.070
0.0003
b1
9.574
0.018
0.0013
b2
12.153
0.026
0.0017
b1
22.289
0.028
0.0023
b2
28.170
0.041
0.0030
b1
40.138
0.040
0.0039
b2
50.820
0.058
0.0046
6. Conclusion
We have proposed an advanced algorithm to speed up the
process of finding the optimal buffer solution of a wire tree.
The algorithm time complexity is O ( b 2 n+ mn ) for multi-pins.
Simulation results have shown that our algorithm runs
evidently faster than previous works for both 2-pin nets and
multi-pin nets. In addition, as a fundamental algorithm, our
achievement is applicable to some of the precious works,
such as in [4] and [10].
7. References
[1]
[2]
[3]