Вы находитесь на странице: 1из 20

Marvell Semiconductor Last Update: 11/04/2008

Ltd. By: Hatem Yazbek

Reliable Frequency Prediction @ RTL Level in deep sub-micron processes

Hatem Yazbek
Marvell Semiconductors Inc.
hatem@marvell.com

Abstract
In this paper, a Fast Flow for predicting RTL performance is presented. Several projects
especially with re-use and proliferations require frequency speed up, from one generation
to the next. In relevant cases, it is done at the same process node (e.g. TSMC 65nmG).
This flow is centered on Synopsys’s DCT (Topographical) tool, and it provides
breakthrough methods for automating and generating the floorplan information, then
injecting it to DCT, before doing any backend Place&Route. This flow is used by RTL
Front End (FE) designers, to explore their RTL code, and get quick feedback from real
synthesis flow, which takes into effect the physical side of the design. SDC and RTL are
owned by FE designers. Fast iterations of this flow, will help achieve faster and efficient
speed up of performance and identification of correlated critical timing paths, to improve
frequency.

Fast Flow for Predicting frequency and area uses mainly DCT, with best optimized set of
parameters to get best correlation between DCT results and physical backend design. In
this paper, we will investigate and quantify the parameters, timing correlation, and look
deeper into congestion removal benefits. This flow and usage of DCT is providing early
prediction of frequency performance and area estimation, with good correlation to
backend stages. This in turn will increase productivity, and reduce time to market.

Marvell Technology Page 1 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

Table of Contents
1.0 Introduction................................................................................................................. 3
2.0 DC Topographical (DCT) – Technology and Benefits............................................... 4
2.1 The Technology .................................................................................................. 4
2.2 The Stated Benefits versus Benefits Required by our Project ............................ 4
3.0 Fast Flow Prediction (FFP) in the project development stages .................................. 4
3.1 Stage 1 of the FFP flow ...................................................................................... 5
3.2 Stage 2 of the FFP flow ...................................................................................... 6
3.3 Stage 3 of the FFP flow ...................................................................................... 6
4.0 Pilot Test Case Description......................................................................................... 7
5.0 Results......................................................................................................................... 7
6.0 Quality of results comparison ..................................................................................... 8
7.0 Timing and critical path correlation............................................................................ 8
8.0 Runtime comparison ................................................................................................. 12
9.0 Routing and Congestion prediction .......................................................................... 13
10.0 FFP Flow Stage 1 Initial Results .............................................................................. 19
11.0 Conclusions............................................................................................................... 19
12.0 Acknowledgements................................................................................................... 20
13.0 References................................................................................................................. 20

Table of Figures
Figure 1 - Timing correlation ICC vs. DCT 1pins_1macro_cong ...................................... 9
Figure 2 - Timing correlation ICC vs. DCT 1pins_1macro.............................................. 10
Figure 3 - Timing correlation ICC vs. DCT 1pins_0macro.............................................. 11
Figure 4 - Timing correlation ICC vs. DCT 0pins_0macro.............................................. 12
Figure 5 - DCT congestion map with 0pins_0macro experiment..................................... 15
Figure 6 - ICC congestion map with 0pins_0macro experiment ...................................... 15
Figure 7 - DCT congestion map with 1pins_0macro experiment..................................... 16
Figure 8 - ICC Congestion map with 1pins_0macro experiment ..................................... 16
Figure 9 DCT congestion map with 1pins_1macro experiment ....................................... 17
Figure 10 - ICC Congestion map with 1pins_1macro experiment3 ................................. 17
Figure 11 DCT congestion map with 1pins_1macro_cong experiment ........................... 18
Figure 12 ICC Congestion map with 1pins_1macro_cong experiment4.......................... 18

Marvell Technology Page 2 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

1.0 Introduction
Marvell’s Switching design team is continuously challenged with ever increasing demand
for performance and frequency of next generation network switches and fabrics. Even at
certain given process nodes (e.g. 65nm) designers are required to tackle challenges of
increasing macro and logic unit’s performance. These challenges are met under
demanding schedule constraints, driven by a competitive landscape and tier-1 customer
constraints.

Usually, synthesis of RTL code starts after its ready for release by RTL designer. So, we
run synthesis at later stage, and then we verify if the RTL can meet the target frequency.
This process requires having man resource from backend (BE) group to run experiments
for the frontend (FE) designer. BE designer runs synthesis, generates reports and sends
to FE designer for analysis, where we fix the RTL to address those critical paths.

DC (not DCT) has been run by the FE/RTL designer, but not on large scale. Since we
have been using advanced processes (90nm and beyond) and high speed designs, DC has
shown weakness, in predicting the critical paths using estimated wireloads or even zero
wireloads, especially for deep sub-micron processes. In those processes interconnect
becomes a dominant part of the delay/timing and can no longer be neglected or roughly
estimated.

The purpose of this work, is to provide early prediction of frequency which is within
<10% accuracy to final BE target. So, we want to predict the critical paths or frequency
at early stages and by the FE designer. DC obviously does not satisfy our needs. On the
other hand, DCT does answer many questions set forth. For sub-micron processes with
interconnect delay estimated, prediction of critical paths is best achieved when using
initial floorplan or DEF, combined with using other parameters like Resistance and
Capacitance scaling and cycle over constraints.

Please, note that we do not look for exact correlation of critical path, but rather to achieve
at least 10% of accuracy between DCT results and ICC actual results. DCT parameters
of R,C scaling, usage of layers and cycle time can be tweaked differently per each block
or macro or unit. Blocks or macros should be treated differently based on the following:
1. Size of block
2. Aspect ratio of block
3. Congestion and Routing of block
4. Number of rams/sub-macros placed inside the block

Marvell Technology Page 3 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

2.0 DC Topographical (DCT) – Technology and Benefits


2.1 The Technology
Synopsys has introduced DCTopographical around 2006. This technology uses
placement engine to place all the cells in the design. The placement is ICC-based and not
a comprehensive full placement, since legalization is not rigorous and results are not
suitable for final placement and routing of the design.

Virtual routes are created by a global-route routing engine. Physical information for the
design floorplan is taken into consideration (FP DEF) for virtual placement and routing.
Finally net capacitance and resistance is estimated using data derived directly from the
foundry technology file. The R and C values calculated from the placement and routing
engineers produce a more accurate estimation than wire load models. Lately, in 2007.12
DCT versions Synopsys introduced a new congestion optimization feature, to the
compile_ultra command. Using compile_ultra with the –congestion option for those
blocks/macros that are congested in routing will reduce the congestion in routing before
getting into ICC.

2.2 The Stated Benefits versus Benefits Required by our Project


Synopsys stated goal with Topographical technology, is to “deliver accurate correlation
to post layout timing, area and power”. Following is a list of claims and proof of
concept which we intend to verify and quantify:
• DCT-T is for RTL designers (FE designers) with no physical expertise
• DCT-T within FFP flow can be used to predict performance and area at early
stages of the design
• So DCT-T and Fast Flow Prediction flow create better starting point for physical
BE design

3.0 Fast Flow Prediction (FFP) in the project development


stages

To provide FE designers with quick flow for predicting frequency, we developed the flow
in several stages, where each stage matches the project schedule and development phases.
Figure 1 shows the general idea of the flow, where DCT is focused on predicting RTL
critical timing paths, and after timing reports are analyzed, the FE designer fixes those
paths in RTL and then goes back to re-run the flow, and so on.

Marvell Technology Page 4 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

SDC RTL

Generate FP.def DCT Synthesis


(Scan, DFT Order)
FP DEF
from Timing Analysis,
ICC/Astro and RTL fix
Verification
- Formal
- Timing critical paths

Backend Further Area Reduction


Responsibilit Final Optimization of flow

Figure 1 – FFP general flow

During discovery stages of the backend, and early stages of RTL coding, we use stage 1
of the flow, where the top level floorplan is NOT ready yet.

3.1 Stage 1 of the FFP flow

Uses DC/DCT to synthesize the RTL code using following approach:


1. Runs DC to estimate area
2. Uses ready TCL utilities which reports the area, and creates initial DEF without
pins or macro placements. We use DEFAULT values for aspect ratio of the block
(X2Y=2_to_3), and we assume about 60% placement utilization.
3. DCT now runs using the initial DEF and uses default SDC
4. Initial SDC uses
a. default input/output (1/3*Cycle time)
b. default inputs transition (0.15ns)
c. default outputs load (0.15pF)

Marvell Technology Page 5 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

SDC

Area/SDC
DC Synthesis RTL
Calculate Area,
Setup SDC: SDC Timing Analysis,
CreateInitDEF
CreateInitSDC and RTL fix
FP.def

DCT Synthesis
Physical (2 compile_ultra)
Libraries

Figure 2 - Depicts the flow used during stage 1 of the flow.

3.2 Stage 2 of the FFP flow

This stage happens when the project BE group has an initial floorplan for the fullchip
with pin assignment done. Then, we can run the DCT flow with more realistic pins
assigned, and not use default pins assigned by the DCT tool. Experiment of correlation
to ICC shows that the pins have extreme importance to critical path correlation.

3.3 Stage 3 of the FFP flow

This stage happens when BE designer has initial macro/rams placed in a floorplan. At
this point the floorplan DEF will have both pins and macro/ram placement. Again, the
critical paths correlation to the final ICC result will now become closer to DCT results.

A future enhancement of stage 1 (or prior to stage 3) of the flow is to place the
macro/ram inside the floorplan automatically, where the FE designer can group the rams
based on their original logical hierarchy or logical function. Then, we can group those in
a bounded group and cluster them inside the floorplan at the block/macro physical
boundaries. This will provide more accurate correlation to real macro/ram placement,
and will increase the accuracy and correlation of critical paths to the final ICC stage.

Marvell Technology Page 6 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

4.0 Pilot Test Case Description

One of the pilot projects for this flow has more than 30 units/macros/blocks, and each one
has to run individual synthesis to place&route flow. We looked for early prediction of
frequency and also for area/utilization numbers. In order to generalize the flow and pick
the best tweak parameters for DCT, we picked two representative test cases:
1. C_MACRO testcase which is less critical in timing, but has lots of rams placed (>
60% of total area is occupied by rams or memories
2. B_MACRO testcase which is critical in timing, and mostly congested in routing,
and it also has considerable rams/memories

In both cases, we wanted to test the DCT parameters, and tweak them best, in order to
reach best correlation between DCT results and ICC results. For those experiements we
ran ICC place_opt stage. Those parameters that we discuss are:
1. Cycle time scale. We use 0.9, so we run DCT at 0.9*Cycle_time. Using 4.0ns
cycle time then we run DCT at 0.9*4=3.6ns
2. Resistance Scale of estimated wires. We use 5% scale, so wires have 5% extra.
3. Capacitance Scale of estimated wires. We use 20% scale, so wires have 20%
extra loading
4. Layers availability (or porosity) of PowerGrid – we add this into account (new
feature of 2007.12-SP1)

5.0 Results

Several experiments were picked to research the FFP flow and DCT vs. ICC usage and
correlation. Each experiment was run, using the same SDC and RTL code, and had to
run the following:
1. Pick certain combination of pins and macro/ram placement as floorplan DEF
input
2. Run DCT using R,C, cycle tuning
3. Run ICC using the final BE floorplan DEF
4. Correlate same paths (start/end) from ICC to DCT and vice versa

Please, note that the best R,C scaling factors were picked for a specific testcase. Those
values seem to fit well for another macro/block with similar floorplan (size, #rams, pins).
Scaling the cycle down by 10% when running DCT, has shown better correlation to ICC
results. Further research can yield better parameter optimization for R,C scaling and
Cycle time. Synopsys recommended to run with the same Cycle time as in ICC, but we
noticed that ICC can meet timing better with over constraining the Cycle time while in
DCT. All experiments used a floorplan DEF, which included the following sections:
DIE_AREA, Rows (must identify the die area boundaries), Tracks/Gcells, PINS

Marvell Technology Page 7 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

(experiments 2-4), COMPONENTS (experiments 3-4), Blockage (non-buffer blockage


around rams), and SPECIALNETS (VDD/VSS power grid).

Experiments are described in Table 1 below:

Table 1 - Experiments Description


Experiment Index Pins Macro/Ram C, R Cycle
Assigned placed Scale Scale
Exp1 (0pins_0macro) 0 0 20%, 5% 0.9
Exp2 (1pins_0macro) 1 0 20%, 5% 0.9
Exp3 (1pins_1macro) 1 1 (real) 20%, 5% 0.9
Exp4 (1pins_1macro_cong) 1 1 (w –cong) 20%, 5% 0.9

6.0 Quality of results comparison

We compared several runs based on DCT and ICC runs. So for each experiment we did
out of the 4, we gathered the corresponding qor data, where DCT and ICC were run using
the same exact environment. Table 2 below shows the collected data.
Table 2 - DCT/ICC QOR Results

Experiment-> 0pins_0macro 1pins_0macro 1pins_1macro 1pins_1macro_cong


Category DCT ICC Diff DCT ICC Diff DCT ICC Diff DCT ICC Diff
WNS (F2F) 0 -0.0283 0.63% -0.01 -0.0567 1.04% 0 -0.0278 0.62% -0.24 -0.0204 -4.88%
- - -
WNS (IN=I2F) -0.39 -0.0027 8.61% -0.39 -0.0001 8.66% -0.41 -0.0012 9.08% -0.42 -0.0003 -9.33%
WNS - - - -
(OUT=F2O) -0.29 0.0003 6.45% -0.29 -0.0022 6.40% -0.44 -0.0007 9.76% -0.5 0.0001 11.11%

Cell Area 3772402 3695826 2.07% 3779006 3694240 2.29% 3766440 3692364 2.01% 3821470 3696312 3.39%
Cell Count 326305 317212 2.87% 325220 316092 2.89% 324629 314184 3.32% 332365 317774 4.59%

Note that when ICC runs with DCT experiment 4 (1pins_1macro_cong), then it gives the
best timing, and best runtime (see Table 3 below). Also, DCT has violation when using
experiment 4, where the compile_ultra was run with –congestion option.

7.0 Timing and critical path correlation

We wrote several scripts to be able to correlate the same paths from ICC back to DCT,
and vice versa. Our purpose is to correlate path by path between ICC and DCT. We
collected the timing report from ICC (showing 500 paths in the report), then we wrote a

Marvell Technology Page 8 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

perl/awk scripts to create a TCL command file, which report timing from startpoint to
endpoint that are shown in ICC report. This TCL command was run as is from the DCT
prompt after loading the corresponding DDC session.

3.5
ICC delay [ns]

ICC vs. DCT


1pins_1macro_cong
3 Perfect Correlatiion

2.5

2
2 2.5 3 3.5 4
DCT delay [ns]

Figure 1 - Timing correlation ICC vs. DCT 1pins_1macro_cong

Marvell Technology Page 9 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

3.5
ICC Delay [ns]

ICC vs. DCT 1pins_1macro

3 perfect correlation

2.5

2
2 2.5 3 3.5 4
DCT Delay [ns]

Figure 2 - Timing correlation ICC vs. DCT 1pins_1macro

Marvell Technology Page 10 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

3.5
ICC Delay [ns]

ICC vs. DCT


3 1pins_0macro

perfect correlation

2.5

2
2 2.5 3 3.5 4
DCT Delay [ns]

Figure 3 - Timing correlation ICC vs. DCT 1pins_0macro

Marvell Technology Page 11 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

3.5
ICC Delay [ns]

ICC vs. DCT 0pins_0macro


3 perfect correlation

2.5

2
2 2.5 3 3.5 4
DCT Delay [ns]

Figure 4 - Timing correlation ICC vs. DCT 0pins_0macro

From examining Figures 1-4, we can see clearly that experiment 4 (1pins_1macro_cong)
shows that it has the best timing correlation with ICC timing reports, with more timing
paths are closer to perfect correlation line in the graph. In all graphs we see that several
paths are located further to the left. Those paths are related to memories paths, where
DCT has shown less delay than ICC.

8.0 Runtime comparison

We noticed that when DCT has more realistic data, then its run time increases, while ICC
runtime decreases. Realistic data means, to run DCT with real macro/ram placed in DEF,
and also to run with real pins placed from the top level. Obviously when we run the DCT
with –congestion option and with adding layer availability, then we get the highest run
time of DCT. Table 3 shows the run times of DCT and ICC for the various experiments
we ran. It shows that when using the DCT with the –congestion option (experiment 4),
then run time is the longest, but this is because the correlation with ICC is the best.
Running merely with 0pins_0macro case will recur the fast runtime but correlation of
timing paths is not good.

Marvell Technology Page 12 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

Table 3 - DCT/ICC Runtime Summary


0pins_0macro 1pins_0macro 1pins_1macro 1pins_1macro_cong
DCT_1st_compile 4:10 9:20 9:03 10:30
DCT_2nd_compile 1:20 3:05 4:20 3:45
ICC_place_opt 3:32 3:03 3:18 2:19
ICC_psynopt 1:02 1:00 1:04 0:52
ICC_place_opt -cong 4:19 4:05 5:15 4:57
ICC_psynopt -cong 0:44 0:42 0:51 0:53

Total: DCT+ICC 10:04 16:28 17:45 17:26


Total: DCT+ICC
(w/cong) 10:33 17:12 19:29 20:05

Highlighted in bold, we see that with DCT experiment 4 (1pins_1macro_cong) we get the
smallest runtime of ICC, and that no need to run ICC with the –cong option. Runtime
improvement in this case is about 30% less than in other DCT experiments 1-3.

9.0 Routing and Congestion prediction

B_macro case had to be reduced in size, to be able to exploit the congestion capability of
DCT. So we performed several attempts to reduce the size, to create at least 5 test cases
(TC1-TC5), so we can reach the smallest area possible. Congestion was measured using
new feature of DCT (report_congestion), as opposed to ICC congestion report, Results
emphasize the congestion removal capability of DCT, which makes ICC meet congestion
and timing quicker and in more predictable fashion. Table 4 shows the congestion
comparison for experiment 3 (1pin_1macro) and experiment 4 (1pins_1macro_cong).
Both experiments are with pins and ram/macro assigned in the floorplan DEF, while we
run the DCT with –congestion or without. As we see in Table 4, the routing congestion
is reduced a lot when using DCT compile_ultra option of –congestion.

Table 4 - ICC Routing Congestion Comparison


Routing
ICC Run DCT Run Overflow
w/ cong (place_opt -cong) ICC 1pins_1macro 3892
w/ cong (place_opt -cong) ICC 1pins_1macro_cong 564

Congestion maps of DCT vs. ICC are shown in Figures 5-12. Notice that in Figures 5, 7
where the macro/ram placement was placed randomly by DCT, the macro placement is
not organized and not optimal. In Figure 5 with 0pins_0macro experiment, the pins
placed by DCT do not resemble the real pins placed, and thus causing further mis
correlation of timing and congestion. For experiments 1-2 depicted in Figures 5, 7, DCT

Marvell Technology Page 13 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

did not see much congestion and it took less runtime to meet the timing. In Figures 9,
11, where the macro/ram placement came from backend floorplan (real placement) we
see that DCT took more runtime, and created congestion. Figure 11 clearly has less
congestion, since DCT was run with the new option of compile_ultra –congestion.

Most interesting is to compare between ICC and DCT congestion maps. Figures 6, 8
show the ICC congestion map, which comes up as the worst congestion we can have,
since DCT runs were missing information about pins and real macro/ram placement.
Figures 5, 7 instead show no congestion in DCT.

Figures 10, 12 show the ICC congestion map (running place_opt –cong) which
correspond to experiments 3 and 4. Notice the good correlation of ICC when we run it
based on Figures 8-9 (DCT experiments 3 and 4). Another note is that ICC run with
experiment 4 shows less congestion than ICC run with experiment 3. Experiment 4 (see
Table 1 above) is based on DCT run with 1pins_1macro_cong running compile_ultra
with –congestion option.

Marvell Technology Page 14 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

Figure 5 - DCT congestion map with 0pins_0macro experiment

Figure 6 - ICC congestion map with 0pins_0macro experiment

Marvell Technology Page 15 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

Figure 7 - DCT congestion map with 1pins_0macro experiment

Fi
Figure 8 - ICC Congestion map with 1pins_0macro experiment

Marvell Technology Page 16 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

Figure 9 DCT congestion map with 1pins_1macro experiment

Figure 10 - ICC Congestion map with 1pins_1macro experiment3

Marvell Technology Page 17 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

Figure 11 - DCT congestion map with 1pins_1macro_cong experiment

Figure 12 - ICC Congestion map with 1pins_1macro_cong experiment4

Marvell Technology Page 18 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

10.0 FFP Flow Stage 1 Initial Results

We collected results from several units/macros that FE designers ran. FE designers used
the FFP flow without pins/rams placed, but with initial default DEF as explained in
Figure 2. Table x shows the results, with respect to FE target frequency. For correlatin
with BE, extra margin is taken by using 90% of cycle time.

Table 5 - FFP Flow Stage1 RTL fix results

FE FE FE # of RTL Task
Intial Final Target fix Duration
Block Freq Freq Freq iterations [days]

macro_bm_a 279 377 360 3 9


macro_xaa 435 610 540 2 3
macro_xcrc 667 667 625 0 1
macro_ccnt_top 909 1111 1000 2 2
macro_tx_q_c 279 389 360 4 11
macro_he_p_d 302 427 360 2 10

11.0 Conclusions

We conclude that we can surely use DCT with the research set forth in this paper, to
better predict area and performance while at early stages of RTL design. We have shown
that with optimized set of parameters like R, C scaling, metal layer availability, and
period or cycle over constraint, then we reach better correlation between DCT and ICC
backend tool.

We also have shown that FFP (Fast Flow of Predicting frequency and area) gives great
benefit to our RTL (FE) designers. Our FE designers have been building RTL code and
checking it using FFP synthesis flow, which gives quick feedback on the frequency and
area of their design. They predict their RTL frequency; perform fast iterations of FFP
synthesis, which allows them to fix the RTL to meet their targets. By using DCT within
FFP, the backend (BE) design group finds fewer problems, and more roust RTL code
which converges the BE design and effort.

On the congestion and routing side, we have shown that using the latest feature of DCT
with –congestion gives even better correlation to ICC. It does increase DCT runtime, but
ICC runtime in turn gets less, and correlation of timing and congestion is greatly
improved. We must mention that Synopsys is working on introducing more features
which will improve the correlation to backend tools down stream in the flow. Those

Marvell Technology Page 19 of 20 SNUG Israel 2008


mailto:hatem@marvell.com
Marvell Semiconductor Last Update: 11/04/2008
Ltd. By: Hatem Yazbek

features involve adding a real power grid identification, p_net options (like ICC), and
better placement of macro/rams in the floorplan. One future option we use at Marvell is
using a novel method to place the rams/macros inside the floorplan using special
“bounding” method to bound rams based on their original logic hierarchy. Those
bounded memories will be grouped in families, and placed closer to the boundaries of the
block floorplan.

12.0 Acknowledgements

The author would like to thank Sharon Avital and Shaul Ben-Dor of Synopsys, for their
technical support with running, gathering, and interpreting the data for the set of
experiments.

13.0 References

1. Design Compiler Reference Manual, Version Z-2007.12


2. Evaluating the Benefit of DC Topographical to the Entire Design Flow – SNUG
2008, Kearting et. Germia

Marvell Technology Page 20 of 20 SNUG Israel 2008


mailto:hatem@marvell.com

Вам также может понравиться