You are on page 1of 8

Physical Design Flow

STEP 1:
Load the 4 main Inputs i.e.

Netlist [Gate level Netlist]


Lef [ Technology Lef, Macro Lef, Stdcell Lef]
Libs [Logical libraries or timing libraries]
SDCs [ Design Constraints]

Lef contains :
No of metal layers,
Direction of metal layer H/V,
Resistance and capacitance per unit square,
width
spacing and pitch of all metal layers,
area,
thickness,
Via information [double cut and single cut]
a subset of DRC rules.

Macro Lef: All macro information ex dimensions and coordinates, macro pin information.

Std cell Lef: All physical dimensions of the std cells and input pin a b and out, their geometry.

Libs: NLDM’s
All timing information of the standard cells ex nand, and, or flip-flop.
Like cell delay information for that much input transition and that much output load.
timing sense i.e. Positive or negative unate.
Setup and hold constraints of the sequential circuits. For that much data transition and that
much clock transition Setup time and hold times.
Recovery and removal checks.
Power information: depending upon Input transition and output capacitance internal and
external power. Cell Leakage information.

SDCs: Design Constraints like create_clock, set_clock_latency, set_clock_uncertainty,


set_multicyclepath, set_input_delay, set_output_delay, set_max_transition,
set_max_capacitance.
STEP 2:
Performing Sanity checks

Sanity checks Include


1) check_timing –verbose [-all] to report all warnings and to find errors if all flops are getting
clock. If the design has any combinational loops.
2)checkDesign to check out design like how many
3) checkNetlist
4)report_constraints–all_violators to check total number of paths that are failing ,if more the
no. of paths failing then taking it to design and optimizing will increase utilization, hence it
should be cleaned at synthesis stage itself.
5) Report_analysis_coverage
6) checkPinAssignment

How much –ve slack can you take to the design?


It depends upon the WLM’s. If the synthesis team has used best case RC trees in WLM’s the it
is better to have positive slack or non-negative slack. Because the vest case RC WLM’s assume
that there is no wire resistance in the path it sees only pin capacitance which in real can be over
optimistic. If synthesis is done on worst case RC tress then design with –ve slack like -30 to 50ps
can be taken and optimized, because worst case tree is more pessimistic approach.

STEP 3:
Floor planning talk about giving large area to the std cells and placing memories to the
boundaries how it will cause IR drop.
Congestion issue and how you solved it.

STEP 4:
Power planning:
The main goal of power planning is to achieve the Limit or % of IR drop that is given. Like 2% or
1% or 0.9% of total power.
Initially we design power structure to meet half i.e. if target is 2% we build power structure to
meet 1%, its more pessimistic. Then once Routing is done the target of 2% will be met due to
more cells after optimization.

Three types of power Static, Leakage and Dynamic.


Leakage power is when the cells are not switching and they are getting power, due to thinner
technologies sub-threshold leakage form drain to source and gate oxide leakage due to gate
oxide tunneling.
Dynamic power is the power dissipated when the cells are switching.
2 types of dynamic power internal power and external switching power.
Internal power is dissipated for charging internal capacitance and the shortckt current [crowbar
current] .

External switching power is the power dissipated for charging and discharging output loads.

Different areas of the chip function at different frequencies on area may be slow one area may
be fast. So if we design power structure for dynamic power the after CTS more buffers will be
added and switching may vary at each stage of optimization, so for that we need to create a
different power structure at each stage which is incorrect.
So we take average of all switching factors [static power is average switching power] and build
a power structure and the fine tune it if there is IR drop.

STEP 5:
Placement:
Timing optimization: Uncertainty included skew+jitter+delay due to OCV derates.
Why we need to give uncertainty
We need to show the post CTS effects in preCts stage itself because once clock is built the
postCTS optimization will not touch flops and clocks. We need to play with cells to solve timing
and we can use usefulskew. But if we give pessimism to show postCTS effects in preCts the tool
will optimize refine placement and give better placement for the later effects.
Why solve only setup at PreCts?
At preCts stage the clock is Ideal it’s not propagated, so if you look at the internal circuit of flop
the clock to q delay + the propagation delay cannot be less that the hold time [Buffer delay]
Hence there can be no hold violations at preCts stage, but still we see some violations because
of the uncertainty values that we have given.
If we clean hold at this stage with the improper uncertainties because we don’t know the
uncertainties we have given is correct so the tool will insert lot of buffers which are
unnecessary. It will increase area, utilization, leakage, congestion.
Once the clock is propagated we can solve the hold after running placement with reduced
uncertainties.

STEP 6:
Clock Tree Synthesis:
Once setup is clean we can move on to build a clock tree.
Input to CTS:

All required libraries & tech rule files imported


SDC
Floorplanning & Power planning done
Standard cells are placed and optimization done
Congestion analyzed and within limits
Placement database
Clock specification File:
Clock name
Clock period
Max Latency
targeted Skew
Max trans, Max cap
Buffers and Inverters
NDR’s
Macro Model.
Build CTS for non-leaf nets use 6 and 7 layer and for leaf nets use 3 and 4 and 5.
Because if NDR’s are used for leaf nets also it can cause routing congestion.

Which do you prefer buffers or inverters to build a clock tree?


It is better to use both buffers and inverters but inverters should be even. And delay due to
buffer is more but if use only inverters and if there are max Trans violations then again you
need to use buffer to rectify it.

How do you say your CTS is good?


CTS is good when 1) Latency is less.
2) Skew is minimum.
3) Less buffers.
4) Less number of levels.

What happens if Latency is more?


# Clock takes longer time to reach the flops. Though the Skews are balanced it takes longer time
so Chip will be slow. We need faster chips so target min latency.
# Clock travels different areas of the chip. Different areas will have different OCV’s, It can
impact clock and delays may vary in different areas of chip.
#once we apply derates after CTS and suppose n/w latency is more, delay may increase and we
may see more no. of timing violations.
# Suppose latency is 2ns and after derates 2x1.02=2.04, likewise if its 4 then 4x1.02=4.08,
8x1.02=8.16 so derates factor is increasing the delay, arrival time will be delayed.

STEP 7:
POST CTS OPTIMIZATION:
Apply Derates and remove CPPR. If we don’t remove CPPR it will be over pessimism Optimize
for both setup and hold.
1.Talk about Derates, OCV’s, CPPR.

STEP 8:
ROUTING:
Route the Design.
After routing you’ll see that the setup violations are increased that’s because till this point the
routing was trial route and delays were estimated based upon the trial route. Once actual
routing is done then we get actual RC’s.

There will be no or less hold violations because the delays are more which will be a plus point
for hold margins.
Signoff Flow

Routing

Extract SPEF cmd<extractRC> with (captable)


Coupling Capacitance will be included with
ground cap

Setup/Hold Optimization.

Extract RC with coupling capacitance.


cmd<extractRC –coupling true>

setup&hold optimization with SI ,cmd<optDesign –postRoute –si>

Dump [ Netlist, Lef, DEF] with


captable. Give it to StartXtract and it will
dump a SPEF with which STA signoff is done
USEFUL SKEW

Case 1

To solve setup time of FF2.


You need to check the hold margin of FF2 then setup margin of FF3.

FF1

FF2

FF2
FF3

Case 2
To solve hold time of FF2.
You need to check setup margin of FF2 and hold margin of FF0.

FF1

FF2

FF0

FF1