Академический Документы
Профессиональный Документы
Культура Документы
Application Note
PAGE 1
Table Of Content
Purpose ............................................................................................................................... 3
Audience ............................................................................................................................. 3
Overview ............................................................................................................................. 3
1.
Introduction ................................................................................................................. 4
2.
3.
4.
5.
6.
7.
8.
9.
10.
12.
PAGE 2
Purpose
This document will explain the full work flow for Clock Tree synthesis.
Audience
This document is meant for users or designers doing CTS using Encounter Digital
Implementation (EDI) system versions 9.1 and 10.1
Overview
This document explains the basic concepts about Clock Tree synthesis understanding and the
work flow.
PAGE 3
1. Introduction
Clock tree synthesis is performed to meet clock timing constraints, such as clock skew,
latency (insertion delay), and the transition time.
General Issues caused by improper CTS:
Routing Congestion
Sudden Rise in Stdard cell Density
Timing closure Issues
Best CTS will yield:
Reasonable density change
Wellcontrolled CTS structure, in turn, yields best Insertion / skew /clock transitions
Early timing closure
Less prone to Cross talks
set_clock_transition
set_clock_latency
set_clock_latency
SrcLatency value in ns
set_clock_uncertainty
create_generated_clock :
If create_clock have multiple ports then it will define clocks to a clock group (clkGroup).
PAGE 4
-update
{AutoCTSRootPin
clkname
Buffer
PAGE 5
In addition, there are other useful (design dependant) constraints, as shown below, which
could be part of the constraints applied to the clock root pin.
To mark the pin as leaf pin:
LeafPin
+ <pinname1>
+ <pinname2>
CTS treats the pins as sinks, stops tracing further, and balances clock skew.
CTS would not insert buffers for nets that are between the start pin and end pin and consider having
the DontTouchNet attribute.
CTS would not add a new port to the specified logical modules at their given hierarchical.
During clock tree synthesis, the order of the clocks defined in the clock specification file is
important and synthesis depends on this. The clock that is defined first in the clock
specification file will be built first, irrespective of its clock frequency. This is also true for
clock routing. So, the clock defined first will be routed first, and so on.
So, the recommendation is to keep the faster clock in your specification file first so that it
does not stop due to re-convergence and gets the maximum space for routing.
PAGE 6
Cell/Port delay specification that has all the instantiations of cells have the same pin
delay.
MacroModel port cellName/portName maxRiseDelay minRiseDelay
maxFallDelay minFallDelay inputCap
eg. MacroModel port spram288x65/clk 1e-8s .8e-8s 1.1e-8s
.7e-8s 22e-12
minRiseDelay
refInstPinName
pin
targetInstPinName
Here, using the dynamic macro model you can balance the skew between the two flops.
Once, specify the clock pin of Flop B as a reference pin and clock pin of Flop A as the target
pin, then the clock pin of flop A is balanced with the clock pin of flop B. The
DynamicMacroModel statement minimizes the skew between these two flops to avoid timing
violation on the data path.
This happens because the dynamic macro model the clock pin of Flop A is balanced with the
group of flops and not with the clock pin of Flop B due to the ThroughPin that has been
defined in Flop B.
COPYRIGHT 2013, CADENCE DESIGN SYSTEMS, INC.
ALL RIGHTS RESERVED.
PAGE 7
PAGE 8
If you want to use any non-default rule or shielding for any particular clock then you have to
define the RouteTypeName along with the rules in the constraint file, which can be
defined later at RouteType in that particular clock definition.
In case there will be some routability issues and desire to change the properties of any
particular clock even it already has some property set during CTS, then the setAttribute
command with -net and -preferred_extra_space/-non_default_rule
options can be used to attach attributes to the desired nets.
Another way of routing the clock nets is through the routed guides so that when you give the
command routeClockNetWithGuide, CTS will build a brand new routing guide. This
routing guide is based on the Steiner estimation for the clock trees in the design thatwas
loaded. The flow for using the routed guide is as below:
restoreDesign
specifyClockTree -clkfile xxx.cts
routeClockNetWithGuide
If you want to route the specific clock with some specific rule then it can also be possible
using the attribute settings. Some of the features of routing the clocks are specified widths,
shielding, and extra spacing.
Specified Width: Non-default rules can be used to route the clock net using a wider width
wire. First, define the rule in the LEF using the NONDEFAULTRULE syntax. Then use
setAttribute to assign the rule to clock nets:
setAttribute -net @clock -non_default_rule wide_wire_rule1
Shielding: Use the -shield_net attribute to specify the net(s) to use for shielding.
setAttribute -net @clock -shield_net {VDD VSS}
Extra
Spacing:
Specify
the
extra
spacing
using
the
attribute
preferred_extra_space. The value specified is from 0 to 3 routing tracks. Nanoroute
will do its best to achieve the specified spacing but will reduce the spacing to avoid
violations.
An example flow of specifying clock routing attributes and routing the clock nets is below:
setAttribute -net @clock non_default_rule wide_wire_rule1
setAttribute -net @clock -shield_net {VSS}
selectNet clock
setNanoRouteMode -routeSelectedNetOnly true
routeDesign
PAGE 9
to improve skew, the ckECO command does not worsen maximum transition or maximum
capacitance violations significantly.
The ckECO command performs resizing and buffer insertion or dummy buffer insertion to
improve skew. In addition, the ckECO command might move gating cells when the ckECO
command runs refinePlace.
The ckECO command also supports local skew optimization using the localSkew
parameter. Local skew optimization considers the skew between adjacent flip-flops that have
data path connection (from a Q-pin of one flip-flop to the D-pin of another flipflop).
The options to control the behavior of the ckECO are listed below.
-preRoute: Used when there is no license to run NanoRoute; or theflow is to build
the clock tree, optimize clock tree, and then call another router to route the clock net.
-clkRouteOnly: To use immediately after the clock tree is routed.
-postRoute: To use after all signal nets are routed.
If you are using the ckSynthesis command, then youcan use the option
forceReconvergent. This option should be used if the physical partition has muxed
clocks and CTS is expected to build a clock tree for every clock root of the muxed clock. The
option will allow CTS to handle (trace through) the muxed clocks and generate a balanced
tree starting from all the clock root branches of the muxed clock.
CTS can support crossover clocks, but the subtree after the crossover pin must have the same
conditions defined in both tree specifications. For example, if a subtree is marked with
ExcludedPin, LeafPin or PreservePin in one tree, it must be marked for the other subtree as
well.
COPYRIGHT 2013, CADENCE DESIGN SYSTEMS, INC.
ALL RIGHTS RESERVED.
PAGE 10
11.
Cloning distributes the clock gating components and their gated loads, depending on the
parameters you specify. This can be used to optimize the amount and placement of the Gated
Clock cells based on the placement of the design. Gated clock cells are typically inserted into
the netlist during synthesis, but the netlist may not be placement aware. Optimizing the gated
clocks cells after placement can improve the placement and improve the design performance.
Cloning does not fix design rule violations on the data path, so if in cloning a large number of
cells you create a high fanout net for the enable signal, then you would need to run IPO to fix
it. Clock cloning *can* add a clock gating cell to the netlist. (ckSynthesis does not add gating
cells). The command used to do this ckCloneGate.
Decloning is done to identical clock gating components with the same inputs, depending on
the parameters you specify. This step can be run prior to ckCloneGate to provide
ckCloneGate a better starting point. To achieve the highest level of decloning, use the
options -ignoreDontTouch and -ignorePreplaced. To check the decloning that
ckDecloneGate will do prior to committing it, run the "ckDecloneGate -check file filename" to output a report on the changes proposed:
PAGE 11
The above flow generates the CTS specification files and synthesizes the clock trees in
separate steps. This is most common because sometimes it is required to modify the clock tree
specification files that are automatically generated. If editing is not required then you can use
the subsequent flow:
PAGE 12
createClockTreeSpec view1.spec
specifyClockTree view1.spec
ckSynthesis
saveClockNets view1.DontTouchNets
cleanupSpecifyClockTree
createClockTreeSpec view2.spec
specifyClockTree view2.spec +
view1.DontTouchNets
ckSynthesis
cleanupSpecifyClockTree
timeDesign
Split the larger clock domains into smaller, more manageable domains and separately
build trees for each. Since a lot of timing paths are moving between them, to balance
those, all the downstream trees should be defined into a clkGroup.
Remove all the through point on the divider register that is the source point for generated
clocks, which helps in reducing the insertion delay by half.
Investigate the clock tree to check the depth of muxing as well to check whether any
HVT library (slow but low leakage) is being used since that impacts the insertion delay.
One solution is to go for mixed-VT libs.
Switch off the set_dont_touch and set_dont_use on the clock gating cells to
allow CTS to upsize these cells. For example: set_dont_use [get_lib_cell
<clock gating libcell>] false
One way of thinking about clock tree constraints is to set the maxDelay to a large value
to reduce the effort the tool spends on this, hence, make it focus on skew or slew or
minimal added cells.
Cell Padding will help getting more space reserved around Flip Flops (FFs). This should
help with both clock tree buffer insertion and the addition of de-coupling cap cells into
key areas.
If running scan-reordering then further re-ordering can be carried out after clock trees are
inserted to reduce Hold violations caused by clock tree insertion. This should help postCTS holds, but may have little effect on post-CTS routing congestion.
PAGE 13
After the clock trees with lower cell count have been created, the command ckECO can
be used on high buffer-cell count trees, which may result in further improvement in the
skew.
ckECO -clk <clk root name> -postRoute useSpecFileCellsOnly
While doing the optimization it performs resizing and it may be allowed to use any cell
that matches the footprint of the existing cell, regardless of whether it is in the buffer list
or not. So it may also swap the cell that is specified in the spec file. So if you want to
limit the resizing of listed cell then you must specify setDontUse cellName true on
the cells that are not to be used.
Black Box
Clock Gate
To
FF
Ist Scenario
Clock
Clock Gate
To
FF
2nd Scenario
There should be no issue when CTS is tracing the INOUT pin as INPUT but when it is
required to consider the pin as OUTPUT as shown in the above scenarios, then use the
variable below
setCTSMode -traceCellInOutPinAsOutPin true
COPYRIGHT 2013, CADENCE DESIGN SYSTEMS, INC.
ALL RIGHTS RESERVED.
PAGE 14