Вы находитесь на странице: 1из 13

Cadence Design Systems, Inc.

Clock Tree Synthesis (CTS) Flow

1
Table of Contents
1. INTRODUCTION ......................................................................... 3
2. GENERATING THE CLOCK TREE SPECIFICATION FILE ................................ 3
3. HOW TO CHOOSE BUFFERS FOR CTS ................................................ 3
4. UNDERSTANDING OF SPECIFICATION FILE ........................................... 4
5. CREATING MACRO MODELS TO HANDLE HARD MACROS ............................ 5
6. CREATE DYNAMIC MACRO MODELS TO HANDLE CLOCK DIVIDERS.................. 6
7. SYNTHESIZING THE CLOCK TREE...................................................... 6
8. ROUTING THE CLOCK TREE ............................................................ 7
9. OPTIMIZATION OF CLOCK TREE ....................................................... 8
10. TRACING AND ANALYSIS OF CLOCK TREE ............................................ 8
11. CTS WITH MULTI-MODE MULTI-CORNER (MMMC) FLOW ..................... 10
12. GUIDELINES AND ISSUES ........................................................... 11
13. DOCUMENTATION..................................................................... 13

2
1. Introduction
Clock tree synthesis is performed to meet the clock timing constraints, such as clock
skew, latency (insertion delay) and the transition time.
The purpose of this document is to explain the basic CTS flow and provide references to
solution articles for addressing specific design issues and challenges.

2. Generating the clock tree specification file


The first step is to create the clock tree constraints in a specification file. This file defines
the minimum and maximum delay for the tree, maximum skew and other options to
control how the tree is to be built.
To generate the clock tree specification file automatically from the SDC constraints using
the following command:
encounter> clockDesign –genSpecOnly <fileName>
The clockDesign command is a super command that can be used to generate the
constraints file, delete existing clock trees, and build clock trees. createClockTreeSpec is
typically used in conjunction with the standalone commands deleteClockTree and
ckSynthesis. Either method will generate the same constraints file.

Mapping of SDCs to CTS constraints


create_clock : AutoCTSRootPin in CTS constraints file
set_clock_transition : SinkLeafTran/BufMaxTran (Default: 400ps)
set_clock_latency : MaxDelay (Default: clock period)
MinDelay (Default: 0)
set_clock_latency : SrcLatency value in ns
set_clock_uncertainty : MaxSkew (Default: 300 ps)
create_generated_clock : Adds necessary ThroughPin statement
If create_clock have multiple ports then it will define clocks to a clock group (clkGroup).

Related Solution:
Automated flow to create CTS constraints file from SDCs in Encounter

3. How to choose buffers for CTS


The libraries contain a range of clock net buffers and inverters that are designed to have
nearly matching rise and fall signal behavior. Such behavior helps the generation of
balanced clock circuitry. The cells also have a much finer step in drive strengths
compared to regular buffers and inverters. Additionally, the clock net buffers are
designed such that the input capacitance of each drive strength version is nearly
identical. This offers the possibility to exchange cells in a clock circuit to tune the drive
strength without affecting the loading of the net connected to the input of the cell and
affecting the overall clock tree performance.
3
clockDesign is unable to automatically determine the buffers to use. The user should
specify the buffers and inverters to use by specifying them in the addCTSCellList
command. For example:
addCTSCellList BUFX8 BUFX12 BUFX16 INVX8 INVX12 INVX16

Related Solution:
How to pass the buffer/inverter list into the CTS specification file using variable.
ERROR (SOCCK-2069): cannot find CTS cells. Please use addCTSCellList.

4. Understanding of Specification file


Below is the format of the clock tree specification file:

Clock tree specification file:


#------------------------------------------------------------
# defining the clock shielding
#------------------------------------------------------------
RouteTypeName doublewidth
NonDefaultRule DOUBLEWIDTH_DOUBLESPACE
PreferredExtraSpace 0
TopPreferredLayer 6
BottomPreferredLayer 5
Shielding vss
# shielding will be done from VSS net.
# Non Default Rule “DOUBLEWIDTH_DOUBLESPACE” is used for Shielding.
#------------------------------------------------------------
# Clock Root : clkout # Clock Period: 36.992ns
#------------------------------------------------------------
AutoCTSRootPin clkout
Period 36.992ns
MaxDelay 36.992ns # Define maximum insertion delay
MinDelay 0ns # Define minimum insertion delay
MaxSkew 400ps # Define the maximum skew.
SinkMaxTran 400ps # Define maximum transition at the sink
BufMaxTran 400ps # Define maximum transition at input of clock buffer.
Buffer cnivx12 cnivx16 cnivx2 cnivx4 cnivx6
NoGating NO # Auto detects the clock gating and builds the tree
through the gating element. If raising it stops at the
first gate.
DetailReport YES
SetIoPinAsSync NO
RouteClkNet YES # Do the clock routing.
PostOpt YES # automatically does the optimization
OptAddBuffer NO
RouteType doublewidth # Specify the routing attributes.
END

In addition, there are other useful (design dependant) constraints, as shown below,
which could be part of the constraints applied to the clock root pin.

4
To mark the pin as leaf pin:
LeafPin
+ <pinname1>
+ <pinname2>
CTS treats the pins as sinks, stops tracing further, and balances clock skew.
To exclude the pin from the clock tree synthesis:
ExcludedPin
+ <pinname1>
+ <pinname2>
CTS would exclude the pins from the skew analysis.
To preserve the clock tree netlist below the pin:
PreservePin
+ <pinname1>
+ <pinname2>
CTS would preserve the clock structure below the pins specified.
To treat specific cell Pin/Port as non-leaf pin.
GlobalExcludedPin/GlobalExcludedPort
+ u0/CK CK pin on instances u0 (of DFFRX1) have been declared as excluded pins.
+ DFFRX1/CK This will exclude the CK pin of all DFFRX1 instances from clock tree.
CTS would not to trace or do any skew analysis to this pin specified.

Related Solution:
How to define Shielding clock nets in clock tree specification file.
Usage of GlobalExcludedPort and GlobalExcludedPin in specification file.
How to preserve a module for not adding buffer during CTS.
How does the order of clocks in specification file effects the quality of result.
How can we preserve/don’t touch a net while doing the synthesis.

5. Creating Macro Models to handle hard macros


A macro model is a block that has been clock tree synthesized so that the delays are
identified. All macro model statements must be specified in the top lines of the clock
tree specification file.
There are two ways to set the Macro Model pin properties inside Encounter
1. Cell/Port delay specification having all instantiations of cells have same pin delay.
MacroModel port cellName/portName maxRiseDelay minRiseDelay maxFallDelay
minFallDelay inputCap
eg. MacroModel port spram288x65/clk 1e-8s .8e-8s 1.1e-8s .7e-8s 22e-12
2. Pin instance delay specification that can supersede a Cell/Port delay
MacroModel pin leafPinName maxRiseDelay minRiseDelay maxFallDelay
minFallDelay inputCap
eg. MacroModel pin mem_core/clk 20ps 18ps 20ps 18ps 28ff

5
Related Solution:
How to apply different delay constraint to leaf pins using macro model.
CTS fail if MacroModel statement lacks delay/cap units.
How can we write different macroModel for different pins of same instance?
Why clock not delayed when running CTS with negative delay macromodel.
Deriving clock latency from .lib instead of macromodel constraint?
Macro model writes out 0ff for capacitance
How to delay and advance the clock on specific clock sync points using macromodel.
How to create the macromodel using the optDesign command.

6. Create dynamic macro models to handle clock dividers


A dynamic macro model is used to minimize the skew between the reference pin and the
target pin during CTS. The reference pin is a clock instance pin along a clock path. The
target pin must be a leaf pin. The DynamicMacroModel statement can be used when the
design contains clock dividers
DynamicMacroModel ref refInstPinName pin targetInstPinName [offset delayNumber]

Here using the dynamic macro model we can balance the skew between the two flops.
Once specify clock pin of Flop B as a reference pin and clock pin of Flop A as the target
pin then the clock pin of flop A is balanced with the clock pin of flop B. The
DynamicMacroModel statement minimizes the skew between these two flops to avoid
timing violation on the data path.
Since without the dynamic macro model the clock pin of Flop A is balanced with the
group of flops and not with the clock pin of Flop B because of the ThroughPin that has
been defined in Flop B.

7. Synthesizing the clock tree


To synthesize the clock tree set the desired mode settings using setCTSMode. Then run
clockDesign with the desired options.
encounter> setCTSMode < >
encounter> clockDesign -specFile <CtsConstraints>
The generated clock tree constraints file may not contain all the necessary constraints. It
might require understanding of clock strategy which might help in defining the root pin.
6
So recommendation is not to use the auto constraint file blindly but create your own
after understanding the clock strategy.
All Clock Group statements must be specified before any clock specification. Clock
grouping is done to ensure that the maximum skew between their sinks does not exceed
the max skew time specified in the clock tree specification file.
Also if there be any overlaps between the buffers added for the different clocks during
the synthesis tool will then calls refinePlace to legalize the placement.
In case any buffer or inverter has to be passing other then specification file we can use
the command createClockTreeSpec.
To prevent CTS from changing a hierarchical module, insert buffers inside or outside of
the boundary ports of the modules and then set PreservePin on those buffers.
DontTouchNet/ DontTouchFromToPin options can be use in the clock tree specification file
to preserve a net during CTS.
When net are defined as DontTouchNet then ckSynthesis and ckEco commands will not
insert buffers on those nets. The deleteClockTree command does not delete buffers if
their input or output nets have the DontTouchNet attribute but this is not a physical
parameter; so any net specified in this statement can still be routed.
The DontTouchFromToPin statement will instruct the ckSynthesis and ckEco commands to
not insert buffers for nets that are between the specified start instance pin and end
instance pin. Any nets between these pins are considered to have the DontTouchNet
attribute.
Related Solution:
CTS constraint file settings take precedence over setCTSMode settings.
CTS not honoring set_case_analysis constraint defined in SDC file.
How the report_timing be different then the CTS report for the clock nets.
How to instruct CTS not to add new port to specific logical module at given hierarchy.

8. Routing the clock tree


The behavior of the clockDesign command can be controlled using setCTSMode
command. Clock nets can be routed in CTS using –routeClkNet option in setCTSMode
command or by setting RouteClkNet YES in clock tree specification file.
If user wants to use any non default rule/shielding for any particular clock then they
have to define the RouteTypeName along with the rules in the constraint file which later
be defined at RouteType in that particular clock definition.
In case there will be an some routability issues and desire to change the properties of
any particular clock even it already have some property set during CTS then the
setAttribute command with -net and -preferred_extra_space/-non_default_rule
options can be used to attach attributes to the desired nets.

7
Related Solution:
How to reset the extra spacing of a particular clock net & how to pick nets in clk tree?
How to specify guide file with routeClockNetWithGuide?
What kind of abilities does NanoRoute have for clock tree routing?

9. Optimization of clock tree


The optimization of clock tree can be done using the ckECO command to improve the
skew of each clock and clock group, and to resolve minimum phase delay violations.
The ckECO command does not attempt to correct any design rule violations by default. To
fix the DRVs on the clock nets run ckECO –fixDRVOnly separately. However, in trying to
improve skew, the ckECO command does not significantly worsen maximum transition or
maximum capacitance violations.
The ckECO command performs resizing and buffer insertion or dummy buffer insertion to
improve skew. In addition, the ckECO command might move gating cells when the ckECO
command runs refinePlace.
The ckECO command also supports local skew optimization (with the –localSkew
parameter). Local skew optimization considers the skew between adjacent flip-flops that
have data path connection (from a Q-pin of one flip-flop to the D-pin of another flipflop).
Below are the options to control the behavior of the ckECO.
-preRoute: Used when there is no license to run NanoRoute; or their flow is to
build the clock tree, optimize clock tree, and then call another router to route
the clock net.
-clkRouteOnly: To use immediately after the clock tree is routed.
-postRoute: To use after all signal nets are routed.
Related Solution:
ckECO –localSkew is reporting zero adjacent pairs.
ckECO –preRoute creates shorts with power nets.

10. Tracing and Analysis of clock tree


The clock tree will trace the clocks before it does the synthesis and dumps the reports in
*trace file which can be used to understand the clock strategy also.
It has been seen that while tracing the clocks if two clocks roots merge to same output
pins or there be some reconvergent points within the same clock or crossover points
from one clock to another clock, CTS fails and won’t build the clock tree. So to build the
clock tree we have to handle the clock crossover and Reconvergence points.
The diagram shows the scenarios of crossover and Reconvergence and the command
clockDesign will take care of these scenarios automatically.

8
If we are using ckSynthesis command then we can use the option forceReconvergent.
This option should be used if the physical partition has muxed clocks and CTS is
expected to build a clock tree for every clock root of the muxed clock. The option will
allow CTS to handle (trace through) the muxed clocks and generate a balanced tree
starting from all the clock root branches of the muxed clock.
Related Solution:
Getting ‘Tracing Clock Fail’ Error for reconvergence pins.
CTS with forceReconvergent fails on PreservePin in clock specification.

Clock Gating – Cloning & De-Cloning


Cloning distributes the clock gating components and their gated loads, depending on the
parameters specify. This can be used to optimize the amount and placement of the
Gated Clock cells based on the placement of the design. Gated clock cells are typically
inserted into the netlist during synthesis which may not be placement aware. Optimizing
the Gated clocks cells after placement can improve the placement and improve the
design performance. The command used to do this ckCloneGate.
Decloning are identical clock gating components with the same inputs, depending on the
parameters you specify.
This step can be run prior to ckCloneGate to provide ckCloneGate a better starting point.
To achieve the highest level of decloning use the options -ignoreDontTouch and
-ignorePreplaced. To check the decloning that ckDecloneGate will do prior to committing
it, run the "ckDecloneGate -check -file filename" to output a report on the changes it
proposes:

9
Related Solution:
How to perform clock gate cloning and deconing in SOC.
Is cloning/declining possible in CTS?
Why does CTS and reportClockTree not correlate with CTE?

11. CTS with Multi-Mode Multi-Corner (MMMC) Flow


To run To run Clock Tree Synthesis (CTS) in MMMC mode first create the clock tree
specification file for each operating mode, and then synthesize the clock tree using the
specification files.
create_constraint_mode –name functional -sdc
functional.sdc
Define Mode and Analysis View create_constraint_mode -name test -sdc test.sdc
create_analysis_view -name func_slow -
constraint_mode functional -delay_corner slow
create_analysis_view -name test_slow -
constraint_mode test -delay_corner slow
create_analysis_view -name func_fast -
constraint_mode functional -delay_corner fast
create_analysis_view -name test_fast -
constraint_mode test -delay_corner fast
set_analysis_view –setup {func_slow test_slow} –hold
{func_fast test_fast}

set_analysis_view –setup {view1 view2} –hold {view3


view4}
setCTSMode –specMultiMode true
Generating Clock Tree Spec File clockDesign –genSpecOnly fileName

clockDesign generates a CTS specification file for


each mode named "fileName.modeName". For example, if
we have "functional" mode and "test" mode then
clockDesign would generate the specification files
"fileName.functional" and "fileName.test".
Clock Tree Synthesis clockDesign –specViewList {{fileName.modeName view1
view2} {fileName.modeName view3 view4} ...}

The above flow generates the CTS specification files and synthesizes the clock trees in
separate steps. This is most common because sometimes it is require modifying the

10
clock tree specification files that are automatically generated. If editing is not required
then you can use the below flow.

Define Mode and Analysis View set_analysis_view –setup {view1 view2} –hold {view3
view4}
setCTSMode –specMultiMode true
clockDesign

Generating Clock Tree Spec File


createClockTreeSpec view1.spec
Clock Tree Synthesis specifyClockTree view1.spec
ckSynthesis
saveClockNets view1.DontTouchNets
cleanupSpecifyClockTree
createClockTreeSpec view2.spec
specifyClockTree view2.spec +
view1.DontTouchNets
ckSynthesis
cleanupSpecifyClockTree
timeDesign

Related Solution:
How do you run CTS in a Multi-Mode Multi-corner (MMMC) flow?

12. Guidelines and Issues

Guidelines for Avoiding the Hold Violation


Split the larger clock domains into smaller, more manageable domains and
separately build trees for each. Since lot of timing paths moving between them, so
to balance those all the downstream trees should be defined into a clkGroup.
Remove all the throughpoint on divider register that is the source point for
generated clocks that helps in reducing the insertion delay by half.
Investigate the clock tree regarding the depth of muxing along with whether any
HVT library (slow but low leakage) is being used since that impacts the insertion
delay. One solution is to go for mixed-VT libs.
Switch off the set_dont_touch and set_dont_use on the clock gating cells to allow
CTS to upsize these cells. eg: set_dont_use [get_lib_cell <clock gating
libcell>] false
One way of thinking on clock tree constraints is to set the maxDelay to a large
value to reduce the effort the tool spends on this, hence make it focus on
skew/slew/minimal added cells.
Cell Padding will help in getting more space reserved around FF's. This should help
with both clock tree buffer insertion and the addition of de-coupling cap cells into
key areas.
11
If running scan-reordering then further re-ordering can be carried out after clock
trees are inserted to reduce Hold violations caused by clock tree insertion. This
should help post CTS holds, but may have little effect on post CTS routing
congestion.
Add following variables before placement stage:
setPlaceMode –ignoreScan true
setScanReorderMode –skipMode skipNone
After CTS and setting clocks to propagated:
setScanReorderMode -clkAware
scanReorder
After the clock trees with lower cell count have been created, the command ckECO
can used on high buffer-cell count trees which may result in some further
improvement in skew.
ckECO -clk <clk root name> -postRoute -useSpecFileCellsOnly
While doing the optimization it performs resizing and it may allowed to use any
cell that matches the footprint of the existing cell, regardless of whether it is in the
buffer list or not. So it may also swap the cell which is specified in spec file. So if
you want to limit the resizing of listed cell then you must specify setDontUse
cellName true on the cells it should not use.

Related Solution:
Pointers on how to create low skew, low cell count, low insertion delay clock tree.
CTS insert buffers/inverters that are not in buffer list of spec file.

Issue on tracing the Bi-direction ports


There are some scenarios where clock tree synthesis where not able to understand the
bi-direction port as clock root, as it was very unpredictable whether CTS assume it as
input or output. Some of the scenarios are as below

To
Black Box Clock Gate
FF

INOUT Clock root pin


CTS assume it input pin and couldn’t find any arc

Ist Scenario

12
To
Clock Gate Clock Gate
FF

INOUT Clock root pin


CTS is tracing on right side, but user want it treat it as
output pin of gate and trace on left side

2nd Scenario

There should be no issue when CTS is tracing the INOUT pin as INPUT but when it is
required to consider the pin as OUTPUT as above scenarios the use the below variable
setCTSMode -traceCellInOutPinAsOutPin true

13. Documentation
EDI System User Guide
Describes how to install and configure the EDI System software, and provides strategies
for implementing digital integrated circuits.
EDI System Text Command Reference

Describes the Encounter text commands, including syntax and examples.

13

Вам также может понравиться