Академический Документы
Профессиональный Документы
Культура Документы
Course Mantras
Objectives
Module Objectives:
Motivation
The logic design strongly impacts the achievable clock frequency and
thus IC performance
The synthesis tools help achieve a specific a specific clock target but can not
fix a poorly structured design
How the design and the CAD tools achieve a clock frequency that satisfies
set-up requirements
How the CAD tools try to fix hold requirements
Any why this is much harder when latches are used
References
References
Smith and Franzon,
Ciletti,
Mantra #1
Why?
Moving data between different clock domains requires careful timing design
and synthesis scripting
For your design (at least for each module) use one clock source and only one
edge of that clock
Only use edge-triggered flip-flops
Caveat
Tools and designers are getting better at using latches and multi-phase clocks.
However, this requires some experience to get correct.
In general, all signals start and end in registers every clock period
This is how the clock synchronizes logic events
D-Flip-flops
Combinational Logic
Example (Revision):
In[0]
In[1]
In[2]
Out[0]
Out[1]
Out[2]
0
1
In[3]
clock
Clock
In
Out
A
3
2
3
c
1
6
4
Critical Path
Thus, the clock speed is determined by the slowest feasible path between
registers in the design
The goal of clock tree is for the clock to arrive at every leaf node at the
same time:
Usually designed after synthesis: Matched buffers; matched capacitance
loads
Common design method:
H tree
Clock tree done after design
Balance RC delays
Balance buffer delays
10
t_skew
11
Smaller skews
Local skews < Global skew
Multiple non-overlapping clock phases
Deliberate non-random skew at flip-flop/latch level
Future automatic clock tree synthesis tools might include features like this
These features are starting to appear in clock
generation tools but are not to be assumed for this class
12
Q
Ck
Ck
tSetUp
Set-up time:
Data can not change no later than this
point before the clock edge.
thold
Hold time:
Data can not change during this time after
the clock edge.
D
Q
tclock-Q
Register
Q1
D1
Comb
Logic
clock
Register
D2
tLogic-delay
+/-tskew
Q2
clock
t_clock-Q
Delay on output (Q) changing from positive
clock edge
In general skew is bounded but you do not
know whether it is positive or negative at any
particular point
13
tskew
clock
tclock-Q
Q1
tset-up
Violation
if D2 changes
after this time
tlogic
D2
The amount of time required to turn > into = is referred to as timing slack
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html
14
tclock-Q
tskew
thold
Hold
Violation
tlogic
D2
OK
Problem: Incorrect
Q2
Q or race-through
sometimes have to insert additional logic to prevent hold violations
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html
15
tSetUp
thold
D
Q
tclock-Q
tSetUp
thold
D
Q
Ck
tSetUp
thold
D
tclock-Q
tclock-Q
Logic failure
16
tSetUp
thold
D
Q
17
D-latch
Q follows D while clock is high
(transparent)
Value on D when clock goes low
is stored on Q
Q
Ck
Ck
tSetUp
thold
D
Q
tclock-Q
Register
Q1
D1
Comb
Logic
Register
D2
tLogic-delay
+/-tskew
Q2
18
t1
t2
clock
t1
t2
19
t clock t clock high max t clock Q max t log ic max t set up t skew
clock
tskew
clock
tclock-Q
Q1
tslack
tset-up
tlogic
D2
Notes:
The percentage of time the clock is high is referred to as the duty-cycle
If part of the following clock-high time is used to allow this logic to be slower, then
the logic-block connected to Q2 must be proportionally faster
Using the clock-high time like this is called cycle-stealing
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html
20
21
Example:
t1
t2
t3
clock
t1
t2
t3
T3=tclock
22
tskew
thold
Q1
Hold
Violation
tlogic
D2
Note:
Hold violations are harder to prevent in latch-based designs
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html
23
Revision So far
D input for flip-flop changing after set-up time before clock edge
What is a hold-violation?
D input for flip-flop changing during hold period after clock edge, as a
result of race-through
24
Example:
D-flip-flop
D-flip-flop
Setup:
= 5 + 6 + 1 + 1 = 13 ns
Hold:
Is
I.e. 76 MHz
3+1
2+1
<
25
= RonCload
Thus, delay in CMOS circuits depends largely on W/L of the drive transistor
and the capacitance of the load it is driving.
Different drive cell sizes can be selected by synthesis tool (x1, x2, x3, etc.)
That capacitance consists of:
Input gates of cells being driven, and Capacitance of wiring
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html
26
Metrics : FO-4
Typical timing budgets
Pipelining and Parallelism
Logic style
27
Delay Metric
Estimating FO4:
Exemplar delays:
Inverter = FO-4
1-bit adder = 10 FO4
Flip-flop t_cp-Q = 4 FO4
28
Compare
Critical path
Compare
Compare
29
Pipelining
Replace with:
Compare
Compare
Compare
30
Replace with:
Compare
Compare
Compare
To see how to code these three structures in Verilog, please refer to the end
of the Verilog1 notes.
31
Retiming
Before:
20FO4
5FO4
10FO4
T_cp = 4 + 20 + 5 + 2 + 4 + 2 = 37 FO4
After:
20FO4
5FO4
10FO4
T_cp = 4 + 20 + 2 + 4 + 2 = 32 FO4
32
During synthesis
After synthesis:
33
analysis tool
Back-annotate
more accuracy
Start
Verilog structural
description of
design (post clock
tree synthesis)
Back-annotate
wiring parasitics in
Prim etim e
Post-route SPEF
file from SoC
Encoutner
Finish
34
Sidebar
Not Examinable!
What is asynchronous design?
Can we use deliberate local clock skew in a design?
Are you sure flip-flops are better, cycle stealing sounds
useful?
35
Remember
36
Summary
37
Summary
Does the logic design have a strong impact on achieving a desired clock
speed?
Yes, design structure has a STRONG impact
38