Вы находитесь на странице: 1из 38

ECE 520 Class Notes

2. Timing Design in Digital Systems


Dr. Paul D. Franzon
Outline
1. Timing design in Synchronous (clocked) Logic

Min/Max timing with flip-flops


Latch-based design

3. Timing Issues in CMOS circuits


4. Timing verification Flow
5. Techniques to Improve Performance

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Course Mantras

One clock, one edge, Flip-flops only


Design BEFORE coding
Behavior implies function
Clearly separate control and datapath

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Objectives
Module Objectives:

Describe how a clock-synchronized design behaves at a clock-cycle to cycle


level. Know how to draw a timing diagram for a design.
Describe how clocks are distributed. Define clock skew and jitter.
Describe the basic behavior of flip-flops and latches.
Define setup and hold times.
Derive the timing equations for flip-flop based designs.
Describe the basic behavior of latches, including set-up and hold times.
Derive the timing equations for latch based designs.
Understand the application of cycle stealing in latches.
Describe timing closure in the CAD tool flow.
Understand the relationship between design and clock frequency

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Motivation

Synchronous design is one cornerstone of modern digital design

The logic design strongly impacts the achievable clock frequency and
thus IC performance

Events are synchronized by a master clock using flip-flops and latches

The synthesis tools help achieve a specific a specific clock target but can not
fix a poorly structured design

The designer needs to understand

How the design and the CAD tools achieve a clock frequency that satisfies
set-up requirements
How the CAD tools try to fix hold requirements
Any why this is much harder when latches are used

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

References
References
Smith and Franzon,

Section 11.2 (timing relationships)


Chapter 13 (CAD flow)

Ciletti,

Sections 3.1, 3.2.1 (flip-flops, latches)


Section 11.2.1 (Static timing analsys)

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Mantra #1

One clock, one edge; Flip-flops only

Why?

Moving data between different clock domains requires careful timing design
and synthesis scripting

If you need multiple clocks in your design

For your design (at least for each module) use one clock source and only one
edge of that clock
Only use edge-triggered flip-flops

Make them related by a powers of 2


E.g 50, 100 and 200 MHz
Consider one clock per module
Consider resynchronizing using flip-flops between clock domains

Caveat

Tools and designers are getting better at using latches and multi-phase clocks.
However, this requires some experience to get correct.

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

General Approach to Timing Design

In general, all signals start and end in registers every clock period
This is how the clock synchronizes logic events
D-Flip-flops

Combinational Logic

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Clock Level Timing

Example (Revision):
In[0]
In[1]
In[2]

Out[0]

Out[1]

Out[2]

0
1

In[3]

clock
Clock
In
Out

A
3

2
3

c
1

6
4

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Critical Path

Thus, the clock speed is determined by the slowest feasible path between
registers in the design

Often referred to as the critical path

Critical path is longer


with increased logic depth
(# gates in series)

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

ECE 520 Class Notes

Synchronous Clock Distribution

The goal of clock tree is for the clock to arrive at every leaf node at the
same time:
Usually designed after synthesis: Matched buffers; matched capacitance
loads
Common design method:

H tree
Clock tree done after design
Balance RC delays
Balance buffer delays

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

10

ECE 520 Class Notes

Clock skew and jitter

Clock skew = systematic clock edge variation between sites

Mainly caused by delay variations introduced by manufacturing variations


Random variation

Clock jitter = variation in clock edge timing between clock cycles

Mainly caused by noise


t_jitter

Equations below lump


jitter and skew into a single
skew number.
Skew is larger anyway.

t_skew

Skew + jitter called Uncertainty


in Synopsys.

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

11

ECE 520 Class Notes

Comments on Clock Skew

ASIC design relies on automatic clock tree synthesis

Works to guarantee a global skew target

Custom clock distribution can be used to add the following features to a


clock:

Smaller skews
Local skews < Global skew
Multiple non-overlapping clock phases
Deliberate non-random skew at flip-flop/latch level
Future automatic clock tree synthesis tools might include features like this
These features are starting to appear in clock
generation tools but are not to be assumed for this class

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

12

ECE 520 Class Notes

Flip-Flop based design


D

Edge triggered D-flip-flop


Q becomes D after clock edge

Q
Ck

Ck

tSetUp

Set-up time:
Data can not change no later than this
point before the clock edge.
thold

Hold time:
Data can not change during this time after
the clock edge.

D
Q

tclock-Q

Register
Q1
D1
Comb
Logic

clock

Register
D2

tLogic-delay
+/-tskew

Q2

clock

t_clock-Q
Delay on output (Q) changing from positive
clock edge
In general skew is bounded but you do not
know whether it is positive or negative at any
particular point

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

13

ECE 520 Class Notes

Preventing Set-Up Violations


Set-up violation:
Logic is too slow for the correct logic value to arrive at the inputs to the
register on the right before one set-up time before the clock edge
Constraint to prevent this:

t clock t clock Q max t log ic max t set up t skew


clock

tskew

clock

tclock-Q
Q1

tset-up

Violation
if D2 changes
after this time

tlogic

D2
The amount of time required to turn > into = is referred to as timing slack
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

14

ECE 520 Class Notes

Preventing hold violations


Hold violations occur when race-through is possible
Constraint to prevent hold violations:

t hold t skew t clock Q min t log ic min


clock
clock
Q1

tclock-Q

tskew
thold

Hold
Violation

tlogic
D2

OK

Problem: Incorrect
Q2
Q or race-through
sometimes have to insert additional logic to prevent hold violations
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

15

ECE 520 Class Notes

What can happen with a timing violation?


D changes outside setup and hold tclock-Q is correct
Ck

tSetUp

thold

D
Q

tclock-Q

D changes during setup and hold tclock-Q longer than specified, or Q


does not transition correctly
Ck

tSetUp

thold

D
Q

Ck

tSetUp

thold

D
tclock-Q

Incorrect timing in next logic stage

tclock-Q

Logic failure

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

16

ECE 520 Class Notes

What can happen with a timing violation?


D changes during setup and hold
Ck

tSetUp

thold

Q changes in WRONG clock cycle


- Racethrough

D
Q

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

17

ECE 520 Class Notes

Latch Based Design


D

D-latch
Q follows D while clock is high
(transparent)
Value on D when clock goes low
is stored on Q

Q
Ck

Ck

tSetUp

thold

D
Q

tclock-Q

Register
Q1
D1
Comb
Logic

Register
D2

tLogic-delay
+/-tskew

Q2

Set-up and hold times:


D can not change close to the falling
(latching) clock edge.
t_clock-Q
Delay from clock going high to Q
changing
t_D-Q
Delay from D changing to Q changing
while clock high (assumed = t_ck-Q
here)

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

18

ECE 520 Class Notes

Latch Timing Constraints


Set-up

constraints under nominal design same as for flip-flop

t clock t clock Q max t log ic max t set up t skew


Transparency

of latch can be used to improve flexibility of timing


e.g.: If critical path is in logic block 1:

t1

t2

clock
t1

t2

t1 longer than for FF exploit transparent latch


2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

19

ECE 520 Class Notes

Latch timing constraints WITH cycle stealing


To prevent set-up violations:

t clock t clock high max t clock Q max t log ic max t set up t skew
clock

tskew

clock

tclock-Q
Q1

tslack

tset-up

tlogic

D2

Notes:
The percentage of time the clock is high is referred to as the duty-cycle
If part of the following clock-high time is used to allow this logic to be slower, then
the logic-block connected to Q2 must be proportionally faster
Using the clock-high time like this is called cycle-stealing
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

20

ECE 520 Class Notes

Latch Set Up Violations


Notes:

The percentage of time the clock is high is referred to as the duty-cycle


If part of the following clock-high time is used to allow this logic to be
slower, then the logic-block connected to Q2 must be proportionally
faster

Using the clock-high time like this is called cycle-stealing

Normally cycle stealing is not enabled

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

21

ECE 520 Class Notes

Latches Cycle Stealing

Can use up to a total of t_clock_high within a pipeline structure, to help


in timing closure

Example:

t1

t2

t3

clock
t1

t2

t3

t1,t2 > tclock (cycle stealing)


What can t3 be?
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

T3=tclock
22

ECE 520 Class Notes

Latch timing constraints


To prevent hold violations:

t clock high max t hold t skew t clock Q min t log ic min


clock
clock

tskew
thold

Q1

Hold
Violation

tlogic
D2
Note:
Hold violations are harder to prevent in latch-based designs
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

23

ECE 520 Class Notes

Revision So far

What is a set-up violation?

D input for flip-flop changing after set-up time before clock edge

How is a set-up violation fixed?


Reduce critical path, or increase clock period

What is a hold-violation?
D input for flip-flop changing during hold period after clock edge, as a
result of race-through

How is a hold-violation fixed?


Increase shortest logic path(s)

Why are edge-triggered flip-flops preferred over latches?


Much less suscpetible to hold violations

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

24

ECE 520 Class Notes

Example:

D-flip-flop

min : typ : max


Tclock-Q = 3 : 4 : 5
TNOR
=1:2:3
Tsu_max = 1
Thold_max = 2
Tskew = 1 ns

D-flip-flop

If this is the critical path, what is the fastest clock frequency?


Is there potential for a hold violation?

tclock tclock Q max tlog ic max t set up t skew

Setup:

= 5 + 6 + 1 + 1 = 13 ns
Hold:

Is

I.e. 76 MHz

t hold t skew tclock Q min tlog ic min

3+1

2+1
<

OK. No hold violation

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

25

ECE 520 Class Notes

CMOS Drive Strength


Revision: CMOS transistors operating in the linear region:

I ds ((Vgs Vt )Vds Vds2 / 2


where ( tox )(W / L)
where W is the transistor width, and L is the channel length
i.e. To a first approximation,
I ds Vds / Ron
Ron 1/ (VGS-VT)

= RonCload

Thus, delay in CMOS circuits depends largely on W/L of the drive transistor
and the capacitance of the load it is driving.
Different drive cell sizes can be selected by synthesis tool (x1, x2, x3, etc.)
That capacitance consists of:
Input gates of cells being driven, and Capacitance of wiring
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

26

ECE 520 Class Notes

Estimating and Improving Performance

With a focus on timing:


Topics:

Metrics : FO-4
Typical timing budgets
Pipelining and Parallelism
Logic style

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

27

ECE 520 Class Notes

Delay Metric

Usual Metric for delay:

Estimating FO4:

Fanout of 4 inverter delay: FO4


Typical ~ 360 x Leff (ps)
Worst Case ~ 600 x Leff (ps)
Leff = Effective gate length in um ~ 0.7 x Ldrawn
E.g. In a 0.18 um process, Leff = 0.126 um and FO-4 <= 75 ps

Exemplar delays:
Inverter = FO-4
1-bit adder = 10 FO4
Flip-flop t_cp-Q = 4 FO4

2-input NAND gate = 2 FO4


2-input Multiplexer = 4 FO4
Flip-flop t_su / t_h = 2 FO4

Clock skew = 4 FO4


Clock jitter = 2 FO4

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

28

ECE 520 Class Notes

Examples of Improving Timing Performance

Example 1 : Benefits of Pipelining and Parellism


Example:

Compare

Critical path

Compare

Compare

If t_comparator = 20 FO4, what is the clock period?


(Use values on previous page)
T_cp = t_ck-Q + t_logic + t_su + t_skew + t_jitter
= 4 + 20*3 + 2 + 4 + 2 = 72 FO4

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

29

ECE 520 Class Notes

Pipelining

Replace with:

Compare
Compare

Compare

T_cp = t_ck-Q + t_logic + t_su + t_skew + t_jitter


= 4 + 20 + 2 + 4 + 2 = 32 FO4

What is the delay improvement?

What is the drawback?

2.25x (note: NOT 3x)


Increased area AND power
BUT performance/area increased

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

30

ECE 520 Class Notes

Logic Level Parallelism

Replace with:

Compare
Compare
Compare

Clock Period = 52 FO-4


No increase in area

To see how to code these three structures in Verilog, please refer to the end
of the Verilog1 notes.

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

31

ECE 520 Class Notes

Retiming

Impact of critical paths can often be reduced by retiming or rebalancing a


design:
Example:

Before:
20FO4

5FO4

10FO4

T_cp = 4 + 20 + 5 + 2 + 4 + 2 = 37 FO4

After:

20FO4

5FO4

10FO4

T_cp = 4 + 20 + 2 + 4 + 2 = 32 FO4

Note: Clock level logic sequence has been changed

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

32

ECE 520 Class Notes

Timing in CAD Flow

During synthesis

After synthesis:

Tool calculates path delays under worst-case delay conditions


Determines critical path
Moves logic to a faster path if setup violation predicted
Some tools do a preliminary placement while doing logic synthesis so
that wiring delay can be properly estimated
Perform Static Timing Analysis
Determine no setup violations exist under worst case conditions
Determine no hold violations exist under best case conditions

After place and route:

Perform Timing Analysis


i.e. Run timing verification tools on netlist with actual delays
Back-annotate actual delays to netlist from later tools

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

33

ECE 520 Class Notes

Initial Delay Estimation Flow


Primetime:

analysis tool

Gate-level static timing

Report timing for critical path

Back-annotate

more accuracy

Start

wire parasitics for

Load, link, and


constrain design in
Prim etim e

Verilog structural
description of
design (post clock
tree synthesis)

Back-annotate
wiring parasitics in
Prim etim e

Post-route SPEF
file from SoC
Encoutner

SPEF file from place and route tool

Report tim ing in


Prim etim e

Finish

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

34

ECE 520 Class Notes

Sidebar
Not Examinable!
What is asynchronous design?
Can we use deliberate local clock skew in a design?
Are you sure flip-flops are better, cycle stealing sounds
useful?

Message: The CAD tools capabilities


constrain your design flexibility.
2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

35

ECE 520 Class Notes

Remember

Methodology for purposes of ECE 520

If at all possible one-edge of one clock


If you need multiple clocks, they must have a common root and be
related by factors of 2
E.g. Root clock : Tclock = 5 ns
This is BEST: Tclock only!!
This is OK:
Tclock, Tclock10, Tclock20
Tclock10 = 10 ns (Tclock*2)
Tclock20 = 20 ns (Tclock*4)
ONE CLOCK
This is NOT OK
ONE EDGE
Tclock, Tclock15, Tclock17
Tclock15 = 15 ns (Tclock*3)
Tclock17 = 17 ns (??)

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

36

ECE 520 Class Notes

Summary

What determines the maximum clock frequency?


Total delay in critical path + Clock skew
Determined in part by maximum logic depth

What is a hold violation?


When logic change runs through storage elements in sequence
in one clock edge.
Determined in part by minimum logic depth

Why do we prefer flip-flop designs over using latches?


Less potential for hold violations

What tool is used to check timing at design closure?


Static timing analysis, after logic & RC extraction.

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

37

ECE 520 Class Notes

Summary

What determines clock skew?


Random variations in clock distribution delay

Can the designer set the clock skew to be positive or negative?


No, (usually) clock skew can be either, depending on random result

Does the logic design have a strong impact on achieving a desired clock
speed?
Yes, design structure has a STRONG impact

2012 Dr. Paul D. Franzon, www.ece.ncsu.edu/erl/faculty/paulf.html

38

Вам также может понравиться