Академический Документы
Профессиональный Документы
Культура Документы
I give you some timing models of the gates and maybe wires too
R. Rutenbar 2001
Acknowledgements
^
Current version has also benefited from versions of 18-760 taught jointly
by John Cohn (IBM) and Dave Hathaway (IBM) at the University of
Vermont Dept of EE.
Many thanks to Karem, Tom, John, and especially Dave for all
the inputs on this material
R. Rutenbar 2001
Page 1
Copyright Notice
R. Rutenbar 2001
M
Aug 27
Sep 3
10
17
24
Oct 1
8
15
22
29
Nov 5
12
Thnxgive 19
26
Dec 3
10
T
28
4
11
18
25
2
9
16
23
30
6
13
20
27
4
11
W
29
5
12
19
26
3
10
17
24
31
7
14
21
28
5
12
Th
30
6
13
20
27
4
11
18
25
1
8
15
22
29
6
13
F
31
7
14
21
28
5
12
19
26
2
9
16
23
30
7
14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Introduction
Advanced Boolean algebra
JAVA Review
Formal verification
2-Level logic synthesis
Multi-level logic synthesis
Technology mapping
Placement
Routing
R. Rutenbar 2001
Page 2
Readings
^
De Micheli
X
R. Rutenbar 2001
Basic question
X
So what do we do instead?
X
R. Rutenbar 2001
Page 3
Combinational
Circuit
(No feedback
loops)
LATCHES
LATCHES
Common
Clock
R. Rutenbar 2001
Clock
Data
R. Rutenbar 2001
Page 4
R. Rutenbar 2001
Data
Often consider rising and
falling times separately
Clock
R. Rutenbar 2001
Page 5
Does data always reach a stable value at all latch inputs in time for the
clock to capture it?
Z
Does data always stay stable at all latch inputs long enough after the
clock to get stored?
Z
R. Rutenbar 2001
Delay Models
^
The delay through a gate -- ANY gate -- is equal to 1 time unit. Period.
=1
=1
R. Rutenbar 2001
Page 6
Delay Models
^
=2
=3
R. Rutenbar 2001
Delay Models
^
Gates with more fanout are slower than gates with less fanout
=2
= 3.2
= 3
=2
Gate output has to electrically drive all
the fanout gates. More fanout means
more load ==> slower.
Gates delay model will usually depend on load of driven wires & gates
Page 7
Delay Models
^
=2
=3
Poor slope, slow rise
in
=3
=3
1
in
=3
=3
!
out
out
R. Rutenbar 2001
Delay Models
^
Delay is from each individual pin to gate output(s); all can be different
=2
=3
5 V = logic 1
=3
=3.2
B
nand(A,B)
A
B
0 V = logic 0
R. Rutenbar 2001
Page 8
Delay Models
^
(output falling)=3.1
(output rising)=3.5
X
Delay Models
^
In most elaborate case, its a real probability distribution that gives you
a real probability of the signal arriving with a given delay...
Messy! Complicated!
=2
=3
= max delay
= min delay
delay
3
R. Rutenbar 2001
Page 9
Topological analysis means we only worry about the delay through the
paths through the graph shown below, not the logical function of the
modules (which we hide here!)
Longest delay is
=8
PI
8+2+8+2 = 20
=8
=2
=2
=1
PI
PO
=1
=1
PI
R. Rutenbar 2001
Topological (again)
=8
PI
=8
=2
=1
PI
PO
=1
Delay = 20
=1
PI
=2
PI
2:1 mux
0
=8
2:1 mux
0
=2
=1
PI
conflict
=2
=1
PO
=1
PI
1
R. Rutenbar 2001
Page 10
It is not possible to apply a set of inputs that will cause a logic signal to
propagate down this supposed longest path from PI to PO
Sensitization
X
=8
PI
2:1 mux
0
=8
2:1 mux
0
=2
PI
1/0
PI
=1
=2
=1
PO
=1
R. Rutenbar 2001
Sensitization
^
Definitions
X
controlling
0
value is_____
output
0
controlling
1
value is_____
output
1
1
1
R. Rutenbar 2001
Page 11
Sensitization
^
Definitions
X
A path is a set of connected gates and wires that starts with some PI and
ends with some PO. Path is defined by 1 input and 1 output per gate
Side inputs on a path are the other inputs to these gates on the path.
Combinational
network
Stuff connected
to the side inputs
Side inputs
PO
PI
R. Rutenbar 2001
Static Sensitization
^
Static sensitization
X
Input vector
Combinational
network
Stuff connected
to the side inputs
1
1
Side
inputs
1
PO
PI
R. Rutenbar 2001
Page 12
Static Sensitization
^
PI
2:1 mux
0
2:1 mux
0
=8
=2
PI
1/0
PI
=1
=2
PO
=1
=1
Statically sensitizable
=8
PI
2:1 mux
0
2:1 mux
0
=8
=2
PI
PI
=1
=2
PO
=1
=1
R. Rutenbar 2001
Sensitization
^
PI
=8
2:1 mux
0
=1
2:1 mux
0
=2
PI
=1
=2
arbitrary
Boolean
function F
=8
PO
=1
PIs
R. Rutenbar 2001
Page 13
...and that also allow the output of that gate to propagate to some
output.
test pattern
input vector
nee
Combinational network
nputs
side i
t
h
g
i
the r
orce
d to f
PO
R. Rutenbar 2001
Dynamic sensitization
X
at time t2 need
a 1 on this AND...
Side
inputs
Combinational
network
PO
Stuff connected
to the side inputs
PI
Side
inputs
PO
PI
at time t0 need
a 1 on this AND...
R. Rutenbar 2001
Page 14
No slopes, etc. Any loading effects are bundled back into the gate
delay number itself.
=3
=3.2
This is usually a pessimistic timing model -- delay numbers too big since
we find false paths first that are usually overly long
R. Rutenbar 2001
PRO
CON
Can be pessimistic (includes false paths)
R. Rutenbar 2001
Page 15
X
X
X
2
2
3
d
R. Rutenbar 2001
Can use delay graph with node for each pin instead of each net
a x
c w
e
o
b y
d z
a
w
e
b
X
Gate and net delays interact - can have delay edge from input to input
a
w
o
b
X
Page 16
Delay Graph
^
Now network has 1 clear entry node, and 1 clear exit node
Even timers that dont explicitly add these nodes do something similar
Z
Loop through all PIs (POs) loop through fanout (fanin) of source
(sink) node
a
b
0
Src
=2
=3
d
a
c
2
0
3
Sink
d
Like HLS scheduling graph
R. Rutenbar 2001
Page 17
Path Enumeration
^
2n
R. Rutenbar 2001
Instead find for each node the worst delay to the node along any path
R. Rutenbar 2001
Page 18
ATs
n
src
sink
other paths
R. Rutenbar 2001
RATs
n
src
sink
other paths
R. Rutenbar 2001
Page 19
Slacks at a node
X
Can increase delay at a node (to minimize power, circuit area) with
positive late mode slack and not degrade overall performance
How To Compute...?
sor
ec es
pred ths
pa
src
(p,n)
(n,s)
suc
ces
s
or p
a
ths
pred(n)
succ(n)
sink
Recursively.
X
R. Rutenbar 2001
Page 20
(p,n)
p
src
(n,s)
suc
ces
s
or p
ath
s
pred(n)
succ(n)
sink
0 if n == src
AT E(n) = min delay to n =
Min
p = pred(n)
0 if n == src
ATL(n) = max delay to n =
Max
p = pred(n)
{ ATL(p) + (p,n) }
R. Rutenbar 2001
src
AT E =10 q
7
1
AT E =?
AT E =5 r
AT E(n) = Min
x{p, q, r}
Big idea
X
...then if we take node n off the end of the path, the shorter partial path
(to node r, here) is the min (max) delay path from source to node r
Page 21
(p,n)
p
src
suc
ces
s
(n,s)
or p
ath
s
s
sink
pred(n)
succ(n)
0 if n == sink
RATE(n) =
Max
s = succ(n)
{RATE(s) - (n,s) }
Note reversal of min and max
for early and late modes; this is
because were subtracting delays
instead of adding them
Min
s = succ(n)
{RATL(s) - (n,s) }
R. Rutenbar 2001
Example
B
3
src
6
11
sink
4
C
15
E
ATE(E) =
4+9
= 13
ATL(E) =
3+11
= 14
RATE(B) = 0-6-5
= -11
Cycle time = 30
RATL(B) = 30-11-15 = 4
SlackE(B) = 3-(-11)
= 14
SlackL(B) = 4-3
=1
R. Rutenbar 2001
Page 22
Computational Strategy
^
6
11
F
4
C
15
E
R. Rutenbar 2001
Topological Sorting
^
R. Rutenbar 2001
Page 23
Topological Sorting
^
6
11
4
C
15
topsort(A)
topsort(D)
topsort(C)
Topological order
topsort(B)
topsort(E)
topsort(F)
R. Rutenbar 2001
A
C
B
E
D
F
stack
Computing ATs
^
src
(p,n)
(n,s)
suc
ces
sor
pat
hs
pred(n)
succ(n)
get_ATs() {
ATE(src) = 0; ATL(src) = 0;
for each n in topsort order {
ATE(n) = ; ATL(n) = - ;
for each p in pred(n) {
ATE(n) = min( ATE(n), ATE(p) + (p,n) );
ATL(n) = max( ATL(n), ATL(p) + (p,n) );
}
}
}
sink
R. Rutenbar 2001
Page 24
Computing RATs
^
RATs same as the ATs would be if you reversed all arrows and start
from sink (now=source) and go to source (which is now the sink)!
sor
eces
pred ths
pa
SINK
suc
(p,n)
(n,s)
ces
sor
pat
hs
s
SRC
pred(n)
succ(n)
get_RATs() {
RATE(sink) = 0; RATL(sink) = cycle_time;
for each n in reverse topsort order {
RATE(n) = - ; RATL(n) = ;
for each s in succ(n) {
RATE(n) = max( RATE(n), RATE(s) - (n,s) );
RATL(n) = min( RATL(n), RATL(s) - (n,s) );
}
}
}
R. Rutenbar 2001
Slack
^
AT=8
RAT=23
Slack=23-8=15
F
4
C
Slack=5-4=1
RAT=5
AT=4
X
Cycle time = 29
11
15
E
Slack=0
RAT=29
AT=29
Slack=0
RAT=14
AT=14
Allow us to report worst paths, even though we didnt trace them all
R. Rutenbar 2001
Page 25
Path Reporting
^
6
11
F
4
C
15
E
R. Rutenbar 2001
OK if all clocks are ideal and arrive at the same time but they dont
Edge-triggered FF
Hold
clock
Page 26
Hold test says late clock must precede early data by some amount
Setup test says late data must precede early clock by some amount
Complication - adjusts
X
Remember that many cycles of activity were folded into one cycle
Ideal clock
Early clock
X
Late clock
R. Rutenbar 2001
Example:
Clock 1
(period 2)
Clock 2
(period 3)
R. Rutenbar 2001
Page 27
Slack Stealing
^
Edge-triggered FF
Transparent latch
clock
clock
Slack Stealing
^
But this means the arrival at the end of one path affects the
arrival at the beginning of another path
X
R. Rutenbar 2001
Page 28
Simplification is just a fixed delay per gate (or per input pin, same thing)
Logical = we worry about false paths, what the gates really do. This is
still pretty hard, and a lot of computational work.
R. Rutenbar 2001
Page 29