Академический Документы
Профессиональный Документы
Культура Документы
Delays
Delays can be modelled in a variety of ways, depending on the overall design approach that has been adopted.
These correspond neatly to the different levels of modelling that have already been introduced, namely gate-level
modelling, dataflow modelling and behavioural modelling.
Rise Delay
0, x, z -> 1
Fall Delay
1, x, z -> 0
Turn-Off Delay
0, 1, x -> z
10/9/12
impedance means that the net is not directly being driven by anything and so is floating. Thus it has neither a high
nor a low logic value.) Any or all of these delays can be specified for each gate by use of the delay token `#'. If
only one value is specified, it is used for all three delays. If two are given, they are used for the rise- and falldelays respectively. The turn-off delay (the time taken for the output to go to a high impedance state) is taken to
be the minimum of these values. Alternatively, all three values can be explicitly set. The use of delays is illustrated
below for the 2-input multiplexor given in an earlier example.
module multiplexor_2_to_1(out, cnt, a, b);
/*
* A 2-1 1-bit multiplexor
*/
output out;
input cnt, a, b;
wire not_cnt, a0_out, a1_out;
not # 2
n0(not_cnt, cnt);
/* Rise=2, Fall=2, Turn-Off=2 */
and #(2,3) a0(a0_out, a, not_cnt); /* Rise=2, Fall=3, Turn-Off=2 */
and #(2,3) a1(a1_out, b, cnt);
or #(3,2) o0(out, a0_out, a1_out); /* Rise=3, Fall=2, Turn-Off=2 */
endmodule /* multiplexor_2_to_1 */
(Since none of the gates used above are tri-state devices, the value for the Turn-Off delay should not be
specified, and the internally calculated value for this delay will never be used in such gates.)
The fourth category of transitions is for a change of state to an unknown value (i.e. 0, 1, z -> x), and its delay
value is taken to be the minimum of the above three.
Dataflow modelling
As dataflow modelling does not use the concept of gates, but instead has the concept of signals or values, the
approach taken to allow modelling of delays is slightly different. The delays are instead associated with the net
(e.g. a wire) along which the value is transmitted. Since values can be assigned to a net in a number of ways,
there are corresponding methods of specifying the appropriate delays.
Net Declaration Delay
The delay to be attributed to a net can be associated when the net is declared. Thereafter any changes of
the signals being assigned to the net will only be propagated after the specified delay.
e.g. wire #10 out; assign out = in1 & in2;
If either of the values of in1or in2should happen to change before the assigment to outhas taken
place, then the assignment will not be carried out, as input pulses shorter than the specified delay are
filtered out. This is known as inertial delay.
2/21
10/9/12
Behavioural modelling
At this level of abstraction, the circuit is modelled by assigning values to variables, some of which correspond to
the the inputs and outputs of the module in question. Again, there are number of different types of delay
associated with this style of programming :
Regular Delay Control
This is the most common delay used - sometimes also referred to as inter-assignment delay control.
e.g. #10 q = x + y;
It simply waits for the appropriate number of timesteps before executing the command.
Intra-Assignment Delay Control
With this kind of delay, the value of x + yis stored at the time that the assignment is executed, but this
value is not assigned to quntil after the delay period, regardless of whether or not xor yhave changed
during that time.
e.g. q = #10 x + y;
This is similar to the delays used in dataflow modelling.
10/9/12
The Full-Adder
The design below is for a full-adder, written using gate-level modelling techniques. (Note : The generateand
propagatesignals, G and P from the diagram, are not given as outputs here. However, some designs
which attempt to improve on the overall data rate may make use of them, thus requiring them to be
added to the list of module outputs - see the carry skip adder later on.) The code given specifies some of the
delays described above - the first of the two graphs shows the output of an identical circuit but without any
delays, while the second shows the actual output from the code below. [View full source code : Delays / No
delays]
module full_adder(sum_out, carry_out, a, b, carry_in);
/*
* A gate-level model of a 1-bit full-adder
*/
output carry_out, sum_out;
input carry_in, a, b;
wire one_high, generate, propagate;
xor #(3,2) x0(one_high, a, b);
xor #(3,2) x1(sum_out, one_high, carry_in);
and #(2,4) a0(generate, a, b);
and #(2,4) a1(propagate, one_high, carry_in);
or #(3) o0(carry_out, generate, propagate);
endmodule /* full_adder */
Note The clksignal in the graphs below is not required for the operation of the circuit, and is provided
: purely to illustrate the delay in the output signals.
4/21
10/9/12
Exercise 1 :
Calculate the best and worst delays for both rising and falling
transitions on the sumoutput.
Answers
The timing constraints imposed upon each full adder must allow for the worst case of each of these transitions, so
the inputs must stay constant for at least a period of 9 time units.
Exercise 2 :
Answers
The code below uses the full_addermodule defined earlier. The graphs show sample sections of the output
signals, which illustrate the differences between a circuit using full_adders with no delays, and one using
full_adders with delays as specified earlier. [View full source code : Delays / No delays]
5/21
10/9/12
Note The clksignal in the graphs below is not required for the operation of the circuit, and is provided
: purely to illustrate the delay in the output signals.
6/21
10/9/12
Exercise 3 :
Work out the worst case input vectors (i.e. a, band carry_in) for
the 4-bit ripple carry adder.
Answers
Knowing the worst case vectors allows tests to be run to confirm the minimum period for which the inputs must
be stationary. This is important as it determines the maximum data rate through that part of the circuit - often a
crucial consideration in many modern designs. Such an analysis may result in an alternative solution, with a higher
data rate, being required.
10/9/12
particular block (e.g. bits 4 to 7) will propagate any carry into that block, and the carry_inis already known,
then that carry can skip around the block, and be passed into the next block (i.e. bits 8 to 11). This gives a
considerable saving in time as the carry signal need now only pass through two gates - the ANDand the ORrather than the eight it would otherwise have to negotiate in the ripple_carry_4_bitmodule. For this to
work, however, it is necessary to be able to set the carry_inof each of the blocks to LOWeach time any of the
inputs aor bare changed.
Exercise 4 :
Exercise 5 :
Answers
Answers
There are many other improved adder designs that are even faster than this, but they are beyond the scope of
these examples.
then a race condition occurs, and both aand bwill end up with one of the values. The value that they are
both left with will depend on which of the assignments was scheduled first.
Non-blocking Assignments
Non-blocking assignments eliminate the possibility of race conditions in situations like this, as at the time
that the assignment operation is executed the expression on the right hand side of the <=operator is
copied to an internal temporary variable, which is then copied to the variable on the left hand side. All of
the `reads' for a particular timestep are carried out before any of the `writes', and so values can be safely
swapped as below :
www.ee.ed.ac.uk/~gerard/Teach/Verilog/mjta/Gateway /html/delay s.html
8/21
10/9/12
Exercise 6 :
Look at the code below and, for each of the different modules, write
out the time and the values of all of the registers, each time any of
them changes value.
Answers
Note : In the following examples, the event queuing system is assumed to be stack-based, with later
events being pushed onto the end of the stack, but read from the front. However, the
implementation of the queuing system is not specified in the Verilog language specification, so this
need not necessarily be the case. Hence, the order in which events scheduled for the same time
step in separate blocks will occur is non-deterministic (i.e. cannot be predicted) and will
depend on the particular implementation of the queuing system for the specific version of
Verilog that you are running.
(On our system, the stack-based system appears to be used.)
module blocking;
module non_blocking;
reg[7:0] a, b, c, d, e;
reg[7:0] a, b, c, d, e;
initial begin
$monitor($time, " :\ta = %d\t", a,
"b = %d\tc = %d\t", b, c,
"d = %d\te = %d", d, e);
#50 $finish;
end
initial begin
$monitor($time, " :\ta = %d\t", a,
"b = %d\tc = %d\t", b, c,
"d = %d\te = %d", d, e);
#50 $finish;
end
initial begin
a = 2;
b = 5;
#1 a = c;
#1 a = d;
#2 a = 4;
#2 a = 7;
b = 6;
#2 a = d;
$display("a, b - done");
end
initial begin
a <= 2;
b <= 5;
#1 a <= c;
#1 a <= d;
#2 a <= 4;
#2 a <= 7;
b <= 6;
#2 a <= d;
$display("a, b - done");
end
9/21
10/9/12
initial begin
c = 1;
d = c;
e = a;
#2 e = d;
c = 0;
d = 3;
#5 c = a;
d = 1;
d = 2;
$display("c, d, e - done");
end
initial begin
c <= 1;
d <= c;
e <= a;
#2 e <= d;
c <= 0;
d <= 3;
#5 c <= a;
d <= 1;
d <= 2;
$display("c, d, e - done");
end
endmodule /* blocking */
endmodule /* non_blocking */
module blocking_intra;
module non_blocking_intra;
reg[7:0] a, b, c, d, e;
reg[7:0] a, b, c, d, e;
initial begin
$monitor($time, " :\ta = %d\t", a,
"b = %d\tc = %d\t", b, c,
"d = %d\te = %d", d, e);
#50 $finish;
end
initial begin
$monitor($time, " :\ta = %d\t", a,
"b = %d\tc = %d\t", b, c,
"d = %d\te = %d", d, e);
#50 $finish;
end
initial begin
a = 2;
b = 5;
a = #1 c;
a = #1 d;
a = #2 4;
a = #2 7;
b = 6;
a = #2 d;
$display("a, b - done");
end
initial begin
a <= 2;
b <= 5;
a <= #1 c;
a <= #1 d;
a <= #2 4;
a <= #2 7;
b <= 6;
a <= #2 d;
$display("a, b - done");
end
initial begin
c = 1;
d = c;
e = a;
e = #2 d;
c = 0;
d = 3;
c = #5 a;
d = 1;
d = 2;
$display("c, d, e - done");
end
initial begin
c <= 1;
d <= c;
e <= a;
e <= #2 d;
c <= 0;
d <= 3;
c <= #5 a;
d <= 1;
d <= 2;
$display("c, d, e - done");
end
endmodule /* blocking_intra */
endmodule /* non_blocking_intra */
10/21
Delay Models
When modelling circuit delays, there are a number of options available to the modeller in terms of how to deal
with attributing the delays around the circuit model. The three most commonly used techniques are distributed
delay, lumped delay and pin-to-pin delay.
Distributed Delay
The distributed delay method requires delays to be assigned to every element of the circuit - then the delay
between any two points can be calculated by adding together the delays of the components through which
the signal being monitored passes.
Lumped Delay
This is similar to the distributed delay approach, except that it is only modules (rather than their component
parts) that are assigned delays. Normally, the delay assigned to the module is the longest path through it,
to ensure that the model reflects the worst case performance.
Pin-to-pin Delay
(This technique is sometimes also referred to as the path delay method.) Delays are specified for each
input to output pin pairing, rather than being associated with specific elements. This can be advantageous
as it means that details of the internals of the module need not be known for the analysis to be carried out.
The behavioural modelling techniques mentioned earlier allow for the distributed delay and the lumped delay
methods to be implemented without any further special commands. However, in order to use the pin-to-pin
method, some way to specify the timings to use is required.
Parallel Connections
specify
(a => out) = 9;
(b => out) = 7;
endspecify
The =>notation can only be used when the source and destination
ports, aand outrespectively in this case, are of the same (bit)width. Hence aand outcould both be single- or multi-bit vectors.
(e.g. reg a, out;or reg [3:0] a, out;)
Full Connections
10/9/12
The *>notation may be used when every bit of the source port is
to be associated with every bit of the destination port. The two
ports need not be the same width. (e.g. reg [3:0] a; reg
[7:0] out;)
specify
(a *> out) = 9;
endspecify
local to this
specify...endspecifyblock) that may simplify the task of
changing values for a large set of delays. The use of these
statements for all timing specifications is recommended. Should any
of the delay values assigned to a set of connections change, it is
now only necessary to change the value in the specparam
statement, rather than all of the parallel or full connections.
specify
specparam a_high = 2;
specparam a_low = 4;
endspecify
Pin-to-pin timings can also be expressed in terms of rise-, fall- and turn-off times. (See the earlier section on
gate level modelling.) Different delays can be specified for each possible signal transition, but only in certain
combinations, and the order in which they are to be declared must be strictly observed. The allowable
combinations limit the number of values that may be specified in any one statement to be 1, 2, 3, 6 or 12 only.
The permitted combinations are as follows :
Number of parameters
1
Used for...
All transitions.
Rise and Fall times.
COUPONS
Rise
0 -> 1, 0 -> z, z -> 1
:
Fall
1 -> 0, 1 -> z, z -> 0
:
more coupons
www.ee.ed.ac.uk/~gerard/Teach/Verilog/mjta/GatewaySee
/html/delay
s.html
12/21
10/9/12
FastestChrome
Rise, Fall and Turn-Off by
times.
Rise :
0 -> 1, 0 -> z
Fall :
1 -> 0, 1 -> z
12
If the x transitions are not specified, a pessimistic approach is taken to ensure worst case timings. Any transition
from an unknown (x) to a known (0, 1 or z) state will take the maximum of the specified times, while a transition
from a known state to an unknown state will take the minimum of the specified times. (e.g. if 6 values have
been specified, a 0 -> x transition will take the minimum of the delays specified for a 0 -> 1 or a 0 -> z
transition.)
$setup(data_line,
clk_line, limit);
$hold(clk_line,
data_line, limit);
13/21
10/9/12
These (and the other timing-related functions) can only be called from within specifyblocks. Such functions are
not restricted to use with sequential circuits - they may be used on any circuit where events can be seen to occur
with respect to some other event. Use of the $setupand $holdtasks is probably best illustrated by the
examples after the next section.
Timescales
Up until now, all of the timing and delay values have been measured in terms of simulator timesteps, with no
reference to real time. Verilog allows different timescales (mappings from simulator timesteps to real time) to be
assigned to each module. The `timescaledirective is used for this :
`timescale reference_time_units / time_precision
where reference_time_units and time_precision are values with a measurement - the two values need not use
the same measurement (e.g. `timescale 10 us / 100 ns ), but can only be specified to the nearest 1, 10
or 100 units. The reference_time_units is the value attributed to the delay (#) operator, and the time_precision
is the accuracy to which reported times are rounded during simulations.
`timescaledirectives can be given before each module to
14/21
/****************************************************************************\
*
*
*
'Basic building block' module definitions
*
*
*
\****************************************************************************/
module toggle(q, qbar, clk, toggle, reset);
/*
* A mixed style model of a T-type (toggle) flip-flop,
* with a reset line and delays on the outputs.
* This first part is behavioural code.
*/
output q, qbar;
input clk, toggle, reset;
reg
q;
www.ee.ed.ac.uk/~gerard/Teach/Verilog/mjta/Gateway /html/delay s.html
15/21
10/9/12
16/21
10/9/12
specify
$setup(d, posedge clk, `setup_time);
$hold(posedge clk, d, `hold_time);
endspecify
always @(posedge clk)
if (set == 1)
#5 q = 1;
else if (enable == 1)
#6 q = d;
endmodule /* effs */
/****************************************************************************\
*
*
*
Now, the more complex modules for implementing the actual solution
*
*
*
\****************************************************************************/
module evenSlice(bus, oneOut, zeroOut, clk, init, oneIn, zeroIn);
/*
* A dataflow model of one bit slice of the full moves generator.
* The only differences between this module and the oddSlice one
* are in the initialisation values. (Note the types of the
* flip-flops used.)
*/
inout [3:0] bus;
output
oneOut, zeroOut;
input
clk, init, oneIn, zeroIn;
wire
enable, tq, tqbar;
wire [1:0] toPeg, fromPeg, new;
toggle tog (tq, tqbar, clk, oneIn, init);
effr to0 (toPeg[0], clk, enable, init, new[0]);
effs to1 (toPeg[1], clk, enable, init, new[1]);
effs from0 (fromPeg[0], clk, enable, init, toPeg[0]);
effr from1 (fromPeg[1], clk, enable, init, toPeg[1]);
assign #2 oneOut = oneIn & tq;
assign #2 zeroOut = zeroIn & tqbar;
assign #2 enable = zeroIn & tq;
assign #2 new[1] = ~(toPeg[1] & fromPeg[1]);
assign #2 new[0] = ~(toPeg[0] & fromPeg[0]);
assign bus = (enable == 1) ? {fromPeg, toPeg} : 4'bz;
endmodule /* evenSlice */
module oddSlice(bus, oneOut, zeroOut, clk, init, oneIn, zeroIn);
www.ee.ed.ac.uk/~gerard/Teach/Verilog/mjta/Gateway /html/delay s.html
17/21
10/9/12
/*
* See the comments for the evenSlice module.
*/
inout [3:0] bus;
output
oneOut, zeroOut;
input
clk, init, oneIn, zeroIn;
wire
enable, tq, tqbar;
wire [1:0] toPeg, fromPeg, new;
toggle tog (tq, tqbar, clk, oneIn, init);
effs to0 (toPeg[0], clk, enable, init, new[0]);
effs to1 (toPeg[1], clk, enable, init, new[1]);
effs from0 (fromPeg[0], clk, enable, init, toPeg[0]);
effr from1 (fromPeg[1], clk, enable, init, toPeg[1]);
assign #2 oneOut = oneIn & tq;
assign #2 zeroOut = zeroIn & tqbar;
assign #2 enable = zeroIn & tq;
assign #2 new[1] = ~(toPeg[1] & fromPeg[1]);
assign #2 new[0] = ~(toPeg[0] & fromPeg[0]);
assign bus = (enable == 1) ? {fromPeg, toPeg} : 4'bz;
endmodule /* evenSlice */
module start_button(go, clk, press);
/*
* A gate level model of the start button, with the functionality
* as described elsewhere.
*/
output go;
input clk, press;
wire
e_out, not_press;
supply1 vdd;
/*
* This block checks that the pulse with on the input line is
* wider than 3, otherwise it is invalid.
*/
specify
specparam min_time = 3;
$width(posedge press, min_time);
endspecify
effs
st_0 (e_out, clk, vdd, press, vdd);
not #(1)
n_0 (not_press, press);
and #(2,1) a_0 (go, e_out, not_press);
endmodule /* start_button */
module tower(from_peg, to_peg, done, clk, start);
www.ee.ed.ac.uk/~gerard/Teach/Verilog/mjta/Gateway /html/delay s.html
18/21
10/9/12
/*
* This is a dataflow model of the actual move generator - to
* be thought of as a stack (or tower) of modules, each of which
* with one disk of the puzzle.
*
* It brings together all of the other modules, and presents a
* clean interface to the outside world, taking a 'start' signal
* and returning a 'done' signal, once the sequence has been
* completed.
*/
output [1:0] from_peg, to_peg;
output
done;
input
clk, start;
wire [4:0] oneOut, zeroOut;
wire [3:0] bus;
wire
init;
supply1
vdd;
start_button st_0(init, clk, start);
oddSlice rung0 (bus, oneOut[0], zeroOut[0], clk, ~init, vdd, vdd);
evenSlice rung1 (bus, oneOut[1], zeroOut[1], clk, ~init,
oneOut[0], zeroOut[0]);
oddSlice rung2 (bus, oneOut[2], zeroOut[2], clk, ~init,
oneOut[1], zeroOut[1]);
evenSlice rung3 (bus, oneOut[3], zeroOut[3], clk, ~init,
oneOut[2], zeroOut[2]);
oddSlice rung4 (bus, oneOut[4], zeroOut[4], clk, ~init,
oneOut[3], zeroOut[3]);
assign from_peg = bus[3:2];
assign to_peg = bus[1:0];
assign done
= oneOut[4];
endmodule /* tower */
/****************************************************************************\
*
*
* The final stimulus module is used to check that the tower module works *
* properly
*
*
*
\****************************************************************************/
module stimulus;
/*
* This is a behavioural model. It simply instantiates the tower
* module, provides it with inputs and monitors its outputs.
*/
reg
clk, button;
wire [1:0] from, to;
wire
done;
tower t_0(from, to, done, clk, button);
www.ee.ed.ac.uk/~gerard/Teach/Verilog/mjta/Gateway /html/delay s.html
19/21
10/9/12
initial begin
clk = 0;
forever #(`clk_period / 2) clk = ~clk;
end
initial begin
button = 0;
#40 button = 1;
#50 button = 0;
end
always @(posedge clk)
#(`clk_period - 1) $display($time, " From peg %d To peg %d",
from, to);
always @(posedge clk)
if (done == 1) #`clk_period $stop;
endmodule /* stimulus */
The design has been implemented using bit slice techniques to allow for easy extension to different numbers of
bits (which correspond to the number of disks in the problem). Each new disk to be catered for requires the
counter and the disk selector to be extended, and a new move generator with its tri-state outputs to be attached
to the bus. This is a very good methodology to adopt when designing circuits that may be extended. In this case,
another point to be borne in mind is that the move generators require to be initialised to different values
depending on whether the number of disks is even or odd.
The code makes use of many of the delay techniques covered earlier, as well as the setup and hold checks.
Exercise 7 :
Copy the above code to a file, and run it though the Verilog
compiler. Now change the clock period to half of its current value,
and run the code again. What happens?
Answers
Setup and hold violations allow the maximum rate at which data can be clocked through the sequential elements
of a circuit to be determined. However, there may be other constraints on the circuit which affect the overall data
rate.
Exercise 8 :
Exercise 9 :
Change the clock period to 15, and re-run the code. What happens
this time? Why does this occur?
Answers
Determine the maximum clock frequency with which the circuit will
20/21
10/9/12
function correctly.
Answers
This example should have given a good idea of the sort of techniques employed in modelling circuits, making use
of delays and timing checks. Obviously, it has not covered all of the concepts presented earlier, but has shown a
typical use of many of them.
Exercise 10 :
Why not?
Answers
Consequently, while there may not be any directly adverse effects of using both positive and negative edges of a
clock, common synchronous design practices tend to shy away from this, preferring to keep the design 'clean' by
using only one edge of the clock signal to latch all values.
21/21