Академический Документы
Профессиональный Документы
Культура Документы
Custom single-purpose
processors
Outline
Introduction
Combinational logic
Sequential logic
processor design
Introduction
• Processor
– Digital circuit that performs a computation tasks
gate Conducts
if gate=1
drain
gate
IC package IC oxide
source channel drain
Silicon substrate
CMOS transistor implementations
Complementary Metal Oxide Semiconductor
x x x x
F x F F
x y F y F x y F F x y F
0 0 y 0 0 0 0 0 0 y 0 0 0
1 1 0 1 0 0 1 1 0 1 1
F=x F=xy 1 0 0 F=x+ 1 0 1 F=x 1 0 1
1 1 1 1 1 1 1 1 0
Drive AND y y
r OR XOR
x F x x y x x y x x y
x F
F
F F F F F
0 1 y 0 0 1 y 0 0 1 y 0 0 1
1 0 0 1 1 0 1 0 0 1 0
F = x’ F = (x 1 0 1 F= 1 0 0 F=x y 1 0 0
Inverte y)’ 1 1 0 (x+y)’ 1 1 0 XNOR 1 1 1
r NAND NOR
Combinational logic design
z = ab + b’c + bc’
RT-Level Combinational Components
flip-flop.
The simplest type of flip-flop is the D flip-flop. It
clock.
When clock is 0, the previously stored bit is
S is 1, a 1 is stored. If R is 1, a 0 is stored.
If both are 0, there’s no change. If both are 1,
load.
a=1 a=1
1 2
a=1 • Given this implementation model
a=0 x=0 x=0 a=0
– Sequential logic design quickly reduces to
Q1Q0
01 11 10 I1
a
I0 00
0 0 1 1 0 I0 = Q0a’ + Q0’a
1 1 0 0 1
x Q1Q0 I0
a
00 01 11 10
0 0 0 1 0 x = Q1Q0
Q1 Q0
1 0 0 1 0
Custom single-purpose processor
design
We can apply the above combinational and
data.
It contains register units, functional units, and
datapath
control
outputs state functional
… … register units
external external
control outputs data
outputs
… …
3: x = x_i
d_o
“complex” state machine 4: y = y_i
if (c1)
Assignment statement Loop statement c1 stmts
a=b else if c2
while
next c2 stmts
(cond) {
statement else
loop-body-
other stmts
next statement
statements
}
a=b next
!cond C:
C: statement
c1 !c1*c2 !c1*!c2
next cond
statemen
t loop-
body- c1 stmts c2 others
statement stmts
s
J:
J:
next
statement
next
statement
Creating the data path
• Create a register for any declared variable.
2-J: x_sel
n-bit 2x1 n-bit 2x1
3: x = x_i
y_sel
x_ld
4: y = y_i 0: x 0: y
y_ld
!(x!=y)
5:
x!=y
6: != < subtractor subtractor
x<y !(x<y)
5: x!=y 6: x<y 8: x-y 7: y-x
7: y = y -x 8: x=x-y
x_neq_y
6-J:
x_lt_y 9: d
d_ld
5-J:
9: d_o = x d_o
1-J:
Creating the controller’s FSM
1:
!1
Controller
go_i
• Same structure as FSMD
!1
1 !(!go_i) 0000 1:
2:
0001 2:
1 !(!go_i) • Replace complex
!go_i
!go_i
2-J:
0010 2-J: actions/conditions with
3: x = x_i x_sel = 0
0011 3: x_ld = 1
4: y = y_i
data path configurations.
y_sel = 0 x_i y_i
0100 4: y_ld = 1
!(x!=y) Datapath
5: !x_neq_y
0101 5:
x!=y x_sel
n-bit 2x1 n-bit 2x1
x_neq_y
6: 0110 6: y_sel
x<y !(x<y) x_lt_y !x_lt_y x_ld
0: x 0: y
7: y = y -x 8: x = x - y 7: y_sel = 1 8: x_sel =1
y_ld
y_ld = 1 x_ld = 1
d_
o
Splitting into a controller and data path
go_i
1010 5-J:
1011 9: d_ld = 1
1100 1-J:
Completing the GCD custom single-
purpose processor design
… …
• We finished the data path
controller datapath
• We have a state table for
next-state registers
logic
state functional
– All that’s left is combinational register units
logic design
…
• This is not an optimized …
basic steps
Controller state table for the GCD Example
Completing the GCD custom
single-purpose processor design
… …
We finished the data path
controller datapath
We have a state table for the
next-state registers
33
RT-level custom single-purpose processor
design
We often start with a state
Problem Specification
Sende Bridge Rece
machine r rdy_in A single-purpose processor that
converts two 4-bit inputs, arriving
rdy_out iver
functionality
rdy_in=0 Bridge rdy_in=1
Example WaitFirst4
rdy_in=1
RecFirst4Start RecFirst4End
data_lo=data_in
Bus bridge that converts 4-bit
rdy_in=0 rdy_in=0 rdy_in=1
bus to 8-bit bus rdy_in=1
WaitSecond4 RecSecond4Start RecSecond4End
FSMD
rdy_in=0
Known as register-transfer (RT) Send8Start
Inputs
rdy_in: bit; data_in: bit[4];
data_out=data_hi Send8End Outputs
level & data_lo
rdy_out=1
rdy_out=0 rdy_out: bit; data_out:bit[8]
Variables
data_lo, data_hi: bit[4];
Exercise: complete the design
34
RT-level custom single-purpose
processor design (cont’)
Bridge
(a) Controller
rdy_in=0 rdy_in=1
rdy_in=1
WaitFirst4 RecFirst4Start RecFirst4End
data_lo_ld=1
rdy_in=0 rdy_in=0 rdy_in=1
rdy_in=1
WaitSecond4 RecSecond4Start RecSecond4End
data_hi_ld=1
Send8Start Send8End
data_out_ld=1 rdy_out=0
rdy_out=1
rdy_in rdy_ou
t
clk
data_in(4) data_out
data_lo_ld
data_out_ld
data_hi_ld
registers
data_hi data_lo
to all
data_out
(b) Datapath
35
Optimizing single-purpose
processors
FSMD
Data path
FSM
36
Optimizing the original program
of possible improvement
number of computations
size of variable
operations used
37
Optimizing the original
program
original program optimized program
0: int x, y; 0: int x, y, r;
1: while (1) { 1: while (1) {
2: while (!go_i); 2: while (!go_i);
3: x = x_i; // x must be the larger
4: y = y_i; number
5: while (x != y) { 3: if (x_i >= y_i) {
replace the subtraction
6: if (x < y) 4: x=x_i;
operation(s) with modulo
7: y = y - x; 5: y=y_i;
operation in order to speed
else }
up program
8: x = x - y; 6: else {
} 7: x=y_i;
9: d_o = x; 8: y=x_i;
} }
9: while (y != 0) {
10: r = x % y;
11: x = y;
12: y = r;
}
13: d_o = x;
}
GCD(42, 8) - 9 iterations to complete the loop GCD(42,8) - 3 iterations to complete the loop
x and y values evaluated as follows : (42, 8), (43, x and y values evaluated as follows: (42, 8),
8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2). (8,2), (2,0)
38
Optimizing the FSMD
Areas of possible improvements
merge states
2-J: x = x_i
3: y = y_i
merge state 2 and state 2J – no loop operation in
3: x = x_i between them
5:
x!=y
9: d_o = x
6: merge state 5 and state 6 – transitions from state 6
x<y !(x<y) can be done in state 5
y = y -x 8: x = x - y
7:
eliminate state 5J and 6J – transitions from each state
6-J: can be done from state 7 and state 8, respectively
5-J:
eliminate state 1-J – transition from state 1-J can be
d_o = x done directly from state 9
9:
1-J:
40
Optimizing the data path
necessary
if same operation occurs in different states, they can
42