Академический Документы
Профессиональный Документы
Культура Документы
Control
Finite state machines (PLA, ROM, random logic)
Interconnect
Switches, arbiters, buses
Memory
Caches (SRAMs), TLBs, DRAMs, buffers
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
+ 1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
N
multiplicand
multiplier
partial
N product can be formed in parallel
array
Making it faster
Use a faster adder
Use higher radix (e.g., base 4) multiplication – O(N/2 Tadder)
- Use multiplier recoding to simplify multiple formation (booth)
Form the partial product array in parallel and add it in parallel
Making it smaller (i.e., slower)
Use serial-parallel mult
Use an array multiplier
- Very regular structure with only short wires to nearest neighbor
cells. Thus, very simple and efficient layout in VLSI Can be easily
Sp11 CMPEN 411 L20 S.5
and efficiently pipelined
Serial-parallel multiplier structure
X3 X2 X1 X0 Y0
X3 X2 X1 X0 Y 1 Z0
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
HA FA FA HA
FA FA FA HA Critical Path 1
Critical Path 2
HA HA HA HA
HA FA FA FA
HA FA FA FA
HA FA FA HA
X3 X2 X1 X0
Y0
Y1 HA Multiplier Cell
C S C S C S C S
Z0
FA Multiplier Cell
Y2
C S C S C S C S
Z1 Vector Merging Cell
Y3
C S C S C S C S X and Y signals are broadcasted
Z2 through the complete array.
( )
C C C C
S S S S
Z7 Z6 Z5 Z4 Z3
...
so small (low cost)
C63
disadvantage: slow (O(N)
for N bits) and lots of A63 1-bit
glitching (so lots of energy FA S63
consumption) B63
C64=Cout
(a) (b)
FA HA
(c) (d)
First stage
HA HA
Second stage FA FA FA FA
Final adder
z7 z6 z5 z4 z3 z2 z1 z0
partial
product mux
interconnect
array +
reduction
tree
reduction
tree (log N)
+
fast carry
propagate CPA (log N)
adder (CPA)
P (product)
(3,2)
partial
product
array
reduced pp
array (to
CPA)
double
precision
product
multiplier
partial
product
array
multiplier
nine (4,2) counters
13-bit CPA
ISSCC 2003
Sp11 CMPEN 411 L20 S.31
Multipliers —Summary