Вы находитесь на странице: 1из 3

Lecture 7

EE 486

Multiplication Add-and-Shift Algorithm


EE 486 lecture 7: Integer Multiplication
Multiplicand Multiplier Partial Products
X Y PP S

110 101 110 000 110 11110

6 x5

M. J. Flynn
slides prepared by Albert Liddicoat and Hossam Fahmy

Result

30

X 0

Shift

ALU

Shift

Computer Architecture & Arithmetic Group

Stanford University

Computer Architecture & Arithmetic Group

Stanford University

Parallel Multiplication
1 Simultaneous generation of partial products 2 Parallel reduction of partial products 3 Carry Propagate Addition (CPA)

Generation of PPs Using ROMs


4b x 4b Multiply Using 256x8-bit ROM Data 8b x 8b Multiply Using 256x8-bit ROMs

Address

Parallel generation of partial products


Using AND gate as 1x1-bit multiplier x X1 X0 X2 Y2 Y1 Y0 X2Y0 X1Y0 X0Y0 X2Y1 X1Y1 X0Y1 X2Y2 X1Y2 X0Y2 R4 R3 R2 R1 R0
3

X7 X6 X5 X4 X3 X2 X1 X0

DOT representation

Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 X3 X2 X1 X0 * Y3 Y2 Y1 Y0 X7 X6 X5 X4* Y3 Y2 Y1 Y0 X3 X2 X1 X0 * Y7 Y6 Y5 Y4

R5

. . . . .. . . . .. . . ..
x X2 X1 X0 Y2 Y1 Y0
Stanford University

X7 X6 X5 X4 * Y7 Y6 Y5 Y4

Computer Architecture & Arithmetic Group

Computer Architecture & Arithmetic Group

Stanford University

Generation of PPs Using ROMs


8 x 8 bit Multiply using 256x8b ROMs

Booths Algorithm
Observation: We can replace a string of 1s in the multiplier by +1 and -1. Example: ...0111110... ...0111110... +1 -1 ...1000000... -1 . . . 1 0 0 0 0-1 0 . . .
Y
1 1 1 1 1 0
6

4, 8, and 16 bit Multiply using 256x8bit ROMs


16x16 bit 8x8 bit 4x4 bit

b c

b c

n 4 8 16 32 64
5

h 1 3 7 15 31

CSA Delay 0 1 4 6 8
Stanford University

X*(0 1 1 1 1 1)
X X X X X 0

X*(1 0 0 0 0-1)
-X 0 0 0 0 X

Y
-1 0 0 0 0 1

Computer Architecture & Arithmetic Group

Computer Architecture & Arithmetic Group

Stanford University

M.J. Flynn

Lecture 7

EE 486

Booths Algorithm
+ Reduces the number of partial products which in turn reduces the hardware and delay required to sum the partial products. - Adds Delay into the formation of the Partial Products. Works well for serial multiplication that can tolerate variable latency operations by reducing the number of serial additions required for the multiplication. The number of serial additions depends on the data (multiplicand). Worst case 8-bit multiplicand requires 8 additions 01010101 <=> 1 -1 1 -1 1 -1 1 -1 Parallel systems generally are designed for worst case hardware and latency requirements. Standard booth 2 does not significantly reduce the worst case number of partial products.
Computer Architecture & Arithmetic Group 7 Stanford University

Modified Booth 2
Booth 2 modified to produce at most n/2+1 partial products.
Algorithm: (for unsigned numbers)
1) 2) 3) 4) Pad the LSB with one zero. Pad the MSB with 2 zeros if n is even and 1 zero if n is odd. Divide the multiplier into overlapping groups of 3-bits. Determine partial product scale factor from modified booth 2 encoding table. 5) Compute the Multiplicand Multiples 6) Sum Partial Products
Computer Architecture & Arithmetic Group 8 Stanford University

Modified Booth 2
Example: (n=8-bits unsigned)
1) Pad LSB with 1 zero 2) n is even so Pad MSB with 2 zeros
0 0 Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 0

Modified Booth 2
5) Compute Multiplicand Scale Factor (~4 gate delays)
Direct Multiplier
Xi Yj

4) Determine action from table for each 3-bit group

Scale Yi+1 Yi Yi-1 Factor 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1


+0 +X +X +2X -2X -X -X -0

Modified Booth 2 Multiplier


Xi Xi-1 0

Mux Control Logic


Scale Factor +0 +X Yj+1 Yj Yj-1 +2X -X -2X Action Mux 0 Mux Xi Mux Xi-1 Mux Xi Mux Xi-1

3) Form 3-bit overlapping groups for n=8 we have 5 groups

PP ij PP ij

Computer Architecture & Arithmetic Group

Stanford University

Computer Architecture & Arithmetic Group

10

Stanford University

Modified Booth 2
6) Sum Partial Products
Sign extend partial products to the full width of the final result
Logic may replace the A9, B9, C9, D9, and E9 sign extention bits. Yi+1 bit determines if the multiple needs to be complemented

Modified Booth 2
Algorithm Extension: (for signed multiplier)
1) Pad the LSB with one zero. 2) If n is even dont pad the MSB ( n/2 PPs) and if n is odd sign extend the MSB by 1 bit ( n+1/2 PPs). 3) Divide the multiplier into overlapping groups of 3-bits. 4) Determine partial product factor from table. 5) Compute the Multiplicand Multiples 6) Sum Partial Products n Even: n Odd: Xn-1 Xn-2 Xn-2 Positive: Negative: 0 1 x x
12

Partial Products
A9 B9 C9

Multiplicand

. . . A9 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0 . . . B8 B7 B6 B5 B4 B3 B2 B1 B0 Y1 . . . C6 C5 C4 C3 C2 C1 C0 Y3 D9 . . . D4 D3 D2 D1 D0 Y5 E9 . . . E2 E1 E0 Y7 S15. . . S 10 S9 S8 S7 S6 S5 S4 S3 S2 S1 S0
Computer Architecture & Arithmetic Group 11

Yi+1 Yi YI-1 (Y1 (Y3 (Y5 (Y7 (0 Y0 Y2 Y4 Y6 0 0) Y1) Y3) Y5) Y7)

x x

0 1

Xn-1 Xn-2 0 x 1 x

Stanford University

Computer Architecture & Arithmetic Group

Stanford University

M.J. Flynn

Lecture 7

EE 486

Modified Booth 2
Algorithm Extension: (for signed multiplicand)
Nothing the algorithm works fine !
The multiplicand may be represented in 2s complement code. The scale factors (0, +X, +2X, -X, and -2X) are handled correctly. Shift left for 2 times weighting. 2s complement the multiplicand for subtraction.

Booth 3
Reduces the partial products to ~n/3 Form overlapping groups of 4 bits.
0 Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0 0

Booth 3 Encoding
Note: 3x is a hard multiple that must be precomputed

Scale Scale Yi+2 Yi+1 Yi Yi-1 Factor Yi+2 Yi+1 Yi Yi-1 Factor 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1
14

0 1 0 1 0 1 0 1

+0 +X +X +2X +2X +3X +3X +4X

1 1 1 1 1 1 1 1

0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

-4X -3X -3X -2X -2X -X -X -0

Computer Architecture & Arithmetic Group

13

Stanford University

Computer Architecture & Arithmetic Group

Stanford University

Partially Redundant Booth 3

Partially Redundant Booth 3

Computer Architecture & Arithmetic Group

15

Stanford University

Computer Architecture & Arithmetic Group

16

Stanford University

M.J. Flynn

Вам также может понравиться