Академический Документы
Профессиональный Документы
Культура Документы
1. INTRODUCTION The multiplier using the Booth algorithm is a well-know technique for high-speed and low-cost multipliers. There are many researches on high-speed Booth multipliers, and the main technique is the radix-4 Booth encode[l-6]. Although radix-4 Booth can reduce the input bits and the output bits to half, it also increases the time of compression. In order to get a better system performance, we have improved the circuit of the radix-4 Booth multiplier in this paper.
In the CPU and DSP processor design, we use the modified multiplier scheme widely and commonly. There are several types of multiplier such as series, parallel, array, and encoding. The property of these multipliers as we know that the series multiplier is the simplest structure,the parallel scheme is higher-speed, the matrix one is more difficult when it is used on symbol operation, and the encoding one is much more efficient when it is used on symbol operation. Therefore we modify the encoder and the decoder in order to reduce area and increase the whole speed. This paper is organized as follows. Section I1 discusses the proposed radix-4 Booth multiplier which this paper is proposed. Section I11 compares the proposed radix-4 Booth multiplier structure with a standard one. Section IV is the conclusion.
2. Radix-4 Booth Multiplier In this section, we present a novel scheme using the modified Booth encoderidecoder (MBE) and the remodified full-adder (MFAr). It is improved from the Yen's MBE [I] and the original 4-to-2 compressor to reduce the
0-7803-8660-4/04/$20.0002004 IEEE
837
There are five outputs after the conventional radix-4 encoder. The following equation is the decoding algorithm.
PPI, = Z ( M I Y , + M Z Y , + P l Y , +P2Y,.,)
(2)
2.2. The Efficient Compressor
Some of the radix-4 Booth encoderldecoders have beneficial property for area and timing [I]. Here, the new encoding circuit is designed by rebuilding the Booth encoder truth table into a new one shown in Table 1. Table 1 Truth table of the new MBE scheme
X ~ , + I Xli
0
II o
10
II
0
0
Io
Io
II I 11 II
Io I
10
Io
The compressor simplifies the multi-row partial-product decoded by the Booth decoder into two rows. Figure 4 shows the compressor structure for an 8-bit input. The number of bits in each column is different such that each column has different compression ratio. We construct a compressor with three 340-2 compressors and three 4-to2 compressors. In Figure 4(a), the first column has no delay time that does not need to be compressed, the sixth column data with 3-bit input is compressed by a 3-to-2 compressor, and the eighth column is compressed by a 4to-2 compressor. The sum of the output will be generated simultaneously if the numbers of compression bits are the same.
The novel encoding circuit is shown in Figure 2. It h& a 3bit input and generates a 3-bit output for the decoding circuit. Figure 3 shows the decoding circuit which receives the signals from the encoding circuit and generates the partial product results.
Y!r x>> X
xu
x u a X,? :
D.7
Xlrhlr xu
(h)
w.1
I *
Fig2. The novel encoding circuit
Fig.4 Compressors (a) The column position of compression @) The 4-inputs and 3-inputs compressor
Figure 5 shows a standard compressor. It is composed of 8 NAND gates and 4 XOR gates, so that there is a long delay time for a multiplier.
838
Let
u = l l d 1 3 d I ; x = c ,wegetthecarryby ,
U , I, ,and Cin.
3. COMPAIUSION AND ANALYSIS The proposed multiplier is implemented by Synopsys and Apollo library. Table 2 gives the comparison results betweenthe new multiplier and the other four different kinds of the radix-4 Booth multiplier. Obviously, the numbers of the transistor of the proposed circuit is less than MBE-I11 and MBE-IV. From the delay time calculated by Synopsys, we can see that the proposed circuit is faster than MBE-I and MBE-11.
__
Types of the modified Transistor Delay(ns) Delay (ns) Booth encoder (Apollo) (Synopsys) MBE-1
PI
IO
16
20
18 16
li
CF
FigS. 4 to 2 compressor
The row number of the partial product and the column bits are decreased after the encoding of the new modified Booth encoderldecoder, and the compression ratio is decreased, too. Figure 7 shows the column bits between the radix-4 Booth multipliers and the matrix multipliers. Because Booth encoders number of bits are decreased, and thus the speed increases. Table 3 fives the comparison results between various compressors. The MFArs performance is better than the others, no matter in delay time, area, and power. Regarding the delay time of the critical path, it can reduce about 15% than the others. Regarding the area cost, it can economize about 20%. Regarding the power consumption, it can decrease about 17%-24%. From the above, we can h o w that MFAr indeed can improve the circuit performance of the compressor.
C
~
.._._.____._I.I_
ce CompressA01
Delay(ns I
Area
Power(mw)
1.34
21.4 17.39
2.75 2.29
C m
MFAr
Fig6. MFAr (proposed)
I
15%
I
20%
17?&24% into
839
Fig7. Compare the matrix multipliers with the radix-4 Booth multipliers
The area of the radix-4 Booth multiplier is compared with the array multiplier by gate-count. The gate-count of the radix-4 Booth multiplier is about (N1/2 full adders (FAs)) + (/2 decoders), and the gate-count of the array multiplier is about (N1 FAs) + (N1/2 AND gates). Assume the input has 16 bit. If the novel modified radix-4 Booth multiplier and the array multiplier both use the same compression method, then the array multipler uses only 108FAs. Therefore, the area of the matrix multiplier is about 1.5 times of that of the Radix-4 Booth multiplier. It can not only reduce the power consumption but also reduce the circuit complexity.
4. CONCLUSION We have shown in this paper that the use of the new Booth encoderldecoder and the proposed compressor can truly decrease the circuit area and the delay time of the critical path. Table 2 shows that the proposed design has smaller area than MBE-111 and MBE-IV and is faster than MBE-I and MBE-11. Table 3 gives that the area of the compressor is reduced to SO%, the delay time of the critical path to 85% reduction, and the power to 76%-83%.
6. REFERENCES [l]W.-C. Yeh, C.-W. Jen, High-speed booth encoded parallel multiplier design, IEEE Transactions on Computers.vol.4Y.no.7,pp.692-701, July 2000. [2]B. Parhami, Computer Arithmetic, Oxford 2000. [3]P. Bonatto, V.G. Oklobdzija, Evaluation of Booths algorithm for implementation in parallel multipliers, Signals, Systems and Computers, Conference Record of the Twenry-Ninth Asilomar, Oct. -Nov., 1995. [4] V.G. Oklobdzija, Improving multiplier design by using improved column compression tree and optimized final adder in CMOS technology, IEEE Tramactions on Very Large Scale Integration Systems, vo1.3, no.2, pp.292-30, July 1995. [5]V.G. Oklbdzija, D. Villeger, S. S. Lin, A method for speed optimized partial product reduction and generation of fast parallel multipliers using an algorithmic approach, IEEE Transactions on Computers, vo1.45, no.3, pp.294-306, March 1996. [6]D. Villeger, V.G. Oklobdzija, Analysis of Booth encoding efficiency in parallel multipliers using compressors for reduction of partial products, Proceeding o the 27 Asilomar Conference on f Signals, Systems and Computers, pp.781-784, 1993. [qA.R. Cooper, Parallel architecture modified Booth multiplier, IEE Proceedings, ~01.135, pt.G, no.3, pp.125-128, June 1988. [8]G. Goto, A. Inoue, A 4.1-11s compact 54 X 54-b multiplier utilizing sign-select Booth encoders, IEEE Journal o Solid-State Circuits, vo1.32, no.11, pp.416f 417,Nov 1997. [9]G. Goto et al., A 54 x 54-b regularly structured tree multiplier, IEEE Journal o Solid-State Circuits, f vo1.27, no.9, Sep. 1992. [ 1O]R. Fried, Minimizing energy dissipation in high-speed multipliers, Intl Symp. Low Power Electronics and Design, pp. 214-219, 1997. [ 11]F. Elguibaly, A fast parallel multiplier-accumulator using the modified Booth algorithm, IEEE Transactions on Circuits and Systems 1 : Analog and 1 Digifal Signal Processing, vo1.47, pp. 902-908, Sep. 2000.
5. ACKNOWLEDGMENT This work was supported by the National Science Council of Taiwan under grant NSC 92-2220-E-005-004, and Meng YauChip Center.
840