A Novel, Low-Power Array Multiplier Architecture

A Novel, Low-Power Array Multiplier Architecture
by
Ronak Bajaj, Saransh Chhabra, Sreehari Veeramachaneni, M.B Srinivas
in
9th International Symposium on Communication and Information Technology 2009

(ISCIT 2009)
Songdo - iFEZ ConvensiA, Icheon, Korea
Report No: IIIT/TR/2009/234
Centre for VLSI and Embeded Systems Technology

International Institute of Information Technology
Hyderabad - 500 032, INDIA
September 2009
A Novel, Low-Power Array Multiplier Architecture
Ronak Bajaj, Saransh Chhabra, M B Srinivas
Sreehari Veeramachaneni Department Electronics and Communication Engg.,
Birla Institute of Technology and Sciences (BITS) Pilani,
International Institute of Information Technology-
Hyderabad Campus,
Hyderabad,
Hyderabad, 500078, India
Gachibowli, Hyderabad, 500032, India
E-mail: srinivas@bits-hyderabad.ac.in
E-mail :{ ronak,saransh}@students.iiit.ac.in,
srihari@research.iiit.ac.in.
Abstract— Low power parallel array multiplier is proposed

for both unsigned and two’s complement signed multiplication. X4 X3 X2 X1 X0
Y0
Modified Baugh-Wooley multiplier is further modified and if

input numbers are not in two’s complement form, proposed X3 X2 X1 X0 Y1 P0
method makes the calculation of two’s complement of the
number redundant, thus reducing delay. Also power HA HA HA HA
consumption has been found to be less than that of modified

C S C S C S C S
X4
P1
Baugh-Wooley multiplier. X3 X2 X1 X0 Y2
I. INTRODUCTION X4
C
FA
S C
FA
S C
FA
S C
FA
S
P2
Y3
Multipliers are one of the most important arithmetic units X3 X2 X1 X0
in microprocessors and DSPs and also a major source of FA FA FA FA

power dissipation. Reducing the power dissipation of X4
C S C S C S C S
P3
multipliers is key to satisfying the overall power budget of X3 X2 X1 X0 Y4
various digital circuits and systems. Power consumed by

multipliers can be lowered at various levels of the design X4
C
FA
S C
FA
S C
FA
S C
FA
S
hierarchy, from algorithms to architectures to circuits, and P4
devices. Various algorithms and multiplier schemes have

been proposed till date including Hoffman et al. [1], Burton FA FA FA HA
and Noaks [2], De Mori [3], and Guilt [4] for positive
C S C S C S C S
P5
numbers, and Baugh and Wooley [5] and Hwang [6] for P9 P8 P7 P6
numbers in two’s complement form. References [7-9] give a Figure 1(a). Conventional Unsigned Array Multiplier
good insight into the problem and design optimizations at all
the hierarchy levels. Y0
X4 X1 X0
In this paper, we focus on power reduction for both
X3 X2
unsigned and signed multipliers. For Signed multiplier, the X3 X2 X1 X0 Y1 P0
modified Baugh-Wooley algorithm (Fig. 1(b) and 2) is

extended to obtain a power efficient multiplier. Inputs are the X4 C
HA
S C
HA
S C
HA
S C
HA
S
inverted bits of two's complement representation. X3 X2 X1 X0

Y2
P1
II. UNSIGNED PARALLEL ARRAY MULTIPLIER X4 C

FA
S C
FA
S C
FA
S C
FA
S
P2
The basic process of binary array multiplication involves X3 X2 X1 X0
Y3
the AND operation of multiplicand and multiplier bits and

subsequent addition as shown in Fig. 1(a) for a 5 ∗ 5
FA FA FA FA
X4 C S C S C S C S
P3
multiplier. NOR gates are used instead of AND in X0 Y4
X3 X2 X1
accordance with the DeMorgan’s Law:

A.B = (A’ + B’)’ (1)
FA FA FA FA
X4 C S C S C S C S
P4
From (1), it is clear that if NOR gates are used, the inputs 1
1
have to be complimented.
While it takes 6 transistors to build AND/OR gate, only 4 C
HA
S C
FA
S C
FA
S C
FA
S C
FA
S
transistors are used for NOR/NAND gate. Also, AND gate

P5
has an extra delay of 1T compared to NOR gate.

P9 P8 P7 P6

Figure 1(b). Modified Baugh-Wooley two’s complement signed multiplier [7]
978-1-4244-4522-6/09/$25.00 ©2009 IEEE 119 ISCIT 2009
Authorized licensed use limited to: INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY. Downloaded on March 25,2010 at 05:31:40 EDT from IEEE Xplore. Restrictions apply.
y m-1 . . . y4 y3 y2 y1 y0
x n-1 . . . x3 x2 x1 x0
1 x 0 y m-2 . . . x 0y4 x 0y3 x 0y 2 x 0y 1 x 0y0
x 1 y m-2 . . . x 1y4 x 1y3 x 1y2 x 1y 1 x 1y 0
x 2 y m-2 . . . x 2y4 x 2y3 x 2y2 x 2y1 x 2y 0
. .
. .
. .
x n-1y m-1 0 x n-2y m-2 . . . x n-2y 2 x n-2y 1 x n-2y 0
(x n-1y m-2 )’ (x n-1y m-3 )’ . . . (x n-1y 2)’ (x n-1 y 1)’ (x n-1 y 0)’
1 (x n-2y m-1 )’ (x n-3y m-1 )’ . . . (x 0ym-1)’ 1
P m+n-1 P m+n-2 P m+n-3 P m+n-4 . . . P m-1 . . P n+1 Pn P n-1 . . . P3 P2 P1 P0

Figure 2. Tabular form for modified Baugh-Wooley two’s complement signed multiplier
Thus, for a m ∗ n multiplier, the proposed method introduces n−2

m + n extra inverters along with changing m ∗ n AND X v = − xn−1 2 n−1 + ∑ xi 2i
gates to m ∗ n NOR gates, effectively saving i =0
(m ∗ n − (m + n )) inverters or 2 * (m ∗ n − (m + n )) Each of multiplier bits is ANDed to every multiplicand bit to

produce partial products and then summed to form the
transistors (Fig. 2). product Pv,
m+n−2
X4 ' X3 ' X2 ' X1 ’ X0 '

Y0 ' Pv = − p m+ n −1 2 m+ n −1 + ∑p 2i =0
i
i
= Yv X v
⎛ m−2
⎞⎛ n −2
⎞
Y 1'
P0
= ⎜ − ym−1 2 m−1 + ∑ yi 2i ⎟⎜ − xn−1 2 n−1 + ∑ xi 2i ⎟
X3 ' X2 ' X1 ' X0 '
HA HA HA HA
⎝ i =0 ⎠⎝ i =0 ⎠
X4 ' C S C S C S C S
⎛ n−2 m−2 ⎞ ⎛ m−2 n−2

⎞
= ⎜⎜ xn−1 ym−1 2m+n−2 + ∑∑xi y j 2i+ j ⎟⎟ − ⎜ ∑xn−1 yi 2n−1+i + ∑ ym−1xi 2m−1+i ⎟
P1
X3 ' X2 ' X1 ' X0 ' Y2 '
⎝ i=0 j=0 ⎠ ⎝ i=0 i=0 ⎠
FA FA FA FA Modified Baugh-Wooley multiplier uses AND and NAND
'
P2 gates to generate partial products. The tabular form of
' ' ' ' Y3
X3 X2 X1 X0
modified Baugh-Wooley multiplier is shown in Fig. 2. Its
architecture is shown in Fig. 1(b).
Now we see, out of m ∗ n partial products, majority of the
FA FA FA FA
P3
X3 ' X2 ' X1 ' X0 '
Y4 '
bits (m ∗ n − (m + n − 2)) are generated as a result of AND
FA FA FA FA
operation while only a few (m + n − 2) are a result of
NAND operation between multiplier and multiplicand bits. In

P4
the proposed multiplier, all the AND gates are replaced with
NOR gates.
FA FA FA HA
C S C S C S C S
P5
According to DeMorgan’s Law,
P9 P8 P7 P6
A.B = (A’+B’)’. (2)
Figure 3. Proposed unsigned array multiplier This makes it necessary to use inverted inputs (addition of
(m + n) inverters) and convert the remaining NAND gates
III. PARALLEL TWO’S COMPLEMENT SIGNED MULTIPLIER
to OR gates, as (A.B)’ = (A’+B’).
For m ∗ n parallel two’s compliment signed multiplication, These changes reduce (m ∗ n − 2 * (m + n − 2) − ( m + n))
m-bit multiplier Yv is represented as: inverters in the proposed multiplier. (As an example, for a
m−2
16x16 multiplier, 164 inverters are reduced.)
Yv = − ym−1 2 m−1 + ∑ yi 2i Thus, in comparison to the modified Baugh-Wooley
i =0
multiplier (Fig. 1(b)), area is reduced. The probability of
and n-bit multiplicand Xv is represented as:
120
getting a 1 in both NOR and AND gate is same (i.e. 1/4) and Sign Extension is done to 16 bits,
that for NAND and OR is also same (i.e. 3/4). Thus the Multiplicand, 5 = 0000 0000 0000 0101
switching activity in the proposed multiplier remains the
same. Since less number of transistors are used, power
dissipation in the proposed multiplier is bound to decrease. Input
|n|
However due to the extra inverters added to obtain
complemented inputs, there is extra 1T delay. Sign extension to m bits n=+ve
However, if the inputs are not in two’s complement form, n=‐ve
further modifications can be done to produce the
m inverters
complemented inputs as explained below.
Generation of Inverted Bits

Addition of 1 m bit Adder
If the inputs are not in two’s complement form, for modified
Baugh-Wooley multiplier, they have to be initially converted
into two’s complement form. For this, first, sign extension to Required input
the modulus of the input is done irrespective of the sign of the
input. This is done by appending 0s to the most significant Figure 4(a). Conventional method of generating two’s complement form
side of the number. Then, if the number is positive, all the
bits remain the same. If the number is negative, all the bits
are complemented and 1 is added to the resulting number. Input
The process is illustrated in example below and shown in Fig. |n|
4(a).
Example: Consider two integer numbers 5 and -3. We shall Sign extension to m bits
use a 16 bit word length for illustration. Three bit binary n=‐ve n=+ve
representation of the modulus of given integers is
Addition Inversion
Multiplier, 5 = 101 m bit Adder m inverters
of (2m‐1) of bits
Multiplicand, 3 = 011
Sign Extension done to 16 bits,
Multiplicand, 5 = 0000 0000 0000 0101 Required input
Multiplier, 3 = 0000 0000 0000 0011
Multiplier is positive so it would remain same i.e. 0000 0000
0000 0101. Figure 4(b). Generation of input of proposed method
Multiplicand is negative, so the bits are first inverted, and
then 1 is added to give the two’s complement form for -3 i.e.
Y 0'
1111 1111 1111 1101. X4' X3' X 2' X 1' X 0'
Instead of generating inverted bits simply by complementing X3' X2' X1' X0' Y1' P0
two’s complement form of input, a new way described below

is proposed which makes the calculation of two’s complement X4' C
HA
S C
HA
S C
HA
S C
HA
S
of a negative number redundant. P1

X3' X2' X1' X0' Y2'
First, sign extension is done to the modulus of the input
irrespective of the sign of the input. This is done by FA FA FA FA
appending 0s to the most significant side of the number. Now X4' C S C S C S C S
P2
if the input number is positive, bits are complemented. If it is X3' X2' X1' X0' Y3'
negative, the complement of its two’s compliment form is

obtained by adding (2m - 1) to the number. (Considering sign X4'
FA FA FA FA
extension is done to m bits, proof given below.)

C S C S C S C S
P3
X3' X2' X1' X0' Y4'
The process is shown in Fig. 4(b) and illustrated in example
given below. The tabular form representation shown in Fig. FA FA FA FA
2, now looks like as shown in Fig. 6. The block diagram is X4' C S C S C S C S
P4
shown in Fig. 5. 1 1
Example: Consider two integer numbers 5 and -3. We shall HA

S C
FA
S C
FA
S C
FA
S C
FA
S
use a 16 bit word length for illustration. 3 bit binary P5
equivalent of the modulus of given integers is P9 P8 P7 P6

Multiplier, 5= 101 Figure 5. Proposed two’s complement signed array multiplier
Multiplicand, 3 = 011
121
ym-1 . . . . y4 y3 y2 y1 y0
xn-1 . . . x4 x3 x2 x1 x0
Inversion of bits by process described in Fig. 5
(ym-1)’ . . . (y4)’ (y3)’ (y2)’ (y1)’ (y0)’
(xn-1)’ . . . (x3)’ (x2)’ (x1)’ (x0)’
1 (x0’+ym-2’)’ . . . (x0’+y3’)’ (x0’+y2’)’ (x0’+y1’)’ (x0’+y0’)’
(x1’+ym-2’)’ . . . (x1’+y3’)’ (x1’+y2’)’ (x1’+y1’)’ (x1’+y0’)’
(x2’+ym-2’)’ . . . (x2’+y3’)’ (x2’+y2’)’ (x2’+y1’)’ (x2’+y0’)’
. .
. .
. .
(xn-1’+ym-1’)’ 0 (xn-2’+ym-2’)’ . . . (xn-2’+y1’)’ (xn-2’+y0’)’
xn-1’+ym-2’ xn-1’+ym-3’ . . . xn-1’+y1’ xn-1’+y0’
1 xn-2’+ym-1’ xn-3’+ym-1’ . . x0’+ym-1’ 1
Pm+n-1 Pm+n-2 Pm+n-3 Pm+n-4 . . Pm-1 . . Pn Pn-1 . . . P3 P2 P1 P0

Figure 6. Tabular form for proposed two’s complement signed array multiplier
Multiplier, 3 = 0000 0000 0000 0011 TABLE I. Power and Delay comparison of conventional and proposed
Multiplier is positive so the bits are inverted to give 1111 multipliers
1111 1111 1010.
Multiplicand is negative, so (216 - 1) is added to 0000 0000
0000 0011 to give the complement of two’s complement form Unsigned array multiplier
for -3 i.e. 0000 0000 0000 0010.
Conventional Proposed Proposed/Conventional
Proof for addition of (2m-1): Power (in Watt) 7.2177E-04 6.6482E-04 0.92
Delay (in ns) 1.015 0.968 0.954
As mentioned earlier, required input for proposed multiplier is
complement of two’s complement form of input. Steps Two’s complement signed array multiplier
involved in obtaining two’s complement form of a number
(sign extended to m bits) are: Modified Proposed/Modified
Proposed
1) Complement the number. Baugh Wooley Baugh Wooley
2) Addition of 1. Power (in Watt) 5.6533E-04 5.5366E-04 0.979
Delay (in ns) 1.1478 0.98 0.854
As we know that, taking two’s complement of a number twice
gives the same number.
Let A is the given m-bit sign extended number and B is its V. CONCLUSION
two’s complement representation. B’ is required number, In this paper, a new approach for the design of parallel
then array multipliers has been suggested. AND gates in the
A → A’→ (A’+1) = B existing designs have been replaced with NOR gates. Where
B → B’→ (B’+1) = A the numbers are not in two’s complement form then they are
B’ = A – 1 = A + (2m – 1) inverted and given as input. Results of the simulation clearly
show that the proposed multiplier architecture performs better
Thus for a negative number, calculation of its two’s than the existing modified Baugh-Wooley multiplier.
complement form and its complement are taken care of
simultaneously avoiding the need to calculate its two’s REFERENCES
complement form. This not only improves time delay but
also power dissipation is reduced. [1] J. Hoffman, G. Lacaze, and P. Csillag, “Iterative Logical
Network for Parallel Multiplication,” Electronics Letters, vol. 4,
p. 178, 1968.
IV. SIMULATION DETAILS AND RESULTS [2] P. Burton and D.R. Noaks, “High-Speed Iterative Multiplier,”
The analysis has been carried out on the proposed Electronics Letters, vol. 4, p. 262, 1968.
[3] R. De Mori, “Suggestion for an IC Fast Parallel Multiplier,”
multipliers by performing simulations on HSpice and
Electronics Letters, vol. 5, pp. 50-51, Feb. 1969.
compared with the existing multipliers. Simulations are [4] H. Guilt, “Fully Iterative Fast Array for Binary Multiplication,”
performed for 16x16 bit multipliers at 1.2V and at a Electronics Letters, vol. 5, p. 263, 1969.
frequency of 50 MHz. Results shown in the Table I are for [5] R. Baugh and B.A. Wooley, “A Two’s Complement Parallel
the particular inputs 1010101010101010x Array Multiplication Algorithm,” IEEE Trans. Computers, vol.
01010010101010101. Similar results can also be obtained for 22, no. 12, pp. 1,045-1,059, Dec. 1973.
other inputs.
122
[6] K. Hwang, “Global and Modular Two’s Complement Array
Multipliers,” IEEE Trans. Computers, vol. 28, no. 4, pp. 300-
306, Apr. 1979.
[7] Wayne Wolf, (2002). Modern VLSI Design: System-On-Chip
Design. 3rd Edition, Prentice Hall, Upper Saddle River, N.J.
[8] M.S.Elrabaa, I.S. Abu-Khater, M.I. Elmasry, “Advanced Low-
PowerDigital Circuits Techniques”, Kluwer Academic Publ.,
1997.
[9] J.M.Rabaey, A.Chandrakasan, and B.Nicolic, “Digital
Integrated Circuits”, (2nd Edition) Prentice Hall, 2002.
123

A Novel, Low-Power Array Multiplier Architecture

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

A Novel, Low-Power Array Multiplier Architecture

Загружено:

Авторское право:

Доступные форматы

A Novel, Low-Power Array Multiplier Architecture

Ronak Bajaj, Saransh Chhabra, Sreehari Veeramachaneni, M.B Srinivas

9th International Symposium on Communication and Information Technology 2009

Songdo - iFEZ ConvensiA, Icheon, Korea

Report No: IIIT/TR/2009/234

Centre for VLSI and Embeded Systems Technology

Abstract— Low power parallel array multiplier is proposed

Modified Baugh-Wooley multiplier is further modified and if

consumption has been found to be less than that of modified

in microprocessors and DSPs and also a major source of FA FA FA FA

various digital circuits and systems. Power consumed by

hierarchy, from algorithms to architectures to circuits, and P4

devices. Various algorithms and multiplier schemes have

unsigned and signed multipliers. For Signed multiplier, the X3 X2 X1 X0 Y1 P0

modified Baugh-Wooley algorithm (Fig. 1(b) and 2) is

inverted bits of two's complement representation. X3 X2 X1 X0

II. UNSIGNED PARALLEL ARRAY MULTIPLIER X4 C

the AND operation of multiplicand and multiplier bits and

accordance with the DeMorgan’s Law:

transistors are used for NOR/NAND gate. Also, AND gate

has an extra delay of 1T compared to NOR gate.

978-1-4244-4522-6/09/$25.00 ©2009 IEEE 119 ISCIT 2009

Thus, for a m ∗ n multiplier, the proposed method introduces n−2

(m ∗ n − (m + n )) inverters or 2 * (m ∗ n − (m + n )) Each of multiplier bits is ANDed to every multiplicand bit to

X4 ' X3 ' X2 ' X1 ’ X0 '

⎛ n−2 m−2 ⎞ ⎛ m−2 n−2

NAND operation between multiplier and multiplicand bits. In

Generation of Inverted Bits

two’s complement form of input, a new way described below

of a negative number redundant. P1

negative, the complement of its two’s compliment form is

extension is done to m bits, proof given below.)

Example: Consider two integer numbers 5 and -3. We shall HA

use a 16 bit word length for illustration. 3 bit binary P5

equivalent of the modulus of given integers is P9 P8 P7 P6

Вам также может понравиться