Вы находитесь на странице: 1из 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/280735466

Design and Comparative Study of Wallace tree and CLA based Multiplier for
DSP Application

Article · January 2015

CITATIONS READS

0 246

2 authors, including:

Rajesh Kumar Lal


Birla Institute of Technology, Mesra
33 PUBLICATIONS   66 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Multiobjective VLSI Partitioning View project

Heterojunctions View project

All content following this page was uploaded by Rajesh Kumar Lal on 07 August 2015.

The user has requested enhancement of the downloaded file.


International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.55 (2015)
© Research India Publications; httpwww.ripublication.comijaer.htm

Design and Comparative Study of Wallace tree and CLA based


Multiplier for DSP Application
1
Shahnawaz Arif, 2R K Lal
1 Student, Electronics and Communication Engineering, Birla Institute of Technology, India
2 Associate professor, Department of ECE, Birla Institute of Technology, India

Abstract these circuits has been presented with respect to delay, power
This article presents the design and comparative study of consideration and resource utilization (area).
various multipliers for digital signal processing applications. Although, many authors [5], [6], [7] has done extensive
Multiplier is an important key element of many high work on high speed and low power multiplier at logic, circuit,
performance systems such as microprocessor, FIR filter, digital physical and technological levels but only few have considered
signal processor etc. So, there is need of such type of the design issues based on Field Programmable Gate Array[3].
multiplier which has less area, low power consumption and In [8] the author has considered the custom design of adder and
high speed. These features make it suitable for various multiplier circuits and their realization but detailed analysis of
compact, high speed and low power Field Programmable Gate adder and multiplier circuits in respect of resource utilization,
Array (FPGA) based digital signal processing implementation. delay calculation and power measurement has been not
It is consummated using Xilinx family Virtex-5 XC5VLX30- discussed. Similarly, Beevao and Stukjunger [9] have worked on
3-FF243 FPGA device. The comparative study has been the design of FPGA based arithmetic circuits but the part
presented with respect to power, area and delay for 8-bit Carry related to power analysis was not given. This paper is based on
ripple multiplier, Wallace tree multiplier, Ling Multiplier and the design and implementation of fixed point arithmetic
Carry Look Ahead based multiplier. circuits [4] (adder & multiplier circuits) with respect to delay
calculation, power measurement and resource utilization
Keywords: Area, Arithmetic circuits, CLA, delay, DSP, (area).
FPGA, power, Xilinx Xpower. In the present research work, we have focused on
enhancement of FPGA based design and a comparative study
1. Introduction of carry ripple multiplier, Wallace tree multiplier, Ling
In this era due to growing scale of integration, a large number multiplier and carry look ahead adder (CLA) based multiplier
of signal processing systems are being implemented on a VLSI has been done for DSP application. We have used Xilinx
chip. These signal processing systems [1] consumes huge family Virtex-5 XC5VLX30-3-FF243 FPGA device for
amount of energy and it also requires great computation implementation purpose. This work provides relatively a very
capacity. The major design tolls are area and performance but good analysis for FPGA based multiplier design for DSP
in today‟s VLSI system design, power consumption [2] is also application.
an important factor. There are two main forces which arises Generally there are two basic steps in the multiplication.
the design of low power VLSI system. The first is large The first step is Partial product and the second step is addition.
currents which are to be delivered with the processing capacity Therefore before designing the different multiplier circuits, we
per chips and the steady growth of operating systems. The have designed different adders and have compared their power,
second is battery life which is limited in portable electronics resource utilization (area) and speed.
devices. Low power design gives longer life of these portable
devices. 2. Design of Different Adder Circuits
In most signal processing algorithms, the A. Carry ripple adder
fundamental operation is multiplication. Multiplier takes This circuit has simple adder structure and used for
significant power and it has long latency and larger area. comparison purpose. In this structure, full adders are
Consequently, there is a need of low power multiplier in VLSI connected into chain such that the carry-out of a full adder is
system design. Generally, the performance of a system is given to the carry-in of the next full adder. In this way the
determined by the multiplier performance because it is the carry generated in the first full adder ripples to the last full
slowest component in the system. Furthermore, it also occupies adder and produces n-bit sum and a carry out. It is a simplest
large area. Therefore speed and area are a major design issue structure for implementation.
of the multiplier but they are conflicting constraints. When
speed of the multiplier is improved, it results larger area. This B. Carry look ahead adder
paper shows the design and comparative study of various A carry look ahead adder increases the speed by
multipliers for DSP application. The comparative study of reducing the delay caused by the carry which ripples from the
first full adder to the last full adder. It has two special signals,

1086
International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.55 (2015)
© Research India Publications; httpwww.ripublication.comijaer.htm

called carry generate (Gi) and carry propagate (Pi). For each bit with Xilinx ISE10.1 and behavioral simulation has been done
pair which is to be added, the carry generate and carry using Model Sim. Xilinx Xpower tool [3] is used for the
propagate signal will decide that the bit pair will produce a calculation of optimized power.
carry or transmit a carry. By using these two signals, the carry-
out is decided ahead of time. It is type of “pre-process” so this A. Xilinx xpower tools
adder reduces the time required to calculate carry bits. For programmable devices such as field programmable gate
array (FPGA), the power estimation is a multi-faceted
C. Ling adder process. It vastly depends upon the sum of logic and
Ling adder [10] is an advanced form of CLA. It lowers configuration used in the design. To generate precise estimate,
operation cost by replacing Ex-OR gate by OR gate. Propagate it needs precise input values (toggle rates, resource utilization
signal (Pi) is replaced by Ti and Hi+1 in place of Ci+1. Where Ti and clock rates). In static power consumption, the device
and Gi are propagate and generate signal respectively, H i+1 is remains in configured form but there is no switching activity.
carry-out signal and sum is denoted by Si. The static power includes the power in clock managers, I/O
DCI terminations, etc. The dynamic power consumption
3. DESIGN OF DIFFERENT MULTIPLIER CIRCUITS depends on the switching activity and the user logic
utilization. The switching element such as BRAM, LUT, FF
A. Multiplier based on carry ripple adder
and routing segment have a capacitance model associated with
This multiplier is based on carry ripple adder. In this these. The user gives the specific frequency to the primary
multiplier the partial products are produced by AND gate and input signal and clock signal. Xpower [3] calculates power by
thereafter, these partial products are added in pairs using carry adding the power taken by each switching element. The power
ripple adder. The N-bit multiplier creates „N‟ partial product. taken by each element is estimated as:
The design of 8-bit, 16-bit and 32-bit multiplier is based on
carry ripple adder that utilizes 48, 224 and 960 carry ripple
adder respectively for the addition of these partial products. P = C × V2 × E × F × 1000 (9)
B. Multiplier based on carry look ahead adder Where: F= Frequency (Hz), P= Power (mW),
This multiplier is based on carry look ahead adder. We C= Capacitance (F), V= Voltage (V), E= switching activity.
have designed it by using two bit, three bit and four bit carry
look ahead adder according to combination of bit. The 8-bit B. Adder result
multiplier is designed by using 1-two bit, 6-three bit and 7-four Table I shows the area and delay and Table II shows the
bit carry look ahead adder. The design of 16-bit multiplier power of different fixed point adders (for 8 bit, 16 bit, 32 bit
utilizes 1-two bit, 14-three bit and 45-four bit carry look ahead and 64 bit) implemented on Virtex-5 field programmable gate
adder for the addition of 16 partial products. For 32-bit, we array (FPGA). On the basis of time, it can be observed that the
have used 1-twobit, 30-three bit and 186-four bit carry look best fixed point adder is ling adder for 64 bit, but on the basis
ahead adder. of both area and power, it is found out to be carry look ahead
adder. Figure 1 and Figure 2 show the comparison between
C. Multiplier based on ling adder the adders on the basis of area and delay respectively. The
This multiplier is based on ling adder and it is highest and lowest dynamic and quiescent power is shown in
designed by using two bit, three bit and four bit ling adder Fig. 3 and Fig. 4 respectively for 64 bit. Here, it is observed
according to combination of bit. The 8-bit multiplier is that ling adder is faster adder than carry ripple adder and carry
designed by using 1-two bit, 6-three bit and 7-four bit ling look ahead adder takes less area and less power than carry
adder. The design of 16-bitmultiplier utilizes 1-two bit, 14- ripple adder.
three bit and 45-four bit ling adder for the addition of 16 partial
products. For 32-bit, we have used 1-two bit, 30-three bit and C. Multiplier result
186-four bit ling adder. Table III shows the area, estimated power and delay result of
multiplier circuits implemented on Virtex-5 field
D. Wallace tree multiplier programmable gate array for 8 bit. The delay of Wallace tree
In Wallace tree architecture, all the bits of all of the multiplier is less than the carry ripple multiplier, CLA based
partial products in each column are added together by a set of multiplier and LA based multiplier. The comparison between
counters in parallel without propagating any carries. Another the multipliers based on lowest and highest area and delay is
set of counters then reduces this new matrix and so on, until a shown in Fig. 5 and Fig. 6 for 8 bit. We have compared total
two-row matrix is generated. The most common counter used dynamic and quiescent power as shown in Fig. 7 and Fig. 8
is the 3:2 counters which is a Full Adder. The final results are respectively for 8 bit multiplier.
added using usually carry propagate adder. The advantage of
Wallace tree is speed because the addition of partial products is
5. Analysis of result
now O (log N).
From Fig. 2 we can say that the ling adder is faster adder
4. Simulation result
whereas carry ripple adder is slower adder for 64 bit. From
The implementation of all the adders and multipliers has been
performed on Virtex-5 XC5VLX30-3-FF243 FPGA device

1087
International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.55 (2015)
© Research India Publications; httpwww.ripublication.comijaer.htm

TABLE I. AREA & DELAY OF FIXED POINT ADDERS


IMPLEMENTED ON VIRTEX-5, DEVICE-XC5VLX30, FF324
CRA CLA LA
Area
12 12 13
(Slices)
N=8bit
Delay
5.521 5.521 6.189
(ns)
Area
25 24 25
(Slices)
N=16bit
Delay
8.162 7.640 8.972
(ns)
Area
49 48 85
(Slices)
N=32bit
Delay
12.401 11.879 11.992
(ns)
Area
97 96 173
(Slices)
N=64bit
Delay
20.879 20.358 18.977
(ns)

TABLE II. POWER OF FIXED POINT ADDERS IMPLEMENTED


ON VIRTEX-5, DEVICE-XC5VLX30, FF324 Fig.1: Comparison between area (slices) of adder.
CRA CLA LA
Total
Dynamic 0.01862
0.01862 0.01863
Power
(watt)
N=8bit
Total
Quiescent 0.30143
0.30143
Power 0.30143
(watt)
Total
Dynamic 0.03553 0.03555 0.03556
Power
N=16bit Total
Quiescent
0.30251 0.30251
Power 0.30251
(watt)
Total
Dynamic
0.06965
Power 0.06963 0.07146 Fig.2: Comparison between delay (ns) of adder
(watt)
N=32bit
Total
Quiescent
0.30471
Power 0.30471 0.30483
(watt)
Total
Dynamic
0.13739
Power 0.13747 0.14039
(watt)
N=64bit
Total
Quiescent
0.30915
Power 0.30916 0.30935
(watt)

Fig. 3, we can also say that the minimum power (both


dynamic & quiescent) is taken by carry look ahead adder
whereas maximum power is taken by ling adder for 64 bit.
Carry look ahead adder also takes minimum slices whereas
Fig. 3: Comparisons between total dynamic power (Watt) of
ling adder takes maximum slices for 64 bit from Fig. 1. It is
64 bit adder.
observed that there is a relationship between the slices and
multiplier has less delay than carry ripple multiplier, CLA
power of adder circuits. If the design of an adder circuit takes
based multiplier and LA based multiplier for 8 bit. From
less number of slices then it will consume less power.
Fig.5.we also observe that Wallace tree multiplier uses fewer
The highest and lowest area (slices) and delay
slices than other designed multiplier. The lowest and highest
comparison of multiplier circuits are shown in Fig. 5 and Fig.6
power comparison shown in Fig. 7 and Fig.8 indicates that
respectively. From Fig.6 we observe that Wallace tree
multiplier based on ling adder consumes more quiescent and
dynamic power than Wallace tree multiplier. Here, it is also

1088
International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.55 (2015)
© Research India Publications; httpwww.ripublication.comijaer.htm

Fig.4: Comparison between total quiescent power (watt) of 64


bit adder
TABLE III. AREA & DELAY OF FIXED POINT MULTIPLIERS
IMPLEMENTED ON VIRTEX-5, DEVICE-XC5VLX30, FF324 Fig. 6: Comparison between lowest and highest delay of 8 bit
Carry CLA LA based Wallace multiplier.
ripple based multiplier tree
multiplier multiplier multiplier

Area
107 113 144 99
(Slice)

Delay 22.043
16.559 16.243 10.698
(ns)

Total
Dynamic 0.04008
0.03530 0.03697 0.03410
Power
(watt)

Total
Quiescent 0.30280
0.30250 0.30260 0.30242
Power
(watt) Fig.7: Comparison between lowest and highest dynamic
power of 8 bit multiplier.

Fig. 8: Comparison lowest and highest total quiescent power


Fig.5: Comparison between lowest and highest area (slices) of of 8 bit multiplier.
8 bit multiplier.

1089
International Journal of Applied Engineering Research, ISSN 0973-4562 Vol. 10 No.55 (2015)
© Research India Publications; httpwww.ripublication.comijaer.htm

also noted that there is a proportional relationship between


number of slices and power of the multiplier.

6. Conclusion [5] S shanthala, S.Y. Kulkarni, “VLSI design and


In this paper, various multiplier circuits have been implementation of low power MAC unit with block enabling
designed and a comparative study has been done based on technique”, European Journal of Scientific Research, Vol. 30,
area (slices), delay and power. We have used Xilinx Virtex-5 No. 4, pp: 620-630, 2009.
XC5VLX30-3-FF243 field programmable gate array device [6] K. H. Chen, Y. M. Chen, Y. S. Chu, “A versatile multimedia
for simulation purpose because it confers special features that functional unit design using the spurious power suppression
support arithmetic operation. It is observed that Wallace tree technique,” In Processing. IEEE Asian Solid-State Circuits
multiplier is faster than carry ripple multiplier, CLA based Conf. Hangzhou, pp: 111-114, 2006.
multiplier and LA based multiplier. From this study it is also [7] Z. Wang, G. A. Jullien, and W. C. Miller, “A new design
technique for column compression multipliers,” IEEE
observed that multiplier circuits which have less number of
Transactions on Computers, Vol. 44, pp: 962-970, 2005.
slices will consume less quiescent and dynamic power. These
[8] L. Beuchat, J. M. Muller, "Automatic Generation of Modular
fixed point multiplier circuits are applicable in digital signal
Multipliers for FPGA Applications," LIP research report,
processing.
Tsukuba, Vol. 57, pp: 1600-1613, 2008.
References [9] M. Beevao, and P. Stukjunger,“ Fixed-Point Arithmetic in
[1] N. Kehtarnavaz, S. Mahotra, “FPGA implementation FPGA, Acta, Polythecnica, Vol. 45, pp: 5-8, 2005.
made easy for applied digital signal processing [10] K. V. Suresh Kumar, S. Shabbir Ali, E. Chitra, “
courses,” IEEE international. Conference. On acoust, Implementation of 32-bit high-valency ling adder in modified
Spec. and Signal. Processing. Czech Republic, pp: FIR filter using APC-OMS approach,” International Journal of
2892-2895, 2011. Electrical and Electronics Engineering, Vol. 3, Issue-1,
[2] Saradindu Panda, A. Banerjee, B. Maji, Dr. A. K. 2013.
Mukhopadhyay, “Power and delay comparison in between [11] G. Lakshminarayanan, and B. Venkataramani,
different types of full adder circuits,” International Journal “Optimization Techniques for FPGA-Based Wave Pipelined
of Advance Research in Electrical, Electronics and DSP Blocks,” IEEE Transaction in Very Large Scale
Instrumentation Engineering, Vol.1 Issue 3, 2013. Integration (VLSI) System, Vol. 13, pp: 783-792, 2005.
[3] http://www.xilinx.com.
[4] Vijayalakshmi, Seshadri, Ramakrishnan, “Design and
implementation of 32 bit unsigned multiplier using CLAA and
CSLA,” IEEE Transaction Emerging Trends in VLSI,
Embedded system, Nano Electronics and
Telecommunication System, pp: 1-5, 2013.

1090

View publication stats

Вам также может понравиться