Вы находитесь на странице: 1из 5

Integrated Circuit Design of 4-bit Booth Multiplier

Radix-4 using Microwind Software


Muhammad Rizqi Nauval Afif Prof. Trio Adiono, S.T., M.T., Ph.D.
School of Electrical Engineering and Informatics School of Electrical Engineering and Informatics
Institut Teknologi Bandung Institut Teknologi Bandung
Bandung, Indonesia Bandung, Indonesia
16516386@office.itb.ac.id tadiono@gmail.com

Abstract—In the course EL4230 Analysis & Design of Digital IC


(integrated circuit), the students are asked to create a full-
custom IC design of Booth Multiplier Radix-4 using Microwind
software. Booth Multiplier is a kind of algorithm invented by
Booth which is used to multiply two inputs (a multiplier and a
multiplicand). After those two multiplied, the result of the
multiplication (product) is accumulated using an accumulator
and the outputs that are displayed in the simulation is the
product and the accumulation. This work shows that the Booth
Multiplier is implemented properly and can do multiplication
successfully between two numbers in the range from 0 to 7.

Keywords—EL4230, full-custom, IC design, Booth Multiplier,


Microwind, accumulator

I. INTRODUCTION Fig. 2. The architecture of the Booth Multiplier to be implemented


In the course EL4230 Analysis & Design of Digital IC
(integrated circuit) that is taught by Prof. Trio Adiono, S.T., The architecture that is implemented is using parallel
M.T., Ph.D., the students are taught about how to design a full- algorithm. The three most significant bits (MSBs) are Booth
custom IC using Microwind software. In the final project, multiplied parallely with the two least significant bits (LSBs)
students are asked to create a full-custom IC design of Booth plus initial zero value on the least significant bit. Output from
Multiplier Radix-4 [1]. Booth Multiplier is a kind of algorithm the first full adder (left picture) is shifted 2 times left first
invented by Booth which is used to multiply two inputs (a before entering the second full adder (right picture). After
multiplier and a multiplicand). After those two multiplied, the that, the output of the second full adder is inserted to the input
result of the multiplication (product) is accumulated using an of full adder accumulator.
accumulator and the outputs that are displayed in the
simulation is the product and the accumulation. IV. FLOORPLANNING
II. PROCESSING ELEMENT The floorplanning of the IC design is implemented as
follows:

Fig. 1. Processing element of complete Booth Multiplier

III. ARCHITECTURE OF BOOTH MULTIPLIER


The architecture of the Booth Multiplier is implemented
as follows:

Fig. 3. Floorplan of complete Booth Multiplier

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE


V. IMPLEMENTATION
A. 1-bit Register

Fig. 7. The simulation result of Shift-3 Register 4-bit

Fig. 4. Layout of 1-bit register implemented using D flip-flop Fig. 7. above shows that the Shift-3 Register 4-bit has
been implemented properly. The setup time needed is 118 ps,
Layout on the Fig. 4. above is implemented using CMOS rising latency is 24 ps (from falling-edge clock to rising-edge
0.12 μm technology. The total area needed is 142 lambda x QY0) and falling latency is 47 ps (from falling-edge clock to
64 lambda = 8.52 μm x 3.84 μm = 32.7168 μm2. There is no falling-edge QY0).
DRC (design rule check) error at all. The design uses 3 metal
layers. C. Booth Decoder

Fig. 8. Layout of Booth Decoder

Layout on the Fig. 8. above is implemented using CMOS


0.12 μm technology. The total area needed is 233 lambda x
64 lambda = 13.98 μm x 3.84 μm = 53.6832 μm2. There is
Fig. 5. The simulation result of 1-bit register no DRC (design rule check) error at all. The design uses 4
metal layers.
Fig. 5. above shows that the 1-bit register has been
implemented properly. The setup time needed is 114 ps,
rising latency is 25 ps (from falling-edge clock to rising-edge
Q00) and falling latency is 47 ps (from falling-edge clock to
falling-edge Q00).

B. Shift-3 Register 4-bit

Fig. 9. The simulation result of Booth Decoder

Fig. 9. above shows that the booth decoder has been


implemented properly. The setup time needed is 269 ps,
rising latency is 27 ps (from A to rising-edge zero) and falling
Fig. 6. Layout of Shift-3 Register 4-bit implemented using D flip-flop latency is 8 ps (from A to falling-edge zero).

Layout on the Fig. 6. above is implemented using CMOS


0.12 μm technology. The total area needed is 414 lambda x
180 lambda = 24.84 μm x 10.18 μm = 252.8172 μm2. There
is no DRC (design rule check) error at all. The design uses 3
metal layers.
D. 4-bit Multiplexer E. 4-bit Full Adder

Fig. 10. Layout of 4-bit Multiplexer

Layout on the Fig. 10. above is implemented using


CMOS 0.12 μm technology. The total area needed is 180
lambda x 100 lambda = 10.08 μm x 6 μm = 6.48 μm2. There
is no DRC (design rule check) error at all. The design uses 4
metal layers.

Fig. 12. Layout of 4-bit full adder

Layout on the Fig. 12. above is implemented using


CMOS 0.12 μm technology. The total area needed is 281
lambda x 325 lambda = 16.86 μm x 19.5 μm = 328.77 μm2.
There is no DRC (design rule check) error at all. The design
uses 4 metal layers.

Fig. 11. The simulation result of 4-bit multiplexer

Fig. 11. above shows that the 4-bit Multiplexer has been
implemented properly. The setup time needed is roughly 0 ps,
rising latency is 12 ps (from 2y1_0 to rising-edge mux2_0)
and falling latency is 11 ps (from 2y1_0 to falling-edge
mux2_0).

Fig. 13. The simulation result of 4-bit full adder

Fig. 13. above shows that the 4-bit full adder has been
implemented properly. The setup time needed is 265 ps,
rising latency is 38 ps (from A[3] to rising-edge OUT[3]) and
falling latency is 140 ps (from A[3] to falling-edge OUT[3]).
F. 4-bit Register G. Final and Complete Design

Fig. 16. Complete layout of Booth Multiplier

Layout on the Fig. 6. above is implemented using CMOS


0.12 μm technology. The total area needed is 843 lambda x
Fig. 14. Layout of 4-bit Register
936 lambda = 56.16 μm x 75.6 μm = 4,245.96 μm2. There is
Layout on the Fig. 15. above is implemented using no DRC (design rule check) error at all. The design uses 4
CMOS 0.12 μm technology. The total area needed is 141 metal layers.
lambda x 181 lambda = 8.46 μm x 10.86 μm = 91.8756 μm2.
There is no DRC (design rule check) error at all. The design
uses 3 metal layers.

Fig. 17. The simulation result of complete booth multiplier

Fig. 5. above shows that the Booth Multiplier has been


Fig. 15. The simulation result of 4-bit register
implemented properly. The outputs, both product and
Fig. 15. above shows that the 4-bit register has been accumulation, started to display on the fourth clock, because
implemented properly. The setup time needed is 103 ps, the use of shift-3 register before the inputs entering the booth
rising latency is 41 ps (from add[0] to rising-edge acc[0] and decoder and the multiplexers. At first, the inputs, multiplier
falling latency is 47 ps (from add[0] to falling-edge acc[0]. (X) and multiplicand (Y) are shifted 3 three clocks to be QX
and QY The product output (OUT) is simply the product of
the 3-clock-shifted multiplier (QX) and the multiplicand
(QY), while the accumulation output (OUTacc) is the
summation of present OUT and OUTacc a clock before, i.e.
OUTacc[t-1].
Because the maximum size of the outputs are 4 bits, if the
result of accumulation exceeds 15, then overflow will occur,
and the intended accumulation output will be subtracted by a
factor of 15 (represents 4 bits).
The setup time needed is 221 ps, rising latency is 368 ps I acknowledge Prof. Trio Adiono, S.T., M.T., Ph.D. for his
(from falling-edge clock to rising-edge OUTacc[0]), and tireless effort to develop IC research and industry in
falling latency is 294 ps (from falling-edge clock to falling- Indonesia. He is not reluctant at all to share all his knowledge
edge OUTacc[0]). and expertise in IC design to undergraduate students. I want
to express my gratitude too to Adiwena Putra that helps me to
VI. CONCLUSION understand the concept of Booth Multiplier.
The Booth Multiplier is properly implemented and can do REFERENCES
multiplication successfully for number 0 to 7. The limitation
is because of 4-bit capacity. [1] T. Adiono, Perancangan Rangkaian Terintegrasi Digital. Bandung:
Penerbit ITB, 2017.
ACKNOWLEDGMENT

Вам также может понравиться