Вы находитесь на странице: 1из 17

PERANCANGAN SISTEM DIGITAL

Dosen :
Andi Hasad
http://andihasad.wordpress.com

TEKNIK ELEKTRO
UNIVERSITAS ISLAM 45
BEKASI

About this Topic

Topic 4

Arithmetic Circuits

Peter Cheung
Department of Electrical & Electronic Engineering
Imperial College London

Computer Arithmetic, B. Parhami, OUP


Computer Arithmetic Algorithms, I. Koren, AK Peters

URL: www.ee.imperial.ac.uk/pcheung/
E-mail: p.cheung@imperial.ac.uk
PYKC 21-Jan-08

E3.05 Digital System Design

Comparison of adder architectures on FPGAs


Multiple operands addition
Basic multipliers
Booth recoding multipliers
Fixed point vs Floating Point
Floating point Unit architectures
Example: FIR and IIR filter implementations
References

Topic 4 Slide 1

PYKC 21-Jan-08

Different adder architectures

Basic Ripple Carry Adder

Revision on last years digital electronics II course


(http://www.ee.ic.ac.uk/hp/staff/dmb/courses/dig2/5_Adder.pdf)
Common adder architectures are:

Using full-adders in building


bit-serial and ripple-carry
adders.

x
xi

Shift
Carry
FF

Ripple carry adder


Carry lookahead adder
Carry skip (or carry select) adder
Carry save adder
Parallel prefix adder (Brent & Kungs)

yi

ci+1

Clock

FA

ci
Shift

si

(a) Bit-serial adder.


x31
c32
cout
s32

y31

FA

s31

x1
c31

. . .

c2

y1

FA

x0
c1

s1
(b) Ripple-carry adder.

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 2

E3.05 Digital System Design

Topic 4 Slide 3

PYKC 21-Jan-08

E3.05 Digital System Design

y0

FA

c0
cin

s0
Source: Parhami
Topic 4 Slide 4

Critical Path Through a Ripple-Carry Adder

Adder Conditions and Exceptions

Tripple-add = TFA(x,ycout) + (k 2)TFA(cincout) + TFA(cins)


xk1
ck

yk1

FA

cout

sk1

sk

ck1

xk-2

yk2

FA

x1

ck2

. . .

c2

sk2

y1

x0
c1

FA

s1

cout
Overflow

y0

Negative

c0

FA

Zero

cin

s k2

s k1

s1

s0

Twos-complement adder with provisions for


detecting conditions and exceptions.

s0

Critical path in a k-bit ripple-carry adder.

overflow2s-compl = xk1 yk1 sk1 xk1 yk1 sk1


overflow2s-compl = ck ck1 = ck ck1 ck ck1

Source: Parhami
PYKC 21-Jan-08

y0 x0
y1 x1
yk1 xk1 yk2 xk2
c k1
c
ck
c k2
c
... 2 FA 1 FA c 0
FA
FA
cin

Topic 4 Slide 5

E3.05 Digital System Design

PYKC 21-Jan-08

Source: Parhami
Topic 4 Slide 6

E3.05 Digital System Design

Full Carry Lookahead

Saturating Adders
Saturating (saturation) arithmetic:

x3 y3

x2 y2

x1 y1

x0 y0

When a results magnitude is too large, do not wrap around;


rather, provide the most positive or the most negative value that is
representable in the number format

cin

Example In 8-bit 2s-complement format, we have:


120 + 26 18 (wraparound); 120 +sat 26 127 (saturating)

...

Saturating arithmetic in desirable in many DSP applications

s3

Designing saturating adders


Adder

Unsigned (quite easy)


Overflow

Saturation value
Source: Parhami
PYKC 21-Jan-08

E3.05 Digital System Design

s1

s0

Theoretically, it is possible to derive each sum digit directly from


the inputs that affect it

0
1

Signed (only slightly harder)

s2

Topic 4 Slide 7

Carry-lookahead adder design is simply a way of reducing the


complexity of this ideal, but impractical, arrangement by hardware
sharing among the various lookahead circuits
Source: Parhami
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 8

Carry-Lookahead Adder Design

Unrolling the Carry Recurrence

Block generate and propagate signals

Recall the generate g, propagate p signals:


Signal
gi
pi
si

Radix r
is 1 iff xi + yi r
is 1 iff xi + yi = r 1

Binary
xi yi
xi yi

(xi + yi + ci) mod r

xi yi ci

g [i,i+3] = gi+3 + gi+2 pi+3 + gi+1 pi+2 pi+3 + gi pi+1 pi+2 pi+3
p [i,i+3] = pi pi+1 pi+2 pi+3
ci+2

ci+3

The carry recurrence can be unrolled to obtain each carry signal directly from
inputs, rather than through propagation

ci+1

gi+3 p i+3 gi+2 p i+2 gi+1 p i+1 gi p i

ci = gi1 + ci1 pi1


= gi1 + (gi2 + ci2 pi2) pi1
= gi1 + gi2 pi1 + ci2 pi2 pi1
= gi1 + gi2 pi1 + gi3 pi2 pi1 + ci3 pi3 pi2 pi1
= gi1 + gi2 pi1 + gi3 pi2 pi1 + gi4 pi3 pi2 pi1 + ci4 pi4 pi3 pi2 pi1
=...

4-bit lookahead carry generator

ci

p [i,i+3]

g [i,i+3]

Schematic diagram of a 4-bit lookahead carry generator.

Source: Parhami
PYKC 21-Jan-08

Topic 4 Slide 9

E3.05 Digital System Design

A Building Block for


Carry-Lookahead Addition

Combining Block g and p Signals

pi+3
gi+3

Block Signal Generation


Intermediate Carries

p3
g3

j2

j3

Four-bit
adder

pi+2

g2

g i+2

g1
p0
c1

g0

PYKC 21-Jan-08

gi+1
pi
ci+1

E3.05 Digital System Design

g p

pi+1

ci+2

c0

c j 1+1

c j 2+1

p2

p1

c2

i2

ci

i0

i1

g p

c j 0+1
g p

g p

4-bit lookahead carry generator


g p

gi

Source: Parhami
Topic 4 Slide 11

j0

j1

i3

ci+3

c3

Topic 4 Slide 10

E3.05 Digital System Design

p [i,i+3]

g [i,i+3]
Four-bit
lookahead
carry generator.

c4

Source: Parhami
PYKC 21-Jan-08

PYKC 21-Jan-08

Block generate and


propagate signals can
be combined in the
same way as bit g and
p signals to form g
and p signals for
wider blocks
ci

Combining of g and p signals of four (contiguous or


overlapping) blocks of arbitrary widths into the g and p
signals for the overall block [i0, j3].
E3.05 Digital System Design

Source: Parhami
Topic 4 Slide 12

Carry-Select Adders
k -1

k /2

k /2-bit adder

c out

k /2-bit adder

k/2+1
1

Mux

0
1

k /2-bit adder
k/2+1

k -1

Multilevel Carry-Select Adders


k -1

3k /4

c in
k /4-bit adder
k/4+1

k/2

c k/2

Mux

Low k /2 bits

Tselect-add(k) = Tadd(k/2) + 1

k/4

k /4-bit adder

1
k/4

k/4+1

k /4-bit adder

k/4+1

Mux

c out , High k /2 bits

Mux

c in

k/4

c k/4

k/4

c k/2

Middle k /4 bits

Low k /4 bits

Two-level carry-select adder built of k/4-bit adders.


Source: Parhami

E3.05 Digital System Design

Topic 4 Slide 13

Source: Parhami
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 14

Results for Stratix II Area

Comparison between adders on modern FPGAs

k /4-bit adder

k /4 - 1
0

k/2+1

Cselect-add(k) = 3Cadd(k/2) + k/2 + 1

k /4

k/2

High k /2 bits

k /2 - 1

k /2
0

k/4+1

Carry-select adder for k-bit numbers built from


three k/2-bit adders.

PYKC 21-Jan-08

3k /4 - 1
0

Sacristan, Rodella & Diaz, Comparison of addition structures synthesis over


commercial FPGAs, International Conf. on Design & Test, 2006 Page(s):413
- 417
Compare ripple carry adder (RCA), carry lookahead adder (CLA), carry select
adder (CSLA), Brent&Kung parallel prefix adder (PA-BK) and finally not
specifying any structure and let the synthesis tool decide!
Use Altera Stratix II and Xilinx Virtex-4 (not latest, but pretty recent).
Result summary:
Mostly as expected, faster means larger
Surprising, synthesis tools does the best: both fast and small!!
Morale at low level, difficult to beat modern synthesis tools
Results shown in the next four slides.

Source: Sacristan
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 15

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 16

Results for Stratix II Delay

Results for Virtex 4 Area

Source: Sacristan
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 17

Source: Sacristan
PYKC 21-Jan-08

Results for Virtex-4 Delay

E3.05 Digital System Design

Topic 4 Slide 18

Multipliers and DSP Blocks

Remember that both Altera and Xilinx FPGAs have embedded multipliers with
accumulators etc.
This part of the lecture will look at some of the common multiplier hardware
(i.e. what such embedded multiplier circuits might look like).
We will also consider application of FPGA embedded multiplier for FIR Filter
implementations.
Topics to cover are:

Basic multipliers
Booth recoded multipliers
Array multipliers
FIR Filter Compiler

Source: Sacristan
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 19

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 20

An example

Multiplication of two 4-bit unsigned numbers


Notation:
a
x
p

Multiplicand
Multiplier
Product (a x)

p2k1p2k2

ak1ak2 . . . a1a0
xk1xk2 . . . x1x0
. . . p3p2p1p0

Initially, we assume unsigned operands


a
x

x0a
x1a
x2a
x3a

Multiplicand
Multiplier
20
21
22
23

Partial
pro ducts
bit-matrix
Product

Source: Parhami
PYKC 21-Jan-08

Topic 4 Slide 21

E3.05 Digital System Design

Basic Sequential Multipliers

PYKC 21-Jan-08

Topic 4 Slide 22

E3.05 Digital System Design

Performing Add and Shift in One Clock Cycle

S hift

Adders
carry-out

Multiplier x

Adders sum
k

D oublewidth partial prod uct p (j)

Partial pro duct p(j)

S hift
Multiplica nd a

0
0

Mux

xj a
c out

PYKC 21-Jan-08

To mux control

Combining the loading and shifting of the double-width


register holding the partial product and the partially used
multiplier.

Adder

E3.05 Digital System Design

k 1

To add er

xj

k 1

Unuse d
part o f the
multiplier x

Source: Parhami
Topic 4 Slide 23

Source: Parhami
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 24

2s complement signed multiplication

Example of a detail 4x4 unsigned sequential multiplier

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 25

4x4 sequential signed multiplier circuit

PYKC 21-Jan-08

E3.05 Digital System Design

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 26

Recoded Multiplier Booth Algorithm (1)

Topic 4 Slide 27

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 28

Recoded Multiplier Booth Algorithm (1)

Proof of Booth Algorithm


Booth Algorithm does this

2s complement rep of x
PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 29

PYKC 21-Jan-08

Sequential Booth Multiplier

E3.05 Digital System Design

Topic 4 Slide 30

Multi-bit sequential multiplier

+/-

BA

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 31

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 32

Modified Booth Algorithm (2 bits at a time)

PYKC 21-Jan-08

E3.05 Digital System Design

Modified Booth Recoding (2 bits at a time)

Topic 4 Slide 33

PYKC 21-Jan-08

Modified Booth Multiplier Circuit

PYKC 21-Jan-08

E3.05 Digital System Design

E3.05 Digital System Design

Topic 4 Slide 34

Modified Booth Multiplier Circuit

Topic 4 Slide 35

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 36

Array Multiplier

PYKC 21-Jan-08

E3.05 Digital System Design

Array Multiplier obvious, but slow version

Topic 4 Slide 37

Array Multiplier using carry-save adders

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 38

Embedded Multipliers in Altera Cyclone II (1)

Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 39

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 40

Embedded Multipliers in Altera Cyclone II (2)

Embedded Multipliers in Altera Cyclone II (3)

Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 41

Source:

PYKC 21-Jan-08

Application of Multipliers: Typical DSP System

Altera and Xilinx provide FIR filter compiler support.


These examples are taken from Alteras FIR Compiler Users Guide.
MegaCore functions pre-designed core (large modules).
LPM Functions are parameterised building blocks (e.g. adder, multiplier)
Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 43

E3.05 Digital System Design

Topic 4 Slide 42

Basic FIR Filter

Altera and Xilinx provide FIR filter compiler support.


These examples are taken from Alteras FIR Compiler Users Guide.
Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 44

Exploiting Symmetric Coefficients (7-tap)

Parallel Implementation of FIR Filter

Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 45

Serial Implementation of FIR Filter

Source:

PYKC 21-Jan-08

Multibit Serial Implementation of FIR Filter

Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 46

E3.05 Digital System Design

Topic 4 Slide 47

Source:

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 48

Floating-Point Numbers

FIR Filter Compiler Design Space

No finite number system can represent all real numbers


Various systems can be used for a subset of real numbers
Low precision and/or range
Fixed-point
w.f
Difficult arithmetic
Rational
p/q
Most common scheme
Floating-point
s be
Limiting case of floating-point
Logarithmic
logbx
Fixed-point numbers
x = (0000 0000 . 0000 1001)two
y = (1001 0000 . 0000 0000)two

Small number
Large number

Floating-point numbers
x = s be

or

significand baseexponent

Note that a floating-point number comes with two signs:


Number sign, usually represented by a separate bit
Exponent sign, usually embedded in the biased exponent
Source:

PYKC 21-Jan-08

Source: Parhami

Topic 4 Slide 49

E3.05 Digital System Design

Floating-Point Number Format and Distribution


Typical floatingpoint number
format.

Sign

Subranges and special


values in floating-point
number representations.

m ax

Spars er

O ve rflow
regio n

m in

Dens er

Short (32-bit) forma t

Usually normalized by shifting,


so that the MSB becomes nonzero.
In radix 2, the fixed leading 1
can be removed to save one bit;
this bit is known as "hidden 1".

m in +

Positive n um b ers
FLP +

Dens er

Und erflo w
exam ple

8 bits ,
bias = 127,
126 to 127

Sign Ex pone nt
m ax +

11 bits ,
bias = 1023,
1022 to 1023

Spars er

Und erflo w
regio ns
M idway
exam ple

The ANSI/IEEE Floating-Point Representation

Significand:
Represented as a fixed-point number

Range with h bits:


[bias, 2 h1bias]

Neg ati ve n um b ers


FLP

O ve rflow
regio n
Typical
exam ple

23 bits for fractional part


(plus hidden 1 in integer part)

E3.05 Digital System Design

52 bits for fractional part


(plus hidden 1 in integer part)

Long (64 -bit) format

O ve rflow
exam ple
Topic 4 Slide 51

IEEE 754 Standard


(now being revised to
yield IEEE 754R)

Significa nd

Source: Parhami

Source: Parhami
PYKC 21-Jan-08

Topic 4 Slide 50

E3.05 Digital System Design

Expon ent:
Signed integer,
often represented
as unsigned value
by adding a bias

0:+
1:

PYKC 21-Jan-08

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 52

Exponent Encoding

Overview of IEEE 754 Standard Formats


Some features of the ANSI/IEEE standard floating-point number representation formats.

Feature
Single / Short
Double / Long

Word width (bits)


32
64
Significand bits
23 + 1 hidden
52 + 1 hidden
Significand range
[1, 2 223]
[1, 2 252]
Exponent bits
8
11
Exponent bias
127
1023
e + bias = 0, f = 0
e + bias = 0, f = 0
Zero (0)
Denormal
e + bias = 0, f 0
e + bias = 0, f 0
represents 0.f 2126 represents 0.f 21022
Infinity ()
e + bias = 255, f = 0
e + bias = 2047, f = 0
e + bias = 255, f 0
e + bias = 2047, f 0
Not-a-number (NaN)
e + bias [1, 254]
e + bias [1, 2046]
Ordinary number
e [126, 127]
e [1022, 1023]
represents 1.f 2e
represents 1.f 2e
min
2126 1.2 1038
21022 2.2 10308
max
2128 3.4 1038
21024 1.8 10308

Source: Parhami
PYKC 21-Jan-08

Topic 4 Slide 53

E3.05 Digital System Design

Exponent encoding in 8 bits for the single/short (32-bit) ANSI/IEEE format


Decimal code
Hex code
Exponent value

Opera nds after alignme nt


x = 2 5 1.00101101
y = 2 5 0.000111101101
Result o f addition:
s = 2 5 1.010010111101
s = 2 5 1.01001100

Extra bits to be
rounded off
Rounded s um

E3.05 Digital System Design

f = 0: Representation of 0
f 0: Representation of denormals,
0.f 2126

+1

+127

Exponent encoding in
11 bits for the double/long
(64-bit) format is similar
PYKC 21-Jan-08

f = 0: Representation of
f 0: Representation of NaNs
max

Neg ati ve n umb ers


FLP

Sparser

min

Denser

O ve rflow
regio n

min +

Positive n umb ers


FLP +

Denser

max +

Midway
example

Sparser

Und erflo w
regio ns

O ve rflow
regio n

Und erflo w
example

O ve rflow
example

Typical
example

Topic 4 Slide 54

E3.05 Digital System Design

Isolate the sign, exponent, significand


Reinstate the hidden 1
Convert operands to internal format
Identify special operands, exceptions

Like signs:
Possible 1-position
normalizing right shift
Different signs:
Possible left shift by
many positions
Overflow/underflow
during addition or
normalization

Topic 4 Slide 55

Other key parts of the adder:


Significand aligner (preshifter)
Result normalizer (postshifter), including
leading 0s detector/predictor
Rounding unit
Sign logic

Opera nds

Unpack
Signs Exponents

Significands

Add/
Sub
Mu x

Source: Parhami
PYKC 21-Jan-08

126

254 255
FE FF

( s1 b e1) + ( s2 b e2) = ( s1 b e1) + ( s2 / b e1e2) b e1


= ( s1 s2 / b e1e2) b e1 = s b e

shift:

126 127 128


7E 7F 80

FP Adder/Sub

Assume e1 e2; alignment shift (preshift) is needed if e1 > e2

Operand with
sm aller exponent
to be preshifted

1
01

1.f 2e

Floating-Point Adders/Subtractors

Example:
Numbers to be added:
x = 2 5 1.00101101
y = 2 1 1.11101101

0
00

Selective comple ment


and possible sw ap

Sub

Align significan ds

c out

Control
& sign
logic

Add

c in

Normalize

Converting internal to external


representation, if required, must be
done at the rounding stage

Rou nd an d
selective comple ment
Add

Combine sign, exponent, significand


Hide (remove) the leading 1
Identify special outcomes, exceptions
PYKC 21-Jan-08

Sign

E3.05 Digital System Design

Normalize

Exponent

Significand

Pack
s

Sum /Differe nce


Topic 4 Slide 56

re- and Postshifting


x i+31 x i+30

x i+2 x i+1 x i
. . .

31 30

Shift amount
5

32-to-1 Mux

Enable

Leading Zeros / Ones Detection or Prediction

One bit-slice of a single-stage


pre-shifter.
x i+8

x i+7

x i+6

x i+5

x i+4

x i+3

Leading zeros prediction, with adder inputs


(0x0.x1x2 ...)2s-compl and (0y0.y1y2 ...)2s-compl
x i+2

x i+1

xi

yi

Four-stage
combinational
shifter for
preshifting
an operand
by 0 to 15 bits.

p
p
p
p

Source: Parhami

p
p
p
p

...
...
...
...

y i+8

y i+7

y i+6

y i+5

y i+4

y i+3

y i+2

y i+1

yi

p
p
p
p

g
g
a
a

a
a
g
g

a
a
g
g

...
...
...
...

a
a
g
g

a
a
g
g

Topic 4 Slide 57

E3.05 Digital System Design

In this way, prediction can be


partially overlapped with shifting
PYKC 21-Jan-08

Floating-Point Multipliers

g
p
a
p

...
...
...
...

Adjust
Exponent

s1 s2 [1, 4): may need postshifting


Overflow or underflow can occur during
multiplication or normalization

Add
Exponents

Adjust
Exponent

Leading zeros/ones prediction.

Significand
Adder
Predict
Leading
0s/1s
Adjust
Exponent

Shift amount

Post-Shifter

Source: Parhami
E3.05 Digital System Design

Topic 4 Slide 58

An analysis of the double-precision floating-point FFT on FPGAs


Hemmert, K.S.; Underwood, K.D.; 13th Annual IEEE Symposium on Field-Programmable Custom
Computing Machines, 18-20 April 2005 Page(s):171 - 180
Architectural Modifications to Improve Floating-Point Unit Efficiency in FPGAs
Beauchamp, M.J.; Hauck, S.; Underwood, K.D.; Hemmert, K.S.; International Conference on
Field Programmable Logic and Applications, 28-30 Aug. 2006 Page(s):1 - 6
Double precision floating-point arithmetic on FPGAs
Paschalakis, S.; Lee, P.; IEEE International Conference on Field-Programmable Technology
(FPT), 15-17 Dec. 2003 Page(s):352 - 358

Round
Adjust
Exponent

Normalize

Pack

Product
E3.05 Digital System Design

Post-Shifter

Normalize

Need for normalizing right-shift is known at


or near the end
Hence, rounding can be integrated in
the generation of the upper half,
by producing two versions of these bits

Multiply
Significands

Speed considerations
Many multipliers produce the lower half of
the product (rounding info) early

Unpack

XOR

Shift amount

Further references for Floating Point on FPGAs

Floating-point operands

( s1 b e1) ( s2 b e2) = ( s1 s2 ) b e1+e2

PYKC 21-Jan-08

p
p
p
p

Count
Leading
0s/1s

Prediction might be done in two stages:


Coarse estimate, used for coarse shift
Fine tuning of estimate, used for fine shift

MSB

PYKC 21-Jan-08

Significand
Adder

Ways in which leading 0s/1s are generated:

LSB

4-Bit
Shift
Amount

Leading zeros/ones counting

Source: Parhami
Topic 4 Slide 59

PYKC 21-Jan-08

E3.05 Digital System Design

Topic 4 Slide 60

References
Davis Justin. 2006. High-Speed Digital System Design, Morgan &
Claypool Publishers series, USA
Johnson, Graham. High-Speed Digital Design A Handbook of Black
Magic, Prentice Hall, New Jersey, USA
Hasad Andi. 2011, Materi Kuliah Perancangan Sistem Digital, Teknik
Elektro, UNISMA, Bekasi
Wakerly John F. 2005. Digital Design, Principles & Practices, 4th Edition,
Prentice Hall, USA
Wolf
Wayne. 2004. FPGA-Based System Design, Prentice-Hall
Publishers, Inc.

Вам также может понравиться