Вы находитесь на странице: 1из 48

Lecture 10 -- Adders

Lecture 10 ECE 425


Outline
• Adders
– Basic cell structures
– Optimized cells
– Carry chains

Lecture 10 ECE 425


Adders
• Important subsystem in digital designs, so we care about
performance
• Good example of ways to start thinking about building
systems larger than 1-2 gates

• Basic problem: Bit N of the result of an add depends on


bits 0 -- (N-1) of the inputs
– Could build separate circuits that generate each output
bit based on the less-significant input bits
• This would be very fast
• Also, very big
– Instead, use bit adders that compute the value of each
bit, connect them together to form N-bit adders.

Lecture 10 ECE 425


Half-Adders
• For a one-bit add, the sum is just the XOR of the inputs,
and the carry out is just the AND

• Cells that compute this function are called half-adders

Lecture 10 ECE 425


Full Adders
• To get a cell that we can compose into a multi-bit adder,
we need to have a carry in input as well.
– These are called full adders

Lecture 10 ECE 425


Full Adders
• From the truth table, we get:

• Which can be rewritten as:

• Note that the carry-out can be generated when A and B


are true, or propagated if A or B are true and Cin is true

Lecture 10 ECE 425


Multi-Bit Adders

Lecture 10 ECE 425


Parallel Adders
Gate-Level Implementation
• Much more worried about the delay from A,B, Cin --> Cout
than A, B, Cin --> Sum

Lecture 10 ECE 425


Complex-Gate Implementation

Lecture 10 ECE 425


Improving the Ripple-Carry Adder

Lecture 10 ECE 425


Better Full Adder (Symmetrical or Mirror Adder)

Lecture 10 ECE 425


Laying out a Full-Adder Cell
• Start by sizing the transistors to get good performance
• Then, consider how the cell will be used
– In a standard-cell environment, the structure we talked
about with a single row of n- and p-diffusion works
well, and is likely to fit in with other cells
– In a datapath, fitting the layout to the organization of
wires is more important
• Run inputs horizontally across the cell to create a
bit-slice layout
• Rotate transistors so that transistor width doesn’t
affect height

Lecture 10 ECE 425


Minimum-Height Layout

Lecture 10 ECE 425


Full-Adder bit slice layout
Full-adder bit slice layout
Bit slice Full Adder
Eight bit Full adder
Serial Adder
• Ripple-carry adder has latency proportional to number of
bits -- why not just re-use the same full adder for each bit?
– In this case, want to equalize delay of sum and carry
bits

Lecture 10 ECE 425


Carry-save Adders
• How to add more than two numbers
? Sequentially, takes m-1 incremental
sums
• Instead, we add three numbers
x+y+z=c+s, where c is carry and s is
the sum
Carry save adders

• There is one CSA unit per bit


• Every CSA has a c output and an s output
• Doing this way, addition can be implemented in log m sums
Carry save adders

+ + + + + + + +

• There is one CSA unit per bit


• Every CSA has a c output and an s output
• Doing this way, addition can be implemented in log m sums
• It is necessary to implement one final Adder
Carry save adders: Wallace tree
Carry-save Adders: Pipeline
• SIN, CIN and A are added on
the left column
• Result is added on the right
column
• Registers are placed in
between
• CPA (carry-propagate adder)
adds the results
• latency = 2 clocks
Better Carry Chains
• Speed of the ripple-carry adder is limited by the carry
chain, which has delay linear in the number of bits
– No matter what we do with full-adder design, and there
are lots of designs out there, can’t get around this.

• Basic idea: Parallelize the computation of carry bits for


different bits of the computation, rather than grinding them
out serially
– If we had all the carry bits at the start of the operation,
could generate all the output bits in parallel in one step

Lecture 10 ECE 425


Parallel Carry Computation
• Remember a few slides back, when we defined the
propagate and generate signals?
– P=A+B
– G = AB

• Redefine P as A XOR B, since the generate signal covers


the case where A and B are high
• In that case,

Lecture 10 ECE 425


Parallelizing Carry Computation
• Using this, we can express the carry bits of successive
bits of the add as

• In theory, we could use this to compute the carry in to


each bit of the add in parallel
– This would make adder delay approximately
logarithmic in the number of bits, since four-input or so
gates are most efficient

Lecture 10 ECE 425


Carry –Lookahead Adder
Carry Lookahead
• In practice, it’s not efficient to compute more than 4-8
carry bits in parallel

• Carry chain is the critical portion of the design


• Design logic for Co of each stage so that as much work as
possible is done before Ci arrives
• Can optimize this somewhat by making stages of different
length so that the amount of work done before Ci arrives
matches the time to generate Ci for the stage

Lecture 10 ECE 425


Adder Design
• Adder design is really carry-chain design -- once you have
the carry bits, generating the sum is trivial

• Different adder designs are really different carry-chain


designs.

• The design on the last slide is called a carry lookahead


adder

Lecture 10 ECE 425


Carry Lookahead Adder
• Domino Logic Implementation

Lecture 10 ECE 425


Manchester Carry Adder
• Note that the C0--C2 circuits are subsets of the C3 circuit
• We can exploit this to reduce the amount of circuitry
required

• Since we’re tapping points in the n-network, we get the


inverse of C0-2
Lecture 10 ECE 425
Manchester Carry Adder

Lecture 10 ECE 425


Optimizing the Manchester Adder

Lecture 10 ECE 425


Putting it Together -- Adder Floorplan

Lecture 10 ECE 425


• Note that we have two copies of the carry chain -- one to
generate the carries for each bit, one to generate the
carry out
– Reduces load, optimizes carry out generation

Lecture 10 ECE 425


Carry-Select Adder
• If we’re willing to throw hardware at the problem, we can
speed things up by duplicating adders.
• Carry-lookahead and Manchester adders divide an N-bit
add into M N/M-bit pieces
– Carry input to each piece is either zero or one.
• We can further reduce the critical path by building two
adders for each piece, feeding different carry inputs to
each, and later selecting the one with the right carry input

Lecture 10 ECE 425


Carry-Select Adder

Lecture 10 ECE 425


Logarithmic Lookahead Adder
• General best-case delay for any operation that depends
on N bits is going to be logk(N), where k is the optimal bit
width of a gate in the implementation technology.
• Rearrange the carry computation so that the final carry is
computed in a binary tree

Lecture 10 ECE 425


Some Notation

Lecture 10 ECE 425


Parallel Prefix
• The dot operator is associative, which means we can
transform a linear sequence of dot operations into a
binary tree
• This is an example of a general technique called parallel
prefix

Lecture 10 ECE 425


Logarithmic Lookahead Adder

Lecture 10 ECE 425


8-Bit Logarithmic Lookahead Adder

Lecture 10 ECE 425


Logarithmic Lookahead Adder

Lecture 10 ECE 425


Wrapping Up
• Sections 10.1, 10.2 in your book

Lecture 10 ECE 425

Вам также может понравиться