Вы находитесь на странице: 1из 104

# 1

CHAPTER 1
1.0) INTRODUCTION
By using adder, delay and multiplier we can realize any digital filter which can be further
extended to realize any system. So to increase the speed of system we need to focus on
increasing the speed of addition and multiplication.
If we look at the conventional binary number system, considering the case of addition of two
numbers the carry may propagate all the way from the least significant digit to the most
significant. Thus the addition time is dependent on the word length (linear in ripple carry
In this chapter we will design various adders based on different addition techniques on Active
HDL software and study their delay and complexity characteristics. Our focus would be to
design certain adders in which the addition time is independent of the word length like the RBSD
exploiting the redundancy of RBSD and QSD numbers. The redundancy allows multiple
representations of any integer quantity. There are two steps involved in the carry-free addition.
The first step generates an intermediate carry and sum from the addend and augend. The second
step combines the intermediate sum of the current digit with the carry of the lower significant
digit.
By designing the adders in which the carry doesnt propagate we can considerably increase the
speed of addition. So, in a system involving large number of adders and multipliers, its response
time can be considerably improved.

2

1.1.1) Introduction
A full adder is a logical circuit that performs an addition operation on three binary digits. The
full adder produces a sum and carry value, which are both binary digits. It can be combined with
other full adders (see below) or work on its own. 
A full adder adds binary numbers and accounts for values carried in as well as out. A one-bit full
adder adds three one-bit numbers, often written as A, B, and Cin; A and B are the operands, and
Cin is a bit carried in (in theory from a past addition).

The delay through a digital circuit is measured in gate-delays, as this allows the delay of a design
to be calculated for different devices. AND and OR gates have a nominal delay of 1 gate-delay,
and XOR gates have a delay of 2, because they are really made up of a combination of ANDs
and ORs.
A full adder block has the following worst case propagation delays:
- From A or B to C
out
: 4 gate-delays (XOR AND OR)
- From A or B to S

: 4 gate-delays (XOR XOR)
- From C
in
to C
out
: 2 gate-delays (AND OR)
- From C
in
to S

: 2 gate-delays (XOR)
The worst propagation delay in 1 bit full adder is of 4 gate delays so the total propagation
delay in 1 bit full adder is of 4 gate delays.
Assuming that both normal and complement form of inputs are present.

3

1.1.2) VHDL code of full adder

library IEEE;
use IEEE.STD_LOGIC_1164.all;

port(
a, b, cin : in bit;

sum, cout : out bit

);

--}} End of automatically maintained section

begin

sum<= a xor b xor cin after 4 ns;
cout<= (a and b) or (b and cin) or (a and cin) after 4 ns;-- enter your statements here --

4

1.2.1) Introduction
It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full
adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is a ripple carry
adder, since each carry bit "ripples" to the next full adder. 

Because the carry-out of one stage is the next's input, the worst case propagation delay is then:
- 4 gate-delays from generating the first carry signal (A
0
/B
0
C
1
).
- 2 gate-delays per intermediate stage (C
i
C
i+1
).
- 2 gate-delays at the last stage to produce both the sum and carry-out outputs (C
n-1
C
n

and S
n-1
).
So for an n-bit adder, we have a total propagation delay, t
p
of:
t
p
= 4 + 2(n 2) + 2 = 2n + 2 (1.1)
This is linear in n, and for a 32-bit number, would take 66 cycles to complete the calculation.
This is rather slow, and restricts the word length in our device somewhat. We would like to find
ways to speed it up.

5

1.2.2) VHDL code of ripple carry adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity ripplecarry is
port( a, b: in bit_vector(3 downto 0); ci: in bit;
s: out bit_vector(3 downto 0); co: out bit
);
end ripplecarry;

--}} End of automatically maintained section

architecture ripplecarry of ripplecarry is
port (a, b, cin: in bit;
cout, sum: out bit);
end component;
signal c: bit_vector(3 downto 1);
begin
fa0: fullader port map (a(0), b(0), ci, c(1), s(0));
fa1: fullader port map (a(1), b(1), c(1), c(2), s(1));
fa2: fullader port map (a(2), b(2), c(2), c(3), s(2));
fa3: fullader port map (a(3), b(3), c(3), co, s(3));

end ripplecarry;
6

Figure 1.3: VHDL simulation of ripple carry adder

1.3.1) Introduction
The generate function, G
i
, indicates if that stage causes a carry-out signal C
i
to be generated if no
carry-in signal exists. This occurs if both the addends contain a 1 in that bit:
G
i
= A
i
. B
i
(1.2)
The propagate function, P
i
, indicates if a carry-in to the stage is passed to the carry-out for the
stage. This occurs if either the addends have a 1 in that bit:
P
i
= A
i
+ B
i
(1.3)
Note that both these values can be calculated from the inputs in a constant time of a single gate
delay. Now, the carry-out from a stage occurs if that stage generates a carry (G
i
= 1) or there is a
carry-in and the stage propagates the carry (P
i
C
i
= 1):
C
i+1
= A
i
B
i
+ A
i
C
i
+ B
i
C
i
(1.4)
C
i+1
= A
i
B
i
+ (A
i
+ B
i
)

C
i
(1.5)
C
i+1
= G
i
+ P
i
C
i
(1.6)
7

C
i+1
= G
i
+ P
i
(G
i-1
+ P
i-1
C
i-1
) (1.7)

C
i+1
= G
i
+ P
i
G
i-1
+ P
i
P
i-1
(G
i-2
+ P
i-2
C
i-2
) (1.8)
.
.
C
i+1
= G
i
+ P
i
G
i-1
+ P
i
P
i-1
G
i-2
+ P
i
P
i-1
P
i-2
G
i-3
+ + P
i
P
i-1
... P
i
P
i-1
P
1
P
0
C
0
(1.9)

Note that this does not require the carry-out signals from the previous stages, so we don't have to
wait for changes to ripple through the circuit. In fact, a given stage's carry signal can be
computed once the propagate and generate signals are ready with only two more gate delays (one
AND and one OR). Thus the carry-out for a given stage can be calculated in constant time, and
therefore so can the sum.
Operation Required Data Gate Delays
Produce stage generate and propagate signals Addends (a and b) 1
Produce stage carry-out signals, C1 to Cn P and G signals, and C
0
2
Produce sum result, S Carry signals and addends 3
Total 6

A basic carry-lookahead adder is very fast but has the disadvantage that it takes a very large
amount of logic hardware to implement. In fact, the amount of hardware needed is approximately
quadratic with n, and begins to get very complicated for n greater than 4.
Due to this, most CLAs are constructed out of "blocks" comprising 4-bit CLAs, which are in turn
cascaded to produce a larger CLA.
8

library IEEE;
use IEEE.STD_LOGIC_1164.all;

port (x,y : in bit_vector (3 downto 0); cin : in bit;
s : out bit_vector (3 downto 0); cout,gout,pout : out bit);

port (a,b,cin : in bit;
g,p,so : out bit);
end component;
component clalogic
port (g,p : in bit_vector (3 downto 0); ci : in bit;
c : out bit_vector (3 downto 1) ; co,go,po : out bit);
end component;
signal g,p : bit_vector (3 downto 0);
signal c : bit_vector (3 downto 1);

begin
carrylogic : clalogic port map (g,p,cin,c,cout,pout,gout);
gpfa0 : gpfulladder port map ( x(0),y(0),cin,g(0),p(0),s(0));
gpfa1 : gpfulladder port map (x(1),y(1),c(1),g(1),p(1),s(1));
9

gpfa2 : gpfulladder port map (x(2),y(2),c(2),g(2),p(2),s(2));
gpfa3 : gpfulladder port map (x(3),y(3),c(3),g(3),p(3),s(3));

library IEEE;
use IEEE.STD_LOGIC_1164.all;

port (a,b,cin : in bit;
g,p,so : out bit);

--}} End of automatically maintained section

signal p_int : bit;
begin
g <= a and b;

p <= p_int;
p_int <= a xor b;
so <= p_int xor cin;
-- enter your statements here --
10

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity clalogic is
port (g,p : in bit_vector (3 downto 0); ci : in bit;
c : out bit_vector (3 downto 1) ; co,go,po : out bit);

end clalogic;

--}} End of automatically maintained section

architecture clalogic of clalogic is
signal go_int,po_int : bit;
begin
c(1) <= g(0) or (p(0) and ci);
c(2) <= g(1) or (p(1) and g(0)) or (p(1) and ci);
c(3) <= g(2) or (p(2) and g(1)) or (p(3) and p(2) and g(1)) or ( p(2) and p(1) and p(0)
and ci);
po_int <= p(3) and p(2) and p(1) and p(0);
go_int <= g(3) or (p(3) and g(2)) or (p(3) and p(2) and g(1)) or (p(3) and p(2) and p(1)
and g(0));
co <= go_int or (po_int and ci);
po <= po_int;
go <= go_int;
-- enter your statements here --
end clalogic;
11

1.4) REDUNDANT BINARY SIGNED ADDER 
1.4.1) Introduction
In such a system, a carryfree addition can be performed, where the term carryfree in this
context means that the carry propagation is limited to a single digit position. In other words, the carry
propagation length is fixed irrespective of the word length. The addition consists of two steps. In
the first step, an intermediate sum s
i
and a carry c
i
are generated, based on the operand digits x
i

and y
i
at each digit position i. This is done in parallel for all digit positions. In the second step,
the summation z
i = s
i
+ c
i-1
is carried out to produce the final sum digit z
i
. The important point is that it is
always possible to select the intermediate sum s
i and carry c
i-1
such that the summation in the second step
does not generate a carry. Hence, the second step can also be executed in parallel for all the digit
positions, yielding a fixed addition time, independent of the word length.
Figure shows an example for an 8-bit redundant binary addition. In the Figure, X and Y are n-
digit redundant binary integers. I-Sum and I-Cin are intermediate sum and carry-in. Final Sum
(F-Sum), which is obtained by adding I-Sum and I-Cin. Note that there is no carry generation in
12

the addition of I-Sum and I-Cin to satisfy a carry-free condition and the LSB of I-Cin is set to
logic zero.

The addition of two signed digit takes place in two steps. In the first step intermediate carry and
intermediate sum is written using the above table, then in the second step the intermediate sum
and intermediate carry is added to obtain the final sum. The above table is designed such that the
addition of intermediate sum bit and intermediate carry bit does not produce a carry.
(1.10)

Figure 1.7: Signed adder cell 
13

If the delay of NAND, NOR gate is considered t
o
then delay of the circuit for the circuit becomes
T
delay
= t
o
+2t
o
+2t
o
+ t
o
+ t
o
= 7t
o
(1.11)

1.4.2) Rules For Redundant Binary Addition
Type Augend
digit
(x
i
)
digit
(y
i
)
Digit at the next lower
order position
(x
i-1
, y
i-1
)
Intermediate
Carry
(c
i
)
Intermediate
Sum
(s
i
)
1 1 1
------------------
1 0
2 1
0
0
1
Both are non-negative 1
0
-1
1 Otherwise
3 0 0
------------------
0 0
4 1
-1
-1
1
------------------
0
0
0
0
5 0
-1
-1
0
Both are non-negative 0 -1
Otherwise -1 1
6 -1 -1
------------------
-1 0
Figure 1.8: Rules table for intermediate carry and intermediate sum
The addition of two signed digit takes place in two steps. In the first step intermediate carry and
intermediate sum is written using the above table, then in the second step the intermediate sum
and intermediate carry is added to obtain the final sum. The above table is designed such that the
addition of intermediate sum bit and intermediate carry bit does not produce a carry.

Figure 1.9: Steps of RBSD addition
14

1.4.3) VHDL code of RBSD adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;
port (a,b,c,d,e,f,g,h : in bit;
c2,c1,s2,s1 : out bit);
--}} End of automatically maintained section
begin
-----------------------------------------------------------------------------------------------
c2 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d)
and g)))
or (g and (((not b) and c and f)
or ( a and (not d) and (not f)))) or (c and (((not b)and (not f) and g) or ( a and b
and (not h))))
or (a and b and c and (not f));
-----------------------------------------------------------------------------------------------
c1 <= (e and f and ((a and (not d)) or ((not b) and c))) or (a and f and ((b and c) or ((not d)
and g)))
or (g and (((not b) and c and f) or ( a and (not d) and (not f)))) or (c and (((not b)and (not
f) and g) or ( a and b and (not h))))
15

or (b and((a and c and (not f)) or ( (not a) and (not c) and d and f)))
or ((not a) and b and(not c) and (not h) and (d or ( not e )))
or ((not a) and b and(not c) and (not f) and (d or (not g)))
or ((not b) and (not c ) and d and (((not e) and (not h)) or ((not f) and (not g))))
or ((not e) and f and (not g ) and (((not a) and b and (not d)) or ((not b) and (not c) and
d)));
-------------------------------------------------------------------------------------------------
s2 <= (b and (not d) and (((not e) and (not h)) or ((not f) and (not g))))
or ((not b) and d and (((not e) and (not h)) or ((not f) and ( not g))))
or ((not e) and f and (not g) and (b xor d));
-------------------------------------------------------------------------------------------------
s1 <= (f and (b xor d)) or (b and (not d) and ((not h) or (not f)))
or ((not b) and d and ((not h) or (not f)))

16

Figure 1.10: Waveform of RBSD adder cell

Figure 1.11: VHDL simulation of RBSD adder cell
17

1.5) HYBRID SIGNED DIGIT ADDER 
1.5.1) Introduction
Here, instead of insisting that every digit be a signed digit, we let some of the digits to be signed
and leave the others unsigned. For example, every alternate or every third or fourth digit can be
signed; all the remaining ones are unsigned. We refer to this representation as a Hybrid Signed-
Digit (HSD) representation. In the following, we show that such a representation can limit the
maximum length of carry propagation chains to any desired value. In particular, we prove that
the maximum length of a carry propagation chain equals (d + 1), where d is the longest distance
between neighboring signed digits.

Unsigned digit position Signed digit position
Figure 1.12: signed and unsigned adder cell 
In HSD for d=1 (the distance between signed digit positions) the delay is;
18

(1.12)
Here, the two delays of 1.5 units in parenthesis are due to the two complex gates in the lower
order signed digit cell. The last 1.5 units of delay (shown within the square brackets) is
associated with the XNOR gate at the higher order signed digit where the carry propagation
terminates. The terms in between are proportional to d since the carry ripples through all the
unsigned digit positions.

Figure 1.13: Critical path delay vs. distance between signed digits 

19

Figure 1.14: Transistor count vs. Distance between signed digits 

Figure 1.15: Transistor count *Delay vs. distance between signed digits 
20

1.5.2) VHDL code of unsigned position adder cell

library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity unsigned_new is
port(
ai_1 : in STD_LOGIC;
bi_1 : in STD_LOGIC;
vi_2 : in STD_LOGIC;
wi_2 : in STD_LOGIC;
vi_1 : out STD_LOGIC;
wi_1 : out STD_LOGIC;
ei_1 : out STD_LOGIC
);
end unsigned_new;
architecture unsigned_new of unsigned_new is
component nor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
21

end component;
component and2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component not1
port(
a : in STD_LOGIC;
y : out STD_LOGIC );
end component;
component xnor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component xor2
22

port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component or2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
signal s1,s2,s3,s4,s5,s6,s7: STD_LOGIC;
begin
N1: not1 port map (wi_2,s1);
N2: xnor2 port map (ai_1,bi_1,s2);
N3: and2 port map (vi_2,s1,s3);
N4: or2 port map (s1,vi_2,s4);
N5: and2 port map (s2,s4,s5);
N6: or2 port map (s3,s5,vi_1);
23

N7: nor2 port map (ai_1,bi_1,wi_1);
N8: xor2 port map (vi_2,wi_2,s6);
N9: xor2 port map (ai_1,bi_1,s7);
N10: xor2 port map (s7,s6,ei_1);
end unsigned_new;

Figure 1.16: Waveform of HSD unsigned position adder cell

Figure 1.17: VHDL simulation of HSD unsigned position adder cell
24

1.5.3) VHDL code of signed position adder cell
library IEEE;
use IEEE.STD_LOGIC_1164.all;

port(
xis_c : in STD_LOGIC;
yis_c : in STD_LOGIC;
xia : in STD_LOGIC;
yia : in STD_LOGIC;
vi_1_c : in STD_LOGIC;
wi_1 : in STD_LOGIC;
vi : out STD_LOGIC;
wi : out STD_LOGIC;
zia : out STD_LOGIC;
zis_c : out STD_LOGIC
);

--}} End of automatically maintained section

25

component nor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component and2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component xnor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
26

end component;
component nand2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component xor2
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
y : out STD_LOGIC
);
end component;
component nor3
port(
a : in STD_LOGIC;
b : in STD_LOGIC;
c : in STD_LOGIC;
y : out STD_LOGIC
27

);
end component;
signal s1,s2,s3,s4,s5,s6: STD_LOGIC;
begin
N1: nand2 port map (xis_c,yis_c,wi);
N2: nor2 port map (xis_c,yis_c,s1);
N3: nor2 port map (xia,yia,s2);
N4: xor2 port map (xia,yia,s3);
N5: and2 port map (s3,wi_1,s5);
N6: xor2 port map (wi_1,s3,s6);
N7: nand2 port map (s6,vi_1_c,zis_c);
N8: xnor2 port map (vi_1_c,s6,zia);
N9: nor3 port map (s1,s2,s5,vi);

-- enter your statements here --

28

Figure 1.18: Waveform of HSD signed position adder cell

Figure 1.19: VHDL simulation of HSD signed position adder cell
29

1.6) QUATERNARY SIGNED DIGIT NUMBERS
1.6.1) Introduction
QSD numbers are represented using 3-bit 2s complement notation. Each number can be
represented by:
i
i
x 4
n
i
D =

(1.13)
Where x
i
can be any value from the set {-3, -2, -1, 0, 1, 2, 3} for producing an appropriate
decimal representation. For digital implementation, large number of digits such as 64, 128, or
more can be implemented with constant delay. A high speed and area effective adders and
multipliers can be implemented using this technique.

We can achieve carry-free addition by exploiting the redundancy of QSD numbers and the QSD
addition. The redundancy allows multiple representations of any integer quantity. There are two
steps involved in the carry-free addition. The first step generates an intermediate carry and sum
from the addend and augend. The second step combines the intermediate sum of the current digit
with the carry of the lower significant digit.
To prevent carry from further rippling, we define two rules.
1) The first rule states that the magnitude of the intermediate sum must be less than or equal
to 2(or -2).
2) The second rule states that the magnitude of the carry must be less than or equal to 1(or -
1).
Consequently, the magnitude of the second step output cannot be greater than 3 which can be
represented by a single-digit QSD number; hence no further carry is required. In step 1, all
30

possible input pairs of the addend and augend are considered. The output ranges from -6 to 6 as
shown in figure 1.19.

Figure 1.20: QSD representation
Both inputs and outputs can be encoded in 3-bit 2scomplement binary number. The mapping
between the inputs, addend and augend, and the outputs, the intermediate carry and sum are
shown in binary format. Since the intermediate carry is always lies between -1 and 1, it requires
only a 2-bit binary representation. Finally, five 6-variable Boolean expressions can be extracted.
The intermediate carry and sum circuit is shown in Figure 1.20.
31

Figure 1.21: The intermediate carry and sum generator

Figure 1.22: The second step QSD adder
In step 2, the intermediate carry from the lower significant digit is added to the sum of the
current digit to produce the final result. The addition in this step produces no carry because the
current digit can always absorb the carry-in from the lower digit.
32

By using N cells in parallel we can make N digit adder. The delay in this N digit adder is
constant which is equal to delay of single digit adder.
33

Table1.1 : The mapping between the inputs and outputs of the Intermediate carry and sum

34

1.6.3) VHDL code of QSD adder
library IEEE;
use IEEE.STD_LOGIC_1164.all;

port (a0,a1,a2,b0,b1,b2 : in bit;
c0,c1,s0,s1,s2 : out bit);

begin
c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1)
and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0));

c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or
((not a1) and (not a0) and b2 and (not b1))
or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1)
or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0)
or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2));

s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1))
or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1)
or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0)
or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0)
or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0))
or (a2 and a1 and a0 and b2 and b1 and b0);

s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 )
or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0))
or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0);

s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0)
or ((not a0 ) and (not b2) and b0);

35

Figure 1.24: Waveform of QSD adder cell

Figure 1.25: VHDL simulation of QSD adder cell

On simulation in Xilinx we get the delay of 13.931ns for QSD adder.
36

1.6.4) Single Digit QSD Multiplier

There are generally two methods for a multiplication operation : parallel and iterative. QSD
multiplication can be implemented in both ways, requiring a QSD partial product generator and a
QSD adder as basic components. A partial product M
i
is a result of multiplication between an n-
digit input , A
n-1
A
0
, with a single digit input B
i
, where i = 0n-1 .
The primitive component of the partial product generator is a single digit multiplication unit. The
single digit multiplication produces M as a result and C as a carry to be combined with M of the
next digit. The range of the out is from -9 to 9 which can be represented with M and C in QSD
form. The value of M and C should lie between -2 and 2.
The mapping between inputs A (Multiplicand) and B (Multiplier) and the outputs M and C is
shown in the Table 1.2.

INPUT OUTPUT
QSD Binary Decimal QSD Binary
A B A B Product C M C M
3 3 011 011 9 2 1 010 001
-3 -3 101 101 9 2 1 010 001
3 2 011 010 6 1 2 001 010
2 3 010 011 6 1 2 001 010
-3 -2 101 110 6 1 2 001 010
-2 -3 110 101 6 1 2 001 010
2 2 010 010 4 1 0 001 000
-2 -2 110 110 4 1 0 001 000
3 1 011 001 3 1 -1 001 111
-3 -1 101 111 3 1 -1 001 111
1 3 001 011 3 1 -1 001 111
-1 -3 111 101 3 1 -1 001 111
2 1 010 001 2 0 2 000 010
-2 -1 110 111 2 0 2 000 010
1 2 001 010 2 0 2 000 010
-1 -2 111 110 2 0 2 000 010
1 1 001 001 1 0 1 000 001
-1 -1 111 111 1 0 1 000 001
3 0 011 000 0 0 0 000 000
2 0 010 000 0 0 0 000 000
1 0 001 000 0 0 0 000 000
0 1 000 001 0 0 0 000 000
0 2 000 010 0 0 0 000 000
0 3 000 011 0 0 0 000 000
0 0 000 000 0 0 0 000 000
-3 0 101 000 0 0 0 000 000
-2 0 110 000 0 0 0 000 000
-1 0 111 000 0 0 0 000 000
0 -1 000 111 0 0 0 000 000
0 -2 000 110 0 0 0 000 000
0 -3 000 101 0 0 0 000 000
1 -1 001 111 -1 0 -1 000 111
-1 1 111 101 -1 0 -1 000 111
2 -1 010 111 -2 0 -2 000 110
-1 2 111 010 -2 0 -2 000 110
1 -2 001 110 -2 0 -2 000 110
-2 1 110 001 -2 0 -2 000 110
3 -1 011 111 -3 -1 1 111 001
-1 3 111 011 -3 -1 1 111 001
-3 1 101 001 -3 -1 1 111 001
1 -3 001 101 -3 -1 1 111 001
37

Table 1.2: The mapping between multiplicand and multiplier

1.6.5) VHDL code for single digit multiplier
library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity QSD_SINGLE_DIGIT_MULT is
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
2 -2 010 110 -4 -1 0 111 000
-2 2 110 010 -4 -1 0 111 000
3 -2 011 110 -6 -1 -2 111 110
-2 3 110 011 -6 -1 -2 111 110
-3 2 101 010 -6 -1 -2 111 110
2 -3 010 101 -6 -1 -2 111 110
3 -3 011 101 -9 -2 -1 110 111
-3 3 101 011 -9 -2 -1 110 111
38

end QSD_SINGLE_DIGIT_MULT;
--}} End of automatically maintained section
architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is
begin
c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand
a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand
b0));
c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0);
c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or
( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1
nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not
b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1
xor b1));
m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not
b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1))
or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2
nor a0));
m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0));
m0<= a0 and b0;

end QSD_SINGLE_DIGIT_MULT;
39

Figure 1.26: Single digit QSD multiplier

On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.348ns.

40

1.7) COMPARATIVE RESULT OF DIFFERENT ADDERS

Figure 1.27: Delay vs. Number of bits for addition for different adding schemes

Figure 1.28: complexity vs. number of bits for addition of different adding schemes

10 13
106
14
130
20
30
212
28
260
40
500
424
56
520
0
100
200
300
400
500
600
ripple carry
carry look
redundant
binary
hybrid signed
quartinary
signed digit
2 bit
4 bit
8 bit
41

CHAPTER 2
2.1) INTRODUCTION
An adaptive filter is a filter that self-adjusts its transfer function according to an optimization
algorithm driven by an error signal. Because of the complexity of the optimization algorithms,
most adaptive filters are digital filters. By way of contrast, a non-adaptive filter has a static
transfer function. Adaptive filters are required for some applications because some parameters of
the desired processing operation (for instance, the locations of reflective surfaces in a reverberant
space) are not known in advance. The adaptive filter uses feedback in the form of an error signal
to refine its transfer function to match the changing parameters.
Generally speaking, the adaptive process involves the use of a cost function, which is a criterion
for optimum performance of the filter, to feed an algorithm, which determines how to modify
filter transfer function to minimize the cost on the next iteration.
As the power of digital signal processors has increased, adaptive filters have become much more
common and are now routinely used in devices such as mobile phones and other communication
devices, camcorders and digital cameras, and medical monitoring equipment.
The block diagram, shown in the following figure, serves as a foundation for particular adaptive
filter realizations, such as Least Mean Squares (LMS) and Recursive Least Squares (RLS). The
idea behind the block diagram is that a variable filter extracts an estimate of the desired signal.

To start the discussion of the block diagram we take the following assumptions:
* The input signal is the sum of a desired signal d(n) and interfering noise v(n)
x(n) = d(n) + v(n) (2.1)
42

* The variable filter has a Finite Impulse Response (FIR) structure. For such structures the
impulse response is equal to the filter coefficients. The coefficients for a filter of order p are
defined as
w
n
=[w
n
(0), w
n
(1),. W
n
(p)]
T
(2.2)
* The error signal or cost function is the difference between the desired and the estimated
signal
e(n) = d(n)-

d (n) (2.3)
The variable filter estimates the desired signal by convolving the input signal with the impulse
response. In vector notation this is expressed as

d (n) = w
n *
x(n) (2.4)
where
x(n)=[x(n),x(n-1),.,x(n-p)]
T
(2.5)
is an input signal vector. Moreover, the variable filter updates the filter coefficients at every time
instant
w
n+1
= w
n
+ w
n
(2.6)
where w
n
is a correction factor for the filter coefficients. The adaptive algorithm generates this
correction factor based on the input and error signals. LMS and RLS define two different
coefficient update algorithms.

2.2) LEAST MEAN SQUARE ADAPTIVE FILTER 
2.2.1) Introduction
Adaptive algorithms are a mainstay of Digital Signal Processing (DSP). They are used in a
variety of applications including acoustic echo cancellation, radar guidance systems, and
wireless channel estimation, among many others.
An adapative algorithm is used to estimate a time varying signal. There are many adaptive
algorithms such as Recursive Least Square (RLS) and Kalman filters, but the most commonly
used is the Least Mean Square (LMS) algorithm. It is a simple but powerful algorithm that can
be implemented to take advantage of Lattice FPGA architectures. Developed by Window and
Hoff, the algorithm uses a gradient descent to estimate a time varying signal. The gradient
43

descent method finds a minimum, if it exists, by taking steps in the direction negative of the
gradient. It does so by adjusting the filter coefficients to minimize the error.
The LMS reference design consists of two main functional blocks - a FIR filter and the LMS
algorithm. The FIR filter is implemented serially using a multiplier and an adder with feedback.
The FIR result is normalized to minimize saturation. The LMS algorithm iteratively updates the
coefficient and feeds it to the FIR filter. The FIR filter than uses the coefficient e(n) along with
the input reference signal x(n) to generate the output y(n). The output y(n) is then subtracted to
from the desired signal d(n) to generate an error, which is used by the LMS algorithm to compute
the next set of coefficients.
Figure 1 is a block diagram of system identification using adaptive filtering. The objective is to
change (adapt) the coefficients of an FIR filter, W, to match as closely as possible the response
of an unknown system, H. The unknown system and the adapting filter process the same input
signal x[n] and have outputs d[n] (also referred to as the desired signal) and y[n].

Figure 2.2: Least Mean Square adaptive filter

The adaptive filter, W, is adapted using the least mean-square algorithm, which is the most
widely used adaptive filtering algorithm. First the error signal, e[n], is computed as
e[n]=d[n]y[n], which measures the difference between the output of the adaptive filter and the
output of the unknown system. On the basis of this measure, the adaptive filter will change its
coefficients in an attempt to reduce the error. The coefficient update relation is a function of the
error signal squared and is given by
| | | |
| |
2
n 1 n
n
( )
h i h i
2 h i
e
+
| |
c
= +
|
|
c
\ .
(2.7)
The term inside the parentheses represents the gradient of the squared-error with respect to the I
th

coefficient. The gradient is a vector pointing in the direction of the change in filter coefficients
44

that will cause the greatest increase in the error signal. Because the goal is to minimize the error,
however, Equation 1 updates the filter coefficients in the direction opposite the gradient; that is
why the gradient term is negated. The constant is a step-size, which controls the amount of
gradient information used to update each coefficient. After repeatedly adjusting each coefficient
in the direction opposite to the gradient of the error, the adaptive filter should converge; that is,
the difference between the unknown and adaptive systems should get smaller and smaller. To
express the gradient decent coefficient update equation in a more usable manner, we can rewrite
the derivative of the squared-error term as
| | | |
2
( ) ( )
2
h i h i
e e
e
| | | |
c c
=
| |
| |
c c
\ . \ .
(2.8)
Or,
| | | |
2
( ) ( )
2
h i h i
e d y
e
| | | |
c c
=
| |
| |
c c
\ . \ .
(2.9)
| |
| |
| |
1
2
0
( h i [ ])
( )
2
h i h i
N
i
d x n i
e
e

=
| |
c
|
| |
c
| =
|
|
c c
|
\ .
|
\ .

(2.10)

| |
2
( )
2( [ ])
h i
e
x n i e
| |
c
=
|
|
c
\ .
(2.11)
which in turn gives us the final LMS coefficient update,
| | | |
n 1 n
h i h i [ ] ex n i
+
= +
(2.12)

The step-size directly affects how quickly the adaptive filter will converge toward the
unknown system. If is very small, then the coefficients change only a small amount at each
update, and the filter converges slowly. With a larger step-size, more gradient information is
included in each update, and the filter converges more quickly; however, when the step-size is
45

too large, the coefficients may change too quickly and the filter will diverge. (It is possible in
some cases to determine analytically the largest value of ensuring convergence.)

2.2.3) CONVERGENCE AND STABILITY 
Assume that the true filter H(n) = H is constant, and that the input signal x(n) is wide-sense
stationary. Then E{W(n)} converges to H as n if and only if
max
2
0

< <
(2.13)

Where
max
is the greatest eigenvalue of the autocorrelation matrix. If this condition is not
fulfilled, the algorithm becomes unstable and W(n) diverges.
Maximum convergence speed is achieved when
max min
2

=
+
(2.14)

where
min
is the smallest eigenvalue of autocorrelation matrix. Given that is less than or equal
to this optimum, the convergence speed is determined by .
min
, with a larger value yielding
faster convergence. This means that faster convergence can be achieved when
max
is close to

min
, that is, the maximum achievable convergence speed depends on the eigenvalue spread of
autocorrelation matrix.
A white noise signal has autocorrelation matrix R =
2
I, where
2
is the variance of the signal. In
this case all eigenvalues are equal, and the eigenvalue spread is the minimum over all possible
matrices. The common interpretation of this result is therefore that the LMS converges quickly
for white input signals, and slowly for colored input signals, such as processes with low-pass or
high-pass characteristics.
46

It is important to note that the above upperbound on only enforces stability in the mean, but the
coefficients of W(n) can still grow infinitely large, i.e. divergence of the coefficients is still
possible. A more practical bound is
2
0
[ ] tr R
< <
(2.15)

where tr[R] denotes the trace of autocorrelation matrix. This bound guarantees that the
coefficients of W(n) do not diverge (in practice, the value of should not be chosen close to this
upper bound, since it is somewhat optimistic due to approximations and assumptions made in the
derivation of the bound).

47

CHAPTER 3
IMPLEMENTATION OF LMS ADAPTIVE FILTER 
3.1) INTRODUCTION
In LMS the weight vector is updated from sample to sample as follows-
h
k+1
= h
k
k (3.1)
h
k
and k are the weights and the true gradient vectors respectively. At the k
th
sampling instant,
controls the stability and the rate of convergence.
LMS algorithm for updating the weights from sample to sample is
h
k+1
= h
k
+ 2 e
k
x
k
(3.2)
where,
e
k
= y
k
-

h
k
T
x
k
(3.3)

3.2) IMPLEMENTATION OF LMS ALGORITHM 

1) Initially, set each each weight h
k
(i), for i=0,1,2,,N-1 to an arbitrary fixed value such
as 0.
For each subsequent sampling instant, k=1,2,.. carry out steps (2) to step (4) below.
2) Compute filter output as
1
k
0
n ( )
N
k k i
i
h i x

=
=

(3.4)

3) Compute the error estimate
e
k
= y
k
- n
k
(3.5)
4) Update the next filter weights
k 1 k
h ( ) h ( ) 2 e
k k i
i i x
+
= +
(3.6)

The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each
new set of input and output samples.

48

3.3) FLOWCHART FOR THE LMS ADAPTIVE FILTER 

Update Coefficient
w
k+1
= w
k
+ 2e
k
x
k-i

Compute Factor
2e
k

Compute Error
e
k
=y
k
- n
k

Filter x
k

n
k
=w
k
(i).x
k-i

k
and y
k

Initialize
h
k
(i) and x
k-i
49

3.4) IMPLEMENTATION OF DIFFERENT ORDERS LMS
3.4.1) Introduction 
The LMS algorithm is a linear adaptive filtering algorithm, which, in general, consists of two
basic processes:
1) A filtering process, which involves (a) computing the output of a linear filter in response
to an input signal and (b) generating an estimation error by comparing this output with a
desired response.
2) An adaptive process, which involves the automatic adjustment of the parameters of the
filter in accordance with the estimation error.
The combination of these two processes working together constitutes a feedback loop. First we
have a transversal filter, around which the LMS algorithm is built, this component is responsible
for performing the filtering process. Second, we have a mechanism for performing the adaptive
control process on the tap weights of the transversal filter, hence is called adaptive weight-
control mechanism.

Figure 3.1: LMS filter

50

3.4.2) 1
st
3.4.2.1) Introduction

Figure 3.2: 1
st

d
out
is the output of transversal filter
y
n
is the desired signal
e(n) is the estimation error given as-
e(n) = d
out
(n) y(n) (3.7)
w(n+1) = w(n) + 2e(n)x
in
(n) (3.8)
w(n+1) is the updated weight and w(n) is the previous weight

51

Components required for designing of 1
st
Number of delay elements required = 1
Number of multipliers in transversal filter = 2
Number of multipliers in adaptive weight control mechanism = 3
Number of adders in transversal filter = 1
Here total number of multipliers are 5 and total number of adders are 4. The delay of QSD adder
is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 1
st
order LMS

3.4.2.2) VHDL implementation of 1
st
Here we are using =0.5.

3.4.2.2.1) VHDL code for 1
st

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity first_order_filter is
port( x2,x1,x0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
w02,w01,w00: in std_logic ;
w12,w11,w10: in std_logic;
d5,d4,d3,d2,d1,d0:inout std_logic);
end first_order_filter;

--}} End of automatically maintained section

architecture first_order_filter of first_order_filter is
component delay_unit
port(
a ,b,c: in STD_LOGIC;

d ,e,f: out STD_LOGIC

52

);
end component ;
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;

component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);

end component ;
port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20: in std_logic ;
w12,w11,w10,w22,w21,w20: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20: out std_logic );
end component ;

signal xd12,xd11,xd10:std_logic ;

signal xd22,xd21,xd20 :std_logic ;
signal nk02,nk01,nk00: std_logic ;
signal nk12,nk11,nk10: std_logic ;
signal nki02,nki01,nki00: std_logic ;
53

signal nki12,nki11,nki10: std_logic ;
signal do4,do3: std_logic ;

signal ws02,ws01,ws00,ws12,ws11,ws10:std_logic ;
begin

delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10);

mul1: QSD_SINGLE_DIGIT_MULT port map
(x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00);
mul2: QSD_SINGLE_DIGIT_MULT port map
(xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10);

nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,d5,d4,d3,d2,d1,d0);

(d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10,
w02,w01,w00,w12,w11,w10,ws02,ws01,ws00,ws12,ws11,ws10);

-- enter your statements here --

end first_order_filter;

54

Figure 3.3: VHDL simulation of 1
st
3.4.2.2.2) VHDL code for one digit QSD adder

library IEEE;
use IEEE.STD_LOGIC_1164.all;

port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);

--}} End of automatically maintained section

begin
c1 <= (a2 and b2 and(not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0))or (a2 and (not a0) and b2) or (b2 and (not a1)
and (not a0) and (not b1)) or (a2 and (not a1) and (not b1) and (not b0)) after 2 ns;

c0 <= (a2 and b2 and (not b1)) or (a2 and (not a1) and b2) or
(a2 and b2 and (not b0)) or (a2 and (not a0) and b2) or
((not a1) and (not a0) and b2 and (not b1))
or ((not a2)and a1 and (not b2) and b1) or ((not a2) and a0 and (not b2) and b1)
55

or((not a2) and (not b2) and b1 and b0) or ((not a2) and a1 and (not b2) and b0)
or (a2 and (not a1) and (not b1) and (not b0)) or ((not a2) and a1 and a0 and (not b2))
after 2 ns;

s2 <= ((not a1) and b2 and b0) or (a2 and (not a0) and (not b1))
or ((not a1) and a0 and b2 and (not b1)) or ((not a1) and (not a0) and b2 and b1)
or ((not a1) and a0 and b1 and (not b0)) or ((not a1) and (not a0) and b1 and b0)
or (a2 and (not a1 ) and (not b1) and b0) or (a1 and (not a0) and (not b1) and b0)
or (a2 and a1 and (not b1) and (not b0)) or (a1 and a0 and (not b1 ) and (not b0))
or (a2 and a1 and a0 and b2 and b1 and b0) after 2 ns;

s1 <= ((not a1) and b1 and (not b0)) or ((not a1) and (not a0) and b1 )
or (a1 and (not a0) and (not b1)) or (a1 and (not b1) and (not b0))
or ( a1 and a0 and b1 and b0) or ((not a1) and a0 and (not b1) and b0) after 2 ns;

s0 <= (a0 and (not b0)) or ((not a0) and b1 and b0) or ((not a2) and (not a0) and b0)
or ((not a0 ) and (not b2) and b0) after 2 ns;

-- enter your statements here --

56

Figure 3.4: 1 digit QSD adder

3.4.2.2.3) VHDL code for two digit QSD adder

library IEEE;
use IEEE.STD_LOGIC_1164.all;

57

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
--}} End of automatically maintained section
signal ci5,ci4,ci3:std_logic ;
signal s5,s4,s3,s2,s1:std_logic ;
signal si5,si4:std_logic ;
port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;
begin
ci5<=0 ;
58

Figure 3.5: 2 digit QSD adder

3.4.2.2.3) VHDL code for single digit multiplier

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity QSD_SINGLE_DIGIT_MULT is
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
59

b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);

end QSD_SINGLE_DIGIT_MULT;
--}} End of automatically maintained section
architecture QSD_SINGLE_DIGIT_MULT of QSD_SINGLE_DIGIT_MULT is
begin
c2<= (a2 and(not b2)and b0 and((not b1)nand a1)) or (a2 and(not b2)and b1 and(a1 nand
a0)) or ((not a2)and a0 and b2 and ((not a1)nand b1)) or ((not a2) and a1 and b2 and(b1 nand
b0));
c1<= c2 or ( a2 and(not a1)and b2 and(not b1)) or (a1 and a0 and (not b2)and b1 and b0);
c0<= (a1 and(a0 nor b2)and b1) or (a1 and b1 and(a2 nor b0)) or (a2 and b2 and(a1 xor b1)) or
( a2 and b2 and(a0 nor b0)) or (a2 and b1 and(a1 nor b0)) or (a1 and b2 and(a0 nor b1)) or ((a1
nor b1)and a2 and(not b2)and b0) or ((a1 nor b1)and(not a2)and a0 and b2) or (a2 and a1 and(not
b2)and b1 and b0) or ((not a2)and a1 and a0 and b2 and b1) or ((a2 nor b2)and a0 and b0 and(a1
xor b1));
m2<= (a2 and b1 and(a1 nor b2)) or (a1 and b2 and(a2 nor b1)) or (a1 and a0 and b2 and(not
b1)) or(a2 and(not a1)and b1 and b0) or (a0 and b2 and(a2 nor b0)) or (a0 and b0 and(a1 xor b1))
or (a2 and b0 and(a0 nor b2)) or (a2 and a0 and b1 and(b2 nor b0)) or (a1 and b2 and b0 and(a2
nor a0));
60

m1<= (a0 and b1 and(a1 nand b0)) or (a1 and b0 and(b1 nand a0));
m0<= a0 and b0;

end QSD_SINGLE_DIGIT_MULT;

Figure 3.6: Single digit QSD multiplier

On simulation of QSD single digit multiplier in Xilinx we get the delay of 11.348ns.

61

3.4.2.2.4) VHDL code for complement generator of two digit QSD number

library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity complement_genrator is
port(a5,a4,a3,a2,a1,a0: in std_logic;
b5,b4,b3,b2,b1,b0: inout std_logic);
end complement_genrator;
--}} End of automatically maintained section
architecture complement_genrator of complement_genrator is
signal f5,f4,f3,f2,f1,f0: std_logic;
signal n2,n1: std_logic ;
signal n0: std_logic ;
port (a0,a1,a2,b0,b1,b2 : in std_logic;
c0,c1,s0,s1,s2 : out std_logic);
end component;
begin
n2<=0;
n1<=0;
n0<=1;
f5<=0;
process(a0)
begin
62

if a2=0 and a1 =0 and a0=0 then
b2<=0 ; b1<=1 ; b0<=1;
end if;
if a2=0 and a1 =0 and a0=1 then
b2<=0 ; b1<=1 ; b0<=0;
end if ;
if a2=0 and a1 =1 and a0=0 then
b2<=0 ; b1<=0 ; b0<=1;
end if ;
if a2=0 and a1 =1 and a0=1 then
b2<=0 ; b1<=0 ; b0<=0;
end if ;

if a5=0 and a4 =0 and a3=0 then
b5<=0 ; b4<=1 ; b3<=1;
end if ;
if a5=0 and a4 =0 and a3=1 then
b5<=0 ; b4<=1 ; b3<=0;
end if ;
if a5=0 and a4 =1 and a3=0 then
b5<=0 ; b4<=0 ; b3<=1;
end if ;
if a5=0 and a4 =1 and a3=1 then
63

b5<=0 ; b4<=0 ; b3<=0;
end if ;

end process;

end complement_genrator;

Figure 3.7: Two digit QSD number complement generator

64

3.4.2.2.5) VHDL code for delay unit

library IEEE;
use IEEE.STD_LOGIC_1164.all;
entity delay_unit is
port(
a ,b,c: in STD_LOGIC;

d ,e,f: out STD_LOGIC

);
end delay_unit;

--}} End of automatically maintained section

architecture delay_unit of delay_unit is
begin
d<=a after 100 ns;
e<=b after 100 ns;
f<=c after 100 ns;

-- enter your statements here --

end delay_unit;

65

Figure 3.8: Delay unit
3.4.2.2.6) VHDL code for adaptive weight control mechanism

library IEEE;
use IEEE.STD_LOGIC_1164.all;

port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20: in std_logic ;
66

w12,w11,w10,w22,w21,w20: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20: out std_logic );

--}} End of automatically maintained section
component complement_genrator
port(a5,a4,a3,a2,a1,a0: in std_logic;
b5,b4,b3,b2,b1,b0: inout std_logic);
end component ;
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;

component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
67

c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
end component ;

port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;

signal dc5,dc4,dc3,dc2,dc1,dc0:std_logic ;
signal e5,e4,e3,e2,e1,e0:std_logic ;
signal f5,f4,f3,f2,f1,f0:std_logic ;
signal g15,g14,g13,g12,g11,g10:std_logic ;
signal g25,g24,g23,g22,g21,g20:std_logic ;

signal wo14,wo13,wo24,wo23:std_logic ;
begin
complement: complement_genrator port map (
d5,d4,d3,d2,d1,d0,dc5,dc4,dc3,dc2,dc1,dc0);
dc5,dc4,dc3,dc2,dc1,dc0,y5,y4,y3,y2,y1,y0,e5,e4,e3,e2,e1,e0);
mul1: QSD_SINGLE_DIGIT_MULT port map (e2,e1,e0,q2,q1,q0,f5,f4,f3,f2,f1,f0);
68

mul21:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x12,x11,x10,g15,g14,g13,g12,g11,g10);
mul22:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x22,x21,x20,g25,g24,g23,g22,g21,g20) ;

Figure 3.9: Adaptive weight control mechanism of 1
st

69

3.4.3) 2
nd
3.4.3.1) Introduction

Figure 3.10: 2
nd

d
out
is the output of transversal filter
y
n
is the desired output
e(n) is the estimation error given as-
e(n) = d
out
(n) y(n) (3.9)
w(n+1) = w(n) + 2e(n)x
in
(n) (3.10)
w(n+1) is the updated weight and w(n) is the previous weight
Components required for designing of 2
nd
Number of delay elements required = 2
Number of multipliers in transversal filter = 3
70

Number of multipliers in adaptive weight control mechanism = 4
Number of adders in transversal filter = 2

Here total number of multipliers are 7 and total number of adders are 6. The delay of QSD adder
is 13.931ns and the delay of QSD multiplier is 11.348ns, so the total delay of 2
nd
order LMS

3.4.3.2) VHDL implementation of 2
nd

3.4.3.2.1) VHDL code for 2
nd
library IEEE;
use IEEE.STD_LOGIC_1164.all;

entity second_order_filter is
port( x2,x1,x0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
w02,w01,w00: in std_logic ;
w12,w11,w10: in std_logic;
w22,w21,w20: in std_logic;
d5,d4,d3,d2,d1,d0:inout std_logic);
end second_order_filter;

--}} End of automatically maintained section
architecture second_order_filter of second_order_filter is
component delay_unit
71

port(
a ,b,c: in STD_LOGIC;

d ,e,f: out STD_LOGIC

);
end component ;

port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;

port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;

component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
72

b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
);
end component ;
port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ;
w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic );
end component ;
signal xd12,xd11,xd10:std_logic ;
signal xd22,xd21,xd20 :std_logic ;
signal nk02,nk01,nk00: std_logic ;
signal nk12,nk11,nk10: std_logic ;
signal nki22,nki21,nki20,nk22,nk21,nk20:std_logic ;
signal nki02,nki01,nki00: std_logic ;
signal nki12,nki11,nki10: std_logic ;
signal do4,do3: std_logic ;
73

signal di5,di4,di3,di2,di1,di0:std_logic ;
signal ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws21,ws20:std_logic ;
begin
delay1: delay_unit port map (x2,x1,x0,xd12,xd11,xd10);
delay2: delay_unit port map (xd12,xd11,xd10,xd22,xd21,xd20);
mul1: QSD_SINGLE_DIGIT_MULT port map
(x2,x1,x0,w02,ws02,ws01,ws00,nki01,nki00,nk02,nk01,nk00);
mul2: QSD_SINGLE_DIGIT_MULT port map
(xd12,xd11,xd10,ws12,ws11,ws10,nki12,nki11,nki10,nk12,nk11,nk10);
mul3: QSD_SINGLE_DIGIT_MULT port map
(xd22,xd21,xd20,ws22,ws21,ws20,nki22,nki21,nki20,nk22,nk21,nk20);
nki02,nki01,nki00,nk02,nk01,nk00,nki12,nki11,nki10,nk12,nk11,nk10,di5,di4,di3,di2,di1,di0);
(di5,di4,di3,di2,di1,di0,nki22,nki21,nki20,nk22,nk21,nk20,d5,d4,d3,d2,d1,d0);
(d5,d4,d3,d2,d1,d0,y5,y4,y3,y2,y1,y0,q2,q1,q0,x2,x1,x0,xd12,xd11,xd10,xd22,xd21,xd20,
w02,w01,w00,w12,w11,w10,w22,w21,w20,ws02,ws01,ws00,ws12,ws11,ws10,ws22,ws2
1,ws20);
-- enter your statements here --
end second_order_filter;
74

figure 3.11: VHDL simulation of 2
nd
3.4.3.2.2) VHDL code for adaptive weight control mechanism of 2
nd
order LMS filter

library IEEE;
use IEEE.STD_LOGIC_1164.all;

port (d5,d4,d3,d2,d1,d0:in std_logic ;
y5,y4,y3,y2,y1,y0:in std_logic ;
q2,q1,q0:in std_logic ;
x12,x11,x10,x22,x21,x20,x32,x31,x30: in std_logic ;
w12,w11,w10,w22,w21,w20,w32,w31,w30: in std_logic ;
wo12,wo11,wo10,wo22,wo21,wo20,wo32,wo31,wo30: out std_logic );

75

--}} End of automatically maintained section

component complement_genrator
port(a5,a4,a3,a2,a1,a0: in std_logic;
b5,b4,b3,b2,b1,b0: inout std_logic);
end component ;
port (x5,x4,x3,x2,x1,x0,y5,y4,y3,y2,y1,y0:in std_logic ;
z5,z4,z3,z2,z1,z0:out std_logic );
end component ;
component QSD_SINGLE_DIGIT_MULT
port(
a2 : in STD_LOGIC;
a1 : in STD_LOGIC;
a0 : in STD_LOGIC;
b2 : in STD_LOGIC;
b1 : in STD_LOGIC;
b0 : in STD_LOGIC;
c2 : inout STD_LOGIC;
c1 : inout STD_LOGIC;
c0 : inout STD_LOGIC;
m2 : inout STD_LOGIC;
m1 : inout STD_LOGIC;
m0 : inout STD_LOGIC
76

);
end component ;

port (b2,b1,b0,a2,a1,a0 : in std_logic;
c1,c0,s2,s1,s0 : out std_logic);
end component ;

signal dc5,dc4,dc3,dc2,dc1,dc0:std_logic ;
signal e5,e4,e3,e2,e1,e0:std_logic ;
signal f5,f4,f3,f2,f1,f0:std_logic ;
signal g15,g14,g13,g12,g11,g10:std_logic ;
signal g25,g24,g23,g22,g21,g20:std_logic ;
signal g35,g34,g33,g32,g31,g30:std_logic ;
signal wo14,wo13,wo34,wo33,wo24,wo23:std_logic ;
begin
complement: complement_genrator port map (
d5,d4,d3,d2,d1,d0,dc5,dc4,dc3,dc2,dc1,dc0);
dc5,dc4,dc3,dc2,dc1,dc0,y5,y4,y3,y2,y1,y0,e5,e4,e3,e2,e1,e0);
mul1: QSD_SINGLE_DIGIT_MULT port map (e2,e1,e0,q2,q1,q0,f5,f4,f3,f2,f1,f0);
mul21:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x12,x11,x10,g15,g14,g13,g12,g11,g10);
mul22:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x22,x21,x20,g25,g24,g23,g22,g21,g20) ;
mul23:QSD_SINGLE_DIGIT_MULT port map
(f2,f1,f0,x32,x31,x30,g35,g34,g33,g32,g31,g30);
77

-- enter your statements here --

Figure 3.12: Adaptive weight control mechanism of 2
nd

78

CHAPTER 4

CONCLUSION

We have implemented 1
st
and 2
nd
control mechanism. For implementation of above adaptive filter we have used non conventional
quaternary signed digit number system. For this we have designed and implemented addition and
multiplication blocks for QSD number system. By use of these blocks we have implemented our
adaptive filter. We have shown above that in QSD number system the addition takes place in
parallel so the delay is constant and does not depend on number of bits to be added, the delay of
QSD adder is 13.931ns and the delay of QSD multiplier is 11.348ns.
The LMS algorithm requires approximately 2N+1 multiplications and 2N+1 additions for each
new set of input and output samples, where N is order of the filter. So the delay depends upon
the number of multiplication and addition.
Here we have implemented the adaptive filter using QSD adders and multipliers the total delay
of 1
st
order LMS adaptive filter is 112.464ns and the total delay of 2
nd
is 163.022ns. So the delay is much less in comparison to the implementation of adaptive filter

79

APPENDIX
1. Xilinx report for QSD adder

Release 9.2i - xst J.36
--> Parameter TMPDIR set to ./xst/projnav.tmp
CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

--> Parameter xsthdpdir set to ./xst
CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

1) Synthesis Options Summary
2) HDL Compilation
3) Design Hierarchy Analysis
4) HDL Analysis
5) HDL Synthesis
5.1) HDL Synthesis Report
7) Low Level Synthesis
8) Partition Report
80

9) Final Report
9.1) Device utilization summary
9.2) Partition Resource Summary
9.3) TIMING REPORT
=====================================================================
====
* Synthesis Options Summary *
=====================================================================
====
---- Source Parameters
Input Format : mixed
Ignore Synthesis Constraint File : NO

---- Target Parameters
Output Format : NGC
Target Device : xc2s15-6-cs144

---- Source Options
Automatic FSM Extraction : YES
FSM Encoding Algorithm : Auto
Safe Implementation : No
FSM Style : lut
RAM Extraction : Yes
81

RAM Style : Auto
ROM Extraction : Yes
Mux Style : Auto
Decoder Extraction : YES
Priority Encoder Extraction : YES
Shift Register Extraction : YES
Logical Shifter Extraction : YES
XOR Collapsing : YES
ROM Style : Auto
Mux Extraction : YES
Resource Sharing : YES
Asynchronous To Synchronous : NO
Multiplier Style : lut
Automatic Register Balancing : No

---- Target Options
Global Maximum Fanout : 100
Add Generic Clock Buffer(BUFG) : 4
Register Duplication : YES
Slice Packing : YES
Optimize Instantiated Primitives : NO
Convert Tristates To Logic : Yes
Use Clock Enable : Yes
Use Synchronous Set : Yes
82

Use Synchronous Reset : Yes
Pack IO Registers into IOBs : auto
Equivalent register Removal : YES

---- General Options
Optimization Goal : Speed
Optimization Effort : 1
Keep Hierarchy : NO
RTL Output : Yes
Global Optimization : AllClockNets
Write Timing Constraints : NO
Cross Clock Analysis : NO
Hierarchy Separator : /
Bus Delimiter : <>
Case Specifier : maintain
Slice Utilization Ratio : 100
BRAM Utilization Ratio : 100
Verilog 2001 : YES
Auto BRAM Packing : NO
Slice Utilization Ratio Delta : 5

=====================================================================
====
83

=====================================================================
====
* HDL Compilation *
=====================================================================
====
Compiling vhdl file "C:/Xilinx92i/lma/adaptive.vhd" in Library work.

=====================================================================
====
* Design Hierarchy Analysis *
=====================================================================
====

=====================================================================
====
* HDL Analysis *
=====================================================================
====

84

=====================================================================
====
* HDL Synthesis *
=====================================================================
====

Performing bidirectional port resolution...

=====================================================================
====
HDL Synthesis Report

Found no macro
=====================================================================
====

=====================================================================
====
=====================================================================
====

85

=====================================================================
====

Found no macro
=====================================================================
====

=====================================================================
====
* Low Level Synthesis *
=====================================================================
====

Mapping all equations...
Building and optimizing final netlist ...
Found area constraint ratio of 100 (+ 5) on block qsdadder, actual ratio is 3.

Final Macro Processing ...

=====================================================================
====
Final Register Report
86

Found no macro
=====================================================================
====

=====================================================================
====
* Partition Report *
=====================================================================
====

Partition Implementation Status
-------------------------------

No Partitions were found in this design.

-------------------------------

=====================================================================
====
* Final Report *
=====================================================================
====
Final Results
RTL Top Level Output File Name : qsdadder.ngr
Top Level Output File Name : qsdadder
Output Format : NGC
87

Optimization Goal : Speed
Keep Hierarchy : NO

Design Statistics
# IOs : 11

Cell Usage :
# BELS : 17
# LUT2 : 1
# LUT3 : 2
# LUT4 : 10
# MUXF5 : 3
# MUXF6 : 1
# IO Buffers : 11
# IBUF : 6
# OBUF : 5
=====================================================================
====

Device utilization summary:
---------------------------

Selected Device : 2s15cs144-6

Number of Slices: 7 out of 192 3%
88

Number of 4 input LUTs: 13 out of 384 3%
Number of IOs: 11
Number of bonded IOBs: 11 out of 86 12%

---------------------------
Partition Resource Summary:
---------------------------

No Partitions were found in this design.

---------------------------

=====================================================================
====
TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.
FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT
GENERATED AFTER PLACE-and-ROUTE.

Clock Information:
------------------
No clock signals found in this design

89

Asynchronous Control Signals Information:
----------------------------------------
No asynchronous control signals found in this design

Timing Summary:
---------------

Minimum period: No path found
Minimum input arrival time before clock: No path found
Maximum output required time after clock: No path found
Maximum combinational path delay: 13.931ns

Timing Detail:
--------------
All values displayed in nanoseconds (ns)

=====================================================================
====
Timing constraint: Default path analysis
Total number of paths / destination ports: 59 / 5
-------------------------------------------------------------------------
Delay: 13.931ns (Levels of Logic = 6)
90

Data Path: a0 to c0
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
IBUF:I->O 10 0.776 1.980 a0_IBUF (a0_IBUF)
LUT4:I0->O 1 0.549 1.035 c138 (c1_map15)
LUT3:I2->O 1 0.549 1.035 c149 (c1_map17)
LUT4:I0->O 2 0.549 1.206 c157 (c1_OBUF)
LUT4:I3->O 1 0.549 1.035 c0 (c0_OBUF)
OBUF:I->O 4.668 c0_OBUF (c0)
----------------------------------------
Total 13.931ns (7.640ns logic, 6.291ns route)
(54.8% logic, 45.2% route)

=====================================================================
====
CPU : 3.12 / 3.31 s | Elapsed : 3.00 / 3.00 s

-->

Total memory usage is 161748 kilobytes
Number of errors : 0 ( 0 filtered)
Number of warnings : 0 ( 0 filtered)
Number of infos : 0 ( 0 filtered)
91

2. Xilinx report for QSD multiplier
Release 9.2i - xst J.36
--> Parameter TMPDIR set to ./xst/projnav.tmp
CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

--> Parameter xsthdpdir set to ./xst
CPU : 0.00 / 0.16 s | Elapsed : 0.00 / 0.00 s

1) Synthesis Options Summary
2) HDL Compilation
3) Design Hierarchy Analysis
4) HDL Analysis
5) HDL Synthesis
5.1) HDL Synthesis Report
7) Low Level Synthesis
8) Partition Report
9) Final Report
9.1) Device utilization summary
9.2) Partition Resource Summary
92

9.3) TIMING REPORT

=====================================================================
====
* Synthesis Options Summary *
=====================================================================
====
---- Source Parameters
Input File Name : "QSD_SINGLE_DIGIT_MULT.prj"
Input Format : mixed
Ignore Synthesis Constraint File : NO

---- Target Parameters
Output File Name : "QSD_SINGLE_DIGIT_MULT"
Output Format : NGC
Target Device : xc2s15-6-cs144

---- Source Options
Top Module Name : QSD_SINGLE_DIGIT_MULT
Automatic FSM Extraction : YES
FSM Encoding Algorithm : Auto
Safe Implementation : No
FSM Style : lut
RAM Extraction : Yes
RAM Style : Auto
93

ROM Extraction : Yes
Mux Style : Auto
Decoder Extraction : YES
Priority Encoder Extraction : YES
Shift Register Extraction : YES
Logical Shifter Extraction : YES
XOR Collapsing : YES
ROM Style : Auto
Mux Extraction : YES
Resource Sharing : YES
Asynchronous To Synchronous : NO
Multiplier Style : lut
Automatic Register Balancing : No

---- Target Options
Global Maximum Fanout : 100
Add Generic Clock Buffer(BUFG) : 4
Register Duplication : YES
Slice Packing : YES
Optimize Instantiated Primitives : NO
Convert Tristates To Logic : Yes
Use Clock Enable : Yes
Use Synchronous Set : Yes
Use Synchronous Reset : Yes
94

Pack IO Registers into IOBs : auto
Equivalent register Removal : YES

---- General Options
Optimization Goal : Speed
Optimization Effort : 1
Library Search Order : QSD_SINGLE_DIGIT_MULT.lso
Keep Hierarchy : NO
RTL Output : Yes
Global Optimization : AllClockNets
Write Timing Constraints : NO
Cross Clock Analysis : NO
Hierarchy Separator : /
Bus Delimiter : <>
Case Specifier : maintain
Slice Utilization Ratio : 100
BRAM Utilization Ratio : 100
Verilog 2001 : YES
Auto BRAM Packing : NO
Slice Utilization Ratio Delta : 5

=====================================================================
====

95

=====================================================================
====
* HDL Compilation *
=====================================================================
====
Compiling vhdl file
"C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd" in Library
work.
Entity <QSD_SINGLE_DIGIT_MULT> compiled.
Entity <QSD_SINGLE_DIGIT_MULT> (Architecture <QSD_SINGLE_DIGIT_MULT>)
compiled.

=====================================================================
====
* Design Hierarchy Analysis *
=====================================================================
====
Analyzing hierarchy for entity <QSD_SINGLE_DIGIT_MULT> in library <work> (architecture
<QSD_SINGLE_DIGIT_MULT>).

=====================================================================
====
* HDL Analysis *
=====================================================================
====
Analyzing Entity <QSD_SINGLE_DIGIT_MULT> in library <work> (Architecture
<QSD_SINGLE_DIGIT_MULT>).
96

Entity <QSD_SINGLE_DIGIT_MULT> analyzed. Unit <QSD_SINGLE_DIGIT_MULT>
generated.

=====================================================================
====
* HDL Synthesis *
=====================================================================
====

Performing bidirectional port resolution...

Synthesizing Unit <QSD_SINGLE_DIGIT_MULT>.
Related source file is
"C:/Xilinx92i/QSD_SINGLE_DIGIT_MULT/QSD_SINGLE_DIGIT_MULT.vhd".
Found 1-bit xor2 for signal <m2\$xor0000> created at line 55.
Unit <QSD_SINGLE_DIGIT_MULT> synthesized.

=====================================================================
====
HDL Synthesis Report

Macro Statistics
# Xors : 1
1-bit xor2 : 1

97

=====================================================================
====

=====================================================================
====
=====================================================================
====

=====================================================================
====

Macro Statistics
# Xors : 1
1-bit xor2 : 1

=====================================================================
====

=====================================================================
====
* Low Level Synthesis *
=====================================================================
====

98

Optimizing unit <QSD_SINGLE_DIGIT_MULT> ...

Mapping all equations...
Building and optimizing final netlist ...
Found area constraint ratio of 100 (+ 5) on block QSD_SINGLE_DIGIT_MULT, actual ratio is
4.

Final Macro Processing ...

=====================================================================
====
Final Register Report

Found no macro
=====================================================================
====

=====================================================================
====
* Partition Report *
=====================================================================
====

Partition Implementation Status
-------------------------------

No Partitions were found in this design.
99

-------------------------------

=====================================================================
====
* Final Report *
=====================================================================
====
Final Results
RTL Top Level Output File Name : QSD_SINGLE_DIGIT_MULT.ngr
Top Level Output File Name : QSD_SINGLE_DIGIT_MULT
Output Format : NGC
Optimization Goal : Speed
Keep Hierarchy : NO

Design Statistics
# IOs : 12

Cell Usage :
# BELS : 23
# LUT2 : 1
# LUT3 : 1
# LUT4 : 14
# MUXF5 : 5
# MUXF6 : 2
# IO Buffers : 12
100

# IBUF : 6
# OBUF : 6
=====================================================================
====

Device utilization summary:
---------------------------

Selected Device : 2s15cs144-6

Number of Slices: 8 out of 192 4%
Number of 4 input LUTs: 16 out of 384 4%
Number of IOs: 12
Number of bonded IOBs: 12 out of 86 13%

---------------------------
Partition Resource Summary:
---------------------------

No Partitions were found in this design.

---------------------------

=====================================================================
====
101

TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE.
FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT
GENERATED AFTER PLACE-and-ROUTE.

Clock Information:
------------------
No clock signals found in this design

Asynchronous Control Signals Information:
----------------------------------------
No asynchronous control signals found in this design

Timing Summary:
---------------

Minimum period: No path found
Minimum input arrival time before clock: No path found
Maximum output required time after clock: No path found
Maximum combinational path delay: 11.348ns

Timing Detail:
--------------
102

All values displayed in nanoseconds (ns)

=====================================================================
====
Timing constraint: Default path analysis
Total number of paths / destination ports: 71 / 6
-------------------------------------------------------------------------
Delay: 11.348ns (Levels of Logic = 5)

Data Path: a1 to c1
Gate Net
Cell:in->out fanout Delay Delay Logical Name (Net Name)
---------------------------------------- ------------
IBUF:I->O 13 0.776 2.250 a1_IBUF (a1_IBUF)
LUT4:I1->O 2 0.549 1.206 c2_SW0 (N21)
LUT3:I0->O 1 0.549 0.000 c1_F (N41)
MUXF5:I0->O 1 0.315 1.035 c1 (c1_OBUF)
OBUF:I->O 4.668 c1_OBUF (c1)
----------------------------------------
Total 11.348ns (6.857ns logic, 4.491ns route)
(60.4% logic, 39.6% route)

=====================================================================
====
103

CPU : 3.03 / 3.21 s | Elapsed : 3.00 / 3.00 s

-->

Total memory usage is 162324 kilobytes

Number of errors : 0 ( 0 filtered)
Number of warnings : 0 ( 0 filtered)
Number of infos : 0 ( 0 filtered)

104

REFERENCES

1) M. Morris Mano, Digital design 2
nd
edition, pp. 119-121.

2) Charles H Roth & Lizy Kurian John, Principles of digital system design, pp. no. 66-69 &
186-190.

3) Iljoo Choo and R.G. Deshmukh, A Novel Fast Parallel Signed-Digit Hybrid
Multiplication Scheme for Digital Systems. 0-7803-5957-7/00 2000 IEEE.

4) Dr. Krishna Raj and Suman Lata, Fast Processing Using Signed Digit Number System
International Journal of Electronics Engineering, 2(1), 2010, pp. 173-175.

5) Dhananjay S. Phatak and Israel Korean et al Hybrid Signed Digit Number Systems: A
Unified Framework for Redundant Number Representation with Bounded Carry
Propagation Chains, IEEE Transactions on Computers Vol. 43, No. 8, pp 880-891,
August 1994.

6) S. Haykin. (1996). Adaptive Filter Theory 3
rd
edition. pp. 231-240. Prentice Hall.

7) Paulo S.R. Diniz: Adaptive Filtering: Algorithms and Practical Implementation, Kluwer

## Нижнее меню

### Социальные сети

Авторское право © 2021 Scribd Inc.