Вы находитесь на странице: 1из 17




Table of Contents
Page No.


1.1 Fast Fourier Transform 2

1.2 Low Power 3

1.3 High-energy 4

1.4 High Performance 4

1.5 Suitability of VLSI 4

1.6 Generation of OFDM 5

1.7 Complete model of OFDM 6

1.7.1 Modulation of Data 7

1.7.2 Guard Duration 8

1.7.3 Channel and Receiver 8

1.7.4 Concept of Orthogonality 9

1.7.5 Principle of Space Diversity 9

1.8 FFT Algorithm Radix 2 10

1.9 FFT Algorithm Radix 4 12

1.10 FFT Algorithm Split Radix 13

1.11 Pipelined FFT Algorithm 15

1.12 Outline of the Thesis 16





The Fourier Transform is an inevitable approach in signal processing particularly

for applications in Orthogonal Frequency Division Multiplexing (OFDM) systems.

The Discrete Fourier Transform decomposes a set of values into different components

of frequency. The Fast Fourier transform (FFT) is an appropriate technique to do

manipulation of DFT. The algorithm of FFT was devised by Cooley and Tukey in

order to decrease the amount of complexity with respect to time and computations.

The hardware of FFT can be implemented by two types of classifications - memory

architecture and pipeline architecture.

The memory architecture comprises a single processing element and various units

of memory. The merits of memory architecture include low power and low cost when

compared to that of other styles. The specific demerits are greater latency and lower

throughput. The above demerits of the memory architecture are totally eliminated by

pipeline architecture at the expense of extra hardware in an acceptable way. The

various types of pipeline architecture include Single delay feedback (SDF), Single

delay commutator (SDC) and Multiple delay commutator (MDC). The pipeline

architecture is a regular structure which can be adopted by using hardware description

language in an easy manner. In the recent years, the communication systems need to

transmit voice and video signals of high quality in an efficient manner. Hence there is

a great requirement for a speed efficient and reliable technology of communication.

Orthogonal Frequency Division Multiplexing (OFDM) is such a reliable option to

accomplish the above requirement.


The algorithms of FFT can be grouped into fixed-radix, mixed-radix and split-

radix algorithms in a rough manner. The basic categories of algorithms of FFT

include - Decimation in-frequency (DIF) and the Decimation-in-time (DIT). Both of

these algorithms depend on disintegration of transformation of an N-point sequence

into many subsequences in a successive manner. There is no major difference

between them as far as complexity of computation is concerned. Generally DIT deals

with the input and output in reverse sequence and normal sequence respectively,

while DIF deals with input and output in normal sequence and reverse sequence

respectively. Only Decimation-in-frequency (DIF) algorithm will be taken into



The DFT is a technique used periodically for applications of signal and image

processing. Generally DFTs are computed using Fast Fourier Transform which

includes a variety of algorithms for manipulation. The FFT finds application in wide

areas such as Communication, Signal processing, Image processing, Bio-medical

instrumentation etc. Since the impact of semiconductor technologies continue to

emerge, the performance also increases in a parallel manner. But the consumption of

power by processors in these emerging technologies also continues to increase in an

unfortunate manner. This rise in power has an impact in the current scenario where

important applications of FFT are restricted by available power budgets. This increase

in power-limitation has been created by the increase in the number of portable

applications. In the last decade, the field of low power electronics has been paid

attention only in specific applications that include battery-powered devices in small

versions. There were two major advancements in low-power electronics with respect

to performance and size. The initial advancement was the evolution of CMOS that

manufactured components having the characteristics of high power levels and low

dissipation of energy. The second advancement was the huge increase in demand for

the portable components such as laptops and mobile phones. Due to low power

electronics, there is a need for low power and low power dissipation due to energy

requirement and cooling costs respectively.

These portable circuits are constructed using CMOS due to its high performance,

efficiency, density etc. Other technologies such as Bipolar CMOS and Gallium

Arsenide have better performance, but they have restrictions in voltage or current

requirements and are less suitable for applications of low power.


Power is an important objective required for design because the consumption of

power by a processor can be reduced appropriately by decreasing the overhead of the

design. Energy is a factor used to measure the cost per work completed and hence it is

an useful objective required for the design.


The processing of data with high performance is an important criteria required for

all applications of DSP processors. The manipulation of outcomes is very much

required for both regular and signal processing processors with regard to lower

latency. Hence, this work highlights high performance at the compromise of latency.


The background of signal and image processing comprises various number of

algorithms of FFT. Generally the algorithms are evaluated based on the number of

multiplications and additions. This work points out that other factors such as

complexity and regularity are also needed for the successful implementation of a FFT

processor in addition to the above factor.


The implementations of simple form of designs include less number of efforts and

small errors when compared to regular form of designs. Hence these simple designs

reduce the marketing time and their design time can be employed for simulating the

key design parameters. The regular forms of designs are generally constructed by the

basic building blocks and they include smaller number of components. But these

designs enjoy majority of the benefits attained by that of simple designs. Orthogonal

Frequency-division multiplexing (OFDM) is a modulation technique where many

number of subcarriers are employed to express the data. The data is decomposed into

various channels with one channel for each sub-carrier. Every sub-carrier is subjected

to a traditional method of modulation at a lower symbol rate with respect to the same


Orthogonal Frequency Division Multiplexing is a famous scheme employed for

applications of Digital communication such as Networking, Broadcasting of Audio

and Video etc.


The orthogonality is maintained among subcarriers of the baseband signal by

monitoring the relationship between them in an appropriate manner. Hence OFDM is

generated by selection of certain factors such as spectrum, the input and the method of

modulation. Initially, every carrier is allotted some amount of data for transmission.

Then phase and amplitude of the carrier is manipulated using the traditional method

of modulation. By employing the inverse of FFT, the frequency spectrum of interest is

converted to its time domain signal. The IFFT does this transformation in an efficient

manner and the orthogonality is maintained between the carriers.


Fig. 1.1 OFDM Transceiver

The Fast Fourier Transform (FFT) converts a time based signal into its

corresponding frequency based signal by manipulating the sum of orthogonal

components. The time based signal spectrum is indicated by the phase and amplitude

of those components. Inverse Fast Fourier Transform (IFFT) does the reverse process,

thus converting the spectrum back to time signal. Every point of data present in the

spectrum is called a bin.

Each and every bin is set to generate the carrier signals for OFDM using IFFT. The

reverse process maintains the orthogonality between the carriers because each bin is

nothing but a group of orthogonal sinusoids. Figure 1.1 shows the schematic diagram

of an OFDM transceiver in basic form. The base-band signal should be concatenated

with the frequency of transmission in order to generate the Radio frequency signal.


The block diagram of a complete OFDM system is given in Figure 1.2. The

description of the model is given below.


Data in

Fig. 1.2 Block Diagram of a complete OFDM

The orthogonality provides a successful implementation of the modulator and

demodulator using the algorithms of FFT and inverse FFT on the receiver side and the

sender side respectively. Though the merits of OFDM are known earlier in the last

decade, it is an emerging scheme in the field of wideband communications today by

means of computing the FFT in an efficient manner.

1.7.1 Modulation of Data

Initially, the input data in serial format is converted into parallel format. Then the

data travels in this format and it is subjected to differential encoding with previous

symbols and then converted into format of phase shift keying by concatenating an

extra symbol at the start of transmission. The data is then mapped on each symbol

depending upon the type of modulation used. In general, 0, 90, 180, and 270 degrees

of phase angles are employed for PSK. The method of phase shift keying (PSK) is

selected for the purpose of transmission because it reduces the fluctuations due to

fading by producing a constant signal of amplitude. Once the spectrum is received,

the corresponding waveform can be obtained by using IFFT. The guard interval is

concatenated to the beginning of every symbol.

1.7.2 Guard Duration

Normally the guard duration is composed of two halves a transmission of zero

amplitude and a transmission of symbol in an extended manner. This decomposition

of guard period is done to recover the timing of the symbol by means of envelope

detection. In order to eliminate the interference between symbols; a guard interval is

inserted between them. Once the guard period is concatenated, the symbols are

converted into serial format which forms the base band signal for transmission.

1.7.3 Channel and Receiver

The signal from the transmitter enters the channel mode where it is subjected to

many factors amounting to noise and attenuation. The signal to noise ratio is

determined by concatenating a certain level of white noise to the signal transmitted.

The spread of delay can be simulated to obtain the delay spread of multipath by using

a finite impulse response filter. The filter length and coefficient of amplitude indicate

the spread of delay and the magnitude of reflected signal respectively.

The operation of receiver is reverse with respect to operation of transmitter. The

spectrum of the baseband signal is obtained by removing the guard period and taking

FFT for each symbol. The phase angle of all symbols are manipulated and

demodulated into corresponding data words. The baseband signal is obtained by

integrating those data words.


1.7.4 Concept of Orthogonality

The frequencies of the sub-carriers are orthogonal to each other and eliminate the

requirement of guard bands and the crosstalk between the channels in OFDM. This

reduces the complexity at the transmitting and receiving ends by which a standalone

filter for every channel is avoided. The idea of orthogonality needs the spacing of sub-

carrier to be F = n / TK (Hertz), where TK is the duration of symbol in seconds and

n is an integer equal to 1. Therefore, the bandwidth of pass band will be BW = M.

F (Hz) with M number of carriers. The better spectral efficiency is obtained with a

symbol rate almost equal to Nyquist rate for the baseband signal.

OFDM needs both the receiver and the transmitter to be synchronized to avoid the

interference between the carriers. Generally the offsets in frequency are created by

Doppler shift due to mismatch between the oscillators of transmitter and receiver.

This situation amounts to reflections at various offsets. This is an inevitable factor

restricting the application of OFDM for high-speed vehicles. The traditional methods

used for suppression of inter carrier interference are not appropriate and they may

increase the complexity in receiver. The orthogonality provides a successful

implementation of the modulator and demodulator using FFT and inverse FFT on the

receiver side and the sender side respectively.

1.7.5 Principle of Space Diversity

The receivers can benefit from the signals received from various transmitters in a

simultaneous manner and thereby improve the area of coverage for broadcasting. This

becomes the basis for the functioning of single-frequency networks (SFN) where

various transmitters send the same signal in a simultaneous manner using the same

channel frequency. In SFN, the spectrum is completely utilized when compared to

multi-frequency networks (MFN). The merits of SFN over MFN include large gain

and coverage area because of increase in signal strength at the receiver. Some OFDM

systems employ a guard interval of longer period to space the transmitters in order to

form a larger SFN.

In a SFN, the distance covered by the signal during the guard interval is used to

determine the maximum distance between transmitters. For example, a guard period

of 100 microseconds would place the transmitters 30 km apart in a SFN. OFDM

systems can be integrated with MIMO channels and antenna arrays using the

standards of IEEE 802.11 n.


There are two types with respect to FFT algorithm devised by Cooley and Tukey -

Decimation-in-Time algorithm (DIT) and Decimation-in-Frequency algorithm (DIF).

The computation of a sequence of N-point can be obtained by means of a dual

approach. The input sequence x(n) of size N is decomposed into samples of odd and

even and the corresponding sub-sequences f1(n) and f2(n) are given by -

f1 (n ) x (2n)
N (1.1)
f 2 (n) x (2 n 1), n 0,1,... 1

The input sequence of data is decimated with an integer of 2 to obtain the

subsequences of data f1 (n) and f2 (n). This type of FFT algorithm is called

Decimation-in-Time algorithm. Now Discrete Fourier Transform of N-point sequence

of data is given by-

( N / 2 ) 1 ( N / 2 ) 1
X (k ) f 1 ( m )W Nk m/ 2 W Nk f 2 ( m )W Nk m/ 2
m 0 m 0

F1 ( k ) W Nk F 2 ( k ) , k 0 , 1, .... N 1

where F1 (k) and F2 (k) are the sequences of DFT of size N/2 for the respective

subsequences. This procedure of computation is adopted in a recursive way to

obtain the given DFT as a set of 2-point subsequences of DFT. The butterfly

element is shown in Figure 1.3 and the process of computation for an 8-point DFT

is shown in Figure 1.4.

Fig. 1.3 Computation of Butterfly Using Radix -2

Fig. 1.4 Computation of 8-point DFT



This algorithm decomposes a sequence of DFT into four small DFTs of

lengths in a recursive manner and their outputs are employed to manipulate several

other outputs by which the cost of computation will be reduced. It integrates every

sample of fourth output into a sequence of small-length DFTs to decrease the count

of manipulations and need only of multipliers required by Radix-2. The input

data is disintegrated into four small sequences of x (4n + i) where n = 0, 1, ..., N/4-1

and i = 0, 1, 2, 3.

X ( p, q) [W N . F ( l , q )]W 4lp
( N / 4 ) 1
F (l , q ) x ( l , m )W Nm/q4
m 0

N (1.3)
p 0,1, 2, 3; l 0,1, 2, 3; q 0,1, 2, .. 1
x ( l , m ) x ( 4 m 1)
x( p, q) x( p q)

The four sub-sequences of DFT, F (l, q) of N/4-point arrived from the equation

(1.3) are integrated to obtain the DFT of N-point. The butterfly element is shown in

Figure 1.5 (a) and its compact form for Radix-4 is shown in Figure 1.5 (b). This

butterfly element is concerned with 12 complex additions and 3 complex


Fig. 1.5 (a) Butterfly element for Radix- 4

Fig. 1.5 (b) Compact form of Butterfly for Radix- 4


This type of FFT algorithm is used to manipulate the Discrete Fourier Transform.

Specifically, it is a mixture of Radices 4 and 2 and a derivative of algorithm devised

by Cooley- Tukey. In this algorithm, the sequence of N-point DFT is decomposed

into one subsequence of length N/2 and two subsequences of length N/4. It employs

least count of arithmetic manipulations needed for the calculation of DFT. This

characteristic of the algorithm is exploited to determine the computation time of DFT

on a computer. The split-radix algorithm is applicable only when N is a multiple of

4. The DFT is given by the equation (1.4).

N 1
X k xn wNnk (1.4)

2 i
where k ranges from 0 to N 1 and wN e N
. The equation (1.4) is used to

represent the split-radix algorithm by means of three summations. This includes a

summation with respect to even indices and a summation with respect to odd indices

decomposed into two components which are given by equation (1.5).

N /21 N /41 N /41

Xk x2n2wNn2/2k wNk
n 20
x4n4 wNn4/4k wN3k
n 40
n 40
4n 4 3wNn4/4k (1.5)

In the above equation, the summations correspond to the components of Radix -2 and

Radix-4 respectively. These summations are obtained in a recursive manner and then

integrated. The butterfly element for split radix is shown in Figure 1.6.

Fig. 1.6 Butterfly element for Split-radix



The various types of pipeline architectures used to adopt processors of FFT with

their corresponding requirements are compared and listed in Table 1.1. These pipeline

architectures have different merits and requirements of various styles. The multiple

delay commutators using Radix-2 FFT (R2MDC) and Radix-4 FFT (R4MDC) are

frequently used for pipeline implementations but their resources of hardware are not

exploited appropriately. The implementation of single delay commutators seemed to

be efficient when compared to implementation of multiple delay commutators

because they employ mechanisms of feedback for reducing the requirements of

memory. In the table, among the approaches available, the SDF architecture using

Radix-22 FFT (R2 2SDF) needs the least number of multipliers and storage capacity.

But least number of both adders and multipliers are required for R4SDC architecture.

Hence single delay commutators using Radix-4 is generally utilized for the design of

FFT processor used for applications of OFDM.

Table 1.1 Comparisons of architectures of Pipeline FFT

S No Approaches Multipliers Adders Storage Control

1 R2MDC 2 log4 N-1 4 log4 N 3 N/2 - 2 Simple

2 R2SDF 2 log4 N-1 4 log4 N N -1 Simple

3 R4MDC 3 log 4 N-1 8 log4 N 5 N/2 - 4 Simple

4 R4SDC log4 N-1 3 log4 N 2N-2 complex

5 R4SDF log4 N -1 8 log4 N N -1 Medium

6 R22SDF log4 N - 1 4 log4 N N-1 Simple



The thesis is organized into eight chapters including this chapter. The contents of

each chapter are summarized as follows-

In the Second Chapter, the previous investigations in the literature related to

implementation of power and area efficient pipeline FFT processors for OFDM

applications are reviewed.

In the Third Chapter of the thesis, the problem statement and the proposed solution,

scope and objectives of the research, contributions of the thesis, design and

implementation plan are discussed in a brief manner.

In the Fourth Chapter of the thesis, the implementation of low power and area

efficient 128 point pipeline FFT processor using Radix 2 algorithm based on Single

delay feedback (R2SDF) architecture has been discussed.

In the Fifth Chapter of the thesis, the implementation of low power and area efficient

128 point pipeline FFT processor has been achieved using Mixed Radix 4/2 algorithm

and Single delay feedback (R42SDF) architecture.

In the Sixth Chapter of the thesis, a low power and area efficient 128 point pipeline

FFT processor using Cache-memory architecture (CMA) and Mixed radix 4/2

multiple delay commutator (R42MDC) algorithm has been implemented. A

comparison for various pipeline architectures such as Multiple Delay Commutator

(MDC), Single Delay Feedback (SDF) and Single Delay Commutator (SDC) using

the Mixed Radix 4/2 algorithm has also been arrived for the efficient design of 128-

point FFT processor with respect to memory size, area and power.

In the Seventh Chapter of the thesis, the application of 128 point pipeline FFT

processor for Multiple-Input Multiple-Output Orthogonal Frequency Division


Multiplexing Systems (MIMO OFDM) employing the concepts of Mixed Radix

algorithm, Delay feedback and Data scheduling approaches has been discussed.

Finally, the Eighth Chapter contains the summary of our work, contributions made

in this thesis and directions for future work.