Вы находитесь на странице: 1из 40

VLSI IMPLEMENTATION OF

OFDM

Orthogonal Frequency Division Multiplexing or OFDM is a modulation


format that is being used for many of the latest wireless and
telecommunications standards.
OFDM has been adopted in the Wi-Fi arena where the standards like
802.11a, 802.11n, 802.11ac and more. It has also been chosen for the
cellular telecommunications standard LTE / LTE-A, and in addition to
this it has been adopted by other standards such as WiMAX and many
more.
Orthogonal frequency division multiplexing has also been adopted for a
number of broadcast standards from DAB Digital Radio to the Digital
Video Broadcast standards, DVB.
Although OFDM, orthogonal frequency division multiplexing is more
complicated than earlier forms of signal format, it provides some
distinct advantages in terms of data transmission, especially where high
data rates are needed along with relatively wide bandwidths.

WHAT IS OFDM?
An OFDM signal consists of a number of closely spaced modulated carriers.
When modulation of any form - voice, data, etc. is applied to a carrier, then
sidebands spread out either side.
It is necessary for a receiver to be able to receive the whole signal to be able
to successfully demodulate the data. As a result when signals are
transmitted close to one another they must be spaced so that the receiver
can separate them using a filter and there must be a guard band between
them.
This is not the case with OFDM. Although the sidebands from each carrier
overlap, they can still be received without the interference that might be
expected because they are orthogonal to each another. This is achieved by
having the carrier spacing equal to the reciprocal of the symbol period.

Traditional View Of
Receiving Signals
Carrying Modulation

To see how OFDM works, it is necessary to look at the receiver. This acts
as a bank of demodulators, translating each carrier down to DC. The
resulting signal is integrated over the symbol period to regenerate the
data from that carrier. The same demodulator also demodulates the other
carriers. As the carrier spacing equal to the reciprocal of the symbol
period means that they will have a whole number of cycles in the symbol
period and their contribution will sum to zero - in other words there is no
interference contribution.

One requirement of the OFDM transmitting and receiving systems is that


they must be linear. Any non-linearity will cause interference between the
carriers as a result of inter-modulation distortion. This will introduce
unwanted signals that would cause interference and impair the
orthogonality of the transmission.

OFDM Spectrum

DATA ON OFDM
The data to be transmitted on an OFDM signal is spread across the carriers of the
signal, each carrier taking part of the payload. This reduces the data rate taken by
each carrier. The lower data rate has the advantage that interference from
reflections is much less critical. This is achieved by adding a guard band time or
guard interval into the system. This ensures that the data is only sampled when the
signal is stable and no new delayed signals arrive that would alter the timing and
phase of the signal.

The distribution of the data across a large number of carriers in the OFDM signal has
some further advantages. By using error-coding techniques, which does mean
adding further data to the transmitted signal, it enables many or all of the corrupted
data to be reconstructed within the receiver.

BASIC PRINCIPLE OF OFDM


Product modulator

Sub-carrier

Sub-carrier

Sub-carrier

Separate local oscillators to generate each individual sub-carrier

OFDM SYSTEM
CORRELATION RECEIVER

OFDM ADVANTAGES
OFDM can easily adapt to severe channel conditions without the need for
complex channel equalisation algorithms being employed
It is robust when combatting narrow-band co-channel interference. As
only some of the channels will be affected, not all data is lost and error
coding can combat this.
Intersymbol interference, ISI is less of a problem with OFDM because low
data rates are carried by each carrier.
Provides high levels of spectral efficiency.
Relatively insensitive to timing errors
Allows single frequency networks to be used - particularly important for
broadcasters where this facility gives a significant improvement in spectral
usage.

OFDM DISADVANTAGES
High peak-to average-power ratio (PAPR) This put high
demand on linearity in amplifiers.

Phase noise error cause degradation to OFDM system


Very sensitive time frequency synchronization

OFDM TRANSCEIVER AND IMPLEMENTATION


Data input to
Transmitter

Scrambler

Coding

Inter
leaving

QPSK
Mapping

Output Of
Transmitter

Serial to
parallel

Parallel
to serial

Add cyclic
Extension and
windowing

IFFT (Tx)
FFT(Rx)
QPSK
Demapping

De-inter
leaving

Equalizer

Decoding

Channel
Estimate

DeScrambler

Parallel
To Serial

Serial To
Parallel

Data received

Synchronization
Remove
Cyclic
Extension

Input to Receiver

SCRAMBLER(RANDOMIZER)

In the proposed design, a standard 7 bit scrambler has been used to


randomize the incoming bits.

INTERLEAVER

Two memory elements (usually RAMs) are used. In the first RAM the incoming block
of bits is stored in sequential order. This data from the first RAM is read out
randomly (using an algorithm) so that the bits are re-arranged and stored in the
second RAM and then read out.

The three building blocks of the interleaver are:


Block Memory
Controller
Address ROM
The job of the controller is to guide the incoming block of data to the
correct memory blocks, to switch the RAMs between reading and writing
modes, and to switch between the two RAMs for 16 alternate bits in
writing mode. This is done by using counters.

The address ROM is basically a 64x6 ROM that stores read addresses for
the RAMs.
Counter C is a 3-bit counter that controls switching between either RAM
1A and RAM 2A or RAM 1B and RAM 2B depending upon which RAMs
are in write mode. Counter1 and Counter2 are 5-bit counters after
every 8th count control switches to either Counter1 or Counter2; this
is controlled by Counter C.

CONSTELLATION MAPPER
Signal constellation of QPSK
*

-3m/8

-m/8

m/8

3m/8

Mapping of bits to constellation points

In QPSK two bits make up one symbol.

A ROM is used to store the constellation points. Each constellation point is


represented by 48 bits in binary. In these 48 bits, the most significant 24 bits

represent the real part and the least significant 24 bits represent the
imaginary part.
In both the real and imaginary parts the most significant 8 bits are the integer
part and the least significant 16 bits

represent

the

fractional part. 2s

complement notation has been used to represent negative numbers.


The size of ROM is 4x48. The incoming input bits (2 bits) act as address for the
ROM. Each ROM values in the ROM is a constellation point corresponding to

the data bits which here act as addresses for the ROM.

SERIAL TO PARALLEL MODULE

The data comes serially from the input port SERIN. The parallel data is output
from DOUT port. Output port DRDY is asserted 1 when the start bit, 8 bit data
and the parity bit is received. Output port PERRn is asserted 0 when the parity
bit received is different from the parity generated inside the serial to parallel
circuit. When parity error is detected, the serial to parallel circuit would be reset
before its normal operation can be performed.

17

IFFT DESIGN
64-point Radix-2^2 fixed-point DIT FFT

Since in the proposed design there are 64 sub-carriers so the input to FFT would
be 64 complex numbers, hence a 64 point FFT would be required.

PARALLEL TO SERIAL MODULE

A parallel to serial converter is a special function of shift register. The data is


parallel loaded to the shift register and then shift out bit by bit also is bounded
by a start bit and stop bit.

Data to be transmit is first parallel loaded then transmitted bit by bit by a start
bit of value 1. This is followed by the 8-bit data with the left bit most bit first.
The converter holds the output low when the transmission is completed.

CYCLIC PREFIX ADDER

Causes intercarrier
interference (ICI)

If multipath delay is less than the cyclic prefix no


intersymbol or intercarrier interference

RECEIVER DESIGN AND IMPLEMENTATION


The receiver follows an exact reverse procedure of which was followed in the
transmitter. It receives the complex (modulated) output points and performs
demodulation and recovers the original bits sent to the transmitter.

CYCLIC PREFIX REMOVER


The cyclic prefix was added at the transmitting end in order to avoid
inter-symbol interference, therefore during reception it must be eliminated
for any further processing of the received signal. This is done by simply
skipping the first eight sub-carriers in the received OFDM symbol. In
hardware this is implemented in the control unit. The control unit only
enables the next block (FFT) when the first eight bits of the received OFDM
symbols have been skipped .

FAST FOURIER TRANSFORM


In order to implement FFT in hardware the algorithm is same, only the
difference is that the divider is removed and the real and imaginary parts at the
input are swapped i.e. real becomes imaginary and imaginary becomes real.

Same goes for the output i.e. real and imaginary parts at the output are
swapped as well.

CONSTELLATION DE-MAPPER

Therefore, basically the incoming constellation points are mapped onto the data
points as shown in Table. Can be implemented by direct coding.

DE-INTERLEAVER
De-interleaving performs the inverse task. It re-arranges the interleaved bits into their
original order. De-interleaving is done the same way as Interleaving, the difference being

that the number of rows and the number

of

columns

for

de-interleaving

are

interchanged. Hence the only difference in the hardware architectures of interleaver and
de-interleaver is the contents of the address ROM, which actually provides the read
addresses to the RAM that stores the data to be de-interleaved.

DESCRAMBLER
The above setup simply
descrambles the scrambled
data

VITERBI DECODER
The Viterbi Decoder decodes Convolutional codes. Alteras Viterbi IP core is a
parameterized IP core that is synthesizable and allows for parallel as well as
hybrid implementation of the Viterbi decoder.

BMU
Branch metrics computation unit calculates the hamming distances for the
incoming pair of codes from four possible codes.

ACS
Add, compare and select unit is used to update the path metric for all the 64 states
and select the predecessor. For each of the 64 states, it adds current path metric
and branch metric for both the predecessor states and selects the lower of the two
as the new path metric and the predecessor information is passed on to the SMU
unit.
The width of the Path metric register and the ACS adders and subtractor will
change based on whether a soft-decision or a hard-decision viterbi is ued. It also
depends on the maximum metrics accumulated by metrics registers before a
normalization is done.

VLSI IMPLEMENTATION

Lower gate count compared to DSP+RAM+ROM, hence lower cost.


Low power consumption

DESIGN METHODOLOGY
Early in the development cycle,
different communication and
signal processing algorithms are
evaluated for their performance
under different conditions like
noise, multipath channel and
radio non-linearity. Since most of
these algorithms are coded in "C"
or tools like MATLAB, it is
important to have a verification
mechanism which ensures that
the hardware implementation
(RTL) is same as the "C"
implementation of the algorithm.
The flow is shown in the Figure.

SPECIFICATIONS OF THE OFDM TRANSCEIVER


Data rates to be supported
Range and multipath tolerance

Indoor/Outdoor applications
Multi-mode: 802.11a only or 802.11a+HiperLAN/2

DESIGN TRADE-OFF
Area - Smaller the die size lesser the chip cost
Power - Low power crucial for battery operated mobile devices

Ease of implementation - Easy to debug and maintain


Customizability - Should be customizable to future standards with variations
in OFDM parameters

ALGORITHM SURVEY & SIMULATION


The simulation at algorithmic level is to determine performance of algorithms
for various non-linearitys and imperfections. The algorithms are tweaked and
fine tuned to get the required performance. The following
algorithms/parameters are verified

Channel estimation and compensation for different channel models (Rayleigh,


Rician, JTC, Two ray) for different delay spreads
Correlated performance for different delay spreads and different SNR
Frequency estimation algorithm for different SNR and frequency offsets

Compensation for Phase noise and error in Frequency offset estimation


System tolerance for I/Q phase and amplitude imbalance
FFT simulation to determine the optimum fixed-point widths
Wave shaping filter to get the desired spectrum mask
Determine clipping levels for efficient PA use
Effect of ADC/DAC width on the EVM and optimum ADC/DAC width

FIXED POINT SIMULATION


One of the decisions to be taken early in the design cycle is the format or
representation of data. Floating point implementation results in higher hardware
costs and additional circuits related with normalizing of numbers. Floating point
representation is useful when dealing with data of different ranges. But this
however is not true as the Baseband circuits have a fair idea of the range of values
they will work on. So a fixed-point representation will be more efficient. Further in
fixed point a choice can be made between signed and 2's complement
representation.

The width of representation need not be constant throughout the Baseband and it
depends on the accuracy needed at different points in transmit or receive path. A
small change in the number of bits in the representation could result in a significant
change in the size of arithmetic circuits especially multipliers.

SIMULATION SETUP
The algorithms could be simulated in a variety of tools/languages
like SPW, MATLAB, C or a mix of these.
SPW has an exhaustive floating point and fixed-point library. SPW
also provides feature to plug-in RTL modules and do a cosimulation of SPW system and Verilog. This helps in verifying the
RTL implementation of algorithms against the SPW/C
implementation.

HARDWARE DESIGN
Baseband interfaces with two external modules: MAC and Radio.
INTERFACE TO MAC
Baseband should support the following for MAC
Should support transfer of data at different rates
Transmit and receive control
Register programming for power and frequency control

Following options are available for MAC interface:


Serial data interface Clock provided along with data. Clock speed changes for
different data rates
Varying data width, single speed clock The number of data lines vary according to
the data rate. The clock remains same for all rates.
Single clock, Parallel data with ready indication Clock speed and data width is same
for all data rates. Ready signal used to indicate valid data

INTERFACE TO RADIO
Two kinds of radio interfaces are described below

I/Q interface
On the transmit side, the complex Baseband signal is sent to the radio unit that
first does a Quadrature modulation followed by up-conversion at 5 GHz. On the
receive side, following the down-conversion to IF, Quadrature demodulation is
done and complex I/Q signal is sent to Baseband. Shown below is the interface.

IF interface
The Baseband does the Quadrature modulation and demodulation digitally.

CLOCKING STRATEGY
The 802.11a supports different data rates from 6 Mbps to 54 Mbps. The clock scheme
chosen for the Baseband should be able to support all rates and also result in low power
consumption. We know from our Basic ASIC design guidelines that most circuits should
run at the lowest clock.
Two options are shown below:

Above scheme requires different clock sources or a very high clock rate from which all
these clocks could be generated.
The modules must work for the highest frequency of 54 MHz.

Shown in the figure is a simpler clocking scheme with only one clock speed for all
data rates
Varying duty cycles for different data rates is provided by the data enable signal
All the circuits in the transmit and receive chain work on parallel data (4 bits)
Overhead is the Data enable logic in all the modules

Optimize Usage Of Hardware Resources By


Reusing Different Blocks
Hardware resources can be reused considering the fact that 802.11a system is a halfduplex system. The following blocks are re-used:
FFT/IFFT
Interleaver/De-interleaver
Scrambler/Descrambler
Intermediate data buffers

Since Adders and Multipliers are costly resources, special attention should be given to
reuse them. An example shown below where an Adder/Multiplier pool is created and
different blocks are connected to this.

Optimize the widely used circuits


Identify the blocks that are used at several places (several instances of the same unit)
and optimize them. Optimization can be done for power and area. Some of the
circuits that can be optimized are:
Multipliers
They are the most widely used circuits. Synthesis tools usually provide highly
optimized circuits for multipliers and adders.
In case optimized multipliers are not available, multipliers could be designed using
different techniques.
ACS unit
There are 64 instantiations of ACS unit in the Viterbi decoder. Optimization of ACS
unit results in significant savings.
Custom cell design (using foundry information) for adders and comparators could
be considered.

THANK YOU