Вы находитесь на странице: 1из 5

IMPLEMENTATION OF OPTIMIZED

128-POINT PIPELINE FFT PROCESSOR


USING MIXED RADIX 4-2 FOR OFDM
APPLICATIONS
K. UMAPATHY,
Research scholar, Department of ECE, J awaharlal Nehru Technological University, Anantapur, India,
umapathykannan@gmail.com
DR. D. RAJ AVEERAPPA,
Professor, Department of ECE, Loyola Institute of Technology, Chennai, India,
draja_2001@rediffmail.com
Abstract
This paper proposes a 128-point FFT processor for Orthogonal Frequency Division Multiplexing (OFDM)
systems to process the real time high speed data based on cached-memory architecture (CMA) with the resource
Mixed Radix 4-2 algorithm using MDC style. The design and implementation of FFT processor has been done
using the above technique to reduce the size and power. Using the above algorithm the chip size will be 2.8 x
2.8 mm
2
with 0.35m technology. The power consumption with our optimum case is 72 mW for an operating
speed of 127-133 MHz which is only less than half of the latest reported 128-Point FFT design with 0.18 um
technology. A comparison has been made for various pipeline architectures such as MDC, SDF, and SDC using
the same algorithm for the design of 128-point FFT processor with respect to memory size, area and power.
Keywords: OFDM, CMA, Mixed Radix 4-2, FFT, R42MDC
1. Introduction
The fast Fourier transformation (FFT) is one of the most frequently used Digital signal processing (DSP)
algorithms for Orthogonal Frequency Division multiplexing (OFDM) applications. There are various types of
FFT architectures used in OFDM systems. They can be categorized into three types- the parallel architecture,
the pipeline architecture and the memory architecture. The parallel and pipeline architectures employ more
butterfly processing units to achieve high performance but consume larger area when compared to memory
architecture. The shared memory architecture employs only one butterfly processing unit having the advantage
of area efficiency. The block diagram of FFT processor is shown in figure 1. Our paper focuses on the memory
architecture for area efficiency and hardware simplicity in order to construct a small OFDM system. We have
proposed a 128-point FFT processor, consuming low power and having small chip area using CMOS technology
and to increase the processing speed based on cached-memory architecture (CMA) and R42MDC (Mixed Radix
4/2 MDC) style.
2. Mixed Radix 4/2 Algorithm
The computation of FFT is represented by Eq. (1). There are two types of mixed-radix FFT algorithms. The first
category indicates a situation arising naturally when a radix-q algorithm, where q =2m >2, is applied to an
input series consisting of N =2k qs equally spaced points, where 1 k <m. In this situation, k steps of radix-
2 algorithm are applied either at the beginning or at the end of the transformation. The second type of mixed-
radix algorithms indicates to those specialized for a composite N =N0 N1 N2 ...Nk. Different algorithms
may be employed based upon on whether the factors satisfy certain restrictions or not. Only the 2 4m of the
first type of mixed-radix algorithm will be considered here using the MDC style. The mixed-radix 4/2,
calculates four butterfly outputs.
(1)
K. Umapathy et al. / International Journal of Engineering Science and Technology (IJEST)
ISSN : 0975-5462 Vol. 4 No.12 December 2012 4745

Fig 1. Block diagramof FFT processor
The input data will be divided into two parallel data stream and enters the butterfly processing element after
proper delay time using R42MDC algorithm. Similar to CMA, the Mixed Radix-4/2 FFT algorithm is used as an
example to introduce R42MDC-based FFT processor architecture. The MDC structure and the signal flow graph
for Mixed Radix 4/2 algorithm are shown in figure 2 and figure 3 respectively.

Fig 2. Radix 4-2 MDC Structure
3. Cached-Memory Architecture
The cached-memory architecture is similar to the single-memory architecture except that a small cache memory
resides between the processor and main memory, as shown in Figure 4. Spiffee employs the cached-memory
architecture because a hierarchical memory system will be required to realize the benefits of the cached-FFT
algorithm. The performance of the memory system can be improved by adding a second cache set. In this
configuration, the processor operates out of one cache set while the other set is being flushed and then loaded
from memory. If the R42MDC style flushing time plus load time is less than the time required to process data in
the cache, then the processor need not wait for the cache between groups. Therefore second cache set increases
processor utilization and overall performance at the expense of some additional area and complexity.
K. Umapathy et al. / International Journal of Engineering Science and Technology (IJEST)
ISSN : 0975-5462 Vol. 4 No.12 December 2012 4746

Fig 3. Signal Flow Graph of 128-point FFT using Mixed Radix 4-2 algorithm.

Fig 4. The Proposed Cached-Memory Architecture
Table 1. Area and Power Consumption of 128-point FFT using MDC Style.
128 Point FFT
Parameter/Type Conventional
Design
Proposed Design
(Reduction %)
Frequency (MHz)
Memory size (words) 128 91 (26.7) 127-133
Area (gate count) 51,000 41990.5 (23.5) 127-133
Power Consumption (mW) 137 72 127-133
K. Umapathy et al. / International Journal of Engineering Science and Technology (IJEST)
ISSN : 0975-5462 Vol. 4 No.12 December 2012 4747
Table 2. Comparison of MDC Style with Other Architectures- SDF and SDC
Type of
Architectures
Memory size (Words)
(Reduction/Increase %)
Power (mW) Area (mm
2
)
MDC 91 (26.7) 72 7.84 (127-133 MHz)
SDF 267 (60.5) 137 22.50 (154 MHz)
SDC 245 (50) 90.6 21.25 (133 MHZ)
4. Design and Simulation
The Modelsim using C programming language was used for algorithmic-level simulation and verification
because of their high execution speed. In total, about ten simulations at various levels of abstraction were
written. Next, the details of the architecture were sorted out using the Verilog Hardware Description language
and a Cadence Simulator (Modelsim). Approximately twenty total modules for the processor and its sub-blocks
were written. Table 1 shows the area and power consumption for the proposed FFT processor in comparison
with the conventional FFT design. Moreover a comparison of MDC style with other pipeline architectures such
as Single Delay Feedback (SDF) and Single Delay Commutator (SDC) with respect to area and power factors is
shown in Table 2. Figure 5 shows the comparison graph for these pipeline architectures. Figure 6 and Figure 7
shows the simulation results for power and chip size of the proposed 128-point FFT processor respectively.

Fig 5. Comparison of Power, Area & Memory size for Pipeline Architectures MDC (Blue), SDF (Brown) & SDC (Green).

Fig 6. Simulation Results for Power - 128 point FFT Processor.
K. Umapathy et al. / International Journal of Engineering Science and Technology (IJEST)
ISSN : 0975-5462 Vol. 4 No.12 December 2012 4748

Fig 7. Chip Size of 128-point FFT Processor.
5. Conclusion
The 128 point FFT processer was designed using cache memory architecture with the resource Mixed Radix 4-2
(R42MDC) using MDC style. This exploits a hierarchical memory structure with increased performance was
developed with - (i) reduced power dissipation, (ii) small area and (iii) minimum operating clock frequencies.
Moreover a comparison has been made with other pipeline architectures and MDC style chosen for the design.
The power consumption with our optimum case is 72 mw which is only less than half of the latest reported 128-
Point FFT design in 0.18 um technology at the operating frequency 127-133 MHz. This implementation can be
used in low power applications for OFDM system data transfer and wireless communication systems. In this
study, an FFT processor based on the proposed algorithm has been implemented by using Verilog HDL and
Model Sim for circuit design and simulation.
References
[1] C. Lin, Y. Yu, and L. Van, "A low-power 64-point FFT/IFFT design for IEEE 802.11a WLAN application" in Proc. International
Symposiumon circuit and systems, 2006, pp. 4523-4526.
[2] B. G. J o and M. H. Sunwoo, New Continuous-Flow Mixed-Radix (CFMR) FFT Processor Using Novel In-Place Strategy, Electron
Letters, vol. 52, No. 5, May 2005.
[3] S. He and M. Tokelson, A New Approach to Pipeline FFT Processor, Parallel Processing Symposium, The 10th International, pp.
766-770, April 1996.
[4] S. He and M. Tokelson, Design and Implementation of 1024-point FFT Processor, Proc. IEEE CustomIntegrated Circuit
Conference, pp. 131-134, 1998.
[5] P. J ackson, C. Chan, C. Rader, J . Scalera, and M. Vai. A systolic FFT architecture for real time FPGA systems In High Performance
Embedded Computing Conference (HPEC04), Sept. 2004.
[6] L. Yang, K. Zhang, H. Liu, J . Huang, and S. Huang, "An Efficient Locally Pipelined FFT Processor," IEEE transactions on circuits
and systemsII: Express Briefs, VOL. 53, NO. 7, J ULY 2006, pp. 585-589.
[7] E. E Ngu, K. Ramar and R. Montano, V. Cooray Fault characterization and classification using wavelet and Fast Fourier Transform,
WSEAS transaction on signal processing, Volume 4, Issue 7, J uly 2008, pp. 398-408.
[8] J ess Garca1, J uan A. Michell, Gustavo Ruiz, and Angel M. Burn,"FPGA realization of a Split Radix FFT processor," Proc. of
SPIE.Microtechnologies for the New Millennium, vol. 6590, 2007, pp.65900P-1 to 65900P-11.
[9] Zhijian Sun, Xuemei Liu, and Zhongxing J i, "The Design of Radix-4 FFT by FPGA," International Symposiumon Intelligent
Information Technology Application Workshops, 2008, pp.765-768.
K. Umapathy et al. / International Journal of Engineering Science and Technology (IJEST)
ISSN : 0975-5462 Vol. 4 No.12 December 2012 4749

Вам также может понравиться