Академический Документы
Профессиональный Документы
Культура Документы
ISBN: 978-93-
New Reconfigurable Architectures for Implementing FIR Filters with Low Complexity S.Poojitha Prof.
Dept. of ECE Dept. of ECE SVPCE Visakhapatnam, India Visakhapatnam, India jithamahi@gmail.co m
AbstractRecon the two key requirements of (FIR) communication systems. In this paper, two new recon roposed, namely constant shifts method and programmable shifts method. The proposed FIR architecture is capable of operating for different word length hardware circuitry. We show that dynamically recon rable using common sub expression elimination algorithms. Design examples show that the proposed architect ures offer good area and power reductions and speed improvement compared to the best existing recon e FIR the literature.
M.Murali SVPCE muralitejas@gmail.co m narrowband channels from a wideband signal using a of FIR filters, called channel filters. bank However the stringent adjacent channel due to attenuation specifications of wireless communication standards, order higher filters are required for channelization and consequently the complexity and power consumption of the receiver will be high. As the ultimate aim of the future multistandard communication receiver is to realize wireless its functionalities in mobile handsets, where its full utilization is possible, low power and low area implementation of FIR channel filters is inevitable. The complexity of FIR filters is dominated by the complexity of coefficient multipliers. Moreover, we that there is sufficient scope for more work note on complexity reduction in reconfigurable filters especially for wireless communication applications where higher order filters are often required to meet stringent the adjacent channel attenuation specifications. In this paper, we propose two architectures that integrate reconfigurability and low complexity to realize FIR filters. The FIR filter architectures proposed are called constant shifts method (CSM) and programmable shifts method We have presented the preliminary design (PSM). of these architectures in a recent conference paper [7]. proposed architectures consider coefficients The as constants (as they are stored in LUTs) and input as variable. The coefficient multiplication signal in such a case is known as multiple constant multiplications (MCM), i.e.
Index Terms Channelizer, sub expression elimination, FIR vel synthesis, recon I. I NTRODUCTION
common
Recent advances in mobile computing and communication applications demand low power and speed VLSI Digital Signal Processing high (DSP) systems. One of the most important operations in DSP is finite impulse response filtering. FIR DIGITAL filters find extensive applications in mobile communication systems for applications systems such as channelization, channel equalization, matched filtering, and pulse shaping, due to their absolute stability and linear phase properties. The filters employed in mobile systems must be realized to consume less power and operate at high speed. Recently, with the advent of software defined radio (SDR) technology, finite impulse response (FIR) research filter has been focused on reconfigurable realizations. The fundamental idea of an SDR is to replace most of the analog signal processing in the transceivers with digital signal processing in order to the advantage of flexibility provide through reconfiguration. This will enable different airinterfaces to be implemented on a single generic hardware platform to support multistandard wireless communications [1]. Reconfigurability of the receiver to work with different wireless communication standards is another key requirement in an SDR. The computationally intensive part of an most SDR receiver is the channelizer since it operates at the highest sampling rate [6]. It extracts multiple
Fig.1. Transposed direct form of an FIR filter. multiplication of one variable (input signal) with multiple constants (filter coefficients) The MCM is then optimized for eliminating redundancy using our recently proposed BCSE algorithm [6] to minimize complexity. The proposed CSM focuses the filter on
ISBN: 978-93-
complexity filter Fig. 2. architecture Architecture of proposed III. Filter method. Proposed Architectures section, the architecture of the In this FIR proposed presented. Our architecture is based filter is on transposed direct form FIR filter structure the as shown in Fig. 1. The dotted portion in Fig. 1 presents the MB. I n Fig. 1, i represents the re PEprocessing eleme nt corr esponding to ith the coefficient. PE performs the coefficient multiplication the help of a shift and add unit operation with .The architecture of PE is different for proposed CSM and In the CSM, the filter coefficients PSM. are partitione d into fixed groups and henc e the PE architecture involves constant shifters. But in the PSM, the PE consists of programmable shifters (PS).FIR filter architecture can be realized in a The serialin which the same PE is used for generation way of partial products by convolving the all coefficients with the input signa l h x[n])is used when ( consumption and area are of power concern. prime The architecture of the PE (dotted portion) is basic shown in Fig. 2. The functions of different blocks of the PE are explained below. and Add Unit: 1) Shift It is well known that one of the efficient ways to reduce the complexity of multiplication operation is to realize it using shift and oper ations. In contrast to conventional shift add and units used in previously proposed add reconfigurable filter architectures, we use the BCSs-based shift and unit in our pro- posed CSM and add PSM architectures. The architecture of shift and add unit is shown in Fig. 3. shift and add unit is used to realize all the 3The bit BCSs of the input signal ranging from [0 0 0] to [1 1 In Fig. 3, x>>k represents the input x 1]. shifted k units. All the 3-bit BCSs [0 1 1], [1 0 1], right by [1 0], and [1 1 1] of a 3-bit number are 1 generated three adders. Since the shifts to obtain using only the are known beforehand, PS are not required. BCSs All eight BCSs (including [000]) are then fed to these the multiplexe r unit. In both the ar chitectures ( CSM and proposed in this paper, we use the same PSM) shift and add unit.
Fig. 3. Architecture of shift and add unit. 2) Multiplexer Unit: The multiplexer units are
ISBN: 978-93-
coefficien h is the worst-case 8-bit coefficient t the bits are nonzero and hence needs a since all maximum additions and shifts. In this number of n=8, case, therefore the number of multiplexers required and is 3.The output y =h x is expressed as y= -1 x+2-2 x +2-3 x +2-4 x +2-5 x +2-6 x +2-7 x+2 -8 x (3) By2partitioning into groups of three bits from most significant bit (MSB) (3), we obtain (x +2-1 x +2-2 x +2 -3 x +2-4 x +2-5 x +2 -6 x +2- 7 h=2-1 x ) (4) -1 -1 =2-1 x x -2 x -3 ( + 2- 1 + 2 -2 ) + 2 -6 x )) (5) Note terms that the +2 + 2 -2 and +2
-4
-15
x.
, 2-3 . Since shifts are always constant irrespective ofthese the coefficients, programmable shifters are not required and these shifts can be hardwired. The final adder will compute the sum of all the unit intermediateto sums x n obtain The architecture of PE for CSM is shown in Fig. 4. The coefficient word length is considered as 16 bits. magnitude form with the MSB reserved for the sign
=2
-1
-2
( + 2 -1 ) x
(2)
-2
x x help of multiplexer unit, the final shifter unit -4 will perform the shift operations in 2 PSM and CSM architectures also differ in The the nature
(2). an 18-bit value in LUTs. Each row in LUT corresponds to one coefficient. Note that only half the number of coefficients needs to be stored as FIR filter coefficients are symmetric. The coefficient 0 to 2 values groups of three bits and are used as select signals 0 , to ) forms the select signal to Mux1 and so on. Since are there
T his unit will compute -4 the sum of all the intermediate additions x x -1 2 ( +2 ) as in (2). Compared to the
-2
addition). Thus, the same hardware architecture can the necessary reconfigurability. Moreover, the addition operations and hence offers hardware A. Architecture of CSM directly in the LUT. These coefficients are partitioned for the multiplexers. The number of multiplexer units n/ n filter coefficients. The CSM can be explained with h
Fig. 4. CSM
Architecture of PE for
ISBN: 978-93-
filter coefficient and hence it is a 2:1 multiplexer. In Fig. 4, the shifts are obtained as follows. r1 to Letdenotes the outputs of Mux1 to r6 Mux6, respectively. Then 2 -1 r 1 + -4 r 2 + -7 r3 + -10 r 4 + -13 r 5 + -1 6 y= 2 2 2 2 2 r6 . (6) The shifts are obtained by partitioning the 16 bit coefficient into groups of 3 bits. partitioning By (6) y=2-1 [(r 1+ -3 r 2) -6 [( r3+ -3 r4) -6 ( r5+ -3 r6)]] . 2 +2 2 +2 2 (7) -3 -3 Substituting r1+ r 2), ( r 3 + 2 r4), and r 5+ (6) 2 r 9, ( 2 r r 7 r8, by respectively, We get , and y = 2 -1 [r 7 + -6 (r8 + -6 2 2 (8) By substituting r 8 + -6 ( 2 y = 2 -1 (r 7 + -6 r10) . 2 (9) By substituting ( y = 2 -1 (r 11) . (10) The expressions from (6)(10) are represented in Fig. 4. The main advantage of the CSM architecture the shifts are constants irrespective of is that all the coefficients and hence can be hardwired resulting in high speed operation of the filter.Architecture of PSM B. The PSM is based on the BCSE algorithm presented in our previous work [2]. The PSM architecture presented in this section incorporates reconfigurability into BCSE. The PSM has a preanalysis part in which the filter coefficients are analyzed using the BCSE algorithm in [2]. Thus, the redundant computations (additions) are eliminated BCSs and the resulting coefficients in using the aoded format are stored in the LUT. The c coding is explained in the latter part of this format section. and add unit is identical for both PSM The shift and The number of multiplexer units required CSM. can obtained from the filter coefficients after be the application of BCSE [6]. The number of multiplexers after considering the number of nonis selected zero operands (BCSs and unpaired bits) in each of the r7 2 r 9)] . r 9) by r 10
-3
+ -6 r 10) by
r 11
coefficients after the application of the BCSE algorithm. The number of multiplexers will be corresponding to the number of non-zero operands worst-case for the coefficient (worst-case coefficient being defined as coefficient that has the maximum number of non-zero operands). The architecture of PE for PSM is shown in Fig. 5. The coefficient word length is fixed as 16 bits. Based statistical analysis, we have fixed the on our number of multiplexers as 5 (same as the number of nonzero operands). The LUT consists of two rows of 18 bits each coefficient of the for form SDDDDXXDDDDXXMMMML and DDDDXXDDDDXXDDDDXX where S represents the sign bit, DDDD represents the shift to 2 -15 and XX represents values from 0 2 input x or the BCSs the obtained from the shift In the coded format, XX = and add unit. 01 represents x , 10 x + 2 -1 x , represents x + represents and 2 -2 x , 00 x+2 -1 x 11 x , + 2 -2 represents respectively. Thus, the two rows can store up to five operands which is the worst case number of operands16-bit coefficient. In most of the for a practical coefficients, the number of operands is less than the worst case number of operands, 5. In that case MMMML can be used to avoid unnecessary additions. The values MMMM will be given as select signal to the Mux6 and L to Mux8. MMMML indicates the presence of five operands. each position indicates the presence of A 1 in each operand. Thus, for all operands to be present will be indicated by MMMML = 11111. This means the Mux6 will select the output from the output of adder, Mux8 will select the output of adder, A2. A4 and If only first operand is present, MMMML = 10 This means the Mux8 will select the output 000. of shr4 and Mux6 will select the output of PS, PS, shr1 a result of this none of the adders shr1 to . As shr4 be loaded saving significant amount of will dynamic
Fig. 5. Architecture of PE for Power. ThePSM. coding can be explained as given below. Consider the positive h coefficient h= . (11 [1010011001010011] )
ISBN: 978-93-
architecture, the number of multiplexers is fixed on the number of BCSs present in a based given coefficient set (worst case-coefficient of the set). even if the word length changes, it Thus, hardly the architecture of PSM. In [11], it affects was pointed out that for many filter taps, the highest coefficient precision is not required. Valuable resources will be wasted if all taps hardware are implemented with the highest precision. The proposed PSM can be implemented for dynamically varying coefficient precision as it is word length independent. One of the limitations of the PSM architecture is that it requires pre-analysis of filter coefficients and hence on-the-fly reconfigurability is not always feasible. But this restriction does not impose constraints on popular reconfigurable filter applications like wireless communications. This is because in such applications, we have a distinct filter each communication standard and the for coefficients are fixed for a specific standard. In of the filter other words, when the communication system is operating on a particular wireless standard, the filter coefficients do not change, i.e., the filter is not required to be an adaptive filter. When the system changes its mode of operation to a different wireless communication standard (as in the case of a multistandard transceiver), the coefficient set corresponding to the specification of the new standard is loaded (replacing the current filter coefficients). Note that the coefficients of the new standard are known beforehand (pre-stored) and therefore the pre-analysis can be done offline and the problem with re-configurability can be solved. TABLE I Synthesis Results for an FIR Filter with T 20 and Coefficient Word length of 16 aps Bits Proposed PSM Proposed CSM Gate count 22 581 22 956 Sampling (MH arrival time (ns) 33.64 frequency24 Data ) 26 26.824 V. Experimental Results section, the synthesis and design results In this theofproposed CSM and PSM architectures are presented and compared A. Synthesis Results We have used Xilinx 8.1i ISE for synthesizing he synthesis has been done on purposes. T Xilinxs 2v3000ff1152-4 FPGA. Table I shows Virtex-II the synthesis results of the CSM and PSM 20-tap FIR that has a coefficient word length of 16 bits. filter We done the implementation of filters with have different pass band edge p ) and stop band edge ( ( specification s
s )
ISBN: 978-93-
Mobile Systems . Dordrecht, The Netherlands: Kluwer Academic, 1999, pp. 257283. [2] R.Mahesh and A.P.Vinod, A new common sub expression elimination algorithm for realizing low complexity 16higherdigital filters, order IEEE Trans. Comput.-Aided Design bi bi Integr. Circuits Syst. , vol. 27, no. 2, pp. 217219, Feb. t t Gate count 2878 3532 2008 3771 Sampling Frequency (MHz) 35 20 [3] A. P. Vinod and E. M.-K. Lai, On the implementation of 24 Arrival time (ns) 7.96 8.84 efficient channel filters for wideband receivers by Data optimizing common subexpression elimination methods, 9.92 given by: p = 0 .1 , s = 0 .12 ; p = 0 . 15 IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. , 1) = 0 .2 ; 3) = p = 0 .2 , s 2) 0 .22 ; and , p s vol. 24, no. 2, pp. 295304, Feb. 2005. 4) = 0 .2 , s = 0 .3 , [4] A. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. W.Brodersen, Optimizing power using respectively. Even though the proposed transformations, IEEE Trans.Comput.-Aided Design architectures are reconfigurable, the usage of adders and shifters is Integr. Circuits Syst. , vol. 14, no. 1, pp. 1231, Jan. 1995. dependent on the filter coefficient values. Some [5] K. H. Chen and T. D. Chiueh, A low-power digit-based of adders may not be used by the multiplexers. As the reconfigurable FIR filter, IEEE Trans. Circuits Syst. II , a result of this, they are unloaded and do not vol. 53, no. 8, pp. 617621, Aug. 2006. consume any dynamic power. Hence, the power and [6] J.Mitola,Object-oriented approaches to wireless speed of the synthesis results are dependent on values system the coefficients and hence we have considered filter engineering,inSoftware Radio Architecture .New an York:Wiley,200 average of the synthesis results in all the tables in 0 this From the comparison it is very evident that paper. [7] R. Mahesh and A. P. Vinod, Reconfigurable low the CSM requires 475 gates more than that of complexity FIR filters for software radio PSM, whereas PSM requires 6.82 ns more for the data receivers, in Proc. 17th IEEE Int. Symp. Personal Indoor to arrive at the output compared to CSM. Thus, Mobile Radio Commun. (PIMRC) , Helsinki, Finland, Sep. the results in higher speed whereas the PSM CSM 2006, pp. 15. results area. The reason for lower speed of PSM in lower [8] Analysis of Efficient Architectures for FIR Filters using is to the presence of programmable shifters and due Common Subexpression Elimination Algorithm M. that less area is due to elimination of of Thenmozhi, N. Kirthika 2007, pp. 18 redundant b y using BCSE algorithm. We have additions [9] N. Moreano, E. Borin, C. de Souza, and G. Araujo, also analyzed the effect o f the MB for different Efficient data- path merging for partially reconfigurable filter coefficient word lengths of 8, 12, and 16 bits for architectures, IEEE Trans. Comput.-Aided Design Integr. the architecture. The results are shown in Table PSM Circuits Syst. , vol. 24, no. 7, pp. 969980, Jul. 2005. II. can be noted that as the precision of the [10] M. P otkonjak, M. B. Srivastava, and A. P. It Chandrakasan, Multiple constant multiplications: coefficienthigh the area consumption is increased is made Efficient and versatile framework and algorithms for and speed of operation is reduced. Thus, by the exploring common sub expression elimination, IEEE choosing the appropriate filter coefficient word length, it , Trans. Comput.-Aided Design , vol. 15, no. 2, pp. 151165 is possible to obtain reduced area and power as well Feb. 1996. as increased speed for the PSM
architecture. VII. Conclusion proposed two new approaches We have namely, and CSM PSM, for implementing reconfigurable filters with low complexity. higher order The proposed CSM and PSM methods make use of architectures with fixed number of multiplexers and reduction in complexity is achieved by the applying the BCSE The CSM architecture results algorithm. filters and PSM architecture results in in high speed low and thus low power filter implementations. area The also provides the flexibility of changing PSM the filter coefficient word lengths dynamically. The proposed reconfigurable architectures can be easily modified to employ any CSE (MCM) method. Thus,method is a general approach for low our complexity reconfigurable channel filters. References [1] T. Hentschel and G. Fettweis, Software radio
receivers, in CDMA Techniques for Third Generation