A Fpga Ieee-754-2008 Decimal64 Floating-Point Adder-Subtractor

A FPGA IEEE-754-2008 DECIMAL64 FLOATING-POINT ADDER/SUBTRACTOR Carlos Minchola1, Martn Vazquez2 and Gustavo Sutter1 School of Engineering, Universidad
Autnoma de Madrid, Madrid, Spain INTIA Institute, Universidad Nacional del Centro de la Prov. de Bs. As., Argentina e-mail: carlos.minchola@uam.es; mvazquez@exa.unicen.edu.ar; gustavo.sutter@uam.es
2 1
ABSTRACT This paper describes the FPGA implementation of a Decimal Floating Point (DFP) adder/subtractor. The design performs addition and subtraction on 64-bit operands that use the IEEE 754-2008 decimal encoding of DFP numbers and is based on a fully pipelined circuit. The design presents a novel hardware for pre-signal generation stage and an enhanced version of previously published leading zero stage. The design can operate at a frequency of 200 MHZ on a Virtex-5 with a latency of 8 cycles. The presented DFP adder/subtractor supports operations on the decimal64 format and it is easily extendable for the decimal128 format. To our knowledge, this is the first hardware FPGA design for adding and subtracting IEEE 754-2008 using decimal64 encoding. 1. INTRODUCTION Decimal arithmetic plays a key role in many commercial and financial applications, which process decimal values and perform decimal rounding. However, current software implementations are prohibitively slow [1], prompting hardware manufacturers such as IBM to add decimal floating point (DFP) arithmetic support to their microprocessors [2]. Furthermore, the IEEE has developed the newly IEEE 754-2008 [3] standard for Floating-Point Arithmetic adding the decimal representation to the IEEE 754-1985 standard. There are several works focused on fixed-point addition [4, 5]. For example, in [4, 5], Busaba, and Haller propose combined decimal and binary adders using pre-sum and pre-selection logic. Only a few previous recent papers focused on decimal floating-point addition [6-8]. The proposed adder by Cohen et al. [8] and Bohlender et al. [6] have long latencies and produces one result digit each cycle. In [7] presents the first IEEE 754 decimal floating point adder. Nevertheless, recently appears the first publications in decimal arithmetic applications on FPGA [9]. Several hardware designs were synthesized using other platforms than FPGA [10], this is why it is believed that it
can be one of the first implementation on FPGA for addition/subtraction using decimal64 encoding. A recent work [9] presents a DFP adder for FPGA but using Binary Integer Decimal (BID) encoding, whose results will be used in our comparisons. Because of the BID adder occupies less area [9] than our proposed design, it may be argued that the BID format is well suited for hardware implementation but considering the latency-area tradeoff. The outline of the paper is as follows. In the next Section, it is described the background information on decimal floating-point. Section 3 describes the challenge of adding decimal64 encoded numbers, and presents the technique and theory for addition and subtraction. Section 4 presents synthesis results for our proposed adder and comparisons with the adder from [9]. Finally, Section 5 presents our conclusions. In the rest of this paper, AX and BX are the significands and EAX, EBX and EX are the exponents respectively. X is a digit that denotes the outputs of different units. The symbol (N)ZT refers to Tth bit of the Zth digit in a number N, where the least significant bit and the least significant digit have index 0. For example, (A1)25 is the fifth bit of the second BCD digit in A1. 2. DECIMAL FLOATING POINT IN IEEE 754-2008 The new IEEE 754-2008 [3] standard specifies formats for both binary floating-point (BFP) and decimal floating-point (DFP) numbers. The primary difference between two formats, besides the radix, is the normalization of the significands (coefficient or mantissa). BFP significands are normalized with the radix point to the right of the most significant bit (MSB), while DFP mantissa are not required to be normalized and are represented as integers. The IEEE 754-2008 standard specifies DFP formats of 32, 64, and 128 bits. A DFP number contains a sign bit, an integer significand with a precision of p digits, and a biased exponent q. The value of a finite DFP number is: D = -1s x C x 10q, q = E bias (1) Where s is the sign bit, C is the non-negative integer significand, and q the exponent. The exponent q is obtained as a function of biased non-negative integer exponent E.
251
978-1-4244-8848-3/11/$26.00 2011 IEEE
The mantissa is encoded in densely packed decimal [3], the exponent must be in the range [emin, emax], when biased by bias. Representations for infinity and not-anumber (NaN) are also provided. Representations of floating-point numbers in the decimal interchange formats are encoded in k bits in the following three fields (Fig. 1):
unit swaps the operands (A1, B1) if EA1 < EB1 and generates the BCD coefficients A2 (with higher exponent, max(EA1, EB1)) and B2 (with lower exponent, min(EA1, EB1)). In parallel with the above mentioned, this unit generates an exponent difference (Ed = |EA1 - EB1|), the exponent E2 = max(EA1, EB1), the SWAP flag if a swapping process is carried out, and the right shift amount (RSA) which indicates how many digits B2 should be right shifted in order to guarantee that both coefficients (A2, B2) have the same exponent.
Fig. 1.
Decimal interchange floating-point format
a) 1-bit sign s. b) A w + 5 bit combination field G encoding classification and, if the encoded datum is a finite number, the exponent q and four significand bits (1 or 3 of which are implied). The biased exponent E is a w + 2 bit quantity q + bias, where the value of the first two bits of the biased exponent taken together is either 0, 1, or 2. c) A t-bit trailing significand field T that contains J 10 bits and contains the bulk of the significand. J represents the number of declets. When this field is combined with the leading significand bits from the combination field, the format encodes a total of p = 3 J + 1 decimal digits. The values of k, p, t, w, and bias for decimal64 interchange formats are 16, 50, 12, and 398 respectively. That means that number has p=16 decimal digits of precision in the significand, an unbiased exponent range of [383, 384], and a bias of 398. 3. DECIMAL FLOATING-POINT ADDER/SUBTRACTOR IMPLEMENTATION A general overview of proposed adder/subtractor is described below. For the best performance, the design presents eight pipelined stages as is exhibited in the Fig. 2. Arrows are used to show the direction of data flow, the dashed blocks indicate the main stages of the design, and the dotted line indicates the pipeline. This architecture was proposed for the IEEE 754-2008 decimal64 format and can be extended for the decimal128 format. The adder/subtractor on decimal64 is carried out as follows: The decoder unit takes the two 64-bit IEEE 7542008 operands (OP1, OP2) to generate the sign bits (SA, SB), 16-digit BCD significands (A0, B0), 10-bit biased exponents (EA, EB), the effective operation (EOP) and flags for specials values of NaN or infinity. The signal EOP defines the effective operation (EOP = 0 for effective addition and EOP = 1 for effective subtraction), this signal is calculated as: EOP = SA xor SB xor OP (2) As soon as possible the decoded significands become available, the leading zero detection unit (LZD) takes these results and computes the temporary exponents (EA1, EB1) and the normalized coefficients (A1, B1). The swapping
Fig. 2.
High-level Decimal Floating-Point Adder Diagram
The RSA is computed as follows:

if (Ed <= p_max) RSA = Ed else RSA = p_max
(3) (4)
The value p_max = 18 digits, RSA is limited to this value since B2 contains 16 digits plus two digits which will be processed to compute the guard and round digit. Next, the Shifting unit receives as inputs the RSA, and the significand B2 generating a shifted B2 (B3) and a 2-bit signal called predicted sticky-bit (PSB) that will predict two initials sticky bits. PSB and B3 will be utilized as inputs in the decimal addition, control signals generation and post-correction units, respectively.
252
The outputs above mentioned plus two signals, significand A2 and EOP, are taken as inputs in the control signals generation unit and generates the signals necessary to perform an addition or subtraction operation, these signals are described in the Sub-section 3.4 and are made up of a prior guard digit (GD1), a prior round digit (RD1), an extra digit (ED), a signal which verifies if A2 > B3 (AGTB) and a carry into (CIN). The significands BCD (A2, B3) and the CIN are inputs the decimal addition unit generating the partial sum of magnitude |S1| = |A2 + (1)EOP B3| and a carry out (COUT), respectively. At once, the 16-digit decimal addition unit takes the A2, B3, EOP and CIN and computes S1 as follows: S1 = A2 + B3 if EOP = 0, S1 = A2 + cmp9(B3) if EOP = 1 and A2 >= B3, and S1 = cmp9(A2+cmp9(B3)) if EOP = 1 and A2 < B3. The symbol cmp9 means the 9`s complement. The post-correction unit uses as inputs the PSB, the exponent E2, GD1, RD1, ED, the partial sum S1 and COUT to verify, correct and compute the inputs signals if only the following two cases occur: 1) COUT=1 and EOP=0, and 2) (S1)15=0 and (GD1 > 0) and (EOP=1). The analysis is explained in the Sub-section 3.5. This unit generates the final sticky bit (FSB), the corrected guard digit (GD2) and round digit (RD2), the final partial sum (S2) and the corrected exponent (E3). Next, the Rounding unit takes the outputs of the prior unit and rounds S2 to produce the results significand S3 and adjusts the exponent E3 to calculate the final exponent E4. Simultaneously the overflow, underflow and sign bit signals are generated. The final sign bit is computed as:
FS = (SA ~EOP) (EOP (AGTB SA SWAP))
bit sign, a 10-bit exponent (E - bias), and 16 digits represented in BCD, i.e. 64 bits. Then the decoder process is a combinational circuit that unpacks 64 bits into 75 bits (1+10+64). Into the decoder function the special cases (infinity and NANs) are detected and signaled. The coder stage transforms the 75-bit representation of the result into the 64-bit interchange format. The combinational circuit takes into account the special cases that can be detected by the special case detection circuitry. Additionally in this stage the overflow and underflow conditions are detected and signaled. 3.2. Alignment operation: leading zero detection and swapping The alignment operation is made up of the leading zero detection (LZD) and swapping stages. This unit reads the unpacked signals from the prior one, at once a LZD process is applied on both operands to estimate and remove the significant zeros (LZDA, LZDB) in order to obtain normalized significant digits in the results (A1, B1). The exponents are also normalized and set to EA1 = EA - LZDA and EB1 = EB - LZDB. The two significands results are swapped if EB1 > EB2 and a new exponent E2 is determined, simultaneously a SWAP flag is produced to indentify the swapping (if swapping process occurs then SWAP = 1 otherwise SWAP = 0). The LZD and swapping process generate outputs as the significands A2 and B2, RSA and E2 which were explained the beginning of Section 3. The implementation of the LZD and swapping units are shown in the Fig. 3. The LZD design is based on low level combinational circuit which improves period of the data path, the proposed circuit is slighter than the proposed one in [11]. The SUB-ABS process calculates |EA1-EB1| that uses a binary absolute value unit; this grants a reduced delay [12]. 3.3. Shifting right The B2 is right shifted by the RSA amount in order to both significands (A2, B2), the results is stored in a 19-digit register (B3) in which the last three digits represent the predicted guard and round digit (GD1, RD1), and the extra digit (ED) respectively. The ED digit is used when a subtraction operation occurs. In the same way this module generates a 2-bit predicted sticky bit signal PSB(1:0). Actually this predicts two sticky-bit signals which are made up of the following way: SB1 = PSB(1) or PSB(0), SB2 = PSB(0) which will be selected by means of conditions described in the post-correction stage. PSB is produced concatenating the least significant digit (LSD) of B3 ((B3)0) and the or-operation of the digits of B2 that are not considered in B3.
(5)
Finally, the FS, E4 and S3 are processed to generate the IEEE 754-2008 result in the Encoder unit. This stage also handles special cases as not-a-number (NaN) or infinity.
Fig. 3.
Alignment and swapping unit
3.1. IEEE 754-2008 Decoder/Coder A DFP number represented into the IEEE 754-2008 interchange format (Fig.1) is unpacked into the three fields: sign, exponent and significands. A decimal64 number (64 bit interchange format) is decomposed into a 1
253
3.4. Control signal generation After processing the previous stage, this one takes as inputs: ISB, EOP, and the significands B3, A2. The adder/subtractor operation is controlled by the extra signals generated in this unit, these ones are: an initial carry (CIN), previous guard and round digit (GD1, RD1), an additional digit (ED) and a signal that verifies if A2 >= B3. In parallel in the adder unit the A1 operand and the B3s 16 most significant digits (MSD) are registered, this process will be described in the next sub-section. The goal is to compute the mentioned signals: guard digit (GD1), round digit (RD1) and extra digit for subtraction (ED). The final alignment is exhibited in Fig. 4, both operands are placed starting from the MSD thus the first of them is equal to A2 and the second one equal to (B3)18 down to 1 , the initial value of the extra digit (IED) is set to (B3)0 . As a matter of fact this stage just predicts signals outputs, the corrected and definitive values are generated in the post-correction unit. The implementation of this stage is fully combinational. 3.5. Decimal Addition/Subtraction design Because both operands (A2, (B3)18 down to 3) are computed it is proposed a 16-digit decimal adder/subtractor. First, it is necessary to correct B3: if subtraction operation is selected (EOP = 1) then (B3corrected)i = 9 - (B3)i, otherwise (B3)i. Having as inputs signals the two operands above mentioned (A2, B3corrected), EOP and AGTB, the goal of the implementation is to calculate the corresponding partial sum (S1 = A2 + (B3 corrected)18 down to 3). The proposed algorithm is based on 10s complement BCD numbers, and in the carry-chains techniques to carry out the fully design. The carry-chain adding consists in computing beforehand all carry propagating and carry generating conditions, in order to reduce the overall execution time. It has been considered the highperformance adder proposed in [13]. 3.6. Post-Correction The temporary result generated from the adder unit requires a post-correction unit to convert the uncorrected result. As it is indicated at the beginning of Section 3, this stage uses as inputs the PSB, the exponent E2, GD1, RD1, ED, the partial sum S1 and COUT. The outputs (corrected inputs) are designated as final sticky-bit (FSB), final guard digit (GD2), final round digit (RD2), corrected partial sum S1 (S2), and updated exponent E2 (E3). The conditions for performing this correction are defined below: 1.- If EOP = 0 and COUT = 1. Condition enforced when performing an effective addition operation and COUT = 1, therefore a correction is applied to S1. In order to fit 16 digits in S1 then COUT is
considered as MSD of S1 plus 15 digits MSD of one, this result is stored in S2. As well the guard digit final (GD2) is determined as the LSD of S1. Likewise the final round digit (RD2) is updated to GD1, the final exponent E3 is corrected to E2 + 1 and the new value of the final sticky bit (FSB) is equal to bit generated by the or-operation: PSB(0) or PSB(1) or RD1. 2.- If EOP = 1 and MSD of S1 = 0 and GD1 > 0. Condition necessary when performing an effective subtraction operation, the MSD of S1 is equal to 0 and guard digit input is greater than zero. The result S2 is made up of 15 digits LSD of S1 plus GD1 as LSD of S2. The final guard digit GD2 is considered equal to RD1, the final round digit RD2 equal to extra digit (ED), final exponent (E3) is update to E2 - 1 and the new sticky bit (FSB) is set to PSB(0). 3.- If EOP = 1 and MSD of S1 = 0 and GD1 = 0. Condition enforced when performing an effective subtraction operation, the MSD of S1 is 0 and guard digit input is equal to 0. The result S2 is equal to S1. The final guard digit GD2 is equal to GD1, the final round digit RD2 equal to RD1, final exponent (E3) is update to E2 and the new sticky bit (FSB) is set to bit generated by the oroperation: PSB(0) or PSB(1).
A d d itio n (e o p = 0 ) / S u b tra c tio n (e o p = 1 ) MSD A 2 (A 2 ) 1 5 (A 2 ) 1 4 (A 2 ) 1 3 . (A 2 ) 3 (A 2 ) 2 (A 2 ) 1 B 3 (B 3 ) 1 8 (B 3 ) 1 7 (B 3 ) 1 6 . (B 3 ) 6 (B 3 ) 5 (B 3 ) 4 LSD (A 2 ) 0 (B 3 ) 3 (B 3 ) 2 (B 3 ) 1 2 D ig its C IN AGTB EOP GD1 RD1
1 6 D ig its (A 2 , B 3 ) A2 + B3 C om ponent R D 1 = P rio r R o u n d D ig it G D 1 = P rio r G u a rd D ig it IE D = In itia l E x tra D ig it = (B 3 )o A G T B = 0 if A > B , 1 o th e rw is e
Fig. 4.
Example of Alignment for A2 and B3 registers.
3.7. Rounding This unit takes as inputs the outputs of the previous one, is a simple stage that receives verified signals beforehand. In the rounding process, when no overflow or underflow is present (the stage detects the underflow and overflow cases in decimal adder) the rounding must be performed when all of the significant digits of the S2 do not fit within the p digits (p=16) of the results significands. The rounding examines the inputs: Sticky bit (FSB), Guard Digit (GD2) and Round Digit (RD2) in accordance with the chosen rounding. The decision to increment or not is made by the rounding module depending on the rounding strategy, the conditional plus one decimal adders adds one to the S2 or not depending on rounding add_one. The rounding decision add_one is made by the following algorithm:
254
Algorithm 2. Round ties to even rounding strategy

GD = GD2 --guard digit SB = FSB --sticky bit. if GD < 5 then add_one := 0; elsif GD >5 then add_one := 1; else --GD = 5 if SB = 0 then add_one := 0; else add_one := 1; end if end if;
4.1. Results verification To validate the designs, large numbers of random vectors were applied to an automated testbench that tests the behavioral model using ModelSim. For the floating point adder/subtractor, over 40,000 test cases were used. All rounding modes and exception are supported in this version. Additionally the core was implemented as a Microblaze coprocessor and again 40 thousand random test cases were used to validate results. 4.2. Results Comparisons Since no previous results were found for decimal64 addition/subtraction in FPGA, as a first comparison we match up our approach against a BID adder/subtractor [9] and the double precision Binary Floating Point adder/subtractor [17] provided by the Xilinx Core Generator implemented in Virtex-5. [9] describes the FPGA implementation of a decimal 64-bit adder using Binary Integer Decimal (BID) encoding, in BID the 16 decimal digits are represented as 54 bits number. The results are shown at table 2, the area is expressed in LUTs, FF, DSP blocks and Block Rams (BR), and the timing information includes maximum operating frequency (MHz), the number of cycles (#cy), the latency (T), and the productivity in mega-operations per second. The results depicted in table 2 shows that our approach outperforms in terms of latency and throughput the BID approach of [9]. Even the decimal adder/subtractor proposed has a performance similar of the binary floating point representation, but doubling the area.
Table 2.
An additional extra operation is potentially carried out at this point. If the sum S2 is 9999 and the add_one signal is asserted, the resulted decimal significand is 100 but using one extra digit. The solution in that situation is that the conditional adder returns 100 with p digits and gives an additional signal in order to add one to the intermediate exponent, E3, resulting the final exponent E4. 4. FPGA IMPLEMENTATION OF DFP ADDITION/SUBTRACTION All circuits were described in VHDL. Some parts of adder/subtractor use low level component instantiation. The circuits have been implemented on Xilinx Virtex-5, device XC5VLX220T, with speed grade 2 using timing constraints [14]. For synthesis and implementation XST [15] and Xilinx ISE 12.1 tool [16] have been used respectively. An area breakdown is exhibited in Table 1 and shows the approximate contribution of each component. The design uses 863 slices (2390 LUTS, 935 FF), and can operate at a frequency of 200 MHz with a latency of 8 cycles.
Table 1.
Area breakdown of DFP Adder/Subtractor

Components cycles 1 1 1 1 1 1 1 0 0 0 1 8 area LUTs % 128 5.6% 18 0.8% 16 0.7% 97 4.2% 375 16.4% 91 4.0% 110 4.8% 1416 59.9% 4 0.2% 48 1.0% 83 3.6% 2390 100%
Implementation result comparison for BID adder, a Binary64 and the proposed circuit in Virtex-5
LUT FF DSP BR MHz #Cy 2171 1392 12 3 T (ns) 79.3/ 1 163.9 13/18 109.8 - 333 14 42.0 - 200 8 40.0 Mop /sec 12.6/ 9.1 333 200
Adder / Subtractor BID [9]
Decoder Leading Zero Detector Swapping Pre-signal generation Carry-chain Adder Post-Correction Rounding (overflow, underflow) Further combinational circuits Special Cases Further combinational circuits Coder Entire Design
BCD Adder
Binary64 [17] 734 960 Proposed 2390 935
The performance of our 16-Digit Decimal adder/ subtractor is based on carry-chain techniques described in [13] and is the key point in the performance. The presented result uses the nearest ties to even as rounding, the special cases detection and signaling and overflow/underflow detection.
As a second comparison we present at table 3 several software and hardware achievement for decimal adder/subtractors. The reports [18] and [19] are based on an Intels software DFP library, meanwhile Cowlishaw reports for decNumber software [20] a worst case performance of 848 cycles [21]. The hardware implementation includes the IBM Power6 and Z10 processors whos has hardware acceleration for decimal floating point [22][23], and the Binary Integer Decimal (BID) implementation in 65nm and Virtex 5.[9, 10] The proposed FPGA implementation is more than one order of magnitude faster than the software implementation. The hardware acceleration present in Z10 and Power 6 is faster than a single Virtex 5 core. However, processors have
255
a single DFP unit, meanwhile in a single Virtex-5 FPGA (LX330) we can fit near 60 adder/subtractor cores.
Table 3.
[7] [8] [9]
Latency of the DFP adder/subtractor implemented on different platforms.

Technology Clk Cycles (GHz) 1.4 219 3.0 133 3.2 249 1.5 848 5.0 17 4.4 12 1.3 3-13 0.16 13-18 0.2 8 Delay (ns) 156.4 44.3 77.8 565.3 3.4 2.7 10.0 109.8 40 Mops/ sec 6.4 22.6 12.9 1.8 294.1 366.7 100.0 9.1 200.0
Itanium2 [18] Xeon5100 [19] Xeon [18] Pentium M [21] Power6 [22] Z10 [23] BID 65nm [10] BID Virtex 5 [9] Proposed Virtex5
[10]
[11]
5. SUMMARY To the authors knowledge, this is the first hardware FPGA implementation of a DFP adder/subtractor using densely packed decimal, and compliant with the standard IEEE 754-2008. The design is fully pipelined with 8 cycles latency and can operate at 200 MHz in a Virtex 5 device. The proposed architecture pre-normalizes the BCD significant improving the operating frequency. The significant BCD addition/subtraction is based on a carry save structure previously published. The comparison with software libraries running in a general purpose processor is at least one order of magnitude faster. Comparing against the hardware accelerator presented in modern IBM processors is the same order of magnitude in productivity. Nevertheless, processors have a single DFP unit, meanwhile it is possible to arrange tens DFP cores in a single FPGA device. 6. REFERENCES
[1] [2] M. F. Cowlishaw, Decimal floating-point: algorism for computers, in Proc. 16th IEEE Symp. Computer Arithmetic, 2003, pp. 104111. E. M. Schwarz, J. S. Kapernick, and M. F. Cowlishaw, Decimal floating-point support on the IBM System z10 processor, 2009, iBM Journal of Research and Development. IEEE Standard for Floating-Point Arithmetic, pp. 158, 2008, iEEE Std 754-2008. F. Y. Busaba, C. A. Krygowski, W. H. Li, E. M. Schwarz, and S. R. Carlough, The IBM z900 decimal arithmetic unit, in Proc. Conf Signals, Systems and Computers Record of the Thirty-Fifth Asilomar Conf, vol. 2, 2001, pp. 13351339. W. Haller, K. Ulrich, L. Thomas, and H. Wetter, Combined binary/decimal adder unit, in International Business Machines Corporation (Armonk, NY), 1999. G. Bohlender and T. Teufel, BAP-SC: A Decimal Floating-Point Processors for Optimal Arithmetic, in Computerarithmetic: Scientific Computation and [19] [12]
[13]
[14] [15] [16] [17] [18]
[20] [21] [22]
[3] [4]
[5] [6]
[23]
Programming Languages, E. Kaucher, U. Kulisch, and C. Ullrich, Eds. B.G Teubner Verlag, 1987, pp. 3158. J. Thompson, N. Karra, and M. J. Schulte, A 64-bit decimal floating-point adder, in Proc. IEEE Computer society Annual Symp. VLSI, 2004, pp. 297298. M. S. Cohen, T. E. Hull, and V. C. Hamacher, CADAC: A Controlled-Precision Decimal Arithmetic Unit, no. 4, pp. 370377, 1983. A. Farmahini-Farahani, C. Tsen, and K. Compton, FPGA implementation of a 64-Bit BID-based decimal floatingpoint adder/subtractor, in Proc. Int. Conf. FieldProgrammable Technology FPT 2009, 2009, pp. 518521. C. Tsen, S. Gonzalez-Navarro, and M. Schulte, Hardware design of a Binary Integer Decimal-based floating-point adder, pp. 288295, 2007, computer Design, 2007. ICCD 2007. 25th International Conference on. C. Minchola and G. Sutter, A FPGA IEEE-754-2008 Decimal64 Floating-Point Multiplier, in Proc. Int. Conf. Reconfigurable Computing and FPGAs ReConFig 09, 2009, pp. 5964. L.-K. Wang and M. J. Schulte, Decimal Floating-Point Adder and Multifunction Unit with Injection-Based Rounding, in Proc. 18th IEEE Symp. Computer Arithmetic ARITH 07, 2007, pp. 5668. M. Vazquez, G. Sutter, G. Bioul, and J. P. Deschamps, Decimal Adders/Subtractors in FPGA: Efficient 6-input LUT Implementations, in Proc. Int. Conf. Reconfigurable Computing and FPGAs ReConFig 09, 2009, pp. 4247. Xilinx Inc. XST User Guide 12.1, v12.1 ed., Xilinx Inc., June 2009. [Online]. Available: http://www.xilinx.com Xilinx Inc. Xilinx ISE Design Suite 12.1 Software Manuals, v12.1 ed., Xilinx Inc., June 2009. [Online]. Available: http://www.xilinx.com Xilinx Inc. Virtex-5 Libraries Guide for VHDL design, v12.1 ed., Xilinx Inc., June 2009. [Online]. Available: http://www.xilinx.com Xilinx Inc, DS335: Floating-Point Operator v5.0, June 2009. M. Cornea, J. Harrison, C. Anderson, P. Tang, E. Schneider, and E. Gvozdev, A software implementation of the ieee 754r decimal floating-point arithmetic using the binary encoding format, pp. 148162, 2009, computers, IEEE Transactions on. M. Cornea, C. Anderson, J. Harrison, P. T. P. Tang, E. Schneider, and C. Tsen, A software implementation of the ieee 754r decimal floating-point arithmetic using the binary encoding format, pp. 2937, 2007, computer Arithmetic, 2007. ARITH 07. 18th IEEE Symposium on. The decNumber C library, v3.68 ed., IBM UK Laboratories, January 2010. [Online]. Available: http://speleotrove.com/decimal/decnumber.pdf M. Cowlishaw. (2009) Decimal library performance. [Online]. Available: http://speleotrove.com/decimal/decperf.pdf L. Eisen, J. W. Ward, H.-W. Tast, N. Mading, J. Leenstra, S. M. Mueller, C. Jacobi, J. Preiss, E. M. Schwarz, and S. R. Carlough, Ibm power6 accelerators: Vmx and dfu, pp. 121, 2007, iBM Journal of Research and Development. C. F. Webb, IBM z10: The Next-Generation Mainframe Microprocessor, vol. 28, no. 2, pp. 1929, 2008.
HW
SW
256

A Fpga Ieee-754-2008 Decimal64 Floating-Point Adder-Subtractor

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

A Fpga Ieee-754-2008 Decimal64 Floating-Point Adder-Subtractor

Загружено:

Авторское право:

Доступные форматы

A FPGA IEEE-754-2008 DECIMAL64 FLOATING-POINT ADDER/SUBTRACTOR Carlos Minchola1, Martn Vazquez2 and Gustavo Sutter1 School of Engineering, Universidad

Decimal interchange floating-point format

High-level Decimal Floating-Point Adder Diagram

The RSA is computed as follows:

Alignment and swapping unit

1 6 D ig its (A 2 , B 3 ) A2 + B3 C om ponent R D 1 = P rio r R o u n d D ig it G D 1 = P rio r G u a rd D ig it IE D = In itia l E x tra D ig it = (B 3 )o A G T B = 0 if A > B , 1 o th e rw is e

Example of Alignment for A2 and B3 registers.

Algorithm 2. Round ties to even rounding strategy

Area breakdown of DFP Adder/Subtractor

Adder / Subtractor BID [9]

Binary64 [17] 734 960 Proposed 2390 935

[7] [8] [9]

Latency of the DFP adder/subtractor implemented on different platforms.

[14] [15] [16] [17] [18]

[20] [21] [22]

Вам также может понравиться