Вы находитесь на странице: 1из 16

Ch.

2 Floating Point Numbers

Representation

Comp Sci 251 -- Floating point

Floating point numbers

Binary
IEEE

representation of fractional numbers

754 standard

Comp Sci 251 -- Float

Binary Decimal conversion


23.47 = 2101 + 3100 + 410-1 + 710-2
decimal point
10.01two = 121 + 020 + 02-1 + 12-2
binary point
= 12 + 01 + 0 + 1
= 2 + 0.25 = 2.25
3

Comp Sci 251 -- Float

Decimal Binary conversion

Write number as sum of powers of 2


0.8125 = 0.5 + 0.25 + 0.0625
= 2-1 + 2-2 + 2-4
= 0.1101two
Algorithm: Repeatedly multiply fraction by two until
fraction becomes zero.
0.8125 1.625
0.625 1.25
0.25 0.5
0.5
1.0

Comp Sci 251 -- Float

Beware
Finite

decimal digits finite binary digits


Example:
0.1ten 0.2 0.4 0.8 1.6 1.2 0.4 0.8
1.6 1.2 0.4
0.1ten = 0.00011001100110011two
= 0.00011two (infinite repeating binary)
The more bits, the binary rep gets closer to 0.1ten
5

Comp Sci 251 -- Float

Scientific notation
Decimal:

-123,000,000,000,000 -1.23 1014


0.000 000 000 000 000 123 +1.23 10-16
Binary:

110 1100 0000 0000 1.1011 214


-0.0000 0000 0000 0001 1011 -1.1101 2-16
6

Comp Sci 251 -- Float

Floating point representation

Three pieces:

Format:

sign
exponent
significand
sign

exponent

significand

Fixed-size representation (32-bit, 64-bit)


1 sign bit
more exponent bits greater range
more significand bits greater accuracy

Comp Sci 251 -- Float

IEEE 754 floating point standards


Single

precision (32-bit) format

23

Normalized

rule: number represented is


(-1)S1.F2E-127, E ( 000 or 111)
Example: +101101.101+1.0110110125
0 1000 0100 0110 1101 0000 0000 0000 000

Comp Sci 251 -- Float

Features of IEEE 754 format


Sign:

1negative, 0non-negative
Significand:

Normalized number: always a 1 left of binary point


(except when E is 0 or 255)
Do not waste a bit on this 1 "hidden 1"

Exponent:

Not two's-complement representation


Unsigned interpretation minus bias
Comp Sci 251 -- Float

Example: 0.75
0.75 ten = 0.11 two = 1.1 x 2 -1
1.1 = 1. F F = 1
E 127 = -1 E = 127 -1 = 126 = 01111110two
S=0

10

00111111010000000000000000000000 =
Comp Sci 251 -- Float
0x3F400000

Example 0.1ten - Check float.a


0.1ten = 0.00011two
= 1.10011two x 2 -4 = 1.F x 2 E-127
F = 10011

-4 = E 127

E = 127 -4 = 123 = 01111011two

11

00111101110011001100110011001100110011

Comp Sci 251 -- Float

IEEE Double precision standard

11

52

not 000 (decimal 0) or 111(decimal


2047)
Normalized rule: number represented is
(-1)S1.F2E-1023
12

Comp Sci 251 -- Float

Special-case numbers

Problem:

Solution:

make exceptions to the rule

Bit patterns reserved for unusual numbers:

13

hidden 1 prevents representation of 0

E = 000
E = 111
Comp Sci 251 -- Float

Special-case numbers
Zeroes:
0

000

000

000

000

111

000

111

000

+0
-0

Infinities:

14

+
-
Comp Sci 251 -- Float

Denormalized numbers

No hidden 1
Allows numbers very close to 0
E = 000 Different interpretation applies
Denormalization rule: number represented is
(-1)S0.F2-126 (single-precision)
(-1)S0.F2-1022 (double-precision)

15

Note: zeroes follow this rule

Not a Number (NaN): E = 111; F != 000


Comp Sci 251 -- Float

IEEE 754 summary


E

= 000, F = 000 0
E = 000, F 000 denormalized
0000
E

< E < 111 normalized

= 111

F = 000 infinities
F 000 NaN
16

Comp Sci 251 -- Float

Вам также может понравиться