Вы находитесь на странице: 1из 8

1/21/2009

BinaryFractions

FloatingPoint
Representation

Eachpositionistwicethevalueoftheposition
t th i ht
totheright.
23
8
1

COMP370
IntroductiontoComputerArchitecture

7.75
31
7.375
15.25

21
2
1

20
1
0

.
.
.

21 22 23
1/2 1/4 1/8
0 1 0

Addingthepowersof2gives
8+4+2+0.25=14.25

Whatis111.11indecimal?
1.
2.
3.
4.

22
4
1

Whatis8.5inbinary?
1.
2.
3.
4.

11111111.11111
1000.01
0.100011
1000.10

1/21/2009

RangeofValues
Unsignedintegers:0to2n1
Forbyte,from0to255
Forint,from0to4.2x109

2scomplement: (2n 1)to+(2n 1 1)


Forbyte,from128to127
Forshort,from32768to32767
Forint,from2147483648to2147483647
Forlong,from9,2x1018 to9,2x1018

ShiftingExponents
241,506,800canbe
2.415068x108
24.15068x107
241.5068x106
2415.068x105
24150.68x104
241506.8x103

ScientificNotation
Exponent
241,506,800=0.2415068x109

Mantissa

BinaryScientificNotation
Abinarynumber,suchas10110011,canbe
expressedas:
d

1.0110011x27
Notetheexponentisapoweroftwonotten.

etc.

1/21/2009

ShiftingBinaryExponents
Abinarynumbercanbeexpressedin
scientific notationisseveralwayslike
notation is several ways like
scientific
decimalnumbers.
0.110010x25
0.78125x32=25
1.10010x24
1.5625x16=25
11.0010x23
3.125x8=25
2
110.010x2
6.25x4=25
1100.10x21
12.5x2=25
11001.0x20
25x1=25

StandardFormat
Mostcomputers(includingIntelPentiums)
f ll th IEEE St d d f Bi
followtheIEEEStandardforBinaryFloating
Fl ti
PointArithmetic,ANSI/IEEEStandard754
1985
Beforethestandarddifferentcomputersused
gp
differentformatsforfloatingpointnumbers.
Thestandarddefinestheformat,accuracyand
actiontakenwhenerrorsaredetected.

110.010isequivalentto
1.
2.
3.
4.
5
5.

11001.0x22
0.110010x23
6.25
Alloftheabove
None of the above
Noneoftheabove

FloatingpointSizes
ANS/IEEEStandard754
ANS/IEEE Standard 7541985
1985
Singleprecision(32bits)
Doubleprecision(64bits)
Extendedprecision(80bits)

1/21/2009

SinglePrecisionFloatRange

SinglePrecisionFloatingpointNumbers
float variablesinC++orJava

Alittlemorethan7decimaldigitsaccuracy
From3.4x1038 to3.4x1038
Positivenumberscanbeassmallas
1.18x1038 beforegoingtozero.

SignedMagnitude
Forpositivenumbers,thesignbitiszero
Fornegativenumbers,thesignbitisone
andeverythingelseisthesame

DoublePrecisionFloatingPoint

double variablesinC++orJava
approximately16decimaldigitsofaccuracy
i
l 16 d i l di i f
From1.798x10308 to1.798x10308

ExtendedPrecisionFloatingPoint

almost20decimaldigitsofaccuracy
3.37 104932 to1.18 104932
NotdirectlysupportedinC++orJava
Oftenusedinternallyforcalculationswhich
arethenroundedtodesiredprecision

1/21/2009

ExponentBias
Theexponentrepresentsthepowerof2.
Thesingleprecisionexponentisbiasedby
adding127totheactualexponent
Thisavoidsanextrasignbitfortheexponent
Theexponentrangeis126to128
Exponent value

Decimalexponent

Binaryexponent

20

127
132
122

01111111
10000100
01111010

25
25

Normalization
Floatingpointnumbersareadjustedsothe
mantissaorfractionalparthasasingle1bit
ti
f ti
l
th
i l 1 bit
beforetheradixpoint.
Decimal
5.75
0.125
32.0

Binary
Normalized
101.11x20 1.0111x22
0.001x20
1.0x23
100000.0x20
1.0x25

SavingaBit

CreatingaFloatingPointNumber

Thefractionalpartormantissaisalways
adjustedsotheleftmostbitisaone.
dj t d th l ft
t bit i
Sincethisbitisalways aone,itisnotactually
storedinthefloatingpointnumber.
Themantissaisstoredwithouttheleading
one bit although the one bit is assumed in
onebitalthoughtheonebitisassumedin
calculatingthevalueofthenumber.

1. Writethenumberinbinarywithafractional
partasnecessary.
t
2. Adjusttheexponentsotheradixpointisto
therightofthefirstonebit.
3. Themantissaisthebinarynumberwithout
the leading one bit
theleadingonebit.
4. Theexponentfieldiscreatedbyadding127
tothebinaryexponent.
5. Thesignisthesameasthenumberssign.

1/21/2009

DecimaltoFloatingPointExample

Convert15.375toFloatingPoint

Convert4.5tosingleprecisionfloatingpoint
Decimal4.5is100.1inbinary
Adjustradixtoget1.001x22
Theexponentfieldis127+2=129=10000001
Thefloatingpointnumberinbinaryis
S Exponent
0 10000001

Mantissa
00100000000000000000000

Convert15.375toFloatingPoint
Decimal15.375is1111.011inbinary
Adjusttheexponentto1.111011x23
Exponentfieldis3+127=13010 =100000102
S Exponent
0 10000010

Mantissa
11101100000000000000000

FloatingPointtoDecimalExample
S Exponent
1 10000011

Mantissa
01001000000000000000000

Whatisthedecimalvalueofthisnumber?
Exponent10000011=131 127=4
Mantissa is 1 01001000000000000000000
Mantissais1.01001000000000000000000
1.01001x24 =10100.1
10100.1is20.5indecimal

1/21/2009

Whatisthedecimalvalueof
S Exponent
0 10000001
1.
2.
3.
4.

Mantissa
10100000000000000000000

4.5
3.25
6.5
13.0

SpecialValueRepresentation
Value
Sign
Zero
0
+INF +
0
INF
1
NaN
0or1

Exponent Mantissa
0
0
11111111
0
11111111
0
0
Notzero

SpecialFloatingPointValues
Zero isrepresentedasallzerobits.
NotaNumber(NaN)isaspecialvaluethat
indicatesafloatingpointerror,suchastaking
thesquarerootofanegativenumber.
Infinity(INF)bothpositiveandnegative.

OverflowandUnderflow
Whenyoucalculateanumberthatistoobig
t fit i t th fl ti
tofitintothefloatingpointformat,theresult
i tf
t th
lt
isinfinity.
Calculatinganumberthatistoosmall(a
positivenumbersmallerthan1.18x1038 for
g p
)p
singleprecision)produceszero.
Dividingbyzeroproducesinfinitywiththe
propersign.

1/21/2009

CalculatingwithInfinity
(+INF)+(+7)=(+INF)
(+INF) (2)=(INF)
(+INF) 0=NaNmeaninglessresult

Вам также может понравиться