Академический Документы
Профессиональный Документы
Культура Документы
TheGaussianProbabilityDistributionFunction
Introduction
TheGaussianprobabilitydistributionisperhapsthemostuseddistributioninallofscience.
Sometimesitiscalledthebellshapedcurveornormaldistribution.
UnlikethebinomialandPoissondistribution,theGaussianisacontinuousdistribution:
p( y)
1
e
2
( y )2
2 2
=meanofdistribution(alsoatthesameplaceasmodeandmedian)
2=varianceofdistribution
yisacontinuousvariable(?y?
Probability(P)ofybeingintherange[a,b]isgivenbyanintegral:
1 b
P(a y b) p ( y )dy
e
2 a
a
b
( y )2
2 2
dy
KarlFriedrichGauss17771855
Theintegralforarbitraryaandbcannotbeevaluatedanalytically.
Thevalueoftheintegralhastobelookedupinatable(e.g.AppendixesAandBofTaylor).
1
p(x)
e
2
p(x)
R.Kass/S07
P416Lec3
( x )2
2
2
gaussian
Theintegralswith
limits[,]canbe
evaluated,seeBarlowP.37.
Thetotalareaunderthecurveisnormalizedtoonebythe(2) factor.
( y) 2
2 2 dy 1
1
e
2
Weoftentalkaboutameasurementbeingacertainnumberofstandarddeviations()away
fromthemean()oftheGaussian.
Wecanassociateaprobabilityforameasurementtobe|n|
awayfromthemeanjustbycalculatingtheareaoutsideofthisregion.
n Prob.ofexceedingn
0.67
0.5
It is very unlikely (< 0.3%) that a
1
0.32
measurement taken at random from a
2
0.05
Gaussian pdf will be more than
3
0.003
from the true mean of the
4
0.00006
distribution.
P( y )
Shaded
-3
-2
-1
Shaded
0.2
0.1
0.1
-4
-3
-2
0.3
0.2
95%ofareawithin2
R.Kass/S07
Gaussianpdfwith=0and=1
0.3
-4
-1
Only5%ofareaoutside2
P416Lec3
RelationshipbetweenGaussianandBinomial&Poissondistribution
TheGaussiandistributioncanbederivedfromthebinomial(orPoisson)assuming:
pisfinite&Nisverylarge
wehaveacontinuousvariableratherthanadiscretevariable
Considertossingacoin10,000times.
p(head)=0.5andN=10,000
Forabinomialdistribution:
meannumberofheads==Np=5000
standarddeviation=[Np(1p)]1/2=50
Theprobabilitytobewithin1forthisbinomialdistributionis:
10 4 !
m
10 4 m
P
0.5
0.5
0.69
4
m500050 (10 m)!m!
500050
ForaGaussiandistribution:
P( y )
1
e
2
( y ) 2
2 2 dy
SeeTaylor10.4
0.68
Bothdistributionsgiveaboutthe~sameprobability!
Compare1areaofPoissonandGaussian:
MeanPoission
10
0.74
25
0.73
100
0.707
250
0.689
5000
0.6847
R.Kass/S07
Gaussian
0.6827
0.6827
0.6827
0.6827
0.6827
%diff
7.8
6.9
3.5
0.87
0.29
P416Lec3
Poisson : P(1 )
e m
m!
WhyistheGaussianpdfsoapplicable?CentralLimitTheorem
AcrudestatementoftheCentralLimitTheorem:
ThingsthataretheresultoftheadditionoflotsofsmalleffectstendtobecomeGaussian.
Amoreexactstatement:
LetY1,Y2,...Ynbeaninfinitesequenceofindependentrandomvariables
eachwiththesameprobabilitydistribution.
Supposethatthemean()andvariance(2)ofthisdistributionarebothfinite.
Foranynumbersaandb:
Y1 Y2 ...Yn n
1 b 12 y 2
lim P a
b
dy
e
n
n
2 a
TheC.L.T.tellsusthatunderawiderangeofcircumstancesthe
probabilitydistributionthatdescribesthesumofrandomvariables
tendstowardsaGaussiandistributionasthenumberoftermsinthesum?.
How close to does n have to be??
AlternativelywecanwritetheCLTinadifferentform:
Y
Y
1 b 12 y 2
lim P a
b lim P a
b
dy
e
n
n
/ n
m
2
a
R.Kass/S07
P416Lec3
Y
Y
1 b 12 y 2
lim P a
b lim P a
b
dy
e
n
n
/ n
m
2 a
issometimescalledtheerrorinthemean(moreonthatlater):
m
Theandofthepdfmustbefinite.
Nooneterminsumshoulddominatethesum.
Arandomvariableisnotthesameasarandomnumber.
ArandomvariableisanyrulethatassociatesanumberwitheachoutcomeinS
(Devore,inprobabilityandStatisticsforEngineeringandtheSciences).
HereSisthesetofpossibleoutcomes.
RecallifyisdescribedbyaGaussianpdfwithmean()ofzeroand=1thenthe
probabilitythata<y<bisgivenby:
1 b 12 y 2
P ( a y b)
dy
e
2 a
TheCLTistrueeveniftheYsarefromdifferentpdfsaslongasthemeans
andvariancesaredefinedforeachpdf!
SeeAppendixofBarlowforaproofoftheCentralLimitTheorem.
R.Kass/S07
P416Lec3
Example:GenerateaGaussiandistributionusinguniformrandomnumbers.
Randomnumbergeneratorgivesnumbersdistributeduniformlyintheinterval[0,1]
=1/2and2=1/12
Procedure:
a)Take12numbers(r1,r2,r12)fromyourcomputersrandomnumbergenerator(ran(iseed))
b)Addthemtogether
c)Subtract6
GetanumberthatlooksasifitisfromaGaussianpdf!
Y Y2 ...Yn n
P a
b
n
12
1
r
12
2
i1
P a
b
1
12
12
12
P 6 ri 6 6
i1
1 6 12 y 2
dy
e
2 6
-6
+6
12iscloseto
P416Lec3
Example:
Awatchmakesanerrorofatmost1/2minuteperday.
Afteroneyear,whatstheprobabilitythatthewatchisaccuratetowithin25minutes?
Assumethatthedailyerrorsareuniformin[1/2,1/2].
Foreachday,theaverageerroriszeroandthestandarddeviation1/12minutes.
Theerroroverthecourseofayearisjusttheadditionofthedailyerror.
Sincethedailyerrorscomefromauniformdistributionwithawelldefinedmeanandvariance
theCentralLimitTheoremisapplicable:
Y Y ...Yn n
1 b 12 y 2
lim P a 1 2
b
dy
e
n
n
2 a
Theupperlimitcorrespondsto+25minutes:
Y1 Y2 ...Yn n 25 365 0
4.5
1
n
365
12
Thelowerlimitcorrespondsto25minutes.
Y Y ...Yn n 25 365 0
a 1 2
4.5
1
n
365
12
Theprobabilitytobewithin25minutesis:
1 4.5 12 y 2
P
dy 0.999997
e
2 4.5
Thisintegralis1
toabout3partin106!
Theprobabilitytobeoffbymorethan25minutesisjust:1P106
Thereis<3inamillionchancethatthewatchwillbeoffbymorethan25minutesinayear!
R.Kass/S07
P416Lec3
Example: Thedailyincomeofacardsharkhasauniformdistributionintheinterval[$40,$50].
Whatistheprobabilitythats/hewinsmorethan$500in60days?
LetsusetheCLTtoestimatethisprobability:
Y1 Y2 ...Yn n
1 b 12 y 2
lim P a
b
dy
e
n
n
2 a
Theprobabilitydistributionofdailyincomeisuniform,p(y)=1.
p(y)needstobenormalizedincomputingtheaveragedailywinning()anditsstandarddeviation().
50
yp(y)dy
40
50
p(y)dy
1 [50 2
2
(40)2 ]
50 (40)
40
50
2
y p(y)dy
4050
p(y)dy
1 [50 3 (40) 3 ]
3
25 675
50 (40)
40
Thelowerlimitofthewinningsis$500:
Y Y ...Yn n 500 60 5 200
a 1 2
n
675
60
201
Theupperlimitisthemaximumthatthesharkcouldwin(50$/dayfor60days):
Y Y ...Yn n 3000 60 5 2700
b 1 2
13.4
n
675 60
201
1 13.4 12 y 2
1 12 y 2
P
e
dy
dy 0.16
e
2 1
2 1
16%chancetowin$500in60days
R.Kass/S07
P416Lec3