Академический Документы
Профессиональный Документы
Культура Документы
Distribution
Introduction
Binomial distribution arises in many of the real life situations. When a toss
coin number of heads in a given number of toss follows a binomial
distribution. A distribution is said to be binomial if it satisfies the given
below five conditions (adapted from Wackerly, Mendenhall and Scheaffer
2008).
1. There is a fixed number, n, of identical trials.
2. For each trial, there are only two possible outcomes (success/failure).
3. The probability of success, p, remains the same for each trial.
4. The trials are independent of each other.
5. The random variable Y = the number of successes observed for the n
trials.
If all the above conditions are satisfied, then the binomial p.m.f. is given
by
f ( y :n , p )= n p y (1 p)n y
y
The above distribution can arise both in case of infinite population and
finite population with replacement. If the finite population without
replacement is taken then the two (3,4) out of five conditions given above
are not satisfied and thus one need to take care of that. One can argue
that with very high value of N (Finite population) size in comparison to
n will have very little effect on p and thus one can treat the resulting
distribution as binomial distribution. But with very large N also the
independence criteria cant be met. Hence in the case of finite population
without replacement the resulting distribution
is known as
hypergeometric distribution and the resulting p.m.f. is given by
r N r
y N y
f ( y : N ,n , r ) =
N
n
y [max ( 0, n( N r ) ) , min ( r , n ) ]
Where N is the finite population size and r is the total number of
success in N . The hypergeometric distribution would be discussed in
the next chapter. In this chapter we will limit our self to infinite population
or finite population with replacement.
()
( )( )
( )
Binomial Distribution
The probability mass function in case of binomial distribution is given by
y
n y
f ( y :n , p )= n p (1 p)
p
Graph:1 below represents the binomial distributions probability mass
functions with few values of n and p
()
()
Graph: 1
Graph: 2
()
into
(a+ b) =(a+b) =
m
y=0
m!
m!
a y bm y =
p x (1 p) mx
y ! ( m y ) !
x=0 x ! ( mx ) !
(a+b) =1
And from above we see that
m
m!
E ( x )=np
p x (1 p)m x =np
x
!
(
mx
)
!
x=0
n
y
n y
E ( y ( y1) )= n p (1 p) y ( y1)
y=0 p
n
n!
E ( y ( y1) )= y ( y1)
p y ( 1 p)n y
y
!
(
n
y
)
!
y=0
Since the y = 0 and y=1 term vanishes. This implies
n
n!
E ( y ( y1) )=
p y (1p)n y
y=2 ( y2)! ( n y ) !
n
(n2)!
2
E ( y ( y1) )=n(n1) p
p y2 (1 p)n y
y=2 ( y 2)! ( n y ) !
Let x= y2 = x1 and m=n2 . Subbing y=x +2 and n=m+2
the last sum (and using the fact that the limits x = 0 and x = m
correspond to y = 2 and y = n 2 = m, respectively)
m
m!
E ( y ( y1) )=n(n1) p 2
px (1p)m x
x=0 x ! ( mx ) !
Using binomial theorem as above we can write
m!
p x (1p)m x = n(n1) p 2
x ! ( mx ) !
()
E ( y ( y1) )=n(n1) p
x=0
()
()
into
Graph: 3
()
y n y
+
(1 )=0
p 1 p
y
p= , p 0,1
n
The same parameter would be obtained if we do n Bernoullis trial. This
is not surprising as the fact is that the binomial distribution is the resultant
of n independent Bernoullis trials.
Binomial Approximation to Poisson
Normal Approximation of Binomial
The normal approximation to the binomial distribution (based on the De
MoivreLaplace theorem)
b
xnp
1
Pr <
<
=
eu du=F ( )F ( )
1
2
( npq ) 2
Where u N (0,1) and F represent cumulative density function
This is a relatively crude approximation, but it can be useful when n is
large. Numerical comparisons have been published in a number of
textbooks (e.g., Hald, 1952). A marked improvement is obtained by the
use of a continuity correction. The following normal approximation is used
widely on account of its simplicity:
x +0.5np
1
Pr ( X < x ) =F ( ( npq ) 2 )
Its accuracy for various values of n and p was assessed by Raff (1956)
and by Peizer and Pratt (1968), who used the absolute and the relative
error, respectively. Various rules of thumb for its use have been
recommended in various standard textbooks. Two such rules of thumb are
1. use when np( 1 p) > 9 and
2. use when np > 9 for 0 < p 0. 5 q .
x+0.5np
1
( npq ) 2
Pr ( X =x )=
eu du
x0.5np
1
( npq ) 2
c. The variance
[]
P ( X=x )= n p x ( 1 p)20 x
x
x
1 4
P ( X=x )= 20
x 5 5
[ ]
20x
b. The mean of
is given by
np=
201
=4
5
is given by
1
5
201
4
5
16
c. The variance of X is given by
np ( 15 )=
=
5
5
d. The probability that student answers at least 12 questions is given
by P ( X 12 )
P ( X=12 ) + P ( X=13 )+ + P ( X=20 ) = 0.000102
2
5
is given by
[]
x
20 x
P ( X=x )= n p ( 1 p)
x
2 x 3 50x
P ( X=x )= 50
x 5 5
[ ]
502
=20
5
502
3
5
c. The variance of X is given by
np ( 15 )=
=12
5
d. The probability that student answers at least 12 questions is given
by P( Z 19)
P ( Z=0 )+ P ( Z=1 ) ++ P ( Z =19 ) = 0.446476379
e. Z N (20, 12)
b. The mean of
is given by
np=
19.520
=0.442616957
12
P ( Z 19 )=area
( )
Graph: 5
[ ]
M ( t )= e tx x1 (1p) xr p r
r1
x=r
Multiplying numerator and denominator both by e tr
x=
e tr
M ( t )= e tx x1 (1p) xr p r tr
r1
e
x=r
r tr
Now we can take p e out of the summation as they are independent of
x , this gives us
[ ]
x=
[ ]
[ ]
M ( t )= p e
tr
x=
[ ]
(1 p)xr
et (xr ) x1
r 1
x=r
1 p
e
t r
M ( t )=(pe )
x=
x=r
[ x1
r1 ]
Now substitute
1 p
e
x=
k =xr
so that
x=k + r
M ( t )=(pe t )r k +r 1
r1
x=r
Using the property that the sum of the negative binomial probabilities
x=
r
(1w) = k + r1 w k
r1
x=r
Using the above summation, we can write the moments generating
function as
1 p
e
1
M ( t )=(pe t )r
This can be simplified as
1 p
e
1
( pe t )r
M ( t )=
Mean of the Negative Binomial Distribution
1p
e
1
( pe t )r
dM ( t )
=
dt t =0
1 p
e
1
1 p
e
1
1 p
e
1 p
e
1
dM ( t )
=
dt
r
r1
[1(1 p)] r ( p)
dM ( t )
=
dt t =0
r2
[1( 1 p)]
r 1
dM ( t )
p r r ( p)r1 p{( p )r r [ p ]
=
dt t =0
pr2
r
r 1
dM ( t )
p r p { ( p ) r p
=
r2
dt t =0
p
r 1
p{( p ) r [ 1( 1 p ) ]
(1p) }
(1p)}
(1p)}
r 1
dM ( t )
p r p +p r p
=
dt t =0
p r2
dM ( t )
p
=
dt t =0
2 r1 r
(1p) }
r ( p +1 p)
pr2
dM ( t )
r
=M ' (0)=
dt t =0
p
Variance of the Negative Binomial Distribution
Variance of the negative binomial distribution can be obtained in the
similar fashion using formulae 2=M ' ' ( 0 )( M ' ( 0 ) )2
1p
e
1
1p
e
1p
e
1
r 1
( t ] +r 2 ( pe t ) pet
r1
''
(
M (t)=r pe t ) (r 1)
Simplifying the above expression gives
r (r +1p)
M ' ' ( 0 )=
p2
r
M ' ( 0) =
p
2
''
=M ( 0 )( M ' ( 0 ) )2
r (r+ 1 p) r
r
2
p
p
p
solving this gives
r (1p)
2=
p2
Mode of the Negative Binomial Distribution
2=
( )( )
r1
is not an integer.
p
t1 and t .
t=1+
( )
( )
( )
Ans.
( )
( )
( )
r
3
=E ( x )= =
=15
p 0.20
r (1 p)
30.80
2=Var ( x )=
=
=60
2
0.200.20
p
Example: 4 A standard, fair die is thrown until 3 aces occur. Let X denote
the number of throws. Find each of the following:
a. The probability density function of X
b. The mean of X
c. The variance X
d. The probability that there will be at least 20 throws will be needed.
Ans.
r
xr
. f ( X =x : r , p )= x1 p (1p)
r1
( )
1 5
f ( X =x :r , p )= n1 . .
31 6 6
( )
n3
r
3
=E ( x )= =
=18
p 1 /6
r (1 p) 35/6
2=Var ( x )=
=
=90
1/61 /6
p2
p (V 20 )=0.3643
Many real life variables are found to be having negative binomial
dispersion. Negative binomial regression analysis is used for modelling
over dispersed count data. The following two examples shows the
negative binomial distribution.
Example A. School administrators study the attendance behavior of high
school juniors at two schools. Predictors of the number of days of absence
include the type of program in which the student is enrolled and a
standardized test in math.
Example B. A health-related researcher is studying the number of
hospital visits in past 12 months by senior citizens in a community based
on the characteristics of the individuals and the types of health plans
under which each one is covered.
Normal Approximation of Negative Binomial Distribution:
r (1p)
p
2=
r (1p)
p2
Multinomial Distribution
A multinomial distribution is an extension of binomial
distribution. In case of binomial distribution the only two possible
outcomes are possible. For example in the case of coin its either
head or tail and in case of dice its either 1 or not 1 or 5 and not
5. In case of multinomial distributions the number of outcomes
are more than 2 and we can say that we have k outcomes with
probability pi
The idea of probability requires that
pi=1
k
i=1
n!
p 1x p2 x p3x p k x
x 1 ! x 2 ! x3 ! .. x k !
1
Where
pi=1
k
i=1
x i= 1
k
i=1
pi p j
(1 pi )(1 p j)
Example
Suppose that we throw 10 standard, fair dice. Find the probability
of each of the following events:
a. Scores 1 and 6 occur once each and the other scores occur
twice each.
n!
x
x
x
x
p p p pk
x 1 ! x 2 ! x3 ! .. x k ! 1 2 3
1
10 !
1/6 1 1/61 1/62 1/6 2 1/62 1/62=.00375
1! 1 ! 2 ! 2! 2 ! 2 !
b.
p=
10 !
6
4
1/2 1/2 =.205
6 !4 !
Ans.
f ( x 1 , x 2 , , x k :n , p 1 , p 2 , , pk ) =
p=
n!
p 1x p2 x p3x p k x
x 1 ! x 2 ! x3 ! .. x k !
1
12!
7
2
3
0.40 0.35 0.25 =0.0248
7 ! 2! 3 !
f ( x 1 , x 2 , , x k :n , p 1 , p 2 , , pk ) =
p=
n!
p 1x p2 x p3x p k x
x 1 ! x 2 ! x3 ! .. x k !
1
10 !
0.404 0.101 0.505=0.1008
4 !1!5!
20!
p10 p 6 (1 p1 p2 )4
10 ! 6 ! 4 ! 1 2
20 !
p 10 p6 ( 1 p1 p 2) 4 )
10! 6 ! 4 ! 1 2
=log
( 10!206!! 4 ! )+log ( p
=log
( 10!206!! 4 ! )+log ( p
=log
10
1
10
1
p62 ( 1 p1 p 2) 4 )
) + log ( p 62 )+ log (( 1 p1 p 2 )4 )
1
p1
and
10
4
=
=0
p1 p1 1p 1 p2
6
4
=
=0
p2 p 2 1 p1 p 2
1
5
Summary:
We have discussed binomial, negative binomial and multinomial
distributions in this chapter. We will end the chapter with one real life
example that helps you distinguishing binomial and negative binomial
distributions. Suppose you have a pack of card and you draw one card.
You put this card back in the pack, shuffle the pack and draw one card
again. You repeat the process. If the question is how many draws are
required to draw two hearts, the resulting distribution is negative
binomial. But if you are asked to find probability of getting two heads in
five draws then your distribution is binomial. Multinomial distribution can
be understood as a extension of binomial distribution with many
outcomes.