Вы находитесь на странице: 1из 40

Chapter 3:

Special Distributions

Yang Zhenlin

zlyang@smu.edu.sg
http://www.mysmu.edu/faculty/zlyang/

Chapter Contents

Chapter 3

Bernoulli Random Variable


Binominal Distribution
Negative Binomial Distribution
Geometric Distribution
Poisson Distribution
Uniform Distribution
Exponential Distribution
Gamma Distribution
Normal Distribution
STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Introduction

Chapter 3

Many probability distributions have been invented to model the


phenomena associated with certain social activities. Each of them
is suitable for a special situation, has some unique properties
including a special pmf or pdf form, and hence merits a separate
study. For example,
Binomial random variable is to the number of successes among
n independent experimental trials with two possible outcomes in
each trial: success or failure.
Poisson random variable counts the number of events occurred
in a fixed time period.
Exponential random variable can be used to model the lifetime
of an electronic component.
Normal random variable can be used to describe the height or
weight distributions of human being, .
STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Chapter 3

Bernoulli Trial

Consider an experiment with only two possible outcomes:


s = "Success" and f = "Failure".
A performance of such an experiment is called a Bernoulli trial.
The sample space of a Bernoulli trial is S = {s, f}.
Defining a variable X in such way that
X(s) = 1 and X(f ) = 0,
then X is a r.v. taking only two possible values: 0 and 1. This r.v. is called
Bernoulli random variable.
Denoting = P(X = 1), it is then called the probability of success.
The pmf of a Bernoulli r.v., called Bernoulli Distribution, is seen to be

p ( x) x (1 )1 x , x 0, 1
The expectation:
STAT151,Term
TermII,I 09/10
12-13
STAT306,

E[X] = 1 P(X=1) + 0P(X=0) = .


4

Zhenlin Yang, SMU

Chapter 3

Bernoulli Trial
The variance: As E[X2] = 12P(X=1) + 02P(X=0) = ,
Var(X) = E[X2] (E[X])2 = 2 = (1)
The MGF:

M(t) = E[etX] = et(1)P(X=1) + et(0)P(X=0)


= et + (1)
From MGF: M'(0) = = E[X]; and M''(0) = = E[X2].
Example 3.1. If in a throw of a fair die the event of obtaining 4
or 6 is called a success, and the event of obtaining 1, 2, 3, or 5 is
called a failure, then
1 if 4 or 6 is obtained
X
0 otherwise,
is a Bernoulli r.v. with parameter = 1/3.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Binomial Distribution

Chapter 3

If n Bernoulli trials with probability of success are performed


independently, then X, the number of successes in these n trials,
is called a Binomial r.v. with parameters n and , denoted by X ~
Bin(n, ). The pmf of the binomial r.v., called the Binomial
Distribution, has the expression

p ( x) nx x (1 ) n x , x 0, 1, 2, , n

The arguments for deriving the binomial pmf is as follows:
in n Bernoulli trials, the probability for each particular
sequence of x successes and nx failures is x(1-)n-x.
There are totally nCx ways of obtaining x successes and nx
failures. Thus the desired probability of having (any) x
successes and nx failures is nCx x(1-)n-x.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Binomial Distribution

Chapter 3

Plots of Binomial Probability Mass Function

n=20
=0.8

STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Binomial Distribution

Chapter 3

Expectation: E[X] = n
Proof: Apply Binominal Expansion formula,
n

(a b) n ni a n i bi

i 0

STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Binomial Distribution

Chapter 3

Variance: Var(X) = n (1)


Proof: Apply Binominal Expansion formula to find E[X(X1)].

STAT151,Term
TermII,I 09/10
12-13
STAT306,

Zhenlin Yang, SMU

Binomial Distribution

Chapter 3

MGF: M(t) = [et + (1)]n


Proof:

Exercise: Using MGF, show E[X] = n, and Var(X) = n (1)


STAT151,Term
TermII,I 09/10
12-13
STAT306,

10

Zhenlin Yang, SMU

Binomial Distribution

Chapter 3

Example 3.2. Suppose that jury members decide independently


and that each with probability makes the correct decision. If the
decision of the majority is final, what is the probability that a
three-person jury makes a correct decision?
Solution: Let X be the number of persons who decide correctly
among a three-person jury. Then X ~ Bin(3, ). Hence the
probability that a three-person jury decides correctly is
P(X 2) = P(X = 2) + P(X = 3)
= 3C2 2(1) + 3C3 3(1)0
= 32 23.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

11

Zhenlin Yang, SMU

Chapter 3

Geometric Distribution

Consider a sequence of independent Bernoulli trials with


probability of success . If X represents the number of trials until
one success, then, X is a geometric r.v. Its distribution is called
the geometric distribution, denoted by X ~ GEO( ), which can
easily be seen as
p ( x) (1 ) x 1 , x 1, 2, 3, ,
1

The Mean and Variance: E( X ) , and Var ( X )

1
2

Proof: Recall geometric series and its 1st and 2nd derivatives,
1 + a + a2 + a3 + + an + = (1 a)1 , for |a| < 1
1 + 2a + 3a2 + + n an1 + = (1 a)2 ,
2 + 32a + 43a2 + + n(n 1)an2 + = 2(1 a)3 . . . .
STAT151,Term
TermII,I 09/10
12-13
STAT306,

12

Zhenlin Yang, SMU

Chapter 3

Geometric Distribution
t

e
The MGF: M (t )
1 (1 )et

Proof: Apply geometric series.

Exercise. Use MGF to derive the mean and variance of a


geometric random variable.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

13

Zhenlin Yang, SMU

Chapter 3

Geometric Distribution

Example 3.3. A father asks his sons to cut the backyard lawn. Since he does not
specify which of his three sons is to do the job, each boy tosses a coin to
determine the odd person, who must then cut the lawn. In the cases that all three
get heads or tails, they continue tossing until they reach a decision. Let be the
probability of heads, then 1 is the probability of tails.
(a) Find the probability that they reach a decision in less than n rounds of tosses.
(b) If =1/2, what is the minimum number of rounds required to reach a
decision with probability 0.95?

Solution:
(a) The probability of reaching a decision on any round of coin tossing:
1 = P{exactly two heads} + P{exactly two tails}
= 3C2 2(1) + 3C2 (1) 2 = 3(1)
Let X be the number of rounds of tosses until they reach a decision; then X
is a geometric r.v. with parameter 1 . Thus,
P(X < n) = 1 P(X n) = 1 [1 3(1)]n1.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

14

Zhenlin Yang, SMU

Chapter 3

Geometric Distribution
(b) We want to find the minimum n so that P(X n) .95.
CDF of Geometric Distribution:
F(x) = P{X x} = 1 P{X > x}
= 1 P{X x+1}
= 1 [(1)x + (1)x+1 + (1)x+2 + ]
= 1 (1)x [1+ (1) + (1)2 + ]
= 1 (1)x
P(X n) = 1 P(X n +1)
= 1 (1 3(1))n
=1 (1 3/4) n 0.95,
which gives n 2.16.
So n =3.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

15

Zhenlin Yang, SMU

Chapter 3

Negative Binomial Distribution

If Y represents the number of Bernoulli trials until r successes,


then Y is called a negative binomial r.v. with parameters r and
, denoted by Y~ NB(r, ).
To obtain the pmf of Y, note the event {Y = y} is equivalent to
{r1 successes in first y1 trials and a success in last trial}, thus,
p(y) = P(Y = y) = y 1 r (1 ) y-r, y = r, r+1, . . .
r 1
which is called the negative binomial distribution.
r
r (1 )
et

E (Y ) , Var (Y )
, and M (t )
2
t

1 (1 )e

Note: The mean and variance of a negative binomial random


variable are r times of those of a geometric r.v, and MGF is that
of a geometric r.v. raised to power r. Why ?
STAT151,Term
TermII,I 09/10
12-13
STAT306,

16

Zhenlin Yang, SMU

Negative Binomial Distribution

Chapter 3

Example 3.4. Team A plays team B in a seven-game world series. That is, the
series is over when either team wins four games. For each game, P(A wins) =
0.6, and the games are assumed independent.
(a)
What is the probability that the series ends in exactly six games?
(b)
If the series ends in six games, what is the probability that A wins the
series?
Solution:
(a) Let X be the number of games until A wins four games, and
Y be the number of games until B wins four games.
Then, X ~ NB(4, 0.6) and Y~ NB(4, 0.4).
P(series ends at exactly 6 games) = P(X = 6) + P(Y = 6)
= 5C3 (0.6)4(0.4) 2 + 5C3 (0.4) 4(0.6)2
= 0.2074 + 0.0922 = 0.2996.
(b)
Let A be the event that team A wins and D be the event that series
ends in six games. Then the desired probability is
P( A D)
P ( X 6)
0.2074
P( A | D)

0.6923
P( D)
P( X 6) P(Y 6) 0.2996
STAT151,Term
TermII,I 09/10
12-13
STAT306,

17

Zhenlin Yang, SMU

Poisson Distribution

Chapter 3

In many applications, the number of events occurred in a fixed


"time interval" is of interest. For example,
The number of customers entering a bank on a given day.
The number of typographical errors of a book.
The number of traffic accidents in a given year.
The number of aparticles discharged in a fixed period of time from some
radioactive material.
The number of defects on a piece of wire.

These quantities often satisfy the following conditions:


The number of events occurred in nonoverlapping intervals are
independent
The probability of exactly one event occurring in a sufficiently
short interval of length h is approximately h.
The probability of two or more events occurring in a
sufficiently short interval is essentially zero.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

18

Zhenlin Yang, SMU

Poisson Distribution

Chapter 3

A suitable distribution for describing the above phenomena is the


Poisson distribution. If the three conditions are satisfied, it can
indeed be shown that the number of events occurred in a fixed or
a unit interval, X, follows a Poisson distribution with pmf:
e x
p( x)
, x 0, 1, 2, ,
x!

As a pmf sums to one, we must have,

e x
x

1
,
or

e
,

x!
x 0
x 0 x!
The mean:

a useful formula!

E[X] = : the rate of Poisson events per unit.

The variance: Var(X) =


The MGF:
STAT151,Term
TermII,I 09/10
12-13
STAT306,

M (t ) e

( e t 1)

19

Zhenlin Yang, SMU

Chapter 3

Poisson Distribution
Plots of Poisson Probability Mass Function
0.3

0.2

Mean=2

Mean=5

0.2
0.1

0.1

0.0

0.0
0

10

20

10

20

Note:
i)The mean and variance of
Poisson are both .
ii)The parameter is usually
called the rate of occurrence of
Poisson events.

Mean=10
0.10

0.05

0.00
0

STAT151,Term
TermII,I 09/10
12-13
STAT306,

10

20

20

Zhenlin Yang, SMU

Chapter 3

Poisson Distribution
Derivation of the mean, variance and MGF:
E[ X ( X 1)]

E[ X ]

M (t ) E[etX ]

x p( x)

e x
x( x 1)
x!
x 0

e x
x
x!
x 0

e x
e
x!
x0

e x
x( x 1)
x!
x2

e x
x
x!
x 1

e x 2

x 2 ( x 2)!

e (e t ) x

x!
x0

e x 1

x 1 ( x 1)!

y
e

2
y!
y 0

e y

y!
y 0

x 0

y=x1

STAT151,Term
TermII,I 09/10
12-13
STAT306,

y=x2

t x
(

e
)
e
x!
x 0

e t

e e
e

E[X2] = 2 + Var(X) =
21

tx

( e t 1)

Use M(t) to
show these?
Zhenlin Yang, SMU

Poisson Distribution

Chapter 3

Example 3.5. Every day the average number of wrong numbers


received by a certain mail-order house is one. What is the
probability that this house receives
a)two wrong calls tomorrow,
b)at least one wrong call tomorrow?
Solution: Assuming that the house receives a lot of calls, X,
the number of wrong calls received tomorrow, is approximately
a Poisson r.v. with = E(X) = 1. Thus,
a)

e 112
P ( X 2)
0.18
2!

b) P(X 1) = 1 P(X = 0) = 1 1/e = 0.63.


If X, the number of Poisson events occurred in [0, 1] is Poi(), then
Y, the number of Poisson events in [0, 2] is Poi(2 ), and so on !
STAT151,Term
TermII,I 09/10
12-13
STAT306,

22

Zhenlin Yang, SMU

Uniform Distribution

Chapter 3

Uniform Random Variable


A continuous r.v. X is said to be uniformly distributed over an
interval (a, b), denoted by X ~ UN(a, b), if its pdf is given by

f ( x)

1
, if a x b
ba
0,
othewise
0,
xa
F ( x)
ba
1,

ab
E[ X ]
2
STAT151,Term
TermII,I 09/10
12-13
STAT306,

Var ( X )
23

b a
12

x a,
a x b,
x b.
2

M(0)=?

etb eta
M (t )
t (b a)
Zhenlin Yang, SMU

Chapter 3

Uniform Distribution

Example 3.6. Starting at 5:00 A.M., every half hour there is a flight from San
Francisco airport to Los Angeles International airport. Suppose that none of
these planes is completely sold out and that they always have rooms for
passengers. A person who wants to fly to L.A. arrives at the airport at a
random time between 8:45 A.M. and 9:45 A.M. Find the probability that she
waits a) at most 10 minutes; b) at least 15 minutes.
Solution: Let the passenger arrive at the airport X minutes past 8:45. Then,
X ~ UN(0, 60).
a)

P(wait at most 10 minutes)


= P(arrive between 8:50 and 9:00 or 9:20 and 9:30)
= P(5 < X < 15) + P(35 < X < 45) = 1/3
15 1

b)

60

P(wait at least 15 minutes)


= P(arrive between 9:00 and 9:15 or 9:30 and 9:45)
30 1
= P(15 < X < 30) + P(45 < X < 60) = 1/2

15

STAT151,Term
TermII,I 09/10
12-13
STAT306,

24

60

1
1
dx
35 60
3

dx

45

1
1
dx
45 60
2

dx

60

Zhenlin Yang, SMU

Chapter 3

Exponential Distribution

A continuous r.v. is said to follow an exponential distribution


with parameter , denoted by X ~ Exp(), if its pdf is given by,
1e x

x 0,

otherwise.

f ( x)

It is easy to see that the CDF of the exponential distribution is


0
F ( x)
x
1

The mean and variance:

if x 0,
if x 0.

E(X) = and Var(X) = 2.

Exponential distribution has a famous property, called the


memoryless property: if the lifetime of a electronic component
follows the exponential distribution, then the probability for a
component to last additional t time units is the same as the
probability of a new component lasting t time units.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

25

Zhenlin Yang, SMU

Exponential Distribution

Chapter 3

Example 3.7. Suppose that a certain solid-state component has a lifetime or


failure time (in hours) X ~ EXP(100).
a)Find the probability that the component will last at least 100 hours given that
it has been used for 50 hours.
b)Suppose that 10 such components are randomly selected. What is the
probability that more than two of them last at least 50 hours?
c)Now, 100 components are randomly selected. Find the probability that at
least 2 of them last more than 460 hours.
Solution: a) By memoryless property,
P(X > 100 | X > 50) = P(X > 50) = exp(50/100) = 0.6065.
b) Let Y be number of components in the 10 selected that last more than 50
hours. Then Y ~ Bin(10, 0.6065). The desired probability is
P(Y > 2) = 1 P(Y 2) = 0.9890
c) Let W be number of components in the 100 selected that last more than 460
hours. Then, W ~ Bin(100, ), with = P(X > 460) = exp(50/100) = 0.01.
The desired probability = P(W 2) = 0.2642.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

26

Zhenlin Yang, SMU

Chapter 3

Exponential Distribution
Exponential and Poisson:
The time between any two Poisson events is an exponential
random variable with = 1/.

Let T be the time until the first Poisson event; let the rate of
occurrence of Poisson events be per time unit. Then, the CDF
of T is,

F (t ) P(T t )
1 P(T t )
1 P(no Poisson events in [0, t ])
1 e t ,
which is the exponential CDF, thus T is exponential with = 1/.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

27

Zhenlin Yang, SMU

Chapter 3

Gamma Distribution

A continuous r.v. X is said to follow a gamma distribution with


parameters > 0 and > 0, denoted by X ~ Gamma( , ), if its
pdf is given by

f ( x)

1
1 x
x
e , x 0,

( )
0,
otherwise,

where () the gamma function defined as

( ) y 1e y dy, t 0.
0

having a property ()= (1)(1), and (n) = (n1)!

As pdf integrates to 1, we have

1
1 x
x
e dx 1

( )

STAT151,Term
TermII,I 09/10
12-13
STAT306,

28

x 1e x dx ( )
Zhenlin Yang, SMU

Gamma Distribution

Chapter 3

Expectation: E[X] =
Proof: Apply gamma pdf formula.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

29

Zhenlin Yang, SMU

Gamma Distribution

Chapter 3

Variance: Var(X) = 2
Proof: Apply gamma pdf formula.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

30

Zhenlin Yang, SMU

Gamma Distribution
The MGF:

Chapter 3

1
M (t )
(1 t )

Proof: Apply gamma pdf formula.

Exercise: Derive the mean and variance using M(t).


STAT151,Term
TermII,I 09/10
12-13
STAT306,

31

Zhenlin Yang, SMU

Gamma Distribution

Chapter 3

More about Gamma Distribution:


When =1, gamma becomes exponential
When = 2 and =r/2, where r is a positive integer,
gamma distribution becomes a chi-squared
distribution, r degrees of freedom.
When = k, a positive integer, the gamma random
variable can be interpreted as the waiting time until the
kth Poisson events.
(Relate this to the relationship between exponential and Poisson.)

STAT151,Term
TermII,I 09/10
12-13
STAT306,

32

Zhenlin Yang, SMU

Normal Distribution

Chapter 3

A continuous r.v. X is called normal with mean and standard


deviation , denoted by X ~ N(, 2), if its pdf has form

( x )2
1
, x
f ( x)
exp
2
2
2

where < < and > 0. The f(x) is a bell-shaped, unimodal,


and symmetric around .
2

0.2
0
2

0.15

0.1

0.05

0
-5

-2.5

STAT151,Term
TermII,I 09/10
12-13
STAT306,

2.5

7.5

10
33

Zhenlin Yang, SMU

Normal Distribution

Chapter 3

If X ~ N(, 2), then a + bX ~ N(a + b, b22)


It follows that Z = (X )/ ~ N(0, 1), called the Standard
Normal Random Variable
The pdf and CDF of Z are usually denoted respectively by
1 z
1
1 2
1 2
( z )
exp t dt
( z)
exp z ,

2
2
2
2
There is no closed-form expression for (z), but the
percentage values are tabulated, or can be easily calculated
using computer software.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

34

Zhenlin Yang, SMU

Chapter 3

Normal Distribution
Example 3.8. Let X ~ N(3, 9). Calculate
a) P(2 < X < 5),
b) P(X > 0),
c) P(|X 3| > 6).
Solution: a)
P (2 X 5)
2 3 X 3 5 3

3
3
3

1
2
P Z
3
3
(2 / 3) (1 / 3)
0.3779

STAT151,Term
TermII,I 09/10
12-13
STAT306,

b)
P ( X 0)

c)
P (| X 3 | 6)

X 3 0 3

3
3
P Z 1

1 (1)
(1)
0.8413

0.0456

35

| X 3|

P | Z } 2
2 (2)

Zhenlin Yang, SMU

Normal Distribution

Chapter 3

The Normal Approximations to the Binomial


When n, the number of Bernoulli trials, is large. It is difficult to
calculate the binomial probabilities, especially in the old days.
The normal approximation to the binomial probabilities was then
introduced. The basic idea is that when n gets larger, the plot of
Binomial pmf gets closer to the plot of a normal pdf.

STAT151,Term
TermII,I 09/10
12-13
STAT306,

36

Zhenlin Yang, SMU

Normal Distribution

Chapter 3

As a rule of thumb: if n is such that n (1 ) 10, then the


distribution of a Bin(n, ) r.v. can be approximated by the
distribution of a N[n, n (1)] r.v.
Example 3.9. Flip a fair coin 40 times, Let X be number of
times that it lands heads. Find the probability that X 21. Use the
normal approximation and then compare it to the exact solution.
Solution: Since X ~ Bin(40, 0.5) and 400.50.5 = 10, the
distribution of X can be approximated by N(20, 10),

P( X 21) P

X 20

40 0.5 0.5

21 20

40 0.5 0.5

P ( Z 0.3162) 0.6241.

The exact answer is 0.6821. Approximation not so good, why?


STAT151,Term
TermII,I 09/10
12-13
STAT306,

37

Zhenlin Yang, SMU

Normal Distribution

Chapter 3

The reason for that is because we are using a continuous


distribution to approximate a discrete one, and a continuity
correction is lacking.
Continuity Correction Factor. Let X ~ Bin(n, p), np(1p) 10;
and let Y ~ N[np, np(1p)]. Then, we have
P(X i) P(Y i + 1/2),
P(X i) P(Y i 1/2),
P(X = i) P(i 1/2 Y i + 1/2).
How is, in general, a probability based on a discrete r.v. be approximated by a
probability based on a continuous one? Let X be a discrete r.v. with pmf
p(x) and taking integer values. We have
P(i X j) =

j
k i

p(k ) = Sum of the areas of all rectangles from i to j,

where the kth rectangle has base 1, height p(k) and midpoint k.
STAT151,Term
TermII,I 09/10
12-13
STAT306,

38

Zhenlin Yang, SMU

Normal Distribution

Chapter 3

f(x)

p(i)

p(j)
...

i i+1

j1 j

Now suppose the continuous pdf f(x) is a good approximation to p(x). Clearly,
P(i X j) the area under f(x) from i1/2 to j+1/2 =
STAT151,Term
TermII,I 09/10
12-13
STAT306,

39

Zhenlin Yang, SMU

Normal Distribution
Similarly,
P(X =
P(X

Chapter 3

k 1 2
k) = k 1 2 f (x)dx ,

i) = i1 2 f (x)dx ,

P(X j) =

j1 2

f (x)dx .

Example 3.9 (Cont'd). Now revisit Example 3.9 to approximate


P(X 21) by applying the continuity correction factor.
P(X 21) P(Y 21+0.5)

= P

Y 20

40 0.5 0.5

21.5 20

40 0.5 0.5

P(Z 0.4743) = 0.6824,


compared to the exact answer 0.6821!
STAT151,Term
TermII,I 09/10
12-13
STAT306,

40

Zhenlin Yang, SMU

Вам также может понравиться