You are on page 1of 71

Civil Engineering

Statistics
(BFC 34303)
Chapter 4:
Special Probability
Distributions
Prepared by:
Assoc. Prof. Dr. Munzilah Md. Rohani
Nursitihazlin Ahmad Termida
1.0 Introduction
• 2 types of discrete distribution:
(1) Binomial
(2) Poisson

1.1 The Binomial Distribution


• Many statistical problems deal with the situations referred to as repeated
trials.
• For example, we may want to know the probability that 1 of 5 rivets will
rupture in a tensile test, the probability that 45 of 300 drivers stopped at a
roadblock will be wearing seatbelts, the probability that 9 of 10 DVD
players will run at least 1,000 hours, or the probability that 66 of 200
television viewers (interviewed by a rating service) will recall what products
were advertised on a given program.
• Borrow from the language games of chance, we might say that in
each of these examples, we are interested in the probability of getting
𝑥 successes in 𝑛 trials, or, in other words, 𝑥 successes and 𝑛 − 𝑥
failures in 𝑛 attempts.

Definition 1 (page 58)


A Bernoulli trial can result in a success with probability 𝑝 and a failure
with probability, 𝑞 = (1 − 𝑝). Then, the probability distribution of 𝑋 is
called a Binomial Distribution (or binomial probability distribution) with
the number of success in 𝑛 independent trial is written as:

𝑋~𝐵(𝑛, 𝑝)
Important question in binomial
distribution probability is..
Determination of
binomial distribution
The Binomial Distribution
Bernoulli Random Variables
• Imagine a simple trial with only two possible outcomes
• Success (S)
• Failure (F)

• Examples
• Toss of a coin (heads or tails)
• Sex of a newborn (male or female) Jacob Bernoulli (1654-1705)
• Survival of an organism in a region (live or die)
The Binomial Distribution
Overview
• Suppose that the probability of success is p

• What is the probability of failure?


• q=1–p

• Examples
• Toss of a coin (S = head): p = 0.5  q = 0.5
• Roll of a die (S = 1): p = 0.1667  q = 0.8333
• Fertility of a chicken egg (S = fertile): p = 0.8  q = 0.2
The Binomial Distribution
Overview
• Imagine that a trial is repeated n times

• Examples
• A coin is tossed 5 times
• A die is rolled 25 times
• 50 chicken eggs are examined

• Assume p remains constant from trial to trial and that the trials
are statistically independent of each other
The Binomial Distribution
Overview
• What is the probability of obtaining x successes in n trials?

• Example
• What is the probability of obtaining 2 heads from a coin
that was tossed 5 times?

P(HHTTT) = (1/2)5 = 1/32


The Binomial Distribution
Overview
• But there are more possibilities:

HHTTT HTHTT HTTHT HTTTH


THHTT THTHT THTTH
TTHHT TTHTH
TTTHH

P(2 heads) = 10 × 1/32 = 10/32


The Binomial Distribution
Overview
• In general, if trials result in a series of success and failures,

FFSFFFFSFSFSSFFFFFSF…

Then the probability of x successes in that order is

P(x) = q  q  p  q  
= px  qn – x
Theory 2 (page 59)
Binomial formula:

𝑛!
𝑃 𝑋=𝑥 = . 𝑝 𝑥 . 𝑞 𝑛−𝑥 = 𝑛𝐶𝑥 . 𝑝 𝑥 . 𝑞 𝑛−𝑥
𝑥!. (𝑛 − 𝑥)!
Where
𝑝 = probability of success in a single trial
𝑞 = 1 − 𝑝, probability of failure in a single trial
𝑛 = number of trials
𝑥 = number of success in 𝑛 trials, so 𝑥 can be any number
between 0 and 𝑛 (inclusive) with 𝑋~𝐵(𝑛, 𝑝)

See Example 1 (page 59)


Theory 3 (page 60)
The usual inequalities that we will use in Binomial are:

Name Symbol
Greater than >
Less than <
At least >
At most <
Not more than <
Not less than >
See example 3 (page 61)
Theory 4 (page 62)
Sometimes, it is more suitable to use binomial probability table
especially when the trials are more than ten (𝒏 > 𝟏𝟎).

(see Table 1 in the appendix)


See Example 4 (page 62)
Theory 5 (page 63)
Some ideas in finding the value of probability by using table such as
below:
(a) 𝑷 𝑿 ≥ 𝒌 = 𝒇𝒓𝒐𝒎 𝒕𝒂𝒃𝒍𝒆
(b) 𝑃 𝑋 < 𝑘 = 1 − 𝑃(𝑋 ≥ 𝑘)
(C) 𝑃 𝑋 ≤ 𝑘 = 1 − 𝑃(𝑋 ≥ 𝑘 + 1)
(d) 𝑃 𝑋 > 𝑘 = 𝑃(𝑋 ≥ 𝑘 + 1)
(e) 𝑃 𝑋 = 𝑘 = 𝑃 𝑋 ≥ 𝑘 − 𝑃(𝑋 ≥ 𝑘 + 1)
(f) 𝑃 𝑘 ≤ 𝑋 ≤ 𝑙 = 𝑃 𝑋 ≥ 𝑘 − 𝑃(𝑋 ≥ 𝑙 + 1)
(g) 𝑃 𝑘 < 𝑋 < 𝑙 = 𝑃 𝑋 ≥ 𝑘 + 1 − 𝑃(𝑋 ≥ 𝑙)
(h) 𝑃 𝑘 ≤ 𝑋 < 𝑙 = 𝑃 𝑋 ≥ 𝑘 − 𝑃(𝑋 ≥ 𝑙)
(i) 𝑃 𝑘 < 𝑋 ≤ 𝑙 = 𝑃 𝑋 ≥ 𝑘 + 1 − 𝑃(𝑋 ≥ 𝑙 + 1)

You can see Example 5 (page 63) on how to apply Theory 5.


Theory 6 (Page 67)
The mean and variance of the binomial distribution 𝑋~𝐵(𝑛, 𝑝) with 𝑛
trials and probability of success, 𝑝 are:

Mean, 𝜇 = 𝑛𝑝
Variance, 𝜎 2 = 𝑛𝑝𝑞
***Note: We know that 𝑞 = (1 − 𝑝)

See example 8 (page 67)


• Try exercise 2.2 of page 69
(Q3) Let 𝑋 be a binomial random variable with 𝑛 = 7 and 𝑝 = 0.3.
Find the value of:

(a) 𝑃(𝑥 = 4)
(b) 𝑃(𝑥 ≤ 1)
(c) 𝑃(𝑥 > 1)
(d) Mean
(e) Variance

Solution???
1.2 The Poisson Distribution

• The Poisson probability distribution provides a good model for data


that represent the number of occurrences of a specified event in a
given unit of time or space.
• Examples of experiments for which the random variable 𝑥 can be
modeled by the Poisson random variable:
• The number of calls received by a switchboard during a given period of time.
• The number of bacteria per small volume of fluid.
• The number of customer arrivals at a checkout during a given minute.
• The number of machine breakdowns during a given day.
• The number of traffic accidents at a given intersection during a given time
period.
A little bit of the history….

• The distribution was derived


by the French mathematician
Siméon Poisson in 1837,
and the first application was
the description of the
number of deaths by horse
kicking in the Prussian army.
• All the examples mentioned previously show that 𝑋 represents the
number of events that occur in a period of time or space during with
an average of mean where such events can be expected to occur.

Definition 2 (Page 74)


The probability distribution of 𝑋 is called a Poisson distribution and the
distribution is write as:

𝑋~𝑃0 𝜇
Theory 7 (Page 74)
• Properties of Poisson distribution are:

• The experiment consists of counting the number of 𝑋 of times for


a particular event occurs during a given unit of time, or in a given
area or volume (or weight, distance, or any other unit of
measurement).
• The probability that an event occurs in a given unit of time, area,
or volume is the same for all the units.
• The number of event is independent.
• The Poisson random variable is the number of success event.
Theory 8 (Page 75)
• Poisson distribution formula:

𝑒 −𝜇 . 𝜇 𝑥
𝑃 𝑋=𝑥 = , 𝑥 = 0, 1, 2, … , ∞
𝑥!

Where, 𝜇 = 𝑚𝑒𝑎𝑛 𝑜𝑟 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡


𝑥 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠 𝑒𝑣𝑒𝑛𝑡
With 𝑋~𝑃0 𝜇
e 
 x  Or 𝜇 is the shape
Or
p( x ) 
parameter which indicates
the average number of
x! events in the given time
interval.
• For each value of 𝑋, the individual probabilities for the Poisson
random variable can be obtained, just like binomial random variable.

• Or, we can use Cumulative Poisson Table (see Table 2 in the appendix)
in which the way on how to find the values are similar with Binomial
random variable.

• Important! – Please use consistent units in the calculation of


probabilities, means and variances that involve Poisson random
variables (see page 75).

• See example 11 (page 75)


Theory 9 (Page 76)

Sometimes Poisson Table or Poisson probability table are use regarding


the value of success event that are small or large. We will use the same
ideas as in Binomial distribution.

See example 12 (page 76).


Theory 10 (page 77)

The formula mean, variance and standard deviation of the Poisson random
variable 𝑋 are:

𝑀𝑒𝑎𝑛, 𝜇 = 𝜇
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒, 𝜎 2 = 𝜇
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛, 𝜎 = 𝜇

Note: If the variance of count data is much greater than the mean of the
same data, the Poisson distribution is not a good model of the distribution
for the random variable.

See example 13 (page 77)


e 
 x
p( x ) 
x!
Example 1:
Example 2:
Arrivals at a bus-stop follow a Poisson distribution with an average of 4.5
every quarter of an hour.
Assume a maximum of 20 arrivals in a quarter of an hour and calculate the
probability of fewer than 3 arrivals in a quarter of an hour.

Solution:
The probabilities of 0 up to 2 arrivals can be calculated directly from the formula:

e   x
p( x )  Similarly p(1)=0.04999 and p(2)=0.11248
x!
with  =4.5

4.5 0
So the probability of fewer than 3 arrivals is:
e 4.5
p(0)  0.01111+ 0.04999 + 0.11248 =0.17358
0!
So p(0) = 0.01111
Example 3:
If we use the Poisson Table to find the answer:
Given 𝜇 𝑜𝑟 λ = 2 𝑝𝑒𝑟 𝑤𝑒𝑒𝑘, therefore for fortnight, λ = 2(2) = 4 per fortnight
What is 𝑃 𝑋 < 3 ?

Note that the Poisson Table


(see Table 2 in appendix)
shows the value of
𝑃 𝑋≥𝑥 .

So in this case,
𝑃 𝑋 <3 =1−P X≥3
= 1 − 0.7619
= 0.2381

(Same answer as using the


Poisson formula
previously!)
Poisson Approximation to the Binomial Distribution

Or n ≥ 30
Example 4:

Binomial
distribution
approach
Calculation using Poisson distribution approach via Poisson Table:
Given, n = 300 (which is n > 50), p = 1/50 (which is p < 0.1), therefore 𝜇 𝑜𝑟 λ =
1
𝑛𝑝 = 300 × 50 = 6
So, Poisson distribution follows 𝑋~𝑃𝑜 (6)

Find 𝑃 𝑋 ≥ 8 ?
Since the Poisson Table (Table 2
in the appendix) is for the
values of 𝑃 𝑋 ≥ 𝑥 , therefore
just straight away use the value
from the table as your answer.

Thus, 𝑃 𝑋 ≥ 8 = 0.256

The similar answer as using


Binomial distribution approach
previously with +0.001!
Approximation to the Binomial distribution

The Poisson distribution is an approximation to B(n, p), when n is large and p is small
(e.g. if 𝑛𝑝 < 7, say).
𝑒 −𝜆 𝜆𝑘
In that case, if 𝑋 ∼ 𝐵(𝑛, 𝑝) then 𝑃(𝑋 = 𝑘) ≈ Where 𝜆 = 𝑛𝑝
𝑘!

i.e. X is approximately Poisson, with mean 𝜆 = 𝑛𝑝.


Example 5:

The probability of a certain part failing within ten years is 10-6. Five million of the parts
have been sold so far. What is the probability that three or more will fail within ten years?

Answer:

Let X = number failing in ten years, out of 5,000,000; 𝑋 ∼ 𝐵(5000000,10−6 )

Evaluating the Binomial probabilities is rather awkward; better to use the Poisson
approximation.

X has approximately Poisson distribution with 𝜆 = 𝑛𝑝 = 5000000 × 10−6 = 5.

P(Three or more fail) = 𝑃 𝑋 ≥ 3 = 1 − 𝑃 𝑋 = 0 − 𝑃 𝑋 = 1 − 𝑃(𝑋 = 2)


𝑒 −5 50 𝑒 −5 51 𝑒 −5 52
=1− − −
0! 1! 2!

= 1 − 𝑒 −5 1 + 5 + 12.5 = 0.875

For such small 𝑝 and large 𝑛 the Poisson approximation is very accurate
(exact result is also 0.875 to three significant figures).
1.3 NORMAL DISTRIBUTION
**The beauty of the normal curve:

No matter what  and  are, the area between - and


+ is about 68%; the area between -2 and +2 is
about 95%; and the area between -3 and +3 is
about 99.7%. Almost all values fall within 3 standard
deviations.
68-95-99.7 Rule

68% of
the data

95% of the data

99.7% of the data


68-95-99.7 Rule
in Math terms…
  1 x 2
1  ( )

  
 2
 e 2  dx  .68

  2 1 x 2
1  ( )

  
2 2
 e 2  dx  .95

  3 1 x 2
1  ( )

  
3 2
 e 2  dx  .997
How good is the rule for real data?

Check some example data:


The mean of the weight of the women = 127.8 lb
The standard deviation (SD) = 15.5 lb
68% of 120 = .68x120 = ~ 82 runners
In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.

112.3 127.8 143.3

25

20

P
e 15
r
c
e
n 10
t

0
80 90 100 110 120 130 140 150 160
POUNDS
95% of 120 = .95 x 120 = ~ 114 runners
In fact, 115 runners fall within 2-SD’s of the mean.

96.8 127.8 158.8

25

20

P
e 15
r
c
e
n 10
t

0
80 90 100 110 120 130 140 150 160
POUNDS
99.7% of 120 = .997 x 120 = 119.6 runners
In fact, all 120 runners fall within 3-SD’s of the mean.

81.3 127.8 174.3

25

20

P
e 15
r
c
e
n 10
t

0
80 90 100 110 120 130 140 150 160
POUNDS
The Normal Distribution:
as mathematical function (pdf)

1 x 2
1  ( )
f ( x)  e 2 
 2
This is a bell shaped curve
Note constants: with different centers and
=3.14159 spreads depending on 
e=2.71828 and 
The Normal PDF

It’s a probability function, so no matter what the values of 


and , must integrate to 1!

 1 x 2
1  ( )

 2
e 2  dx 1
Example 1:
The masses of the well-known
brand of breakfast cereal are
normally distributed with mean
of 250g and standard deviation of
4g. Find the probability of a
packet containing more than
254.4g.

Solution:
Let x be the random variable of the ‘weight of cereal packet’.

𝑥 − 𝜇 254.4 − 250
𝑧= =
𝜎 4
= 1.1
Previously, it is calculated
that 𝑧 = 1.1.

Therefore, since the table


shows the values of
probability 𝑃 𝑋 ≥ 𝑥 , just
straightaway take the value
from the Normal
Distribution Table (see Table
3 in the appendix).

𝑃 𝑋 ≥ 254.4 =P(z ≥ 1.1)


= 0.1357
Example 2:
A carton of orange juice has
a volume which is normally
distributed with a mean of
120ml and a standard
deviation of 1.8ml. Find
the probability that the
volume is more than 118ml.

Solution:

Let X be the random variable of ‘the volume of orange juice in ml’.

𝑥−𝜇 118 − 120


𝑧= =
𝜎 1.8
= −1.111
Previously, it is calculated
that 𝑧 = −1.11.

Therefore, since the


table shows the values of
probability 𝑃 𝑋 ≥ 𝑥 , we
have to adjust it a bit.

𝑃 𝑋 ≥ 118
= 𝑃 𝑧 ≥ −1.11
= 1 − 𝑃 𝑧 ≥ 1.11
= 1 − 0.1335
= 0.8665
EXAMPLE 3

Binomial approach
Normal approach
Normal approach
End of chapter 4