Вы находитесь на странице: 1из 36

ENGG2450 Probability and Statistics for Engineers

1 Introduction
3 Probabilityy
4 Probability distributions
5 Probability densities
2 Organization and description of data
6 Sampling distributions
7 Inferences concerning a mean
8 C
Comparing two treatments
9 Inferences concerning variances
A Random Processes
4 Probability distributions
1 Introduction
3 Probability
4 Probability distributions
4.1 Random variables 5 Probability densities
2 Organization & description
6 Sampling distributions

4.2 The binomial distribution 7 Inferences .. mean


8 Comparing 2 treatments
9 Inferences .. variances
A Random processes
4.3 The hypergeometric distribution
4 4 The mean and the variance of a
4.4
probability distribution
4.5 Chebyshevs Theorem
4 6 The Poisson approximation to the
4.6
binomial distribution
4.8 The geometric and negative binomial distribution
4.1 Random variables (3)

In a
an e
experiment,
pe e t, An event A is said to have Sample
p space
p is the set
there is an occurred in an experiment if the corresponds to all possible
outcome. outcome is an element of A. outcomes of an experiment.

e.g.1 Throwing a dice outcome sample space

xi S S={1,2,3,4,5,6}

Suppose the outcome of an In this experiment


p of throwing
ga
experiment is a number. We dice, the outcome is a number
can define a quantity called
represented here by xi which is a
random variable X to sample of the random variable X.
represent the randomness The i in xi represents the i trial
nature of the number
number.
of the experiment.
In an experiment, The outcome of an experiment Sample space is the set
there is an outcome. is a number and is a sample of corresponds to all possible
the random variable. outcomes of an experiment.
p

e.g.1 Throwing a dice outcome xi sample space S ={1,2,3,4,5,6}


The result of throwing a dice is random and discrete. It can be
represented by a discrete random variable, say X.

We obtain a sample of the random variable, represented as xi , by throwing


a dice, where i indicates that the sample is obtained in the i th experiment.
i is discrete.
x(t)
e.g.2 Measuring noise
outcome x(t) sample space (-10v, 10v)
level of a phone line

The noise level of a phone line is random and continuous. It can


be represented
p y a continuous random variable,, sayy X.
by

We obtain a sample of the random variable, represented as x(t) , by measuring


the noise level of a phone line at certain time t. t is continuous.
4.1 Random variables (5)

A random variable is defined by the following:


1. An experiment that can be repeated.
The set that contains all possible outcomes of the experiment is called the sample space. A
subset of the sample
p space
p is called an event.
The outcome of an experiment is a number which is a sample of the random variable.
2. A function that associates a probability to each event.

Experiment obtain a sample of the outcome sample


random variable space

e.g.1 Throw a dice at trial i and


xi {1,2,3,4,5,6}
record the outcome xi.
Measure the noise level of a
e.g.2 voltage x(t) (-10v, 10v)
phone line at time t and record
the outcome x(t).

Note t at in e.g.
ote that e.g.2, the outco e x(t) iss continuous,
t e outcome co t uous, so tthere
eea are
e infinite
te number
u be oof
possible outcome. What will be the probability of each of this outcome?
The probability of a possible outcome is 0. P(1 X 3) may not be 0.
A random variable is any function that assigns a numerical value to
each possible outcome.

The probability distribution of a discrete random variable X is a list of


the possible values of X together with their probabilities f(x) = P[ X=x].
The probability distribution always satisfies the conditions
f(x) 0 and f(x) =1.
all x

e.g.
f(1) = P[ X=1]= 1/6 = f(2) = f(3)= f(4)= f(5)= f(6)

The cumulative distribution function or distribution function of a


discrete random variable X is F(x)
F( ) = P[ X x].
]

e.g. F(1) ] 1/6


( ) = P[[ X1]= F(2) ] 2/6
( ) = P[[ X2]=

F(3) = P[ X3]= 3/6 F(6) = P[ X6]= 1


4.2 The Binomial Distribution (7)
Bernoulli trials
1. There are only two possible outcomes, called success & failure.
2. The probability of success, denoted by p, is the same for each trial.
note:
3. The outcomes from different trials are independent. The probability
of failure is 1-p.
In the pproblems to be studied in chapter
p 4,, we further assume
4. A fixed number n of Bernoulli trials are conducted.

eg
e.g. Consider
C id th
the ttrial
i l off th
throwing
i a di ith success defined
dice with d fi d as h
having
i an
outcome 6.
The probability of success is p =1/6.

Suppose the dice is thrown 4 times. Can these 4 trials be


treated as Bernoulli trials? Yes.
Yes

What are the possible results of these trials? They are


ssss sssf ssfs ssff sfss sfsf sffs sfff
fsss fssf fsfs fsff ffss ffsf fffs ffff
They are 16 possible results in these Bernoulli trials with n = 4 .
ssss sssff ssfs
f ssff
ff sfss
f sfsf
f f sffs
ff sfff
fff n= 4
fsss fssf fsfs fsff ffss ffsf fffs ffff x cn,x
0 1

W use discrete
We di t random
d variable
i bl X to
t representt the
th number
b 1 4
2 6
of success in this experiment. The sample space is {0,1,..., n}.
3 4
Let x be a sample of X. X 4 1

The probability of a result with x successes


n= 5
(which implies n x failures) is px (1-p)n-x.
x cn,x
0 1
The probability of all results with x successes is
1 5
n!
f ( x) P[ X x] Cn , x p x q n x p x q n x 2 10
x ! n x ! 3 10
4 5

The probability distribution of X is Binomial distribution 5 1

b( x; n, p) Cn, x p x (1 p)n x x 0, 1, 2,..., n


e.g. It has been claimed that in 60% of all solar-heat installations the
utility bill is reduced by at least one-third. Accordingly, what are the
probabilities that the utility bill will be reduced by at least one-third in
(a) four of five installations; (b) at least four of five installations?

sln. Consider the trial of building a solar-heat installation. The probability of x


Success is defined as having the bill reduced by at least 1/3. successes in n
Bernoulli trials
Probability of success p(s) = p = 0.6
b(x;n, p) Cn,x px qnx
The outcomes from different trials are independent x 0,1, 2,...,n
and so they are Bernoulli trials.

(a) four of five installations are successful, e n=5,


successful ii.e. n=5 x=4
b( x; n, p ) C5, 4 p 4 (1 p )5 4 = C5,4 0.64 (1-0.6) = 0.259

(b) at least four of five installations are successful, i.e. n=5, x=5 & 4
b( x; n, p ) C5 , 5 p 5 (1 p )55 = C5,5 0.6
0 65 (1-0.6)
(1-0 6)0 = 0.078
0 078
The probability is b(4; 5, 0.6) + b(5; 5, 0.6) = 0.259 + 0.078 = 0.337
4.2 The Binomial Distribution
4.1 Random variables
(10) 4.2 binomial distribution
4.3 hypergeometric
Binomial distribution distribution
4.4 mean & variance
4.5 Chebyshevs Theorem
b( x; n, p) Cn, x p x (1 p)n x x 0, 1, 2,..., n 4.6 Poisson approximation
4.8 geometric & negative
binomial distribution

If n is large, the calculation of binomial probabilities become tedious.


A binomial distributions table gives the values of cumulative probabilities
x
B( x; n, p) b( x; n, p) x 0, 1, 2,..., n
k 0

for n =2 to 20 and
p = 0.01,
0 01 0.05,
0 05 010,
010 0.15,
0 15 ..., 0.90,
0 90 0.95.
0 95

Cumulative probabilities B(x;n,p) rather than probabilities b(x;n,p) are


tabled because the values of B(x;n,p)
B(x;n p) are the ones needed more often
in statistical applications.

P b biliti b(x;n,p)
Probabilities b( ) can be
b ddetermined
t i d ffrom cumulative
l ti probabilities
b biliti
b(x;n,p) = B(x;n,p) - B(x-1;n,p) where B(-1;n,p)=0.
e.g. A manufacturer of external hard drives claims
Binomial distribution
that only
y 10% of his drives require
q repairs
p within
b ( x; n, p ) Cn , x p x (1 p ) n x
the warranty period of 12 months. If it turns out that
x 0,1,..., n
5 of 20 of his drives requires repairs within the first
year, does this tend to support or refute the claim?

sln. Let p probability of success (repairs needed in 12 mths) = 0.1

The probability that 5 or more of 20 of the drives will require repairs is


b(5; 20,p) + b(6; 20,p) + ... + b(20; 20,p)
= 1 - { b(0; 20, p) + b(1; 20, p) + ... + b(4; 20, p) }
= 1- B(4 20,p)
B(4; 20 )
= 1 - 0.9568
= 0.0432
0 0432

Since this p
probability
y is very
y small, it would seem reasonable to reject
j
the hard drive manufacturers claim.
= B(x; n, p) = B(4; 20, 0.10)
Examples of Binomial Distributions
b(x;n,p)
Binomial distribution
b ( x; n, p ) Cn , x p x (1 p ) n x
x 0,1,..., n

x
For any n, the binomial distribution with p=0.5 is a symmetrical
distribution at x= n/2.

When p<0.5,
p
b(x;n,p) X is more
likely to be When p>0.5,
small. X is more
likely to be
large.

x
4.3 The Hypergeometric Distribution (15)

Consider a lot consisting of N units,


units of which a are defective.
defective
e.g. N= 20, a=3 1 2 3 4 5 6 7 8 9 10 11121314151617181920
We are interested in the number of defectives in a sample of n units taken
from this lot. e.g. n = 10 ? ? ? ? ? ? ? ? ? ?
p be drawn in such a way
Let the sample y that at each successive drawing,
g,
whatever units left in the lot have the same chance of being selected.
Suppose we do sampling without replacement.

The probability the 1st unit drawn from the sampled n units is defective is a/N.
The probability the 2nd unit drawn is defective is (a-1)/(N-1) or (a)/(N-1),
depending on whether the 1st unit drawn is defective or not defective.

The trials are not independent and so not Bernoulli trials. The probabilities
of drawing a defective unit cannot be determined using Binomial distribution.
The trials are independent and so are Bernoulli trials if we do sampling
with replacement.
Consider a lot consisting of N units, of which a are defective.
e.g.
g N= 20,, a=3 1 2 3 4 5 6 7 8 9 10 11121314151617181920
We are interested in the number of defectives in a sample of n units from this lot.
e.g. n = 10, x = 2 ?1?2?3?4?5?6?7?8 ?18?19;
We do sampling without replacement.

Suppose there are x defective units in this sample of n units.


These x defective units can be chosen in Ca,x ways.
The n-x non-defective units can be chosen in CN-a,, n-x ways.
There are Ca,x x CN-a, n-x ways.

There are CN,n ways to choose n units from the N units.


units

Hypergeometric distribution
The probability of a result with x
a N a successes (defective) in n trials is
x n x
h( x; n, a, N ) , x 0, 1, ..., n Hypergeometric distribution.
N
n

e.g. A company sells discount accessories for cell phones often ships many
defective units. Suppose it has 20 chargers on hand but 5 are defective.
If the company decides to randomly select 10 of these items, what is the
probability that 2 of the 10 is defective.

sln. N = 20, a = 5, 1 2 3 4 5 6 7 8 9 10 11121314151617181920


n = 10,
10 x = 2.
2 P( 12345678 1617 ..))

The probability that 2 of the 10 will be defective


sampling without
5 15
2 8 replacement.
p
h( x; n, a, N )
20 Hypergeometric
10 distribution a N a
x n x
h( x; n, a, N ) ,
N
n
= (10 x 6,435 )/184,756 = 0.348 x = 0, 1, ..., n.
e.g. A company sells discount accessories for cell phones often ships many
defective units. Suppose it has 100 chargers on hand but 25 are defective.
The company decides to randomly select 10 of these items.
What is the probability that 2 of the 10 is defective hypergeometric
using (a) hypergeometric distribution and distribution a N a
x n x
(b) binomial distribution as an approximation? h( x; n, a, N ) ,
N
x = 0, 1, ..., n. n

sln.
l The
Th probability
b bilit that
th t 2 off the
th 10 will
ill b
be d
defective
f ti

(a) hypergeometric distribution N = 100, a = 25, n = 10, x = 2.


C C75 ,8
h( x; n, a, N ) 25 , 2 = 0.292
C100 ,10
(b) binomial distribution as an approximation p = 25/100 = 0.25,
0 25 n = 10,
10 x = 2.
2
b( x; n, p ) C10 , 2 p 2 (1 p )10 2= 0.282 binomial distribution
b( x; n, p) Cn, x p x (1 p) n x
It can be shown that h(x;n,a,N)
h( ) approaches
b(x;n,p) when N .
A good rule of thumb: binomial distribution is a good approximation to
hypergeometric distribution when N 10 n .
Examples of Hypergeometric Distributions

h(x; n,
n a,
a N)

hypergeometric
distribution a N a
x n x
h( x; n, a, N ) ,
N
x = 0, 1, ..., n. n

b(x; n, p)

x
4.4 The Mean and Variance of a Probability Distribution (20)

Mean of discrete p y distribution f(


probability f(x))

x f (x)
A Bernoulli trial is an experiment that has only 2
outcomes, success & failure. p(success)= p; p(failure)=1-p.
all x Outcomes from different trials are independent
independent.

Consider a coin is tossed 3 times. Find the mean number of heads ((success).
)
The number of heads in 3 tosses is a discrete random variable, say X.

There are 8 possible results in these 3 Bernoulli trials.


trials
sss ssf sfs sff fss fsf ffs fff
n=3
x cn,x
1. x is a sample of X. 0 1
2. The sample space of X is {{0,, 1,, 2,, 3 } 1 3
3. f(x) is probability distribution of X & is binomial. 2 3
4. Mean of f(x) is also called Mean of X or expected 3 1

value of X
e.g. Find the mean number of heads (success) in 3 tosses.
n=3
sln. Let X be the number of heads which is a discrete
random variable of binominal distribution. x cn,x
The probability of success p = 0.5 0 1
1 3
2 3
There are 8 possible results in these 3 Bernoulli trials.
3 1
sss ssf sfs sff fss fsf ffs fff
binomial distribution
The mean number of heads x f (x)
all x
b( x; n, p )
3
Cn , x p x (1 p ) n x
x b( x; n, p)
x 0

= 0 x b(0;3,p) + 1 x b(1;3,p) + 2 x b(2;3,p) + 3 x b(3;3,p)

= 0 + 1 x 3 x 0.51 x ((1-0.5))2 + 2 x 3 x 0.52 x ((1-0.5)) + 3 x 1 x 0.53 x ((1-0.5))0

= 0 + 0.375 + 0.75 + 0.375 = 1.5


3
Alternatively,
x 0
x b( x; n, p) n p = 3 x 0.5 = 1.5
Mean of binomial distribution f(x) Mean of f(x)

n p x f (x)
all x

Pf.
all x
x f ( x) binomial distribution
b ( x; n , p ) C n , x p x (1 p ) n x

x Cn, x p x (1 p)n x
all x
n
n!
x p x (1 p)n x
x 1 x!(n x)!
n
n!
p x (1 p)n x
x 1 ( x 1)!
) (n x)!
)

n
(n 1)! Let y = x-1,, m = n-1
np p x 1 (1 p) n x
x 1 ( x 1)!(n x)!
m m
m!
np p (1 p) n p b( y; m, p)
y m y
=np
y 0 y!(m y)! y 0
Mean of hypergeometric distribution f(x) Mean of f(x)
a
n x f (x)
N all x

Pf. x f (x) hypergeometric


all x distribution a N a

: x n x
h ( x; n, a , N ) ,
N
You will prove it in Assignment 2. x = 0, 1, ..., n.

n

e.g. A company sells


ll di
discountt accessories
i ffor cellll phones
h often
ft ships
hi many
defective units. Suppose it has 20 chargers on hand but 5 are defective.
If the company decides to randomly select 10 of these items. What is the
mean of the probability distribution of the number of defective?

sln N = 20,
sln. 20 a = 5,
5 1 2 3 4 5 6 7 8 9 10 11121314151617181920
n = 10 ? ? ? ? ? ? ? ? ? ?

a 5
n 10 2.5
N 20
Variance of probability distribution f(x)
is called the standard deviation
2 ( x )2 f ( x) which measures the expected
all x deviation of samples from the mean.

Binomial distributions
bility
Probab

b( x; n, p ) Cn , x p x (1 p ) n x
x = 0,
0 11, ..., n
x
= 5/4 20/4 50/4

Variance of binomial distribution

2 n p (1 p )
Variance of probability distribution f(x)
is called the standard deviation
2 ( x )2 f ( x) which measures the expected
all x deviation of samples from the mean.

Hypergeometric
distribution
a N a
x n x
h( x; n, a, N ) ,
N
n

x x = 0, 1, ..., n.

Variance of hypergeometric distribution

a a N n
2 n 1
N N N 1
k th moment about the origin Mean of f(x)

x f (x x f (x)
' k
k ( x)
all x all x

1'
1st moment about the origin

k th moment about the mean Variance of f(


f(x))

k ( x )k f ( x) 2 ( x )2 f ( x)
all x all x

2
2 nd moment about the mean

C
Computing
ti fformula
l for
f the
th variance
i

2 2' 2
4.5 Chebyshevs Theorem (27)

Theorem 1: If a probability distribution has mean and standard


deviation , the probability of getting a value which deviates from
1
by t k
b att least
l i att mostt 1/k2 , i.e.
k is i P | X | k 2 .
k
ee.g.
g The number of customers f( )
f(x)
who visit a car dealers
showroom is a random variable
ith =18
with 18 andd =2.5.
2 5 With
what probability can we assert
that there will be more than 8
but fewer than 28 customers? 4 4
sln. | x | k ( x ) k or ( x ) k
{ 8 or less } or { 28 or more } customers
1 1
P | X | k
k 2 16
1 15
Probability of having more than 8
P | X | k 1 or
& fewer than 28 customers 16 16
e.g.
w4 Show that for 40,000 flips of a
Chebyshevs Theorem:
balanced coin, the probability is at 1
least 0.99
0 99 that the proportion of heads P | X | k 2 .
k
will fall between 0.475 and 0.525.
sln. Let discrete random variable X be the f(x)
number of heads in 40,000 flips.
These are Bernoulli trials and so X is
of binominal distribution.
k k
= n p = 40,000 x = 20,000
1 1 binomial distribution
np (1 p ) 40000 100
2 2 = n p ; = n p (1-p)
P | X | k 0.99
1
Byy Chebyshevs
y Theorem, 1 0.99 so k = 10
k2

It means X has at least 0.99 chance within (kk


or (20000-1000, 20000+1000)
or a proportion within (19,000/40,000, 21,000/40,000).
Slide 28

w4 wkcham, 21/01/2016
4.6 The Poisson approximation to the binomial distribution (29)

Poisson distribution
binomial distribution
x e
f ( x; ) x 0, 1, 2,..., 0 b( x; n, p ) Cn , x p x (1 p ) n x
x!

The Poisson distribution often serves as a distribution model for counts


which do not have a natural bound.
f(x,)
For example, an individual keeping track of
the amount of mail they receive each day
may notice that they receive an average
number of 4 letters per day.

If receiving any particular piece of mail


doesn't affect the arrival times of future
pieces of mail, i.e., then a reasonable
assumption is that the number of
pieces of mail received p
p per day
y obeysy
a Poisson distribution. x
When n is large and p is small, binomial Poisson distribution
probabilities are often approximated by means x e
f ( x; ) , x 0, 1,,...,, 0
of Poisson distribution with = np. x!

lim binomial distribution


Prove b( x; n, p) Cn, x p x (1 p)n x f ( x; )
n b( x; n, p) Cn, x p x (1 p)nx
x x = 0, 1, ..., n
Pf:
f
lim lim n!
b( x; n, p) (1 ) n x
n n x!(n x)! n n

lim n(n 1)(n x 1) x n x


(1 )
n x! n x
n

1 2 x
11
(1 )(1 ) (1 ) lim 1 2 x 1
lim n n n (1 )(1 )(1 ) 1
x (1 ) n x n n n n
n x! n

lim x
(1 ) n x
n x! n
When n is large and p is small, binomial probabilities are often Poisson distribution
approximated by means of Poisson distribution with = np. x e
f ( x; ) , x 0, 1,,...,, 0
lim lim x!
Prove b( x; n, p) Cn, x p x (1 p)n x f ( x; )
n p 0
Pf: (continued)
binomial distribution

b( x; n, p) Cn, x p x (1 p)nx
lim lim x x = 0, 1, ..., n
b( x; n, p) (1 ) n x
n n x! n

x e lim
, x 0, 1,..., 0 (1 )n (1 ) x e
x! n n n

which is Poisson distribution

A rule of thumb is to use this approximation if n 20, p 0.05.

If n 100, the approximation is excellent as np 10.


Poisson distribution
e.g. A heavy machinery manufacturer has
x e
3840 large
g g generators in the field that are f ( x; ) , x 0, 1,..., 0
x!
under warranty. If the probability is 1/1200
that any one will fail during the given year, binomial distribution
find the probabilities that 0,1,2,3,
0123 of the b( x; n, p) Cn , x p x (1 p) n x
generators will fail during the given year. x = 0, 1, ..., n

sln: A n 100 and


As d np 10, Poisson
P i approximation
i i iis valid.
lid
x e
The probability is f ( x; ) , x 0, 1, 2,...
x!

where = np = 3840 1/1200 = 3.2

Mean of Poisson distribution


mean of binomial distribution
= n p

V i
Variance off Poisson
P i distribution
di t ib ti variance of binomial distribution
2
= n p (1- p)
4.8 The geometric and negative binomial distribution (33)

Geometric distribution
4.1 Random variables
g(x; p) = p ( 1- p )x-1 for x = 1, 2, 3, 4, 4.2 binomial distribution
4.3 hypergeometric
distribution
4.4 mean & variance
e.g. Cars manufactured by a company need to be 4.5 Chebyshevs Theorem
4.6 Poisson approximation
checked if the nitrogen oxide emission meets 4.8 geometric & negative
binomial distribution
governmentt standard.
t d d S Suppose we are
interested in the number of cars people have to
inspect until they find one whose emission does
not meet the standard (success).

If the
th first
fi t success is
i to th xth trial,
t come on the t i l th
then th
the probability
b bilit iis
p ( 1- p )x-1 where p is the probability of success.

Mean of geometric distribution

1

p
4.8 The geometric and negative binomial distribution (34)

The number of Bernoulli trials tried to find Geometric distribution


the first success is geometric distribution. g(x; p)= p (1-p)x-1

The number
Th b off Bernoulli
B lli ttrials
i l ttried
i d tto fifind
d Probability of y successes
in n Bernoulli trials
r successes is
b( y; n, p) Cn, y p y (1 p) n y
Negative binomial distribution y 0,1, 2,..., n

x 1 r
f ( x) p (1 p) x r for x = r,, r+1,, e.g. 3 Bernoulli trials
r 1
1 h
have 8 possible
ibl results.
lt
sss ssf sfs sff fss fsf ffs fff

The th xthh Bernoulli


Th rthh success occurs att the B lli ttrials.
i l
There are r-1 successes in the previous x-1 trials.

The probability f(x) = b(r-1; x-1, p) p

C x 1,r 1 prr (1 p ) x r
Dos and Donts (35)

Do s
Dos
Describe the chance behavior of a discrete random variable X by its
probability distribution function f(x) = P(X=x) for each value of X in
the sample space.

Summarize a probability distribution f(x) or the random variable X by


its mean and variance .

Binomial distribution,
distribution hypergeometric distribution,
distribution Poisson distribution
have their own sets of underlying assumptions. Choose the right
distribution for your problem.

Donts
Never apply the binomial distribution to counts without first checking that
the conditions hold for Bernoulli trials.