Вы находитесь на странице: 1из 18

# 1

## UNIT 4: PROBABILITY DISTRIBUTIONS

At the end of this topic you should be able to:

i. Define a discrete random variable and its associated probability mass

function and cumulative mass function.

ii. Calculate the expected value and variance for any discrete random variable.

iii. Calculate the expected value and variance for the sum or difference of two

independent random variables.

iv. Define and explain the properties of a binomial distribution;

v. Calculate binomial probabilities and expected values;

vi. Use the binomial model to solve problems.

vii. Define a Continuous Probability Distribution

2
4.1 DISCRETE RANDOM VARIABLE

Recall: A random variable is a variable whose values are determined by the outcomes of a
random experiment.

EXAMPLES: The score when a die is thrown.

The number of persons in a room.

The number of lotto winners in last nights game.

The number of throws of a die before getting a six.

4.2 PROBABILITY DISTRIBUTION FUNCTION

A table or formula which lists all the possible values that a discrete random variable can take
together with their associated probabilities is called a probability distribution function (pdf) or
the probability mass function.

EXAMPLE: A coin is tossed three times. The possible outcomes and their probabilities are as
follows: Denoting Heads = H and Tails = T -

P(HHH) = (0.5)(0.5)(0.5) = 0.125 P(TTT) = (0.5)(0.5)(0.5) = 0.125
P(HHT) = (0.5)(0.5)(0.5) = 0.125 P(THT) = (0.5)(0.5)(0.5) = 0.125
P(HTH) = (0.5)(0.5)(0.5) = 0.125 P(TTH) = (0.5)(0.5)(0.5) = 0.125
P(HTT) = (0.5)(0.5)(0.5) = 0.125 P(THH) = (0.5)(0.5)(0.5) = 0.125
A random variable is discrete if it can assume only a countable number of possible
values, or if it takes exact values:
If a random variable is discrete the sum of all the possible probabilities is 1.

= = 1 ) ( x X P

A probability distribution function assigns a probability to each simple event in the sample
space.
(1) 0 P(X) 1
Note that:
(2)

= = 1 ) ( x X P
3
Let the random variable X be the number of heads obtained from the toss of the three coin, then
X could take values of 0, 1, 2, or 3 and the probability mass function showing the values of X and
their probabilities would be given by:

X = x 0 1 2 3
P(X = x) 0.125 0.375 0.375 0.125

Note: Each probability is less than 1
The sum of the probabilities is 1.

The cumulative distribution function is similar to the cumulative frequencies introduced
earlier. The cumulative distribution function accumulates the probabilities associated with each
possible outcome.

X =x 0 1 2 3
P(X x) 0.125 0.5 0.875 1

Probabilities as relative frequencies

Suppose we are interested in the number of television sets sold daily at a TV store.

The results of sales over the last 50 days were observed and shown below

Daily Sales (x) frequency
0 2
1 8
2 17
3 13
4 10
Total = 50

The frequencies may now be used to estimate the probability of occurrence of each value of x.
Since the sum of the probabilities is 1, we estimate the probability by dividing each frequency by
the total number of days (50), i.e. the relative frequency.

We then obtain the estimated probability distribution as shown below:

4
Daily Sales (x) probability of sales

0
50
2
= 0.04
1
50
8
= 0.16
2
50
17
= 0.34
3
50
13
= 0.26
4
50
10
= 0.20
Sum = 1.00
Using these estimated probabilities, we can state that on 4% of the days no TV sets are sold and
on 34% of the days 2 TV sets are sold. We can also state that the probability of selling more
than 1 TV set on any day is the sum of the probabilities of the values of x, which are greater than
1.

i.e. ( ) ( ) ( ) 80 . 0 20 . 0 26 . 0 34 . 0 4 3 2 = + + = + + p p p

This means that 80% of the days will have sales of more than 1 TV set.

4.3 EXPECTATION AND VARIANCE OF X

Recall the mean of a population of n values was earlier defined as
n
x

n
x
1
). ( . For
infinitely large populations, however, we must replace the 1/n with the probability with
which x occurs. The mean is then given by = x.P(x) where the sum is taken over all the
x values. This value is also referred to the expected value of x written as E(X).

For the random variable described above, X is the number of heads when a coin is tossed three
times, E(X) = 0.(0.125) + 1(0.375) + 2(0.375) + 3(0.125) = 1.5.

Given a discrete random variable X with values x
i
that occur with probabilities p(x
i
) the
expected value of X is E(X) = = x.P(x)
The variance of X is given by

2
= Var(X) = E( X )
2
= E(X
2
)
2

where E(X
2
) = x
2
P(X = x)
5
Using our coin tossing experiment, we get

E(X
2
) = 0(0.125) + 1(0.375) + 4(0.375) + 9(0.125) = 3

Therefore the variance of X = Var(X) =
2
= 3 1.5
2
= 0.75.

Remember: the standard deviation = = (Var(X))

EXAMPLE: The distribution of the number of doughnuts, D, bought by customers at a
particular shop is given in the following table:

D 1 2 3 4 5 6 >6
P(D = d) 1/12 1/6 a 1/6 1/12 0

i. Find the value of a. Since the sum of the probabilities is 1,

i.e. P(D = d) = 1 a = 1 9/12 = 3/12 =

ii. By symmetry, or otherwise, state the expected value of D. (Always look for
symmetry in the table. In this case the expected value is half-way
between 3 and 4

iii. The probabilities are symmetric on either side of 3.5.

Otherwise: E(D) = D(P(D = d) = 1(1/12)+2(1/6)+3(1/4)+4(1/4)+5(1/6)+6(1/12) = 3.5

iv. Evaluate the variance of D: To get the variance we need to E(D
2
)

E(D
2
)= D
2
P(D=d)=1(1/12)+4(1/6)+9(1/4)+16(1/4)+25(1/6)+) 6(1/12) = 85/6

Var(D) = E(D
2
) E
2
(D) = 85/6 - (3.5)
2
= 1.9 (2 s.f.)

people.richland.edu/james/lecture/m170/ch06-prb.html

stattrek.com/probability-distributions/probability-distribution.aspx

6
4.4 Laws of expected values

If X and Y are random variables and a is any constant, then the following identities hold:

1. E[a] = a

2. E[aX] = aE[X]

3. E[X + Y] = E[X] + E[Y]

4. E[XY] = E[X] E[Y]

5. E[X.Y] = E[X].E[Y] if X and Y are independent

Laws of variance

1. Var[a] = 0

2. Var [aX] = a
2
Var[X]

3. Var[X + Y] = Var[X] + Var[Y] if X and Y are independent

4. Var[X Y] = Var[X] + Var [Y] if X and Y are independent

Example: The discrete random variable X has probability distribution function given by

=
=
otherwise
x kx
x f
0
6 , 5 , 4 , 3 , 2 , 1
) (

a) Find the value of k
b) Find the value of
i. E[X] ii. E[2X + 3] iii. Var[X] iv. Var[3X + 2]

Since X is a discrete variable, the distribution can be written as a table:

x 1 2 3 4 5 6
P(X = x) k 2k 3k 4k 5k 6k

7
Since X is a random variable P(X = x) = 1

k + 2k + 3k + 4k + 5k + 6k = 1
21k= 1
k = 1/21

E[X] = xP(X = x)
=
21
6
6
21
5
5
21
4
4
21
3
3
21
2
2
21
1
1 + + + + +
=
21
91

=
3
1
4

E[2X + 3] = 2E[X] + 3

= 2 (
3
1
4 ) + 3

=
3
2
11

E[X
2
] = x
2
P(X = x)
=
|
.
|

\
|
+ |
.
|

\
|
+ |
.
|

\
|
+ |
.
|

\
|
+ |
.
|

\
|
+ |
.
|

\
|
21
6
36
21
5
25
21
4
16
21
3
9
21
2
4
21
1
1
= 21

Var[X] = 21 (13/3)
2

= 2.22

Var[3X + 2] = 9Var[X]
= 20

www.cs.ecu.edu/hochberg/spring2005/ExpectedValue.

8

4.5 THE BINOMIAL DISTRIBUTION

Many experiments share the common trait that their outcomes can be classified into one of two
events. e.g. A salesman would either make a sale, or he would not make a sale; a bill is either
paid or due; an item from a production line might be good or defective. Outcomes such as these
are often labelled as success or failure.

Each repetition or trial of an experiment involving two outcomes is called a Bernoulli trial. In
probability theory we are interested in a series of independent, repeated Bernoulli trials. The
fact that the trials must be independent means that the results of one trial do not influence the
results of any other trial. By repeated, it means that the conditions under which each trial is held
should be an exactly the same. This implies that the probabilities of the two possible outcomes
cannot change from trial to trial.

In a Binomial situation, we are interested in the number of successes in n such trials.

A binomial experiment possesses the following properties:

1. The experiment consists of a fixed number of n trials.

2. The results of each trial can be classified into one of two categories:
success or failure.

3. The probability, p, of a success remains constant for each trial.

4. Each trial of the experiment is independent of all other trials.
A binomial random variable X is the number of successful outcomes in n trials of a
binomial experiment and we write X ~ Bin(n, p) where n and p are the parameters of the
distribution.

9

NOTATION:
n
C
x
is read as n combination x.
This is the number of selections of x items out of n different items.

n
C
x
=
)! x n ( ! x
! n

! = factorial: n! Is read as n factorial or factorial n

N! = N (N- 1)(N 2)( N 3) x ...3 x 2 x 1

6! = 6 x 5 x 4 x 3 x 2 x 1 = 720
Now check your calculator for the ! key.

Use the ! key to calculate (1) 4! (2) 10! (3) 14!

n
C
r
key.

Use it to calculate
12
C
5
(the number of selections of 5 items from 12 different items) = 792

If your
n
C
r
key is a second function then we input 12 shift
n
C
r 5
= . . .

If your
n
C
r
key is a key function,

then you will input
12
n
C
r
5 = . . .

Now, to do some: 6 successes from 10 trials =
10
C
6
= 210
12 successes from 20 trials =
20
C
12
= 125 970
9 successes in 15 trials =
15
C
9
= 5 005

Some useful results:
n
C
0
= 1
10
C
0
= 1

n
C
n
= 1
10
C
10
= 1

n
C
1
= n
10
C
1
= 10
n
C
n r
=
n
C
r
10
C
3
=
10
C
7
= 120

If X is Bin(n, p) then

P(X = x) =
n
C
x
p
x
(1 p)
n x
0 p 1 and x = 1, 2, 3, , n
E(X) = np Var(X) = np(1 p)

10
EXAMPLE: The quality control department of a manufacturer has determined that 5%
of the catalytic converters produced by the company will be defective until
an expensive overhaul is undertaken 2 weeks from now. A sample of 10
converters is randomly taken from the production line.

1. What is the probability that exactly 2 will be defective?

2. What is the probability that at least one will be defective?

3. What is the probability that less than 4 will be defective?

4. A shipment of 60 converters was made to a company. How many of the
converters in the shipment are expected to be defective?

First we should make sure that this is a binomial experiment. The experiment consists of a fixed
number of trials, n = 10; each trial has two outcomes i.e. a converter will either be defective or it
will be good; each trial has the same probability of success i.e. the probability of a converter
being defective is 0.05; the trials are independent i.e. one converter being defective will not
influence another converter being defective.

Define the random variable:
Let X be the number of defective converters in the sample.
State the distribution:
Then X ~ Bin (10, 0.05)
Write the probability function:
P(X = x) =
10
C
x
(0.05)
x
(0.95)
10 x

1. P(X = 2) =
10
C
2
(0.05)
2
(0.95)
8
= 0.0746

2. What is the probability that at least 1 will be defective?
Probability of at least one defective is P(X 1) = 1 P(X = 0)
P(X = 0) =
10
C
0
(0.05)
0
(0.95)
10
= 0.5987
P(X 1) = 1 0.5987 = 0.4013

3. Probability that less than 4 will be defective is P(X < 4)
11
P(X < 4) = P(X = 0) + P(X =1) + P(X = 2) + P(X = 3)
=
10
C
0
(0.05)
0
(0.95)
10
+
10
C
1
(0.05)(0.95)
9
+
10
C
2
(0.05)
2
(0.95)
8
+
10
C
3
(0.05)
3
(0.95)
7

= 0.5987 + 0.3151 + 0.0746 + 0.0105 = 0.9989

4. the number of converters expected to be defective is given as
E[X] = np = 50 x 0.05 = 3.

EXAMPLE: In a telephone poll 22% of the respondents read the daily horoscope. Assuming
that this same proportion apply to the population of a certain district,
(a) calculate the probability that in a random sample of 15 people from the district,
(i) exactly 4 people read the daily horoscope;
(ii) less than 3 people read the daily horoscope.

(b) If 500 people live in the district, how many people are expected to read the daily
horoscope?

SOLUTION:
Let X be the number of people who read the daily horoscope, then
X ~ Bin(15, 0.22)

P(X = x) =
15
C
x
(0.22)
x
(0.78)
15 x

(i) P( X = 4) =
15
C
4
(0.22)
4
(0.78)
11

= 0.2079

(ii) P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2)
=
15
C
0
(0.22)
0
(.78)
15
+
15
C
1
(0.22)(0.78)
14
+
15
C
2
(0.22)
2
(0.78)
13

= 0.024067 + 0.10182 + 0.20103
= 0.326917
= 0.3269 (4 dec. pl.)

(c) the number of people who are expected to read the daily horoscope is
500 x 0.22
12
= 110.

EXAMPLE: The probability that a blood donor is Group B is 0.09. Calculate the least number
of donors needed to ensure that the probability of obtaining at least one Group B
donor is greater than 0.95.

Recall:

(1) the probability of at least one is 1 minus the probability of none.
P(X 1) = 1 - P(X = 0)
(2) If a
x
= b
Then xlog a = log b
x =
a log
b log

SOLUTION: Let X be the number of Group B donors in the sample. X ~ Bin(n, 0.09). The
probability of at least one = P(X 1) > 0.95

P(X = 0) = (0.91)
n

So 1 (0.91)
n
> 0.95
(0.91)
n
< 0.05

using logarithms we get nlog(0.91) < log(0.05)
thus n > log(0.05)/log(0.91)
n > 31.76

Since n is a counting number, the least number bigger than 31.76 is 32.
So the least number of donors needed is 32.

stattrek.com/probability-distributions/binomial.aspx

13

5.6 CONTINUOUS PROBABILITY DISTRIBUTIONS
Recall: A random variable is a variable that can assume an infinite number of values within a
given range of values.

Continuous random variables typically involve measurement attributes such as length, weight,
time and temperature. One major distinction between the discrete and the continuous random
variable relates to the numerical events of interest. We can list all the possible values of a
discrete random variable, and it is meaningful to consider the probability that a particular value
will be assumed. On the other hand, all the values of a continuous random variable cannot be
listed (there is an infinite number of values), and so we are restricted to finding probabilities of
some interval, rather than a particular value. As a result the probability that a continuous
random variable will assume any particular value is zero.

When dealing with continuous data, we attempt to find a function f(x), called the probability
density function, whose graph approximates the relative frequency curve for the population.
This probability density must satisfy two conditions:

f(x) is non-negative i.e. f(x) 0

the total area under the curve representing f(x) = 1

Examples of continuous random variables are:

the time taken, in minutes, to perform some task

the height, in centimetres, of a particular species of plant

the distance, in kilometres, travelled by a salesman

the mass, in gram, of parcels received at the post-office.

Recall that in Unit 2 we discussed continuous random variables.

The probability distribution of a continuous random variable X is denoted by f(x) and
probabilities are given by areas under the curve for values of x between a and b.

14

( ) x f = ( ) b X a P < <

NOTE: ( ) 0 = = x X P

stattrek.com/.../dictionary.aspx?...continuous_probability_distributio.

For students who are familiar with calculus, the probability, expected value and variance
of a continuous random variable X which with probability density function f(x) are given below:

P(a X b) =
}
b
a
dx x f ) ( .

If X is a random variable then 1 ) ( =
}
x all
dx x f . The expected value of a continuous random
variable X is given by = E(X) =
}
allx
dx x xf ) ( . The variance of x is given by

2
= Var(X) = E(X
2
)
2
= . ) (
2 2

}
dx x f x
x all

The above expressions are similar to those for the discrete random variable, with the
summation replaced by the integration.

Using integration is beyond the scope of this course, but we shall discuss in UNI T 5, the
NORMAL DI STRI BUTI ON, which is a continuous distribution . We met this distribution in
Unit 2, but it will be further discussed in UNI T 5.
15
ECON 1005
SEMESTER 2: 2012 2013
ASSIGNMENT 4

1. Determine, with reason, whether or not each of the following is a probability distribution
function:
a) f(x) = 0.25 for x = 9, 10, 11, 12
b) f(x) =
2
3 x
for x = 1, 2, 3, 4
c)
x P(X = x)
0 0.15
1 0.25
2 0.35
3 0.45

2. A mutual fund saleswoman has arranged to call on three households tomorrow. Based on
past experience, she feels that there is a 20% chance of closing on a sale on each call and
that the outcome of each call is independent of the others. Let X represent the number of
sales she closes tomorrow.

a) Construct the probability distribution of X.
b) What is the probability that more than one sale will be closed tomorrow?

3. A drawer contains 8 brown socks and 4 blue socks. A sock is taken from the drawer, its
colour is noted and it is replaced in the drawer. This procedure is repeated twice more. X
is the random variable, the number of brown socks noted.

a) Construct a probability distribution of X.
b) What is the probability that more than 2 brown socks will be taken?

4. The probability distribution of a random variable X is shown in the following table:

X 1 2 3 4 5
P(X = x) 0.1 0.3 0.3 k 0.1

a) Find the value of k.
b) Calculate E[X] and Var [X]
c) Find E[4X + 3] and Var[4X + 3]

16

5. The probability distribution for the number of peas in a pod of a certain plant is given in
the table below:

Number of peas, X 4 5 6 7 8 9
Probability, p 0.13 k 0.38 0.16 0.09 0.03

a) Calculate (i) the value of k
(ii) the mean number of peas in a pod (2 d.p.)
(iii) the variance of the number of peas per pod (2 d.p)

b) Three such pods were examined. Find
(i) the mean number of peas in the 3 pods.
(ii) the variance of the number of peas in the 3 pods.

6. Patients who have a particular surgery experience pain the first day after the surgery. The
severity of the pain is measured on a subjective scale of 1 5. Let the random variable X
denote the pain score as determined by the patient. From past reports, the probability
distribution for x is determined to be:

x 1 2 3 4 5
P(X = x) 0.10 0.15 0.25 0.35 0.15

a) find the mean of x, the pain score.
b) find the standard deviation of x.
c) Calculate the probability that a patient will have a pain score of 3 or more.

7. In an election poll, 35% of a constituency said they will vote for a particular candidate.
From a sample of 15 randomly selected persons, calculate the probability that
a) only 5 persons will vote for the candidate.
b) at most 3 persons will vote for the candidate.
c) between 8 and 10 persons will support the candidate.

8. For each of the experiments described below, state, giving a reason whether a binomial
distribution is appropriate:
a) A bag contains black, white and red bottle covers, (all identical except for colour).
A cover is randomly taken, its colour noted, and it is replaced in the bag. This is
repeated 5 times.

17
b) The white covers are removed from the bag, leaving only red and black covers. A
cover is randomly taken, its colour noted and it is replaced in the bag. This is
repeated 5 times.

c) The same experiment as in b, except that the covers are not replaced after each
selection.

d) The same as in b, except that the experiment is stopped as soon as a red cover is
taken from the bag.

9. A darts player has estimated that any time he throws a dart, his probability of hitting the
bulls-eye is 1/3, independent of any other throw.

a) If he throws the dart 25 times, how many times is he expected to hit the bulls-
eye?

b) What is the probability that in 10 throws of the dart, he will hit the bulls-eye
(i) exactly 4 times?
(ii) at least twice?

c) Find the least number of throws that he should make if the probability of hitting the
bulls-eye at least once is greater than 0.95?

10. Records show that 30% of all patients admitted to a hospital fail to pay their bills and
eventually the bills are forgiven.

a) Of the next 4 patients who are admitted to the hospital, what is the probability that
(i) all will eventually have to be forgiven?
(ii) one will have to be forgiven?
(iii) none will have to be forgiven?

b) In a random sample of 10 patients who were admitted to hospital and then failed
to pay their bills, calculate the probability that

(i) 4 patients had their bills forgiven.
(ii) at least 2 patients had their bills forgiven
(iii) at most 7 patients had their bills forgiven.

18
d) If 2000 patients were admitted to the hospital over a period of 1 year, what is
(i) the expected number of bills that will be forgiven?
(ii) the standard deviation of the number of bills that will be forgiven?

Dr. M. E. Folkes Griffith
Mr Carl Chapman
2013 - 02 -

## Нижнее меню

### Получите наши бесплатные приложения

Авторское право © 2021 Scribd Inc.