Chapter 6

Chapter 6: Jointly Discrete Random Variables Page 61
6 Jointly Discrete Random Variables
6.1 Marginal and Joint Probability Mass Functions: Bivariate Case
Until now, we have studied individual random variables and their distributions. However, we are often
concerned with more than one random variable. Rather than examining the variables separately, we exam-
ine them simultaneously. This allows us to study how the variables behave together, and any relationship
that exists between them.
Example 1. The table below summarizes the number of cats X and the number of dogs Y owned by 250
households in a particular neighbourhood:
X/Y 0 1 2 Total
0 89 38 28 155
1 31 23 11 65
2 12 7 6 25
3 3 2 0 5
Total 135 70 45 250
If we randomly select one household in the neighbourhood, what is the probability they own two cats and
one dog?
7
P (X = 2, Y = 1) = = 0.028
250
We can find all probabilities in a similar manner. We recreate the table above, this time with probabilities
instead of counts:
X/Y 0 1 2 pX (x)
0 0.356 0.152 0.112 0.62
1 0.124 0.092 0.044 0.26
2 0.048 0.028 0.024 0.10
3 0.012 0.008 0 0.02
pY (y) 0.54 0.28 0.18 1
This is known as the joint probability mass function of X and Y .
STAT 2400 Lecture Notes 61

Definition 1. Let X and Y be two discrete random variables defined on the sample space Ω (i.e.,
from the same experiment). The joint probability mass function (joint p.m.f.) of X and Y , denoted by
pX,Y (x, y), is the real-valued function from R2 → [0, 1] defined by
pX,Y (x, y) = P (X = x, Y = y)
Some notes on joint p.m.f.’s:
• More formally, we write
pX,Y (x, y) = P ({X = x} ∩ {Y = y})
= P ({ω ∈ Ω : X(ω) = x and Y (ω) = y})
• We see that the joint probability P (X = x, Y = y) = P ({X = x} ∩ {Y = y}) is the probability of
the intersection of the events X = x and Y = y.
• If either x ∈
/ RX or y ∈
/ RY , then pX,Y (x, y) = P (∅) = 0.
• The range of X and Y (i.e., the set of all pairs of values x and y such that pX,Y (x, y) > 0) is denoted
as RXY .
Example 2. A box contains one white ball, two red balls and two black balls. We randomly select two
balls from the box without replacement. Let X be the number of white balls selected and let Y be the
number of red balls selected. Then the joint p.m.f. of X and Y is given by


 1 2 2
2−x−y


 x y

 if x ∈ {0, 1}, y ∈ {0, 1, 2}
pX,Y (x, y) = 5


 2


0 otherwise


We can also present the joint p.m.f. in tabular form:
X/Y 0 1 2 pX (x)
0 0.1 0.4 0.1 0.6
1 0.2 0.2 0 0.4
pY (y) 0.3 0.6 0.1 1

Proposition 1. (Basic Properties of a Bivariate p.m.f.) The joint p.m.f. pX,Y (x, y) of two discrete
random variables X and Y satisfies the following :
(i) pX,Y (x, y) ≥ 0 for all (x, y) ∈ RXY

XX
(ii) pX,Y (x, y) = 1
(x,y)∈RXY
Example 3. The joint p.m.f. of two discrete random variables X and Y is given by

 x + y if x ∈ {1, 2, 3, 4, 5}, y ∈ {1, 2, 3}

pX,Y (x, y) = k
 0

otherwise
Determine the value of the constant k.
Returning to Example 1, suppose we want to find the p.m.f.’s of X and Y separately. We see, for example,
that the probability a randomly selected household has no cats is
P (X = 0) = P (X = 0, Y = 0) + P (X = 0, Y = 1) + P (X = 0, Y = 2)
= pX,Y (0, 0) + pX,Y (0, 1) + pX,Y (0, 2)
= 0.356 + 0.152 + 0.112 = 0.62
Similarly, by adding accross the other rows in the table, we see that
P (X = 1) = 0.26
P (X = 2) = 0.10
P (X = 3) = 0.02
The probability that a randomly selected household has no dogs is
P (Y = 0) = P (X = 0, Y = 0) + P (X = 1, Y = 0) + P (X = 2, Y = 0) + P (X = 3, Y = 0)
= pX,Y (0, 0) + pX,Y (1, 0) + pX,Y (2, 0) + pX,Y (3, 0)
= 0.356 + 0.124 + 0.048 + 0.012 = 0.54
Similarly, by adding down the other columns in the table, we see that
P (Y = 1) = 0.28
P (Y = 2) = 0.18

So we see that we can find the individual distributions of X and Y (called the marginal distributions) by
summing over the appropriate row or column of the table, (i.e., over all values of the other variable).
Definition 2. Let X and Y be two discrete random variables with joint probability mass function
pX,Y (x, y). Then the marginal distributions of X and Y are

X
pX (x) = pX,Y (x, y) for x ∈ RX , and
y∈RY
X
pY (y) = pX,Y (x, y) for y ∈ RY ,
x∈RX
respectively. That is, the marginal distribution of X is found by summing pX,Y (x, y) over y and the
marginal distribution of Y is found by summing pX,Y (x, y) over x.
Example 4. In Example 3, find the marginal distributions of X and Y .
Example 5. The joint p.m.f. of two discrete random variables X and Y is given by

 p2 (1 − p)x+y−2 if x, y ∈ N,

pX,Y (x, y) =
 0

otherwise
Find the marginal distributions of X and Y .
Just like in the univariate case, to find the probability of any event, we add the probabilities of all outcomes
contained in the event.
Proposition 2. (Fundamental Probability Formula: Bivariate Discrete Case) Let X and Y be two
discrete random variables defined on the sample space Ω with joint p.m.f. pX,Y (x, y). Then for any
event A ∈ RXY : XX
P ((X, Y ) ∈ A) = pX,Y (x, y)
(x,y)∈A
Example 6. In Example 2, find P (X = Y ).
Example 7. In Example 3, find P (X < Y ).
Example 8. In Example 5, if p = 0.2, find P (X > Y ).

6.2 Marginal and Joint Probability Mass Functions: Multivariate Case
All of our results and definitions for the bivariate case can be extended to the case of any number of random
variables.
Definition 3. Let X1 , X2 , . . . , Xm be m discrete random variables defined on the sample space Ω.
Then the multivariate joint probability mass function of these random variables is the real-valued
function from Rm → [0, 1] defined by
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ) = P (X1 = x2 , X2 = x2 , . . . , Xm = xm ),
where x1 , x2 , . . . , xm are m real numbers.
Example 9. The joint p.m.f. of three discrete random variables X, Y and Z is given by

 x + 2y + 3z if x ∈ {0, 1, 2}, y, z ∈ {0, 1}

pX,Y,Z (x, y, z) = k
 0

otherwise
Determine the value of the constant k.
Example 10. You randomly select ten cards from a standard deck of 52 cards without replacement. Let
XH , XD , XC and XS be the number of hearts, diamonds, clubs and spades selected, respectively. Then
the joint p.m.f. of XH , XD , XC and XS



 13 13 13 13

 xH xD x xS
C


 if xH , xD , xC , xS ∈ {0, 1, . . . , 13}, xH + xD + xC + xS = 10
pX,Y (x, y) = 52


 10


0 otherwise



Just as in the bivariate case, marginal distributions can be obtained by summing over the variables that
are not of interest.
Definition 4. Let X1 , X2 , . . . , Xm be m discrete random variables with joint probability mass
function pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ). Then the marginal distribution of Xi is

X X
pXi (xi ) = ··· pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ),
x1 ,...,xi−1 ,xi+1 ,...,xm
The bivariate marginal distribution of Xi and Xj is

X X
pXi ,Xj (xi , xj ) = ··· pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ),
x1 ,...,xi−1 ,xi+1 ,...,xj−1 ,xj+1 ,...,xm
etc.
Example 11. We will find the marginal and bivariate p.m.f.’s in Example 9. To find the marginal p.m.f. of
X, we sum the joint p.m.f. over all values of Y and Z:

1 X
1 1 X
1
X X x + 2y + 3z
pX (x) = pX,Y,Z (x, y, z) =
42
y=0 z=0 y=0 z=0
x + 2(0) + 3(0) x + 2(0) + 3(1) x + 2(1) + 3(0) x + 2(1) + 3(1)

= + + +
42 42 42 42
4x + 10 2x + 5
= = , if x ∈ {0, 1, 2},
42 21
and pX (x) = 0, otherwise.
It can similarly be shown that


 4y + 5

if y ∈ {0, 1},
pY (y) = 14
 0

otherwise
and

 3z + 2

if z ∈ {0, 1},
pZ (z) = 7
 0

otherwise

To find the bivariate p.m.f. of Y and Z, we sum the joint p.m.f. over all values of X:
2 2
X X x + 2y + 3z
pY,Z (y, z) = pX,Y,Z (x, y, z) =
42
x=0 x=0
0 + 2y + 3z 1 + 2y + 3z 2 + 2y + 3z
= + +
42 42 42
6y + 9z + 3 2y + 3z + 1
= = , if y, z ∈ {0, 1},
42 14
and pY,Z (y, z) = 0, otherwise.
It can similarly be shown that


 2x + 4y + 3

if x ∈ {0, 1, 2}, y ∈ {0, 1}
pX,Y (x, y) = 42
 0

otherwise
and

 x + 3z + 1

if x ∈ {0, 1, 2}, z ∈ {0, 1},
pX,Z (x, z) = 21
 0

otherwise
The Fundamental Probability Formula also extends to the multivariate case.
Proposition 3. (Fundamental Probability Formula: Multivariate Discrete Case) Let X1 , X2 , . . . , Xm
be m discrete random variables defined on the sample space Ω with joint probability mass function
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ). Then for any subset A of RX1 X2 ···Xm :
X X
P ((X1 , X2 , . . . , Xm ) ∈ A) = ··· pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm )
(x1 ,x2 ,...,xm )∈A
That is, similar to the univariate and bivariate cases, to find the probability of any event, we add the
probabilities of all outcomes contained in the event.
Example 12. In Example 9, what is the probability that X is greater than Y and Z combined?
P (X > Y + Z) = pX,Y,Z (1, 0, 0) + pX,Y,Z (2, 0, 0) + pX,Y,Z (2, 0, 1) + pX,Y,Z (2, 1, 0)
1 + 2(0) + 3(0) 2 + 2(0) + 3(0) 2 + 2(0) + 3(1) 2 + 2(1) + 3(0)

= + + +
42 42 42 42
1 2 5 4 12 2
= + + + = =
42 42 42 42 42 7

Example 13. A fair six-sided die has three faces that are painted blue, two faces that are red and one
face that is green. If you roll the die ten times, what is the probability you get blue 3 times, red 5 times
and green 2 times?
Definition 5. Suppose that n independent trials are to be conducted, where each trial can result in
one of m different outcomes. For each trial, an outcome is of type i with probability pi , i = 1, 2, . . . , m.
Let Xi denote the number of trials resulting in outcome i. Then the joint p.m.f. of X1 , X2 , . . . , Xm is
 m
n x x x
X
p1 p2 · · · pm if xi ≥ 0 and


 1 2 m xi = n,
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ) = x1 , x2 , . . . , xm
i=1


 0 otherwise.
We write (X1 , X2 , . . . , Xm ) ∼ M(n, p1 , p2 , . . . , pm ) and say the X’s have a multinomial distribution
with parameters n, p1 , p2 , . . . , pm .
Example 14. In a particular election, 40% of voters voted for the Liberal Party, 30% voted for the
Conservative Party, 20% voted for the NDP and 10% voted for the Green party. In a random sample of 12
voters, what is the probability we get 5 Liberal voters, 2 Conservative voters, 4 NDP voters and 1 Green
voter?
Proposition 4. (Properties of the Multinomial Distribution)
Let (X1 , X2 , . . . , Xm ) ∼ M(n, p1 , p2 , . . . , pm ). Then:
(i) Xi ∼ B(n, pi ) for i ∈ {1, 2, . . . , m}
(ii) Xi + Xj ∼ B(n, pi + pj ) for 1 ≤ i < j ≤ m
(iii) Xi + Xj + Xk ∼ B(n, pi + pj + pk ) for 1 ≤ i < j < k ≤ m
(iv) etc.
Example 15. The Liberal Party is considered to be at the centre of the political spectrum, the Conser-
vatives are a right-wing party and the NDP and Greens are left-wing parties. If you randomly select ten
voters, what is the probability that exactly four of them voted for left-wing parties in the election?

6.3 Conditional Probability Mass Functions
In the previous section, we saw how to use the definition of the intersection of events to define a joint
probability mass function. In this section, we learn how to use the concept of conditional probabilities to
construct a conditional p.m.f.
Using the definition of conditional probability,
P ({X = x} ∩ {Y = y})
P ({Y = y}|{X = x}) =
P ({X = x})
P (X = x, Y = y)
=
P (X = x)
pX,Y (x, y)
=
pX (x)
Definition 6. Let X and Y be two discrete random variables defined on the sample space Ω. Then,
the conditional probability mass function of Y given X = x is given by:
pX,Y (x, y)
pY |X (y|x) = P (Y = y|X = x) = ,
pX (x)
for y ∈ RY . When pX (x) = 0, we define pY |X (y|x) = 0 for all y ∈ RY .
Example 16. Return to Example 1. For each number of cats x, let us find the conditional p.m.f. of the
number of dogs Y .
Let us obtain the conditional p.m.f. of Y given X = 1. Using the above definition,
pX,Y (1, y) pX,Y (1, y)

pY |X (y|1) = =
pX (1) 0.26
For example,
pX,Y (1, 2) 0.044

P (Y = 2|X = 1) = pY |X (2|1) = = = 0.1692
0.26 0.26
This tells us that, of all households in the neighbourhood with one cat, 16.92% of them own two dogs.
We can similarly calculate P (Y = 0|X = 1) and P (Y = 1|X = 1) to get the conditional p.m.f. of Y given
X = 1, as shown in the table below:
y 0 1 2
pY |X (y|1) 0.4769 0.3538 0.1692

Note that all probabilities in the table sum to 1 (except for round-off error), as this is a legitimate p.m.f.
Similarly, we obtain the conditional p.m.f. of Y for the other two possible values of X. Each conditional
p.m.f. of Y is displayed as a row in the following table:
X/Y 0 1 2 Total
0 0.5742 0.2452 0.1806 1
1 0.4769 0.3538 0.1692 1
2 0.4800 0.2800 0.2400 1
3 0.6000 0.4000 0 1
pY (y) 0.54 0.28 0.18 1
We can also obtain the conditional p.m.f. of X for each possible value of Y . Each conditional p.m.f. of X
is displayed as a column in the following table:
X/Y 0 1 2 pX (x)
0 0.6593 0.5429 0.6222 0.62
1 0.2297 0.3286 0.2444 0.26
2 0.0889 0.1000 0.1333 0.10
3 0.0222 0.0286 0 0.02
Total 1 1 1 1
Example 17. Recall from Example 5 that we had a joint p.m.f.
pX,Y (x, y) = p2 (1 − p)x+y−2 , for x, y ∈ N,
and pX,Y (x, y) = 0, otherwise.
Recall that we determined that the p.m.f.’s of both X and Y were geometric with parameter p. Now we
find the conditional p.m.f. of Y given X = x. For each value x ∈ N, we have
pX,Y (x, y) p2 (1 − p)x+y−2

pY |X (y|x) = = = p(1 − p)y−1
pX (x) p(1 − p)x−1
for y ∈ N and pY |X (y|x) = 0, otherwise.
Note that the conditional p.m.f.’s of Y given X = x are identical to each other and to the marginal p.m.f. of
Y . As such, knowing the value of the random variable X gives us no information about the probability
distribution of Y .

Proposition 5. (Conditional Probability Mass Functions as Probability Mass Functions) Let X and
Y be random variables defined on the sample space Ω and let x be a real number such that pX (x) > 0.
Then the conditional probability mass function of Y given X = x is a probability mass function.
Note. This says that conditional p.m.f.’s possess all the same properties and follow the same rules
as regular p.m.f.’s. For example, all conditional probabilities are non-negative, the conditional
probabilities P (Y = y|X = x) must sum to one over all values of Y , and we can also use the FPF
for conditional probabilities.
Example 18. In Example 1, what is the probability that a household that owns one dog owns an odd
number of cats?
If we know the joint p.m.f. of X and Y , and we know the marginal distribution of X, we can find conditional
p.m.f. of Y given X = x:
pX,Y (x, y)
pY |X (y|x) = , for all x such that pX (x) > 0 and y ∈ RY ,
pX (x)
and pY |X (y|x) = 0, otherwise.
In some situations, we may know the marginal distribution of X and the conditional distribution of Y
given X = x, which will enable us to determine the joint p.m.f. of X and Y
Proposition 6. (Multiplication Rule for Bivariate PMFs) Let X and Y be two discrete random
variables defined on the sample space Ω with joint p.m.f. pX,Y (x, y). Then:
pX,Y (x, y) = pX (x)pY |X (y|x) and
pX,Y (x, y) = pY (y)pX|Y (x|y)
Example 19. We will conduct repeated Bernoulli trials, each with success probability p, until we observe
the second success. Let X be the number of trials required to obtain the first success and let Y be the
total number of trials required to obtain the second success. Find the joint p.m.f. of X and Y , and use
this to find the conditional p.m.f. of X given Y = y.

Example 20. You repeatedly flip a coin for which P (H) = p. Let X be the number of flips required to
get heads for the first time. If it takes x flips to get the first heads, we will then flip the coin an additional
x times and count Y , the number of heads in those additional tosses. Find the joint p.m.f. of X and Y .
Using the definition of conditional probability,
P ({X = x} ∩ {Y = y} ∩ {Z = z})
P ({X = x} ∩ {Y = y}|{Z = z}) =
P ({Z = z})
P (X = x, Y = y, Z = z)
=
P (Z = z)
pX,Y,Z (x, y, z)
=
pZ (z)
Similarly,
P ({X = x} ∩ {Y = y} ∩ {Z = z})
P ({X = x}|{Y = y} ∩ {Z = z}) =
P ({Y = y} ∩ {Z = z})
P (X = x, Y = y, Z = z)
=
P (Y = y, Z = z)
pX,Y,Z (x, y, z)
=
pY,Z (y, z)
Example 21. In Example 9, the conditional p.m.f. of X, given Y = 0 and Z = 1 is
pX,Y,Z (x, 0, 1) (x + 2(0) + 3(1))/42 (x + 3)/42 x+3

pX|Y,Z (x|0, 1) = = = = , for x ∈ {0, 1, 2},
pY,Z (0, 1) (2(0) + 3(1) + 1)/14 4/14 12
and pX|Y,Z (x|0, 1) = 0, otherwise.
The conditional p.m.f. of X and Y , given Z = 0 is
pX,Y,Z (x, y, 0) (x + 2y + 3(0))/42 (x + 2y)/42 x + 2y

pX,Y |Z (x, y|0) = = = = , for x ∈ {0, 1, 2}, y ∈ {0, 1},
pZ (0) (3(0) + 2)/7 2/7 12
and pX,Y |Z (x, y|0) = 0, otherwise.

6.4 Independent Random Variables: Bivariate Case
Recall that two events A and B are independent if and only if
P (B|A) = P (B)
For two discrete random variables X and Y , the events {X = x} and {Y = y} are independent if and only
if
P (Y = y|X = x) = P (Y = y)
Note that it follows that if X and Y are independent, then
P (X = x, Y = y) = P (X = x)P (Y = y)
This leads us to the following definition:
Definition 7. Let X and Y be two discrete random variables defined on the sample space S. Then
X and Y are independent if
P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)
for any A, B ∈ R.
In terms of the joint and marginal p.m.f.’s, X and Y are independent if and only if
pX,Y (x, y) = P (X = x, Y = y) = P (X = x)P (Y = y) = pX (x)pY (y)
Proposition 7. (Joint Probability Mass Function of Two Independent Random Variables) Let X and
Y be two discrete random variables defined on the sample space Ω. Then X and Y are independent
if and only if
pX,Y (x, y) = pX (x)pY (y)
for all (x, y) ∈ RXY .

If we display the joint p.m.f. in tabular form, then X and Y are independent if and only if the probability
in each cell (pX,Y (x, y)) is equal to the product of the row total (pX (x)) and the column total pY (y).
Example 22. The joint p.m.f. of two discrete random variables X and Y is displayed in the table below:
X/Y 1 2 3 pX (x)
1 0.18 0.30 0.12 0.6
2 0.12 0.20 0.08 0.4
pY (y) 0.3 0.5 0.2 1
Since pX,Y (x, y) = pX (x)pY (y) for all x ∈ RX and y ∈ RY , X and Y are independent.
The conditional p.m.f.’s of Y for each value of X are displayed in the rows in the following table:
X/Y 1 2 3 Total
1 0.3 0.5 0.2 1
2 0.3 0.5 0.2 1
pY (y) 0.3 0.5 0.2 1
We see that when X and Y are independent, all rows in this table are equivalent. That is, the conditional
p.m.f.’s of Y given X are the same as one another, and are all the same as the marginal p.m.f. of Y .
The conditional p.m.f.’s of X for each value of Y are displayed in the columns in the following table:
X/Y 1 2 3 pX (x)
1 0.6 0.6 0.6 0.6
2 0.4 0.4 0.4 0.4
Total 1 1 1 1
We see that when X and Y are independent, all columns in this table are equivalent. That is, the conditional
p.m.f.’s of X given Y are the same as one another, and are all the same as the marginal p.m.f. of X.
Example 23. Are X and Y independent in Example 1?

We had two equivalent definitions of independence of two events. The first (the multiplication rule) led to
our previous definition of independent discrete random variables. The second (the conditional probability
rule) leads us to the following equivalent definition.
Proposition 8. (Independence and Conditional p.m.f.’s) Two discrete random variables X and Y
with joint probability mass function pX,Y (x, y) are independent if and only if
pY |X (y|x) = pY (y) for all (x, y) ∈ RXY and/or pX|Y (x|y) = pX (x) for all (x, y) ∈ RXY .
Example 24. Recall from Example 5 that we had a joint p.m.f.
pX,Y (x, y) = p2 (1 − p)x+y−2 , for x, y ∈ N,
and pX,Y (x, y) = 0, otherwise.
We found the marginal distributions, X ∼ G(p) and Y ∼ G(p), and in Example 17, we found that
pY |X (y|x) = p(1 − p)y−1 = pY (y)
It can also be shown that pX|Y (x|y) = pX (x) (which must be true if the above is true). Therefore, X and
Y are independent. We can also show independence of X and Y using our first definition:
pX,Y (x, y) = p2 (1 − p)x+y−2 = p(1 − p)x−1 p(1 − p)y−1 = pX (x)pY (y)

6.5 Independent Random Variables: Multivariate Case
Definition 8. Let X1 , . . . , Xm be m random variables defined on the sample space Ω. Then X1 , . . . , Xm
are independent if and only if
P (X1 ∈ A1 , X2 ∈ A2 , . . . , Xm ∈ Am ) = P (X1 ∈ A1 )P (X2 ∈ A2 ) · · · P (Xm ∈ Am )
Proposition 9. (Joint Probability Mass Function of Several Independent Random Variables) Let
X1 , X2 , . . . , Xm be m random variables defined on the sample space Ω. Then X1 , . . . , Xm are inde-
pendent if and only if
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ) = pX1 (x1 )pX2 (x2 ) · · · pXm (xm )
for all x1 , . . . , xm ∈ RX1 ···Xm .
6.6 Functions of Many Random Variables
Proposition 10. (Probability Mass Function of a Function of Two Random Variables) Let X and Y
be two discrete random variables defined on the same sample space Ω and let g be a real-valued function
of two variables defined on the range of (X, Y ). Then the p.m.f. of the random variable Z = g(X, Y )
is
 XX


 pX,Y (x, y), for (x, y) ∈ RXY
pZ (z) = (x,y)∈RXY :g(x,y)=z


 0 otherwise.
Note. In other words, if Z is a function of X and Y , then P (Z = z) is equal to the sum of the
probabilities P (X = x, Y = y) for all values x and y such that g(x, y) = z.

Example 27. A market has both an express checkout line and a regular checkout line. Let X and X
denote the number of customers in line at the express checkout and the regular checkout, respectively, at
a given time. Suppose the joint p.m.f. of X and Y is given by the following table:
X/Y 0 1 2 3
0 0.01 0.02 0.03 0.02
1 0.02 0.06 0.08 0.05
2 0.02 0.10 0.14 0.07
3 0.01 0.09 0.12 0.08
4 0.01 0.02 0.02 0.03
Let Z = |X − Y | be the absolute difference between the number of customers in the express lane and the
regular lane. We find the p.m.f. of Z as follows:
First we note the range of Z is RZ = {0, 1, 2, 3, 4}. Now for each value of Z, we determine which pairs
(X, Y ) that produce that value when we take the absolute difference. We can make a table to help us:
|X − Y | Pairs Probabilities Total
0 (0,0), (1,1), (2,2), (3,3) 0.01, 0.06, 0.14, 0.08 0.29
1 (0,1), (1,0), (1,2), (2,1), (2,3), (3,2), (4,3) 0.02, 0.02, 0.08, 0.10, 0.07, 0.12, 0.03 0.44
2 (0,2), (1,3), (2,0), (3,1), (4,2) 0.03, 0.05, 0.02, 0.09, 0.02 0.21
3 (0,3), (3,0), (4,1) 0.02, 0.01, 0.02 0.05
4 (4,0) 0.01 0.01
So we can write the p.m.f. of Z = |X − Y | using the first and last column of the above table:




 0.29 if z = 0





 0.44 if z = 1,



 0.21 if z = 2,

pZ (z) =
0.05 if z = 3,










 0.01 if z = 4,


 0

otherwise.
Note. The p.m.f. of a function of more than two variables can be found analogously.

6.7 Sums of Random Variables
We now examine the special case where our function of interest is the sum of two discrete random variables.
Example 28. In Example 27, let Z = X + Y be the total number of customers in the two lines. We want
to find the p.m.f. of Z. The range of Z is RZ = {0, 1, 2, 3, 4, 5, 6, 7}. The probability there are three total
customers in the two lines is
P (Z = 3) = pZ (3) = pX,Y (0, 3 − 0) + pX,Y (1, 3 − 1) + pX,Y (2, 3 − 2) + pX,Y (3, 3 − 3)
= pX,Y (0, 3) + pX,Y (1, 2) + pX,Y (2, 1) + pX,Y (3, 0) = 0.02 + 0.08 + 0.10 + 0.01 = 0.21
We find pZ (z) for all other values of z ∈ RZ similarly. The p.m.f. of Z is shown below:
z 0 1 2 3 4 5 6 7
pZ (z) 0.01 0.04 0.11 0.21 0.29 0.21 0.10 0.03
We used the law of partitions (partitioning the event Z = z over all values of X) to find the p.m.f. of
Z = X + Y . By fixing the value of one random variable, we can change the expression X + Y = z from
being a function of two variables, to being just a function of just one. We summarize the technique as
follows:
Proposition 11. (p.m.f. of the Sum of Two Discrete Random Variables) Let X and Y be two discrete
random variables defined on the sample space Ω. Then the probability mass function of Z = X + Y
satisfies
X
pZ (z) = pX,Y (x, z − x)
x∈RX
where z ∈ RZ .

If the variables are independent, we can further simplify the expression.
Proposition 12. (p.m.f. of the Sum of Two Independent Discrete Random Variables) Let X and Y be
independent discrete random variables defined on the sample space Ω. Then the p.m.f. of the random
variable Z = X + Y can be obtained from either of the two formulas:

X
pX+Y (z) = pX (x)pY (z − x)
x∈RX
where z ∈ RZ . This is sometimes called the Convolution Formula.
Proposition 13. (p.m.f.’s of Sums of Some Common Discrete Random Variables) Let X1 , . . . , Xm be
m independent random variables defined on the same sample space Ω. Then

m m
!
X X
(i) if Xi ∼ B(ni , p), Xi ∼ B ni , p ,
i=1 i=1
m m
!
X X
(ii) if Xi ∼ P (λi ), Xi ∼ P λi ,
i=1 i=1
m m
!
X X
(iii) if Xi ∼ N B(ri , p), Xi ∼ N B ri , p .
i=1 i=1

We will prove Proposition 13 (i) for the case of m = 2. We have two independent random variables X(n1 , p)
and Y ∼ B(n2 , p). Let Z = X + Y . We find the p.m.f. of Z as follows:
P (Z = z) = P (X + Y = z)
z
X
= pX (x)pY (z − x) by Convolution Formula
x=0
z
X n1 x n1 −x n2
= p (1 − p) pz−x (1 − p)n2 −(z−x)
x z−x
x=0
z
X n1 n2
= pz (1 − p)n1 +n2 −z
x z−x
x=0

n1 + n2 z
= p (1 − p)n1 +n2 −z
z
In other words, Z ∼ B(n1 + n2 , p).
Note. In the last line of the proof, we used Vandermonde’s Identity, which states that
k
X m n−m n
=
j k−j k
j=0
This identity states that any combination of k objects from a group of (m + n) objects must have some
0 ≤ r ≤ k objects from a group of m objects, and the remaining (k − r) objects from a group of n
objects.
We will not formally prove this identity. However, you can verify it with an example. For example,
suppose n = 5, m = 4 and r = 3. Vandermonde’s Identity states (and you can verify) that

9 5 4 5 4 5 4 5 4
= + + +
3 0 3 1 2 2 1 3 0

We will prove Proposition 13 (ii) for the case of m = 2.
We have two independent random variables X1 ∼ P (λ1 ) and X2 ∼ P (λ2 ). Let Z = X + Y . We find the
p.m.f. of Z as follows:
z
X
P (Z = z) = pX (x)pY (z − x) by the Convolution Formula
x=0
z
X λx1 e−λ1 λz−x
2 e
−λ2
=
x! (z − x)!
x=0
z
X λx1 λz−x
= e−(λ1 +λ2 ) 2
x!(z − x)!
x=0
z
1 X z!
= e−(λ1 +λ2 ) λx λz−x
z! x!(z − x)! 1 2
x=0
z
1 X z x z−x
= e−(λ1 +λ2 ) λ λ
z! x 1 2
x=0
n
(λ1 + λ2 )z n
X n k n−k
= e−(λ1 +λ2 ) by Binomial theorem i.e., (x + a) = x a
z! k
k=0
This can be recognized as a Poisson p.m.f. with parameter λ1 + λ2 , and so Z ∼ P (λ1 + λ2 ).
Example 29. You have two unbalanced coins – one gold and one silver. When flipped, each of them
lands on heads 60% of the time. If you flip the gold coin 10 times and the silver coin 5 times, what is the
probability of getting exactly 8 heads in total?
Example 30. Suppose the number of goals X1 scored by the Winnipeg Jets follows a Poisson distribution
with a rate of 3.2 per game. The number of goals X2 scored by the Arizona Coyotes follows a Poisson
distribution with a rate of 2.6 per game. Assuming the variables are independent, what is the probability
that a total of seven goals will be scored in the next Jets vs. Coyotes game?
Example 31. You and two friends each roll a fair six-sided die until each of you have observed a 6 for the
second time. What is the probability that the total number of rolls is 30?

Chapter 6

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Chapter 6

Загружено:

Авторское право:

Доступные форматы

Chapter 6: Jointly Discrete Random Variables Page 61

6 Jointly Discrete Random Variables

6.1 Marginal and Joint Probability Mass Functions: Bivariate Case

that exists between them.

households in a particular neighbourhood:

Total 135 70 45 250

0 0.356 0.152 0.112 0.62

1 0.124 0.092 0.044 0.26

2 0.048 0.028 0.024 0.10

3 0.012 0.008 0 0.02

pY (y) 0.54 0.28 0.18 1

This is known as the joint probability mass function of X and Y .

STAT 2400 Lecture Notes 61

pX,Y (x, y), is the real-valued function from R2 → [0, 1] defined by

Some notes on joint p.m.f.’s:

• More formally, we write

pX,Y (x, y) = P ({X = x} ∩ {Y = y})

= P ({ω ∈ Ω : X(ω) = x and Y (ω) = y})

• We see that the joint probability P (X = x, Y = y) = P ({X = x} ∩ {Y = y}) is the probability of

the intersection of the events X = x and Y = y.

We can also present the joint p.m.f. in tabular form:

0 0.1 0.4 0.1 0.6

1 0.2 0.2 0 0.4

pY (y) 0.3 0.6 0.1 1

STAT 2400 Lecture Notes 62

random variables X and Y satisfies the following :

(i) pX,Y (x, y) ≥ 0 for all (x, y) ∈ RXY

Determine the value of the constant k.

that the probability a randomly selected household has no cats is

= pX,Y (0, 0) + pX,Y (0, 1) + pX,Y (0, 2)

= 0.356 + 0.152 + 0.112 = 0.62

The probability that a randomly selected household has no dogs is

= pX,Y (0, 0) + pX,Y (1, 0) + pX,Y (2, 0) + pX,Y (3, 0)

= 0.356 + 0.124 + 0.048 + 0.012 = 0.54

STAT 2400 Lecture Notes 63

pX,Y (x, y). Then the marginal distributions of X and Y are

marginal distribution of Y is found by summing pX,Y (x, y) over x.

Example 4. In Example 3, find the marginal distributions of X and Y .

Find the marginal distributions of X and Y .

contained in the event.

Example 6. In Example 2, find P (X = Y ).

Example 7. In Example 3, find P (X < Y ).

Example 8. In Example 5, if p = 0.2, find P (X > Y ).

STAT 2400 Lecture Notes 64

6.2 Marginal and Joint Probability Mass Functions: Multivariate Case

Definition 3. Let X1 , X2 , . . . , Xm be m discrete random variables defined on the sample space Ω.

function from Rm → [0, 1] defined by

pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ) = P (X1 = x2 , X2 = x2 , . . . , Xm = xm ),

where x1 , x2 , . . . , xm are m real numbers.

Determine the value of the constant k.

the joint p.m.f. of XH , XD , XC and XS

STAT 2400 Lecture Notes 65

are not of interest.

Definition 4. Let X1 , X2 , . . . , Xm be m discrete random variables with joint probability mass

function pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ). Then the marginal distribution of Xi is

The bivariate marginal distribution of Xi and Xj is

X, we sum the joint p.m.f. over all values of Y and Z:

x + 2(0) + 3(0) x + 2(0) + 3(1) x + 2(1) + 3(0) x + 2(1) + 3(1)

and pX (x) = 0, otherwise.

It can similarly be shown that

STAT 2400 Lecture Notes 66

and pY,Z (y, z) = 0, otherwise.

It can similarly be shown that

The Fundamental Probability Formula also extends to the multivariate case.