Академический Документы
Профессиональный Документы
Культура Документы
Until now, we have studied individual random variables and their distributions. However, we are often
concerned with more than one random variable. Rather than examining the variables separately, we exam-
ine them simultaneously. This allows us to study how the variables behave together, and any relationship
Example 1. The table below summarizes the number of cats X and the number of dogs Y owned by 250
X/Y 0 1 2 Total
0 89 38 28 155
1 31 23 11 65
2 12 7 6 25
3 3 2 0 5
If we randomly select one household in the neighbourhood, what is the probability they own two cats and
one dog?
7
P (X = 2, Y = 1) = = 0.028
250
We can find all probabilities in a similar manner. We recreate the table above, this time with probabilities
instead of counts:
X/Y 0 1 2 pX (x)
Definition 1. Let X and Y be two discrete random variables defined on the sample space Ω (i.e.,
from the same experiment). The joint probability mass function (joint p.m.f.) of X and Y , denoted by
pX,Y (x, y) = P (X = x, Y = y)
• If either x ∈
/ RX or y ∈
/ RY , then pX,Y (x, y) = P (∅) = 0.
• The range of X and Y (i.e., the set of all pairs of values x and y such that pX,Y (x, y) > 0) is denoted
as RXY .
Example 2. A box contains one white ball, two red balls and two black balls. We randomly select two
balls from the box without replacement. Let X be the number of white balls selected and let Y be the
number of red balls selected. Then the joint p.m.f. of X and Y is given by
1 2 2
2−x−y
x y
if x ∈ {0, 1}, y ∈ {0, 1, 2}
pX,Y (x, y) = 5
2
0 otherwise
X/Y 0 1 2 pX (x)
Proposition 1. (Basic Properties of a Bivariate p.m.f.) The joint p.m.f. pX,Y (x, y) of two discrete
Example 3. The joint p.m.f. of two discrete random variables X and Y is given by
x + y if x ∈ {1, 2, 3, 4, 5}, y ∈ {1, 2, 3}
pX,Y (x, y) = k
0
otherwise
Returning to Example 1, suppose we want to find the p.m.f.’s of X and Y separately. We see, for example,
P (X = 0) = P (X = 0, Y = 0) + P (X = 0, Y = 1) + P (X = 0, Y = 2)
Similarly, by adding accross the other rows in the table, we see that
P (X = 1) = 0.26
P (X = 2) = 0.10
P (X = 3) = 0.02
P (Y = 0) = P (X = 0, Y = 0) + P (X = 1, Y = 0) + P (X = 2, Y = 0) + P (X = 3, Y = 0)
Similarly, by adding down the other columns in the table, we see that
P (Y = 1) = 0.28
P (Y = 2) = 0.18
So we see that we can find the individual distributions of X and Y (called the marginal distributions) by
summing over the appropriate row or column of the table, (i.e., over all values of the other variable).
Definition 2. Let X and Y be two discrete random variables with joint probability mass function
X
pY (y) = pX,Y (x, y) for y ∈ RY ,
x∈RX
respectively. That is, the marginal distribution of X is found by summing pX,Y (x, y) over y and the
Example 5. The joint p.m.f. of two discrete random variables X and Y is given by
p2 (1 − p)x+y−2 if x, y ∈ N,
pX,Y (x, y) =
0
otherwise
Just like in the univariate case, to find the probability of any event, we add the probabilities of all outcomes
Proposition 2. (Fundamental Probability Formula: Bivariate Discrete Case) Let X and Y be two
discrete random variables defined on the sample space Ω with joint p.m.f. pX,Y (x, y). Then for any
event A ∈ RXY : XX
P ((X, Y ) ∈ A) = pX,Y (x, y)
(x,y)∈A
All of our results and definitions for the bivariate case can be extended to the case of any number of random
variables.
Then the multivariate joint probability mass function of these random variables is the real-valued
Example 9. The joint p.m.f. of three discrete random variables X, Y and Z is given by
x + 2y + 3z if x ∈ {0, 1, 2}, y, z ∈ {0, 1}
pX,Y,Z (x, y, z) = k
0
otherwise
Example 10. You randomly select ten cards from a standard deck of 52 cards without replacement. Let
XH , XD , XC and XS be the number of hearts, diamonds, clubs and spades selected, respectively. Then
Just as in the bivariate case, marginal distributions can be obtained by summing over the variables that
etc.
Example 11. We will find the marginal and bivariate p.m.f.’s in Example 9. To find the marginal p.m.f. of
and
3z + 2
if z ∈ {0, 1},
pZ (z) = 7
0
otherwise
To find the bivariate p.m.f. of Y and Z, we sum the joint p.m.f. over all values of X:
2 2
X X x + 2y + 3z
pY,Z (y, z) = pX,Y,Z (x, y, z) =
42
x=0 x=0
0 + 2y + 3z 1 + 2y + 3z 2 + 2y + 3z
= + +
42 42 42
6y + 9z + 3 2y + 3z + 1
= = , if y, z ∈ {0, 1},
42 14
and
x + 3z + 1
if x ∈ {0, 1, 2}, z ∈ {0, 1},
pX,Z (x, z) = 21
0
otherwise
be m discrete random variables defined on the sample space Ω with joint probability mass function
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ). Then for any subset A of RX1 X2 ···Xm :
X X
P ((X1 , X2 , . . . , Xm ) ∈ A) = ··· pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm )
(x1 ,x2 ,...,xm )∈A
That is, similar to the univariate and bivariate cases, to find the probability of any event, we add the
Example 12. In Example 9, what is the probability that X is greater than Y and Z combined?
Example 13. A fair six-sided die has three faces that are painted blue, two faces that are red and one
face that is green. If you roll the die ten times, what is the probability you get blue 3 times, red 5 times
Definition 5. Suppose that n independent trials are to be conducted, where each trial can result in
one of m different outcomes. For each trial, an outcome is of type i with probability pi , i = 1, 2, . . . , m.
Let Xi denote the number of trials resulting in outcome i. Then the joint p.m.f. of X1 , X2 , . . . , Xm is
m
n x x x
X
p1 p2 · · · pm if xi ≥ 0 and
1 2 m xi = n,
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ) = x1 , x2 , . . . , xm
i=1
0 otherwise.
We write (X1 , X2 , . . . , Xm ) ∼ M(n, p1 , p2 , . . . , pm ) and say the X’s have a multinomial distribution
with parameters n, p1 , p2 , . . . , pm .
Example 14. In a particular election, 40% of voters voted for the Liberal Party, 30% voted for the
Conservative Party, 20% voted for the NDP and 10% voted for the Green party. In a random sample of 12
voters, what is the probability we get 5 Liberal voters, 2 Conservative voters, 4 NDP voters and 1 Green
voter?
(iv) etc.
Example 15. The Liberal Party is considered to be at the centre of the political spectrum, the Conser-
vatives are a right-wing party and the NDP and Greens are left-wing parties. If you randomly select ten
voters, what is the probability that exactly four of them voted for left-wing parties in the election?
In the previous section, we saw how to use the definition of the intersection of events to define a joint
probability mass function. In this section, we learn how to use the concept of conditional probabilities to
P ({X = x} ∩ {Y = y})
P ({Y = y}|{X = x}) =
P ({X = x})
P (X = x, Y = y)
=
P (X = x)
pX,Y (x, y)
=
pX (x)
Definition 6. Let X and Y be two discrete random variables defined on the sample space Ω. Then,
pX,Y (x, y)
pY |X (y|x) = P (Y = y|X = x) = ,
pX (x)
Example 16. Return to Example 1. For each number of cats x, let us find the conditional p.m.f. of the
number of dogs Y .
Let us obtain the conditional p.m.f. of Y given X = 1. Using the above definition,
For example,
This tells us that, of all households in the neighbourhood with one cat, 16.92% of them own two dogs.
We can similarly calculate P (Y = 0|X = 1) and P (Y = 1|X = 1) to get the conditional p.m.f. of Y given
y 0 1 2
Note that all probabilities in the table sum to 1 (except for round-off error), as this is a legitimate p.m.f.
Similarly, we obtain the conditional p.m.f. of Y for the other two possible values of X. Each conditional
X/Y 0 1 2 Total
3 0.6000 0.4000 0 1
We can also obtain the conditional p.m.f. of X for each possible value of Y . Each conditional p.m.f. of X
X/Y 0 1 2 pX (x)
Total 1 1 1 1
Recall that we determined that the p.m.f.’s of both X and Y were geometric with parameter p. Now we
Note that the conditional p.m.f.’s of Y given X = x are identical to each other and to the marginal p.m.f. of
Y . As such, knowing the value of the random variable X gives us no information about the probability
distribution of Y .
Proposition 5. (Conditional Probability Mass Functions as Probability Mass Functions) Let X and
Y be random variables defined on the sample space Ω and let x be a real number such that pX (x) > 0.
Then the conditional probability mass function of Y given X = x is a probability mass function.
Note. This says that conditional p.m.f.’s possess all the same properties and follow the same rules
as regular p.m.f.’s. For example, all conditional probabilities are non-negative, the conditional
probabilities P (Y = y|X = x) must sum to one over all values of Y , and we can also use the FPF
Example 18. In Example 1, what is the probability that a household that owns one dog owns an odd
number of cats?
If we know the joint p.m.f. of X and Y , and we know the marginal distribution of X, we can find conditional
p.m.f. of Y given X = x:
pX,Y (x, y)
pY |X (y|x) = , for all x such that pX (x) > 0 and y ∈ RY ,
pX (x)
In some situations, we may know the marginal distribution of X and the conditional distribution of Y
Proposition 6. (Multiplication Rule for Bivariate PMFs) Let X and Y be two discrete random
variables defined on the sample space Ω with joint p.m.f. pX,Y (x, y). Then:
Example 19. We will conduct repeated Bernoulli trials, each with success probability p, until we observe
the second success. Let X be the number of trials required to obtain the first success and let Y be the
total number of trials required to obtain the second success. Find the joint p.m.f. of X and Y , and use
Example 20. You repeatedly flip a coin for which P (H) = p. Let X be the number of flips required to
get heads for the first time. If it takes x flips to get the first heads, we will then flip the coin an additional
x times and count Y , the number of heads in those additional tosses. Find the joint p.m.f. of X and Y .
P ({X = x} ∩ {Y = y} ∩ {Z = z})
P ({X = x} ∩ {Y = y}|{Z = z}) =
P ({Z = z})
P (X = x, Y = y, Z = z)
=
P (Z = z)
pX,Y,Z (x, y, z)
=
pZ (z)
Similarly,
P ({X = x} ∩ {Y = y} ∩ {Z = z})
P ({X = x}|{Y = y} ∩ {Z = z}) =
P ({Y = y} ∩ {Z = z})
P (X = x, Y = y, Z = z)
=
P (Y = y, Z = z)
pX,Y,Z (x, y, z)
=
pY,Z (y, z)
P (B|A) = P (B)
For two discrete random variables X and Y , the events {X = x} and {Y = y} are independent if and only
if
P (Y = y|X = x) = P (Y = y)
P (X = x, Y = y) = P (X = x)P (Y = y)
Definition 7. Let X and Y be two discrete random variables defined on the sample space S. Then
P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B)
for any A, B ∈ R.
In terms of the joint and marginal p.m.f.’s, X and Y are independent if and only if
Proposition 7. (Joint Probability Mass Function of Two Independent Random Variables) Let X and
Y be two discrete random variables defined on the sample space Ω. Then X and Y are independent
if and only if
If we display the joint p.m.f. in tabular form, then X and Y are independent if and only if the probability
in each cell (pX,Y (x, y)) is equal to the product of the row total (pX (x)) and the column total pY (y).
Example 22. The joint p.m.f. of two discrete random variables X and Y is displayed in the table below:
X/Y 1 2 3 pX (x)
Since pX,Y (x, y) = pX (x)pY (y) for all x ∈ RX and y ∈ RY , X and Y are independent.
The conditional p.m.f.’s of Y for each value of X are displayed in the rows in the following table:
X/Y 1 2 3 Total
We see that when X and Y are independent, all rows in this table are equivalent. That is, the conditional
p.m.f.’s of Y given X are the same as one another, and are all the same as the marginal p.m.f. of Y .
The conditional p.m.f.’s of X for each value of Y are displayed in the columns in the following table:
X/Y 1 2 3 pX (x)
Total 1 1 1 1
We see that when X and Y are independent, all columns in this table are equivalent. That is, the conditional
p.m.f.’s of X given Y are the same as one another, and are all the same as the marginal p.m.f. of X.
We had two equivalent definitions of independence of two events. The first (the multiplication rule) led to
our previous definition of independent discrete random variables. The second (the conditional probability
Proposition 8. (Independence and Conditional p.m.f.’s) Two discrete random variables X and Y
with joint probability mass function pX,Y (x, y) are independent if and only if
pY |X (y|x) = pY (y) for all (x, y) ∈ RXY and/or pX|Y (x|y) = pX (x) for all (x, y) ∈ RXY .
We found the marginal distributions, X ∼ G(p) and Y ∼ G(p), and in Example 17, we found that
It can also be shown that pX|Y (x|y) = pX (x) (which must be true if the above is true). Therefore, X and
Y are independent. We can also show independence of X and Y using our first definition:
Proposition 9. (Joint Probability Mass Function of Several Independent Random Variables) Let
pX1 ,X2 ,...,Xm (x1 , x2 , . . . , xm ) = pX1 (x1 )pX2 (x2 ) · · · pXm (xm )
Proposition 10. (Probability Mass Function of a Function of Two Random Variables) Let X and Y
be two discrete random variables defined on the same sample space Ω and let g be a real-valued function
of two variables defined on the range of (X, Y ). Then the p.m.f. of the random variable Z = g(X, Y )
is
XX
pX,Y (x, y), for (x, y) ∈ RXY
pZ (z) = (x,y)∈RXY :g(x,y)=z
0 otherwise.
Note. In other words, if Z is a function of X and Y , then P (Z = z) is equal to the sum of the
Example 27. A market has both an express checkout line and a regular checkout line. Let X and X
denote the number of customers in line at the express checkout and the regular checkout, respectively, at
a given time. Suppose the joint p.m.f. of X and Y is given by the following table:
X/Y 0 1 2 3
Let Z = |X − Y | be the absolute difference between the number of customers in the express lane and the
First we note the range of Z is RZ = {0, 1, 2, 3, 4}. Now for each value of Z, we determine which pairs
(X, Y ) that produce that value when we take the absolute difference. We can make a table to help us:
1 (0,1), (1,0), (1,2), (2,1), (2,3), (3,2), (4,3) 0.02, 0.02, 0.08, 0.10, 0.07, 0.12, 0.03 0.44
2 (0,2), (1,3), (2,0), (3,1), (4,2) 0.03, 0.05, 0.02, 0.09, 0.02 0.21
So we can write the p.m.f. of Z = |X − Y | using the first and last column of the above table:
0.29 if z = 0
0.44 if z = 1,
0.21 if z = 2,
pZ (z) =
0.05 if z = 3,
0.01 if z = 4,
0
otherwise.
Note. The p.m.f. of a function of more than two variables can be found analogously.
We now examine the special case where our function of interest is the sum of two discrete random variables.
Example 28. In Example 27, let Z = X + Y be the total number of customers in the two lines. We want
to find the p.m.f. of Z. The range of Z is RZ = {0, 1, 2, 3, 4, 5, 6, 7}. The probability there are three total
= pX,Y (0, 3) + pX,Y (1, 2) + pX,Y (2, 1) + pX,Y (3, 0) = 0.02 + 0.08 + 0.10 + 0.01 = 0.21
We find pZ (z) for all other values of z ∈ RZ similarly. The p.m.f. of Z is shown below:
z 0 1 2 3 4 5 6 7
We used the law of partitions (partitioning the event Z = z over all values of X) to find the p.m.f. of
Z = X + Y . By fixing the value of one random variable, we can change the expression X + Y = z from
being a function of two variables, to being just a function of just one. We summarize the technique as
follows:
Proposition 11. (p.m.f. of the Sum of Two Discrete Random Variables) Let X and Y be two discrete
random variables defined on the sample space Ω. Then the probability mass function of Z = X + Y
satisfies
X
pZ (z) = pX,Y (x, z − x)
x∈RX
where z ∈ RZ .
Proposition 12. (p.m.f. of the Sum of Two Independent Discrete Random Variables) Let X and Y be
independent discrete random variables defined on the sample space Ω. Then the p.m.f. of the random
Proposition 13. (p.m.f.’s of Sums of Some Common Discrete Random Variables) Let X1 , . . . , Xm be
m m
!
X X
(ii) if Xi ∼ P (λi ), Xi ∼ P λi ,
i=1 i=1
m m
!
X X
(iii) if Xi ∼ N B(ri , p), Xi ∼ N B ri , p .
i=1 i=1
We will prove Proposition 13 (i) for the case of m = 2. We have two independent random variables X(n1 , p)
P (Z = z) = P (X + Y = z)
z
X
= pX (x)pY (z − x) by Convolution Formula
x=0
z
X n1 x n1 −x n2
= p (1 − p) pz−x (1 − p)n2 −(z−x)
x z−x
x=0
z
X n1 n2
= pz (1 − p)n1 +n2 −z
x z−x
x=0
n1 + n2 z
= p (1 − p)n1 +n2 −z
z
Note. In the last line of the proof, we used Vandermonde’s Identity, which states that
k
X m n−m n
=
j k−j k
j=0
This identity states that any combination of k objects from a group of (m + n) objects must have some
0 ≤ r ≤ k objects from a group of m objects, and the remaining (k − r) objects from a group of n
objects.
We will not formally prove this identity. However, you can verify it with an example. For example,
suppose n = 5, m = 4 and r = 3. Vandermonde’s Identity states (and you can verify) that
9 5 4 5 4 5 4 5 4
= + + +
3 0 3 1 2 2 1 3 0
We have two independent random variables X1 ∼ P (λ1 ) and X2 ∼ P (λ2 ). Let Z = X + Y . We find the
p.m.f. of Z as follows:
z
X
P (Z = z) = pX (x)pY (z − x) by the Convolution Formula
x=0
z
X λx1 e−λ1 λz−x
2 e
−λ2
=
x! (z − x)!
x=0
z
X λx1 λz−x
= e−(λ1 +λ2 ) 2
x!(z − x)!
x=0
z
1 X z!
= e−(λ1 +λ2 ) λx λz−x
z! x!(z − x)! 1 2
x=0
z
1 X z x z−x
= e−(λ1 +λ2 ) λ λ
z! x 1 2
x=0
n
(λ1 + λ2 )z n
X n k n−k
= e−(λ1 +λ2 ) by Binomial theorem i.e., (x + a) = x a
z! k
k=0
Example 29. You have two unbalanced coins – one gold and one silver. When flipped, each of them
lands on heads 60% of the time. If you flip the gold coin 10 times and the silver coin 5 times, what is the
Example 30. Suppose the number of goals X1 scored by the Winnipeg Jets follows a Poisson distribution
with a rate of 3.2 per game. The number of goals X2 scored by the Arizona Coyotes follows a Poisson
distribution with a rate of 2.6 per game. Assuming the variables are independent, what is the probability
that a total of seven goals will be scored in the next Jets vs. Coyotes game?
Example 31. You and two friends each roll a fair six-sided die until each of you have observed a 6 for the
second time. What is the probability that the total number of rolls is 30?