Вы находитесь на странице: 1из 15

Math 501 Assignment 5: due Monday 11/4 at 11:59 p.m.

submitted
via Canvas in pdf format

Scoring: This assignment is worth 165 points (40 + 20 + 10 + 30 + 45 + 10


+ 10 = 165).

1 Academic integrity and collaboration policy


for this assignment
You may work together on the reading assignment, which includes dozens of
solved problems, examples, proofs, detailed discussions, theorems, etc., and you
are encouraged to do so, but you must complete the written assignment entirely
on your own. Learn the material well with the reading assignment; prove you
have learned the material well with the written assignment.
You may consult our TAs or me regarding questions about written exercises
if you have worked hard on them and are very stuck, but you may not consult
anyone else until after all parties have submitted the assignment.
You are to …gure the exercises out entirely on your own.
If you …nd solutions to one or more of the written exercises in Weiss’s book,
in Hsu’s book, or in documents I myself have previously posted to Canvas for
our course this semester, you may make use of them, but the write-up should
be your own, and you must cite the source.
If you …nd solutions to one or more of the written exercises in any source
other than Weiss’s book, Hsu’s book, or documents I myself have previously
posted to Canvas for our course this semester, you may not make use of them
at all, and doing so would be an academic integrity violation.

2 Reading assignment
Class notes: Read the class notes up to and including Chapter 38 (Chap-
ters 1 through 30 were previously assigned.); at the time this is being
posted, not all of that material has been posted, but it will be posted
within two days.
Weiss textbook: Chapter 5, and Sections 8.1, 8.2, 8.3, 8.4, and 8.5 (Chap-
ters 1, 2, 3, and 4 were previously assigned.)

Schaum’s book (Hwei Hsu): Skip the “notes” (use our notes and Weiss’s
textbook). However, carefully work through exercises 2.1 through 2.22
(all except 2.18). Do not turn these in, but make sure you understand the
details of the calculations and how to solve these exercises.

1
3 Written Assignment (to be turned in)
3.1 Exercise 1) (40 points –5 points per part) (Bernoulli
trials)
3.1.1 The exercises
a. Textbook exercise 5.94a.

b. Textbook exercise 5.94b.

c. Textbook exercise 5.94c.

d. Textbook exercise 5.96.

e. Textbook exercise 5.42d.

f. Textbook exercise 5.43b. (you’ll have to …gure out part a. …rst)

g. Textbook exercise 5.117b.

h. Textbook exercise 5.117c.

3.2 Exercise 2) (20 points – 5 points per part) (discrete


random variables)
3.2.1 The exercises
An experiment consists of ‡ipping a coin four times. We can represent this
experiment with a classical probability model ( ; F; P ) ; where
8 9
>
> T T T T; T T T H; T T HT; T T HH; >>
< =
T HT T; T HT H; T HHT; T HHH;
= :
>
> HT T T; HT T H; HT HT; HT HH; > >
: ;
HHT T; HHT H; HHHT; HHHH

De…ne a function
X: !R
as follows: for each ! 2 ; let X (!) represent the number of tails (i.e., of T s)
in ! minus the number of heads (i.e., of H s) in !: For instance,

X (HHHT ) = 1 3= 2:

a) Complete the following table:

2
! X (!)
TTTT ?
TTTH ?
TTHT ?
TTHH ?
THTT ?
THTH ?
THHT ?
THHH ?
HTTT ?
HTTH ?
HTHT ?
HTHH ?
HHTT ?
HHTH ?
HHHT ?
HHHH ?

b) Prove that X is a simple random variable.

c) Find fX 1g :

d) Find pX ; the pmf of X: Make sure you explicitly de…ne it for all real numbers
(as in the class notes; Weiss and Hsu do not do this consistently, but we will.).

3.3 Exercise 3) (10 points –5 points per part) (hypergeo-


metric random variables and random sampling with-
out replacement)
3.3.1 The exercises
In each part of this exercise, identify and de…ne an appropriate random variable
X in words. Explain why it is hypergeometric. Find its parameters (with
explanation). Then use it to calculate the indicated probabilities.
a) Textbook exercise 5.68a.

b) Textbook exercise 5.68b.

3.4 Exercise 4 (30 points – 5 points per part) (negative


binomial and binomial random variables)
3.4.1 Background
First, suppose that we have Bernoulli trials with success probability 0:7:
Let X be the number of successes in 20 trials, so that X B (20; 0:7) ; and

3
let Y be the number of the trial on which the 12th success occurs, so that
Y N B (12; 0:7) :
The event fX < 12g occurs precisely when there are fewer than 12 successes
in the 20 trials.
The event fY > 20g occurs precisely when the 12th success occurs after the
20th trial, which occurs precisely when the number of successes in the …rst 20
trials is less than 12:
Thus, fX < 12g = fY > 20g : They are the same event exactly. Since these
are the same event, their probabilities must be the same. Thus,

P (fX < 12g) = P (fY > 20g) :

The complementary events fX 12g and fY 20g must also be the same
as one another and must therefore also have the same probability:

P (fX 12g) = P (fY 20g) :

More generally, we have the following theorem:

Theorem 1 Suppose that X B (n; p) ; where n is a positive integer and p 2


(0; 1) : Suppose that Y N B (r; p) ; where r is a positive integer, and p is as
above. Then fX < rg = fY > ng ; so P (fX < rg) = P (fY > ng) ; and also
fX rg = fY ng ; so P (fX rg) = P (fY ng) :

Consider the baseball world series example from the class notes. We solved
that example in two ways: using a negative binomial random variable, and using
a binomial random variable.
Let R be the random variable from the solution using the negative binomial
random variable. I.e., R N B (4; 2=3) :
Let S be the random variable from the solution using the negative binomial
random variable. I.e., S B (7; 2=3) :
Let r = 4; n = 7: The theorem stated above implies that fS 4g = fR 7g :
This is the event that Team 1 wins the series. The theorem above shows that
we can …nd that probability by calculating either P (fS 4g) or P (fR 7g) :
This is precisely what we did in class (with di¤erent variable letters).
You can use the theorem above to go (easily) from one solution to a problem
of this sort using either a binomial or negative binomial random variable to
another solution to the problem using the other type of random variable. For
example, suppose you …gured out that the event that Team 1 wins is fS 4g :
The theorem above would tell you also that this event is fR 7g ; and so you
would now have two ways to calculate the probability that Team 1 wins.

3.4.2 The exercises


a. Every year, many of the world’s best female tennis players compete at Wim-
bledon, in England. A match consists of 3 sets, the winner being the …rst player
to win 2 sets.

4
Players A and B are playing in a women’s match. For any given set, player
A has a 0.60 probability of winning, and player B has a 0.40 probability of
winning.
Find the probability that player A wins the match by making use of negative
binomial random variables.
Then use the theorem above to convert the probability that player A wins
the match into a corresponding probability involving binomial random variables,
and calculate that probability by using the Fundamental Probability Formula.
Of course, your answers should be the same.

b. Every year, many of the world’s best male tennis players compete at Wim-
bledon, in England. A match consists of 5 sets, the winner being the …rst player
to win 3 sets.
Players A and B are playing in a men’s match. For any given set, player
A has a 0.60 probability of winning, and player B has a 0.40 probability of
winning.
Find the probability that player A wins the match by making use of binomial
random variables.
Then use the theorem above to convert the probability that player A wins
the match into a corresponding probability involving negative binomial random
variables, and calculate that probability by using the Fundamental Probability
Formula. Of course, your answers should be the same.

c. Read Exercise 5.94 in the textbook. We can consider the baseball player’s
at-bats to be Bernoulli trials, in which a “success” means getting a hit and
in which a “failure” means failing to get a hit. We are given that the success
probability is p = 0:260:
Let X be the number of the at-bat on which the player gets the second hit.
Find the probability that the player gets his second hit after his seventh at-
bat. Give your answer accurate to 4 decimal places (a numerical approximation
is …ne). Use negative binomial random variables.

d. Read Exercise 5.94 in the textbook. We can consider the baseball player’s
at-bats to be Bernoulli trials, in which a “success” means getting a hit and
in which a “failure” means failing to get a hit. We are given that the success
probability is p = 0:260:
Let X be the number of the at-bat on which the player gets the second hit.
Find the probability that the player gets his second hit after his seventh at-
bat. Give your answer accurate to 4 decimal places (a numerical approximation
is …ne). Use binomial random variables.

e. Solve Exercise 5.129 part a from the textbook.

f. Solve Exercise 5.129 part b from the textbook. Note that this provides
another example of how to use a probabilistic argument to prove an otherwise
extremely di¢ cult identity.

5
3.5 Exercise 5 (45 points – 5 points per part) (random
variable approximations)
3.5.1 Background
A very useful approximation The following theorem is from our class notes:

Theorem 2 Suppose m is a non-negative integer. Then


m! M
lim = 1:
M !1 Mm m
Consequently, we can use the approximation
M Mm
m m!

provided M is su¢ ciently large compared to m (equivalently, provided m=M is


close enough to zero) with the idea that the error disappears in the limit because
of the theorem above.

Binomial approximation to the hypergeometric Suppose X H (N; n; p) ;


where N; n; and p satisfy the usual conditions. Hold n and p …xed. Then, as
N ! 1; we also have N p ! 1 and N (1 p) ! 1 (since p 2 (0; 1)), and so,
as N ! 1; we have
x
! 0;
Np
n x
! 0;
N (1 p)
n
! 0:
N
We may therefore use the approximation above on each of the combinations
in pX : For each x = 0; 1; 2; : : : ; n;
Np N (1 p) N
lim pX (x) = lim =
N !1 N !1 x n x n
!
n x
N x px Nn x
(1 p) Nn
= lim =
N !1 x! (n x)! n!
n n x
= lim px (1 p) (after canceling terms)
N !1 x
n n x
= px (1 p) ;
x

and for each x 2


= f0; 1; 2; : : : ; ng we have

lim pX (x) = lim 0 = 0:


N !1 N !1

6
We have shown that, for all real x; the limit of pX (x) equals the pmf of a
B (n; p) random variable. Thus, H (N; n; p) random variables become B (n; p)
random variables in the limit as N ! 1; holding n and p …xed.
In practice, N is …xed, hence not going to in…nity, so we will get an approxi-
mation, not an exact result. Provided n = N 0:05; a condition called “the …ve
percent condition,”this approximation (of a hypergeometric random variable by
a binomial one) will be acceptable to many people, provided proper procedures
(see below) are followed.
We will use the …ve percent condition as our cuto¤ condition for determin-
ing when it is reasonable to approximate probability calculations involving a
hypergeometric random variable with similar calculations involving a binomial
one with the same n and p; regardless of what the random variables represent.
Often, this approximation is used when dealing with random sampling without
replacement. The number of successes in random sampling without replacement
is hypergeometric, while the number of successes in random sampling with re-
placement is binomial, so the binomial approximation to the hypergeometric
e¤ectively lets us approximate random sampling without replacement by ran-
dom sampling with replacement in the situation where the sample size is no
more than …ve percent of the population size.
Here is the procedure to follow. Suppose W H (N; n; p) ; and suppose that
we wish to calculate a probability, say P (fW 2 Eg) for some Borel subset E of
R: Provided n = N 0:05; we can introduce a new random variable X B (n; p)
(for the same values of n and p), and we can calculate P (fX 2 Eg) instead, as
an approximation to P (fW 2 Eg) : It should always be made clear what the
exact random variable is (in this case, W ), it should always be made clear that
this approximation is being used, and it should always be made clear why this
approximation is reasonable.

Another useful approximation Recall from calculus (this result can be


proven using logarithms and L’Hopital’s Rule) that
x n
lim 1+ = ex for each real x:
n!1 n

Poisson random variables A discrete random variable X is called a Poisson


random variable with parameter 2 (0; 1) ; and we write X P ( );
provided its pmf pX satis…es
x
pX (x) = e
x!
whenever x is a non-negative integer, and pX (x) = 0 otherwise.

Poisson approximation to the binomial Suppose > 0 is …xed, suppose


n is a positive integer, and let p = =n (which guarantees that p 2 (0; 1)).

7
For each x 2 f0; 1; 2; : : : ; ng ; we have 0 x n; and thus x=n ! 0 as
n ! 1: This will allow us to use the approximation nx nx =x! :

n x n x
lim pX (x) = lim p (1 p)
n!1 n!1 x
x n x
nx
= lim 1
n!1 x! n n
x n x
= lim 1 (after canceling the nx terms)
n!1 x! n
x n x
= lim 1 1
n!1 x! n n
x n x
= lim 1 1
x! n!1 n n
x n x
= lim 1 lim 1
x! n!1 n n!1 n
x
= e 1:
x!
For each x 2
= f0; 1; 2; : : : ; ng ; we have

lim pX (x) = lim 0 = 0:


n!1 n!1

We have shown that, for all real x; the limit of pX (x) equals the pmf of a
P ( ) random variable, with = np: Thus, B (n; p) random variables become
P ( ) = P (np) random variables in the limit as n ! 1; holding p …xed.
In practice, n is …xed, hence not going to in…nity, so we will get an ap-
proximation, not an exact result. In the case of a Poisson approximation to a
binomial random variable, we have good information about the size of the error:

Theorem 3 (error bounds for Poisson approximation to binomial ran-


dom variables) Let n be a positive integer, and suppose p 2 (0; 1) : Suppose
that X B (n; p) ; and that Y P ( ) ; where = np: Suppose that E is any
Borel subset of R: Then

jP (fX 2 Eg) P (fY 2 Eg)j np2 :

Thus, the error in calculating any probability involving a B (n; p) random


variable with a P ( ) random variable, with = np; is at most np2 :
When p is close enough to 0; so close to 0 that np2 is small enough for our
purposes, a Poisson approximation will be adequate.
We want np2 to be a small number absolutely, and we also want it to be small
relative to the size of any probabilities of interest, so that both the absolute and
relative errors will be small. For example, if we are estimating a probability
that is about 0:06 roughly, then using a Poisson approximation to the binomial
with n = 100 and p = 0:02 (which gives np2 = 0:04) would not be a good idea.

8
When p is close to 1; np2 will be too large to be useful (it will easily exceed
2 2
1!), but perhaps n (1 p) will be small. If n (1 p) is small enough, then
we can proceed as follows: introduce a new random variable X which counts
the number of failures in n trials (whereas X counts the number of successes
in n trials). I.e., X = n X: Then X B (n; 1 p) ; since the failure
probability is 1 p: We can then approximate X using a Poisson random
2
variable provided n (1 p) is small enough for our purposes. We can then
translate results involving X into results involving X at the end.

Example 4 Suppose that we have Bernoulli trials with success probability 0:99;
and suppose that we wish to calculate the probability that there will be at least 10
successes in 12 trials. Let X denote the number of successes in those 12 trials.
Then X B (12; 0:99) : Clearly, np2 is too large for a Poisson approximation.
However, we can let X denote the number of failures in those 12 trials. Of
course, X = 12 X; so that having at least 10 successes is equivalent to having
at most 2 failures, so that

fX 10g = fX 2g :
2
We have X B (12; 0:01) : Here, n (p ) is quite small, so we can use a Poisson
random variable Y P (0:12) to approximate probabilities involving X : In
particular, we have

P (fX 10g) = P (fX 2g) P (fY 2g) ;

which we can easily calculate.

It is also possible to combine approximations. For instance, we might have an


H (N; n; p) random variable, with n=N 0:05: We can approximate that with
2 2
a B (n; p) random variable. Provided np is small enough (or that n (1 p)
is small enough), we can use a Poisson approximation to the B (n; p) random
variable, as explained above.

3.5.2 The exercises


a. Solve textbook exercise 5.64d using an appropriate random variable approx-
imation. Be sure to explain why that approximation is reasonable.

b. Textbook exercise 5.64e

c. Textbook exercise 5.64g

d. Textbook exercise 5.68c

e. Textbook exercise 5.78c.

f. Textbook exercise 5.79c.

9
g. Textbook exercise 5.80b.

h. Textbook exercise 5.84.

i. Suppose X B (10; 0:99) : Use an appropriate procedure involving Poisson


random variables to estimate P (f4:2 X 7:4g) : Also estimate this using the
binomial random variable directly. Are the answers close?

3.6 Exercise 6 (10 points – 5 points per part) (functions


of discrete RVs)
3.6.1 Background
Some general function theory (functions, images of sets, inverse im-
ages of sets, and invertibility) This is very general information about func-
tions in pretty much any …eld of mathematics. Throughout this section, suppose
that g : S ! T; by which we mean that g is a function which takes an element,
say x; of S and maps it into an element, g (x) ; in the set T:
Suppose A S; and suppose B T:
The image of the set A under the function g; which is denoted g (A) ;
is de…ned as follows:
g (A) = fg (x) : x 2 Ag :
It is a subset of T: An important special case is when A = S :

g (S) = fg (x) : x 2 Sg = range (g) :

The inverse image of the set B under the function g; which is denoted
g 1 (B) ; or fg 2 Bg ; is de…ned for subsets of T (not for elements of T ) as
follows:
g 1 (B) = fg 2 Bg = fx 2 S : g (x) 2 Bg :
It is a subset of S: Note that g 1 (B) is de…ned for every function g : S ! T;
and for every subset B T:
In the special case when B is a singleton set (i.e., B = fyg for some y 2 T ),
we have
1
g (fyg) = fx 2 S : g (x) 2 fygg = fx 2 S : g (x) = yg ;

which we also abbreviate as fg = yg :


= range (g) ; then g 1 (fyg) = ;; since there are no values of x 2 S for
If y 2
which g (x) = y:
If y 2 range (g) ; then g 1 (fyg) 6= ;; since there is at least one value of x 2 S
–possibly many values of x 2 S –for which g (x) = y:
In general, g 1 as de…ned above may not be a function from range (g) into
S; because there might be many values of x 2 S for which g (x) = y; and a
function has to have just one output associated with each input, not many.

10
We say that g is invertible provided g 1 as de…ned above is a function
from range (g) into S: I.e., g is invertible provided g 1 (fyg) is a singleton set
for each y 2 range (g) : When g is invertible, and only in this case, we call g 1
the inverse of g; we can write g 1 : range (g) ! S; and also we typically write
g 1 (y) instead of g 1 (fyg) :
Thus, g 1 as originally de…ned (whether g is invertible or not) generalizes
the notion of the inverse of a function.
We have encountered this concept and some of this notation already. When-
ever X is a random variable on a probability space ( ; F; P ) ; we have

X: ! R:

For each subset B of R; we have

fX 2 Bg = f! 2 : X (!) 2 Bg :

This is precisely X 1 (B) : In fact, many advanced probability textbooks prefer


the notation X 1 (B) to the notation fX 2 Bg : Whenever B is a Borel subset
of R; X 1 (B) is an element of F; and so its probability is de…ned.
Thus, this is extremely general set-theoretic notation for functions. It applies
in virtually any …eld of mathematics, not just probability.

Functions of a discrete random variable Suppose that X is a discrete


RV on ( ; F; P ) ; and that Y = g (X) where g is a continuous function de…ned
on the range of X (recall that g (X) (!) = g (X (!)) ; so we need range (X)
domain (g) for this composition to be de…ned). Then Y is itself a discrete
random variable. In fact, this result holds even if g is merely a “piecewise
continuous”function (which allows for a …nite number of jump discontinuities).
Its pmf is found as follows:
1
fY = yg = fg (X) = yg = X 2 g (fyg) ;

and thus
1
pY (y) = P (fY = yg) = P (fg (X) = yg) = P X2g (fyg) :

Thus, for each real number y we need to …nd the event X 2 g 1 (fyg) and
then its probability. That value is then pY (y) :
When y 2= range (Y ) ; we have fY = yg = ; (equivalently, X 2 g 1 (fyg) =
;), and therefore pY (y) = P (;) = 0:
As a result, we almost always …nd pY (y) following these steps:
Step 1: When possible, …nd range (Y ) :
Step 2: Suppose y 2 = range (Y ) : Then pY (y) = 0:
Step 3: Suppose y 2 range (Y ) : Find fY = yg and then …nd its probability.
That value is pY (y) :

Here are two examples in which we will demonstrate this three-step method.

11
Example 5 Suppose X is a discrete random variable on ( ; F; P ) ; and let
Y = eX (i.e., Y = g (X) ; where g is the function de…ned by g (x) = ex for each
x). Show that Y is discrete, and …nd its pmf.
Solution: g is continuous for all real x and is de…ned for all real x; hence in
particular at each point of range (X) : X is discrete. As noted above, these facts
together imply that g (X) (i.e., Y ) is discrete.
If y 2
= range (Y ) ; then pY (y) = 0:
Suppose y 2 range (Y ) : Then y > 0 since eany real numb er is positive. Thus,
eX(!) = y if and only if X (!) = ln y; and so we have
fY = yg = fg (X) = yg
= f! 2 : g (X (!)) = yg
n o
= !2 : eX(!) = y
= f! 2 : X (!) = ln yg
= fX = ln yg ;
and so
pY (y) = P (fY = yg) = P (fX = ln yg) = pX (ln y) :
Thus, we have shown that
pX (ln y) ; if y 2 range (Y )
pY (y) = :
0; otherwise
Note that in this example we were able to make use of the fact that g is in-
vertible (for y > 0) when we went from eX(!) = y to X (!) = ln y: The procedure
works even when g is not invertible, as the following example demonstrates.
Example 6 Let X be a discrete random variable which takes values 2; 1; 0;
and 1 with probabilities 1=2; 1=4; 1=8; and 1=8; respectively. Let Y = X 4 : Prove
that Y is discrete, and …nd the pmf of Y:
Solution: Here, g (x) = x4 : g is continuous for all real x and is de…ned for all
real x; hence in particular at each point of range (X) : X is discrete. As noted
above, these facts together imply that g (X) (i.e., Y ) is discrete.
In this case, the function g is not invertible, but we can still follow the same
steps as above. The main di¤erence is that we will …nd pY (y) for some values
of y separately, rather than …nding a single formula that works for all y values.
range (Y ) = g (f 2; 1; 0; 1g) = f0; 1; 16g :
If y 2
= range (Y ) ; then pY (y) = 0:
Suppose y 2 range (Y ) : Then y = 0; 1; or 16:
Case 1: y = 0: We have
fY = 0g = X4 = 0
n o
4
= ! 2 : X (!) = 0
= f! 2 : X (!) = 0g
= fX = 0g :

12
Thus,
1
pY (0) = P (fY = 0g) = P (fX = 0g) = pX (0) = :
8

Case 2: y = 1: We have

fY = 1g = X4 = 1
n o
4
= ! 2 : X (!) = 1
= f! 2 : X (!) = 1 or X (!) = 1g
= fX = 1g [ fX = 1g :

Thus, since fX = 1g and fX = 1g are mutually exclusive, we have

pY (1) = P (fY = 1g)


= P (fX = 1g [ fX = 1g)
= pX ( 1) + pX (1)
1 1 3
= + = :
4 8 8
Case 3: y = 16: We have

fY = 16g = X 4 = 16
n o
4
= ! 2 : X (!) = 16
= f! 2 : X (!) = 2 or X (!) = 2g
= fX = 2g [ fX = 2g :

Thus, since fX = 2g and fX = 2g are mutually exclusive, we have

pY (16) = P (fY = 16g)


= P (fX = 2g [ fX = 2g)
= pX ( 2) + pX (2)
1 1
= +0= :
2 2
We see that P (fY 2 f0; 1; 16gg) = 1=8 + 3=8 + 1=2 = 1; which con…rms in
a di¤erent way that Y is discrete (in fact, Y is simple).
Finally, we give a formula for pY (y) combining all of the cases:
8
>
> 1=8; if y = 0
<
3=8; if y = 1
pY (y) = :
>
> 1=2; if y = 16
:
0; if y 2
= f0; 1; 16g

13
3.6.2 The exercises
a. Suppose that X is a discrete random variable which takes values 2; 0; 1; 2;
and 4 with probabilities 1=10; 1=8; 1=5; 1=4; and 13=40; respectively. Let Y =
X 3 4X + 1: Show that Y is discrete, and …nd the pmf of Y proceeding as I
did above.

b. Look at (but do not solve yet) Exercise 5.137 in the book. You will end up
solving that problem in multiple steps. The answer in the back of the book is
in simpli…ed form, and your answer, even if correct, may end up looking quite
a bit di¤erent algebraically (although you can use the answer in the back of
the book to check your answer by comparing your values for pY (y) for a few
speci…c values of y). I will give extensive hints in the form of a solution outline.
This problem is much easier to solve by cases and will serve as an excellent
demonstration of this powerful technique. This is the same method as above,
but with Step 3 subdivided into several separate cases.
Let Y = jX 3j ; where X P (3) is the random variable from Exercise
5.137 on p. 249. Thus, Y = g (X) where g (x) = jx 3j :
Find the pmf of Y by proceeding as follows:

Find range (Y ) : (This is Step 1 above.)


Find pY (y) for each y 2 = range (Y ) : (This is Step 2 above. The following
steps are all part of Step 3 above.)
Find pY (0) :
Find pY (1) ; pY (2) ; and pY (3) :
Find pY (y) for each y 2 f4; 5; 6; : : :g : (Hint: there is a good reason
these can be grouped together but the other values must be considered
separately.)
Give a formula for pY (y) for all real y by considering the separate cases
above (which together account for all real numbers y).

Remark 7 Notice how much harder this would have been if you had tried to do
this all in one case. It is often easier to solve problems like this by considering
separate cases.

3.7 Exercise 7 (10 points – 5 points per part) (a very


elementary introduction to maximum likelihood esti-
mation)
3.7.1 Background
A main goal of maximum likelihood estimation is to estimate parameters of
random variables. For example, suppose we believe that X is a binomial ran-
dom variable with n = 10; but suppose we do not know what p is. Maximum

14
likelihood estimation gives us a way to estimate p: Roughly speaking, it does so
by …nding the value(s) of p which make the data we observed most likely.
Maximum likelihood estimation involves techniques and results from inferen-
tial statistics (since in general we have various data points, not just one), and so
the full method is far beyond the scope of our course, but I have designed these
exercises to make use of a single data point, so as to avoid the need for results
of inferential statistics. In this basic setting, …rst-semester calculus will su¢ ce.
These exercises are intended as elementary introductions to this powerful and
widely-used technique.

3.7.2 The exercises


a) Suppose X B (30; p) ; with p 2 (0; 1) unknown. Suppose we carry out an
experiment (just one trial) and …nd that the value of X is 18:
Given that X = 18 has actually occurred (in the only actual experiment
we have conducted), it may seem reasonable to attempt to use this information
to deduce what p might be. (We could try estimating p after running several
trials of the experiment, but then the mathematical argument would require
techniques of inferential statistics, as noted above, so we’ll stick to this simple
example.)
Calculate P (fX = 18g) : It will be a polynomial function of p (of degree
30). Let’s call it g (p) : Do not expand it. Keep it in the form you get from
substituting into the pmf.
Graph g (p) using a suitable window to estimate approximately ( 0:04)
which value of p maximizes P (fX = 18g) :
Then calculate g 0 (p) ; set it equal to 0; and solve for p: If you get multi-
ple values, use the graph to determine which one is the one that maximizes
P (fX = 18g) : This value of p is the maximum-likelihood estimator for p:

b) Suppose X has a Poisson distribution with unknown parameter : I.e., X


P ( ) for some > 0: Suppose we do an actual experiment and measure that
the value of X; in our experiment (which we did just once – again, if we do it
multiple times we’ll have better results, but we would need methods of inferential
statistics, and those are beyond the scope of this course) is 5. Given that X = 5
has actually occurred (in the only actual experiment we have conducted) it may
seem reasonable to attempt to use this information to deduce what might be.
Proceeding as in part a) (except, of course, in this case the function g ( ) is
not going to be a polynomial of degree 30), …nd the value of which maximizes
P (fX = 5g) : That is the maximum-likelihood estimator.

15

Вам также может понравиться