Вы находитесь на странице: 1из 14

Moment Generating Function and

the Central Limit Theorem


Aria Nosratinia Probability and Statistics 7-1
Motivation
Moment generating function:
Systematic calculation of all moments
Recovery of pdf from the moments
Central Limit Theorem:
Guassianity of appropriate sums of random variables
Characterization of noise
Important approximations
Aria Nosratinia Probability and Statistics 7-2
Moments of Sums
If W = X
1
+ +X
n
, then
E[W] = E[X
1
] + +E[X
n
]
V ar(W) =
n

i=1
V ar(X
i
) + 2
n1

i=1
n

j=i
Cov(X
i
, X
j
)
When X
i
are uncorrelated,
V ar(W) = V ar(X
1
) + +V ar(X
n
)
Aria Nosratinia Probability and Statistics 7-3
Sum of Two RV
Take two random variables X and Y , we wish to nd the pdf of the
sum. Dene:
W = X +Y
V = Y
Then use the Jacobian method to nd f
WV
(w, v). It is easy to see
|J| = 1, therefore:
f
WV
(w, v) = f
XY
(x, y) = f
XY
(w y, y)
Integrate to nd the marginal
f
W
(w) =
_

f
XY
(w y, y) dy
This is an integration of f
XY
along a diagonal line.
Aria Nosratinia Probability and Statistics 7-4
Example
Random Variables X, Y are distributed
f
XY
(x, y) =
_
_
_
2 0 x, y 1, x +y 1
0 else
Find the pdf of the sum.
Using the previous result:
f
W
(w) =
_

f
XY
(w y, y) dy
=
_
w
0
2dy 0 w 1
= 2w
Aria Nosratinia Probability and Statistics 7-5
Sum of Independent RV
If X, Y are independent and W = X +Y , then:
f
W
(w) =
_

f
XY
(w y, y) dy =
_

f
X
(w y)f
Y
(y) dy
= f
X
f
Y
The pdf of sum of independent r.v. is given by the convolution of the
individual pdfs.
In the discrete case:
P
W
(w) =

k=
P
X
(k) P
Y
(w k)
NOTE: Naturally the pmf values may be zero for some of these k.
Aria Nosratinia Probability and Statistics 7-6
Example
Random variable X is Gaussian (0, 1) and random variable Y is
Gaussian with mean 2 and variance 1. Find the pdf of X +Y .
Aria Nosratinia Probability and Statistics 7-7
Example
Random variables X, Y are distributed according to:
f
XY
(x, y) =
_
_
_
2e
x
e
2y
x, y > 0
0 else
Find the pdf of W = X +Y
Aria Nosratinia Probability and Statistics 7-8
Moment Generating Function
Denition: For random variable X, the moment generating function
is:

X
(s) = E
_
e
sX
_
For continuous r.v., this is (almost) the Laplace transform of the pdf:

X
(s) =
_

e
sx
f
X
(x) dx
For discrete distributions:

X
(s) =

e
sx
i
P
X
(x
i
)
Aria Nosratinia Probability and Statistics 7-9
Example
Find the MGF of the Bernoulli distribution:
P
X
(x) =
_
_
_
p x = 1
1 p x = 0

X
(s) = (1 p)e
0
+pe
s
= 1 p +pe
s
Aria Nosratinia Probability and Statistics 7-10
Example
Find the MGF of the exponential R.V.
f
X
(x) =
_
_
_
e
x
x 0
0 else

X
(s) =
_

0
e
xs
e
x
dx
=
_

0
e
x(s)
dx
=

s
_
e
x(s)
_

0
=

s
Aria Nosratinia Probability and Statistics 7-11
Moments via MGF
d
X
(s)
ds
=
d
ds
_
e
sx
f
X
(x)dx
=
_
xe
sx
f
X
(x)dx
Set s = 1 and we get:
d
X
(s)
ds

s=0
= E[X]
If we take n derivatives, we get n powers of x, therefore:
Theorem: A random variable X with MGF
X
(s) has moments
E[X
n
] =
d
n

X
(s)
ds
n

s=0
Aria Nosratinia Probability and Statistics 7-12
Why Moments from MGF?
Each moment requires an integral.
If we need multiple moments, we can take one integral to get MGF,
and then calculate all moments with derivatives (easier than integrals).
Example: Find the rst four moments of an exponential random
variable with parameter .
Aria Nosratinia Probability and Statistics 7-13
MGF of Independent Sums
Theorem: If X, Y are independent,

X+Y
(s) =
X
(s)
Y
(s)
Proof:

X+Y
(s) = E
_
e
s(X+Y )
_
= E
_
e
sX
e
sY
_
= E
_
e
sX
_
E
_
e
sY
_
=
X
(s)
Y
(s)
Aria Nosratinia Probability and Statistics 7-14
Application of MGF Properties
Using the MGF properties it is easy to show that:
The sum of n i.i.d. Bernoulli-p variables is a binomial (n, p)
The sum of n i.i.d. geometric-p variables is a Pascal (n, p)
The sum of n independent Poisson-
i
random variables is a
Poisson

i
The sum of n independent Gaussians is a Gaussian (but this is
also true for non-independent Gaussians).
The sum of n i.i.d. exponential variables is Erlang.
Aria Nosratinia Probability and Statistics 7-15
Characteristic Function (optional)
The characteristic function of variable X is dened as:

X
() =
_

e
jx
f
X
(x) dx
The characteristic function is related to the Fourier transform of
the PDF, just like MGF is related to the Laplace transform.
E[X
n
] =
1
j
n
d
n
d
n

X
()
No two PDF can share the same characteristic function.
All pdf have characteristic functions, not all pdf have MGF.
Therefore characteristic function is a powerful tool.
Aria Nosratinia Probability and Statistics 7-16
Sum of I.I.D. Variables
We wish to investigate sums of a large group of i.i.d. variables
whose distribution we may not know. Is that possible?
Consider the i.i.d. sum W = X
1
+ +X
n
as n .

W
= n
X
and V ar(W) = n V ar(X). Both go to innity.
Oops!
To make the analysis more meaningful, we subtract the means,
and divide by

n to avoid exploding the variance.
Aria Nosratinia Probability and Statistics 7-17
Central Limit Theorem
Theorem: Consider an i.i.d. sequence X
1
, X
2
, . . . with mean
X
and variance
2
X
. Then random variables Z
n
=
1

n
i=1
X
i

X
have the property:
lim
n
F
Z
n
(z) = (z)
This means, if W
n
= X
1
+ +X
n
, for large n:
F
W
n
(w)
_
w n
X

n
_
Aria Nosratinia Probability and Statistics 7-18
Practical Application of CLT
Problem: Find probabilities involving sum of iid variables
W = X
1
+ +X
n
.
Solution: Find
X
and
2
X
. Then use CLT.
Interesting Note: CLT also applies to discrete variables.
Even though sum of discrete variables is always discrete, the CDF
approaches a Gaussian.
This is enough to make approximations involving probabilities.
Aria Nosratinia Probability and Statistics 7-19
Example 1
Consider X
i
to be a uniform distribution over [1, 1]. What is the
probability that W = X
1
+ +X
16
takes values in the interval
[1, 1]?
Aria Nosratinia Probability and Statistics 7-20
Example 1
Consider X
i
to be a uniform distribution over [1, 1]. What is the
probability that W = X
1
+ +X
16
takes values in the interval
[1, 1]?

X
= 0 V ar(X) =
1
3
P(1 < W 1) = F
W
(1) F
W
(1)
=
_
1 0
4/

3
_

_
1 0
4/

3
_
= 2
_

3
4
_
1
= 0.3328
Aria Nosratinia Probability and Statistics 7-20
Example 2
We ip a fair coin a thousand times. What are the chances that we
will have more than 510 heads?
Aria Nosratinia Probability and Statistics 7-21
Example 2
We ip a fair coin a thousand times. What are the chances that we
will have more than 510 heads?
Denote by X the ip of a coin and the number of heads to be
A = X
1
+X
2
+ +X
1000
We know
X
= 0.5 and V ar(X) = 0.25.
P(A > 510) = 1 F
A
(510)
= 1
_
510 1000 0.5
0.5

1000
_
= 1 (0.63) = 0.2643
Question: What would be the probability of exactly 510 heads? Does
this lead to problems?
Aria Nosratinia Probability and Statistics 7-21
Example 3
DeMoivre-Laplace Formula: For a binomial (n, p) variable K,
P(k
1
K k
2
)
_
k
2
np + 0.5
_
np(1 p)
_

_
k
1
np 0.5
_
np(1 p)
_
IDEA: whichever side that is incldued, we use a margine of 0.5 on the
probability. This avoids problems with approximating discrete variables.
Example: Calculate the P(K = 8) for a binomial (20, 0.4).
Using the previous formula, the answer is zero!
Using the DeMoivre-Laplace formula,
P(8 K 8) P(7.5 K 8) = (
0.5

4.8
) (
0.5

4.8
) = 0.1803
The exact answer using the binomial formula is 0.1797.
Aria Nosratinia Probability and Statistics 7-22
Laplace and De Moivre
Abraham de Moivre
(1654-1705)
De Moivre Formula
(cos x+i sin x)
n
= cos nx+i sin nx
Gaussian probabilities
Pierre-Simon Laplace
(1749-1827)
Laplace Transform, Scalar potentials,
Laplace equation (PDE),
Celestial mechanics
De Moivres formula predates the Euler formula e
ix
= cos x +i sinx.
Laplace almost predicted the existence of black holes!
Aria Nosratinia Probability and Statistics 7-23
Advance Topics
Aria Nosratinia Probability and Statistics 7-24
Random Sums of Independent Variables
We draw a random integer N according to some distribution, then
form:
W = X
1
+ +X
N
using i.i.d. variables X
i
.
The MGF of the sum is:

W
(s) =
N
_
ln
X
(s)
_
For proof, see your textbook
This MGF can be used to calculate probabilities involving random
sums.
Aria Nosratinia Probability and Statistics 7-25
Cherno Bound
This is used to bound the tail of a probability distribtion.
Theorem: For a random variable X,
P(X c) min
s0
e
sc

X
(s)
IDEA: the probability is hard to calculate, but the bound is easier.
Proof:
P(X c) =
_

c
f
X
(x)dx =
_

u(x c)f
X
(x)dx

e
s(xc)
f
X
(x)dx
= e
sc
_

e
sx
f
X
(x)dx
= e
sc

X
(s)
Aria Nosratinia Probability and Statistics 7-26

Вам также может понравиться