5 - 21 - Lecture 44 Expectation and Variance

Welcome to calculus I'm Professor Ghrist.
We're about to begin lecture 44 on

expectation and variance.
Do you think that it's a coincidence that
we use the term probability density
function?
By no means.
In this lesson, we'll explore the analogy
between masses and probabilities.
And we'll consider some, standard
concepts from statistics, that of
expectation, variance, and standard
deviation.
Well let's say, that you're given, a
probability density function, row of x,
over the reals.
What might you be able to say about a
point chosen at random with respect to
that density?
You might want to know, what outcome is
most likely?
What do I expect to see on average?
On the other hand, since obtaining any
particular value is very, very unlikely.
You might want some way to quantify your
uncertainty of that expected value.
These ways of characterizing a PDF are
the focus of this lecture.
We begin with two standard definitions
from probability, the expectation and the
variance.
The expectation is also called the mean,
or the expected value.
Well the first moment and it is defined
as E equals the integral over the domain
of XDP.
In the one dimensional setting this is
the integral of X times row of XDX.
The expectation is telling you something
about the most likely value, or the
average value.
Now, it may occur at the peak of the
probability density function, or it may
not.
It depends on how that density Is
distributed over the domain.
The variance is sometimes called the
second central moment.
It is denoted v and it is defined as the
integral or the domain of quantity x
minus e squared dp.
We can interpret X minus E as a distance
from the mean.
And so in the 1 dimensional case, we can
write this as quantity, X minus E
squared, row of X, D X.
What happens when we expand out that
multiplication?
And split this into three integrals.
We get some very nice simplifications.

Notice that the middle integral and the
last integral are things we've seen
before.
The middle integral, that of XDP is the
expectation.
The final integral, the integral of DP
over the domain is, of course, 1 thus, we
can simplify the formula for the
variants.
2, the integral, over the domain of x
squared dp minus the expectation squared.
That is an important formula and you're
going to want to remember it.
Maybe you've seen something like it
before.
It will help us understand just how to
visualize or interpret this variation.
By now, you've probably noticed that
there's a relationship between what we've
done in probability.
And what we've done in mass.
If we consider a probability density
function row over a domain, let's say an
interval from A to B.
You've noticed that we've used the same
symbol, row.
For the PDF and for a mass density in a
more physical context.
Indeed the mass element row of X, DX is
precisely what we use as the probability
element.
In that case, what is the mass?
Well it's the integral of the mass
element that corresponds to the
probability p.
Now, let's consider the expectation that
we've just defined e.
What is that?
That's the integral of xdP.
Now of course, by the definition of a
PDF.
This is really the same thing as the
integral of XDP over the integral of 1dp.
And so what we're really doing is
computing the average x value.
With respect to the probability measure.
That is something we have seen before in
the mass context.
That is the centroid x bar so you can
think of the expectation as a
probabilistic centroid.
Well if that's the case, then, how do we
interpret the variance, that is, the
integral of x minus e quantity squared,
dP.
Well you no doubt recognize that that.
Is the moment of inertia, but rather the
moment of inertia about the centroid.
Now that's a very nice physical
interpretation.
That means we can think of the variance
of something like a measure of resistance
to rotating the domain about its mean.
If that's the case, then what is the
probabilistic interpretation of the
radius of gyration, that distance at
which one can focus all of the mass?
And have the same inertia.
Well, that is a well known object in
probability called the standard deviation
denoted by sigma it is defined to be the
square root of the variance.
And it is a measure of how the mass, if
you will, is distributed about the
centroid.
It's a measure of the spread of the
variance about the expectation.
Now with this in hand let's compute some
these values, in the case of a simple
PDF.
Let's look at a uniform density over the
interval from a to b.
What's the expectation?
Well, it's the integral of XDP, that is
the integral of x times role of xdx, x
goes from a to b.
Row of x being simply one over the
length, b minus a gives us a very simple
integral.
When we evaluate it, and do a little bit
of algebra, you won't be surprised to
find that the expectation is one half, a
plus b.
The average value is right in the middle
of the domain.
What's the variance in this case?
Well, we have to integrate, quantity , X
minus E squared.
Times row of x t x.
Using the formula that we derived a
little bit ago, we can take advantage of
the fact that we already know the
expectation.
And so we're reduced to integrating x
squared d x.
That much is simple enough.
But you'll find when you go to do this,
that there's a little bit of Algebra
that's involved in order to simplify.
I claim that with a little bit of work
one gets a variance of 1 12th Quantity, b
minus a squared.
Now, how do I know that I haven't made a
mistake here?
According to our physical interpretation,
this variance should be related to the
moment of inertia of a region about a
centroid access.
Recall that for a rectangle, we have the
moment of inertia of 1 12th M, the mass,

times l squared.
In the probabilistic context, the mass is
1 and this length is b minus a.
So I'm confident that we got this one
right.
The standard deviation is the square root
of the variance.
That is the square root of one twelfth
quantity b minus a squared in other
words, b minus a over 2 root 3.
How about a different density function,
an exponential?
Density over the domain from 0 to
infinity while the expected values, the
integral of xdp.
In this case, with an exponential
density, we have to integrate x times
alpha e to the minus alpha x dx, x goes
from 0 to infinity.
This is going to take a little bit of
work, but not too much.
Using integration by parts, we get -x
times e to the minus alpha x from 0 to
infinity plus integral from 0 to infinity
of e to the minus alpha x dx.
The former term vanishes upon
substitution and we're left with an easy
enough integral we obtain negative 1 over
alpha, e to the minus aplha x from 0 to
infinity.
Evaluating we obtain 1 over alpha.
Now that's a very, very simple
expectation.
You're probably going to want to remember
this, since exponential density functions
are fairly common.
The variance in this case is going to be
a bit more work.
We're going to have to integrate x
squared dp.
And then subtract off the expectation
squared.
This integral is going to, again, involve
integration by parts, but we're going to
have to do it twice since there is an X
squared in front of the exponential term.
I hope you'll trust me when I claim that
this variance reduces to 1 over alpha
squared.
This means that the standard deviation is
equal to 1 over alpha.
Let's put this to use in a simple
example.
Let's say that a light bulb is advertised
as having an average lifespan of 2000
hours.
What are the odds that your light bulb
fails within 400 hours of use?
You might think, well that's very
unlikely since they, they say, that the

average lifespan is 2000 hours.
But let's see.
If we assume a probability density
function that is exponential.
And of the form alpha e to the minus
alpha t, where t is time.
Then, what could we say about alpha?
Well, knowing that the expected value is
2,000 hours.
And knowing that that expectation is 1
over alpha for an exponential PDF what
can we say?
Well, we know that alpha is 1 over 2000.
And to compute the probability of failure
within 400 hours we want the probability
of landing in the interval from 0 to 400.
That is, simply, the integral of Alpha E
to the minus Alpha T DT, as T goes from 0
to 400.
Now that integral is easy enough, it
yields minus E to the minus alpha T, in
evaluating from 0 to 400.
We obtain 1 minus E to the -1/5 because
negative alpha times 400 is minus 1/5.
Now, that evaluates to about 0.18,
meaning that there's nearly a 1 in 5
chance the failure within the first 400
hours.
Recall from our previous lesson, the
ubiquity of Gaussian or normal.
Densities and their form which looks bad
but is not so much.
Note that this is an even function in x,
so that the expectation is 0.
One of the reasons for the coefficient
out in front is so that the variants and
the standard deviation will both be 1.
Now this formula for a Gaussian is not
the most general possible.
There is a general Gaussian of the form 1
over sigma square root of 2 pi times E to
the minus quantity X minus mu squared
over 2 sigma squared.
This general Gaussian allows you to tune
the mean and the standard deviation.
The expectation is mutual, the variance
is precisely sigma squared.
One of the reasons for using general
Gaussian is many random variable are
normal they have a Gaussian density.
If you look at the size of objects, such
as people or snow flakes, they tend to be
distributed according to a Gaussian.
This is often use in statistics.
Now Gaussians can be hard to work with,
given their form but, one can keep in
mind a few properties.
If you have a Gaussian with expectation
Mu and standard deviation sigma, then
you're likely to be within one standard

deviation of the mean.
68% of the mass of that density function
lies within distance sigma.
If you are within two standard deviations
then the amount of mass goes up to about
95%.
And almost all of the mass more than 99%
lies within a distance of 3 standard
deviations of the expectation.
Keeping these figures in mind, might help
you if you ever take a statistics class.
Note however that there's much more to
statistics and probability than simply
remembering a couple of numbers and the
general shape of a Gaussian.
It's a large and beautiful subject that
you'll be able to handle with your
knowledge of calculus.
And so we come to the end of Chapter
Four.
It's worth taking a moment to reflect on
all the applications that we've covered,
from simple, area under a curve, all the
way up through probability.
In our next, and last chapter, we're
going to tie the entire course together
by reconsidering what Calculus means in
the context of discrete functions.

5 - 21 - Lecture 44 Expectation and Variance

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

5 - 21 - Lecture 44 Expectation and Variance

Загружено:

Авторское право:

Доступные форматы

Welcome to calculus I'm Professor Ghrist.

We're about to begin lecture 44 on

We get some very nice simplifications.

moment of inertia of 1 12th M, the mass,

unlikely since they, they say, that the

you're likely to be within one standard

Вам также может понравиться