Академический Документы
Профессиональный Документы
Культура Документы
Distributions
Outline – Jointly Distributed
Random Variables
Outline – Expected values,
covariance and correlation
Outline – Statistics and their
distributions
Outline – Distribution of the
Sample Mean
Outline – The Distribution of
Linear Combinations
Joint Distributions
1. So far we have studied probability models for a single discrete or
continuous random variable.
2. In many practical cases it is appropriate to take more than one
measurement of a random observation. For example:
1. Height and weight of a medical subject.
2. Grade on quiz 1, quiz 2, quiz 3 of a math student.
3. The case by case result of 12 independent measurements of air
quality in 524 W 59 Street.
3. How are these variables related? That is: What is their joint
probability distribution?
4. The air quality type of situation is very important and is the
foundation of much of inferential statistics.
Joint Probability Mass Function
Joint Probability Density Function
Joint Distributions
Joint Distributions
Example
An insurance company sells both homeowners policies and
auto policies. The deductibles on the homeowner’s policy is
variable Y, and X for auto.
y Px
P(x,y) 0 100 200
100 0.2 0.1 0.2 0.5
x
250 0.05 0.15 0.3 0.5
Py 0.25 0.25 0.5
and 0 otherwise.
Suppose λ1=1/1000 and λ2=1/1200. Then the probability that both
lifetimes are at least 1500 hours equals:
1500 1500
P(1500 £ X 1 ,1500 £ X 2 ) = P(1500 £ X 1 ) P(1500 £ X 2 ) = e e
1000 1200
= (.2231)(.2865) = .0639
We simulated the sample twice. Note the two means are different but ‘near’ 65
and the two standard deviations are different but ‘near 3.’
Simulation Normal Distribution - 2
1. If we perform the previous simulation many times then how will the generated
means and standard deviations behave? That is, what are their distributions?
2. Will the means cluster around the population mean of 65?
3. What if we look only at the means? Will the standard deviation of the means
generated equal the original population standard deviation or will they be
different?
4. We examine these questions by writing an R program that performs the
simulation many thousands of times and looks at the results.
Steps in a simulation experiment
Simulation Normal Distribution - 3
Simulation Normal Distribution - 4
The following code will simulate the selection of a random sample
of size n = 20 from a Normal(65,3) distribution k=50000 times,
determining the mean of the sample and storing the results in a
vector. The code plots a histogram of the 50000 means and then
calculates the mean of the means and the standard deviation of the
means. Is this standard deviation close to 3?
k <- 50000
n <- 20
mns <- numeric(k)
for (i in 1:k) mns[i] <- mean(rnorm(n,65,3))
hist(mns)
mean(mns)
sd(mns)
3/sqrt(n)
Simulation Normal Distribution - 5
We run the code next.
Let X1, X2, . . . , X25 be a random sample of size 25, where each Xi
is the number of cycles on a different randomly selected
specimen.
Now let’s look at the means and standard deviations of the samples.
First the code:
Central Limit Theorem Example Uniform[-1,1] Distribution
Here is the resulting data frame. Notice that the sample standard
deviation is close to (sqrt(3)/3)/sqrt(n). Note the sqrt(3)/3 is the
standard deviation of the original uniform distribution and that the
sample mean ex is always near 0 – the true uniform mean.
The Case of the Normal Distribution
The Case of the Normal Distribution
The Case of the Normal Distribution
Figure 5.14
Example 25
The time that it takes a randomly selected rat of a certain subspecies
to find its way through a maze is a normally distributed rv with µ=
1.5 min and σ = .35 min. Suppose five rats are selected.
Let X1, . . . , X5 denote their times in the maze. Assuming the Xi’s to
be a random sample from this normal distribution, what is the
probability that the total time To = X1 + . . . + X5 for the five is
between 6 and 8 min?
Example 25
Thus, To has a normal distribution with
= nµ = 5(1.5) = 7.5
and
variance = n𝛔 2 = 5(.1225) = .6125, so = √.6125=.783
Using R, P( 6 ≤ To ≤ 8) =
Note Rounding
Linear Combinations and their
means
Variances of linear combinations
The difference between random
variables
The difference between random variables-Example
• Let X1 and X2 represent the number of gallons of
gasoline sold at JJ Gasoline on Monday and Tuesday
respectively. Suppose X1 and X2 are independent
normally distributed RV’s with respective means of 1500
and 1200 gallons and respective standard deviations of
100 and 80 gallons. Let TS = X1+X2 be the total sold and
let TD = X1 – X2 be the difference sold between Monday
and Tuesday.
• By the previous results:
– TS is normal with mean 1500+1200 = 2700 and standard
deviation sqrt(1002+802)=sqrt(16400) = 128.0625.
– TD is normal with mean 1500-1200 = 300 and standard
deviation sqrt(1002+802)=sqrt(16400) = 128.0625.
• Let’s simulate this in R to see if this makes sense.
The difference between random variables-Example-p2
The difference between random variables-Example-p3
It is not the CLT demonstrated here as we are not dealing with different
means. It is the fact that sums and differences of Normal random
variables are normal.
Example 29
A gas station sells three grades of gasoline: regular, extra, and super.
These are priced at $3.00, $3.20, and $3.40 per gallon, respectively.
Let X1, X2, and X3 denote the amounts of these grades purchased
(gallons) on a particular day.
Suppose the Xi’s are independent with E(X1) = 1000, E(X2) = 500,
E(X3)= 300, 𝛔1 = 100, 𝛔 2 = 80, and 𝛔3 = 50.
Example 29
The revenue from sales is Y = 3.0X1 + 3.2X2 + 3.4X3, and
= $5620
Or without normalizing: