Вы находитесь на странице: 1из 61

Topic 9

Sampling Distributions

Sampling Distributions Topic 9 1


Sampling Distributions

The Sampling Distribution of a Sample Mean

Sampling Distributions Topic 9 2


The Sampling Distribution of the Sample
Mean
• Population Distribution vs. Sampling
Distribution
• The Mean and Standard Deviation of the
Sample Mean
• Sampling Distribution of a Sample Mean
• Central Limit Theorem

Sampling Distributions Topic 9 3


Statistical Inference
• Data generated from random sampling or
randomized comparative experiments
▪ Do a statistical analysis on the sample data
▪ Using the laws of probability to answer the question
“What would happen if we did this very many times?”
• Statistical Inference:
▪ Using the sample to infer something about the
population based on the above
▪ Reasoning rests on asking “How often would this
method give a correct answer if I used it very many
times?”

Sampling Distributions Topic 9 4


Sampling Terminology
• Parameter
– a number that describes the population
– in practice, the value is unknown number
– For example:
• , population mean
• , population standard deviation
• p, population proportion
• Statistic
– known value calculated from a sample
– a statistic is often used to estimate a parameter
– For Example:
• , sample mean
• s, sample standard deviation
• p̂ , sample proportion
Sampling Distributions Topic 9 5
Sampling Terminology (continued)

A Statistic Estimates A Parameter


, sample mean estimates , population mean
S , sample standard deviation estimates , population standard deviation
p̂ , sample proportion estimate p, population proportion

Statistics come from samples.


Parameters come from the population

Sampling Distributions Topic 9 6


Population and Sample
The process of statistical inference involves using information from a
sample to draw conclusions about a wider population.
Different random samples yield different statistics. We need to be able to
describe the sampling distribution of possible statistic values in order to
perform statistical inference.
We can think of a statistic as a random variable because it takes numerical
values that describe the outcomes of the random sampling process.

Collect data from a


• Population
representative Sample...
Sample

Make an Inference about


the Population.

Sampling Distributions Topic 9 7


Sampling Terminology (continued)
• Variability
– different samples from the same population may yield
different values of the sample statistic
• Sampling Distribution
– tells what values a statistic takes and how often it takes
those values in repeated sampling
• Sampling Distribution of a Statistic
– The distribution of values taken by the statistic in all
possible samples of the same size from the same
population.

Sampling Distributions Topic 9 8


Parameter vs. Statistic
 The mean of a population is denoted by µ – this
is a parameter.
 The mean of a sample is denoted by x – this is
a statistic. x is used to estimate µ.
 The true proportion of a population with a certain
trait is denoted by p – this is a parameter.

 The proportion of a sample with a certain trait is


denoted by p̂(“p-hat”) – this is a statistic. p̂is
used to estimate p.
Sampling Distributions Topic 9 9
The Law of Large Numbers
Law of Large Numbers – as the sample size
increases, the sample mean gets closer to the
population mean. That is , the difference
between the sample mean and the population
mean tends to become smaller (i.e., approaches
zero).
For Example:

( x gets closer to µ )

Sampling Distributions Topic 9 10


Example: Sampling from a Distribution

Let’s illustrate the concept of the Law of Large numbers


for the sample mean with an example.
Population Distribution From Which
We Will Take Samples
We will be
taking samples 0.28
Population
Population Proportion

0.24
from this Mean = 2.56
0.20
population 0.16
distribution: 0.12

0.08

0.04

0.00

0 1 2 3 4 5 6 7 8 9 10 11
Values of X

This distribution is skewed to the right and looks nothing like a


normal curve.
Sampling Distributions Topic 9 11
Example: Sampling from a Distribution (cont.)
Here are the distribution Sample Distribution for Sample 1 Sample Distribution for Sample 2
of the measurements in
11 11
the sample from four 10 10

Number of Data Points

Number of Data Points


9 9
different random 8 8

samples from our 7


6
7
6

population on the 5
4
5
4
previous slide. Each 3 3
2 2
sample is of size 25. 1 1
0 0

Each of these 0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
Values of X Values of X
distributions looks
somewhat like the Sample Distribution for Sample 3 Sample Distribution for Sample 4
“parent” population
11 11
distribution (i.e., the 10 10
Number of Data Points

Number of Data Points


9 9
distribution being 8 8
7 7
sampled from), but 6 6
5 5
none look exactly like 4
3
4
3
it. 2
1
2
1

They all look different 0 0

0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
from each other as Values of X Values of X

well.
Sampling Distributions Topic 9 12
Example: Sampling from a Distribution (cont.)
• Now here is the distribution of the measurements in a sample of
size 1000.
• This distribution looks much more like the parent population.

Sample Distribution for Sample of Size 1000


Conclusion: The
distribution of
200
measurements in a Number of Data Points
large sample looks
like the distribution in
the parent population. 100

(Think about the Law


of Large Numbers)
0
Distribution
0 1 2 3 4 5 6 7 8 9 10 11
within a single
Values of X
sample.
Sampling Distributions Topic 9 13
Sampling Variability
Different random Samples yield different Statistics. This basic
fact is called sampling variability: the value of a statistic varies
in repeated random sampling.
To make sense of sampling variability, we ask, “What would
happen if we took many samples?”

Population
Sample
Sample

Sample

Sample
Sample

Sample

Sample
?
Sampling Distributions Topic 9 14
Distribution of the Sample Mean
● Suppose we take a series of different random samples.
▪ Sample 1 – we compute sample mean x1
▪ Sample 2 – we compute sample mean x2
▪ Sample 3 – we compute sample mean x3
▪ etc.
● Each time we sample, we may get a different result. The
sample mean x is a random variable! Therefore, the sample
mean has a mean, a standard deviation, and a probability
distribution.
▪ Since the sample mean is determined by chance, there is
variability in our point estimates.
▪ This variability leads to uncertainty as to whether our estimates
are correct.
▪ So, need some way to indicate the reliability of statements
made about a population based on sample data.

Sampling Distributions Topic 9 15


Distribution of the Sample Mean
• Example of a Sampling Distribution of the Sample Mean
sample1, n1 = 10 , 1 = 23.3
Sample2, n2 = 10 , 2 = 22.7
Population Sample3, n3 = 10 , 3 = 23.8
.
.
.
Sample15, n15 = 10 , 15 = 23.2
• Since we do not know the value of in advance when we
take a sample of 10 from our population, is a random
variable. And will have a distribution with a mean,  ,
and a standard deviation,  . This distribution is called the
Sampling Distribution of the Sample Mean.
• We can estimate  by taking the mean of the 15 from
our 15 different samples above (( 1 + 2+ …+ 15)/15)

Sampling Distributions Topic 9 16


Distribution of the Sample Mean
• Show example in EXCEL

Sampling Distributions Topic 9 17


Distribution of the Sample Mean (cont.)
Terminology: Probability distribution of a statistic is called
a sampling distribution.

● Use the sampling distribution to make probability


statements about the values the sample mean takes on
and the likelihood that our estimates of the population
mean based on the sample mean are accurate.
● The sampling distribution of the sample mean depends
upon:
▪ Sample size n.
▪ Mean μ of the population.
▪ Standard deviation σ of the population.
▪ Shape of the population distribution.

Sampling Distributions Topic 9 18


Distribution of the Sample Mean (cont.)

Description of the Sampling Distribution:


If you repeated taking simple random samples of size n from
a population and calculate the sample mean for each sample,
the distribution of this accumulation of sample means is the
sampling distribution of the sample mean.

Important Property:
The sampling distribution of the sample mean does not
necessarily look like the population distribution.

Sampling Distributions Topic 9 19


Example: Sampling Distribution

Let’s illustrate the concept of the sampling distribution


of the sample mean with an example.
Population Distribution From Which
We Will Take Samples
We will be
taking samples 0.28
Population
Population Proportion

0.24
from this Mean = 2.56
0.20
population 0.16
distribution: 0.12

0.08

0.04

0.00

0 1 2 3 4 5 6 7 8 9 10 11
Values of X

This distribution is skewed to the right and looks nothing like a


normal curve.
Sampling Distributions Topic 9 20
Example: Sampling Distribution (cont.)
Now suppose that we took a sample of size 10 and computed the
sample mean as a sample statistic and did this many times. The
sampling distribution of this random variable looks like this.

This looks somewhat like a Probability Dist'n of Sample Mean for Samples of Size 10
normal curve and not at
0.5
all like the parent μ=2.56
population. 0.4
Probability Density

Skewness is still evident. 0.3

0.2

0.1

Distribution
0.0
across many
1 2 3 4 5 6
samples. Values of the Sample Mean

Values of the sample mean range from approximately 0 to 6.


(Values in original population are 0 to 11.)
Sampling Distributions Topic 9 21
Example: Sampling Distribution (cont.)
Probability Dist'n of Sample Mean for Samples of Size 40 Probability Dist'n of Sample Mean for Samples of Size 160
μ=2.56 μ=2.56
0.9 2

0.8
0.7

Probability Density
0.6
0.5
1
0.4
0.3
0.2
0.1
0.0 0

1 2 3 4 2.0 2.5 3.0 3.5

Values of the Sample Mean Values of the Sample Mean

The sampling distribution for the The sampling distribution for the sample
sample mean for a sample size of 40. mean for a sample size of 160.
This looks more like a normal curve This looks almost exactly like a normal
than for a sample of size 10. curve.
Values of the sample mean range from Values of the sample mean range from
approximately 1.5 to 4. approximately 1.75 to 3.25. (A much
tighter distribution – variation
Sampling Distributions Topic 9
decreases as sample size gets larger.)
22
Behavior of Sampling Distribution
What can we take from our sampling distribution example
above?
1) The distribution of measurements in a sample looks like the
distribution in the parent population, NOT necessarily like a
Normal curve.
2) The sampling distribution of the sample mean looks like a normal
curve as our sample size increased, even though the parent
population is definitely NOT normal.
3) As the sample size increases, the sample mean gets closer to the
population mean, i.e., the difference between the sample mean and
the population mean tends to become smaller (i.e., approaches zero).
(Law of Large Numbers!)
4) The spread in the histograms for the sampling distribution of the
sample mean is getting smaller for larger sample sizes.
(Law of Large Numbers!)
Sampling Distributions Topic 9 23
Mean and Standard Deviation of the Sampling
Distribution of the Sample Means
Mean of a sampling distribution of a sample mean
There is no tendency for a sample mean to fall systematically above
or below , even if the distribution of the raw data is skewed. Thus,
the mean of the sampling distribution is an unbiased estimate of the
population mean .

Standard deviation of a sampling distribution of a sample


mean
The standard deviation of the sampling distribution measures how
much the sample statistic varies from sample to sample. It is smaller
than the standard deviation of the population by a factor of √n.

 Averages are less variable than individual observations.

Sampling Distributions Topic 9 24


Mean and Standard Deviation of the Sampling
Distribution of the Sample Means
•If numerous samples of size n are taken from a
population with mean  and standard deviation 
• If the mean, , is taken for each of these samples of
size n, then is now a random variable (sampling
distribution of the sample means)
•The mean of this sampling distribution of the sample
mean,  , equals  (where  the population mean)
and the standard deviation,  , of the sampling
distribution of the sample mean called the standard
error is: / n
(where  is the population standard deviation)
Sampling Distributions Topic 9 25
Summary of Properties of Sampling Distribution
The sampling distribution of the sample mean has several
important properties:
● If a simple random sample of size n is drawn from any large population,
then the sampling distribution of the sample mean has:
▪ Mean: μ x = μ
(The mean of the sampling distribution of the sample mean
equals the population mean.)
▪ Standard deviation, called the standard error of the mean:
σ
σx =
n
(As the sample size increases, the standard error of the sample mean gets
smaller.)
● In addition, if the population is normally distributed, then, the sampling
distribution is normally distributed.
Terminology: The Standard Deviation of a statistic is called its Standard Error.

Sampling Distributions Topic 9 26


Mean and Standard Deviation of the Sampling
Distribution of the Sample Means
 Since the mean of the random variable, is
 (that is,  =), we say that is an
unbiased estimator of 
 Individual observations have standard
deviation , but sample means from
samples of size n have standard deviation
/ n (called the standard error, that is 
= / n)
 Averages are less variable than individual
observations.
Sampling Distributions Topic 9 27
Notation for Sampling Distribution of Sample
Means

If individual observations have the N(µ, )


distribution, then the sampling distribution of
the sample mean, , of n independent
observations has the N(µ, / n ) distribution.

“If measurements in the population follow a


Normal distribution, then so does the sample
mean.”

Sampling Distributions Topic 9 28


Central Limit Theorem
● If our population has a normal distribution, then the
sampling distribution of the sample mean is normal.

● However … what if the population does not have a


normal distribution. What can we do?

● Wouldn’t it be very nice if the sampling distribution for


the sample mean is normal, even when the population
distribution is not? This is almost true …

Sampling Distributions Topic 9 29


Central Limit Theorem
Most population distributions are not Normal. What is the shape of the
sampling distribution of sample means when the population distribution
isn’t Normal?

It is a remarkable fact that as the sample size increases, the distribution


of sample means changes its shape: it looks less like that of the
population and more like a Normal distribution!

When the sample is large enough, the distribution of sample means is


very close to Normal, no matter what shape the population distribution
has, as long as the population has a finite standard deviation.

The Central Limit Theorem states:


Regardless of the shape of the population distribution,
the sampling distribution of the sample mean becomes
approximately normal as the sample size n increases.
Sampling Distributions Topic 9 30
Central Limit Theorem (cont.)

Draw an SRS of size n from any population with mean m and finite
standard deviation s . The central limit theorem (CLT) says that when n
is large, the sampling distribution of the sample mean x is approximately
Normal:
æ s ö
x is approximately N ç m, ÷
è nø

Sampling Distributions Topic 9 31


Central Limit Theorem (cont.)

Summary:
● If the random variable X (i.e., the population) is normally
distributed, then the sampling distribution of the sample mean is
normally distributed for any sample size.
● For all other random variables X (i.e., other populations), the
sampling distribution of the sample mean is approximately
normally distributed if n is 30 or higher. (The convention in our
class for n large enough)

Sampling Distributions Topic 9 32


Old Faithful Example
The most famous geyser in the world, Old Faithful in Yellowstone Nat’l
Park, has a mean time between eruptions of 85 minutes and a
standard deviation of 21.25 minutes. The distribution of the time
interval between eruptions is not normal.
a) What is the probability that a randomly selected time interval will be less
than 75 minutes?
b) What is the probability that a random sample of 20 time intervals will have
a mean less than 75 minutes?
c) What is the probability that a random sample of 30 time intervals will have
a mean less than 75 minutes?
d) What is the probability that a random sample of 30 time intervals will have
a mean greater than 100 minutes?
e) What is the probability that a random sample of 30 time intervals will have
a mean between 75 and 90 minutes?

Sampling Distributions Topic 9 33


Old Faithful Example
The most famous geyser in the world, Old Faithful in Yellowstone Nat’l
Park, has a mean time between eruptions of 85 minutes and a
standard deviation of 21.25 minutes. The distribution of the time
interval between eruptions is not normal.
a) What is the probability that a randomly selected time interval will be less
than 75 minutes? Cannot be done with information given

b) What is the probability that a random sample of 20 time intervals will have
a mean less than 75 minutes? Cannot be done with information given
c) What is the probability that a random sample of 30 time intervals will have
a mean less than 75 minutes? 0.0050
d) What is the probability that a random sample of 30 time intervals will have
a mean greater than 100 minutes? 0.0001
e) What is the probability that a random sample of 30 time intervals will have
a mean between 75 and 90 minutes? 0.8963
Sampling Distributions Topic 9 34
Summary: Distribution of the Sample Mean
The sample mean is a random variable with a probability
distribution called the sampling distribution.
– The mean of the sampling distribution is equal to the mean of
the population  =μ.
– The standard deviation of the sampling distribution
(i.e., the standard error,  ) is equal to  = σ / n , where
σ = population standard deviation and n = sample size.
– Shape of the sampling distribution:
• If the sample size n is sufficiently large (30 or more is a
good rule of thumb), then this distribution is
approximately normal (regardless of the shape of the
population distribution).
• If the population is normally distributed, then the
sampling distribution is normal for any sample size.
Sampling Distributions Topic 9 35
A Few More Facts
Any linear combination of independent
Normal random variables is also Normally
distributed.
More generally, the central limit theorem
notes that the distribution of a sum or
average of many small random quantities is
close to Normal.
Finally, the central limit theorem also applies
to discrete random variables.

Sampling Distributions Topic 9 36


The Sampling Distribution of the Sample
Mean
What Was Covered
• Population Distribution vs. Sampling
Distribution
• The Mean and Standard Deviation of the
Distribution of the Sample Mean
• Sampling Distribution of a Sample Mean
• Central Limit Theorem

Sampling Distributions Topic 9 37


Topic 9 cont.

Sampling Distributions for Sample


Proportion

Sampling Distributions Topic 9 38


Sampling Distributions for Counts and
Proportions
▪ Binomial Distributions for Sample Counts
▪ Binomial Distributions in Statistical Sampling
▪ Finding Binomial Probabilities
▪ Binomial Mean and Standard Deviation
▪ Sample Proportions
▪ Normal Approximation for Counts and
Proportions

Sampling Distributions Topic 9 39


Binomial Probability Distribution
• Binomial Probability Distribution
▪ A special discrete probability distribution
▪ Describes probabilities for experiments that have
two mutually exclusive outcomes (the experiment
has only two outcomes)
▪ One outcome is called a success
▪ The word “success” does not mean that this a a “good”
outcome or that we want this to be the outcome. It is the
outcome that we are looking for.
▪ For example, if we are looking at the cobra strike on an
animal. A success is animal dies.
▪ The other outcome is called a failure
▪ In the above example the failure would be the animal does
not die.
Sampling Distributions Topic 9 40
Definition of a Binomial Experiment
Definition: A binomial experiment is an experiment
with the following characteristics:
1. The experiment is performed a fixed number of
times, n; each time is called a trial. So there are a
fixed number of trials.
2. The n trials are independent. (Outcome of one trial
will not affect the outcome of any other trials.)
3. Each trial has only two possible outcomes, usually
called success and failure.
4. The probability of success is the same for each trial
of the experiment.
Sampling Distributions Topic 9 41
Notation for a Binomial Experiment
Notation used for binomial experiments:
• The number of trials is represented by n.
• The probability of a success (in a single trial)is
represented by p.
The total number of successes in n trials is represented by
the random variable X. (X is called a binomial random
variable.)
Because there cannot be a negative number of successes,
and because there cannot be more than n successes (out
of n attempts):
0 ≤ X=x≤ n
• The probability distribution of a binomial random
variable is called a binomial distribution.
Sampling Distributions Topic 9 42
Binomial Setting
Example 1
In a shipment of 100 televisions, how many are
defective? The probability of a television being
defective is 0.08?
n = 100
P = 0.08
X  number of defective televisions in shipment of
100
X is a binomial random variable since the trials
(televisions) are independent, there is a fixed
number of televisions, n = 100, and the
probability of success (being defective) is the
same for each television (p=0.08)
X~Binomial (100,0.08)
Sampling Distributions Topic 9 43
Binomial Setting
Example 2
A new procedure for treating breast cancer is used on
25 patients. . How many patients are cured? The
procedure has a probability of 0.76 to cure a patient
n = 25
P = 0.76
X number of patience cured out of the 25.
X is a binomial random variable since the trials
(patient with breast cancer) are independent,
there is a fixed number of patients, n = 25, and
the probability of success (being cured) is the
same for each patient (p=0.76)
X~Binomial (25, 0.76)

Sampling Distributions Topic 9 44


Summary for the Binomial Distribution
• Binomial Experiment
▪ Fixed number of trials, n
▪ Only two outcomes for each trial, success or failure
▪ The n trials are independent
▪ The probability of a success, p, is the same for each trial
• Let x = the count of successes in a binomial setting. The
distribution of X is the binomial distribution with
parameters n and p. X~Binomial (n, p)
▪ n is the number of trials/observations
▪ p is the probability of a success on any one observation
(p must be the same for each trial)
▪ The random variable X takes on whole values between 0
and n
Sampling Distributions Topic 9 45
Binomial Probabilities
• Find the probability that a binomial random
variable X takes any particular value. X is the
number of success out of n trials/observations,
– P(x successes out of n observations) = P(X=x)
– There are three ways to calculate binomial
probabilities:
• Using the Binomial Formula
• Using Binomial Tables
• Using a statistical software package

Sampling Distributions Topic 9 46


Mean and Standard Deviation
• If X has the binomial distribution with n
observations and probability p of success
on each observation, then the mean and
standard deviation of X are
μ  np
σ  np (1  p )

Sampling Distributions Topic 9 47


Case Study
Inspecting Switches for lot of 10 Switches
The number X of bad switches has approximately the
binomial distribution with n=10 and p=0.1. Find the mean
and standard deviation of this distribution.
µ = np = (10)(0.1) = 1
 the probability of each being bad is one tenth; so we
expect (on average) to get 1 bad one out of the 10
sampled
σ  n p (1  p )  (1 0 )(0 .1 )( 1  0 .1 )  0 .9  0 .9 4 8 7

Sampling Distributions Topic 9 48


Case Study
Inspecting Switches for lot of 10 Switches
The CDF from MINITAB
Cumulative Distribution Function

Binomial with n = 10 and p = 0.1

x P( X <= x )
0 0.34868
1 0.73610
2 0.92981
3 0.98720
4 0.99837
5 0.99985
6 0.99999
7 1.00000

Sampling Distributions Topic 9 49


Case Study
Inspecting Switches

Probability Histogram
n=10, p=0.1

Sampling Distributions Topic 9 50


Normal Approximation for Binomial
Distributions
As n gets larger, something interesting happens to the shape
of a binomial distribution.

Normal Approximation for Binomial Distributions


Suppose that X has the binomial distribution with n trials and success
probability p. When n is large, the distribution of X is approximately Normal
with mean and standard deviation
X  np s X = np(1- p)
As a rule of thumb, we will use the Normal approximation when n is so
large that np ≥ 10 and n(1 – p) ≥ 10.


Sampling Distributions Topic 9 51
Example
Sample surveys show that fewer people enjoy shopping than in the past. A survey asked a
nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying
new clothes, but shopping is often frustrating and time-consuming.” Suppose that exactly
60% of all adult U.S. residents would say “Agree” if asked the same question. Let X = the
number in the sample who agree. Estimate the probability that 1520 or more of the
sample agree.
1. Verify that X is approximately a binomial random variable.
Success = agree, Failure = don’t agree
n = 2500 trials of the chance process.
The probability of selecting an adult who agrees is p = 0.60.
Each trial is independent

2. Check the conditions for using a Normal approximation.


Since np = 2500(0.60) = 1500 and n(1 – p) = 2500(0.40) = 1000 are both at least
10, we may use the Normal approximation.

3. Calculate P(X ≥ 1519.5) using a Normal approximation.

μ  np  2500(0.60)  1500 1520 -1500


z= = 0.82
  np(1  p)  2500(0.60)(0.40)  24.49 24.49
P(X ³1520) = P(Z ³ 0.82) =1- 0.7939 = 0.2061
Sampling Distributions Topic 9
52
Sampling Distribution of a Sample Proportion
There is an important connection between the sample proportion pˆ and
the number of " successes" X in the sample.

count of successes in sample X


pˆ  
size of sample n

Sampling Distributions Topic 9 53


Proportions
• The proportion of a population that has some
outcome (“success”) is p.
• The proportion of successes in a sample is
measured by the sample proportion:
ˆ num ber of successes in the sam ple
p total num ber of observations in the sam ple
x
pˆ 
n
“p-hat” (sample proportion) is the point estimate
for p (population proportion)
So it is no more than the binomial variable
divided by sample size
Sampling Distributions Topic 9 54
Point Estimate, p̂
Example
• A polling company polls 1200 people to see if they
supported a certain issue. 408 of the 1200
supported the issue. What is the sample proportion:
• Let X = number of people who supported the issue
(the number of successes in the sample)
• Let n = number in our sample
x
• Then ˆ 
p
n
408
ˆ 
p
1200
ˆ  0.34
p
Sampling Distributions Topic 9 55
Distribution of the Sample Proportion
Because the values of the sample proportion varies from sample
to sample, it is a random variable.
So, we have the same questions for the sample proportion as we
had for the sample mean:
• What is the mean of the sample proportion?
• What is the standard deviation of the sample proportion?
• What is the sampling distribution of the sample proportion?
• Can we apply the Central Limit Theorem to approximate these
with normal distributions?

• The answer is yes …

Sampling Distributions Topic 9 56


Distribution of the Sample Proportion
Because the values of the sample proportion varies from sample
to sample, it is a random variable.
So, we have the same questions for the sample proportion as we
had for the sample mean:
• What is the mean of the sample proportion? p
• What is the standard deviation of the sample proportion?
p (1  p )
standard error =
n
(again nothing more than the binomial standard deviation divided by n
np (1  p ) p (1  p )

n n
• What is the sampling distribution of the sample proportion if np(1-p)≥10
or number of success or failures is 15 or more (the conditions that allow
us to estimate the binomial using the normal distribution)?
p̂ ~ N( mean:  p̂  p , standard error: p(1  p) )
 pˆ 
n
Sampling Distributions Topic 9 57
Inference about a Proportion
Simple Conditions

Shape of the sampling distribution: If the sample size n (such


that n ≤ 0.05N) is sufficiently large and the population proportion
p isn’t close to either 0 or 1 (i.e., np(1−p) ≥ 10), then the
sampling distribution is approximately normal. In this book it is
simplified to the number of successes and failures in the
sample are both ≥ 15.

Sampling Distributions Topic 9 58


Aerobics Example
Assume that 80% of the people taking aerobics classes
are female and a simple random sample of n= 100
students is taken.

a)Describe the sampling distribution of the sample


proportion of female students.
b)What is the probability that at most 75% of the
students in the sample are female?
c)What is the probability that at least 90 of the
students in the sample are female? Would that be
unusual?

Sampling Distributions Topic 9 59


Aerobics Example
P(success) = 0.8 P (failure) = 1 – 0.8 = 0.2 n = 100
a) number of successes = 0.8*100 = 80 ≥ 15
number of failures = 0.2*100 = 20 ≥ 15
p̂ ~ N( 0.8, 0.8*.2  0.04 )
100
0.75  0.80
b)P( p̂ ≤ 0.75) = P(Z ≤ 0.8*0.2 ) = P(Z ≤ -1.25) = 0.1056
100

c) P( p̂ ≥ 0.90) = P(Z ≥ 2.5) = 0.0062

Sampling Distributions Topic 9 60


Sampling Distributions for Counts and
Proportions
▪ Binomial Distributions for Sample Counts
▪ Binomial Distributions in Statistical Sampling
▪ Finding Binomial Probabilities
▪ Binomial Mean and Standard Deviation
▪ Sample Proportions
▪ Normal Approximation for Counts and
Proportions

Sampling Distributions Topic 9 61