Вы находитесь на странице: 1из 89

# Central Limit Theorem

## When the population from which we are selecting

a random sample does not have a normal distribution,
the central limit theorem is helpful in identifying the
shape of the sampling distribution ofx .
CENTRAL LIMIT THEOREM
In selecting random samples of size n
from a
x the sampling distribution of
population,
the sample
mean
can be
approximated by a normal distribution as
the sample size becomes large.

Sampling
Distribution
x
of
for MAT
Scores

E(x) 1090

80

14.6
n
30

## Example: St. Andrews College

What is the probability that a simple
random
sample of 30 applicants will provide an
estimate of
the population mean MAT score that is within
In other words, what is the probability xthat
+/10
will
of the actual population mean ?
be between 1080 and 1100?

## Step 1: Calculate the z-value at the upper endpoint of

the interval.
z = (1100 - 1090)/14.6= .68
Step 2: Find the area under the curve to the left of the
upper endpoint.
P(z < .68) = .7517

z
.

.00
.

## Cumulative Probabilities for

the Standard Normal
Distribution
.01
.02
.03
.04
.05
.06
.07
.

## .5 .6915 .6950 .6985 .7019 .7054 .7088

.6 .7257 .7291 .7324 .7357 .7389 .7422
.7 .7580 .7611 .7642 .7673 .7704 .7734
.8 .7881 .7910 .7939 .7967 .7995 .8023

.08

.09

## .7123 .7157 .7190 .7224

.7454 .7486 .7517 .7549
.7764 .7794 .7823 .7852
.8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
.
.
.
.
.
.
.
.
.
.
.

Sampling
Distribution
x
of
for MAT
Scores

x 14.6

Area = .7517

x
1090 1100

## Step 3: Calculate the z-value at the lower endpoint of

the interval.
z = (1080 - 1090)/14.6= - .68
Step 4: Find the area under the curve to the left of the
lower endpoint.
P(z < -.68) = .2483

Sampling Distribution of x

MAT Scores

## Example: St. Andrews College

Sampling
Distribution
x
of
for MAT
Scores

x 14.6

Area = .2483

x
1080 1090

x
Sampling Distribution of

## Step 5: Calculate the area under the curve between

the lower and upper endpoints of the interval.
P(-.68 < z < .68) = P(z < .68) - P(z < -.68)
= .7517 - .2483
= .5034
The probability that the sample mean SAT
score will
be between 1080 and 1100 is:
P(1080 < x< 1100) = .5034

x
Sampling Distribution of

Sampling
Distribution
x
of
for MAT
Scores

x 14.6

Area = .5034

x
1080 1090 1100

## Relationship Between the Sample Size

x of
and the Sampling Distribution

## Suppose we select a simple random sample of 100

applicants instead of the 30 originally considered.

## E( x ) = regardless of the sample size. In

x
our
example,
E( ) remains
1090.
Whenever
the sample
size isat
increased,
the standard
error of the mean x is decreased. With the increase
in the sample size to n = 100, the standard error of
the mean is decreased from 14.6 to:

80
x

8.0
n
100

## Relationship Between the Sample Size

x of
and the Sampling Distribution

With n = 100,
x 8
With n = 30,
x 14.6

E(x) 1090

## Relationship Between the Sample Size

x of
and the Sampling Distribution

## Recall that when n = 30, P(1080 < x < 1100) = .5034.

We follow the same steps to solve for P(1080 <
x
< 1100) when n = 100 as we showed earlier when
n = 30.
Now, with n = 100, P(1080 < x < 1100) = .7888.
Because the sampling distribution with n = 100 has a
smaller standard error, the values ofx
have less
variability and tend to be closer to the population
mean than the values of x with n = 30.

## Relationship Between the Sample Size

x of
and the Sampling Distribution

## Example: St. Andrews College

Sampling
Distribution
x
of
for MAT
Scores

x 8

Area = .7888

x
108010901100

Sampling Distribution of p

Proportion
Population
with proportion
p=?

## The value of p is used

to make inferences

## A simple random sample

of n elements is selected
from the population.

## The sample data

provide a value for
p
the
sample
proportion .

Sampling Distribution of

## The sampling distribution of p is the probability

distribution of all possible values of the sample
proportion .p
Expected Value of p

E ( p) p
where:
p = the population proportion

Sampling Distribution of
Standard Deviation ofp
Finite Population
N n
p
N 1

p(1 p )
n

Infinite Population

p(1 p )
n

## p is referred to as the standard

error of
the proportion.
( N n) / ( N 1) is the finite population
correction factor.

## The sampling distribution of p can be approximated

by a normal distribution whenever the sample size
is large enough to satisfy the two conditions:
np > 5

## . . . because when these conditions are satisfied, the

probability distribution of x in the sample proportion,
p= x/n, can be approximated by normal distribution
(and because n is a constant).

Sampling Distribution of

## Recall that 72% of the prospective students

applying
to St. is
Andrews
Collegethat
desire
on-campus
What
the probability
a simple
random sample
housing.
of 30 applicants will provide an estimate of the
population proportion of applicant desiring on-campus
housing that is within plus or minus .05 of the actual
population proportion?

Sampling Distribution of

## Example: St. Andrews College

For our example, with n = 30 and p = .72, the
normal distribution is an acceptable approximation
because:

## np = 30(.72) = 21.6 > 5

and
n(1 - p) = 30(.28) = 8.4 > 5

Sampling Distribution of

## Example: St. Andrews College

Sampling
Distribution
p
of

E(p) .72

.72(1 .72)
.082
30

Sampling Distribution of

## Step 1: Calculate the z-value at the upper endpoint

of the interval.
z = (.77 - .72)/.082 = .61
Step 2: Find the area under the curve to the left of
the upper endpoint.
P(z < .61) = .7291

Sampling Distribution of

z
.

.00
.

## Cumulative Probabilities for

the Standard Normal
Distribution
.01
.02
.03
.04
.05
.06
.07
.

## .5 .6915 .6950 .6985 .7019 .7054 .7088

.6 .7257 .7291 .7324 .7357 .7389 .7422
.7 .7580 .7611 .7642 .7673 .7704 .7734
.8 .7881 .7910 .7939 .7967 .7995 .8023

.08

.09

## .7123 .7157 .7190 .7224

.7454 .7486 .7517 .7549
.7764 .7794 .7823 .7852
.8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
.
.
.
.
.
.
.
.
.
.
.

Sampling Distribution of

## Example: St. Andrews College

Sampling
Distribution
p
of

p .082

Area = .7291

p
.72 .77

Sampling Distribution of

## Step 3: Calculate the z-value at the lower endpoint of

the interval.
z = (.67 - .72)/.082 = - .61
Step 4: Find the area under the curve to the left of the
lower endpoint.
P(z < -.61) = .2709

Sampling Distribution of

## Example: St. Andrews College

Sampling
Distribution
p
of

p .082

Area = .2709

p
.67 .72

Sampling Distribution of

## Step 5: Calculate the area under the curve between

the lower and upper endpoints of the interval.
P(-.61 < z < .61) = P(z < .61) - P(z < -.61)
= .7291 - .2709
= .4582
The probability that the sample proportion of applicants
wanting on-campus housing will be within +/-.05 of the
actual population proportion :
P(.67 <

## p< .77) = .4582

Sampling Distribution of

Sampling
Distribution
p
of

p .082

Area = .4582

p
.67 .72 .77

## Before using a sample statistic as a point

estimator, statisticians check to see whether
the sample statistic has the following
properties associated with good point
estimators.
Unbiased
Efficiency
Consistency

## Properties of Point Estimators

Unbiased
If the expected value of the sample statistic
is equal to the population parameter being
estimated, the sample statistic is said to be an
unbiased estimator of the population
parameter.

## Properties of Point Estimators

Efficiency
Given the choice of two unbiased
estimators of the same population parameter,
we would prefer to use the point estimator
with the smaller standard deviation, since it
tends to provide estimates closer to the
population parameter.
The point estimator with the smaller
standard deviation is said to have greater
relative efficiency than the other.

## Properties of Point Estimators

Consistency
A point estimator is consistent if the
values of the point estimator tend to become
closer to the population parameter as the
sample size becomes larger.
In other words, a large sample size tends
to provide a better point estimate than a small
sample size.

## Stratified Random Sampling

Cluster Sampling

Systematic Sampling

Convenience Sampling

Judgment Sampling

## Stratified Random Sampling

The
The population
population is
is first
first divided
divided into
into groups
groups of
of
elements
elements called
called strata.
strata.
Each
Each element
element in
in the
the population
population belongs
belongs to
to one
one and
and
only
only one
one stratum.
stratum.
Best
Best results
results are
are obtained
obtained when
when the
the elements
elements within
within
each
each stratum
stratum are
are as
as much
much alike
alike as
as possible
possible
(i.e.
(i.e. aa homogeneous
homogeneous group).
group).

## Stratified Random Sampling

A
A simple
simple random
random sample
sample is
is taken
taken from
from each
each stratum.
stratum.
Formulas
Formulas are
are available
available for
for combining
combining the
the stratum
stratum
sample
sample results
results into
into one
one population
population parameter
parameter
estimate.
estimate.
If strata
strata are
are homogeneous,
homogeneous, this
this method
method
is
is as
as precise
precise as
as simple
simple random
random sampling
sampling but
but with
with
aa smaller
smaller total
total sample
sample size.
size.
Example:
Example: The
The basis
basis for
for forming
forming the
the strata
strata might
might be
be
department,
department, location,
location, age,
age, industry
industry type,
type, and
and so
so on.
on.

Cluster Sampling
The
The population
population is
is first
first divided
divided into
into separate
separate groups
groups
of
of elements
elements called
called clusters.
clusters.
Ideally,
Ideally, each
each cluster
cluster is
is aa representative
representative small-scale
small-scale
version
version of
of the
the population
population (i.e.
(i.e. heterogeneous
heterogeneous group).
group).
A
A simple
simple random
random sample
sample of
of the
the clusters
clusters is
is then
then taken.
taken.
All
All elements
elements within
within each
each sampled
sampled (chosen)
(chosen) cluster
cluster
form
form the
the sample.
sample.

Cluster Sampling
Example:
Example: A
A primary
primary application
application is
is area
area sampling,
sampling,
where
where clusters
clusters are
are city
city blocks
blocks or
or other
other well-defined
well-defined
areas.
areas.
The close
close proximity
proximity of
of elements
elements can
can be
be
cost
cost effective
effective (i.e.
(i.e. many
many sample
sample observations
observations can
can be
be
obtained
obtained in
in aa short
short time).
time).
This method
method generally
generally requires
requires aa
larger
larger total
total sample
sample size
size than
than simple
simple or
or stratified
stratified
random
random sampling.
sampling.

Systematic Sampling
If
If aa sample
sample size
size of
of n
n is
is desired
desired from
from aa population
population
containing
containing N
N elements,
elements, we
we might
might sample
sample one
one
element
element for
for every
every n/N
n/N elements
elements in
in the
the population.
population.
We
We randomly
randomly select
select one
one of
of the
the first
first n/N
n/N elements
elements
from
from the
the population
population list.
list.
We
We then
then select
select every
every n/Nth
n/Nth element
element that
that follows
follows in
in
the
the population
population list.
list.

Systematic Sampling
This
This method
method has
has the
the properties
properties of
of aa simple
simple random
random
sample,
sample, especially
especially if
if the
the list
list of
of the
the population
population
elements
elements is
is aa random
random ordering.
ordering.
The sample
sample usually
usually will
will be
be easier
easier to
to
identify
identify than
than it
it would
would be
be if
if simple
simple random
random sampling
sampling
were
were used.
used.
Example:
Example: Selecting
Selecting every
every 100
100thth listing
listing in
in aa telephone
telephone
book
book after
after the
the first
first randomly
randomly selected
selected listing
listing

Convenience Sampling
It
It is
is aa nonprobability
nonprobability sampling
sampling technique.
technique. Items
Items are
are
included
included in
in the
the sample
sample without
without known
known probabilities
probabilities
of
of being
being selected.
selected.
The
The sample
sample is
is identified
identified primarily
primarily by
by convenience.
convenience.
Example:
Example: A
A professor
professor conducting
conducting research
research might
might use
use
student
student volunteers
volunteers to
to constitute
constitute aa sample.
sample.

Convenience Sampling
Sample selection
selection and
and data
data collection
collection are
are
relatively
relatively easy.
easy.
It is
is impossible
impossible to
to determine
determine how
how
representative
representative of
of the
the population
population the
the sample
sample is.
is.

Judgment Sampling
The
The person
person most
most knowledgeable
knowledgeable on
on the
the subject
subject of
of the
the
study
study selects
selects elements
elements of
of the
the population
population that
that he
he or
or
she
she feels
feels are
are most
most representative
representative of
of the
the population.
population.
It
It is
is aa nonprobability
nonprobability sampling
sampling technique.
technique.
Example:
Example: A
A reporter
reporter might
might sample
sample three
three or
or four
four
senators,
senators, judging
judging them
them as
as reflecting
reflecting the
the general
general
opinion
opinion of
of the
the senate.
senate.

Judgment Sampling
It is
is aa relatively
relatively easy
easy way
way of
of selecting
selecting aa
sample.
sample.
The quality
quality of
of the
the sample
sample results
results
depends
depends on
on the
the judgment
judgment of
of the
the person
person selecting
selecting the
the
sample.
sample.

Recommendation
It
It is
is recommended
recommended that
that probability
probability sampling
sampling methods
methods
(simple
(simple random,
random, stratified,
stratified, cluster,
cluster, or
or systematic)
systematic) be
be
used.
used.
For
For these
these methods,
methods, formulas
formulas are
are available
available for
for
evaluating
evaluating the
the goodness
goodness of
of the
the sample
sample results
results in
in
terms
terms of
of the
the closeness
closeness of
of the
the results
results to
to the
the population
population
parameters
parameters being
being estimated.
estimated.
An
An evaluation
evaluation of
of the
the goodness
goodness cannot
cannot be