Вы находитесь на странице: 1из 59

Lesson 2: Sampling and

Sampling Distributions

LEARNING GOAL
Define the concept of sampling error
Determine the mean and standard deviation for the sampling
distribution of the sample mean, x
Determine the mean and standard deviation for the sampling
distribution of the sample proportion, p
Describe the Central Limit Theorem and its importance
Apply sampling distributions for both x and p
Describe sampling distributions of sample variances
Copyright 2009 Pearson Education, Inc.

Tools of Business Statistics


Descriptive statistics
Collecting, presenting, and describing data

Inferential statistics
Drawing conclusions and/or making decisions
concerning a population based only on
sample data

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-2

Populations and Samples


A Population is the set of all items or individuals
of interest

Examples:

All likely voters in the next election


All parts produced today
All sales receipts for November

A Sample is a subset of the population

Examples:

1000 voters selected at random for interview


A few parts selected for destructive testing
Random receipts selected for audit

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-3

Simple Random Samples


Every object in the population has an equal chance of
being selected
Objects are selected independently
Samples can be obtained from a table of random
numbers or computer random number generators

A simple random sample is the ideal against which


other sample methods are compared
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-4

Inferential Statistics
Making statements about a population by
examining sample results
Sample statistics
Population parameters
(known)
Inference
(unknown, but can
be estimated from
sample evidence)

Sample

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Population

Chap 7-5

Sampling Error
Sample Statistics are used to estimate

Population Parameters
ex: X is an estimate of the population mean,
Problems:

Different samples provide different estimates of the


population parameter
Sample results have potential variability, thus sampling
error exits
Recall: With a random sample the goal is to gather a
representative group from the population
Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall

7-6

Calculating Sampling Error


Sampling Error:
The difference between a value (a statistic)
computed from a sample and the corresponding
value (a parameter) computed from a population
Example: (for the mean)

Sampling Error x -
where:

x sample mean
population mean
Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall

Always present
just because
you sample!

7-7

Review
Population mean:

Population
mean does
NOT vary

See earlier

Sample Mean:

where:
= Population mean
x = sample mean
xi = Values in the population or sample

Sample mean can


vary when
different samples
are collected from
the population

N = Population size
n = sample size
Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall

7-8

Example
If the population mean is = 98.6 degrees
and a sample of n = 5 temperatures yields a
sample mean of x = 99.2 degrees, then the
sampling error is

x 99.2 98.6 0.6 degrees

Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall

7-9

Sampling Errors
Different samples will yield different sampling
errors
The sampling error may be positive or negative
( x may be greater than or less than )
The size of the error depends on the sample
selected
i.e., a larger sample does not necessarily produce a smaller
error if it is not a representative sample
Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall

7-10

Sampling Distribution

A sampling distribution is a distribution


of the probability of possible values
of a statistic for a given size sample
selected from a population

Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall

7-11

Chapter Outline
Sampling
Distributions

Sampling
Distribution of
Sample
Mean

Sampling
Distribution of
Sample
Proportion

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Sampling
Distribution of
Sample
Variance
Chap 7-12

The Distribution of Sample Means


The distribution of sample means is the
distribution that results when we find the
means of all possible samples of a given size.
The larger the sample size, the more closely
this distribution approximates a normal
distribution.

Copyright 2009 Pearson Education, Inc.

Slide 8.1- 13

The Distribution of Sample Means


In all cases, the mean of the distribution of
sample means equals the population mean.
If only one sample is available, its sample x
mean, x, is the best estimate for the
population mean, .

Copyright 2009 Pearson Education, Inc.

Slide 8.1- 14

Developing a
Sampling Distribution
Assume there is a population
Population size N=4

Random variable, X,
is age of individuals
Values of X:
18, 20, 22, 24 (years)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-15

Developing a
Sampling Distribution
(continued)

Summary Measures for the Population Distribution:

P(x)

18 20 22 24

21
4

(X )
i

.25

2.236

18

20

22

24

Uniform Distribution
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-16

Developing a
Sampling Distribution
(continued)

Now consider all possible samples of size n = 2

16 Sample
Means
1st 2nd Observation
Obs 18 20 22 24

18 18 19 20 21
20 19 20 21 22
16 possible samples
(sampling with
replacement)
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

22 20 21 22 23
24 21 22 23 24
Chap 7-17

Developing a
Sampling Distribution
(continued)

Sampling Distribution of All Sample Means

Sample Means
Distribution

16 Sample Means
1st 2nd Observation
Obs 18 20 22 24

18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

P(X)
.3
.2
.1
0

18 19

20 21 22 23

(no longer uniform)

24

Chap 7-18

Developing a
Sampling Distribution
(continued)

Summary Measures of this Sampling Distribution:

E(X)
N

18 19 21 24

21
16

2
(
X

)
i

(18 - 21)2 (19 - 21)2 (24 - 21)2


1.58
16

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-19

Comparing the Population with its


Sampling Distribution
Population
N=4

21

Sample Means Distribution


n=2

X 21

2.236

P(X)
.3

P(X)
.3

.2

.2

.1

.1

X 1.58

18

20

22

24

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

18 19

20 21 22 23

24

X
Chap 7-20

Expected Value of Sample Mean


Let X1, X2, . . . Xn represent a random sample from a
population
The sample mean value of these observations is
defined as

1 n
X Xi
n i1

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-21

Standard Error of the Mean


Different samples of the same size from the same
population will yield different sample means
A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:

X
n
Note that the standard error of the mean decreases as
the sample size increases
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-22

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-23

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-24

If the Population is Normal


If a population is normal with mean and
standard deviation , the sampling distribution
of X is also normally distributed with

and

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

X
n

Chap 7-25

Z-value for Sampling Distribution


of the Mean
Z-value for the sampling distribution of X :

( X ) ( X )
Z

X
n
where:

X = sample mean

= population mean

= population standard deviation


n = sample size
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-26

Finite Population Correction


Apply the Finite Population Correction if:
a population member cannot be included more
than once in a sample (sampling is without
replacement), and
the sample is large relative to the population
(n is greater than about 5% of N)
Then

2 N n
Var( X)
n N 1

or

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

X
n

Nn
N 1

Chap 7-27

Finite Population Correction


If the sample size n is not small compared to the
population size N , then use

( X )
Z
Nn
n N 1

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-28

Sampling Distribution Properties

(i.e.

x is unbiased )

Normal Population
Distribution

Normal Sampling
Distribution
(has the same mean)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-29

Sampling Distribution Properties


(continued)

For sampling with replacement:


As n increases,

Larger
sample size

x decreases
Smaller
sample size

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

x
Chap 7-30

Suppose we roll one die 1,000 times and record the outcome
of each roll, which can be the number 1, 2, 3, 4, 5, or 6.
Figure 5.23 shows a histogram of
outcomes. All six outcomes have
roughly the same relative
frequency, because the die is
equally likely to land in each of the
six possible ways. That is, the
histogram shows a (nearly) uniform
distribution (see Section 4.2).
It turns out that the distribution in
Figure 5.23 has a mean of 3.41 and
a standard deviation of 1.73.

Copyright 2009 Pearson Education, Inc.

Figure 5.23 Frequency and


relative frequency distribution
of outcomes from rolling one
die 1,000 times.
Slide 5.3- 31

Now suppose we roll two dice 1,000 times and record the mean
of the two numbers that appear on each roll. To find the mean
for a single roll, we add the two numbers and divide by 2.
Figure 5.25a shows a typical result.
The most common values in this
distribution are the central values
3.0, 3.5, and 4.0. These values are
common because they can occur in
several ways.
The mean and standard deviation
for this distribution are 3.43 and
1.21, respectively.
Figure 5.25a Frequency and relative
frequency distribution of sample means
from rolling two dice 1,000 times.
Copyright 2009 Pearson Education, Inc.

Slide 5.3- 32

Suppose we roll five dice 1,000


times and record the mean of the
five numbers on each roll. A
histogram for this experiment is
shown in Figure 5.25b.
Once again we see that the central
values around 3.5 occur most
frequently, but the spread of the
distribution is narrower than in the
two previous cases.
The mean and standard deviation
are 3.46 and 0.74, respectively.

Figure 5.25b Frequency and


relative frequency distribution
of sample means from rolling
five dice 1,000 times.

Copyright 2009 Pearson Education, Inc.

Slide 5.3- 33

If we further increase the


number of dice to ten on each
of 1,000 rolls, we find the
histogram in Figure 5.25c,
which is even narrower.
In this case, the mean is 3.49
and standard deviation is 0.56.
Figure 5.25c Frequency and
relative frequency distribution of
sample means from rolling ten
dice 1,000 times.

Copyright 2009 Pearson Education, Inc.

Slide 5.3- 34

Table 5.2 shows that as the sample size increases, the mean of the
distribution of means approaches the value 3.5 and the standard
deviation becomes smaller (making the distribution narrower).

More important, the distribution looks more and more like a


normal distribution as the sample size increases.

Copyright 2009 Pearson Education, Inc.

Slide 5.3- 35

If the Population is not Normal


We can apply the Central Limit Theorem:
Even if the population is not normal,
sample means from the population will be
approximately normal as long as the sample size is
large enough.
Properties of the sampling distribution:

and

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

x
n
Chap 7-36

Central Limit Theorem


As the
sample
size gets
large
enough

the sampling
distribution
becomes
almost normal
regardless of
shape of
population

x
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-37

If the Population is not Normal


(continued)

Sampling distribution
properties:

Population Distribution

Central Tendency

x
Variation

x
n

Sampling Distribution
(becomes normal as n increases)
Larger
sample
size

Smaller
sample size

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

x
Chap 7-38

How Large is Large Enough?


For most distributions, n > 25 will give a
sampling distribution that is nearly normal
For normal population distributions, the
sampling distribution of the mean is always
normally distributed

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-39

Example
Suppose a population has mean = 8 and
standard deviation = 3. Suppose a random
sample of size n = 36 is selected.
What is the probability that the sample mean is
between 7.75 and 8.25?

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-40

Example
(continued)

Solution:
Even if the population is not normally distributed,
the central limit theorem can be used (n > 25)
so the sampling distribution of
approximately normal
with mean

is

= 8

x
and standard deviation

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

3
x

0.5
n
36
Chap 7-41

Example
(continued)

Solution (continued):

P(7.75 X

Population
Distribution
???
?
??
?
?
?
?
?

7.75-8
8.25-8

8.25) P

3
3

36
n
36

P(-0.5 Z 0.5) 0.3830


X -

Sampling
Distribution

Standard Normal
Distribution

Sample

.1915
+.1915

Standardize

7.8

X 8

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

8.2

-0.5

z 0

0.5

Z
Chap 7-42

EXAMPLE Sampling Farms


Texas has roughly 225,000 farms, more than any other state in the
United States. The actual mean farm size is = 582 acres and the
standard deviation is = 150 acres. For random samples of n = 100
farms, find the mean and standard deviation of the distribution of
sample means. What is the probability of selecting a random sample
of 100 farms with a mean greater than 600 acres?
Solution: Because the distribution of sample means is a
normal distribution, its mean should be the same as the mean of
the entire population, which is 582 acres.
The standard deviation of the sampling distribution is
/ n = 150/ 100 = 15.
Copyright 2009 Pearson Education, Inc.

Slide 8.1- 43

EXAMPLE Sampling Farms


Solution: (cont.)
A sample mean of acres therefore has a standard score of

z=

sample mean pop. mean


600 582
=
= 1.2
standard deviation
15

According to Table 5.1, this standard score is in the 88th


percentile, so the probability of selecting a sample with a mean
less than 600 acres is about 0.88.
Thus, the probability of selecting a sample with a mean greater
than 600 acres is about 0.12.

Copyright 2009 Pearson Education, Inc.

Slide 8.1- 44

Acceptance Intervals
Goal: determine a range within which sample means are
likely to occur, given a population mean and variance
By the Central Limit Theorem, we know that the distribution of X
is approximately normal if n is large enough, with mean and
standard deviation X
Let z/2 be the z-value that leaves area /2 in the upper tail of the
normal distribution (i.e., the interval - z/2 to z/2 encloses
probability 1 )
Then

z/2 X

is the interval that includes X with probability 1


Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-45

The Distribution of Sample Proportions


The distribution of sample proportions is the

p
distribution that results when we find the
proportions ( ) in all possible samples of a given
size.
The larger the sample size, the more closely this
distribution approximates a normal distribution.
In all cases, the mean of the distribution of
sample proportions equals the population
proportion.
If only one sample is available, its sample
proportion, , is the best estimate for the
population proportion, p.
Copyright 2009 Pearson Education, Inc.

Slide 8.1- 46

Population Proportions, P
P = the proportion of the population having
some characteristic
Sample proportion (P ) provides an estimate
of P:
X
number of items in the sample having the characteristic of interest
P

n
sample size

0 P 1

P has a binomial distribution, but can be approximated


by a normal distribution when nP(1 P) > 9

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-47

Sampling Distribution of P
Normal approximation:
P(P )
.3
.2
.1
0

Properties:

E(P ) p

and

Sampling Distribution

.2

.4

.6

1 P

X P(1 P)
Var
n
n
2
P

(where P = population proportion)


Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-48

Z-Value for Proportions


Standardize P to a Z value with the formula:

P P
Z

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

P P
P(1 P)
n

Chap 7-49

Example
If the true proportion of voters who support
Proposition A is P = .4, what is the probability
that a sample of size 200 yields a sample
proportion between .40 and .45?
i.e.: if P = .4 and n = 200, what is

.45) ?
P(.40 P
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-50

Example
(continued)

Find P :

Convert to
standard
normal:

if P = .4 and n = 200, what is


.45) ?
P(.40 P
P(1 P)
.4(1 .4)
P

.03464
n
200

.45 .40
.40 .40

P(.40 P .45) P
Z

.03464
.03464
P(0 Z 1.44)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-51

Example
(continued)

if p = .4 and n = 200, what is


.45) ?
P(.40 P

Use standard normal table:

P(0 Z 1.44) = .4251


Standardized
Normal Distribution

Sampling Distribution

.4251
Standardize

.40

.45

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

1.44

Z
Chap 7-52

Sample Variance
Let x1, x2, . . . , xn be a random sample from a
population. The sample variance is
n
1
2
s2
(x

x
)

i
n 1 i1

the square root of the sample variance is called


the sample standard deviation
the sample variance is different for different
random samples from the same population
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-53

Sampling Distribution of
Sample Variances

The sampling distribution of s2 has mean 2

E(s2 ) 2

If the population distribution is normal, then

If the population distribution 2is normal then

4
2
Var(s 2 )
n 1

(n - 1)s
2

has a 2 distribution with n 1 degrees of freedom


Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-54

The Chi-square Distribution


The chi-square distribution is a family of distributions,
depending on degrees of freedom:
d.f. = n 1

0 4 8 12 16 20 24 28

d.f. = 1

0 4 8 12 16 20 24 28

d.f. = 5

0 4 8 12 16 20 24 28

d.f. = 15

Text Table 7 contains chi-square probabilities


Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-55

Degrees of Freedom (df)


Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
Let X1 = 7
Let X2 = 8
What is X3?

If the mean of these three


values is 8.0,
then X3 must be 9
(i.e., X3 is not free to vary)

Here, n = 3, so degrees of freedom = n 1 = 3 1 = 2


(2 values can be any numbers, but the third is not free to vary
for a given mean)
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-56

Chi-square Example
A commercial freezer must hold a selected
temperature with little variation. Specifications call
for a standard deviation of no more than 4 degrees
(a variance of 16 degrees2).

A sample of 14 freezers is to be
tested
What is the upper limit (K) for the
sample variance such that the
probability of exceeding this limit,
given that the population standard
deviation is 4, is less than 0.05?
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-57

Finding the Chi-square Value


2
(n

1)s
2
2

Is chi-square distributed with (n 1) = 13


degrees of freedom

Use the the chi-square distribution with area 0.05


in the upper tail:

213 = 22.36 ( = .05 and 14 1 = 13 d.f.)


probability
= .05

2
213 = 22.36
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-58

Chi-square Example
(continued)

213 = 22.36
So:

( = .05 and 14 1 = 13 d.f.)

(n 1)s 2
2

P(s K) P
13 0.05
16

(n 1)K
22.36
16

or

so

(where n = 14)

(22.36)(16)
27.52
(14 1)

If s2 from the sample of size n = 14 is greater than 27.52, there is


strong evidence to suggest the population variance exceeds 16.
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 7-59

Вам также может понравиться