Вы находитесь на странице: 1из 45

# Tests of Hypothesis

Hypothesis Tests
• Proportion
• Two independent samples
• Paired t-test
• Variances
• 2 test
Hypothesis Tests for Proportions

## • Involves categorical values

• Two possible outcomes
– “Success” (possesses a certain characteristic)
– “Failure” (does not possesses that characteristic)

## • Fraction or proportion of the population in

the “success” category is denoted by p
Proportions
(continued)
• Sample proportion in the success category is
denoted by ps
X number of successes in sample
ps  
n sample size

## • When both np and n(1-p) are at least 5, ps

can be approximated by a normal distribution
with mean and standard deviation

μps  p σps 
p(1  p)
n
Test Statistic for Proportions

• The sampling
distribution of ps is Hypothesis
approximately Tests for p
normal, so the test
statistic is a Z value:
np  5 np < 5
and or
ps  p n(1-p)  5
Z n(1-p) < 5
p(1  p)
Not discussed
n in this chapter
Z Test for Proportion
in Terms of Number of Successes

• An equivalent form
to the last slide, but Hypothesis
in terms of the Tests for X
number of
successes, X:
X5 X<5
and or
n-X  5
X  np n-X < 5
Z
np(1  p) Not discussed
in this chapter
Example: Z Test for Proportion

A marketing company
8% responses from its
mailing. To test this
claim, a random sample
of 500 were surveyed Check:
with 25 responses. Test n p = (500)(.08) = 40

at the  = .05 n(1-p) = (500)(.92) = 460
significance level.
Z Test for Proportion: Solution
Test Statistic:
H0: p = .08 H1: p 
.08
ps  p .05  .08
Z   2.47
p(1  p) .08(1  .08)
 = .05
n = 500, ps = .05
n 500
Decision:
Critical Values: ± 1.96
Reject Reject Reject H0 at  = .05
Conclusion:
.025 .025
There is sufficient
-1.96 0 1.96 z evidence to reject the
-2.47 company’s claim of 8%
response rate.
p-Value Solution
(continued)
Calculate the p-value and compare to 
(For a two sided test the p-value is always two sided)

 2 * P(Z  2.47)
 2 * [1 - P(Z  2.47)]
 2 * 0.0068  0.0136
p-value = .0136

## If we pick alpha = 0.05, we reject H0 since p-

value = .0136 <  = .05
Two Sample Tests

## Two Sample Tests

Population
Means, Means, Related Population Population
Independent Samples Proportions Variances
Samples

Examples:
Group 1 vs. Same group before Proportion 1 vs. Variance 1 vs.
independent vs. after treatment Proportion 2 Variance 2
Group 2
Difference Between Two Means

Population means,
Goal: Test hypotheses or form a
independent samples
* confidence interval for the
difference between two
population means, μ1 – μ2
σ1 and σ2 known

## The point estimate for the difference is

σ1 and σ2 unknown

X1 – X 2
Hypothesis Tests for
Two Population Means
Two Population Means, Independent Samples

## H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2

H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,

## H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0

H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
Hypothesis statement for μ1 – μ2
Two Population Means, Independent Samples

## Lower tail test: Upper tail test: Two-tailed test:

H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

  /2 /2

## -z z -z/2 z/2

Reject H0 if Z < -Z Reject H0 if Z > Z Reject H0 if Z < -Z/2
or Z > Z/2
Independent Samples

## Population means, • Different data sources

independent samples
* – Unrelated
– Independent
• Sample selected from one
σ1 and σ2 known population has no effect
on the sample selected
from the other population
σ1 and σ2 unknown • Use the difference between 2
sample means
• Use Z test or pooled variance t
test
Difference Between Two Means

Population means,
independent samples
*
σ1 and σ2 known Use a Z test statistic

## Use S to estimate unknown σ , use a t

σ1 and σ2 unknown test statistic and pooled standard
deviation
σ1 and σ2 Known

Population means,
independent samples Assumptions:
 Samples are randomly and
independently drawn

σ1 and σ2 known
*  population distributions are
normal or both sample sizes
are  30
σ1 and σ2 unknown  Population standard
deviations are known
σ1 and σ2 Known
(continued)

## When σ1 and σ2 are known and both populations

Population means,
are normal or both sample sizes are at least 30, the
independent samples
test statistic is a Z-value…

σ1 and σ2 known
* …and the standard error of
X1 – X2 is

σ1 and σ2 unknown
2 2
σ σ2
σ X1  X2  
1
n1 n2
σ1 and σ2 Known
(continued)

Population means,
independent samples The test statistic for
μ1 – μ2 is:

σ1 and σ2 known
* Z
 X  X   μ
1 2 1  μ2 
2 2
σ1 and σ2 unknown σ1 σ 2

n1 n 2
Confidence Interval,
σ1 and σ2 Known
Population means,
independent samples The confidence interval for
μ1 – μ2 is:

*
 
σ1 and σ2 known 2 2
σ σ2
X1  X 2  Z 1

σ1 and σ2 unknown n1 n2
σ1 and σ2 Unknown

## Population means, Assumptions:

independent samples
 Samples are randomly and
independently drawn

##  Populations are normally

σ1 and σ2 known distributed

##  Population variances are

σ1 and σ2 unknown
* unknown but assumed equal
σ1 and σ2 Unknown
(continued)

## Population means, Forming interval estimates:

independent samples
 The population variances
are assumed equal, so use
the two sample standard
deviations and pool them to
σ1 and σ2 known estimate σ

## n1  1S12  n2  1S2 2

σ1 and σ2 unknown
* Sp 
(n1  1)  (n2  1)

##  the test statistic is a t value

with (n1 + n2 – 2) degrees
of freedom
σ1 and σ2 Unknown
(continued)

## Population means, The test statistic for

independent samples
μ1 – μ2 is:

t
 X  X   μ  μ 
1 2 1 2
σ1 and σ2 known
1 1 
S   
2

σ1 and σ2 unknown
* p
 n1 n2 
Where t has (n1 + n2 – 2) d.f.,
and
S 2

n1  1S1  n2  1S2
2 2

(n1  1)  (n2  1)
p
Confidence Interval,
σ1 and σ2 Unknown
Population means,
independent samples The confidence interval for
μ1 – μ2 is:

X  X   t 1 1 
σ1 and σ2 known
1 2 n1 n2 -2 S   
2
p
 n1 n2 
σ1 and σ2 unknown
*
Where

n
S2  1
 1 S1
2
 n 2  1 S 2
2

(n1  1)  (n2  1)
p
Example
You are a financial analyst for a brokerage firm. Is
there a difference in dividend yield between stocks
listed on the NYSE & NASDAQ? You collect the
following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Assuming equal variances, is
there a difference in average
yield ( = 0.05)?
Solution
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
 = 0.05 .025 .025

df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040

## Test Statistic: Decision:

3.27  2.53
t  2.040 Reject H0 at  = 0.05
 1 1 
1.5021    Conclusion:
 21 25  There is evidence of a
difference in means.
Calculating the Test Statistic
The test statistic is:

t
X  X   μ  μ 
1 2
 1 2 3.27  2.53   0  2.040
1 1  1 1 
S   
2
1.5021  
 21 25 
p
 n1 n2 

n
S2  1
 1S1
2
 n 2  1S 2
2

21  11.30 2
 25  11.16 2
 1.5021
(n1  1)  (n2  1) (21 - 1)  (25  1)
p
Related Samples
Tests Means of 2 Related Populations
Related samples – Paired or matched samples
– Repeated measures (before/after)
– Use difference between paired values:

D = X1 - X2
• Eliminates Variation Among Subjects
• Assumptions:
– Both Populations Are Normally Distributed
– Or, if Not Normal, use large samples
Hypothesis Statement for
Mean Difference, σD Unknown
Paired Samples

## H0: μD  0 H0: μD ≤ 0 H0: μD = 0

H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

  /2 /2

## -t t -t/2 t/2

Reject H0 if t < -t Reject H0 if t > t Reject H0 if t < -t/2
or t > t/2
Where t has n - 1 d.f.
Mean Difference, σD Known
The ith paired difference is Di , where
Related samples
Di = X1i - X2i
n
The point estimate for the
population mean paired
D i

difference is D : D i 1
n
Suppose the population standard deviation of
the difference scores, σD, is known

## n is the number of pairs in the paired sample

Mean Difference, σD Known
(continued)
The test statistic for the mean
Paired samples difference is a Z value:

D  μD
Z
σD
n
Where
μD = hypothesized mean difference
σD = population standard dev. of differences
n = the sample size (number of pairs)
Confidence Interval, σD Known
The confidence interval for D is
Paired samples
σD
DZ
n
Where n = the sample size
(number of pairs in the paired sample)
Mean Difference, σD Unknown
If σD is unknown, we can estimate the unknown
population standard deviation with a sample
Related samples standard deviation:
n
The sample standard
deviation is
 i
(D  D ) 2

SD  i1
n 1
The test statistic for D is now a t
statistic, with n-1 df:
D  μD
t
SD
n
Confidence Interval, σD Unknown

## Paired samples The confidence interval for D is

SD
D  t n1
n
n

 (D  D)
i
2

where SD  i1
n 1
Paired Samples Example
• Assume you send your salespeople to a “customer
service” training workshop. Is the training effective? You
collect the following data:

## Number of Complaints: (2) - (1) D i

Salesperson Before (1) After (2) Difference, Di D = n

C.B. 6 4 - 2
= -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K.
M.O.
0
4
0
0 - 4
0
SD 
 i
(D  D ) 2

-21 n 1
 5.67
Paired Samples: Solution
• Has the training made a difference in the number of complaints
(at the 0.01 level)?
Reject Reject
H0: μ D = 0
H1: μ D  0
/2 /2
 = .01 D = - 4.2 - 4.604 4.604
- 1.66
Critical Value = ± 4.604
Decision: Do not reject H0
d.f. = n - 1 = 4 (t stat is not in the reject region)
Test Statistic:
Conclusion: There is not a significant
D  D  4.2  0 change in the number of complaints.
t    1.66
sD 5.67
n 5
Hypothesis Tests for Variances

## Tests for Two

Population
*
H0: σ12 = σ22
Variances Two tailed test
H1: σ12 ≠ σ22

F test statistic
H1: σ12 < σ22

## H0: σ12 ≤ σ22 Upper tail test

H1: σ12 > σ22
Hypothesis Tests for Variances
(continued)

## Tests for Two

The F test statistic is:
Population
Variances 2
S
F 1

F test statistic * S 2
2

## S12 = Variance of Sample 1

n1 - 1 = numerator degrees of freedom

## S22 = Variance of Sample 2

n2 - 1 = denominator degrees of freedom
The F Distribution
• The F critical value is found from the F table
• The are two appropriate degrees of freedom:
numerator and denominator
S12 where df1 = n1 – 1 ; df2 = n2 – 1
F 2
S2

• In the F table,
– numerator degrees of freedom determine the column
– denominator degrees of freedom determine the row
Finding the Rejection Region
H0: σ12  σ22
H0: σ12 = σ22
 H1: σ12 < σ22
H1: σ12 ≠ σ22
/2
0 F /2
Reject Do not
H0 FL reject H0
Reject H0 if F < FL 0 F
Reject Do not Reject H0
H0 FL reject H0 FU
H0: σ1 ≤ σ2 2 2

## H1: σ12 > σ22 S12

F  2  FU
  rejection region S2
for a two-tailed
0 test is: S12
F  2  FL
Do not Reject H0 F S2
reject H0 FU
Reject H0 if F > FU
Finding the Rejection Region
(continued)
H0: σ12 = σ22
/2 H1: σ12 ≠ σ22
/2

0 F
Reject Do not Reject H0
H0 FL reject H0 FU
To find the critical F values:
1
1. Find FU from the F table 2. Find FL using the formula: FL 
FU*
for n1 – 1 numerator and
n2 – 1 denominator Where FU* is from the F table
degrees of freedom with n2 – 1 numerator and n1 – 1
denominator degrees of freedom
(i.e., switch the d.f. from FU)
F Test: An Example
You are a financial analyst for a brokerage firm. You
want to compare dividend yields between stocks
listed on the NYSE & NASDAQ. You collect the
following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

## Is there a difference in the

variances between the NYSE
& NASDAQ at the  = 0.1 level?
• Form the hypothesis test:
H0: σ21 – σ22 = 0 (there is no difference between variances)
H1: σ21 – σ22 ≠ 0 (there is a difference between variances)
S12 1.302
 The test statistic is: F 2  2
 1.256
S 2 1.16
FL = F (1- /2) , n , d =
FU = F/2, n , d
= F.05, 20, 24 =1/F /2, d , n = 1/F.05, 20, 24
= 2.03 = 1/2.08 = .48
1.256

## /2 = .05 /2 = .05

0
Reject H0 Do not Reject H0
reject H0
FU=2.03 F
FL=0.48
F Test: Example Solution
(continued)

## • The test statistic H0: σ12 = σ22

H1: σ12 ≠ σ22
is: S12 1.302
F 2
 2
 1.256
S 2 1.16
/2 = .05 /2 = .05

0 F
Reject H0 Do not Reject H0
reject H0
• F = 1.256 is not in the rejection FL=0.48
FU=2.03
region, so we do not reject H0

## • Conclusion: There is not sufficient evidence of

a difference in variances at  = 0.1
• Form the hypothesis test:
H0: σ22 – σ21 = 0 (there is no difference between variances)
H1: σ22 – σ21 ≠ 0 (there is a difference between variances)
S 22 1.16 2
 The test statistic is: F  2  2
 0.796
S1 1.30
FL = F (1- /2) , n , d
FU = F/2, n , d
= F.05, 24, 20 =1/F /2, d , n = 1/F.05 , 24 . 20
= 2.08 = 1/2.03 = .493

0.796

## /2 = .025 /2 = .025

0
Reject H0 Do not Reject H0
reject H0
FU=2.08 F
FL=0.493
F Test: Example Solution
(continued)
• The test statistic is: H0: σ22 = σ12
H1: σ22 ≠ σ12
S 22 1.16 2
F  2  2
 0.796
S1 1.30
/2 = .05 /2 = .05

0 F
Reject H0 Do not Reject H0
reject H0
• F = 0.796 is not in the rejection FL=0.493
FU=2.08
region, so we do not reject H0

## • Conclusion: There is not sufficient evidence of

a difference in variances at  = 0.1