Вы находитесь на странице: 1из 45

Tests of Hypothesis

Hypothesis Tests
• Proportion
• Two independent samples
• Paired t-test
• Variances
• 2 test
Hypothesis Tests for Proportions

• Involves categorical values


• Two possible outcomes
– “Success” (possesses a certain characteristic)
– “Failure” (does not possesses that characteristic)

• Fraction or proportion of the population in


the “success” category is denoted by p
Proportions
(continued)
• Sample proportion in the success category is
denoted by ps
X number of successes in sample
ps  
n sample size

• When both np and n(1-p) are at least 5, ps


can be approximated by a normal distribution
with mean and standard deviation

μps  p σps 
p(1  p)
n
Test Statistic for Proportions

• The sampling
distribution of ps is Hypothesis
approximately Tests for p
normal, so the test
statistic is a Z value:
np  5 np < 5
and or
ps  p n(1-p)  5
Z n(1-p) < 5
p(1  p)
Not discussed
n in this chapter
Z Test for Proportion
in Terms of Number of Successes

• An equivalent form
to the last slide, but Hypothesis
in terms of the Tests for X
number of
successes, X:
X5 X<5
and or
n-X  5
X  np n-X < 5
Z
np(1  p) Not discussed
in this chapter
Example: Z Test for Proportion

A marketing company
claims that it receives
8% responses from its
mailing. To test this
claim, a random sample
of 500 were surveyed Check:
with 25 responses. Test n p = (500)(.08) = 40

at the  = .05 n(1-p) = (500)(.92) = 460
significance level.
Z Test for Proportion: Solution
Test Statistic:
H0: p = .08 H1: p 
.08
ps  p .05  .08
Z   2.47
p(1  p) .08(1  .08)
 = .05
n = 500, ps = .05
n 500
Decision:
Critical Values: ± 1.96
Reject Reject Reject H0 at  = .05
Conclusion:
.025 .025
There is sufficient
-1.96 0 1.96 z evidence to reject the
-2.47 company’s claim of 8%
response rate.
p-Value Solution
(continued)
Calculate the p-value and compare to 
(For a two sided test the p-value is always two sided)

 2 * P(Z  2.47)
 2 * [1 - P(Z  2.47)]
 2 * 0.0068  0.0136
p-value = .0136

If we pick alpha = 0.05, we reject H0 since p-


value = .0136 <  = .05
Two Sample Tests

Two Sample Tests

Population
Means, Means, Related Population Population
Independent Samples Proportions Variances
Samples

Examples:
Group 1 vs. Same group before Proportion 1 vs. Variance 1 vs.
independent vs. after treatment Proportion 2 Variance 2
Group 2
Difference Between Two Means

Population means,
Goal: Test hypotheses or form a
independent samples
* confidence interval for the
difference between two
population means, μ1 – μ2
σ1 and σ2 known

The point estimate for the difference is


σ1 and σ2 unknown

X1 – X 2
Hypothesis Tests for
Two Population Means
Two Population Means, Independent Samples

Lower tail test: Upper tail test: Two-tailed test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2


H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,

H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0


H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
Hypothesis statement for μ1 – μ2
Two Population Means, Independent Samples

Lower tail test: Upper tail test: Two-tailed test:


H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

  /2 /2

-z z -z/2 z/2


Reject H0 if Z < -Z Reject H0 if Z > Z Reject H0 if Z < -Z/2
or Z > Z/2
Independent Samples

Population means, • Different data sources


independent samples
* – Unrelated
– Independent
• Sample selected from one
σ1 and σ2 known population has no effect
on the sample selected
from the other population
σ1 and σ2 unknown • Use the difference between 2
sample means
• Use Z test or pooled variance t
test
Difference Between Two Means

Population means,
independent samples
*
σ1 and σ2 known Use a Z test statistic

Use S to estimate unknown σ , use a t


σ1 and σ2 unknown test statistic and pooled standard
deviation
σ1 and σ2 Known

Population means,
independent samples Assumptions:
 Samples are randomly and
independently drawn

σ1 and σ2 known
*  population distributions are
normal or both sample sizes
are  30
σ1 and σ2 unknown  Population standard
deviations are known
σ1 and σ2 Known
(continued)

When σ1 and σ2 are known and both populations


Population means,
are normal or both sample sizes are at least 30, the
independent samples
test statistic is a Z-value…

σ1 and σ2 known
* …and the standard error of
X1 – X2 is

σ1 and σ2 unknown
2 2
σ σ2
σ X1  X2  
1
n1 n2
σ1 and σ2 Known
(continued)

Population means,
independent samples The test statistic for
μ1 – μ2 is:

σ1 and σ2 known
* Z
 X  X   μ
1 2 1  μ2 
2 2
σ1 and σ2 unknown σ1 σ 2

n1 n 2
Confidence Interval,
σ1 and σ2 Known
Population means,
independent samples The confidence interval for
μ1 – μ2 is:

*
 
σ1 and σ2 known 2 2
σ σ2
X1  X 2  Z 1

σ1 and σ2 unknown n1 n2
σ1 and σ2 Unknown

Population means, Assumptions:


independent samples
 Samples are randomly and
independently drawn

 Populations are normally


σ1 and σ2 known distributed

 Population variances are

σ1 and σ2 unknown
* unknown but assumed equal
σ1 and σ2 Unknown
(continued)

Population means, Forming interval estimates:


independent samples
 The population variances
are assumed equal, so use
the two sample standard
deviations and pool them to
σ1 and σ2 known estimate σ

n1  1S12  n2  1S2 2


σ1 and σ2 unknown
* Sp 
(n1  1)  (n2  1)

 the test statistic is a t value


with (n1 + n2 – 2) degrees
of freedom
σ1 and σ2 Unknown
(continued)

Population means, The test statistic for


independent samples
μ1 – μ2 is:

t
 X  X   μ  μ 
1 2 1 2
σ1 and σ2 known
1 1 
S   
2

σ1 and σ2 unknown
* p
 n1 n2 
Where t has (n1 + n2 – 2) d.f.,
and
S 2

n1  1S1  n2  1S2
2 2

(n1  1)  (n2  1)
p
Confidence Interval,
σ1 and σ2 Unknown
Population means,
independent samples The confidence interval for
μ1 – μ2 is:

X  X   t 1 1 
σ1 and σ2 known
1 2 n1 n2 -2 S   
2
p
 n1 n2 
σ1 and σ2 unknown
*
Where

n
S2  1
 1 S1
2
 n 2  1 S 2
2

(n1  1)  (n2  1)
p
Example
You are a financial analyst for a brokerage firm. Is
there a difference in dividend yield between stocks
listed on the NYSE & NASDAQ? You collect the
following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Assuming equal variances, is
there a difference in average
yield ( = 0.05)?
Solution
Reject H0 Reject H0
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
 = 0.05 .025 .025

df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040

Test Statistic: Decision:


3.27  2.53
t  2.040 Reject H0 at  = 0.05
 1 1 
1.5021    Conclusion:
 21 25  There is evidence of a
difference in means.
Calculating the Test Statistic
The test statistic is:

t
X  X   μ  μ 
1 2
 1 2 3.27  2.53   0  2.040
1 1  1 1 
S   
2
1.5021  
 21 25 
p
 n1 n2 

n
S2  1
 1S1
2
 n 2  1S 2
2

21  11.30 2
 25  11.16 2
 1.5021
(n1  1)  (n2  1) (21 - 1)  (25  1)
p
Related Samples
Tests Means of 2 Related Populations
Related samples – Paired or matched samples
– Repeated measures (before/after)
– Use difference between paired values:

D = X1 - X2
• Eliminates Variation Among Subjects
• Assumptions:
– Both Populations Are Normally Distributed
– Or, if Not Normal, use large samples
Hypothesis Statement for
Mean Difference, σD Unknown
Paired Samples

Lower tail test: Upper tail test: Two-tailed test:

H0: μD  0 H0: μD ≤ 0 H0: μD = 0


H1: μD < 0 H1: μD > 0 H1: μD ≠ 0

  /2 /2

-t t -t/2 t/2


Reject H0 if t < -t Reject H0 if t > t Reject H0 if t < -t/2
or t > t/2
Where t has n - 1 d.f.
Mean Difference, σD Known
The ith paired difference is Di , where
Related samples
Di = X1i - X2i
n
The point estimate for the
population mean paired
D i

difference is D : D i 1
n
Suppose the population standard deviation of
the difference scores, σD, is known

n is the number of pairs in the paired sample


Mean Difference, σD Known
(continued)
The test statistic for the mean
Paired samples difference is a Z value:

D  μD
Z
σD
n
Where
μD = hypothesized mean difference
σD = population standard dev. of differences
n = the sample size (number of pairs)
Confidence Interval, σD Known
The confidence interval for D is
Paired samples
σD
DZ
n
Where n = the sample size
(number of pairs in the paired sample)
Mean Difference, σD Unknown
If σD is unknown, we can estimate the unknown
population standard deviation with a sample
Related samples standard deviation:
n
The sample standard
deviation is
 i
(D  D ) 2

SD  i1
n 1
The test statistic for D is now a t
statistic, with n-1 df:
D  μD
t
SD
n
Confidence Interval, σD Unknown

Paired samples The confidence interval for D is


SD
D  t n1
n
n

 (D  D)
i
2

where SD  i1
n 1
Paired Samples Example
• Assume you send your salespeople to a “customer
service” training workshop. Is the training effective? You
collect the following data:

Number of Complaints: (2) - (1) D i


Salesperson Before (1) After (2) Difference, Di D = n

C.B. 6 4 - 2
= -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K.
M.O.
0
4
0
0 - 4
0
SD 
 i
(D  D ) 2

-21 n 1
 5.67
Paired Samples: Solution
• Has the training made a difference in the number of complaints
(at the 0.01 level)?
Reject Reject
H0: μ D = 0
H1: μ D  0
/2 /2
 = .01 D = - 4.2 - 4.604 4.604
- 1.66
Critical Value = ± 4.604
Decision: Do not reject H0
d.f. = n - 1 = 4 (t stat is not in the reject region)
Test Statistic:
Conclusion: There is not a significant
D  D  4.2  0 change in the number of complaints.
t    1.66
sD 5.67
n 5
Hypothesis Tests for Variances

Tests for Two


Population
*
H0: σ12 = σ22
Variances Two tailed test
H1: σ12 ≠ σ22

H0: σ12  σ22 Lower tail test


F test statistic
H1: σ12 < σ22

H0: σ12 ≤ σ22 Upper tail test


H1: σ12 > σ22
Hypothesis Tests for Variances
(continued)

Tests for Two


The F test statistic is:
Population
Variances 2
S
F 1

F test statistic * S 2
2

S12 = Variance of Sample 1


n1 - 1 = numerator degrees of freedom

S22 = Variance of Sample 2


n2 - 1 = denominator degrees of freedom
The F Distribution
• The F critical value is found from the F table
• The are two appropriate degrees of freedom:
numerator and denominator
S12 where df1 = n1 – 1 ; df2 = n2 – 1
F 2
S2

• In the F table,
– numerator degrees of freedom determine the column
– denominator degrees of freedom determine the row
Finding the Rejection Region
H0: σ12  σ22
H0: σ12 = σ22
 H1: σ12 < σ22
H1: σ12 ≠ σ22
/2
0 F /2
Reject Do not
H0 FL reject H0
Reject H0 if F < FL 0 F
Reject Do not Reject H0
H0 FL reject H0 FU
H0: σ1 ≤ σ2 2 2

H1: σ12 > σ22 S12


F  2  FU
  rejection region S2
for a two-tailed
0 test is: S12
F  2  FL
Do not Reject H0 F S2
reject H0 FU
Reject H0 if F > FU
Finding the Rejection Region
(continued)
H0: σ12 = σ22
/2 H1: σ12 ≠ σ22
/2

0 F
Reject Do not Reject H0
H0 FL reject H0 FU
To find the critical F values:
1
1. Find FU from the F table 2. Find FL using the formula: FL 
FU*
for n1 – 1 numerator and
n2 – 1 denominator Where FU* is from the F table
degrees of freedom with n2 – 1 numerator and n1 – 1
denominator degrees of freedom
(i.e., switch the d.f. from FU)
F Test: An Example
You are a financial analyst for a brokerage firm. You
want to compare dividend yields between stocks
listed on the NYSE & NASDAQ. You collect the
following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std dev 1.30 1.16

Is there a difference in the


variances between the NYSE
& NASDAQ at the  = 0.1 level?
• Form the hypothesis test:
H0: σ21 – σ22 = 0 (there is no difference between variances)
H1: σ21 – σ22 ≠ 0 (there is a difference between variances)
S12 1.302
 The test statistic is: F 2  2
 1.256
S 2 1.16
FL = F (1- /2) , n , d =
FU = F/2, n , d
= F.05, 20, 24 =1/F /2, d , n = 1/F.05, 20, 24
= 2.03 = 1/2.08 = .48
1.256

/2 = .05 /2 = .05

0
Reject H0 Do not Reject H0
reject H0
FU=2.03 F
FL=0.48
F Test: Example Solution
(continued)

• The test statistic H0: σ12 = σ22


H1: σ12 ≠ σ22
is: S12 1.302
F 2
 2
 1.256
S 2 1.16
/2 = .05 /2 = .05

0 F
Reject H0 Do not Reject H0
reject H0
• F = 1.256 is not in the rejection FL=0.48
FU=2.03
region, so we do not reject H0

• Conclusion: There is not sufficient evidence of


a difference in variances at  = 0.1
• Form the hypothesis test:
H0: σ22 – σ21 = 0 (there is no difference between variances)
H1: σ22 – σ21 ≠ 0 (there is a difference between variances)
S 22 1.16 2
 The test statistic is: F  2  2
 0.796
S1 1.30
FL = F (1- /2) , n , d
FU = F/2, n , d
= F.05, 24, 20 =1/F /2, d , n = 1/F.05 , 24 . 20
= 2.08 = 1/2.03 = .493

0.796

/2 = .025 /2 = .025

0
Reject H0 Do not Reject H0
reject H0
FU=2.08 F
FL=0.493
F Test: Example Solution
(continued)
• The test statistic is: H0: σ22 = σ12
H1: σ22 ≠ σ12
S 22 1.16 2
F  2  2
 0.796
S1 1.30
/2 = .05 /2 = .05

0 F
Reject H0 Do not Reject H0
reject H0
• F = 0.796 is not in the rejection FL=0.493
FU=2.08
region, so we do not reject H0

• Conclusion: There is not sufficient evidence of


a difference in variances at  = 0.1