Вы находитесь на странице: 1из 43

Chapter 10

Statistical Inference
for Two Samples
Learning Objectives
• Comparative experiments involving two samples
• Test hypotheses on the difference in means of
two normal distributions
• Test hypotheses on the ratio of the variances or
standard deviations of two normal distributions
• Test hypotheses on the difference in two
population proportions
• Compute power, type II error probability, and
make sample size decisions for two-sample tests
• Explain and use the relationship between
confidence intervals and hypothesis tests
Assumptions
• Interested on statistical inferences on the
difference in means of two normal distributions
• Populations represented by X1 and X2
• Expected Value
Assumptions
• Quantity

• Has a N(0, 1) distribution


• Used to form tests of hypotheses and
confidence intervals on μ1-μ2
Hypothesis Tests for a Difference
in Means, Variances Known
• Difference in means μ1-μ2 is equal to a
specified value ∆0
– H0: μ1-μ2 =∆0
– H1: μ1-μ2 #∆0
• Test statistic
Hypothesis Tests for a Difference
in Means, Variances Known
• Alternative Hypothesis
• H1: μ1-μ2 #∆0
– Rejection Criterion
• z0> zα/2 or z0<-zα/2
• H1: μ1-μ2 >∆0
– Rejection Criterion
• z0> zα
• H1: μ1-μ2<∆0
– Rejection Criterion
• Z0< -zα
Choice of Sample Size
• Use of OC Curves
– Use OC curves in Appendix Charts VIa, VIb,
VIc, and VId
– Abscissa scale of the OC curves
Choice of Sample Size
• Two-sided Sample Size
– Sample size n=n1=n2 required to detect a
true difference in means ∆ of with power at
least 1-β

– Where ∆ is the true difference in means of


interest
• One-sided Sample Size
Type II Error
• Follows the singe-sample case
• Two-sided alternative
C.I. on a Difference in Means,
Variances Known, and Choice of
Sample Size
• Confidence Interval
– 100(1-α)% C.I. on the difference in two
means μ1-μ2
Choice of Sample Size
• Choice of Sample Size
– Error in estimating μ1-μ2 by x  x less than
1 2

E at 100(1-α)% confidence
Example
• Two machines are used for filling plastic bottles with a net
volume of 16.0 ounces
• The fill volume can be assumed normal, with standard
deviation 1=0.020 and 2=0.025 ounces
• A member of the quality engineering staff suspects that
both machines fill to the same mean net volume, whether
or not this volume is 16.0 ounces. A random sample of 10
bottles is taken from the output of each machine as
follows
Questions
1. Do you think the engineer is correct? Use =0.05
2. What is the P-value for this test?
3. What is the power of the test in part (1) for a true
difference in means of 0.04?
4. Find a 95% confidence interval on the difference in
means. Provide a practical interpretation of this interval.
5. Assuming equal sample sizes, what sample size should
be used to assure that =0.05 if the true difference in
means is 0.04? Assume that =0.05
Solution-Part 1
1. Parameter of interest is the difference in fill volume, 1  2  0

2. H0 :1  2 or 1  2
3. H1 :1  2  0 or 1  2
4.  = 0.05
( x  x2 )   0
5. The test statistic is z0  1
12 22

n1 n2

6. Reject H0 if z0 < z/2 = 1.96 or z0 > z/2 = 1.96


7. x1  16.015, x2  16.005,  = 0,   0.025, 1  0.02, n = 10, and
2 1
n2 = 10 (16.015  16.005)  0
z0   0.99
(0.02) 2 (0.025) 2

10 10

8. Since -1.96 < 0.99 < 1.96, do not reject the null hypothesis
Solution-Part 2 and 3
2. P-value = 2(1  (0.99))  2(1  0.8389)  0.3222
   
   
   0     0 
3.    z / 2 

    z  / 2 
12  22   1  2
2

2 
     
 n1 n2   n1 n2 

   
   
 0.08   0.08 
 196
.      196
.  
2 2
 (0.02) (0.025)   (0.02) 2 (0.025) 2 
     
 10 10   10 10 

196
.  7.9   196
.  7.9   5.94   9.86

=00=0
Hence, the power = 1  0 = 1
Solution-Part 4
4. Confidence interval
12  22 12  22
 x1  x2   z / 2 
n1 n2
 1   2   x1  x2   z / 2 
n1 n2

(0.02) 2 (0.025) 2 (0.02) 2 (0.025) 2


16.015  16.005  196
.   1   2  16.015  16.005  196
. 
10 10 10 10

0.0098  1  2  0.0298

With 95% confidence, we believe the true difference in the


mean fill volumes is between 0.0098 and
0.0298. Since 0 is contained in this interval, we can
conclude there is no significant difference between
the means.
Solution-Part 5
5. Assume the sample sizes are to be equal, use  = 0.05,  =
0.05, and  = 0.08
   196 
.  (0.02) 2  (0.025) 2   2.08,
2 2
z / 2  z 12   22 .  1645
n
2 (0.08) 2

Hence, n = 3, use n1 = n2 = 3
Hypotheses Tests for a Difference
in Means, Variances Unknown
• Tests of hypotheses on the difference in
means μ1-μ2 of two normal distributions

• If n1 and n2 exceed 40, use the CLT


• Otherwise base our hypotheses tests and
C.I. on the t distribution
• Two cases for the variances
Case I: 12=22= 2: Pooled Test
• Two normal populations with unknown means and
unknown but equal variances
• Expected value

• Form an estimator of 2
• Pooled estimator of 2, denoted by S2p

• Test statistic
Hypotheses Tests
• Test hypothesis
– H0: μ1-μ2 =∆0
– H1: μ1-μ2 #∆0
• Test statistic

• Where Sp is the pooled estimator of 


Critical Regions
• Alternative Hypothesis
– H1: μ1-μ2 #∆0
– Rejection Criterion
• t0>tα/2, n1+n2-2 or
• t0<-tα/2, n1+n2-2
– H1: μ1-μ2 >∆0
– Rejection Criterion
• t0>tα, n1+n2-2
– H1: μ1-μ2 <∆0
– Rejection Criterion
• t0<-tα, n1+n2-2
Case 2: 12#22
• Not able to assume that the unknown
variances 12, 22 are equal
• Test statistic

• With v degrees of freedom


• Critical regions
– Identical to the case I
– Degrees of freedom will be replaced by v
Confidence Interval on the
Difference in Means
• Case 12=22
– 100(1-)% CI on the difference in means μ1-μ2

• Case 12#22
– 100(1- )% CI on the difference in means μ1-μ2
Example
• The diameter of steel rods manufactured on two
different extrusion machines is being investigated
• Two random samples of of sizes n1=15 and n2=17
are selected, and the sample means and sample
variances are x1  8.73, s12=0.35, x2  8.68, and
s22=0.40, respectively
• Assume that equal variances and that the data
are drawn from a normal distribution
– Is there evidence to support the claim that the two
machines produce rods with different mean diameters?
Use α=0.05 in arriving at this conclusion
– Find the P-value for the t-statistic you calculated in part
(1)
– Construct a 95% confidence interval for the difference
in mean rod diameter. Interpret this interval
Solution
1  2
1. Parameter of interest,
2. H0 :     0 or 1  2
1 2

3. H1 :     0 or 1  2
1 2

4.  = 0.05
(x  x )  0
t0  1 2
5. Test statistic is 1 1
sp 
n1 n2
6. Reject the null hypothesis if t0 <  t / 2,n  n  2 where  t 0.025, 30 =
1 2

2.042 or t0 > t / 2 , n1 n2  2 where t 0.025, 30 = 2.042


x  x 
7. 1 8.73, 2 8.68, 0 = 0, s1  0.35, 2  0.40, n1 = 15,
2 s 2

and n2 = 17,
Solution

(n1  1) s12  (n2  1) s22


sp 
n1  n2  2

14(0.35)  16(0.40)
  0.614
30
(8.73  8.68)  0
t0   0.230
1 1
0.614 
15 17
8. Since 2.042 < 0.230 < 2.042, do not reject the null hypothesis
Solution-Cont.
• P-value = 2P t  0.230  2( 0.40), P-value > 0.80
• 95% confidence interval: t0.025,30 = 2.042

x1  x2   t / 2,n n 2 (s p ) 1 1
  1  2  x1  x2   t / 2,n1  n2 2 ( s p )
1 1

1 2
n1 n2 n1 n2

 1  2  8.73  8.68  2.042(0.643)


1 1 1 1
(8.73  8.68)  2.042(0.614)  
15 17 15 17

 0.415  1  2  0.515
• Since zero is contained in this interval, we are 95% confident
that machine 1 and machine 2 do not produce rods whose
diameters are significantly different
Paired t Test
• Special case of the two-sample t-tests
• When the observations are collected in
pairs
• Each pair of observations is taken under
homogeneous conditions
• Conditions may change from one pair to
another
• Testing
– H0: μD=∆0
– H1: μD#∆0
Paired t Test
• Test statistic

– D (bar) is the sample average of the n differences


• Rejection Region
– t0>tα/2, n-1 or t0<-tα/2, n-1
• 100(1-α)% C.I. on the difference in means in
means
Example
• Ten individuals have participated in a diet-modification
program to stimulate weight loss
• Their weight both before and after participation in the
program is shown in the following list
– Is there evidence to support the claim that this particular diet-
modification program is effective in producing a mean weight
reduction? Use α=0.05.
Subject Before After
1 195 187
2 213 195
3 247 221
4 201 190
5 187 175
6 210 197
7 215 199
8 246 221
9 294 278
10 310 285
Solution
1. Parameter of interest is the difference in mean weight, d
where di =Weight Before  Weight After.
2. H0 :  d  0
3. H1 :  d  0
4.  = 0.05
d
5. Test statistic is t0 
sd / n
6. Reject the null hypothesis if t0 > t0.05,9 where t0.05,9 = 1.833
7. d  17, sd  6.41, n=10
17
t0   8.387
6.41 / 10

8) Since 8.387 > 1.833 reject the null


Inferences on the Variances of
Two Normal Populations
• Both populations are normal and independent

• Test the hypotheses


– H0: 12=22
– H1: 12≠22
• Requires a new probability distribution, the F distribution
The F Distribution
• Define rv F as the ratio of two independent
chi-square r.v., each divided by its number
of dof
• F=(W/u) /(Y(v))
• Follows the F distribution with u dof in the
numerator and v dof in the denominator.
• Usually abbreviated as Fu,v
The F Distribution
• Shape of pdf with two dof

• Table V provides the percentage points of the F


distribution

• Note that f1-α,u,v =1/fα,v, u


Hypothesis Tests on the
Ratio of Two Variances
• Suppose H0: 12=22
• S12 and S22 are sample variances
• Test statistics
• F0= S12 / S22
• Suppose H1: 12#22
• Rejection Criterion
• f0>fα/2,n1-1,n2-1 or f0<f1-α/2,n1-1, n2-1
Example
• Two chemical companies can supply a raw material.
• The concentration of a particular element in this
material is important.
• The mean concentration for both suppliers is the
same, but we suspect that the variability in
concentration may differ between the two companies
• The standard deviation of concentration in a random
sample of n1=10 batches produced by company 1 is
s1=4.7 grams per liter, while for company 2, a
random sample of n2=16 batches yields s2=5.8
grams per liter.
• Is there sufficient evidence to conclude that the two
population variances differ? Use α=0.05.
Solution
1. Parameters of interest are the variances of concentration,  12 ,  22

2. H0 : 2
1 2
2

3. H1 :  12   22
4.  = 0.05
s12
5. Test statistic is f 0  s 2
2
6. Reject the null hypothesis if f0 < f 0.975,9,15 where f 0.975,9,15 = 0.265 or
f0 > f 0.025,9,15 where f 0.025,9,15 =3.12
7. n1=10, n2=16, s1= 4.7, and s2=5.8
(4.7) 2
f0  2
 0.657
(5.8)

8. Since 0.265 < 0.657 < 3.12 do not reject the null hypothesis
Hypothesis Tests on Two
Population Proportions
• Suppose two binomial parameters of
interest, p1and p2
• Large-Sample Test
• Test statistic

• Critical regions
β-Error

• If the H1 is two sided, the β-error

• Where
Confidence Interval on the
Difference in Means
• Two sided 100(1-α)% C.I. on the difference
in the true proportions p1-p2
Example
• Two different types of injection-molding machines
are used to form plastic parts. A part is
considered defective if it has excessive
shrinkage or is discolored
• Two random samples, each of size 300, are
selected, and 15 defective parts are found in the
sample from machine 1 while 8 defective parts
are found in the sample from machine 2
• Is it reasonable to conclude that both machines
produce the same fraction of defective parts,
using α=0.05?
Solution
1. Parameters of interest are the proportion of defective parts, p1
and p2
2. H0 : p1  p2
3. H1 : p1  p2
4.  = 0.05 p 1  p 2 x  x2
z0  p  1
5. Test statistic is  1 1 n1  n2
p (1  p )  
 n1 n2 

6. Reject the null hypothesis if z0 < z0.025 where z0.025 = 1.96 or z0


> z0.025 where z0.025 = 1.96
7. n1=300, n2=300, x1=15, x2=8, p 1  0.05, p 2  0.0267
15  8 0.05  0.0267
p   0.0383 z0   149
.
300  300  1
0.0383(1  0.0383) 
1 

 300 300 
Solution-Cont
• Since 1.96 < 1.49 < 1.96 do not reject the null
hypothesis
• P-value = 2(1P(z < 1.49)) = 0.13622

Вам также может понравиться