Вы находитесь на странице: 1из 8

Coverage

Confidence Intervals for 2 samples

Basic Assumptions

X1, X2,, Xm is a random sample of size m from


a population with mean 1 and variance 12
Y1,Y2,, Yn is a random sample of size n from a
population with mean 2 and variance 22
The X and Y samples are independent

Null Hypothesis in the 2-Sample Case

We learned earlier that the null hypothesis is always a


statement of no difference, so

But we want to get the null hypothesis in terms of a


parameter equal to a constant, so transform the above
statement to:

1 Sample Revisited

Recall that the acceptance region for the general case of


the 1-sample test of a location parameter, is:

We use the same idea to create a confidence interval for


the 2-sample location problem

Two Samples: Estimating Differences Between


Means

To obtain a point estimate of 1-2, we select two


independent random samples, one from each population
of sizes m and n respectively, and compute the difference
between the sample means
Here we expect

Two Sample Means, Variances Known

If two samples come from independent distributions with


1.

2.

means 1, and 2, variances 2, and 22, and the


distributions are normal or
the sample sizes m, and n large,

and the variances of both distributions are known, a (1)100% confidence interval for 1-2 is:

where z/2 is the z-value leaving an area of /2 to the


right
6

Example 1

Data were collected on the length of recovery


time for patients randomly treated with one of
two medications (see Session5_data.xls). Find a
95% CI for -, where 1 and 2 indicate
medications 1 and 2 respectively. Assume each
distribution is normal with 2=4, and ,
respectively.

Two Sample Means, Variances Unknown and Equal

If the variances are unknown but equal (12=22=2), we


get an estimate of the common variance 2 by taking the
weighted average of the two variances.

where 1=m-1 degrees of freedom and 2=n-1 degrees of


freedom

Two Sample Means, Variances Unknown and Equal

If two samples come from independent normal


distributions with means 1 and 2, and 2 and 22, and
the variances of both distributions are unknown, but
equal, a (1-)100% confidence interval for 1-2 is:

where t/2, is the t-value with =m+n-2 degrees of


freedom, leaving an area of /2 to the right and sp is the
pooled estimate of the population standard deviation

Example 2

Return to example 1. Find a 95% CI for 1-2,


assuming normal distributions with equal and
unknown population variances

10

Two Sample Means Variances Unknown and Not Equal

If two samples of sizes m and n come from independent


normal distributions, and the variances of both
distributions are unknown, and not equal, a (1-)100%
confidence interval for 1-2 is:

where t/2, is the t-value with degrees of freedom

11

Example 3

Redo example 1 but assume that the variances


are unknown, and not equal

12

Paired Observations

Situation that arises when samples are not independent


The difference here is that the conditions from each
population are not assigned randomly to each
experimental unit (for example, two measurement
methods applied to the same part)
Rather each homogeneous experimental unit receives
both population conditions, so each experimental unit
has a pair of observations, one from each population
If the data is paired, then D1, D2,,Dn are RVs from a
population of differences that follow a normal distribution
with mean D=(1-2) and variance D2.
13

Advantage and Disadvantage of Paired Data

Advantage: Since the experimental unit does


not change across populations, the between
population error is reduced, so D is reduced
Disadvantage: By pairing the data, we have
reduced this to a one-sample problem. Assume
equal sample sizes, this means that the degrees
of freedom have been reduced dramatically.
Outcome: Pairing is counterproductive unless
there is a substantial reduction in D
14

(1-)100% CI for Paired Data

If d1,d2,dn represent normally distributed differences of n


random pairs of measurements, a (1-)100% CI for
D=(1-2) is

where t/2, is the t-value with =n-1 degrees of freedom,


leaving an area of /2 to the right

15

Example 4

A study published a report on the levels of


dioxin TCDD of 20 Mass. Vietnam vets who
were possibly exposed to agent orange. The
amount of TCDD levels in plasma and fat
tissues are recorded in the file
Session5_data.xls. Find a 95% CI for 1-2,
where 1 and 2 represent the amount of TCDD in
the plasma and fat tissues respectively. Assume
the distributions are normal.

16

Вам также может понравиться