Вы находитесь на странице: 1из 37

Comparing

BIOSTATISTICS
BBTC 3282 Between Two
Means
 Until this point, all the inferential statistics we
have considered involve using one sample
as the basis for drawing conclusion about
one population.

 Although these single sample techniques


Comparing are used occasionally in real research, most
research studies aim to compare of two (or
Two Samples more) sets of data in order to make
inferences about the differences between
two (or more) populations.

 What do we do when our research question


concerns a mean difference between two
sets of data?
 Researchers wish to know if the data they have
collected provide sufficient evidence to indicate a
difference in mean serum uric acid levels between
normal individuals and individuals with Down’
syndrome. The data consists of serum uric acid
readings on 12 individuals with Down’s syndrome
and 15 normal individuals. The means are 𝑋ത 1 = 4.5
mg/100 ml and 𝑋ത 2 = 3.4mg/100 ml. The 𝜎 2 for
normal and down syndrome are 1mg/100 ml and
1.5mg/100ml respectively
 Definition
 Two samples drawn from
two populations are
independent if the
selection of one sample
Independent vs from one population does
Dependent not affect the selection of
the second sample from
the second population.
Otherwise, the samples
are dependent.
 Two samples are independent
when the sample selected
from one population is not
related to the sample selected
from the second population
 Two samples are dependent
when each member of one
sample corresponds to a
member of the other sample
 Dependent samples are also
called paired samples or
matched samples
Example – Independent

 Suppose we want to estimate the difference between the


mean salaries of all male and all female executives. To do
so, we draw two samples, one from the population of male
executives and another from the population of female
executives. These two samples are independent because
they are drawn from two different populations, and the
samples have no effect on each other.
Example - Dependent

 Suppose we want to estimate the difference between the


mean weights of all participants before and after a weight
loss program. To accomplish this, suppose we take a sample
of 40 participants and measure their weights before and
after the completion of this program. Note that these two
samples include the same 40 participants. This is an example
of two dependent samples. Such samples are also called
paired or matched samples.
 There are two general research
strategies that can be used to obtain
the two sets of data to be compared:
1. The two sets of data could come
from two independent populations
(e.g. women and men, or students
Two kinds of from section A and from section B)
studies – between subjects
2. The two sets of data could come
from related populations (e.g.
“before treatment” and “after
treatment”) – within subjects
Two
Independent
Mean Samples
When sampling is from normally
distributed populations with known
population variances → z-test
Three When sampling is from normally
different distributed populations with unknown
population variances → t-test
contexts
When sampling is from populations
that are not normally distributed →
nonparametric test
Hypothesis Testing

1.H0 : µ 1 - µ2 = 0 , H A : µ1 - µ2 ≠ 0
2.H0 : µ1 - µ2 ≥ 0 , HA : µ1 - µ2 < 0
3.H0 : µ1 - µ2 ≤ 0 , HA : µ1 - µ2 > 0
Sampling from Normally Distributed Populations:
Population Variances Known
Exercise 1

 Researchers wish to know if the data they have collected


provide sufficient evidence to indicate a difference in mean
serum uric acid levels between normal individuals and
individuals with Down’ syndrome. The data consists of serum
uric acid readings on 12 individuals with Down’s syndrome
and 15 normal individuals. The means are 𝑋ത 1 = 4.5 mg/100 ml
and 𝑋ത 2 = 3.4mg/100 ml. Assume the variance of 1 for Down
syndrome’s population and 1.5 for normal population.
5 minutes exercise!

PLEASE USE THE FIVE STEPS!


 When the population
Sampling from variances are unknown,
Normally Distributed
Populations:
there are TWO possibilities:
Population  The two population
variances may be equal
Variances
 The two population
Unknown variances may be unequal
Population Variances Equal

 When the population variances are unknown, but ASSUMED to be


equal it is appropriate to pool the sample variances by means of
the following formula:

 It is method to assume the TWO population variances that has


been combined
Formula to obtain the Test value

t –critical value;
df ∶ n1 + n2 − 2
Exercise 2

 A consumer agency wanted to estimate the difference


in the mean amounts of caffeine in two brands of
coffee. The agency took a sample of 15 jars of Brand A
coffee that showed the mean amount of caffeine in
these jars to be 80 milligrams per jar with a standard
deviation of 5 milligrams. Another sample of 12 jars of
Brand B coffee gave a mean amount of caffeine equal
to 77 milligram per jar with a standard deviation of 6
milligrams.
Solution?

One tailed or two tailed?


𝑆𝑝2 ?
T-critical?
T-value?
Reject or not reject Ho?
Population Variance Unequal

 The critical value of 𝑡 1 for an 𝛼 level of significance and a two-sided is approximately

𝑡1 = 𝑡(1−𝛼Τ2) 𝑓𝑜𝑟 𝑑𝑓 ∶ 𝑛1 − 1

𝑡2 = 𝑡(1−𝛼Τ2) 𝑓𝑜𝑟 𝑑𝑓 ∶ 𝑛2 − 1
 The critical value of 𝑡 1 for an 𝛼 level of significance and a one-sided is approximately

𝑡1 = 𝑡(1−𝛼) 𝑓𝑜𝑟 𝑑𝑓 ∶ 𝑛1 − 1

𝑡2 = 𝑡(1−𝛼) 𝑓𝑜𝑟 𝑑𝑓 ∶ 𝑛2 − 1
The test value;
Example

 Researchers wants to compare the aortic stiffness index


among subjects with hypertension and healthy control
subjects. In the 15 patients with hypertension, the mean
aortic stiffness index was 19.16 with a standard deviation of
5.29. In the 30 control subjects, the mean aortic stiffness
index was 9.53 with a standard deviation of 2.69. Assume it
is normally distributed and different variances. Using alpha
of 0.05.
1. State the hypothesis 5.292
𝑤1 = = 1.8656 𝑡1 = 𝑡 (1−𝛼Τ2) 𝑓𝑜𝑟 𝑑𝑓 ∶ 14
H0 : µ1 - µ2 = 0 , HA : µ1 - µ2 ≠ 0 15 Refer in t table : 2.1448
2.692 𝑡2 = 𝑡(1−𝛼Τ2) 𝑓𝑜𝑟 𝑑𝑓 ∶ 29
2. Identify the critical value 𝑤2 = = 0.2412
30 Refer in t table : 2.0452

=
= 2.133
3. Calculate the test value

1
19.16 − 9.53 − (0)
𝑡 =
5.292 2.692
+
15 30

= 6.63
4. Decide either to reject or not to reject
 Since 6.63 > 2.133, we reject H0

5. Conclusion
 It has enough evidence to support that there are means
different of aortic stiffness between healthy subjects and
hypertensive subjects. Hence is H0 not accepted.
Paired Sample
Test
 Characteristics:
 Subjects are often tested in a before-after situation (across time,
with some intervention occurring such as a diet), or subjects are
paired such as with twins, or with subject as alike as possible.
 Test:
 The paired t-test is actually a test that the differences between the
two observations is 0. So, if D represents the difference between
observations, the hypotheses are:

Ho: 𝜇 𝐷 = 0 (the difference between the two observations is 0)


Ha: 𝜇 𝐷 ≠ 0 (the difference is not 0)
Hypothesis Testing

1.H0 : µD = 0 , HA : µD ≠ 0
2.H0 : µD ≥ 0 , HA µD < 0
3.H0 : µD ≤ 0 , HA : µD > 0
Two Dependent Sample Means

 To test the null hypothesis, we’ll again compute a t


statistic and look it up in the t table:

=
 Denote the differences with the symbol 𝑫, the
mean of the population differences with 𝝁𝑫 , and
the sample standard deviation of the differences
with 𝑺 𝑫
Example

 A sample of nine local banks shows their deposits (in billions of


dollars) 3 years ago and their deposits (in billions of dollars) today.
At α = 0.05, can it be concluded that the average in deposits for
the banks is greater today than it was 3 years ago? Use α = 0.05.
Solution

 Step 1: State the hypotheses and identify the claim.


 H0: μD = 0 and H1: μD < 0 (claim)

 Step 2: Find the critical value.


 The degrees of freedom are n – 1 = 9 – 1 = 8.
 The critical value for a left-tailed test with
 α = 0.05 is t = –1.860.

Вам также может понравиться