Академический Документы
Профессиональный Документы
Культура Документы
and
. It can be
large because of a large difference between
and
or the large
sample size.
Denominator: measures the within-group variation.
To assess whether several populations have the same mean, we
compare the variation among the means of the groups with the
variation within groups.
That's why this method is called analysis of variance.
Example:
ANOVA Model
Let's first look at the statistical model for a random sample of
observations from a single Normal population with mean and
standard deviation .
The model for one-way ANOVA is very similar. We take a
random sample from each of k different populations. The sample
size is
for the
population. Let
represent
observation
from the
population.
One-way ANOVA model:
For our example:
Estimates of Population Parameters
To estimate
group:
The residuals
from k
independent SRS's of sizes
.
In our example:
Degrees of Freedom
For our example:
F-Test
To test
When
.
Note: the F-test is always one-sided because any differences
among the group means tend to make F large.
For ANOVA we also define the coefficient of determination as
For our example:
This means that 2.37% of the variation in SCI scores is explained
by membership in the groups. The other 97.63% of the variation is
due to worker-to-worker variation within each of three groups.
Assumptions and Conditions:
1. Independence Assumptions:
2. Equal Variance Assumption:
3. Normal Population Assumption:
Example:
Two-way ANOVA
In one-way ANOVA, we classify populations according to one
categorical variable, or factor. But when we are interested in the
effects of two factors, we use a two-way design which offers great
advantages over a one-way design.
Let's consider a few examples.
Example:
Design #1: A magazine publisher wants to compare three different
magazine layouts. To do this, she plans to randomly assign the
three design layouts equally among 60 supermarkets. The number
of magazines sold during a one-week period is the outcome
variable.
Now suppose a second experiment is planned for the following
week to compare four different covers for the magazine. A similar
experimental design will be used, with the four covers randomly
assigned among the same 60 supermarkets.
Here is a picture if the design of the first experiment with the
sample sizes:
And this represents the second experiment:
Total time: two weeks.
Let's now combine the two experiments into one.
Design #2: Suppose we use a two-way approach for the magazine
design problem. There are two factors, layout and cover. Since
layout has three levels and cover has four levels, this is a 3x4
design. This gives a total of 12 possible combinations of layout and
cover. With a total of 60 stores, we could assign each combination
of layout and cover to 5 stores. The number of magazines sold
during a one-week period is the outcome variable.
Here is a picture of the two-way design with the sample sizes:
Each combination of the factors in a two-way design corresponds
to a cell. The 3x4 ANOVA for the magazine experiment has 12
cells, each corresponding to a particular combination of layout and
cover.
This design gives us the same amount of information as the two
one-way designs. The difference is that we spend less time. So by
combining the two factors, we have increased our efficiency by
reducing the amount of data to be collected.
Example: Malaria is a serious health problem causing an estimated
2.7 million deaths per year, mostly in Africa. Some research
suggests that vitamin A can reduce episodes of malaria in young
children. Red palm oil is a good source of vitamin A and is really
available in Nigeria, a country where malaria accounts for about
30% of the deaths of young children. Can an increase in the
consumption of red palm oil reduce the occurrence and severity of
malaria in this region?
Target group: children who are 2-5 years of age
Supplement: either placebo, a low dose of red palm oil, or a high
dose of red palm oil.
Gender: boys and girls
Two-way ANOVA design: The factors are red palm oil with three
levels and gender with two levels. There are 3x2=6 cells in our
study. Suppose we recruit 75 boys and 75 girls. We will then
randomly assign 25 of each gender to each of the red palm oil
levels. The outcome variable will be the amount of an acute-phase
protein in the blood that measures the severity of infection.
Here is a table that summarizes the design:
Example: Osteoporosis is a disease primarily of the elderly. People
with osteoporosis have low bone mass and an increased risk of
bone fractures. Over 10 million people in the US, 1.4 million
Canadians, and many millions throughout the world have this
disease. Adequate calcium in the diet is necessary for strong bones,
but vitamin D is also needed for the body to efficiently use
calcium. High doses of calcium in the diet will not prevent
osteoporosis unless there is adequate vitamin D. Exposure of the
skin to the ultraviolet rays in sunlight enables our body to make
vitamin D. However, elderly people often avoid sunlight, and in
northern areas such as Canada, there is not sufficient ultraviolet
light to make vitamin D, particularly in the winter months.
We want: to see if calcium supplements will increase bone mass in
an elderly Canadian population.
We will use a 2x2 design for our study. The two factors are
calcium and vitamin D. The levels of each factor will be zero
(placebo) and an amount that is expected to be adequate,
800mg/day for calcium and 300 IU/day for vitamin D. Women
between the ages 70 and 80 will be recruited as subjects. Bone
mineral density (BMD) will be measured at the beginning of the
study, and supplements will be taken for one year. The chance in
BMD over the one-year period is the outcome variable. We expect
a dropout rate of 20% and we would like to have about 20 subjects
providing data in each group at the end of the study. We will
therefore recruit 100 subjects and randomly assign 25 to each
treatment combination.
Here is a table that summarizes the design with the sample sizes at
baseline:
Note: The effectiveness of the calcium supplement on BMD
depends on having adequate vitamin D. We call this an
interaction. In contrast, the average values for the calcium effect
and the vitamin D effect are represented as main effects.
Advantages of Two-way ANOVA:
1. It is more efficient to study two factors simultaneously rather
than separately.
2. We can reduce the residual variation in a model by including a
second factor that we think influences the response.
3. We can investigate interaction between factors.
Two-Way ANOVA Model
The fixed component reflects known observable influences on the
response variable and may be fitted.
The random component reflects all the unknown influences.
The standard hypotheses are:
: no interaction exists
It says that the increase (decrease) in the mean when changing
from one level of A (B) to another level of A (B) is the same
regardless of the level of B (A) present.