Академический Документы
Профессиональный Документы
Культура Документы
Introduction
A common objective in research is to investigate the effect of each of a number of variables,
or factors, on some response variable. In earlier times, factors were studied one at a time,
with separate experiments devoted to each one. But RA Fisher pointed out that important
advantages are gained by combining the study of several factors in the same experiment. In a
factorial experiment, the treatment structure consists of all possible combinations of all
levels of all factors under investigation. Factorial experimentation is highly efficient because
each experimental unit provides information about all the factors in the experiment. Factorial
experiments also provide a systematic method of investigating the relationships among the
effects of different factors (i.e. interactions).
Terminology
The different classes of treatments in an experiment are called factors (e.g. Fertilization,
Medication, etc.). The different categories within each factor are called levels (e.g. 0, 20, and
40 lbs N/acre; 0, 1, and 2 doses of an experimental drug, etc.). We will denote different
factors by upper case letters (A, B, C, etc.) and different levels by lower case letters with
subscripts (a1, a2, etc.). The mean of experimental units receiving the treatment combination
aibi will be denoted "Mean(aibi)".
We will refer to a factorial experiment with two factors and two levels for each factor as a
2x2 factorial experiment. An experiment with 3 levels of Factor A, 4 levels of Factor B, and
2 levels of Factor C will be referred to as a 3x4x2 factorial experiment. Etc.
An example of a CRD involving two factors: Nitrogen levels (N0 and N1) and phosphorous
levels (P0 and P1), applied to a crop. The response variable is yield (lbs/acre). The data:
Factor A = N level
Level a1 = N 0 a2 = N 1 Mean (abi) a2-a1
b1 = P0 40.9 47.8 44.4 6.9 (se A,b1)
B = P level
b2 = P1 42.4 50.2 46.3 7.8 (se A,b2)
Mean (aib) 41.6 49 45.3 7.4 (me A)
b2-b1 1.5 (se B,a1) 2.4 (se B,a2) 1.9 (me B)
The differences a2 - a1 (at each leavel of B) and b2 - b1 (at each level of A) are called the
simple effects of a and b, respectively, denoted (se A) and (se B). The averages of the
simple effects are the main effects of a and b, respectively, denoted (me A) and (me B).
1
One way of using this data is to consider the effect of N on yield at each P level separately.
This information could be useful to a grower who is constrained to use one or the other P
level. This is called analyzing the simple effects (se) of N. The simple effects of applying
nitrogen are to increase yield by 6.9 lb/acre for P0 and 7.8 lb/acre for P1.
It is possible that the effect of N on yield is the same whether or not P is applied. In this case,
the two simple effects estimate the same quantity and differ only due to experimental error.
One is then justified in averaging the two simple effects to obtain a mean yield response of
7.4 lb/acre. This is called the main effect (me) of N on yield. If the effect of P is independent
of N level, then one could do the same thing for this factor and obtain a main effect of P on
yield response of 1.9 lb/acre.
Interaction
If the simple effects of Factor A are the same across all levels of Factor B, the two factors are
said to be independent. In such cases, it is appropriate to analyze the main effects of each
factor. It may, however, be the case that the effects are not independent. For example, one
might expect the application of P to permit a higher expression of the yield potential of the N
application. In that case, the effect of N in the presence of P would be much larger than the
effect of N in the absence of P. When the effect of one factor depends on the level of another
factor, the two factors are said to exhibit an interaction.
An interaction is a measure of the difference in the effect of one factor at the different
levels of another factor. Interaction is a common and fundamental scientific idea.
One of the primary objectives of factorial experiments, other than efficiency, is to study the
interactions among factors. The sum of squares of an interaction measures the departure of
the group means from the values expected on the basis of purely additive effects. In common
biological terminology, a large positive deviation of this sort is called synergism. When
drugs act synergistically, the result of the interaction of the two drugs may be above and
beyond the simple addition of the separate effects of each drug. When the combination of
levels of two factors inhibit each other’s effects, we call it interference. Both synergism and
interference increase the interaction SS.
2
These differences between the simple effects of two factors, also known as first-order
interactions or two-way interactions, can be visualized in the following interaction plots:
b2
Y Y b2
b1 b1
se B,a1 se A,b1
a1 a2 a1 a2
Y b1 Y
b1
b2
a1 a2 a1 a2
e. Synergism f. Interference
b2
b2
Y Y b1
b1
a1 a2 a1 a2
3
Reasons for carrying out factorial experiments
3. To offer recommendations that must apply over a wide range of conditions: One can
introduce "subsidiary factors" (e.g. soil type) into an experiment to ensure that any
recommended results apply across a necessary range of circumstances.
1. The total possible number of treatment level combinations increases rapidly as the
number of factors increases. For example, to investigate 7 factors (3 levels each) in a
factorial experiment requires, at minimum, 2187 experimental units.
2. Higher order interactions (three-way, four-way, etc.) are very difficult to interpret. So a
large number of factors complicates the interpretation of results.
Looking at data table, it is easy to get confused between nested and factorial experiments.
Consider a factorial experiment in which leaf discs are grown in 10 different tissue culture
media (all possible combinations of 5 different types of sugars and 2 different pH levels). In
what way does this differ from a nested design in which each sugar solution is prepared
twice, so there are two batches of sugar for each treatment? The following tables represent
both designs, using asterisks to represent measurements of the response variable (leaf
growth).
The data tables look very similar, so what's the difference here? The factorial analysis
implies that the two pH classes are common across the entire study (i.e. pH level 1 is a
specific pH level that is the same across all sugar treatments). By analogy, if you were to
analyze the nested experiment as a two-way factorial ANOVA, it would imply that Batches
are common across the entire study. But this is not so. Batch 1 for Treatment 1 has no closer
relation to Batch 1 for Treatment 2 than it does to Batch 2 for Treatment 2. "Batch" is an ID,
4
and Batches 1 and 2 are simply arbitrary designations for two randomly prepared sugar
solutions for each treatment.
Now, if all batches labeled 1 were prepared by the same technician on the same day, while all
batches labeled 2 were made by someone else on another day, then “1” and “2” would
represent meaningfully common classes across the study. In this case, the experiment could
properly be analyzed using a two–way ANOVA with Technicians/Days as blocks (RCBD).
While they are both require two-way ANOVAs, RCBD's differ from true factorial
experiments in their objective. In this example, we are not interested in the effect of the
batches or in the interaction between batches and sugar types. Our main interest is to control
for this additional source of variation so that we can better detect the differences among
treatments; toward this end, we assume there to be no interactions.
When presented with an experimental description and its accompanying dataset, the critical
question to be asked to differentiate factors from experimental units or subsamples is this:
Do the classes in question have a consistent meaning across the experiment, or are they
simply ID's? Notice that ID (or dummy) classes can be swapped without affecting the
analysis (switching the names of "Batch 1" and "Batch 2" within any given Sugar Type has
no consequences) whereas factor classes cannot (switching "pH1" and "pH2" within any given
Sugar Type will completely muddle the analysis).
Here αi represents the main effect of factor A (i = 1,...,a), βj represents the main effect of
factor B, (j = 1,...,b), (αβ)ij represents the interaction of factor A level i with factor B level j,
and εijk is the error associated with replication k of the factor combination ij (k = 1,..,r). In
dot notation:
Yijk = Y ... + (Y i.. − Y ... ) + (Y . j . − Y ... ) + (Y ij. − Y i.. − Y . j . + Y ... ) + (Yijk − Y ij. )
main effect main effect interaction experimental
factor A factor B effect (A*B) error
The null hypotheses for a two-factor experiment are αi = 0, βj = 0, and (αβ)ij = 0. The F
statistics for each of these hypotheses may be interpreted independently due to the
orthogonality of their respective sums of squares.
5
The ANOVA
In the ANOVA for two-way factorial experiments, the Treatment SS is partitioned into three
orthogonal components: a SS for each factor and an interaction SS. This partitioning is valid
even when the overall F test among treatments is not significant. Indeed, there are situations
where one factor, say B, has no effect on the response variable and hence contributes no more
to the SST than one would expect by chance along. In such a circumstance, a significant
response to A might well be lost in an overall test of significance. In a factorial experiment,
the overall SST is rightly understood to be an intermediate computational quantity rather than
an end product (i.e. a numerator for an F test).
In a two factor experiment (a x b), there are a total of ab treatment combinations and
therefore (ab – 1) treatment degrees of freedom. The main effect of factor A has (a – 1) df
and the main effect of factor B has (b – 1) df. The interaction (AxB) has (a – 1)(b – 1) df.
With r replications per treatment combination, there are a total of (rab) experimental units in
the study and, therefore, (rab – 1) total degrees of freedom.
Source df SS MS F
Factor A a-1 SSA MSA MSA/MSE
Factor B b-1 SSB MSB MSB/MSE
AxB (a - 1)(b - 1) SSAB MSAB MSAB/MSE
Error ab(r - 1) SSE MSE
Total rab - 1 TSS
The interaction SS is the variation due to the departures of group means from the values
expected on the basis of additive combinations of the two factors' main effects. The
significance of the interaction F test determines what kind of subsequent analysis is
appropriate:
6
Relationship between factorial experiments and experimental design
Experimental designs are characterized by the method of randomization: how were the
treatments assigned to the experimental units? In contrast, factorial experiments are
characterized by a certain treatment structure, with no requirements on how the treatments
are randomly assigned to experimental units. A factorial treatment structure may occur
within any experimental design.
Since Factor A has 4 levels (1, 2, 3, 4) and Factor B has 2 levels (1, 2), there are eight
different treatment combinations: (11, 12, 13, 14, 21, 22, 23, 24).
24 23 13 23 24 14 13 23 11 24 12 14 22 13 12 21 21 11 22 12 11 22 21 14
13 12 21 23 11 24 14 22 12 11 24 23 13 22 21 14 24 14 22 21 11 13 23 12
8 x 8 Latin Square
24 11 22 12 13 14 23 21
21 23 13 14 22 12 11 24
12 14 24 11 23 21 22 13
13 22 21 24 11 23 14 12
23 12 11 13 21 22 24 14
14 24 23 22 12 13 21 11
11 21 12 23 14 24 13 22
22 13 14 21 24 11 12 23
7
Example of a 2 x 3 factorial experiment within an RCBD with no significant
interactions (ST&D 391)
Data: The number of quack-grass shoots per square foot after spraying with maleic
hydrazide. Treatments are maleic hydrazide applications rates (R: 0, 4, and 8 lbs/acre) and
delay in cultivation after spraying (D: 3 and 10 days).
The R Code
#The ANOVA
quack_mod<-lm(Number ~ D + R + D:R + Block, quack_dat) ß WHAT'S MISSING??
anova(quack_mod)
Note: If there were only 1 replication per D-R combination (i.e. only 1
block) you could not include the D:R interaction in the model. There would
not be enough error df.
The output
Analysis of Variance Table
8
Essentially parallel lines in an interaction plot, as those observed in this case, indicate the
absence of an interaction.
Interactions of R and D
16
14
R
Number
0
4
12 8
10
3 10
The lines of this plot are essentially parallel because the difference between D levels is
roughly the same at all R levels. This non-interaction can be seen from the perspective of
either factor:
Interactions of D and R
16
14
Number
D
3
12 10
10
0 4 8
Here, the lines are essentially parallel because the difference between R levels is
approximately the same at all levels of D.
9
If there is no significant interaction, you are justified in analyzing the effect of R without
regard for the level D because the effect of R does not depend on the level of D, and vice
versa. Detailed comparisons of the mean effects can be performed using contrasts or an
appropriate multiple comparison test. Representative analyses:
D, means
alpha: 0.05 ; Df Error: 15
Critical Value of Studentized Range: 3.014325
Honestly Significant Difference: 1.409971
R, means
alpha: 0.05 ; Df Error: 15
Critical Value of Studentized Range: 3.673378
Honestly Significant Difference: 2.104414
10
Contrasts (trend analysis of R)
Df Sum Sq Mean Sq F value Pr(>F)
D 1 1.50 1.50 0.571 0.461
R 2 153.66 76.83 29.263 6.64e-06 ***
R: Linear 1 152.52 152.52 58.092 1.56e-06 ***
R: Quadratic 1 1.14 1.14 0.435 0.520
Block 3 0.58 0.19 0.074 0.973
D:R 2 0.49 0.25 0.093 0.911
D:R: Linear 1 0.12 0.12 0.047 0.832
D:R: Quadratic 1 0.37 0.37 0.140 0.714
Residuals 15 39.38 2.63
Look again at the contrast output above. The trend analysis using orthogonal contrasts
partitioned not only the SS for the factor R but also the SS of the interaction D*R. In this
way, R makes partitioning the Interaction SS very easy. This can be done another way as
well, by "opening up" the factorial treatment structure, as described below:
To manually partition the D:R interaction (2 df), you first need to create a variable, say
"TRT," whose values are the full set of factorial combinations of D and R levels. The values
of TRT for this example would be:
Now we are back in familiar territory. We have "opened up" the factorial treatment structure,
redefining it as a simple one-way classification. Now we can simply analyze TRT and use
contrasts to partition the interaction, as you've seen before.
11
Modifying the original data table:
The output:
Here we have successfully partitioned the Treatment SS into its five single-df components,
two of which are interaction components. Compare this output to that of the previous
factorial analysis:
12
Df Sum Sq Mean Sq F value Pr(>F)
D 1 1.50 1.50 0.571 0.461
R 2 153.66 76.83 29.263 6.64e-06 ***
R: Linear 1 152.52 152.52 58.092 1.56e-06 ***
R: Quadratic 1 1.14 1.14 0.435 0.520
Block 3 0.58 0.19 0.074 0.973
D:R 2 0.49 0.25 0.093 0.911
D:R: Linear 1 0.12 0.12 0.047 0.832
D:R: Quadratic 1 0.37 0.37 0.140 0.714
Residuals 15 39.38 2.63
By "opening up" the factorial treatment structure, we have successfully partitioned the SS of
R into its two single-df components and the SS of the D:R interaction into its two single-df
components. In this case, no significant interaction components were found "hiding" inside
the overall non-significant interaction.
13
9.7.4.3 Another example of partitioning the Interaction SS
A factorial experiment was carried out to determine the effects of vernalization genes Vrn1
and Vrn2 on flowering time (days to flowering) in wheat. 102 plants from a segregating
population (parents A and B) were characterized with molecular markers and the number of
alleles of parent A indicated for each of the two genes (BB = 0, AB = 1, AA = 2). In the R
code below, the auxiliary variable “Type” represents each combination of Vrn1 and Vrn2
classes.
1 0 0 89 1 0 0 97 1 0 0 101 1 0 0 100
1 0 0 98 2 0 1 133 2 0 1 144 2 0 1 148
2 0 1 148 2 0 1 138 2 0 1 130 2 0 1 133
2 0 1 128 2 0 1 130 2 0 1 137 2 0 1 141
2 0 1 134 2 0 1 133 2 0 1 138 2 0 1 131
2 0 1 148 3 0 2 163 3 0 2 153 3 0 2 161
3 0 2 153 3 0 2 156 3 0 2 148 4 1 0 109
4 1 0 83 4 1 0 87 4 1 0 103 4 1 0 110
4 1 0 81 4 1 0 99 4 1 0 98 4 1 0 83
4 1 0 78 4 1 0 92 4 1 0 92 4 1 0 91
4 1 0 85 4 1 0 83 4 1 0 66 5 1 1 122
5 1 1 121 5 1 1 121 5 1 1 122 5 1 1 125
5 1 1 118 5 1 1 123 5 1 1 124 5 1 1 125
5 1 1 108 5 1 1 112 5 1 1 126 5 1 1 118
5 1 1 98 5 1 1 116 5 1 1 106 5 1 1 117
5 1 1 110 5 1 1 113 5 1 1 129 5 1 1 116
6 1 2 140 6 1 2 125 6 1 2 178 6 1 2 136
6 1 2 132 6 1 2 133 6 1 2 135 6 1 2 134
6 1 2 125 6 1 2 125 6 1 2 128 6 1 2 121
6 1 2 128 6 1 2 135 7 2 0 91 7 2 0 103
7 2 0 81 7 2 0 99 7 2 0 88 7 2 0 99
7 2 0 73 8 2 1 137 8 2 1 118 8 2 1 120
8 2 1 153 8 2 1 86 8 2 1 114 8 2 1 126
8 2 1 120 8 2 1 120 8 2 1 118 8 2 1 119
8 2 1 106 8 2 1 112 8 2 1 111 8 2 1 117
9 2 2 124 9 2 2 124
The unbalanced nature of this dataset causes some problems, which we will get to in Session
II. For now, let's run the analysis as we've learned until now and take a look at the
Vrn1:Vrn2 interaction:
14
Results of analysis as a 3x3 factorial experiment
Is there any significance lurking inside here? We can partition this interaction in various
ways. One way is to keep both factors (Vrn1 and Vrn2) in the model and create contrasts for
each:
#The same contrasts are used for factors Vrn1 and Vrn2
# Contrast ‘Linear’ -1 0 1
# Contrast ‘Quadratic’ 1 -2 1
contrastmatrix<-cbind(c(-1,0,1),c(1,-2,1))
contrastmatrix
contrasts(vrn_dat$Vrn1)<-contrastmatrix
contrasts(vrn_dat$Vrn2)<-contrastmatrix
vrn_dat$Vrn1
vrn_dat$Vrn2
The output:
15
The other way is to collapse the factorial structure into the single classification variable
"Type" and use contrasts to dissect the sums of squares of the Type factor:
#Type 1 2 3 4 5 6 7 8 9
#Vrn1-Vrn2 00 01 02 10 11 12 20 21 22
# Contrast ‘Vrn1 Linear’ -1 -1 -1 0 0 0 1 1 1
# Contrast ‘Vrn1 Quad’ 1 1 1 -2 -2 -2 1 1 1
# Contrast ‘Vrn2 Linear’ -1 0 1 -1 0 1 -1 0 1
# Contrast ‘Vrn2 Quad’ 1 -2 1 1 -2 1 1 -2 1
# Contrast ‘1Lin*2Lin’ 1 0 -1 0 0 0 -1 0 1
# Contrast ‘1Lin*2Quad’ -1 2 -1 0 0 0 1 -2 1
# Contrast ‘1Quad*2Lin’ -1 0 1 2 0 -2 -1 0 1
# Contrast ‘1Quad*2Quad' 1 -2 1 -2 4 -2 1 -2 1
contrastmatrix2<-cbind(c(-1,-1,-1, 0, 0, 0, 1, 1, 1),
c(1, 1, 1,-2,-2,-2, 1, 1, 1),
c(-1, 0, 1,-1, 0, 1,-1, 0, 1),
c(1,-2, 1, 1,-2, 1, 1,-2, 1),
c(1, 0,-1, 0, 0, 0,-1, 0, 1),
c(-1, 2,-1, 0, 0, 0, 1,-2, 1),
c(-1, 0, 1, 2, 0,-2,-1, 0, 1),
c(1,-2, 1,-2, 4,-2, 1,-2, 1))
contrasts(vrn_dat$Type)<-contrastmatrix2
16
Note that even though the interaction in the 3x3 factorial is declared not significant, the
linear by linear component of that interaction is significant. What does that mean? A
look at the contrast coefficients for this interaction reveals the null hypothesis of this
interaction component (Note: Here, "Type i" indicates the mean of treatment level i):
Plugging in:
(No Vrn1 or Vrn2) – (Full Vrn1 but no Vrn2) – (Full Vrn2 but no Vrn1) + (Full Vrn1 and Vrn2) = 0
(No Vrn1 or Vrn2) – (Full Vrn1 but no Vrn2) = (Full Vrn2 but no Vrn1) - (Full Vrn1 and Vrn2)
i.e. The effect of Vrn1 in the absence of Vrn2 = The effect of Vrn1 in the presence of Vrn2
OR
(No Vrn1 or Vrn2) – (Full Vrn2 but no Vrn1) = (Full Vrn1 but no Vrn2) - (Full Vrn1 and Vrn2)
i.e. The effect of Vrn2 in the absence of Vrn1 = The effect of Vrn2 in the presence of Vrn1
17
9.7.5 Two-way factorial within a CRD, one replication per cell
Analogous to the RCBD case with one replication per block-treatment combination, when
there is only one observation per factor-factor combination, there is no source of variation to
estimate the true experimental error. Due to a lack of degrees of freedom, the interaction
effect cannot be partitioned from the error. As in the RCBD case, Tukey’s 1 df test for
nonadditivity can be used to probe the significance of that interaction, though only in an
approximate way. In the end, because the interaction cannot be removed from the error, all
tests for significance assume additivity of the factor (i.e. no interaction). In the following
code, only the first block from the maleic hydrazide example is used as an example of a
CRD:
The data:
D R Number
3 0 15.7
10 0 18
3 4 9.8
10 4 13.6
3 8 7.9
10 8 8.8
#The ANOVA
quack_mod<-lm(Number ~ D + R, quack_dat)
anova(quack_mod)
The output:
Note that the error (2 df) is estimated by the interaction. If the interaction is non
significant, SSE and SS(A*B) estimate the same quantity and the conclusions are valid.
So the situation is very similar to an RCBD with one rep per cell. The difference is one of
intention. The primary objective of a factorial experiment is to investigate potential
interactions among factors of interest. For that reason, it makes very little sense to conduct a
factorial experiment with only one replication per factor-factor combination. In an RCBD,
blocks are merely an error control strategy and the assumption is that their effect on the
response variable is independent of the treatment; it is this assumption that justifies a design
with a single rep per cell.
18
9.7.6. 2x2 CRD with a significant interaction
The interpretation of factorial experiments is often complicated when the interactions are
large. This is especially true if the effects change direction, as they do in this example. In
this CRD with five replications per treatment combination, Factor A is time of bleeding of a
lamb (AM vs. PM) and Factor B is treatment with the synthetic estrogen diethylstilbestrol (no
DES vs. DES). The response variable is level of plasma phospholipid in the blood. A data
summary:
Factor A
B AM PM Means Effects
No DES 13.28 36.53 24.905 0.660
DES 19.36 27.81 23.585 -0.660
Means 16.32 32.17 24.245
Effects -7.925 7.925
#The ANOVA
phos_mod<-lm(Phos ~ Time:DES, phos_dat)
anova(phos_mod)
The output:
The interaction is significant, which means that the simple effects are heterogeneous. It make
no sense to talk about the main (or general) effect of DES on phospholipid levels unless you
specify a time of day. Conversely, it makes no sense to talk about the mean effect of time of
day on phospholipid levels unless you specify a DES treatment level. The non-parallel nature
of the lines in the interaction plot on the next page illustrates this interaction.
19
If an interaction is present in a fixed-effects model,
the next step is to analyze the simple effects.
What the above code accomplishes are four separate ANOVAs. The first half breaks the data
into two separate experiments, one testing the effect of DES in the morning and one testing
the effect of DES in the evening. The second half breaks the data into two separate
experiments, one testing the effect of time in the absence of DES and one testing the effect of
time in the presence of DES. A summary of these results:
20
Response: Phos
Df Sum Sq Mean Sq F value Pr(>F)
DES (in AM) 1 92.477 92.477 7.7607 0.02371 *
DES (in PM) 1 190.18 190.183 5.3461 0.04952 *
Time (DES) 1 178.591 178.591 24.802 0.001079 **
Time (no DES) 1 1352.10 1352.10 33.559 0.0004084 ***
The significant effect of DES is now seen. It was hidden in the overall ANOVA (p = 0.5532)
because the simple effects of DES are in opposite directions, depending on the time of day.
These opposite effects cancelled one another out in the analysis of main effects. Note: For
each of these four ANOVAs, you must test all assumptions.
There is no reason to restrict the a factorial experiment to a consideration of only two factors.
Three or more factors may be analyzed simultaneously, each at different levels. However, as
the number of factors increases, even without replication within subgroups, the required
number of experimental units becomes very large. It is frequently prohibitive in terms of
resources to carry out such experiments. For example, a 4x4x4 factorial requires 64
experimental units to represent each combination of factors. Moreover, if only 64 EU's are
used, there will not be sufficient replication to estimate the true experimental error. In this
case, the three-way interaction (A:B:C) would have to be used as an estimate of experimental
error; and this replies on the assumption no significant three-way interaction effect is present.
There are also logistical difficulties with such large experiments. It may not be possible to
run all the tests on the same day or to hold all materials in a single controlled environmental
chamber. If this is case, treatment effects may become confounded with one another or
irrelevant sources of variation, either of which, ultimately, may reduce the power of the tests.
The third problem that accompanies a factorial ANOVA with several main effects is the large
number of possible interactions. A two-way ANOVA has only one interaction, A:B. A
three-way factorial has three first-order (or two-way) interactions (A:B, A:C, and B:C) and
one second-order (or three-way) interaction (A:B:C).
A four-way factorial has 6 first-order interactions, four second-order interactions, and one
third-order (or four-way) interaction (A:B:C:D). The number of possible interactions
increases rapidly as the number of factors increases. The testing of their significance, and
more importantly, their interpretation becomes complex.
21
9.8.1. Example of a three-way factorial ANOVA
C.J. Monlezun (1979) Two-dimensional plots for interpreting interactions in the three-factor
analysis of variance model. The American Statistician 33: 63-69
The following group means from a hypothetical 3x5x2 experiment are used to illustrate an
example with no three-way interactions. A graphical technique to show the non-significant
three-way interactions is discussed.
The lines of mean values for fixed C1 (below, left) and C2 (right) levels are not parallel.
This indicates the presence of a two-way interaction between factors A and B at both levels
of factor C. The first order interaction (A:B) now has two values: (A:B at C1) and (A:B at
C2). The interaction term (A:B) is the average of these values.
140 180
120 160
140
Response
100
120 A3
Response
80 100
A1
80
60
60
40 40 A2
20 20
0
0
1 2 3 4 5
1 2 3 4 5
B-levels
B-levels
If, however, the differences between different levels of A are determined for all
combinations of B and C, the plot of these differences reveals no interaction between B and
C. The lack of B:C interaction when the response variable is the differences between A
levels indicates that no significant A:B:C interaction is present (i.e. (αβγ)ijk = 0). A graphical
check of whether (αβγ)ijk = 0 is satisfied in the general situation would require (a-1) different
graphs, like those below:
60 60
C1
40
C2
40 20
0
20 -20 1 2 3 4 5
C1
-40
0
-60
1 2 3 4 5
-80 C2
-20
-100
-40 B-levels
B-levels
Phrasing these results in words, we can say that the factors A and B interact in the same way
at all levels of factor C.
22
The interpretation of a three-factor interaction is that the effect of factor A depends on the
precise combination of factors B and C. Take, for example, A to be nitrogen level and B to
be plow depth. In a simple two-factor experiment, a significant A:B interaction indicates that
the crop has a different response to N depending on plow depth. Now introduce a third factor
C, which could be soil type. A nonzero three-way interaction (A:B:C) means that the effect
of plow depth on the crop's response to nitrogen varies, depending on the soil type.
23