Вы находитесь на странице: 1из 15

Nested (hierarchical) ANOVA

what it is
how you do it
variance components
power and optimal allocation of replication

Nested ANOVA
 designs with subsamples nested within replicates
 if the nesting is not acknowledged, these designs are
pseudoreplicated
 nesting is usually spatial, but can be temporal

 variation is partitioned among hierarchical levels

What is a nested factor?


 all levels of one factor are not present in all
levels of another factor
 some levels are uniquely present within some
levels of another factor, but not other levels
 nested factors are usually random factors
 nested fixed factors require justification
Nested B(A)
Factor A
A
A
A
B
B
B
C
C
C

Factor B
1
2
3
4
5
6
7
8
9

Not Nested: A x B
Factor A
A
A
A
B
B
B
C
C
C

Factor B
1
2
3
1
2
3
1
2
3

Examples of Nesting
 creeks (tributaries) are unique to each river
 multiple samples of a single tissue type within a rat
 subsamples in time (if sampled w/o replacement) can only
be sampled at one time and not another

replicates are always nested within treatments -- but we


dont consider this nesting when we construct the
ANOVA model

Two factor nested ANOVA design


 factors A & B
 factor A with p groups or levels
 factor B with q groups or levels within each level
of A
 nested design:
 different (randomly chosen) levels of Factor B in each
level of Factor A
 often one or more levels of subsampling

Andrew & Underwood (1997)

Example: sea urchin grazing on reefs

effect of sea urchin density on the % cover of filamentous algae

 Factor A - fixed
 sea urchin density
 four levels:





100% of original (control)


66%
33%
0%

 Factor B - random
 randomly chosen patches
 4 within each treatment

 n = 5 quadrats / patch

layout: sea urchin grazing on reefs

Density:
Patch:
Reps:

100%

66%

1 2 3 4

5 6 7 8

etc.

n = 5 in each of 16 cells
p = 4 densities, q = 4 patches

Linear model
yijk =  + i + j(i) + ijk
where

i
i(i)
ijk

overall mean
effect of factor A (i - )
effect of factor B within each level of A
(ij - i)
unexplained variation (error term) variation within each cell

(% cover algae)ijk =  + (sea urchin density)i + (patch


within sea urchin density)j(i) + ijk

Effects
 Main effect:
 effect of factor A
 i.e., variation among factor A group means

 Nested (random) effect:


 effect of factor B within each level of factor A
 variation among means of factor B within each level of A

Null hypotheses
H0: Factor A: No difference in mean amount of filamentous algae
between the four sea urchin density treatments
H0: Factor B: No difference in the mean amount of filamentous algae
between all possible patches in any of the treatments

Factor A:
 H0: no difference among means of factor A (= no
difference among means of urchin density treatments)
 1 = 2 = = i = 
is equivalent to

 H0: no main effect of factor A (no effect of urchin density):


 1 = 2 = = i = 0
 i = (i - ) = 0

Null hypotheses
Factor B(A)
 H0: no difference among means of factor B within any level of
factor A
(no difference among patches in mean filamentous algae
cover within any urchin density treatment)
 11 = 12 = = 1j
 21 = 22 = = 2j
 etc.
 H0: no variance among levels of nested random factor B
within any level of factor A
(no variance among patches within each density treatment):
 2 = 0

Partitioning total variation


SSTotal
SSA
SSA
SSB(A)
SSResidual

SSB(A) +

SSResidual

variation among A means


variation among B means within each level of A
variation among replicates within each cell
(each B(A))

Nested ANOVA table

Source

SS

df

MS

Factor A

SSA

p-1

SSA/(p-1)

Factor B(A)

SSB(A)

p(q-1)

SSB(A)/(p(q-1))

Residual

SSResidual

pq(n-1)

SSResidual/(pq(n-1))

Expected Mean Squares

A fixed, B random:
 MSA
 MSB(A)
 MSResidual

Testing null hypotheses


estimate

parameters

 if no main effect of factor A: MSA


 H0: 1 = 2 = i = (i = 0) is
true
MSB(A)
 F-ratio: MSA / MSB(A)  1

 if no effect of nested
random factor B(A):

MSResidual

 H0: 2 = 0 is true


 F-ratio: MSB(A) / MSResidual  1

4 possible outcomes
H0: true; no variation among A
patches dont differ

A1

H0: false; is variation among A

patches do differ dont differ

patches do differ

B1
B2
Bj

B1

Ai

B2
Bj
Bjs=0

Bjs0
Ais=0

Bjs=0

Bjs0
Ais0

Treatment effects in nested designs


 doesnt matter whether nested factor varies or not,
you can look for differences among treatments
 e.g., compared to the 1-way design, the nested design
un-confounds subsamples from true replicates

 nested designs separate confounded additive


factors

Results: Andrew & Underwood 1993


Source

df

MS

var. comp.

Density

4810

2.72

0.09

5.93

<0.001

Patches(Density)

12

1770

Residual

64

299

Total

79

294

49.6%

299

50.4%

no effect of urchin density on percentage cover of filamentous


algae
filamentous algal cover varies significantly from patch to patch
about 50% of the variance in percentage cover of algae is
explained by differences between patches
remaining 50% is explained by differences at the scale of
quadrats within patches

Additional Tests
 Main effect:
 planned contrasts & trend analyses as part of design
 unplanned multiple comparisons if main F-ratio test
significant

 Nested effect:
 usually random factor
 usually of little interest in further tests
 often can provide information on the characteristic
spatial signal of a population

Another worked example


 what is the effect of schools on standardized
tests? (i.e., do scores differ among schools?)
 is the effect of school driven in part by differences
in teachers?

Data: three schools,


two teachers at each
schools, two scores per
teacher
True data matrix,
accounts for teachers
not being the same at
each school
Data format for statistics

ANOVA output
Analysis of Variance
Source

Sum-of-Squares

df

Mean-Square

F-ratio

SCHOOL$

156.50000

78.25000

11.17857

0.00947

TEACHER(SCHOOL$)

567.50000

189.16667

27.02381

0.00070

42.00000

7.00000

Error

What does this mean???

Big effect of teacher!


What about effect of school?
 SYSTAT and other stats
software generally will not
automatically construct the
F ratio correctly
 F-ratio is:
MSschool / MSteacher(school)

accounting for teacher effect


Before:
Analysis of Variance
Source

Sum-of-Squares

df

Mean-Square

F-ratio

SCHOOL$

156.50000

78.25000

11.17857

0.00947

TEACHER(SCHOOL$)

567.50000

189.16667

27.02381

0.00070

42.00000

7.00000

Error

After:
Test of Hypothesis
SS

df

MS

Hypothesis

Source

156.50000

78.25000

0.41366

Error

567.50000

189.16667

0.69397

No effect of school!

Power
 more replication always gives you more power
 but in nested ANOVA, there is replication at
various levels
 where does your power come from?
if you have nested factors within your treatments,
you need to replicate the nested factor, not the
subsamples

Spatially nested designs


 used to provide information on the characteristic
spatial signal of populations
 other techniques (geostatistical models) also can
do this, but nested models are very efficient
 variance component models (part of nested)
can provide the percent of variation that is
associated with particular spatial scales

What spatial scale is most of the variance associated with?


Sites

Locations

Region 1
Transects

Regions
Region 2
Locations(Regions)
Sites(Locations(Regions))
Transects(Sites(Locations(Regions)))

Region 3

At what scale is most


of the variance?

Source

df

MS

Var. comp (%)

Region

6658

10.4

247 (42)

Location(Region)

638

2.43

71 (12)

Site(Location(Region))

18

263

1.40

88 (14)

Transect = Residual, Error

54

187

187 (32)

Optimal Allocation of Replication at different levels

can calculate at what level it is best to spend your


time or money on replication
 must know
 variance at each level (var. components)
 cost / effort to obtain replication at each hierarchical
level

Вам также может понравиться