Вы находитесь на странице: 1из 22

ANOVA

An Introduction to Experimental
Design
and Analysis of Variance
Analysis of Variance and
the Completely Randomized Design
Multiple Comparison
Procedures

An Introduction to Experimental Design


and Analysis of Variance
Statistical studies can be classified as being
either experimental or observational.
In an experimental study, one or more factors
are controlled so that data can be obtained
about how the factors influence the variables
interest.
of
In an
observational study, no attempt is made
to control the factors.
Cause-and-effect relationships are easier to
establish in experimental studies than in
studies.
observational
Analysis of variance
(ANOVA) can be used to
analyze the data obtained from experimental or
observational studies.

An Introduction to Experimental Design


and Analysis of Variance
A factor is a variable that the experimenter
has selected for investigation.
A treatment is a level of a factor.
Experimental units are the objects of interest
in the experiment.
A completely randomized design is an
experimental design in which the treatments
are randomly assigned to the experimental
units.

Analysis of Variance: A Conceptual


Overview
Analysis
Analysis of
of Variance
Variance (ANOVA)
(ANOVA) can
can be
be used
used to
to test
test
for
for the
the equality
equality of
of three
three or
or more
more population
population means.
means.
Data
Data obtained
obtained from
from observational
observational or
or experimental
experimental
studies
studies can
can be
be used
used for
for the
the analysis.
analysis.
We
We want
want to
to use
use the
the sample
sample results
results to
to test
test the
the
following
following hypotheses:
hypotheses:
H0: 1=2=3=.

. .

= k

Ha: Not all population means are equal

Analysis of Variance: A Conceptual


Overview
H0: 1=2=3=.

. .

= k

Ha: Not all population means are equal


If
If H
H00 is
is rejected,
rejected, we
we cannot
cannot conclude
conclude that
that all
all
population
population means
means are
are different.
different.
Rejecting
Rejecting H
H00 means
means that
that at
at least
least two
two population
population
means
means have
have different
different values.
values.

Analysis of Variance: A Conceptual


Overview
Assumptions for Analysis of
Variance
For
For each
each population,
population, the
the response
response (dependent)
(dependent)
variable
variable is
is normally
normally distributed.
distributed.
The
The variance
variance of
of the
the response
response variable,
variable, denoted
denoted
22,,
is
is the
the same
same for
for all
all of
of the
the populations.
populations.
The
The observations
observations must
must be
be independent.
independent.

Analysis of Variance: A Conceptual


Overview
x
Sampling Distribution of

Given H0 is True

Sample means are close together


because there is only
one sampling distribution
when H0 is true.

2

n
2
x

x2

x1

x3

Analysis of Variance: A Conceptual


Overview
x
Sampling Distribution of

Given H0 is False

Sample means come from


different sampling distributions
and are not as close together
when H0 is false.

x3

x1 1

x2

Analysis of Variance and


the Completely Randomized Design
Between-Treatments Estimate of Population
Variance
Within-Treatments Estimate of Population
Variance
Comparing the Variance Estimates: The F
Test
ANOVA Table

Between-Treatments Estimate
of Population Variance 2
The estimate of 2 based on the variation of
the
sample means is called the mean square due
to
treatments and isk denoted2by MSTR.
nj (xj x)

j 1
MSTR
k 1
Denominator is the
degrees of freedom
associated with
SSTR

Numerator is called
the sum of squares
due
to treatments (SSTR)

Within-Treatments Estimate
of Population Variance 2
The estimate of 2 based on the variation of
the sample observations within each sample is
called the mean square error and is denoted
by MSE.
k

MSE

Denominator is the
degrees of
freedom
associated with
SSE

(nj 1)s2j

j1

nT k
Numerator is
called
the sum of
squares
due to error (SSE)

Comparing the Variance Estimates: The F


Test

If the null hypothesis is true and the ANOVA


assumptions are valid, the sampling distribution of
MSTR/MSE is an F distribution with MSTR d.f.
equal to k - 1 and MSE d.f. equal to nT - k.

If the means of the k populations are not equal, the


value of MSTR/MSE will be inflated because MSTR
overestimates 2.
Hence, we will reject H0 if the resulting value of
MSTR/MSE appears to be too large to have been
selected at random from the appropriate F
distribution.

Comparing the Variance Estimates: The F


Test
Sampling Distribution of
MSTR/MSE
Sampling Distribution
of MSTR/MSE
Reject H0
Do Not Reject H0

F
Critical Value

MSTR/MSE

ANOVA Table
for a Completely Randomized Design
Source of Sum of Degrees of
Variation Squares Freedom
Treatments SSTR

k-1

Error

SSE

nT - k

Total

SST

nT - 1

SST is partitioned
into SSTR and SSE.

Mean
Square

SSTR MSTR
k - 1 MSE
SSE
MSE
nT - k

MSTR

SSTs degrees of freedom


(d.f.) are partitioned into
SSTRs d.f. and SSEs d.f.

pValue

ANOVA Table
for a Completely Randomized Design
SST
SST divided
divided by
by its
its degrees
degrees of
of freedom
freedom n
nTT 1
1 is
is the
the
overall
overall sample
sample variance
variance that
that would
would be
be obtained
obtained if
if we
we
treated
treated the
the entire
entire set
set of
of observations
observations as
as one
one data
data set.
set.
With
With the
the entire
entire data
data set
set as
as one
one sample,
sample, the
the formula
formula
for
for computing
computing the
the total
total sum
sum of
of squares,
squares, SST,
SST, is:
is:
k

nj

SST (xij x)2 SSTR SSE


j 1 i 1

ANOVA Table
for a Completely Randomized Design
ANOVA
ANOVA can
can be
be viewed
viewed as
as the
the process
process of
of partitioning
partitioning
the
the total
total sum
sum of
of squares
squares and
and the
the degrees
degrees of
of freedom
freedom
into
into their
their corresponding
corresponding sources:
sources: treatments
treatments and
and error.
error.
Dividing
Dividing the
the sum
sum of
of squares
squares by
by the
the appropriate
appropriate
degrees
degrees of
of freedom
freedom provides
provides the
the variance
variance estimates
estimates
and
and the
the FF value
value used
used to
to test
test the
the hypothesis
hypothesis of
of equal
equal
population
population means.
means.

Test for the Equality of k Population


Means
Hypothese
s
H0: 1=2=3=.

. .

= k

Ha: Not all population means are equal


Test Statistic
F = MSTR/MSE

Test for the Equality of k Population


Means
Rejection
Rule
p-value Approach: Reject H0 if p-value <
Critical Value Approach: Reject H0 if F > F
where the value of F is based on an
F distribution with k - 1 numerator d.f.
and nT - k denominator d.f.

Multiple Comparison Procedures


Suppose that analysis of variance has
provided statistical evidence to reject
the null hypothesis of equal population
Fishers
means.least significant difference (LSD)
procedure can be used to determine where
the differences occur.

Fishers LSD Procedure


Hypotheses
H 0 : i j
H a : i j

Test
Statistic

xi xj
MSE( 1n 1n )
i
j

Fishers
LSDRuleProcedure
Rejection
p-value Approach:
Reject H0 if p-value <
Critical Value Approach:
Reject H0 if t < -ta/2 or t > ta/2
where the value of ta/2 is based on a
t distribution with nT - k degrees of freedom.

Fishers LSD Procedure


Based on the Test Statistic xi - xj

Hypotheses

Test
Statistic
Rejection
Rule
where

H 0 : i j
H a : i j
xi xj
Reject H0 if xi xj

> LSD

LSD t / 2 MSE( 1n 1n )
i
j