Вы находитесь на странице: 1из 44

Slides Prepared by

JOHN S. LOUCKS
St. Edwards University

2002 South-Western/Thomson Learning

Chapter 13
Analysis of Variance and Experimental
Design

An Introduction to Analysis of Variance


Analysis of Variance: Testing for the Equality
of
k Population Means
Multiple Comparison Procedures
An Introduction to Experimental Design
Completely Randomized Designs
Randomized Block Design

An Introduction to Analysis of Variance

Analysis of Variance (ANOVA) can be used to


test for the equality of three or more
population means using data obtained from
observational or experimental studies.
We want to use the sample results to test the
following hypotheses.

H0: 1=2=3=.

. .

= k

Ha: Not all population means are equal

If H0 is rejected, we cannot conclude that all


population means are different.
Rejecting H0 means that at least two
population means have different values.

Assumptions for Analysis of Variance

For each population, the response variable is


normally distributed.
The variance of the response variable, denoted
2, is the same for all of the populations.
The observations must be independent.

Analysis of Variance:
Testing for the Equality of K Population
Means

Between-Samples Estimate of Population


Variance
Within-Samples Estimate of Population
Variance
Comparing the Variance Estimates: The F Test
The ANOVA Table

Between-Samples Estimate
of Population Variance

A between-samples estimate of 2 is called


the mean square between (MSB).
k

MSB

nj (xj_ x)=2

j1

k 1

The numerator of MSB is called the sum of


squares between (SSB).
The denominator of MSB represents the
degrees of freedom associated with SSB.

Within-Samples Estimate
of Population Variance

The estimate of 2 based on the variation of


the sample observations within each sample is
called the mean square within (MSW).
k

MSW

(nj 1)s2j

j 1

nT k

The numerator of MSW is called the sum of


squares within (SSW).
The denominator of MSW represents the
degrees of freedom associated with SSW.

Comparing the Variance Estimates: The F


Test

If the null hypothesis is true and the ANOVA


assumptions are valid, the sampling
distribution of MSB/MSW is an F distribution
with MSB d.f. equal to k - 1 and MSW d.f. equal
to nT - k.

If the means of the k populations are not


equal, the value of MSB/MSW will be inflated
because MSB overestimates 2.
Hence, we will reject H0 if the resulting value of
MSB/MSW appears to be too large to have
been selected at random from the appropriate
F distribution.

Test for the Equality of k Population


Means

Hypotheses
H0: 1=2=3=. . . = k
Ha: Not all population means are equal

Test Statistic
F = MSB/MSW
Rejection Rule
Reject H0 if F > F
where the value of F is based on an F
distribution with k - 1 numerator degrees of
freedom and nT - 1 denominator degrees of
freedom.
9

Sampling Distribution of MSTR/MSE

The figure below shows the rejection region


associated with a level of significance equal to
where F denotes the critical value.

Do Not Reject H0

Reject H0

F
Critical Value

MSTR/MSE

10

The ANOVA Table


Source of
Sum of
Degrees of
Mean
Variation
Squares
Freedom
Squares
F
Treatment SSTR
k-1
MSTR MSTR/MSE
Error
SSE
nT - k
MSE
Total

SST

nT - 1

SST divided by its degrees of freedom nT - 1 is


simply the overall sample variance that would be
obtained if we treated
the entire nT observations
k nj
2
SST

(
x

x
)
SSTR SSE
as one data set.
ij
j 1 i 1

11

Example: Reed Manufacturing


Analysis of Variance
J. R. Reed would like to know if the mean
number of
hours worked per week is the same for the
department
managers at her three manufacturing plants
(Buffalo,
Pittsburgh, and Detroit).
A simple random sample of 5 managers from
each of
the three plants was taken and the number of
hours
worked by each manager for the previous week
is
12

Example: Reed Manufacturing

Analysis of Variance
Plant 1 Plant 2
Observation
Buffalo
Detroit
1
2
3
4
5

48
54
57
54
62

Sample Mean
Sample Variance
24.5

Plant 3
Pittsburgh

73
63
66
64
74
55

51
63
61
54
56
68

26.0

57
26.5
13

Example: Reed Manufacturing

Analysis of Variance
Hypotheses
H0: 1=2=3
Ha: Not all the means are equal
where:
1 = mean number of hours worked per
week by the managers at Plant 1
2 = mean number of hours worked per
week by the managers at Plant 2
3 = mean number of hours worked per
week by the managers at Plant 3
14

Example: Reed Manufacturing

Analysis of Variance
Mean Square Between
Since the sample sizes are all equal
x ==(55 + 68 + 57)/3 = 60
SSB = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 60)2 = 490
MSB = 490/(3 - 1) = 245
Mean Square Within
SSW = 4(26.0) + 4(26.5) + 4(24.5) =
308
MSW = 308/(15 - 3) = 25.667

15

Example: Reed Manufacturing

Analysis of Variance
F - Test
If H0 is true, the ratio MSB/MSW should be
near 1
since both MSB and MSW are estimating 2.
If Ha
is true, the ratio should be significantly
larger than
1 since MSB tends to overestimate 2.
Rejection Rule
Assuming = .05, F.05 = 3.89 (2 d.f.
numerator,
12 d.f. denominator). Reject H0 if F > 3.89

16

Example: Reed Manufacturing

Analysis of Variance
Test Statistic
F = MSB/MSW = 245/25.667 = 9.55
Conclusion
F = 9.55 > F.05 = 3.89, so we reject H0.
The mean
number of hours worked per week by
department
managers is not the same at each plant.

17

Example: Reed Manufacturing

Analysis of Variance
ANOVA Table
Source of
Variation
F

Sum of
Squares

Treatments
490
9.55
Error
308
Total
798

Degrees of
Freedom
2
12
14

Mean
Square
245

25.667

18

Multiple Comparison Procedures


Suppose that analysis of variance has
provided statistical evidence to reject the null
hypothesis of equal population means.
Fishers least significance difference (LSD)
procedure can be used to determine where the
differences occur.

19

Fishers LSD Procedure

Hypotheses
H0: i = j
Ha: i

Test Statistic

xi xj
t
MSW( 1n 1n )
i
j

Rejection Rule
Reject H0 if t < -ta/2 or t > ta/2
where the value of ta/2 is based on a t
distribution
with nT - k degrees of freedom.

20

Fishers LSD Procedure


_ _
Based on the Test Statistic xi - xj

Hypotheses
H0: i = j
Ha: i

Test Statistic
xi - xj

Rejection Rule

Reject H0 if |xi - xj| > LSD


where

LSD t / 2 MSW( 1n 1n )
i
j

21

Example: Reed Manufacturing

Fishers LSD
Assuming = .05,
LSD 2. 179 25667
.
(15 15) 6.98

Hypotheses (A)
Ha: 1

H0: 1 = 2

Test Statistic
_ _
|x1 - x2| = |55 - 68| = 13

Conclusion
The mean number of hours worked at Plant
1 is not equal to the mean number worked
at Plant 2.
22

Example: Reed Manufacturing

Fishers LSD
Hypotheses (B)
H0: 1 = 3
Ha: 1 3

Test Statistic
_ _
|x1 - x3| = |55 - 57| = 2

Conclusion
There is no significant difference between
the mean number of hours worked at Plant
1 and
the mean number of hours worked at Plant
3.
23

Example: Reed Manufacturing

Fishers LSD
Hypotheses (C)
H0: 2 = 3
Ha: 2 3

Test Statistic
_ _
|x2 - x3| = |68 - 57| = 11

Conclusion
The mean number of hours worked at Plant
2 is not equal to the mean number worked
at Plant 3.

24

An Introduction to Experimental Design

Statistical studies can be classified as being


either experimental or observational.
In an experimental study, one or more factors
are controlled so that data can be obtained
about how the factors influence the variables
of interest.
In an observational study, no attempt is made
to control the factors.
Cause-and-effect relationships are easier to
establish in experimental studies than in
observational studies.

25

An Introduction to Experimental Design

A factor is a variable that the experimenter


has selected for investigation.
A treatment is a level of a factor.
Experimental units are the objects of interest
in the experiment.
A completely randomized design is an
experimental design in which the treatments
are randomly assigned to the experimental
units.
If the experimental units are heterogeneous,
blocking can be used to form homogeneous
groups, resulting in a randomized block
design.
26

Completely Randomized Designs

Between-Treatments Estimate of Population


Variance
Within-Treatments Estimate of Population
Variance
Comparing the Variance Estimates: The F Test
The ANOVA Table
Pairwise Comparisons

27

Between-Treatments Estimate
of Population Variance

In the context of experimental design, the


between-samples estimate of 2 is referred to
as the mean square due to treatments (MSTR).
It is the same as what we previously called
mean square between (MSB).
k
The formula for MSTR is
nj (xj x)2
j1
MSTR _ =
k 1
The numerator is called the sum of squares
due to treatments (SSTR).
The denominator k - 1 represents the degrees
of freedom associated with SSTR.

28

Within-Treatments Estimate
of Population Variance

The second estimate of 2, the within-samples


estimate, is referred to as the mean square
due to error (MSE).
It is the same as what we previously called
mean square within (MSW).
k
The formula for MSE is
(nj 1)s2j
j1
MSE
nT k
The numerator is called the sum of squares
due to error (SSE).
The denominator nT - k represents the degrees
of freedom associated with SSE.
29

ANOVA Table for a


Completely Randomized Design
Source of
Variation
F

Sum of
Squares

Treatments
Error
Total

SSTR

SSE
SST

Degrees of
Freedom
k-1
nT - k

Mean
Squares

SSTR MSTR
MSTR
k- 1
MSE
SSE
MSE
nT - k

nT - 1

30

Example: Home Products, Inc.


Home Products, Inc. is considering marketing a
longlasting car wax. Three different waxes (Type 1,
Type 2,
and Type 3) have been developed.
In order to test the durability of these waxes, 5
new
cars were waxed with Type 1, 5 with Type 2, and
5 with
Type 3. Each car was then repeatedly run
through an
automatic carwash until the wax coating showed
signs
of deterioration. The number of times each car
went

31

Example: Home Products, Inc.


Wax
Wax
Observation
Type 1
3
1
2
3
4
5

48
54
57
54
62

Sample Mean
Sample Variance
24.5

Wax
Type 2

73
63
66
64
74
55
26.0

Type

51
63
61
54
56
68

57
26.5

32

Example: Home Products, Inc.

Completely Randomized Design


Hypotheses
H0: 1=2=3
Ha: Not all the means are equal
where:
1 = mean number of washes for Type 1
wax
2 = mean number of washes for Type 2
wax
3 = mean number of washes for Type 3
wax
33

Example: Home Products, Inc.

Completely Randomized Design


Mean Square Between Treatments
Since the sample sizes are all equal
_
_
_
=
x = (x1 + x2 + x3)/3 = (55 + 68 + 57)/3
= 60
SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 60)2 = 490
MSTR = 490/(3 - 1) = 245
Mean Square Error
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308
MSE = 308/(15 - 3) = 25.667

34

Example: Home Products, Inc.

Completely Randomized Design


Rejection Rule
Assuming = .05, F.05 = 3.89 (2 d.f.
numerator
and 12 d.f. denominator). Reject H0 if F >
3.89.
Test Statistic
F = MSTR/MSE = 245/25.667 = 9.55
Conclusion
Since F = 9.55 > F.05 = 3.89, we reject H0.
The
mean number of carwashes are not the
same for

35

Example: Home Products, Inc.

Completely Randomized Design


ANOVA Table
Source of
Mean
Variation
Squares
F
Treatments
9.55
Error
Total

Sum of

Degrees of

Squares

Freedom

490
308
798

2
12

245

25.667
14

36

Randomized Block Design

The ANOVA Procedure


Computations and Conclusions

37

The ANOVA Procedure

The ANOVA procedure for the randomized


block design requires us to partition the sum of
squares total (SST) into three groups: sum of
squares due to treatments, sum of squares
due to blocks, and sum of squares due to error.
The formula for this partitioning is
SST = SSTR + SSBL + SSE

The total degrees of freedom, nT - 1, are


partitioned such that k - 1 degrees of freedom
go to treatments,
b - 1 go to blocks, and (k - 1)(b - 1) go to the
error term.
38

ANOVA Table for a


Randomized Block Design
Source of
Variation
F

Sum of
Squares

Treatments SSTR

Degrees of
Freedom

k-1

Mean
Squares

SSTR MSTR
MSTR
k- 1
MSE

Error

SSBL
b- 1
SSBL
b-1
SSE
MSE
(k 1)(b 1)
SSE
(k - 1)(b - 1)

Total

SST

Blocks

MSBL

nT - 1
39

Example: Eastern Oil Co.


Eastern Oil has developed three new blends
of gasoline and must decide which blend or
blends to produce and distribute. A study of
the miles per gallon ratings of the three blends
is being conducted to determine if the mean
ratings are the same for the three blends.
Five automobiles have been tested using
each of the three gasoline blends and the
miles per gallon ratings are shown on the next
slide.

40

Example: Eastern Oil Co.


Automobile
Blocks
(Block)
Means
1
31
2
30
3
29
4
33
5
26
Treatment
Means

Type of Gasoline (Treatment)


Blend X
30
29
29
31
25
29.8

Blend Y
30
29
28
29
26

Blend Z

30.333
29.333
28.667
31.000
25.667

28.8

28.4
41

Example: Eastern Oil Co.

Randomized Block Design


Mean Square Due to Treatments
The overall sample mean is 29. Thus,
SSTR = 5[(29.8 - 29)2 + (28.8 - 29)2 + (28.4 29)2] = 5.2
MSTR = 5.2/(3 - 1) = 2.6
Mean Square Due to Blocks
SSBL = 3[(30.333 - 29)2 + . . . + (25.667 29)2] = 51.33
MSBL = 51.33/(5 - 1) = 12.8
Mean Square Due to Error
SSE = 62 - 5.2 - 51.33 = 5.47
MSE = 5.47/[(3 - 1)(5 - 1)] = .68

42

Example: Eastern Oil Co.

Randomized Block Design


Rejection Rule
Assuming = .05, F.05 = 4.46 (2 d.f.
numerator and 8 d.f. denominator). Reject
H0 if F > 4.46.

Test Statistic
F = MSTR/MSE = 2.6/.68 = 3.82
Conclusion
Since 3.82 < 4.46, we cannot reject H0.
There is not sufficient evidence to conclude
that the miles per gallon ratings differ for
the three gasoline blends.
43

End of Chapter 13

44

Вам также может понравиться