Академический Документы
Профессиональный Документы
Культура Документы
To use the one-way ANOVA test, the following Only one classification factor is
assumptions must be true considered
The population under study have normal distribution Factor
1
The samples are drawn randomly, and each sample is Response/ outcome/
independent of the other samples. Treatment 2 dependent variable
(samples)
All the populations from which the samples values are (The level of
obtained, have the same unknown population variance, the factor)
that is for k number of populations, i
1
2
= 2
2 = K = 2
k
Replicates (1,,j)
The object to a
given treatment
1
Mean square
(variance)
H0: 1 = 2 = 3 = ... = k within
All population means
f(X) As production manager, you want
are equal to see if three filling machines Mach1 Mach2 Mach3
No treatment effect have different mean filling times. 25.40 23.40 20.00
Ha: Not All i Are Equal X You assign 15 similarly trained and 26.31 21.80 22.20
At least 2 pop. means 1 = 2 = 3 experienced workers, 5 per
are different machine, to the machines. At the
24.10 23.50 19.75
Mean square among
Treatment effect f(X) .05 level of significance, is there a 23.74 22.75 20.60
1 2 ... k is
Wrong difference in mean filling times? 25.10 21.60 20.40
X
1 = 2 3
The summary statistics for the three filling machines The null hypothesis is that the means are all equal
of each row are shown in the table below
H : = = =L =
0 1 2 3 k
Variation
If the null hypothesis is true, we would Variation is the sum of the squares of the
expect all the sample means to be deviations between a value and the mean of
close to one another (and as a result, the value
close to the grand mean). As long as the values are not identical, there
If the alternative hypothesis is true, at will be variation
least some of the sample means would Abbreviated as SS for Sum of Squares
differ.
Thus, we measure variability between
sample means (and hence MSTr).
2
Are all of the sample means identical?
Are all of the values identical?
No, so there is some variation between the
No, so there is some variation in the data
groups
This is called the total variation This is called the between group variation
Denoted SS(Total) for the total Sum of Sometimes called the variation due to the
Squares (variation) factor
Sum of Squares is another name for Denoted SS(A) for Sum of Squares (variation)
variation between the groups
Are each of the values within each group Variance is described as Sum of Squares
identical?
Total Variance is partitioned as follows:
No, there is some variation within the
groups SS TOTAL
This is called the within group variation
Sometimes called the error variation
Denoted SS(E) for Sum of Squares SSBETWEEN SS WITHIN
(x xi )
2
Total
ij
3
F means F test statistic
One-way Analysis of Variance One-way Analysis of Variance
Source DF SS MS F P Source DF SS MS F P
Factor 2 2510.5 1255.3 93.44 0.000 Factor 2 2510.5 1255.3 93.44 0.000
Error 12 161.2 13.4 Error 12 161.2 13.4
Total 14 2671.7 Total 14 2671.7
P-Value
Source means find the components of variation in this column Factor means Variability between groups or Variability due to
the factor of interest
DF means degrees of freedom
SS means sums of squares Error means Variability within groups or unexplained random
variation
MS means mean squared Total means Total variation from the grand mean
ij
obs n
Source DF SS MS F P
Factor a-1 SS(Between) MSA MSA/MSE SSE = (x ij x i ) 2
Error n-a SS(Error) MSE obs
( x )
Total n-1 SS(Total)
( x i ) 2
2
SSA = (x i x) = 2
ij
obs ni n
MSA = SS(Between)/(a-1) SS MSA
n-1 = (a-1) + (n-a) MSE = SS(Error)/(n-a) SST = SSA + SSE; MS = ; F=
DF MSE
SS(Total) = SS(Between) + SS(Error)
( x i ) 2 ( x ) 2 ( x ) 2
SST = ( x ij x ) 2 = x ij2
ij
SSA = ( x i x ) =
2
ij
obs n
obs ni n
124.652 113.052 102.952 (340.65)
= + +
2
[
= 25.42 + 26.312 + 24.12 +...+ 20.42 7736.162 ]
5 5 5 15
= 7783.326 7736.162 = 7794.379 7736.162
= 47.164 = 58.2172
4
Source SS df MS F p
SST = SSA + SSE Between
47.1640
SSE = SST SSA (Machines)
Within 11.0532
= 58.2172 47.164 (Error)
58.2172
= 11.0532 Total
5
Completing the MS gives
F test statistic
Source SS df MS F p
An F test statistic is the ratio of two sample
variances
Between
47.1640 3-1=2 23.5820
The MS(A) and MS(E) are two sample
(Machines) variances and thats what we divide to find F.
F = MS(A) / MS(E)
Within 11.0532 15 - 3 = 12 .9211
(Error) variance between groups
F=
58.2172 15 - 1 = 14 variance within groups
Total
6
One-way ANOVA: time versus Machine
Source DF SS MS F P
Machine 2 47.164 23.582 25.60 0.000
Error 12 11.053 0.921
Total 14 58.217
There is enough evidence to support the claim An experiment was performed to determine whether
that there is a difference in the mean scores of the the annealing temperature of ductile iron affects its
tensile strength. Five specimens were annealed at each
front, middle, and back rows in class. of four temperatures. The tensile strength (in ksi) was
The ANOVA doesnt tell which row is different, you measured for each temperature. The results are
would need to look at confidence intervals or run presented in the following table. Can you conclude that
there are differences among the mean strengths?
post hoc tests to determine that
Temperature Sample Values
(oC)
750 19.72 20.88 19.63 18.68 17.89
800 16.01 20.04 18.10 20.28 20.53
850 16.66 17.38 14.49 18.21 15.58
900 16.93 14.49 16.15 15.53 13.25
Source DF SS MS F P
Temperature 3 58.65 19.55 8.49 0.001
Error 16 36.84 2.30
Total 19 95.49
7
Confidence interval for each mean, i
1 1
(X 1 X 2 ) t MSE +
n1 n2
MSE
x t where t is obtained from the t table with degrees of
,n a ni freedom (n - k).
2
MSE = [SSE/(n - k)]
When the null hypothesis is rejected, it may Two means are considered different if the
be desirable to find which mean(s) is (are) confidence interval for the difference
different. between the corresponding sample means
does not contain 0. In this case the larger
Two statistical inference procedures, geared
sample mean is believed to be associated
at doing this, are presented:
with a larger population mean.
regular confidence interval calculations
Tukey test How do we calculate the confidence
intervals?
8
Only two classification factor is considered The standard two-way ANOVA tests are valid under the following
conditions: