Академический Документы
Профессиональный Документы
Культура Документы
Inf. Stats
Typical applications
Analysis of Variance
Inf. Stats
single ANOVA compare time needed for lexical recognition in healthy adults, patients with Wernickes aphasia, patients with Brocas aphasia multiple ANOVA compare lexical recognition time in male and female in same three groups
for two groups: t-test testing at p = 0.05 shows signicance 1 time in 20 if there is no difference in population mean (effect of chance) but suppose there are 7 groups, i.e.,
7 2
= 21 pairs
caution: several tests (on same data) run the risk of nding signicance through sheer chance
Example: Suppose you run three tests, always seeking a result signicant at 0.05. The chance of nding this in one of the three is Bonferronis family-wise -level
F W
= = = = =
to guarantee a family-wise alpha of 0.05, divide this by number of tests Example: 0.05/3 = 0.017 (set at 0.017) note: 0.9833 0.95 ANOVA indicated, takes group effects into account
Analysis of Variance
Inf. Stats
based on F distribution
F distribution
s2 F = 1 s2 2 always positive, since variance positive two degrees of freedom interesting, one for s1, one for s2
Inf. Stats
s2 F = 1 s2 2 used in F -test H0: samples from same distribution (s1 = s2) Ha: samples from diff. distribution (s1 = s2 ) value 1 indicates same variance values near 0 or + indicate diff. F -test very sensitive to deviations from normal ANOVA uses F distribution, but is different ANOVA = F -test!
F Distribution
Inf. Stats
Note symmetry: P (
s2 1 s2 2
> x) = P (
s2 2 s2 1
1 < x)
Example: height group boys girls sample size 16 9 mean
F -test
Inf. Stats
std. dev.
180cm 168
6cm 4
examine F =
boys girls
s2
degrees of freedom:
sboys sgirls
16 1 91
8
Inf. Stats
P (F (15, 8) < x)
1 x
Reject H0 if F < 0.31 or F > 4.1 62 Here, F = 42 = 2.25 (no evidence of diff. in distribution)
ANOVA
Inf. Stats
Analysis of Variance (ANOVA) most popular statistical test for numerical data
several types single one-way multiple, i.e., two-, three-, ...n-way examines variation between-groups sex, age,... within-subject, within-groups overall automatically corrects for looking at several relationships (like Bonferroni correction) uses F test, where F (n, m) xes n typically at number of groups (less 1), m at number of subjects (data points) (less number of groups)
10
Inf. Stats
Example: Gisela Redeker identied three roles for literary book reviews in newspapers Taalbeheersing 21(4), 1999, 295-310:
communicate emotional reactions, subjective opinions communicate expert opinion, objective facts motivate reading and purchasing of book
She investigated whether different reviewers emphasized different roles: Tom van Deel (Trouw), Arnold Heumakers (de Volkskrant), and Carel Peeters (Vrij Nederland) stylistic elements indicate one of the three functions, e.g., ik, maar nee, ben ik bang, lijkt, eerlijk gezegd, ik bedoel,... indicate subjective opinions; logical connectives want, temeer dat, ... and quotes indicate an objective point of view; etc. N.b. validating link between style and perspectives is important (see Redeker)
11
Gisela Redeker investigates role of lit. criticism, asking whether different critics did not differ in the degree to which they emphasize one or another role. Sample: reviews of the same books (by Hermans, Heijne and Mulisch), all published 1989-92. Similar in length. Data: relative frequency of, e.g., reader-oriented elements (per 1, 000 words). We are comparing three averages, asking whether their is a difference. Because she compared more than two averages ANOVA is needed.
12
Inf. Stats
26.0
29.8
39.7
H 0 : 1 = 2 = 3 Ha : 1 = 2 or 1 = 3or 2 = 3
Results: statistically signicant difference (p < 0.02) Similar comparisons for subjective style, and argumentative style (differences present, not statistically signicant)
13
One-Way ANOVA
Inf. Stats
Question: Are exam grades of four groups of foreign students Nederlands voor anderstaligen the same? More exactly, are four averages the same?
H 0 : 1 = 2 = 3 = 4 Ha : 1 = 2 or 1 = 3 . . . or 3 = 4
i.e., alternative: at least one group has different mean for the question of whether any particular pair is the same, the t-test is appropriate for testing whether all language groups are the same, pairwise t-tests will exaggerate differences (increase the chance of type I error). we want to apply 1-way ANOVA
14
Four groups of ten each: Groups Amer. Af ri. 33 26 21 25 . . . . . . 20 15 21.9 23.1 6.61 5.92 43.66 34.99
15
Anova Conditions
Inf. Stats
Assumption: normal distribution per group, check with normal quantile plot, e.g., for Europeans (and to be repeated for every group):
No rm a l Q -Q P lo t o f to e ts nl. vo o r a nd e rsta lig e
2 .0
1.5
1.0
.5
E xp e c te d N o rm a l V a lu e
0 .0
-.5
-1.0
-1.5 -2 .0 0 10 20 30 40
O b se rve d V a lu e
16
Anova Conditions
Inf. Stats
-1
-2
-3 0 10 20 30 40
17
ANOVA assumptions:
Anova Conditions
Inf. Stats
normal distribution per subgroup same variance in subgroups: least sd > one-half of greatest sd independent observations: watch out for test-retest situations!
Check differences in SDs! (some SPSS computing) Valid N 10 10 10 10
18
Label
40
Visualizing Anova
Inf. Stats
Is there any signicant difference in the means (of the groups being contrasted)?
30
20
10
0
N = 10 10 10 10
Europa
Am eric a
Afric a
Azie
19
1 Eur. . . . x1,j . . . x1
Sketch of Anova
Inf. Stats
(xi,j x)
total residue
= =
(xi x)
group diff
20
Two Variances
Inf. Stats
21
Two Variances
Inf. Stats
(xi,j x)2
(xi,j x)
= =
(xi,j x)
=
j=1
(xi x)
Ni
+
j=1
(xi,j xi)2
+
j=1
22
Two Variances
Inf. Stats
Since:
Ni j=1 Ni j=1
Ni
j=1
(xi,j xi ) and
(xi,j xi) = 0
23
Ni j=1
Sketch of Anova
Inf. Stats
Ni
Ni
(xi,j x)
=
j=1
(xi x)
Ni
+
j=1
(xi,j xi)2
(+
j=1
Therefore:
Ni j=1 Ni Ni
(xi,j x)
=
j=1
(xi x) SSG
+
j=1
SST
24
For any data point xi,j
Anova Terminology
Inf. Stats
total residue
(xi,j x)
= = = =
group diff
(xi x)
+ + + + + + + +
(xi,j xi)
error
(xi,j x)2
SST
Total Sum of Squares
(xi x)2
SSG
Group Sum of Squares
(xi,j xi)2
SSE
Error Sum of Squares
and DFT (n 1)
Total Degrees of Freedom
= =
DFG (I 1)
Group Degrees of Freedom
DFE (n I)
Error Degrees of Freedom
25
Inf. Stats
SST/DFT
&
SSE/DFE (=MSE)
In ANOVA, we compare MSG (variance betwee groups) and MSE (variance within groups), i.e. we measure
F =
M SG M SE
26
H 0 : 1 = 2 = 3 = 4
Two Variances
Inf. Stats
ANOVA: calculate MSG ( 2 between groups) and MSE ( 2 within groups), i.e. we measure
F =
M SG M SE
27
Two Variances
Inf. Stats
= = =
(n1 1)s2 +(n2 1)s2 +(n3 1)s2 +(n4 1)s2 1 2 3 4 (n1 1)+(n2 1)+(n3 1)+(n4 1) 966.22+943.66+934.99+947.57 9+9+9+9 595.98+392.94+314.91+428.13 = 48.11 36
estimates variance in groups (using dF ), aka within-groups estimate of variance 2. suppose H0 true (a) then group have sample means , variance 2/10, (& sd / 10) (b) 4 means, 25.0, 21.9, 23.1, 21.3, where s = 1.63, s2 = 2.66 (c) s2 estimate of 2/10, i.e., 10 s2 is estimate of 2( s2 = 26.6) (d) this is between-groups variance (MSG)
28
Inf. Stats
= 0.55
P (F (3, 30) > 2.92) = 0.05, (see tables), so no evidence of nonuniform behavior
29
ANOVA Summary
Inf. Stats
Summary to-date (exam results for NL voor anderstalige) Source between-g within-g Total
dF 3 36 39
M SS 26.6 48.1
F F(3,36) = 0.55
30
- - - - Variable By Variable
SPSS Summary
Inf. Stats
O N E W A Y
- - - - -
NL_NIVO GROUP
toets nl. voor anderstalige gebied van afkomst Analysis of Variance Sum of Squares 79.9 1731.9 1811.8 Mean Squares 26.6 48.1 F Ratio .55 F Prob. .65
D.F. 3 36 39
31
Other Questions
Inf. Stats
ANOVA has H0: 1 = 2 = . . . = n But sometimes particular contrasts important e.g., are Europeans better (in learning Dutch)? Distinguish (in reporting results):
prior contrasts questions asked before data collected and analyzed post-hoc (posterior) questions questions after collection and analysis data-snooping is exploratory, cannot contribute to hypothesis testing
32
Prior Contrasts
Inf. Stats
Questions asked before collection and analysis e.g., are Europeans better (in learning Dutch)? Another formulation: is Ha :
33
(the mean of) every group gets a coefcient sum of coefcients is zero a t-test is carried out, & two-tailed p value is reported (as usual)
Contrast 1 Eur -1.0 Am. .3 Afr. .3 Azie .3
Contrast
Value -2.9
T Prob. .260
No signicant difference here (of course) Note: prior contrasts are legitimate as hypothesis tests as long as they are formulated before collection and analysis
34
Post-hoc Questions
Inf. Stats
Assume H0 rejected: which means are distinct? Data-snooping problem: in large set, some distinctions are likely to be stat. signicant But we can still look (we just cannot claim to have tested the hypothesis) We are asking whether m1 m2 is signicantly larger, we apply a variant of the t-test The relevant sd is MSE/n (differences among scores), but theres a correction since were looking at half the scores in any one comparison.
35
sd =
Ni + N j
48.1 10+10 40 10 + 10
48.1 2
20
24.05 = 4.9/ 20 20
and the t value is calculated as p/c where p is the desired signicance level and c is number of comparisons. For pairwise comparisons, c =
I 2
36
SPSS Post-Hoc Bonferroni searches among all groupings for statistically signicant ones.
- - - - Variable By Variable NL_NIVO GROUP O N E W A Y - - - - -
The difference between two means is significant if MEAN(J)-MEAN(I) >= 4.9045 * RANGE * SQRT(1/N(I) + 1/N(J)) with the following value(s) for RANGE: 3.95 - No two groups significantly different at .05 level Homogeneous Subsets (highest \& lowest means not sig. diff.) Group Mean Azie 21.3 America 21.9 Africa 23.1 Europa 25.0
37
F =
M SG M SE
1. MSG increases, differences in means grow larger 2. MSE decreases, overall variation grows smaller
38
xi,j xi,j
First model
= =
+ i,j + i +
i,j
real group effect each datapoint represents error ( ) around an overall mean () combined with a group adjustment (i)
ANOVA: is there sufcient evidence for i?
39
Inf. Stats
40