Вы находитесь на странице: 1из 64

Hypothesis Testing with OneOne-Way

ANOVA
Statistics
Arlo Clark
Clark-Foos
Foos

Conceptual Refresher
Standardized z distribution of scores and of means
can be represented as percentile rankings.
2. t distribution of means, mean differences, and
differences between means can all be standardized,
standardized
allowing us to analyze differences between 2 means
3. Numerator of test statistic is always some
difference (between scores, means, mean
differences, or differences between means)
4. Denominator represents some measure of
variability (or form of standard deviation).
1.

Calculating Refresher
y Test Statistics
y Numerator = Differences between groups
y Example: Men are taller than woman
y Denominator = Variability within groups
y Example: Not all men/women are the same height
* There is overlap between these distributions.

z=

( M M )
M

t=

(M M )
sM

( M X M Y ) ( X Y ) ( M X M Y )
t=
=
sDifference
sDifference

Analysis of Variance (ANOVA)

y Hypothesis test typically used with one or more nominal IV (with at least

3 groups overall) and an interval DV.

y t Test: Distance between two distributions


y F ratio: Uses two measures of variability

F Ratio (Sir Ronald Fisher)


between - g
groups
p variance
F=
within - groups variance
y Between-Groups Variance: An estimate of the

population variance based on the differences


among the means of the samples
y Within
Within-Groups
Groups Variance: An estimate of the

population variance based on the differences


within each of the three or more sample
distributions

More than two groups


y Example:
Example Speech rates in America,
America Japan,
Japan & Wales
t test?

Two Sources of
Variance:
Between &
Within

t test?

t test?

Problem of Too Many Tests


p(A) AND p(B) = p(A) x p(B)
p(A) OR p(B) = p(A) + p(B)
y The probability
probabilit of a Type
T pe I error (rejecting the null
n ll when
hen the n
nullll is

true) greatly increases with the number of comparisons.

Fishing Expedition
If you torture the data long enough,
the numbers will prove anything you want (Bernstein, 1996)

Problem of Too Many Tests

Types of ANOVA
y Always preceded by two adjectives
1. Number
N b off Independent
I d
d t Variables
V i bl
2. Experimental Design
y

One-Way
O
W ANOVA:
ANOVA Hypothesis
H
h i test that
h iincludes
l d one
nominal IV with more than two levels and an interval DV.

Within-Groups One -Way ANOVA: ANOVA where each


sample is composed of the same participants (AKA
repeated measures ANOVA).

Between-Groups One-Way ANOVA: ANOVA where each


sample
l is
i composed
d off diff
different participants.
i i

Assumptions of ANOVA

from 1st edition of textbook

Assumption of Homoscedasticity
y Homoscedastic
H
d ti

populations have the


same variance
y Heteroscedastic

populations have
different
ff
variances

to the Six Steps


y Research Question:
y What influences foreign students to choose an American
graduate program? In particular, how important are financial
aspects to students in Arts & Sciences, Education, Law, &
B i
Business?
?
y Data Source:
y Survey of 17 graduate students from foreign countries currently
enrolled in universities in the U.S.
Importance Scores
Arts & Sciences

Education

Law

Business

1 Identify
1.
y Populations: All foreign graduate students enrolled in

__________ programs in the U.S.


y Comparison Distribution: F distribution
y Test:
T
O
One-Way
W Between-Subjects
B
S bj
ANOVA
y Assumptions:
y Participants not randomly selected
y Be careful generalizing results
y Not clear if population dist. are normal. Data are not skewed.
y Homoscedasticity
y We will return to this later during calculationsDont Forget!

2 Hypotheses
2.

y Null:
N ll Foreign
F i graduate
d t students
t d t iin A
Arts
t &S
Sciences,
i
Ed
Education,
ti
L
Law,

and Business all rate financial factors the same, on average.


1 = 2 = 3 = 4

y Research: Foreign graduate students in Arts & Sciences, Education,

Law, and Business do not all rate financial factors the same, on
average.
1 2 3 4

3 Determine characteristics
3.
y > 2 groups and interval DV:

F distribution

y df for each sample: NSample - 1


y
y
y
y

Arts & Sciences:


Ed ti
Education:
Law:
Business:

df1 = 5 - 1 = 4
df2 = 4 - 1 = 3
df3 = 4 - 1 = 3
df4 = 4 - 1 = 3

y dfBetween: NGroups - 1 = 4 - 1 = 3
y Numerator df

y dfWithin: df1 + df2 + df3 + df4 = 4 + 3 + 3 + 3 = 13


y Denominator dff

4 Determine Critical Values


4.
p = .05
dfBetween = 3
dfWithin = 13
FCritical = 3.41

5 Calculate the Test Statistic


5.
y In order to do this, we need 2 measures of variance
y Between-Groups Variance
y Within-Groups Variance
y We will do this shortly

6 Make a Decision
6.
y If our calculated test statistic exceeds our cutoff, we

reject the null hypothesis and can say the following:


Foreign
F
i graduate
d
students
d
studying
d i in
i the
h U.S.
U S rate
financial factors differently depending on the type of
program in which they are enrolled
enrolled
y ANOVA does not tell us where our differences are!
y We just know that there is a difference somewhere.

L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
between - groups variance
F=
within - ggroups
p variance
y Whenever differences between sample means are large

and differences between scores within each sample are


small, the F statistic will be large.
y Remember that large test statistics indicate statistically

significant results

L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
Large withingroups variability &
small between
groups variability
b) Large
L
withini hi
groups variability &
large between
groups variability
bl
c) Small withingroups
g
p variabilityy &
small between
groups variability.
a)

Less
ess O
Overlap!
e ap

L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
between
b
t
- groups variance
i
F=
within - groups variance
y If between-groups = within-groups, F = 1
y Null hypothesis predicts F = 1
y No differences between groups

y Within-groups variance based on scores, between-groups

variance based on means.


y Need correction.

C l l ti th
Calculating
the F Statistic:
St ti ti The
Th Source
S
Table
T bl
y Source Table
Table: Presents the important calculations and

final results of an ANOVA in a consistent and easy-toread format.


f

C l l ti th
Calculating
the F Statistic:
St ti ti The
Th Source
S
Table
T bl
Col.l 1: Th
C
The sources off variability
i bilit
Col. 5: Value of test statistic, F ratio
Col. 4: Mean Square: arithmetic
a erage of squared
average
sq ared de
deviations
iations
Col. 3: Degrees of freedom
Col. 2: Sum of Squares

MS Between

SS Between
=
df Between

MSWithin =

SSWithin
dfWithin

F=

MS Between
MSWithin

Sums of Squared Deviations


Put all of your scores in one
column, with samples
denoted in another
column.
column

from 1st edition of textbook

SSTotal = ( X GM )

Grand Mean: Refers to the


mean of all scores in a
study, regardless of their
sample.
l
( X )
GM =
NTotal

Sums of Squared Deviations


SSWithin
Wi hi = ( X M )

Calculate the squared


d i ti off each
deviation
h
score from its own
particular sample
p
p
mean

from 1st edition of textbook

Sums of Squared Deviations


SS Between
= ( M GM )
B
Calculate the squared
d i ti off each
deviation
h
sample mean from
the g
grand mean.

from 1st edition of textbook

Sums of Squared Deviations

from 1st edition of textbook

Source Table for our Example

from 1st edition of textbook

What is our decision?

y Back to Step 1.
y Homoscedasticity

from 1st edition of textbook

y Because the largest variance (.500) is not more than twice

(unequal sample sizes) the smallest variance (.251) then we


h
have
mett thi
this assumption.
ti

What is our decision?


y Step 6. Make a decision

F = 3.94 > Fcrit = 3.41


y We
W can reject
j t th
the nullll h
hypothesis.
th i Th
There iis ((are)) a

difference somewhere.
y Where?
y post-hoc test: Statistical procedure frequently carried out

after
f we reject the
h nullll h
hypothesis
h
in an ANOVA;
O
it allows
ll
us to make multiple comparisons among several means.
y p
post-hoc: Latin for after this
y Examples: Tukeys HSD, Scheffe, Dunnet, Duncan, Bonferroni

Reporting ANOVA in APA Style


1.

Italic letter F:

2.

O
Open
parenthesis
h i :

F(

3.

Between Groups df then comma:

F(dfBetween ,

4.

Within Groups df:

F(dfBetween
et ee , dfWithin
t )

5.

Close parentheses, equal sign:

F(dfBetween , dfWithin) =

6.

F Statistic then comma:

F(dfBetween , dfWithin) = 1.23,

7.

Lower case,
case italic letter p:
p

F(dfBetween , dfWithin) = 1.23,


1 23 p

8.

Significant, less than .05:


y OR non significant:
y OR exact p value:
Another
example:

F(dfBetween , dfWithin) = 1.23, p < .05


F(dfBetween , dfWithin) = 1.23, p > .05
F(dfBetween , dfWithin) = 1.23, p = .02

Between-Subjects One Way ANOVA


Example:
p Memoryy for Emotional Stimuli

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Do you have differences in memory for emotional vs. neutral events?
yDo others have the same differences or is it something unique to you?
yLets find out

y Research Question: Will people asked to study pure lists of either

positive, negative, or neutral pictures have differences in recall of


those pure lists?
y Research Design: We asked 17 participants study one single list of

either 30 positive, 30 negative, or 30 neutral pictures (from IAPS).


Following a brief delay all participants were asked to recall as many of
the 30 studied photos as they could. These data are on the following
slide.

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Already Stated: NTotal = 17, one IV with 3 levels (Emotion) is between-sub.
Below are the proportion of pictures on their studied lists that each
participant successfully recalled (100% = perfect memory):

0.69

0.59

.64

0.84

0.64

.73

0.93
93

0.62

.51
5

0.91

0.71

.68

0.89

0.50

.61

0 90
0.90

0 60
0.60

M = .86

M = .61

M = .634

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Already Stated/Calculated
NTotal = 17
NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween = 2
df Within = 14
MNeg = .86

MNeut = .61

MPos = .634

y Six Steps to Hypothesis Testingagain!


1.

Population: All memories for negative, neutral, and positive events.

Comparison
p
Distribution: F distribution
Test: One-Way Between-Subjects ANOVA
y Assumptions:
y Participants
p
were randomlyy selected from subject
j p
pool
y Not clear if population dist. are normal. Data are not skewed.
y Homoscedasticity

Between-Subjects One Way ANOVA:


M
ti
l Sti
Memory ffor E
Emotional
Stimulili
2. Hypotheses
yp
Null: On average, memories for
negative,
ti neutral,
t l and
d positive
iti
pictures will not differ.
Neg = Neut = Pos
Research: On average, memories for
negative,
i neutral,l and
d positive
ii
pictures will be different.
Neg Neut Pos

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
3. Determine characteristics
y

> 2 groups and interval DV:

F distribution

0.69

0.59

.64

0.84

0.64

.73

0.93

0.62

.51

0.91

0.71

.68

0.89

0.50

.61

0.90

0.60

M = .86

M = .61

M = .634

s2 = .00784

s2 = .00472

s2 = .00683

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Digression: Test for Homoscedasticity

Rule
If sample sizes differ
across conditions,
largest variance must
not be more than
twice (2x) the smallest
variance

0.69

0.59

.64

0.84

0.64

.73

0.93

0.62

.51

0.91

0.71

.68

0.89

0.50

.61

0.90

0.60

M = .86

M = .61

M = .634

s2 = .00784

s2 = .00472

s2 = .00683

.00784
7 4

.0047
47 * 2 =.00944
944

.00784 < .00944 so this assumption is met.

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
4. Determine critical values
Already Stated/Calculated
NTotal = 17
NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween = 2
dfWithin= 14
MNeg = .86

MNeut = .61

MPos = .634

s2 = .00784

s2 = .00472

s2 = .00683

Fcrit = 3.74

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li

GM =

5. Calculate a test statistic


Source

SS

df

Between

Within

14

Total

16

SSWithin = ( X M )

MS

( X )
NTotal

SS Between = ( M GM )

SSTotal = ( X GM )

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SSTotal = ( X GM )

5. Calculate a test statistic

GM =

( X )
NTotal

GM = .7053

X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61

(X - GM) (X - GM)
-0.02
0.0002
0.135
0.0181
0 225
0.225
0 0505
0.0505
0.205
0.0419
0.185
0.0341
0.195
0.0379
-0.12
0.0133
-0.07
0.0043
-0.09
0.0073
0.005
0.0
-0.21
0.0421
-0.11
0.0111
-0.07
0.0043
0.025
0.0006
-0.2
0.0381
-0.03
0.0006
-0.1
0.0091

SSTotal = .3135

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SSWithin = ( X M )

5. Calculate a test statistic

MNeg = .86

MNeut = .61

MPos = .634

X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61

(X - M)
-0.17
-0.02
0 07
0.07
0.05
0.03
0.04
-0.02
0.03
0.01
0.1
-0.11
-0.01
0.006
0.096
-0.124
0.046
-0.024

(X - M)
0.0289
0.0004
0 0049
0.0049
0.0025
0.0009
0.0016
0.0004
0.0009
0.0001
0.01
0.0121
0.0001
0
0.0092
0.0154
0.0021
0.0006

SSWithin = .0901

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SS Between = ( M GM )

5. Calculate a test statistic

GM = .7053

X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61

M
0.86
0.86
0 86
0.86
0.86
0.86
0.86
0.61
6
0.61
0.61
0.61
0.61
0.61
0.634
0.634
0.634
0.634
0.634

(M - GM) (M - GM)
0.155
0.024
0.155
0.024
0 155
0.155
0 024
0.024
0.155
0.024
0.155
0.024
0.155
0.024
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.07
0.005
-0.07
0.005
-0.07
0.005
-0.07
0.005
-0.07
0.005

SSBetween = .223

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
5. Calculate a test statistic
Source

SS

df

MS

Between

.223

.1115

17.969

Within

.0901

14

.0064

Total

~.3135

16

MS Between

MSWithin

SS Between
= B
df Between
SSWithin
=
dfWithin

MS Between
F=
MSWithin

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
6. Make a decision
Source

SS

df

MS

Between

.223

.1115

17.969

Within

.0901

14

.0064

Total

~.3135

16

Fcrit = 3.74

Between-Subjects One Way ANOVA:


M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
F = 17.97
>
Fcrit = 33.74
74

6. Make a decision

Recall of negative, neutral, and positive pictures


was different, F(2, 14) = 19.97, p < .05.
But which pictures were remembered best? Worst?

A Priori & PostPost-Hoc Tests

Hindsight is 20
20-20
20
y Although
t oug your
you data may
ay suggest a

new relationship, and thus new


analyses
y Theory should guide research and

thus comparisons
th
i
should
h ld b
be decided
d id d
on before you conduct your
experiment.
p

Planned & A Priori Comparisons


y Based on literature review
y Theoretical

y Planned
l
d comparisons
y A test that is conducted when there are multiple groups of

scores, but specific comparisons have been specified prior


scores
to data collection.
y A Priori Comparisons

Planned & A Priori Comparisons


y If you have planned comparisons
y Just run t tests
y Subjective Decision about p value

p = .05?
y p = .01?
y Bonferroni Correction?
y

Post Hoc Tukey HSD


Post-Hoc:
y Tukey
T k Honestly
H
tl Si
Significant
ifi t Diff
Difference
y Determines differences between means in terms of

standard error
y Honest because we adjust for making multiple comparisons
y The HSD is compared to a critical value

y Overview
1. Calculate differences between a pair of means
2. Divide this difference by the standard error
* Basically this is a variant of a t test *
p againsort
g
of.
Oh no,, that means the six steps

Tukey HSD

(
M1 M 2 )
HSD =
sM

(
M1 M 2 )
t=
sDifference

y For Tukey HSD, standard error is calculated

differently depending on whether your sample sizes


are equal or not.

Tukey HSD
y Equal Sample Sizes

sM =

MSWithin
N

N = Sample size
within
i hi eachh group

y Unequal Sample Sizes

sM =

MSWithin
N

N Groups
N =
1
N

Tukey HSD
y Determine Critical Value from Table
y Make a Decision

y Lets go back to our memory for emotional pictures

example

Tukey HSD:
HSD Example
y Memory for Emotional Pictures Example:

Between-Subjects One Way ANOVA


y Decision: Recall of negative, neutral, and positive

pictures was different,


different F(2,
F(2 14) = 19.97,
19 97 p < .05..
05
y Where are our differences?

y Lets get our qcrit first

Tukey HSD:
HSD Example

Already Stated/Calculated
NTotal = 17
NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween
=2
B

(k = 3)

dfWithin= 14
MNeg = .86

MNeut = .61

qcrit = 3.70

MPos = .634

Tukey HSD:
HSD Example
Already Stated/Calculated
NTotal = 17

0.69

0.59

.64
4

0.84

0.64

.73

0.93

0.62

.51

0 91
0.91

0 71
0.71

.68
68

0.89

0.50

.61

0.90

0.60

NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween = 2

(k = 3)

dfWithin= 14
MNeg = .86

MNeut = .61

qcrit = 3.70

Source

SS

df

MS

Between

.223

.1115

17.969

Within

.0901

14

.0064

Total

~.3135

16

MPos = .634

Tukey HSD:
HSD Example
y Standard Error: Unequal Sample Sizes

N Groups
N =
1
N

sM =

MSWithin
N

N =

3
3
=
= 5.625
1 1 1 .533
+ +
6 6 5

.0064
sM =
= .0011378
0011378 = 0.034
0 034
5.625

Tukey HSD:
HSD Example
y Negative (M=0.86) vs. Neutral (M=0.61)

M 1 M 2 ) (.86 .61)
(
HSD =
=
= 7.35
sM

.034

y Negative (M=0.86) vs. Positive (M=0.634)

M 1 M 2 ) (.86
(
( 86 .634)
634)
HSD =
=
= 6.65
sM

.034

y Neutral (M=0.61) vs. Positive (M=0.634)

M 1 M 2 ) ((.61 .634))
(
HSD =
0 71
=
= 0.71
sM

.034

Tukey HSD:
HSD Example
y Make a Decision
y Post hoc comparisons using the Tukey HSD test

revealed that negative pictures were better


remembered (M = .86) than either positive (M = .634) or
neutral (M = .61) pictures, with no differences between
the latter two.

Bonferonni Correction
An alternative p
post-hoc strategy
gy

Bonferroni Correction

Fishing Expedition

y Remember the problem of too many tests?


y Inflates the risk of a Type I error.
y False positives
y Is there a way to address that without a new test?
y We
Weve
ve hinted at it already
already

Bonferroni Correction

Summary
y Between-Subjects One Way ANOVA
y Two Sources of Variance
y New Sums of Squares
y New df
y Homoscedasticity
y
y The problem of too many tests
y Source Table

y Post-Hoc tests
y
y
y
y

Tukeys HSD
Bonferroni
LSD
etc.