Вы находитесь на странице: 1из 64

# Hypothesis Testing with OneOne-Way

ANOVA
Statistics
Arlo Clark
Clark-Foos
Foos

Conceptual Refresher
Standardized z distribution of scores and of means
can be represented as percentile rankings.
2. t distribution of means, mean differences, and
differences between means can all be standardized,
standardized
allowing us to analyze differences between 2 means
3. Numerator of test statistic is always some
difference (between scores, means, mean
differences, or differences between means)
4. Denominator represents some measure of
variability (or form of standard deviation).
1.

Calculating Refresher
y Test Statistics
y Numerator = Differences between groups
y Example: Men are taller than woman
y Denominator = Variability within groups
y Example: Not all men/women are the same height
* There is overlap between these distributions.

z=

( M M )
M

t=

(M M )
sM

( M X M Y ) ( X Y ) ( M X M Y )
t=
=
sDifference
sDifference

## Analysis of Variance (ANOVA)

y Hypothesis test typically used with one or more nominal IV (with at least

## y t Test: Distance between two distributions

y F ratio: Uses two measures of variability

## F Ratio (Sir Ronald Fisher)

between - g
groups
p variance
F=
within - groups variance
y Between-Groups Variance: An estimate of the

## population variance based on the differences

among the means of the samples
y Within
Within-Groups
Groups Variance: An estimate of the

## population variance based on the differences

within each of the three or more sample
distributions

## More than two groups

y Example:
Example Speech rates in America,
America Japan,
Japan & Wales
t test?

Two Sources of
Variance:
Between &
Within

t test?

t test?

## Problem of Too Many Tests

p(A) AND p(B) = p(A) x p(B)
p(A) OR p(B) = p(A) + p(B)
y The probability
probabilit of a Type
T pe I error (rejecting the null
n ll when
hen the n
nullll is

## true) greatly increases with the number of comparisons.

Fishing Expedition
If you torture the data long enough,
the numbers will prove anything you want (Bernstein, 1996)

## Problem of Too Many Tests

Types of ANOVA
y Always preceded by two adjectives
1. Number
N b off Independent
I d
d t Variables
V i bl
2. Experimental Design
y

One-Way
O
W ANOVA:
ANOVA Hypothesis
H
h i test that
h iincludes
l d one
nominal IV with more than two levels and an interval DV.

## Within-Groups One -Way ANOVA: ANOVA where each

sample is composed of the same participants (AKA
repeated measures ANOVA).

## Between-Groups One-Way ANOVA: ANOVA where each

sample
l is
i composed
d off diff
different participants.
i i

Assumptions of ANOVA

## from 1st edition of textbook

Assumption of Homoscedasticity
y Homoscedastic
H
d ti

## populations have the

same variance
y Heteroscedastic

populations have
different
ff
variances

## to the Six Steps

y Research Question:
y What influences foreign students to choose an American
graduate program? In particular, how important are financial
aspects to students in Arts & Sciences, Education, Law, &
B i
?
y Data Source:
y Survey of 17 graduate students from foreign countries currently
enrolled in universities in the U.S.
Importance Scores
Arts & Sciences

Education

Law

1 Identify
1.
y Populations: All foreign graduate students enrolled in

## __________ programs in the U.S.

y Comparison Distribution: F distribution
y Test:
T
O
One-Way
W Between-Subjects
B
S bj
ANOVA
y Assumptions:
y Participants not randomly selected
y Be careful generalizing results
y Not clear if population dist. are normal. Data are not skewed.
y Homoscedasticity
y We will return to this later during calculationsDont Forget!

2 Hypotheses
2.

y Null:
N ll Foreign
d t students
t d t iin A
Arts
t &S
Sciences,
i
Ed
Education,
ti
L
Law,

1 = 2 = 3 = 4

## y Research: Foreign graduate students in Arts & Sciences, Education,

Law, and Business do not all rate financial factors the same, on
average.
1 2 3 4

3 Determine characteristics
3.
y > 2 groups and interval DV:

F distribution

y
y
y
y

## Arts & Sciences:

Ed ti
Education:
Law:

df1 = 5 - 1 = 4
df2 = 4 - 1 = 3
df3 = 4 - 1 = 3
df4 = 4 - 1 = 3

y dfBetween: NGroups - 1 = 4 - 1 = 3
y Numerator df

## y dfWithin: df1 + df2 + df3 + df4 = 4 + 3 + 3 + 3 = 13

y Denominator dff

4.
p = .05
dfBetween = 3
dfWithin = 13
FCritical = 3.41

## 5 Calculate the Test Statistic

5.
y In order to do this, we need 2 measures of variance
y Between-Groups Variance
y Within-Groups Variance
y We will do this shortly

6 Make a Decision
6.
y If our calculated test statistic exceeds our cutoff, we

## reject the null hypothesis and can say the following:

Foreign
F
d
students
d
studying
d i in
i the
h U.S.
U S rate
financial factors differently depending on the type of
program in which they are enrolled
enrolled
y ANOVA does not tell us where our differences are!
y We just know that there is a difference somewhere.

L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
between - groups variance
F=
within - ggroups
p variance
y Whenever differences between sample means are large

## and differences between scores within each sample are

small, the F statistic will be large.
y Remember that large test statistics indicate statistically

significant results

L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
Large withingroups variability &
small between
groups variability
b) Large
L
withini hi
groups variability &
large between
groups variability
bl
c) Small withingroups
g
p variabilityy &
small between
groups variability.
a)

Less
ess O
Overlap!
e ap

L i off ANOVA:
Logic
ANOVA Q
Quantifying
tif i O
Overlap
l
between
b
t
- groups variance
i
F=
within - groups variance
y If between-groups = within-groups, F = 1
y Null hypothesis predicts F = 1
y No differences between groups

## variance based on means.

y Need correction.

C l l ti th
Calculating
the F Statistic:
St ti ti The
Th Source
S
Table
T bl
y Source Table
Table: Presents the important calculations and

## final results of an ANOVA in a consistent and easy-toread format.

f

C l l ti th
Calculating
the F Statistic:
St ti ti The
Th Source
S
Table
T bl
Col.l 1: Th
C
The sources off variability
i bilit
Col. 5: Value of test statistic, F ratio
Col. 4: Mean Square: arithmetic
a erage of squared
average
sq ared de
deviations
iations
Col. 3: Degrees of freedom
Col. 2: Sum of Squares

MS Between

SS Between
=
df Between

MSWithin =

SSWithin
dfWithin

F=

MS Between
MSWithin

## Sums of Squared Deviations

Put all of your scores in one
column, with samples
denoted in another
column.
column

## from 1st edition of textbook

SSTotal = ( X GM )

## Grand Mean: Refers to the

mean of all scores in a
study, regardless of their
sample.
l
( X )
GM =
NTotal

SSWithin
Wi hi = ( X M )

## Calculate the squared

d i ti off each
deviation
h
score from its own
particular sample
p
p
mean

## Sums of Squared Deviations

SS Between
= ( M GM )
B
Calculate the squared
d i ti off each
deviation
h
sample mean from
the g
grand mean.

## What is our decision?

y Back to Step 1.
y Homoscedasticity

h
have
mett thi
this assumption.
ti

## What is our decision?

y Step 6. Make a decision

## F = 3.94 > Fcrit = 3.41

y We
W can reject
j t th
the nullll h
hypothesis.
th i Th
There iis ((are)) a

difference somewhere.
y Where?
y post-hoc test: Statistical procedure frequently carried out

after
f we reject the
h nullll h
hypothesis
h
in an ANOVA;
O
it allows
ll
us to make multiple comparisons among several means.
y p
post-hoc: Latin for after this
y Examples: Tukeys HSD, Scheffe, Dunnet, Duncan, Bonferroni

1.

Italic letter F:

2.

O
Open
parenthesis
h i :

F(

3.

F(dfBetween ,

4.

F(dfBetween
et ee , dfWithin
t )

5.

## Close parentheses, equal sign:

F(dfBetween , dfWithin) =

6.

## F(dfBetween , dfWithin) = 1.23,

7.

Lower case,
case italic letter p:
p

1 23 p

8.

## Significant, less than .05:

y OR non significant:
y OR exact p value:
Another
example:

## F(dfBetween , dfWithin) = 1.23, p < .05

F(dfBetween , dfWithin) = 1.23, p > .05
F(dfBetween , dfWithin) = 1.23, p = .02

## Between-Subjects One Way ANOVA

Example:
p Memoryy for Emotional Stimuli

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Do you have differences in memory for emotional vs. neutral events?
yDo others have the same differences or is it something unique to you?
yLets find out

## positive, negative, or neutral pictures have differences in recall of

those pure lists?
y Research Design: We asked 17 participants study one single list of

## either 30 positive, 30 negative, or 30 neutral pictures (from IAPS).

Following a brief delay all participants were asked to recall as many of
the 30 studied photos as they could. These data are on the following
slide.

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Already Stated: NTotal = 17, one IV with 3 levels (Emotion) is between-sub.
Below are the proportion of pictures on their studied lists that each
participant successfully recalled (100% = perfect memory):

0.69

0.59

.64

0.84

0.64

.73

0.93
93

0.62

.51
5

0.91

0.71

.68

0.89

0.50

.61

0 90
0.90

0 60
0.60

M = .86

M = .61

M = .634

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
NTotal = 17
NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween = 2
df Within = 14
MNeg = .86

MNeut = .61

MPos = .634

1.

## Population: All memories for negative, neutral, and positive events.

Comparison
p
Distribution: F distribution
Test: One-Way Between-Subjects ANOVA
y Assumptions:
y Participants
p
were randomlyy selected from subject
j p
pool
y Not clear if population dist. are normal. Data are not skewed.
y Homoscedasticity

## Between-Subjects One Way ANOVA:

M
ti
l Sti
Memory ffor E
Emotional
Stimulili
2. Hypotheses
yp
Null: On average, memories for
negative,
ti neutral,
t l and
d positive
iti
pictures will not differ.
Neg = Neut = Pos
Research: On average, memories for
negative,
i neutral,l and
d positive
ii
pictures will be different.
Neg Neut Pos

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
3. Determine characteristics
y

F distribution

0.69

0.59

.64

0.84

0.64

.73

0.93

0.62

.51

0.91

0.71

.68

0.89

0.50

.61

0.90

0.60

M = .86

M = .61

M = .634

s2 = .00784

s2 = .00472

s2 = .00683

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
Digression: Test for Homoscedasticity

Rule
If sample sizes differ
across conditions,
largest variance must
not be more than
twice (2x) the smallest
variance

0.69

0.59

.64

0.84

0.64

.73

0.93

0.62

.51

0.91

0.71

.68

0.89

0.50

.61

0.90

0.60

M = .86

M = .61

M = .634

s2 = .00784

s2 = .00472

s2 = .00683

.00784
7 4

.0047
47 * 2 =.00944
944

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
4. Determine critical values
NTotal = 17
NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween = 2
dfWithin= 14
MNeg = .86

MNeut = .61

MPos = .634

s2 = .00784

s2 = .00472

s2 = .00683

Fcrit = 3.74

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li

GM =

## 5. Calculate a test statistic

Source

SS

df

Between

Within

14

Total

16

SSWithin = ( X M )

MS

( X )
NTotal

SS Between = ( M GM )

SSTotal = ( X GM )

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SSTotal = ( X GM )

## 5. Calculate a test statistic

GM =

( X )
NTotal

GM = .7053

X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61

(X - GM) (X - GM)
-0.02
0.0002
0.135
0.0181
0 225
0.225
0 0505
0.0505
0.205
0.0419
0.185
0.0341
0.195
0.0379
-0.12
0.0133
-0.07
0.0043
-0.09
0.0073
0.005
0.0
-0.21
0.0421
-0.11
0.0111
-0.07
0.0043
0.025
0.0006
-0.2
0.0381
-0.03
0.0006
-0.1
0.0091

SSTotal = .3135

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SSWithin = ( X M )

MNeg = .86

MNeut = .61

MPos = .634

X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61

(X - M)
-0.17
-0.02
0 07
0.07
0.05
0.03
0.04
-0.02
0.03
0.01
0.1
-0.11
-0.01
0.006
0.096
-0.124
0.046
-0.024

(X - M)
0.0289
0.0004
0 0049
0.0049
0.0025
0.0009
0.0016
0.0004
0.0009
0.0001
0.01
0.0121
0.0001
0
0.0092
0.0154
0.0021
0.0006

SSWithin = .0901

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
SS Between = ( M GM )

## 5. Calculate a test statistic

GM = .7053

X
0.69
0.84
0 93
0.93
0.91
0.89
0.90
0.59
0.64
0.62
0.71
0.50
0.60
0.64
0.73
0.51
0.68
0.61

M
0.86
0.86
0 86
0.86
0.86
0.86
0.86
0.61
6
0.61
0.61
0.61
0.61
0.61
0.634
0.634
0.634
0.634
0.634

(M - GM) (M - GM)
0.155
0.024
0.155
0.024
0 155
0.155
0 024
0.024
0.155
0.024
0.155
0.024
0.155
0.024
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.1
0.009
-0.07
0.005
-0.07
0.005
-0.07
0.005
-0.07
0.005
-0.07
0.005

SSBetween = .223

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
5. Calculate a test statistic
Source

SS

df

MS

Between

.223

.1115

17.969

Within

.0901

14

.0064

Total

~.3135

16

MS Between

MSWithin

SS Between
= B
df Between
SSWithin
=
dfWithin

MS Between
F=
MSWithin

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
6. Make a decision
Source

SS

df

MS

Between

.223

.1115

17.969

Within

.0901

14

.0064

Total

~.3135

16

Fcrit = 3.74

## Between-Subjects One Way ANOVA:

M
Memory
for
f Emotional
E ti
l Stimuli
Sti li
F = 17.97
>
Fcrit = 33.74
74

6. Make a decision

## Recall of negative, neutral, and positive pictures

was different, F(2, 14) = 19.97, p < .05.
But which pictures were remembered best? Worst?

Hindsight is 20
20-20
20
y Although
t oug your
you data may
ay suggest a

## new relationship, and thus new

analyses
y Theory should guide research and

thus comparisons
th
i
should
h ld b
be decided
d id d
on before you conduct your
experiment.
p

## Planned & A Priori Comparisons

y Based on literature review
y Theoretical

y Planned
l
d comparisons
y A test that is conducted when there are multiple groups of

## scores, but specific comparisons have been specified prior

scores
to data collection.
y A Priori Comparisons

## Planned & A Priori Comparisons

y If you have planned comparisons
y Just run t tests
y Subjective Decision about p value

p = .05?
y p = .01?
y Bonferroni Correction?
y

## Post Hoc Tukey HSD

Post-Hoc:
y Tukey
T k Honestly
H
tl Si
Significant
ifi t Diff
Difference
y Determines differences between means in terms of

standard error
y Honest because we adjust for making multiple comparisons
y The HSD is compared to a critical value

y Overview
1. Calculate differences between a pair of means
2. Divide this difference by the standard error
* Basically this is a variant of a t test *
p againsort
g
of.
Oh no,, that means the six steps

Tukey HSD

(
M1 M 2 )
HSD =
sM

(
M1 M 2 )
t=
sDifference

## differently depending on whether your sample sizes

are equal or not.

Tukey HSD
y Equal Sample Sizes

sM =

MSWithin
N

N = Sample size
within
i hi eachh group

## y Unequal Sample Sizes

sM =

MSWithin
N

N Groups
N =
1
N

Tukey HSD
y Determine Critical Value from Table
y Make a Decision

## y Lets go back to our memory for emotional pictures

example

Tukey HSD:
HSD Example
y Memory for Emotional Pictures Example:

## Between-Subjects One Way ANOVA

y Decision: Recall of negative, neutral, and positive

## pictures was different,

different F(2,
F(2 14) = 19.97,
19 97 p < .05..
05
y Where are our differences?

## y Lets get our qcrit first

Tukey HSD:
HSD Example

NTotal = 17
NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween
=2
B

(k = 3)

dfWithin= 14
MNeg = .86

MNeut = .61

qcrit = 3.70

MPos = .634

Tukey HSD:
HSD Example
NTotal = 17

0.69

0.59

.64
4

0.84

0.64

.73

0.93

0.62

.51

0 91
0.91

0 71
0.71

.68
68

0.89

0.50

.61

0.90

0.60

NNeg = 6

NNeut = 6

NPos = 5

dfNeg = 5

dfNeut = 5

dfPos = 4

dfBetween = 2

(k = 3)

dfWithin= 14
MNeg = .86

MNeut = .61

qcrit = 3.70

Source

SS

df

MS

Between

.223

.1115

17.969

Within

.0901

14

.0064

Total

~.3135

16

MPos = .634

Tukey HSD:
HSD Example
y Standard Error: Unequal Sample Sizes

N Groups
N =
1
N

sM =

MSWithin
N

N =

3
3
=
= 5.625
1 1 1 .533
+ +
6 6 5

.0064
sM =
= .0011378
0011378 = 0.034
0 034
5.625

Tukey HSD:
HSD Example
y Negative (M=0.86) vs. Neutral (M=0.61)

M 1 M 2 ) (.86 .61)
(
HSD =
=
= 7.35
sM

.034

M 1 M 2 ) (.86
(
( 86 .634)
634)
HSD =
=
= 6.65
sM

.034

## y Neutral (M=0.61) vs. Positive (M=0.634)

M 1 M 2 ) ((.61 .634))
(
HSD =
0 71
=
= 0.71
sM

.034

Tukey HSD:
HSD Example
y Make a Decision
y Post hoc comparisons using the Tukey HSD test

## revealed that negative pictures were better

remembered (M = .86) than either positive (M = .634) or
neutral (M = .61) pictures, with no differences between
the latter two.

Bonferonni Correction
An alternative p
post-hoc strategy
gy

Bonferroni Correction

Fishing Expedition

## y Remember the problem of too many tests?

y Inflates the risk of a Type I error.
y False positives
y Is there a way to address that without a new test?
y We
Weve
ve hinted at it already

Bonferroni Correction

Summary
y Between-Subjects One Way ANOVA
y Two Sources of Variance
y New Sums of Squares
y New df
y Homoscedasticity
y
y The problem of too many tests
y Source Table

y Post-Hoc tests
y
y
y
y

Tukeys HSD
Bonferroni
LSD
etc.