Вы находитесь на странице: 1из 71

The Paired t-Test

(A.K.A. Dependent Samples t-


Test, or t-Test for Correlated
Groups)

1
When to use the paired t-test
• In many research designs, it is helpful to measure
the same people more than once.
• A common example is testing for performance
improvements (or decrements) over time.
• However, in any circumstance where multiple
measurements are made on the same person (or
“experimental unit”), it may be useful to observe if
there are mean differences between these
measurements.
• The paired t-test will show whether the
differences observed in the 2 measures will
be found reliably in repeated samples.

2
Example
• In this example, we will look at the throwing
distance for junior varsity javelin toss (in
meters).
• Five players are selected at random from
the entire league.
• We are interested in the following research
question:
• Do players improve on their distance
between the pre and post season?
• The average throwing distance, in both pre
and post season, is recorded in columns
1 and 2 (see next slide) for each of 5
people (P1-P5):
3
Player Pre Post
P1 1 2
P2 2 4.5
P3 2 3
P4 2 4.5
P5 3 6

4
Hypothesis
• Ho: mean throwing distance are equal
before and after
• Ha: Mean throwing distance are different

5
Choose significance level
• Two-tail test
α=1%; α=5%; α=10%

6
Calculating t-value

1   2
t , df = n-1
S  S  2rx1x2 S x1 S x2
2
x1
2
x2

n 1

7
Example
Column 1 Column 2 Column 3 Column 4

X1 : X2 :
Pre-season Post-season (  1  1 ) 2 (  2   2 ) 2
P1: 1 2 1 4
P2: 2 4.5 0 0.25
P3: 2 3 0 1
P4: 2 4.5 0 0.25
P5: 3 6 1 4
1  2 2  4
s x2 
 (    ) 2

 0.4 1.9
n 8
Example
• Second, we need to find the correlation
between the pre and post-season distances
( rx x ), or likewise columns 2 and 3.
1 2

• rx1x2=0.9177.

9
Example 4.1 (cont.)

24
t  -4.78, df = 4
.4  1.9  2(.9177)(.6325)(1.3784)
5 1

10
Conclusion
• The C.V. t(4), α=.05 = 2.132, therefore we reject
the null hypothesis because the absolute
value of our “t” at 4.78 is greater than the
critical value.

11
INDEPENDENT SAMPLES T-
TEST (OR 2-SAMPLE T-TEST)

12
WHEN TO USE THE
INDEPENDENT SAMPLES T-TEST
• The independent samples t-test is probably the
single most widely used test in statistics.
• It is used to compare differences between
separate groups.
• In Psychology, these groups are often composed
by randomly assigning research participants to
conditions.
• However, this test can also be used to explore
differences in naturally occurring groups.
• For example, we may be interested in
differences of emotional intelligence between
13
males and females.
WHEN TO USE THE
INDEPENDENT SAMPLES T-TEST
(CONT.)
• Any differences between groups can be
explored with the independent t-test, as long
as the tested members of each group are
reasonably representative of the population.
[1]

[1] There are some technical


requirements as well. Principally,
each variable must come from a
normal (or nearly normal) distribution.
14
EXAMPLE
• Suppose we put people on 2 diets:
the pizza diet and the beer diet.
• Participants are randomly assigned to either
1-week of eating exclusively pizza or 1-week
of exclusively drinking beer.

15
EXAMPLE 3.1 (CONT.)

• At the end of the week, we measure


weight gain by each participant.

• Which diet causes more weight gain?

• In other words, the null hypothesis is:

Ho: wt. gain pizza diet =wt. gain beer diet.


16
Person Banku Fufu
P1 1 3
P2 2 4
P3 2 4
P4 2 4
P5 3 5
17
EXAMPLE

• Ho: NO differences in weight gain between


these 2 diets.
• Ha: there ARE differences in weight gain
between the 2 diets.

• α =5%
18
FORMULA
The formula for
the independent samples t-test is:

1   2
t , df = (n1-1) + (n2-1)
2 2
S x1 S x2

n1  1 n2  1
19
EXAMPLE
Column 3 Column 4

X1 : Banku X2 : Fufu ( 1  1 ) 2 ( 2   2 ) 2
1 3 1 1
2 4 0 0
2 4 0 0
2 4 0 0
3 5 1 1
1 2 2  4

s x2 
 (    ) 2

 0.4 0.4
n 20
EXAMPLE 3.1 (CONT.)
• From the calculations previously, we have
everything that is needed to find the “t.”

24
t  4.47, df = (5-1) + (5-1) = 8
.4 .4

4 4
After calculating the “t” value, we need to know
if it is large enough to reject the null hypothesis.
21
EXAMPLE
• The calculated t-value of 4.47 is larger in
magnitude than the C.V. of 2.31, therefore we
can reject the null hypothesis.

22
ONE-WAY ANOVA
ONE-WAY ANALYSIS OF VARIANCE
ONE-WAY ANOVA

• The one-way analysis of variance is used to test the


claim that three or more population means are
equal
• This is an extension of the two independent
samples t-test
ONE-WAY ANOVA

• The response variable is the variable you’re


comparing
• The factor variable is the categorical variable being
used to define the groups
• We will assume k samples (groups)
• one-way is because we have only one factor
• Examples include comparisons by gender, race, political
party, color, etc.
ONE-WAY ANOVA

• Conditions or Assumptions
• The data are randomly sampled
• The variances of each sample are assumed equal
• The residuals are normally distributed
ONE-WAY ANOVA

• The null hypothesis is that the means are all


equal H :       
0 1 2 3 k

• The alternative hypothesis is that at least one


of the means is different
ONE-WAY ANOVA

• The statistics classroom is divided into three rows:


front, middle, and back
• The instructor noticed that the further the students
were from him, the more likely they were to miss
class or use an instant messenger during class
• He wanted to see if the students further away did
worse on the exams
ONE-WAY ANOVA

The ANOVA doesn’t test that one mean is less


than another, only whether they’re all equal or at
least one is different.
H :   
0 F M B
ONE-WAY ANOVA

• A random sample of the students in each row was


taken
• The score for those students on the second exam
was recorded
• Front: 82, 83, 97, 93, 55, 67, 53
• Middle: 83, 78, 68, 61, 77, 54, 69, 51, 63
• Back: 38, 59, 55, 66, 45, 52, 52, 61
ONE-WAY ANOVA

The summary statistics for the grades of each row


are shown in the table below

Row Front Middle Back


Sample size 7 9 8
Mean 75.71 67.11 53.50
St. Dev 17.63 10.95 8.96
Variance 310.90 119.86 80.29
ONE-WAY ANOVA

• Variation
• Variation is the sum of the squares of the deviations
between a value and the mean of the value
• Sum of Squares is abbreviated by SS and often followed
by a variable in parentheses such as SS(B) or SS(W) so
we know which sum of squares we’re talking about
ONE-WAY ANOVA

• Are all of the values identical?


• No, so there is some variation in the data
• This is called the total variation
• Denoted SS(Total) for the total Sum of Squares (variation)
• Sum of Squares is another name for variation
ONE-WAY ANOVA

• Are all of the sample means identical?


• No, so there is some variation between the groups
• This is called the between group variation
• Sometimes called the variation due to the factor
• Denoted SS(B) for Sum of Squares (variation) between
the groups
ONE-WAY ANOVA

• Are each of the values within each group identical?


• No, there is some variation within the groups
• This is called the within group variation
• Sometimes called the error variation
• Denoted SS(W) for Sum of Squares (variation) within the
groups
ONE-WAY ANOVA

• There are two sources of variation


• the variation between the groups, SS(B), or the variation
due to the factor
• the variation within the groups, SS(W), or the variation
that can’t be explained by the factor so it’s called the error
variation
ONE-WAY ANOVA

• Here is the basic one-way ANOVA table

Source SS df MS F p

Between

Within

Total
ONE-WAY ANOVA

• Grand Mean k

• The grand mean is the average of all the n x


x
i i
i 1
values when the factor is ignored k

• It is a weighted average of the individual n i 1


i
sample means

n x  n x   n x
x 1 1 2 2 k k

n  n   n1 2 k
ONE-WAY ANOVA

• Grand Mean for our example is 65.08


7  75.71  9  67.11  8  53.50 
x
798
1562
x
24
x  65.08
ONE-WAY ANOVA

• Between Group Variation, SS(B)


• The between group variation is the variation
between each sample mean and the grand
mean
• Each individual variation is weighted by the
SS  sample
B    nsize
 x  x
k 2

i i
i 1

SS  B   n  x  x   n  x  x     n  x  x 
2 2 2

1 1 2 2 k k
ONE-WAY ANOVA

The Between Group Variation for our example is


SS(B)=1902
SS  B   7  75.71  65.08   9  67.11  65.08   8  53.50  65.08 
2 2 2

SS  B   1900.8376  1902

I know that doesn’t round to be 1902, but if you don’t


round the intermediate steps, then it does. My goal
here is to show an ANOVA table from MINITAB and it
returns 1902.
ONE-WAY ANOVA

• Within Group Variation, SS(W)


• The Within Group Variation is the weighted total of
the individual variations
• The weighting is done with the degrees of freedom
• The df for each sample is one less than the sample
size for that sample.
ONE-WAY ANOVA

Within Group Variation

SS  W    df s
k
2
i i
i 1

SS  W   df s  df s    df s
2
1 1 2
2
2 k
2
k
ONE-WAY ANOVA

• The within group variation for our example


is 3386

SS  W   6  310.90   8  119.86   7  80.29 


SS  W   3386.31  3386
ONE-WAY ANOVA

• After filling in the sum of squares, we have …

Source SS df MS F p

Between 1902

Within 3386

Total 5288
ONE-WAY ANOVA

• Degrees of Freedom, df
• A degree of freedom occurs for each value that can
vary before the rest of the values are predetermined
• For example, if you had six numbers that had an
average of 40, you would know that the total had to
be 240. Five of the six numbers could be anything,
but once the first five are known, the last one is fixed
so the sum is 240. The df would be 6-1=5
• The df is often one less than the number of values
ONE-WAY ANOVA

• The between group df is one less than the number of


groups
• We have three groups, so df(B) = 2
• The within group df is the sum of the individual df’s of
each group
• The sample sizes are 7, 9, and 8
• df(W) = 6 + 8 + 7 = 21
• The total df is one less than the sample size
• df(Total) = 24 – 1 = 23
ONE-WAY ANOVA

• Filling in the degrees of freedom gives this …

Source SS df MS F p

Between 1902 2

Within 3386 21

Total 5288 23
ONE-WAY ANOVA

• Variances
• The variances are also called the Mean of the Squares and
abbreviated by MS, often with an accompanying variable
MS(B) or MS(W)
• They are an average squared deviation from the mean and
are found by dividing the variation by the degrees of
freedom
• MS = SS / df

Variation
Variance 
df
ONE-WAY ANOVA

• MS(B) = 1902 / 2 = 951.0


• MS(W) = 3386 / 21 = 161.2
• MS(T) = 5288 / 23 = 229.9
• Notice that the MS(Total) is NOT the sum of MS(Between)
and MS(Within).
• This works for the sum of squares SS(Total), but not the
mean square MS(Total)
• The MS(Total) isn’t usually shown
ONE-WAY ANOVA

• Completing the MS gives …

Source SS df MS F p

Between 1902 2 951.0

Within 3386 21 161.2

Total 5288 23 229.9


ONE-WAY ANOVA

• Special Variances
• The MS(Within) is also known as the pooled
estimate of the variance since it is a weighted
average of the individualsvariances
2
p
• Sometimes abbreviated
• The MS(Total) is the variance of the response
variable.
• Not technically part of ANOVA table, but useful none the
less
ONE-WAY ANOVA

• F test statistic
• An F test statistic is the ratio of two sample variances
• The MS(B) and MS(W) are two sample variances and
that’s what we divide to find F.
• F = MS(B) / MS(W)
• For our data, F = 951.0 / 161.2 = 5.9
ONE-WAY ANOVA

• Adding F to the table …

Source SS df MS F p

Between 1902 2 951.0 5.9

Within 3386 21 161.2

Total 5288 23 229.9


ONE-WAY ANOVA

• The F test is a right tail test


• The F test statistic has an F distribution with df(B)
numerator df and df(W) denominator df
• The p-value is the area to the right of the test
statistic
• P(F2,21 > 5.9) = 0.009
ONE-WAY ANOVA

• Completing the table with the p-value

Source SS df MS F p

Between 1902 2 951.0 5.9 0.009

Within 3386 21 161.2

Total 5288 23 229.9


ONE-WAY ANOVA

• The p-value is 0.009, which is less than the


significance level of 0.05, so we reject the null
hypothesis.
• Means of the three rows are different
TWO-WAY ANOVA
TWO-WAY ANOVA
• So far, our ANOVA problems had only one dependent
variable and one independent variable (factor). (e.g.
compare gas mileage across different brands)

• What if want to use two or more independent


variables? (e.g. compare gas mileage across different
brands of cars and in different states)

• We will only look at the case of two independent


variables, but the process is the same for larger number
of independent variables.

• When we are examining the effect of two independent


variables, this is called a Two-Way ANOVA.
TWO-WAY ANOVA
• In a Two-way ANOVA, the effects of two factors
can be investigated simultaneously. Two-way
ANOVA permits the investigation of the effects of
either factor alone (e.g. the effect of brand of car on
the gas mileage, and the effect of the state on the
gas mileage) and also the two factors together
(e.g.the combined effect of the model of the car and
the effect of state on gas mileage).
• This ability to look at both factors together is the
advantage of a Two-Way ANOVA compared to two
One-Way ANOVA’s (one for each factor)
TWO-WAY ANOVA
• The effect on the population mean (of
the dependent variable) that can be
attributed to the levels of either factor
alone is called a main effect. This is
what you would detect using two
separate one-way ANOVA’s.
HYPOTHESES
• Two questions are answered by a Two-way
ANOVA
• Is there any effect of Factor A on the outcome?
(Main Effect of A).
• Is there any effect of Factor B on the outcome?
(Main Effect of B).

• This means that we will have two sets of


hypotheses, one set for each question.
Hypotheses
1) Main effect of Factor A:
H0: 1 = 2 = 3 = ... a = 0 or, i = 0 for all i = 1 to a
(a = # of levels of A)

H1: not all i are 0 or, at least one i  0


 
2) Main effect of Factor B:
H0: ß1 = ß2 = ß3 = ... ßb = 0 or, ßj = 0 for all j = 1 to b
(b = # of levels of B)

H1: not all ßj are 0 or, at least one ßj  0

 
HYPOTHESES
Effect of brand of car (Factor A) and tire (Factor B) on
gas mileage (dependent variable).

1) Main effect of Car Brand:


H0: There is no difference in average gas
mileage across different brands of cars.
H1: There are differences in average gas
mileage across different brands of cars.

2) Main effect of Tire Brand:


H0: There is no difference in average gas
mileage across different brands of tires.
H1: There are differences in average gas
mileage across different brands of tires.
SUM SQUARES
• In One-way ANOVA, the relationship between the
sums of squares was:

SST = SS(B) + SS(W)

• In Two-way ANOVA, we have two factors, which


means we have separate treatment levels for
those two factors. Thus the relationship becomes:
SST = SSA(B) + SSB(B) + SSE
Where:
SSA: Variance between different levels of factor A
SSB: Variance between different levels of factor B
Mean Squares and F value
Mean Squares: F-calculated:

MSA = SSA FA = MSA


(a - 1) MSE
 
MSB = SSB FB = MSB
(b - 1) MSE
 
MSE = SSE/(a-1)(b-1)
Two Way ANOVA Table

Source of Sum of
Variation Squares df Mean Square F-ratio F-critical

Factor A SSA a-1 MSA= SSA /(a-1) F= MSA/MSE F [(a-1),(a-1)(b-1)]


Factor B SSB b-1 MSB= SSB /(b-1) F= MSB/MSE F [(b-1),(a-1)(b-1]
Error SSE (a-1)(b-1) MSE= SSE /(a-1)(b-1)
TOTAL SST ab-1

a- Number of treatment levels (categories) for Factor A.


b- Number of treatment levels (categories) for Factor B.
TWO WAY ANOVA WITH
INTERACTIONS HYPOTHESES
• Three questions are answered by a Two-way
ANOVA with Interactions

• Is there any effect of Factor A on the outcome?


(Main Effect of A).
• Is there any effect of Factor B on the outcome?
(Main Effect of B).
• Is there any effect of the interaction of Factor A and
Factor B on the outcome?
(Interactive Effect of AB)

• This means that we will have three sets of


hypotheses, one set for each question.
Hypotheses
1) Main effect of Factor A:
H0: 1 = 2 = 3 = ... a = 0 or, i = 0 for all i = 1 to a
(a = # of levels of A)
H1: not all i are 0 or at least one i  0
 
2) Main effect of Factor B:
H0: ß1 = ß2 = ß3 = ... ßb = 0 or, ßj = 0 for all j = 1 to b
(b = # of levels of B)
H1: not all ßj are 0 or at least one ßj  0

3) Interactive Effect of AB:


H0: ßij = 0 for all i and j
H1: the ßij are not all 0
Two Way ANOVA
with Interactions Table
Source of Sum of
Variation Squares df Mean Square F-ratio F-critical

Factor A SSA a-1 MSA= SSA / (a-1) F= MSA/MSE F [(a-1),ab(n-1)]


Factor B SSB b-1 MSB= SSB / (b-1) F= MSB/MSE F [(b-1),ab(n-1)]
Interaction SSAB (a-1)(b-1) MSI = SSAB / (a-1)(b-1) F= MSAB/MSE F [(a-1)(b-1),ab(n-1)]
Error SSE ab(n-1) MSE= SSE / ab(n-1)
TOTAL SST abn-1

a: Number of treatment levels (categories) for Factor A.


b: Number of treatment levels (categories) for Factor B.
n: number of observations per cell
Mean Squares and F value
Mean Squares: F-calculated:

MSA = SSA FA = MSA


(a - 1) MSE
 
MSB = SSB FB = MSB
(b - 1) MSE
 
MSAB = SSAB FAB = MSAB
(a-1)(b - 1) MSE

MSE = SSE
ab(n-1)

Вам также может понравиться