08 - Hypothesis Testing, CI and One Sample T-Test

Introduction to hypothesis testing
and confidence intervals

One-sample Students test and confidence interval
Jon Michael Gran

Department of Biostatistics, UiO
MF9130 Introductory course in statistics

Wednesday 25.05.2011
1 / 48
Overview
Aalen chapter 8.1-8.5, Kirkwood and Sterne chapter 4,6 and 7

Properties of the mean value
The Student t-distribution
Confidence interval for the mean
One sample t-test
Paired data
2 / 48
Properties of the mean value
Mean
is the sum of all
The (arithmetic) sample mean X
observations divided by the number of observations:
=
X
Pn
i=1 Xi
where n is the sample size

An estimate of the population mean
3 / 48
Median
Another measure of the average value is the sample median
. This is the middle observation when all observations are

X
arranged in increasing order:
(
=
X
Y(n+1)/2
if n is odd
,
1
(Y
+
Y
)
if n is even
n/2
n/2+1
2
where Y(1) , . . . , Y(n) are the ascending ordered observations

X1 , . . . , Xn , and n is the sample size
Mode
The mode is the most frequently occuring value in the sample
4 / 48
Example: 4.1 in Kirkwood & Sterne

We have measurements of the plasma volumes (in litres) of eight
healthy adult males.
Subject
Plasma volume
1
2.75
2
2.86
3
3.37
4
2.76
5
2.62
6
3.49
7
3.05
8
3.12
We find that the sample mean is given by

= 1 (2.75 + 2.86 + . . . + 3.12) = 3.00,
X
8
and the sample median is given by
= 1 (2.86 + 3.05) = 2.96
X
2
Since all the values are different, there is no estimate of the mode
5 / 48
Choice of measure
The choice of measure for the average value depends on the
type of distribution
Average value
arithmetic mean
median
mode
Distribution
usually the preferred measure
one or two extremely high or low values
seldom used
The mean, median and mode are equal when the distribution
is symmetrical and unimodal
6 / 48
Variation
Measures of variation are used to indicate the spread of the values
in a distribution
7 / 48
Range and interquartile range

The range is the difference between the largest and smallest
values in the sample:

R = Yn Y1 ,
where Y1 = min(X ) and Yn = max(X )
The interquartile range is the difference between the middle
two quartiles:
IQR = Q3 Q1 ,
where Q1 and Q3 are the lower and upper quartiles
respectively. It indicates the spread of the middle 50% of the
distribution
8 / 48
Variance
The population variance 2 may be estimated by the
empirical variance s 2 . It is found by averaging the squares of

the deviations of the observations from the sample mean
2
s =
Pn
)2
X
,
n1
i=1 (Xi
where (n 1) is called the number of degrees of freedom

(d.f.) of the variance
9 / 48
Standard deviation
The population standard deviation is found as the square
root of the variance. It may be estimated by the empirical

standard deviation s, which is the square root of the
empirical variance:
sP
s=
)2
X
=
n1
n
i=1 (Xi
sP
n
2
i=1 Xi
( ni=1 Xi )2 /n
n1
P
When the underlying population corresponds to a normal
distribution we have that:

I
about 70% of the observations lie within one standard

deviation of their mean
about 95% of the observations lie within two standard
deviations of their mean
10 / 48

We want to calculate the standard deviation of the eight plasma
volume measurements of Example 4.1 in Kirkwood & Sterne.
Totals
Plasma volume
X
2.75
2.86
3.37
2.76
2.62
3.49
3.05
3.12
Deviation
from the mean
X X
-0.25
-0.14
0.37
-0.24
-0.38
0.49
0.05
0.12
Squared deviation
from the mean
)2
(X X
0.0625
0.0196
0.1369
0.0576
0.1444
0.2401
0.0025
0.0144
Squared
observation
X2
7.5625
8.1796
11.3569
7.6176
6.8644
12.1801
9.3025
9.7344
24.02
0.00
0.6780
72.7980
The sum of squared deviations from the sample mean is

P
2
i (Xi X ) = 0.6780, and we have n 1 = 7 degrees
q of freedom.
The empirical standard deviation is given by s =
0.6780
7
= 0.31
11 / 48
Standard error
also has a distribution!
X
I mean equal to the population mean
I standard deviation, called the standard error, equal to / n

The central limit theorem says that the distribution is a
normal distribution, whether or not the underlying population

is normal (when the sample size is not too small)
The estimated standard error of the sample mean is given
by
s
sd
.e. = sX = ,
n
where s is the empirical standard deviation, and n is the

sample size
12 / 48

Once again, we return to the eight plasma volumes of Example
4.1 and Example 4.2 in Kirkwood & Sterne (2003). We found that
the sample mean is 3.00 litres, and the empirical standard
deviation is 0.31 litres. The estimated standard error of the
sample mean (in litres) is given by
0.31
sd
.e. = sX = = 0.11
8
13 / 48
Standard deviation vs. standard error

Remember that
the standard deviation measures the amount of variability in
the population
the standard error of the sample mean measures the amount
of variability in the sample mean
14 / 48
Example: 8.2 in Aalen et al.

We have a sample of 4 independent measurements of
cholesterol from a population with mean = 6.5 mmol/l and
standard deviation = 0.5 mmol/l
The expected value in the sample equals 6.5 mmol/l,and the
standard error of the sample mean is / n = 0.5/ 4 = 0.25
15 / 48
Summing up the properties of the mean value

=
The mean value: X
Pn
i=1
Xi
) =
Expectation of the mean: E (X
2
) =
Variance of the mean: Var (X
n
Standard deviation of the mean = standard error:
) =
SD(X
N(, )
The distribution of the mean: X
n
(the central limit theorem)
16 / 48
The Student t-distribution

Using the estimated standard error
N(, ), which means that
We have learned that X
n

X
/ n
N(0, 1)
But often is not known, and we use the estimated
standard error sX = s/ n instead of / n. We then get that

t=

X
t(n 1),
s/ n
or in other words that t is student t-distributed with n 1

degrees of freedom
17 / 48
The higher degree of freedom the closer the stundent t is to
the standard normal distribution N(0,1)

18 / 48
19 / 48
Confidence interval for a mean

We want to construct a range of likely values, called a
confidence interval (CI), for the (unknown) population mean

based on the sample mean and its standard error
confidence interval
Population
Sample
6
Figur: Use of confidence intervals to make inferences about the

population from which the sample was drawn
20 / 48
CI when population standard deviation is known

Providing that the sample size is not too small and that the
population distribution is not severely non-normal, the 95%

confidence interval for the population mean is given by
1.96 , X
+ 1.96 ,
95% CI = X
n
n

where 1.96 is the two-sided 5% point of the standard normal

1.96 /n and X
+ 1.96 /n are
distribution, and X
called lower and upper 95% confidence limits for the
population mean respectively
21 / 48
CI when population standard deviation is not known

In a large-sample case with sample size n > 60, we have that
I the sampling distribution of sample means is well approximated
by the normal distribution
I the sample standard deviation s is a reliable estimate of the
population standard deviation
The 95% confidence interval for the population mean is
then given by
+ 1.96 s ,
1.96 s , X
95% CI = X
n
n

where 1.96 is the two-sided 5% point of the standard normal
distribution, and s/ n is the estimated standard error of

the sample mean
22 / 48
Example 6.1 in Kirkwood & Sterne

We want to estimate the amount of insecticide that would be
required to spray all the 10000 houses in a rural area as part of a
malaria control programme. A random sample of 100 houses is
chosen and the sprayable surface of each of these is measured.
= 24.2
The mean sprayable surface area for these 100 houses is X
2
2
m , and the estimated standard deviation is s = 5.9 m .
The estimated
standard error of the sample mean is
s/ n = 5.9/ 100 = 0.6 m2 .
23 / 48
The 95% confidence interval is:

(24.2 1.96 0.6, 24.2 + 1.96 0.6) = (23.0, 25.4)
The upper 95% confidence limit is used in budgeting for the
amount of insecticide required per house. One litre of insecticide is
sufficient to spray 50 m2 and so the amount (in litres) budgeted
for is:
25.4
10000
= 5080
50
24 / 48
Small-sample case
In a small-sample case when the sample size, n, is not large,
we have that:
I
the sample standard deviation s, which is itself subject to

sampling variation, may not be a reliable estimate for the
population standard deviation
when the distribution in the population is not normal, the
distribution of the sample mean may also be non-normal
The latter of these two effects is of practical importance only
when the sample size is very small, say n < 15, and when the
distribution in the population is extremely non-normal. The
former effect invalidates the use of the normal distribution,
and instead we use the t distribution in calculating the
confidence intervals
25 / 48
In a small-sample case, a confidence interval for the population

mean is given by
+ t 0 s ,
t 0 s , X
CI = X
n
n

where t 0 is the appropriate percentage point of the t distribution
with (n 1) degrees of freedom, and s/ n is the estimated

standard error of the sample mean
26 / 48
Example 6.3 in Kirkwood & Sterne

The numbers of hours of relief obtained by six arthritic patients
after receiving a new drug are recorded.
Patient no.
Number of hours
1
2.2
2
2.4
3
4.9
4
2.5
5
3.7
6
4.3
= 3.3 hours, the empirical standard

The sample mean is X
deviation is s = 1.13 hours and the estimated standard error of the
sample mean equals s/ n = 0.46 hours. The number of degrees of

freedom is (n 1) = 5
27 / 48
The 95% confidence interval (in hours) for the average number
of hours of relief for arthritic patients in general is
(3.3 2.57 0.46, 3.3 + 2.57 0.46) = (2.1, 4.5),
where 2.57 is the two-sided 5% point of the t distribution with 5
degrees of freedom
28 / 48
Severe non-normality
When the distribution in the population is markedly non-normal, it
may be desirable to
use a transformation on the scale on which the variable X is
measured, or
calculate a non-parametric confidence interval, or
use bootstrap methods
29 / 48
Confidence interval vs. reference range

If the population distribution is approximately normal, the
95% reference range is given by

95% reference range = ( 1.96 , + 1.96 ),
where is the population mean and is the population
standard deviation
There is a clear distinction between the CI and the reference
range:
I
the reference range describes the variability between

individual observations in the population
the confidence interval is a range of plausible values for the
population mean, given the sample mean and its standard error
Since the sample size n > 1, the confidence interval will always be
narrower than the reference range.
30 / 48
One sample t-test

Hypothesis testing in general
State your null hypothesis H0
Derive the test statistic, who has a certain distribution
Calculate the p-value, or the probability of observing your
data (or more extreme) given that H0 is true. If the p-value is

below a certain level you can reject H0
One sample t-test

The one-sample Student t-test is one of the most frequently
applied tests in statistics. It is used to test a certain

hypothesis about the unknown population mean
31 / 48
Background
The t-test was devised by William Sealy
Gosset, working for Guinness brewery in

Dublin, to cheaply monitor the quality of
stout
Published in Biometrika in 1908 under the
pen name Student as Guinness regarded

the fact that they used statistics a trade
secret
32 / 48
The one sample t-test by example

30 measures of lactate dehydrogenase (LD)
Question: = 105? H0 : = 105, Ha : 6= 105
We know that if H0 is true, then
T0 =
0
X
t(n 1),
s/ n
which is our test statistic
For our example: T0 = 108.8105

= 2.64
7.88/ 30
When would you reject the null hypothesis? When T0 is large,
meaning when T0 > tn1, or when p <

In our example: t29,0.05 = 1.699 Rejection
33 / 48
How to get the P-value
p = 2PH0 (t > |T0 |))

SPSS or other statistical software produces the p-value
automatically
Look up in a table...
34 / 48
35 / 48
Example: One sample t-test summary

We have a dataset, like the one with birthweights for 189
newborns:
Given that the data is approximately normal, can we conclude
that the true mean is less than 3500 grams?

Compute the p-value and conclude
36 / 48
Type I and Type II errors
Type I error (false positive): P(H0 rejected | H0 true) = .
is determined in advance, usually 5%

Type II error (false negative): P(H0 not rejected | H0 false) =
37 / 48
Paired data
Paired measurements
In medical settings we often deal with paired measurements,
which is two outcomes measured on

I
the same individual under different exposure (or treatment)

circumstances
two individuals matched by certain key characteristics
The pairing in the data is taking into account by considering
the differences between each pair of outcome observations. In

that way the data are turned into a single sample of
differences
38 / 48

We consider the results of a clinical trial to test the effectiveness
of a sleeping drug. The sleep of ten patients was observed during
one night with the drug and one night with placebo. For each
patient a pair of sleep times, was recorded and the difference
between these calculated
Patient
1
2
3
4
5
6
7
8
9
10
Mean
Hours of sleep
Drug
Placebo
6.1
5.2
6.0
7.9
8.2
3.9
7.6
4.7
6.5
5.3
5.4
7.4
6.9
4.2
6.7
6.1
7.4
3.8
5.8
7.3
1 = 6.66
X
0 = 5.58
X
Difference
0.9
-1.9
4.3
2.9
1.2
-2.0
2.7
0.6
3.6
-1.5
= 1.08
X
39 / 48
= 1.08 hours,
The observed mean difference in sleep time was X
and the empirical standard deviation of the differences was
s = 2.31. The
estimated standard error of the differences is
s/ n = 2.31/ 10 = 0.73 hours
A 95% confidence interval for the mean difference in sleep time

in the population is given by
(1.08 2.26 0.73, 1.08 + 2.26 0.73) = (0.57, 2.73),
where 2.26 is the two-sided 5% point of the t distribution with
(n 1) = 9 degrees of freedom
40 / 48
= 1.08 hours, and the

The mean difference in sleep time was X
estimated standard error was s/ n = 0.73 hours. The test

statistic is given by
t = 1.08/0.73 = 1.48,
which is t distributed with (n 1) = 9 degrees of freedom when
the null hypothesis of no effect is true. The corresponding P-value,
which is the probability of getting a t value with a size as large as
this or larger in a t distribution with 9 degrees of freedom, is
p = 0.17
So, there is no evidence against the null hypothesis that the drug
does not affect sleep time
41 / 48
SPSS
1
Define the variables needed for the analysis
42 / 48
Click on Analyze Compare Means One-Sample T

Test...
43 / 48
Move the test variable over to the empty field by clicking on

the right arrow. Select appropriate test value
Click on OK
44 / 48
Interpret the results from the output
45 / 48
Exercise in SPSS: One-sampe t-test

Altman p. 183
Compare energy intake (in kJ) for a group of 11 woman
between 22 and 30 years, measured as a mean over 10 days

Mean intake for all woman is 6753,6 kJ
What can we say about the energy intake of these woman in
realtion to a recommended daily intake of 7725 kJ?
46 / 48
Exercise in SPSS: Paired t-test

Altman p. 190
Compare pre-menstrual and post-menstrual energy intake (in
kJ) for a group of 11 woman between 22 and 30 years,

measured as a mean over 10 days
Mean intake for all woman pre is 6753,6 kJ and post is
5433,2kJ
Can we conclude that there is a difference between pre and
post?
47 / 48
Summary
Key words
Sample mean
Sample standard deviation
Standard error vs. standard deviation
The central limit theorem
The student t-distribution
Confidence intervals for the mean
One sample t-test (also for paired data)
Notation
and
X
s and
48 / 48

08 - Hypothesis Testing, CI and One Sample T-Test

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

08 - Hypothesis Testing, CI and One Sample T-Test

Загружено:

Авторское право:

Доступные форматы

Introduction to hypothesis testing

and confidence intervals

Jon Michael Gran

MF9130 Introductory course in statistics

Aalen chapter 8.1-8.5, Kirkwood and Sterne chapter 4,6 and 7

Properties of the mean value

where n is the sample size

. This is the middle observation when all observations are

where Y(1) , . . . , Y(n) are the ascending ordered observations

Example: 4.1 in Kirkwood & Sterne

We find that the sample mean is given by

is symmetrical and unimodal

Range and interquartile range

values in the sample:

empirical variance s 2 . It is found by averaging the squares of

where (n 1) is called the number of degrees of freedom

root of the variance. It may be estimated by the empirical

When the underlying population corresponds to a normal

distribution we have that:

about 70% of the observations lie within one standard

Example: 4.2 in Kirkwood & Sterne

The sum of squared deviations from the sample mean is

I standard deviation, called the standard error, equal to / n

normal distribution, whether or not the underlying population

where s is the empirical standard deviation, and n is the

Example: 4.3 in Kirkwood & Sterne

Standard deviation vs. standard error

of variability in the sample mean

Example: 8.2 in Aalen et al.

standard error of the sample mean is / n = 0.5/ 4 = 0.25

Summing up the properties of the mean value

The Student t-distribution

But often is not known, and we use the estimated

standard error sX = s/ n instead of / n. We then get that

or in other words that t is student t-distributed with n 1

The higher degree of freedom the closer the stundent t is to

the standard normal distribution N(0,1)

Confidence interval for a mean

confidence interval (CI), for the (unknown) population mean

Figur: Use of confidence intervals to make inferences about the

CI when population standard deviation is known

population distribution is not severely non-normal, the 95%

where 1.96 is the two-sided 5% point of the standard normal

CI when population standard deviation is not known

where 1.96 is the two-sided 5% point of the standard normal

distribution, and s/ n is the estimated standard error of

Example 6.1 in Kirkwood & Sterne

s/ n = 5.9/ 100 = 0.6 m2 .

The 95% confidence interval is:

the sample standard deviation s, which is itself subject to

The latter of these two effects is of practical importance only

In a small-sample case, a confidence interval for the population

where t 0 is the appropriate percentage point of the t distribution

with (n 1) degrees of freedom, and s/ n is the estimated

Example 6.3 in Kirkwood & Sterne

= 3.3 hours, the empirical standard

sample mean equals s/ n = 0.46 hours. The number of degrees of

Confidence interval vs. reference range

95% reference range is given by

the reference range describes the variability between

One sample t-test

data (or more extreme) given that H0 is true. If the p-value is

One sample t-test