Вы находитесь на странице: 1из 48

Introduction to hypothesis testing

and confidence intervals


One-sample Students test and confidence interval

Jon Michael Gran


Department of Biostatistics, UiO

MF9130 Introductory course in statistics


Wednesday 25.05.2011

1 / 48

Overview

Aalen chapter 8.1-8.5, Kirkwood and Sterne chapter 4,6 and 7


Properties of the mean value
The Student t-distribution
Confidence interval for the mean
One sample t-test
Paired data

2 / 48

Properties of the mean value

Mean
is the sum of all
The (arithmetic) sample mean X
observations divided by the number of observations:
=
X

Pn

i=1 Xi

where n is the sample size


An estimate of the population mean

3 / 48

Median
Another measure of the average value is the sample median

. This is the middle observation when all observations are


X
arranged in increasing order:
(

=
X

Y(n+1)/2
if n is odd
,
1
(Y
+
Y
)
if n is even
n/2
n/2+1
2

where Y(1) , . . . , Y(n) are the ascending ordered observations


X1 , . . . , Xn , and n is the sample size

Mode
The mode is the most frequently occuring value in the sample

4 / 48

Example: 4.1 in Kirkwood & Sterne


We have measurements of the plasma volumes (in litres) of eight
healthy adult males.
Subject
Plasma volume

1
2.75

2
2.86

3
3.37

4
2.76

5
2.62

6
3.49

7
3.05

8
3.12

We find that the sample mean is given by


= 1 (2.75 + 2.86 + . . . + 3.12) = 3.00,
X
8
and the sample median is given by
= 1 (2.86 + 3.05) = 2.96
X
2
Since all the values are different, there is no estimate of the mode

5 / 48

Choice of measure
The choice of measure for the average value depends on the

type of distribution
Average value
arithmetic mean
median
mode

Distribution
usually the preferred measure
one or two extremely high or low values
seldom used

The mean, median and mode are equal when the distribution

is symmetrical and unimodal

6 / 48

Variation
Measures of variation are used to indicate the spread of the values
in a distribution

7 / 48

Range and interquartile range


The range is the difference between the largest and smallest

values in the sample:


R = Yn Y1 ,
where Y1 = min(X ) and Yn = max(X )
The interquartile range is the difference between the middle

two quartiles:
IQR = Q3 Q1 ,
where Q1 and Q3 are the lower and upper quartiles
respectively. It indicates the spread of the middle 50% of the
distribution

8 / 48

Variance
The population variance 2 may be estimated by the

empirical variance s 2 . It is found by averaging the squares of


the deviations of the observations from the sample mean
2

s =

Pn

)2
X
,
n1

i=1 (Xi

where (n 1) is called the number of degrees of freedom


(d.f.) of the variance

9 / 48

Standard deviation
The population standard deviation is found as the square

root of the variance. It may be estimated by the empirical


standard deviation s, which is the square root of the
empirical variance:
sP

s=

)2
X
=
n1

n
i=1 (Xi

sP

n
2
i=1 Xi

( ni=1 Xi )2 /n
n1
P

When the underlying population corresponds to a normal

distribution we have that:


I

about 70% of the observations lie within one standard


deviation of their mean
about 95% of the observations lie within two standard
deviations of their mean

10 / 48

Example: 4.2 in Kirkwood & Sterne


We want to calculate the standard deviation of the eight plasma
volume measurements of Example 4.1 in Kirkwood & Sterne.

Totals

Plasma volume
X
2.75
2.86
3.37
2.76
2.62
3.49
3.05
3.12

Deviation
from the mean

X X
-0.25
-0.14
0.37
-0.24
-0.38
0.49
0.05
0.12

Squared deviation
from the mean
)2
(X X
0.0625
0.0196
0.1369
0.0576
0.1444
0.2401
0.0025
0.0144

Squared
observation
X2
7.5625
8.1796
11.3569
7.6176
6.8644
12.1801
9.3025
9.7344

24.02

0.00

0.6780

72.7980

The sum of squared deviations from the sample mean is


P
2
i (Xi X ) = 0.6780, and we have n 1 = 7 degrees
q of freedom.
The empirical standard deviation is given by s =

0.6780
7

= 0.31

11 / 48

Standard error
also has a distribution!
X
I mean equal to the population mean

I standard deviation, called the standard error, equal to / n


The central limit theorem says that the distribution is a

normal distribution, whether or not the underlying population


is normal (when the sample size is not too small)
The estimated standard error of the sample mean is given

by

s
sd
.e. = sX = ,
n

where s is the empirical standard deviation, and n is the


sample size

12 / 48

Example: 4.3 in Kirkwood & Sterne


Once again, we return to the eight plasma volumes of Example
4.1 and Example 4.2 in Kirkwood & Sterne (2003). We found that
the sample mean is 3.00 litres, and the empirical standard
deviation is 0.31 litres. The estimated standard error of the
sample mean (in litres) is given by
0.31
sd
.e. = sX = = 0.11
8

13 / 48

Standard deviation vs. standard error


Remember that
the standard deviation measures the amount of variability in

the population
the standard error of the sample mean measures the amount

of variability in the sample mean

14 / 48

Example: 8.2 in Aalen et al.


We have a sample of 4 independent measurements of
cholesterol from a population with mean = 6.5 mmol/l and
standard deviation = 0.5 mmol/l
The expected value in the sample equals 6.5 mmol/l,and the

standard error of the sample mean is / n = 0.5/ 4 = 0.25

15 / 48

Summing up the properties of the mean value


=
The mean value: X

Pn
i=1

Xi

) =
Expectation of the mean: E (X
2

) =
Variance of the mean: Var (X
n
Standard deviation of the mean = standard error:
) =
SD(X

N(, )
The distribution of the mean: X
n
(the central limit theorem)

16 / 48

The Student t-distribution


Using the estimated standard error
N(, ), which means that
We have learned that X
n

X

/ n

N(0, 1)

But often is not known, and we use the estimated

standard error sX = s/ n instead of / n. We then get that


t=


X
t(n 1),
s/ n

or in other words that t is student t-distributed with n 1


degrees of freedom

17 / 48

The higher degree of freedom the closer the stundent t is to

the standard normal distribution N(0,1)


18 / 48

19 / 48

Confidence interval for a mean


We want to construct a range of likely values, called a

confidence interval (CI), for the (unknown) population mean


based on the sample mean and its standard error
confidence interval

Population

Sample
6

Figur: Use of confidence intervals to make inferences about the


population from which the sample was drawn

20 / 48

CI when population standard deviation is known


Providing that the sample size is not too small and that the

population distribution is not severely non-normal, the 95%


confidence interval for the population mean is given by
1.96 , X
+ 1.96 ,
95% CI = X
n
n


where 1.96 is the two-sided 5% point of the standard normal


1.96 /n and X
+ 1.96 /n are
distribution, and X
called lower and upper 95% confidence limits for the
population mean respectively

21 / 48

CI when population standard deviation is not known


In a large-sample case with sample size n > 60, we have that
I the sampling distribution of sample means is well approximated
by the normal distribution
I the sample standard deviation s is a reliable estimate of the
population standard deviation
The 95% confidence interval for the population mean is

then given by
+ 1.96 s ,
1.96 s , X
95% CI = X
n
n


where 1.96 is the two-sided 5% point of the standard normal

distribution, and s/ n is the estimated standard error of


the sample mean

22 / 48

Example 6.1 in Kirkwood & Sterne


We want to estimate the amount of insecticide that would be
required to spray all the 10000 houses in a rural area as part of a
malaria control programme. A random sample of 100 houses is
chosen and the sprayable surface of each of these is measured.
= 24.2
The mean sprayable surface area for these 100 houses is X
2
2
m , and the estimated standard deviation is s = 5.9 m .
The estimated
standard error of the sample mean is

s/ n = 5.9/ 100 = 0.6 m2 .

23 / 48

The 95% confidence interval is:


(24.2 1.96 0.6, 24.2 + 1.96 0.6) = (23.0, 25.4)
The upper 95% confidence limit is used in budgeting for the
amount of insecticide required per house. One litre of insecticide is
sufficient to spray 50 m2 and so the amount (in litres) budgeted
for is:
25.4
10000
= 5080
50

24 / 48

Small-sample case
In a small-sample case when the sample size, n, is not large,

we have that:
I

the sample standard deviation s, which is itself subject to


sampling variation, may not be a reliable estimate for the
population standard deviation
when the distribution in the population is not normal, the
distribution of the sample mean may also be non-normal

The latter of these two effects is of practical importance only

when the sample size is very small, say n < 15, and when the
distribution in the population is extremely non-normal. The
former effect invalidates the use of the normal distribution,
and instead we use the t distribution in calculating the
confidence intervals

25 / 48

In a small-sample case, a confidence interval for the population


mean is given by
+ t 0 s ,
t 0 s , X
CI = X
n
n


where t 0 is the appropriate percentage point of the t distribution

with (n 1) degrees of freedom, and s/ n is the estimated


standard error of the sample mean

26 / 48

Example 6.3 in Kirkwood & Sterne


The numbers of hours of relief obtained by six arthritic patients
after receiving a new drug are recorded.
Patient no.
Number of hours

1
2.2

2
2.4

3
4.9

4
2.5

5
3.7

6
4.3

= 3.3 hours, the empirical standard


The sample mean is X
deviation is s = 1.13 hours and the estimated standard error of the

sample mean equals s/ n = 0.46 hours. The number of degrees of


freedom is (n 1) = 5

27 / 48

The 95% confidence interval (in hours) for the average number
of hours of relief for arthritic patients in general is
(3.3 2.57 0.46, 3.3 + 2.57 0.46) = (2.1, 4.5),
where 2.57 is the two-sided 5% point of the t distribution with 5
degrees of freedom

28 / 48

Severe non-normality
When the distribution in the population is markedly non-normal, it
may be desirable to
use a transformation on the scale on which the variable X is

measured, or
calculate a non-parametric confidence interval, or
use bootstrap methods

29 / 48

Confidence interval vs. reference range


If the population distribution is approximately normal, the

95% reference range is given by


95% reference range = ( 1.96 , + 1.96 ),
where is the population mean and is the population
standard deviation
There is a clear distinction between the CI and the reference
range:
I

the reference range describes the variability between


individual observations in the population
the confidence interval is a range of plausible values for the
population mean, given the sample mean and its standard error

Since the sample size n > 1, the confidence interval will always be
narrower than the reference range.
30 / 48

One sample t-test


Hypothesis testing in general
State your null hypothesis H0
Derive the test statistic, who has a certain distribution
Calculate the p-value, or the probability of observing your

data (or more extreme) given that H0 is true. If the p-value is


below a certain level you can reject H0

One sample t-test


The one-sample Student t-test is one of the most frequently

applied tests in statistics. It is used to test a certain


hypothesis about the unknown population mean

31 / 48

Background

The t-test was devised by William Sealy

Gosset, working for Guinness brewery in


Dublin, to cheaply monitor the quality of
stout
Published in Biometrika in 1908 under the

pen name Student as Guinness regarded


the fact that they used statistics a trade
secret

32 / 48

The one sample t-test by example


30 measures of lactate dehydrogenase (LD)
Question: = 105? H0 : = 105, Ha : 6= 105
We know that if H0 is true, then

T0 =

0
X
t(n 1),
s/ n

which is our test statistic

For our example: T0 = 108.8105


= 2.64
7.88/ 30

When would you reject the null hypothesis? When T0 is large,

meaning when T0 > tn1, or when p <


In our example: t29,0.05 = 1.699 Rejection

33 / 48

How to get the P-value

p = 2PH0 (t > |T0 |))


SPSS or other statistical software produces the p-value

automatically
Look up in a table...

34 / 48

35 / 48

Example: One sample t-test summary


We have a dataset, like the one with birthweights for 189

newborns:

Given that the data is approximately normal, can we conclude

that the true mean is less than 3500 grams?


Compute the p-value and conclude

36 / 48

Type I and Type II errors

Type I error (false positive): P(H0 rejected | H0 true) = .

is determined in advance, usually 5%


Type II error (false negative): P(H0 not rejected | H0 false) =

37 / 48

Paired data

Paired measurements
In medical settings we often deal with paired measurements,

which is two outcomes measured on


I

the same individual under different exposure (or treatment)


circumstances
two individuals matched by certain key characteristics

The pairing in the data is taking into account by considering

the differences between each pair of outcome observations. In


that way the data are turned into a single sample of
differences

38 / 48

Example: 7.3 in Kirkwood & Sterne


We consider the results of a clinical trial to test the effectiveness
of a sleeping drug. The sleep of ten patients was observed during
one night with the drug and one night with placebo. For each
patient a pair of sleep times, was recorded and the difference
between these calculated
Patient
1
2
3
4
5
6
7
8
9
10
Mean

Hours of sleep
Drug
Placebo
6.1
5.2
6.0
7.9
8.2
3.9
7.6
4.7
6.5
5.3
5.4
7.4
6.9
4.2
6.7
6.1
7.4
3.8
5.8
7.3
1 = 6.66
X

0 = 5.58
X

Difference
0.9
-1.9
4.3
2.9
1.2
-2.0
2.7
0.6
3.6
-1.5
= 1.08
X

39 / 48

= 1.08 hours,
The observed mean difference in sleep time was X
and the empirical standard deviation of the differences was
s = 2.31. The
estimated standard error of the differences is

s/ n = 2.31/ 10 = 0.73 hours

A 95% confidence interval for the mean difference in sleep time


in the population is given by
(1.08 2.26 0.73, 1.08 + 2.26 0.73) = (0.57, 2.73),
where 2.26 is the two-sided 5% point of the t distribution with
(n 1) = 9 degrees of freedom

40 / 48

= 1.08 hours, and the


The mean difference in sleep time was X

estimated standard error was s/ n = 0.73 hours. The test


statistic is given by
t = 1.08/0.73 = 1.48,
which is t distributed with (n 1) = 9 degrees of freedom when
the null hypothesis of no effect is true. The corresponding P-value,
which is the probability of getting a t value with a size as large as
this or larger in a t distribution with 9 degrees of freedom, is
p = 0.17
So, there is no evidence against the null hypothesis that the drug
does not affect sleep time

41 / 48

SPSS
1

Define the variables needed for the analysis

42 / 48

Click on Analyze Compare Means One-Sample T


Test...

43 / 48

Move the test variable over to the empty field by clicking on


the right arrow. Select appropriate test value

Click on OK
44 / 48

Interpret the results from the output

45 / 48

Exercise in SPSS: One-sampe t-test


Altman p. 183
Compare energy intake (in kJ) for a group of 11 woman

between 22 and 30 years, measured as a mean over 10 days


Mean intake for all woman is 6753,6 kJ
What can we say about the energy intake of these woman in

realtion to a recommended daily intake of 7725 kJ?

46 / 48

Exercise in SPSS: Paired t-test


Altman p. 190
Compare pre-menstrual and post-menstrual energy intake (in

kJ) for a group of 11 woman between 22 and 30 years,


measured as a mean over 10 days
Mean intake for all woman pre is 6753,6 kJ and post is

5433,2kJ
Can we conclude that there is a difference between pre and

post?

47 / 48

Summary
Key words
Sample mean
Sample standard deviation
Standard error vs. standard deviation
The central limit theorem
The student t-distribution
Confidence intervals for the mean
One sample t-test (also for paired data)

Notation
and
X
s and
48 / 48

Вам также может понравиться