Вы находитесь на странице: 1из 32

PROBABILITY &

STATISTICAL INFERENCE
LECTURE 5
MSc in Computing (Data Analytics)

Lecture Outline
Introduction to hypothesis testing
Hypothesis Testing on the Mean
Hypothesis Testing
Statistical hypothesis testing and confidence
interval estimation of parameters are the
fundamental methods used at the data analysis
stage of a comparative experiment, in which
the experimenter is interested, for example, in
comparing the mean of a population to a
specified value.

Example
For example, suppose that we are interested in
the burning rate of a solid propellant used to
power aircrew escape systems.
Now burning rate is a random variable that can
be described by a probability distribution.
Suppose that our interest focuses on the mean
burning rate (a parameter of this distribution).
Specifically, we are interested in deciding
whether or not the mean burning rate is 50
centimeters per second.

Judicial Analogy
Hypothesis
Significance Level
Collect Evidence Decision Rule
Judicial Analogy
A defendant is put on trial. They are
suspected of being guilty of crime.
Determine the null hypothesis H
0
and the
alternative hypothesis H
1
.
The null hypothesis is what you assume to be
true when you start your analysis. It is the
logical opposite of what you are tying to prove.
In the judicial analogy:
H
0
: The defendant is innocent
H
1
:

The defendant is guilty
Judicial Analogy
You select a significance level. In the judicial
example it is the amount of evidence needed
to convict. In a court of law there must be
enough evidence to convict beyond a
reasonable doubt.
You collect evidence.
You use the decision rule to make a
judgement. If the evidence is
sufficiently strong, reject the null hypothesis. The
defendant is proven guilty
not strong enough, do not reject the null
hypothesis.

Coin Example
You suspect that a coin is not fair and set out to prove
that it is not fair
H
0
: The coin is fair
H
1
:

The coin is not fair

Significance level: If you observe more than 8 head or
tails coin tosses out of ten you conclude the coin is not
fair otherwise you state that there is not enough
evidence
Toss the coin ten times and count the number of heads
and tails
You evaluate the data using your decision rule that
there is
Enough evidence to reject the assumption that the coin is
fair
Not enough evidence to reject the assumption that the coin
is fair



Example


Tests of Statistical Hypotheses

Decision criteria for testing H
0
: = 50 centimeters per second versus H
1
: =
50 centimeters per second.
Some Definitions

There is a chance you could be wrong!
Errors in Hypothesis Tests
Actual
Decision H
0
H
1

H
0


Correct Type II Error
H
1


Type I error Correct
Sometimes the type I error probability is called
the significance level, or the o-error, or the size of
the test
Errors in Hypothesis Tests

= P(type II error) = P(fail to reject H
0

when H
0
is false)

The power is computed as 1 - , and
power can be interpreted as the
probability of correctly rejecting a false
null hypothesis. We often compare
statistical tests by comparing their power
properties.

Which Hypothesis is of interest
Suppose you have a question about the quantity
of cereal is a box of cornflakes. You can use
one of three types of test:
A two tail test if you suspect the true mean is
different rather than claimed.
An upper-tail test if you suspect the true
mean is higher than claimed
A lower-tailed test if you suspect that that
the true mean is lower than claimed.
Critical Regions
Two tail test:



Upper tail test



Lower tail test
0 1
0 0
: H
: H
=
=
0 1
0 0
: H
: H
>
s
0 1
0 0
: H
: H
<
>
General Steps in Hypotheses
testing
1. From the problem context, identify the parameter of
interest.
2. State the null hypothesis, H
0
.
3. Specify an appropriate alternative hypothesis, H
1
.
4. Choose a significance level, o.
5. Determine an appropriate test statistic.
6. State the rejection region for the statistic.
7. Compute any necessary sample quantities,
substitute these into the equation for the test
statistic, and compute that value.
8. Decide whether or not H
0
should be rejected and
report that in the problem context.
Tests on the Mean of a Normal
Dist, Known
Hypothesis Tests on the Mean
We wish to test:





The test statistic is:





n
X
Z
/
__
0
o

=
Tests on the Mean of a Normal
Dist, Known
Reject H
0
if the observed value of the test
statistic z
0
is either:
z
0
> z
o/2
or z
0
< -z
o/2


Fail to reject H
0
if
-z
o/2
< z
0
< z
o/2


Example
Example
We can solve this problem by using the 8
steps as follows:
n
X
Z
/
0
__
0
o

=
Example
Recap
Assumptions

The population variance is known.
The sample means are normally distributed. (Invoke the CLT)
Exercises
The life in hours of a battery is known to be
approximately normally distributed with a standard
deviation =1.25 hours. A random sample of 40
batteries has a mean life of hours.
Is there evidence to support that battery life exceeds 40
hours? Use =0.05.

The mean water temperature downstream from a
power plant cooling tower discharge pipe should be no
more than 38
o
C. Past experience has indicated the
standard deviation of the temperature is 1.1
o
. The
water temperature measured on 35 randomly chosen
days and the average temperature is found to be 37
o
C.
Is there evidence that the water temperature is acceptable
at =0.05.
5 . 40
__
= x
Hypothesis Tests on the Mean,
2
unknown


Two tail test:



Upper tail test



Lower tail test
0 1
0 0
: H
: H
=
=
0 1
0 0
: H
: H
>
s
0 1
0 0
: H
: H
<
>
Example
Example
The sample mean and the standard
deviation
s = 0.02456. The normal probability plot of the data on
the next slides supports the assumption that the sample
means come from a normal distribution. Use the 8 steps
to test that the mean coefficient of restitution exceeds
0.82
83725 . 0
__
= x


Normal probability plot
of the coefficient of
restitution data from
the example.
Normal probability plot
Example
Exercise
An article in a journal describes a study of thermal inertia
properties of autoclaved aerated concrete used as building
material. Five samples of the material was tested in a
structure, and the average interior temperate (
o
C) reported
were as follows: 23.01, 22.22, 22.04, 22.62 and 22.59.
Test the hypotheses H
0
: =22.5 versus H
1
: 22.5 using =0.05
Consider this computer output:


a) How many degrees of freedom are there on the t-test
statistic
b) Fill in the missing quantities
c) Test the hypotheses H
0
: =34.5 versus H
1
: 34.5 using
=0.05
Variable N Mean StDev SE Mean 95%CI t
X 16 35.274 1.783 ?
(34,324,36.224) ?
Tests on a Population
Proportion
Large-Sample Tests on a Proportion

An appropriate test statistic is
Tests on a Population
Proportion

Вам также может понравиться