Вы находитесь на странице: 1из 63

Hypothesis

Testing
Dallas’ Lesson
Chapter Objectives
You will be able to:
Lesson Objectives
You will be able to:

1. understand the terms null hypothesis and alternative hypothesis,


2. understand the significance level, rejection region and acceptance
region,
3. understand the terms Type I error and Type II error in relation to
hypothesis tests,
4. understand the difference between one-tailed and two-tailed tests.
Prerequisite Skills

Sampling Normal Distribution


Understand the distinction between
a sample and a population, and
appreciate the necessity for
01 02 Understand the use of a normal
distribution to model a
continuous random variable.
randomness in choosing samples.

The Poisson Distribution Discrete & Continuous


Understand the relevance
of the Poisson distribution
03 04 Variable
Understand the concept of
discrete and continuous
to the distribution of
random events, and use random variables
the Poisson distribution as
a model
Pre-test
Pre-test
Pre-test
Pre-test
How ready are
you?
The heart of statistic

Hypothesis testing is not really a


difficult concept to understand.
Unfortunately, most of the
textbooks out there make it
entirely too hard.
Why? Because different case
needs different perimeter.
“There are many different
sorts of hypothesis test
used in statistics; in this
chapter you meet only
two of them.”
The ideal hypothesis test:

1 Establish the null and alternative


hypotheses.
2 Decide on the significance level.
3 Collect suitable data using a random
sampling procedure that ensures the
items are independent.
4 Conduct the test, doing the necessary
calculations.
5 Interpret the result in terms of the
original claim, theory or problem.
01

Hypothesis
Null and Alternative
Defining the terms

Alternative
Hypothesis Null Hypothesis (Ho) Hypothesis (Ha)

A claim that need The commonly The Research


to be tested accepted fact Hypothesis
Null Hypothesis
For example:
The mean height of MSHS Students might be 𝜇 = 162 cm
So, Null hypothesis = Ho : 𝜇 = 162 cm

The null hypothesis always says that the population mean (or parameter) is
normal; nothing new or different is happening.
Alternative Hypothesis
If we think the population mean for height of MSHS Students is
different
than the claim, then we state that in the alternative hypothesis.
We could use the alternative hypothesis to make these claims about
the height of MSHS Students:
● The mean height of MSHS students is different than 𝜇 = 162 cm :
Ha : 𝜇 ≠ 162 cm

● The mean height of MSHS students is greater than 𝜇 = 162 cm :


Ha : 𝜇 > 162 cm

● The mean height of MSHS students is less than 𝜇 = 162 cm :


Ha : 𝜇 < 162 cm
Null and Alternative (Mean)

Ho : 𝜇 = 162 cm Ha : 𝜇 ≠ 162 cm

Ho : 𝜇 ≤ 162 cm Ha : 𝜇 > 162 cm

Ho : 𝜇 ≥ 162 cm Ha : 𝜇 < 162 cm

The null hypothesis states the status quo that the population
parameter is ≥, =, or ≤ the claimed value.
Null Hypothesis (proportion)
For example:
The proportion of MSHS students with brown eyes might be p = 0.15

So, Null hypothesis = Ho : p = 0.15

Remember: The null hypothesis always says that the population mean
(or parameter) is normal; nothing new or different is happening.
Alternative Hypothesis
If we think the population proportion for MSHS students with brown
eyes
is different than this 15% claim, then we state that in the alternative
hypothesis. For instance, we could use the alternative hypothesis to
make these claims about MSHS students with brown eyes :
● The proportion of MSHS students with brown eyes is different than p = 0.15:
Ha : p ≠ 0.15

● The proportion of MSHS students with brown eyes is greater than p = 0.15:
Ha : p > 0.15

● The proportion of MSHS students with brown eyes s is less than p = 0.15:
Ha : p < 0.15
Null and Alternative (Proportion)

Ho : p = 0.15 Ha : p ≠ 0.15

Ho : p ≤ 0.15 Ha : p > 0.15

Ho : p ≥ 0.15 Ha : p < 0.15

Remember: The null hypothesis states the status quo that the population
parameter is ≥, =, or ≤ the claimed value.
Quiz
Quiz
Quiz
Quiz
02

level of significance
Rejection and Acceptance
Regions
Point estimate vs Interval
Estimate
The sample mean and sample proportion are both
examples of a point estimate, because they
estimate a particular point. The benefit of using a
point estimate is that it’s easy to calculate. The
drawback is that calculating a point estimate
doesn’t give you any idea of how good the
estimate really is. The point estimate could be a
really good estimate or a really bad estimate, and
we wouldn’t know it either way.
Point estimate vs Interval
Estimate
In contrast, we can find an interval estimate, which instead
gives us a range of values in which the population parameter
may lie. It’s a little harder to calculate than a point estimate,
but it gives us much more information. With an interval
estimate, we’re able to make statements like:

“I’m 95% confident that the population mean lies in the


interval (a, b),”

or “I’m 99% confident that the population proportion lies in


the interval (a, b).”
Confidence level
These 95% and 99% values we’re referring to
are called confidence levels. A confidence
level is the probability that an interval
estimate will include the population
parameter.

It’s most common to choose 90%, 95%, or 99%


as your confidence level, and then find the
interval that’s associated with that particular
confidence level.
The alpha value α
So for a 95% confidence level, we’re saying that 5% of
the confidence intervals we construct won’t contain the
population parameter.

This 5% (or 10% for a 90% confidence level, or 1% for


a 99% confidence level) is called the alpha value, α.

We also call it the level of significance, or the


probability of making an error.
The confidence interval
When population standard deviation σ is known, the
confidence interval is given as (a, b) by

 
Where (a, b) is the confidence interval, is the sample
mean, z* is the critical value (which is the z-score for the
confidence level we’ve chosen), σ is population standard
deviation, and n is our sample size.
The confidence interval
(proportion)

If we’re sampling without replacement from a population of finite size N,


then the confidence interval for the population proportion is:
The confidence interval
(proportion)

If we’re sampling without replacement from a population of finite size N,


then the confidence interval for the population proportion is:
The confidence interval
(proportion)
The confidence interval
(proportion)
The confidence interval
(proportion)
The confidence interval
(proportion)
Region of rejection
When we pick, for example, a 95% confidence level, we know
that the
alpha value is 1 − 95% = 5%. If we have an alpha value of 5%,
that means we can expect the smallest 2.5% and largest 2.5% of
values to fall outside the confidence interval.

Because α = 0.05, that means α/2 = 0.05/2 = 0.025 of the area


under the far left of the probability distribution, and α/2 = 0.05/2 =
0.025 of the area under the far right of the probability
distribution, will fall outside the 95% confidence interval. Using a
z-table, the z-values associated with + 0.025 and −0.025 are +
1.96
and −1.96. Which means the boundaries of the 95% confidence
interval are zα/2 = 1.96 and −zα/2 = − 1.96.
Region of rejection
From this, we can conclude that any z-value outside
of z = ± 1.96 will put us outside the 95%
confidence interval, and inside the region of
rejection. So ± zα/2 are the boundaries of the
region of rejection.
Quiz
Quiz
03

Type of error
Type I and Type II
Two types of risks
Whenever we’re using hypothesis testing, we always run the
risk that the sample we chose isn’t representative of the
population. Even if the sample was random, it might not be
representative.

For instance, if we have been told that 15% of MSHS


students have brown eyes, and we have set up null and
alternative hypotheses to test this claim,
H0: 15% of MSHS students have brown eyes
Ha: The percentage of MSHS students with brown eyes is not
15%

Then, when we take a sample and investigate it, we still run


Two types of risks
Risk Number one.  
Let’s assume that the null hypothesis is true and that the
percentage of MSHS students is 15%.

We might pull a sample of 100 students, find that 40 of them


have brown eyes, and get a sample mean of 𝜇= 40%.

If we use our sample data to reject the null hypothesis, but the
null hypothesis was actually true; we’ve just made a Type I
error.

The probability of making a Type I error is alpha, α, also


called the level of significance.
Two types of risks
Risk Number two.  
Let’s assume that the null hypothesis is false, (which is the
same as saying that the alternative hypothesis is true), and that
the percentage of MSHS students with brown eyes is in fact
not 15%.

Let’s say we pull the same sample of 100 students and get a
sample mean of 𝜇 = 15%. If we use our sample data to accept
the null hypothesis, even though we should have rejected it
because it’s actually false, then we’ve just made a Type II
error.
The probability of making a Type II error is beta.
Type I and II error

• Small sample size


• Narrow confidence interval
• High confidence level
Quiz
Quiz
04

Test Statistic
One-Tailed and Two-Tailed
The ideal hypothesis test:

1 Establish the null and alternative


hypotheses.
2 Decide on the significance level.
3 Collect suitable data using a random
sampling procedure that ensures the
items are independent.
4 Conduct the test, doing the necessary
calculations.
5 Interpret the result in terms of the
original claim, theory or problem.
Null and Alternative (Mean)

Ho : 𝜇 = 162 cm Ha : 𝜇 ≠ 162 cm

Ho : 𝜇 ≤ 162 cm Ha : 𝜇 > 162 cm

Ho : 𝜇 ≥ 162 cm Ha : 𝜇 < 162 cm


Ho : 𝜇 = 162 cm Ha : 𝜇 ≠ 162 cm

When the null and alternative hypotheses use the = and ≠


signs, we’ll use a two-tailed test (also called a two-sided or
nondirectional test).
Ho : 𝜇 ≤ 162 cm Ha : 𝜇 > 162 cm

If we predict that the population parameter is greater


than
the stated value, then the only rejection region will be
in the upper tail, so we call this an upper-tail test.
Ho : 𝜇 ≥ 162 cm Ha : 𝜇 < 162 cm

If we predict that the population parameter is less than


the stated value, then the only rejection region will be
in the lower tail, so we call this a lower-tail test.
Choosing a one-tail or two-
tail test
Whether we use a one- or two-tail test is determined by the hypothesis
statements we write in the first place. With that in mind, we really want to
think ahead when we are writing our hypothesis statements, and consider
which kind of test we want to set ourselves up for.

Since we really want to be careful about which test we use. A one-tail


test has a larger region of rejection, because all of the area that represents
the region of rejection is consolidated into one tail. A two-tail test, on the
other hand, has the region of rejection split into two tails, which means
each individual rejection region for the two-tail test is smaller than the
single rejection region from the one-tail test.
Choosing a one-tail or two-
tail test
Choosing a one-tail or two-
tail test
Quiz
Quiz
Evaluation

http://gg.gg/demoteaching
Lesson Objectives 1. understand the terms null
hypothesis and alternative
hypothesis,
2. understand the
significance level, rejection
region and acceptance
region,
3. understand the terms Type
I error and Type II error in
relation to hypothesis
tests,
4. understand the difference
between one-tailed and
two-tailed tests.
Thanks!

Вам также может понравиться