Вы находитесь на странице: 1из 67

D

R
AF
T
C
O
PY
O
N
LY
TABLE OF CONTENTS

MODULE

1 Hypothesis Testing, z-Test .............................................................. 1

Lesson 1 Hypothesis Testing ................................................... 2

LY
Importance of Hypothesis Testing ........................................ 2

Hypothesis, Null and Alternative Hypotheses ........................ 3

N
Hypothesis Testing ... ............................................................ 3

O
Types of Hypothesis Tests ................................................... 5

Level of Significance ... ......................................................... 6

PY
Errors in Hypothesis Testing ................................................

Power of the Test .................................................................


7

Difference Between Parametric and Nonparametric Tests ... 8


O
Lesson 2 z-Test, One-Sample z-Test ..................................... 9
C

The z-Test ............................ ................................................ 9

One-Sample z-Test .............................................................. 9


T

Lesson 3 Two-Sample z-Test ................................................. 16


AF

Two-Sample z-Test .............................................................. 16

Exercises .................................................................................... 23
R

2 The t-Test ....................................................................................... 24

Lesson 1 Preliminaries ............................................................ 25


D

Degree of Freedom ............................................................. 25

The t-Distribution .................................................................. 26

How to Perform a t-Test ....................................................... 26


How to Determine Critical Values for t ................................. 26

Difference Between Independent and Paired Sample .......... 27

Lesson 2 The One-Sample t-Test ............................................ 28

LY
Performing a One-Sample t-Test .......................................... 28

Lesson 3 The t-Test for Independent Samples ........................ 32

N
Performing a t-Test for Independent Samples ...................... 32

Lesson 4 The t-Test for Paired Samples ................................... 35

O
Performing a t-Test for Paired or Correlated Samples ........ 29

Exercises ................................................................................... 45
PY
3 Analysis of Variance (ANOVA) ........................................................ 47

Lesson 1 The One-Way Analysis of Variance ........................... 48


O
Lesson 2 Post-Hoc Test .......................................................... 60
C

Exercises .................................................................................... 63

TABLE OF CRITICAL VALUES


T

t-Distribution ........................................................................... Appendix A

F-Distribution (  = 0.05 ) ......................................................... Appendix B


AF

F-Distribution (  = 0.01 ) ......................................................... Appendix C


R
D
1

MODULE 1
Hypothesis Testing, z-Test

OVERVIEW

LY
• Hypothesis Testing
• Types of Hypothesis Tests
• Level of Significance
• Errors in Hypothesis Testing

N
• z-Test on Means
• Approaches to Hypothesis Testing

O
LEARNING OBJECTIVES
At the end of this module, students are expected to:
1. Give the meaning of hypothesis.
PY
2. Explain why there is a need to test the hypotheses.
3. Define important terms in hypothesis testing.
▪ statistical hypotheses
▪ null hypothesis
O
▪ alternative hypothesis
4. Differentiate between parametric and non-parametric tests.
5. Determine the types of hypothesis tests based on the hypotheses.
C

▪ One tailed right directional


▪ One-tailed left directional
▪ Two-tailed non-directional
6. Explain the meaning of level of significance.
T

7. Formulate null and alternative hypotheses.


8. Perform simple test of hypothesis using z-test statistic.
AF

PREREQUISITES
Students must have sufficient knowledge about parameters, statistics, normal probability
R

distribution, sampling distribution, and the central limit theorem.


D
2

LESSON 1. HYPOTHESIS TESTING


OBJECTIVES
Upon completion of the lesson, you should be able to:
• Give the importance of hypothesis testing.
• Define the key terms: hypothesis, null hypothesis, alternative hypothesis, hypothesis
testing.

LY
Formulate null and alternative hypotheses.
• Enumerate the types of hypothesis tests.
• Define level of significance, errors in hypothesis tests, and power of a test.
• Differentiate between parametric and nonparametric tests.

N
1.1 Importance of Hypothesis Testing
One of the principal objectives of research is comparison: How does one group differ from

O
another.
Here are some examples of questions in research:
1. What is the mean serum cholesterol level of a group of middle-aged men? How does
it differ from women?
PY
2. Is the latest drug effective in reducing cholesterol level?
3. Are three different drugs for the treatment of rheumatoid arthritis equally effective?
Out of these questions, we can make these assumptions:
1. The average cholesterol level of middle-aged men is 240 mg/dL.
O
2. Men have higher cholesterol levels than women.
3. The latest drug is effective in reducing cholesterol level.
4. There is no difference between the effectiveness of the three different drugs for the
C

treatment of rheumatoid arthritis.


How can we verify the validity of these assumptions without guessing? How can we say they
are true? How can we say they are not true? We cannot simply take a small sample out of the
population consisting of middle-aged men, take their sample average cholesterol level, and then
T

reject or accept an assumption based only on this average. We need to do a more robust
procedure to validate or reject the assumption. This procedure is called hypothesis testing.
AF

1.2 Hypothesis
A hypothesis is
- an assumption about the population parameter.
R

- a statement of belief used in the evaluation of population values.


- an educated guess about the population parameter.
1.2.1 Examples of Hypotheses:
D

1. The average cholesterol level of middle-aged men is 240 mg/dL.


2. The average cholesterol level of middle-aged men is not 240 mg/dL.
3

1.3 Null Hypothesis


A null hypothesis is
- usually denoted by H 0 .
- a statement about the population that will be assumed to be true unless it can be shown
to be incorrect beyond a reasonable doubt.
- a statement that is always hoped to be rejected.

LY
1.4 Alternative Hypothesis
The alternative hypothesis is
- usually denoted by Ha .
- a claim about the population that is contradictory to H 0 and what we conclude when we

N
reject H 0 .
- statement that challenges H 0 .

O
Note: The null and alternative hypotheses contain opposing viewpoints.

1.4.1 Examples:
Null Hypothesis PY
: The average cholesterol level of middle-aged men is 240 mg/dL.
Alternative Hypothesis : The average cholesterol level of middle-aged men is not 240 mg/dL.

1.5 Hypothesis Testing


O
Hypothesis testing is
- the use of statistics to determine the probability that a given hypothesis is true.
- the process of making an inference or generalization on population parameters based on
C

the results of the study on samples.

1.5.1 Example. Setting up the Null and Alternative Hypotheses


T

Suppose someone claims that the mean age of the population of 7683 individuals is
53.00 years. How can we verify (or reject) this claim?
AF

We can start by formulating our hypotheses.


H 0 :  = 53 Remember that  denotes the population mean.
Ha :   53

You could also include interpretations of the hypotheses.


R

H 0 :  = 53 ; The mean age of the 7683 individuals is 53.00.


Ha :   53 ; The mean age of the 7683 individuals is not 53.00.
D

After setting up the null and alternative hypotheses, we can draw a sample of, say, 100
persons and then compute the mean of this sample. Then we will compute the appropriate
test statistic to decide whether to reject or not to reject the null hypothesis ( H 0 ).
4

1.5.2 Example. An evaluation of Online Learning


Suppose that, based on the result of a study, the average grade point average of LSPU
students was found to be 83%. After the introduction of an online learning program, a
researcher wants to know whether online learning has increased the average GPA of LSPU
students from 83%.
The research hypotheses may be as follows.
H 0 :   83 ; The online learning program has not increased the average GPA of

LY
LSPU students.
Ha :   83 ; The online learning program has increased the average GPA of LSPU
students.

N
1.5.3 Example.
We want to test if college students taking up degree courses take less than 5 years to

O
graduate from college, on the average.
H 0 :   5 ; Students taking up degree courses take at least (not less than) 5 years to
graduate from college.
Ha :   5 ; Students taking up degree courses take less than 5 years to graduate from

1.5.4 Example.
college. PY
We want to test whether the tutorial services offered by the students’ math society of a
O
certain university have lowered the number of failures in all math courses from 30% of the
total enrolment.
Here, we are talking about test of proportions. So, our hypotheses would be:
C

H 0 : p  30% ; Tutorial services have not reduced the number of failures.


Ha : p  30% ; Tutorial services have lowered the number of failures.

Now, you already have some idea on how to formulate the null and alternative hypotheses.
T

Have you seen the glaring clues on how to formulate them?


AF

NOTE: H 0 always has a symbol with an equal in it. Ha never has a symbol with an
equal in it. The choice of symbol depends on the wording of the hypothesis
test.
R

When we say testing hypothesis, we mean gathering evidence in order to reject


the null hypothesis. The alternative hypothesis ( Ha ) plays a major role in deciding the
D

type of test to use.


5

1.6 The Types of Hypothesis Tests Based on the Alternative Hypothesis ( Ha )

There are 3 types of hypothesis tests that depend on the way you formulated the alternative
hypothesis. They are as follows:
1. One-Tailed Left Directional Test
This is used if Ha uses the  symbol.

LY
REJECTION REGION
(shaded)
ACCEPTANCE REGION

N
0
Critical Value (also called tabular value)

O
Figure 1. One-Tailed Left Directional Test

In the one-tailed left directional test, we will compute for the test statistic and then
compare it with the critical value or so-called tabular value (a value found on a table).
PY
If the computed value is  the tabular value then we may reject the null hypothesis. If
this happens, then we may say that “THE FINDING IS SIGNIFICANT”.

Note: A test statistic is a single numerical value resulting from the use of a certain
formula. This is also called the computed value.
O
For example, a z-test statistic can be -1.83 while the tabular (or critical) value
can be -1.65. In this case the test statistic is  the tabular value.
C
T

2. One-Tailed Right Directional Test


This is used if Ha uses the  symbol.
AF

REJECTION REGION
(shaded)

ACCEPTANCE REGION
R

0
D

Critical Value
Figure 2. One-Tailed Right Directional Test

In the one-tailed right directional test, if the computed value is  the tabular value
then we may reject the null hypothesis. If this happens, then we may say that “THE
FINDING IS SIGNIFICANT”.
6

3. Two-Tailed Non-directional Test


This is used if Ha uses the  symbol.
In the two-tailed test, notice that there are two rejection regions in the distribution.
There are also two critical values which are numerically equal (their absolute values are
the same). If the value of the test statistic falls outside the acceptance region (within any
rejection region), then we may reject the null hypothesis.

LY
REJECTION REGION REJECTION REGION
ACCEPTANCE REGION

N
0

O
Critical Values

Figure 3. Two-Tailed Test

Therefore: PY
1. If Ha uses the  , the test is two-tailed.
2. If Ha uses the  , the test is one-tailed left directional.
3. If Ha uses the  , the test is one-tailed right directional.
O
Regardless of whether you use a one-tailed or a two-tailed test, the choice should be made
before you collect data and before you begin the data analysis. This is generally considered the
C

proper way to conduct scientific research. To collect and partially analyze the data, and then to
decide whether the test should be one- or two-tailed, is simply not an appropriate course of
action.
T

Question : Should I use the one-tailed test or the two-tailed test?


AF

Answer : If you are not expected to use either a one- or a two-tailed test, then use
the two-tailed test. However, you may appropriately use the one-tailed test
on these contexts:
a. “where there is truly concern for the outcomes in one tail only”; or
b. “where it is completely inconceivable that the results could go in the
R

opposite direction”
D

1.7 Level of Significance


In performing a hypothesis test, we are gathering sufficient evidence (a probability or
chance) that the null hypothesis ( H 0 ) is not true. This means that we could make a wrong
decision. We can, for example, reject the null hypothesis when it is in fact true or accept it when
it is in fact false.
7

So we want to be sure we are making the right decision. If we want to be 95% sure that we
are going to make the right decision, e.g. rejecting the null hypothesis ( H 0 ); then the probability
against the null hypothesis must not be less than 5%. This 5% or 0.05 is called the level of
significance.
The level of significance is also the area of the rejection region designated by the Greek
letter alpha (  ). The typical values used by researchers are  = 0.05 and  = 0.01 .

LY
Notes:
• You are not prevented from using  = 0.02 ,  = 0.06 , etc. However,
researchers typically use  = 0.05 and  = 0.01 for most tests.
• Most published statistical tables do not have entries for  = 0.02 ,  = 0.06 ,
etc.. There are entries, however, for multiples of 0.025 like 0.025, 0.05, and

N
0.10.

O
Hypothesis testing is decision-making. You need to decide whether to reject or not to reject
the null hypothesis ( H 0 ).
• The moment you reject H 0 , it means you have a reason (e.g. a computed statistic or a
computed probability) to believe that it is incorrect. You have sufficient evidence to reject


it. PY
When you accept H 0 , it does not mean it is correct. It means that you simply don’t have
enough evidence to reject it.
O
Therefore: The only decision that can be made regarding H 0 is either:
• Reject H 0 ; or
• Do not reject H 0 (accept it).
C

There is no partial acceptance or partial rejection.

1.8 Errors in Hypothesis Testing


T

In making decisions, you may sometimes commit errors. In statistics, they are referred to as
either Type I or Type II errors.
AF

The table summarizes the types of errors that you can commit.

Decision When H 0 is actually true... When H 0 is actually false...


You rejected H 0 . You committed a Type I Error You made the right decision.
R

You did not reject H 0 . You made the right decision. You committed a Type II error.
D

1.9 Power of the Test


The statistical power of the test is the probability of not making a type II error.
The probability of committing a Type I error is designated by  , while the probability of
committing a type II error is designated by  .
8

What does  = 0.01 mean?


An  of 0.01 means the researcher is being relatively careful. He/she is only willing to risk
being wrong once in 100 times in rejecting a null hypothesis which is really true. That is, if the
researcher performed the same test 100 times using equal sample sizes, then there is a
likelihood that he could be wrong only once.

So, what  should we really use?

LY
Typically,  = 0.05 . If you are going to make a decision that may result in a lot of deaths,
then you may want to choose a very small value of  .

1.10 Difference Between Parametric and Nonparametric Tests

N
A parametric test is
- a test wherein at least one sample statistic is obtained to estimate the population

O
parameter.
- an estimation process that involves at least one sample, a sampling distribution, and a
population.
- a test used when the data have a normal distribution.

following:

PY
You can use a parametric test if, among other assumptions, you know any one of the

Each sample came from a population with a normal distribution.


• Each sample size is at least 30 (the Central Limit Theorem applies).
O
Examples of parametric tests are z-test, t-test, and analysis of variance (ANOVA).

Nonparametric tests are tests that do not rely on a particular distribution. Therefore, you
C

can use it when you do not (or cannot) meet the assumption of normality. Examples of
nonparametric tests are Mann-Whitney U test, Kruskal Wallis analysis of ranks, and Wilcoxon’s
matched pairs test.
T

What if I want to use a parametric test but my sample size is small ( n  30 )?


If you don’t know whether a sample came from a population with a normal distribution and at
AF

the same time the Central Limit Theorem does not apply ( n  30 ), then you may perform a
normality test on your sample. If the result of this normality test shows that you have a sample
that probably came from a normal distribution, then you may use a parametric test.
An example of normality test is the Anderson-Darling test for normality. This test is not
covered in this module.
R
D
9

LESSON 2. z-TEST, ONE-SAMPLE z-TEST


OBJECTIVES
Upon completion of this lesson, you should be able to:
• Determine when the z-test is appropriate.
• Use the one-sample z-test in hypothesis testing.

LY
2.1 The z-Test
The z-test is a parametric procedure used to test the significance of difference between:
• the population mean and a hypothesized or perceived mean; or

N
two sample means

2.2 The One-Sample z-Test

O
The one-sample z-test is used to:
• know whether our sample comes from a particular population, or
• test whether a population parameter is significantly different from some hypothesized
value.
PY
The one-sample z-test can be used when:
• We know the population standard deviation (  ).
• We have the sample statistics n and x , or the raw sample data so we can compute for
x.
O
The formula to determine the test statistic (also called the computed value) is:
C

z=
(x − ) n
This test statistic has a normal distribution.

where:
T

x = sample mean
 = hypothesized value of the population mean
AF

 = population standard deviation


n = sample size
Note: Depending on the values of the variables, this formula may yield positive or negative
values.
R

2.2.1 Example. One Sample z-Test


The average score in the final examination in College Algebra at ABC College is known
D

to be 80 with a standard deviation of 10. A random sample of 39 students was taken from
this year’s batch and it was found that they have a mean score of 84. Is this an indication
that this year’s batch performed better in College Algebra than the previous batches?
10

Solution:
The formula for the test statistic is:

(x − ) n
z =

Substituting  = 80 , x = 84 , n = 39 , and  =10 in the formula, we obtain:

( 84 − 80 )

LY
39
z =
10
z =
( 4) (6.244997998)
10

N
z = +2.4979992
z  +2.50 This is called the z-test statistic or computed value of z .

O
Later on, we will compare this computed value of z against another value found in the
following table.

Test
PY
Table 1. Critical Values for the z-Test

One-Tailed
 = 0.01
2.33
 = 0.05
1.65
Two-Tailed 2.58 1.96
O
We should present our solution in a stepwise method. Explanations of the steps are as
follows:
C

Step 1. Set up the null and alternative hypotheses.


H 0 :   80 ; This year’s batch is as good as the previous batches in College Algebra.
Ha :   80 ; This year’s batch is better in College Algebra than the previous batches.
T

To clarify things a little bit more, note that  denotes the mean of the population
AF

where our sample came from. In our hypotheses, we want to find out whether the
population where our sample came from has a significantly higher mean than 80. The 
in the hypothesis doesn’t refer to the mean of the previous batches of students because
it was not the population where our sample came from.
However, in the formula, the  refers to the value to which we wish to compare the
R

mean of the population where our sample came from.


Step 2.
D

Since we are not told to use a specific level of significance, let us set  = 0.05 .
Furthermore, the test is one-tailed right directional due to the  symbol in the alternative
hypothesis ( Ha ). The critical value of z found in Table 1 is +1.65. Notice that we picked
the positive critical value since the test is right directional (the rejection region is at the
right tail of the distribution).
11

Step 3.
Decision Rule: Reject H 0 if the computed value of z is  +1.65.

REJECTION REGION

ACCEPTANCE REGION

LY
0
+1.65
Critical Value

N
Note: If you don’t understand what’s going on, take a careful look at the figure. Any
computed value greater than or equal to +1.65 will fall outside the acceptance

O
region. It will be within the rejection region.

Step 4.
Decision: We reject H 0 because the computed value of z is +2.50 which is  the
PY
tabular value of +1.65. NOTE: THE FINDING IS SIGNIFICANT.

REJECTION REGION
O
ACCEPTANCE REGION
C

0
+1.65 +2.50
Critical Value
T

The computed value is within


Step 5. the rejection region.
AF

Conclusion: This year’s batch is better in College Algebra than the previous batches.

Now, let us polish up our presentation of the solution. Here is the final write-up.
R

Step 1.
H 0 :   80 ; This year’s batch is as good as the previous batches in College Algebra.
D

Ha :   80 ; This year’s batch is better in College Algebra than the previous batches.

Step 2.
 = 0.05 ; one-tailed test; ztabular = +1.65
Step 3.
12

Decision Rule: Reject H 0 if zcomputed  +1.65 .

Step 4.
Decision: We reject H 0 because zcomputed ( +2.50 )  +1.65 .

Step 5
Conclusion: This year’s batch is better in College Algebra than the previous batches.

LY
2.2.2 Example.
The average score in the final examination in College Algebra at ABC College is known
to be 80 with a standard deviation of 10. A random sample of 39 students was taken from
this year’s batch and it was found that they have a mean score of 84. Is the mean score in

N
College Algebra or this year’s batch different from 80?

Solution:

O
We have the same data as in example 5. So we will get the same computed statistic
( zcomputed = +2.50) .
PY
Again, the  in the hypotheses refers to the mean of the population where our sample
came from. This time we have a two-tailed test because we will have a  symbol in the
alternative hypothesis ( Ha ). The tabular values for a two-tailed test with  = 0.05 found in
the table are 1.96 .
O
Note: There are actually two
critical values and two rejection
C

regions. Why? Because this is a


two-tailed test.
0
T

-1.96 +1.96
AF

Here is the 5-step solution:


Step 1.
H 0 :  = 80 ; The mean score of this year’s batch is equal to 80.
Ha :   80 ; The mean score of this year’s batch is not equal to 80.
R

Step 2.
D

 = 0.05 ; two-tailed test; ztabular = 1.96


Step 3.
Decision Rule: Reject H 0 if zcomputed  −1.96 or zcomputed  +1.96 .
13

Step 4.
Decision: We reject H 0 because zcomputed ( +2.50 )  +1.96 .
NOTE: THE FINDING IS SIGNIFICANT.
Step 5.
Conclusion: The mean score of this year’s batch is not equal to 80.

LY
N
0
-1.96 +1.96 +2.50

O
Critical Values The computed value falls on
one of the two rejection regions.

PY
O
C
T
AF
R
D
14

PRACTICE EXERCISE #1
Animal studies suggest that the anticholinergic drug physostigmine improves memory.
This could have some clinical applications in humans (e.g., senility, Alzheimer’s disease).
Studies with humans typically report that we remember an average of seven of 15 words given
an 80-minute retention interval. These studies also suggest a standard deviation for the
population of two.
The following table shows the scores for a sample of 20 subjects.

LY
9 8 8 9 9
8 10 8 10 7

N
7 7 8 8 10
9 8 8 7 9

O
The research question is: Does physostigmine improve memory in humans?

Solution:
PY
Here, we have n = 20 ,  = 2 , and hypothesized mean  = 7 . But we don’t know the
sample mean ( x ).

So, use your calculator to determine the sample statistic.


O
x =
C

Determine the z statistic.


T

zcomputed =
(x − ) n
=
AF

Write your 5-step solution:


R

Step 1.
D

H0 :

Ha :
15

Step 2.
 = 0.05 ; ______-tailed test; ztabular = _____________
Draw the normal curve and then shade the rejection region.
Mark the critical/tabular value.

LY
N
O
Step 3.
Decision Rule:
PY
O
Step 4.
C

Decision:
T
AF

Step 5.
Conclusion:
R
D
16

LESSON 3. THE TWO-SAMPLE z-TEST


OBJECTIVES:
At the end of this lesson, you should be able to:
• Determine when the two-sample z-test is used.
• Use the two-sample z-test in hypothesis testing

LY
3.1 The Two-Sample z-Test
The two-sample z-test is used if we want to compare the means of two samples.
Note: Technically speaking, we are comparing two population means. It just happens that we
have the sample means.

N
The formula to determine the test statistic is:
x1 − x2
z=

O
s12 s22
+
n1 n2

where:
x1 , x 2 = sample means
s1 , s 2 = sample standard deviations
PY
n1 , n2 = sample sizes
O
3.1.1 Example. Two-Sample z-Test
The dean of ABC University wants to know which method is better in teaching
C

Biochemistry. She took a random sample of 40 students handled by only one teacher in
lecture and laboratory, and found it to have a mean final grade of 83 with a standard
deviation of 7. Fifty students from a group handled by two different teachers in lecture and
laboratory were randomly taken and it was found that they have a mean final grade of 87
T

with a standard deviation of 10. Does this indicate that the two-teacher setup is better than a
one-teacher setup? Test at  = 0.01 .
AF

Solution: Here, we are comparing two groups (samples).


So, the test statistic to be used is:
x1 − x2
z=
R

s12 s22
+
n1 n2
D

where: x1 = 83 , x2 = 87 , s1 = 7 , s2 = 10 , n1 = 40 , n2 = 50 .

Group 1 Group 2
Students handled by one Students handled by two
teacher in both lecture and different teachers in lecture
17

laboratory and laboratory


n1 = 40 n2 = 50
x1 = 83 x2 = 87
s1 = 7 s2 = 10

We substitute these values in the formula. We get:

LY
83 − 87
z=
72 102
+
40 50
83 − 87

N
z=
49 100
+
40 50

O
−4
z=
1.225 + 2
−4
z=

z=
3.225
−4
1.7958284996
PY
z = −2.2273841856  −2.23 This is our computed value of z .
O
Here’s the 5-step solution:
Step 1.
C

H 0 : one −teacher  two −teacher ; One teacher setup is at least as good as the two-teacher
setup.
Ha : one −teacher  two −teacher ; One teacher setup is inferior to the two-teacher setup.
T

Note: You can switch the two groups together such that group 1 will be students
under the two-teacher setup while group 2 will be the students under the
AF

one-teacher setup. You’ll just have to be very careful about the formulation
of the hypotheses. Also, the direction of the test will change.
Step 2.
 = 0.01 ; one-tailed left directional test; ztabular = 2.33
R

Step 3.
Decision Rule: Reject H 0 if zcomputed  −2.33 .
D

Step 4.
Decision: We cannot reject H 0 because zcomputed ( −2.23 ) is not  −2.33 .
Note: The finding is NOT SIGNIFICANT.
18

Step 5.
Conclusion: The one-teacher setup is at least as good as the two-teacher setup.

3.1.2 Example. Two sample z-Test

Note: In this example, the researcher had to compute for some statistics first
before using the formula for the computed z-value.

LY
A Biostatistics professor wants to know if students with ordinary scientific calculators got
significantly lower scores in Biostatistics than those with more advanced scientific
calculators like the Texas Instrument TI-83 or TI-84. To verify her claim, she did the
following:

N
Step 1. She took a sample of 40 students who use ordinary scientific calculators and
recorded their midterm exam scores:

O
79 80 83 90 70 65 60 71 85 89
80 87 85 85 75 73 74 71 70 68
88 78 81 85 87 91 93 90 83 84
81 80 74 PY
73 71 66 65 60 78 78

Step 2. She then took a sample of 50 students who use advanced scientific calculators and
got the following scores:
O
88 80 81 66 75 75 78 61 85 90
81 85 88 84 86 83 84 85 88 81
95 90 99 98 65 60 61 90 91 93
C

82 85 91 90 88 81 80 80 87 87
89 88 91 83 80 80 83 81 86 83
T

Step 3. She computed for the mean and standard deviation of the scores of each group of
students and got the following:
AF

x1 = 78.15 , s1 = 8.73 for students using ordinary scientific calculators


x2 = 83.22 , s2 = 8.65 for students using advanced scientific calculators

Activity: Verify her results using your calculator.


R

Step 4. She chose  = 0.05 .


Step 5. She computed for the z statistic for two samples and got the following:
D

zcomputed = −2.748

Activity: Verify her result using your calculator.


19

The following is the 5-step solution:


Let 1 be the mean score of those with ordinary scientific calculators and  2 be the
mean score of those with advanced scientific calculators.
Step 1.
H 0 : 1  2 ; The groups with ordinary scientific calculator performed at least as good
as those using advanced scientific calculators.

LY
Ha : 1  2 ; The groups with ordinary scientific calculator performed lower than those
using advanced scientific calculators.
Step 2.
 = 0.05 ; one-tailed left directional test; ztabular = 1.65

N
Step 3.

O
Decision Rule: Reject H 0 if zcomputed  −1.65 .

Step 4
Decision: We reject H 0 because zcomputed ( −2.748 )  −1.65 . Note: Significant finding.

Step 5.
PY
Conclusion: Students who use ordinary scientific calculators performed lower than those
with more advanced scientific calculators.
O
C
T

0
-2.748 -1.65
AF

tabular value or critical value


The computed value falls
within the rejection region.
R
D
20

PRACTICE EXERCISE #2

A researcher wishes to determine if there is a significant difference on the systolic


pressures of female nursing graduates of the same age who are reviewing for a board exam
and those who have just finished taking up the board exam.

Systolic BP’s of BS Nursing Graduates

LY
Reviewing for the Board Exam
120 115 130 140 125
130 120 130 140 130
125 130 140 110 140
120 150 150 110 150

N
140 140 120 150 130

O
Systolic BP’s of BS Nursing Graduates who
Have Just Taken the Board Exam
110 90 125 120 120
100
120
100
100
110
100
PY
110
110
100
130
90
100
100
120
100
110 90 90 110 100
O
Solution:

Use your calculator to determine the sample statistics.


C

n1 = 25 n2 = 25

x1 =
T
AF

x2 =

s1 =
R

s2 =
D
21

The test statistic to be used is:


x1 − x2
z=
s12 s22
+
n1 n2

z=

LY
N
Write your 5-step solution:

Step 1.

O
H0 :

Ha :

Step 2.
PY
 = 0.05 ; ______-tailed test; ztabular = _____________
Draw the normal curve and then shade the rejection region.
O
Mark the critical/tabular value.
C
T
AF
R

Step 3.
D

Decision Rule:
22

Step 4.
Decision:

Step 5.

LY
Conclusion:

N
O
PY
O
C
T
AF
R
D
23

EXERCISES
1. Hyperactive children are often disruptive in the typical classroom setting because they
find it difficult to remain seated for extended periods of time. Baseline data from a very large
study show that the typical frequency of “out-of-seat behaviors” was 12.38 per 30-minute
period, with a standard deviation of 3.52. A treatment known as covert positive
reinforcement was applied to a group of 30 hyperactive children. The mean number of “out-
of-seat behaviors” was reduced to 11.59 per 30-minute observation period. Using the 0.01

LY
significance level, can we conclude that this decline in “out-of-seat behaviors” is significant?

2. A machine is set to fire 30.00 decigrams of chocolate pellets into a box of cake mix as it
moves along the production line. Of course, there is some variation in the weight of the
pellets. A sample of 36 boxes of mix revealed that the average weight of the chocolate

N
pellets was 30.08 decigrams, with a sample standard deviation of 0.50 decigrams. Is the
increase in the weight of the pellets significant at the 0.05 level? Apply the usual five steps
to be followed in hypothesis testing.

O
3. There have been complaints that resident physicians and nurses at the Las Palmas
Hospital desk respond slowly to emergency calls from senior citizens who are medical or
surgical patients. It is claimed that other patients receive faster service. The 0.01 level of
PY
significance is to be used to test the hypothesis that the response times to emergency calls
from senior citizens and from other patients are the same. The alternative hypothesis is that
the response times for the senior citizens are greater than those for other medical or
surgical patients.
Unknown to the resident physicians and nurses, lengths of time it took them to respond
O
to the calls of both senior citizens and other patients were recorded. The sample results are
summarized as follows:
C

Patients Sample Mean Sample Standard Number in


Deviation Sample
Senior citizens 5.5 minutes 0.4 minutes 50
Other patients 5.3 minutes 0.3 minutes 100
T
AF

4. The amount of a certain trace element in blood is known to vary with a standard
deviation of 14.1 ppm (parts per million) for male blood donors and 9.5 ppm for female
donors. Random samples of 75 male and 50 female donors yield concentration means of 28
and 33 ppm, respectively. What is the likelihood that the population means of concentrations
of the element are the same for men and women? Test at  = 0.05 and at  = 0.05 .
R

5. In an experiment to see if light from ultraviolet sun lamps affects muscle size, 50 pairs of
D

laboratory animals were segregated into two groups. One group received ultraviolet light
treatments daily for a month. The other did not. A particular muscle on each animal was
then weighed.
If the mean for the ultraviolet group is 89 milligrams, with a standard deviation of 9, and
the mean for the control group is 57 milligrams, with a standard deviation of 7, what is the
value of the test statistic? Is there any indication that ultraviolet sun lamps affect muscle
size?
24

MODULE 2
The t-Test

LY
OVERVIEW
• degree of freedom
• t-test

N
LEARNING OBJECTIVES
At the end of this lesson, students are expected to:

O
1. Intuitively define “degree of freedom”.
2. Compare the t-distribution with the normal distribution.
3. Perform statistical testing on differences between means using t-test.
• hypothesized mean vs. sample mean
• two independent samples
• dependent or correlated samples
PY
PREREQUISITES
O
Students must have sufficient knowledge about hypothesis testing, e.g. significance
level, critical values, implementing decision rules, z-test.
C
T
AF
R
D
25

LESSON 1. PRELIMINARIES
OBJECTIVES
At the end of this lesson, you should be able to:
• Determine when the t-test is appropriate.
• Understand what degree of freedom is.
• Determine the degree of freedom for a given sample.

LY
Understand the t-distribution.
• Look up the critical values for t in a table.
• Differentiate between independent and paired samples.

1.1 The t-Test

N
The t-test is used if n is small, and in this module, it will be used if n<30 and  is unknown.
If the sample size is small (n<30), the values of the mean and standard deviation fluctuate from
sample to sample. The sampling distribution of the sample means and standard deviation may

O
no longer follow a standard normal distribution. Instead, they follow what we call a t-distribution.
The t-distribution is similar to the z-distribution. They are both symmetrical about the mean.
Both are bell-shaped, but the t-distribution is more variable since t-values depend on the

fluctuation of the means of the samples.

1.2 Degree of Freedom


PY
fluctuations of the mean and the standard deviation whereas the z-values depend only on the

Examine the following formula for the sample standard deviation.


O
f (x − x )
2

s=
n −1
C

The divisor n − 1 is called the degree of freedom (df). If the means and standard deviations
are computed from samples of size n, the values of t follows a t-distribution with df = n − 1. So, if
you have n = 23 , then you have a t-distribution whose df = 22 . If you have n = 15 , then you
have a t-distribution whose df = 14. Therefore, you have a different t curve for each possible
T

sample size.
AF

Further Explanation about Degree of Freedom

Degree of freedom is the number of values in the final calculation of a statistic that are
free to vary.
R

Let us talk about the average of five numbers. Remember that the average is just the mean,
a statistic. Suppose you are asked to provide five numbers whose average is 3. We can give the
D

following first four numbers one at a time:


1st number : 4
2nd number : 1
3rd number : 3
4th number : 5
26

These four numbers are free to vary. But the fifth one is not. You have no other choice but to
make it 2 to make the average of the five numbers equal to 3 as required. In this case, the
degree of freedom is n − 1 = 5 − 1 = 4 .

1.3 The t Distribution


The figure below shows different t distributions with df=4,10,20,  .

LY
this is the normal distribution
(df=  )

As df approaches infinity, the

N
t distribution approximates the
normal distribution.

O
Notice, that as df gets higher, the t curve becomes a normal curve. That’s why researchers
PY
use the t-test often even for sample sizes greater than 30. In fact, some statistical softwares
don’t have two-sample z-test. You can use the two-sample t-test, instead.

1.4 How to Perform a t-Test


O
The steps in testing the significance of difference between means using the t-test are just
the same as in the z-test. The difference lies in the use of the t-distribution with n-1 degree of
freedom instead of the normal distribution.
C

As n approaches infinity, the critical values of t approaches those of the critical values of z.
That is, “t becomes z”. In fact, for just n = 30 , the critical values of t are very near the critical
values of z.
T

1.5 How to Determine the Critical Values for t


You will use the table of critical values for the t-distribution. First, you need to prepare the
AF

following information:
1. type of test: two-tailed or one-tailed?
2. the level of significance (  )
3. the degree of freedom (df)
R

Now, the next part is also easy. You simply need to look at the Table of Critical Values for
the t Distribution included in the appendix of this workbook.
D

1.5.1 Examples:
• If you have a two-tailed test with  = 0.05 , and df = 5 ; then t tabular = 2.571 .
• If you have a one-tailed test with  = 0.01 , and df = 10 ; then t tabular = 2.764 .
• If you have a two-tailed test with  = 0.01 , and df = 20 ; then t tabular =__________.
• If you have a one-tailed test with  = 0.05 , and df = 30 ; then t tabular =__________.
27

1.6 Difference Between Independent and Paired Samples


If two sample groups are independent, there is no connection between any subject in group
1 and any subject in group 2. A comparison of test scores between males and females is an
example of independent samples. There is no connection between any female’s test score and
that of any male. This also means that samples do not have to be of the same size.
Paired samples are exactly what the name implies. In this case, there is a connection
between scores in one group and scores in the other. For example, persons in group 1 are the

LY
same persons in group 2. This scenario occurs if, for example, you want to test the effects of
certain drugs on the blood pressures of patients. Here, group 1 will be the patients before the
treatment, while group 2 will be the same patients after treatment. Group 1 and group 2 are
related because they are the same persons and are of the same size. Another example is when
you pair people by age. You form two groups by choosing two 15-year-old’s and then giving

N
them to the two groups. Then you pick two 17-year-old’s and then give them to the two groups.
Then two 20-year-old’s, and so on and so forth. In this case, you will have two groups of the
same size paired by ages. You can even pair by heights. Or even by foot sizes.

O
The terms independent, uncorrelated, and unpaired can be used interchangeably. And so
can the terms dependent, correlated, and paired. There are three types of t-test:
(1) one-sample t-test, (2) t-test for independent samples, and (3) t-test for dependent samples.
The t-test for independent samples is also called the unpaired t-test. The t-test for dependent
PY
samples is also called paired t-test or correlated t-test.
O
C
T
AF
R
D
28

LESSON 2. THE ONE-SAMPLE t-TEST


OBJECTIVES
At the end of this lesson you should be able to:
• Determine when the one-sample t-test is appropriate.
• Use one-sample t-test in hypothesis testing.

LY
2.1 The One-Sample t-Test
One sample t-test is a statistical procedure that is used to know whether there is a
significant difference between the mean of the sample and the known value or a hypothesized
value of the population mean. This is used if we have a sample and we know or can compute
for the following:

N
• the sample size( n )
• sample mean( x )

O
• sample standard deviation( s )
• the population mean or its hypothesized or assumed value (  )

2.1.1 Example. One-Sample t-test


PY
The purpose of the study was to determine whether touch therapists (TT) could detect a
human energy field (HFE) without seeing the patient. Fifteen touch therapists (TT)
participated in the initial test. A coin was flipped to determine which of each TT’s hands
would be the target. The experimenter then held her right hand, palm down, 8-10 cm above
the TT’s hand and said, “Okay.” The TT had to determine which of his or her hands was
O
below and nearer to the experimenter’s hand. Each TT was given 10 opportunities to select
the correct hand.
In this example, we have data from the sample, but no population data. In order to
C

calculate the t-value, we need the population mean (  ), a sample mean ( x ), and the
sample standard deviation ( s ). The experimenter gathered (computed) the following from
the sample:
T

n = 15 touch therapists
df = 15 − 1 = 14
AF

x = 4.67
s = 1.74
If a HEF does not exist, you would expect the TT to be successful at the most 50% of
the time. Your population mean (  ) would be 50% of 10 or  = 5 (hypothesized mean) . If a
HEF does exist, then you would expect a success rate significantly greater than 5.
R

Step 1.
D

H 0 :   5 ; The TT’s cannot detect HEF.


Ha :   5 ; The TT’s can detect HEF.

Step 2.
 = 0.05 ; one-tailed test; ttabular = +1.761 (found in the table)
29

REJECTION REGION

ACCEPTANCE REGION

+1.761 Critical Value

LY
Step 3.
Decision Rule: Reject H 0 if tcomputed  ttabular .

N
Step 4.

t=
(x − ) n
=
( 4.67 − 5 ) 15
= −0.73

O
s 1.74

Decision: We cannot reject H 0 because the computed value of t is not  the critical
value. PY
REJECTION REGION
O
ACCEPTANCE REGION

0
C

computed value of t=-0.73


(inside the acceptance region) +1.761 critical value found in the table
T

Step 5.
AF

Conclusion: Touch therapists cannot detect human energy field (HEF).

2.1.2 Example. One-Sample t-test


R

Ten randomly selected oil wells in a large field produced 21, 19, 20, 22, 24, 21, 19, 22,
22, and 20 barrels of crude oil per day. Is this enough evidence to conclude that the oil wells
are not producing an average of 22.5 barrels of crude oil per day? Test at 0.01 level of
D

significance.
The given data are:
 = 22.5 (the hypothesized mean)
 = 0.05
df =10-1=9
30

The sample mean ( x ) and the standard deviation ( s ) are not given, therefore you have
to determine them using your calculator. Try to get these values:
x = 21
s = 1.56
Substituting these already known values into the formula, we have:

tcomputed =
(x − ) n
=
( 21 − 22.5 ) 10
= −3.04

LY
s 1.56

The 5-step solution:

N
Step 1.
H 0 :  = 22.5 ; The oil wells are producing 22.5 barrels of oil a day.

O
Ha :   22.5 ; The oil wells are not producing 22.5 barrels of oil a day.

Step 2.
 = 0.01 ; two-tailed test; df=9; ttabular = 3.25
PY
O
0
-3.25 +3.25
C

Step 3.
Decision Rule: Reject H 0 if tcomputed  −ttabular or tcomputed  +ttabular .
T

Step 4.

(x − ) n ( 21 − 22.5 ) 10
AF

tcomputed = = = −3.04
s 1.56

Decision: We cannot reject H 0 because the computed value of t is not  the critical
value -3.25 nor  +3.25. That is, it does not fall outside the acceptance
R

region.
D

0
-3.25 -3.04 (computed) +3.25
inside the acceptance region
31

Step 5.
Conclusion: The oil wells are producing 22.5 barrels of oil a day.

LY
N
O
PY
O
C
T
AF
R
D
32

LESSON 3. THE t-TEST FOR INDEPENDENT SAMPLES


OBJECTIVES
At the end of this lesson, you should be able to:
• Determine when the t-test for independent samples is appropriate.
• Use the t-test for independent samples in hypothesis testing.

LY
3.1 Performing a t-Test for Independent Samples
The independent samples t-test compares the mean scores of two groups on a given
variable. Before performing this test, we need to meet the following assumptions:
• The samples came from normal distributions.

N
The two groups have approximately equal variance.
• The two groups are independent of one another.

O
The formula:
x1 − x2
t=
1 1
sp +
n1 n2
PY
To compute for s p , the pooled standard deviation, we use the following equation:

( n1 − 1) s12 + ( n2 − 1) s22
sp =
n1 + n2 − 2
O
The degree of freedom is df = n1 + n2 − 2 . (We will use df when we look up values from the
C

table.)

3.1.1 Example. t-test for Independent Samples


A teacher wants to find out if the Team Based Instruction (TBI) method of teaching
T

Biostatistics is more effective than the Individually-Guided Instruction (IGI) method. Two
classes of approximately equal intelligence were selected. From one class, she considered
AF

15 students with whom she used TBI method and from the other class, she considered 14
students with whom she used the IGI method. After several sessions, a 30-item test was
given. The scores are shown in the following table.

Students
Method #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15
R

TBI 30 28 29 20 18 19 16 27 22 24 26 28 30 29 18
IGI 25 27 20 30 16 21 15 25 28 21 19 17 18 13
D

Based on the result of the test, can you say that the TBI method of teaching is more
effective than the IGI method? Use  = 0.05 .

Solution:
We need to determine the sample statistics. Use your calculator and check whether you
will also get the following values.
33

Method Statistics
1. TBI n1 = 15 x1 = 24.27 s1 = 4.98

2. IGI n2 = 14 x2 = 21.07 s2 = 5.21

( n1 − 1) s12 + ( n2 − 1) s22
sp =

LY
n1 + n2 − 2

(15 − 1) 4.982 + (14 − 1) 5.212


sp =
15 + 14 − 2

N
sp = 5.092 (pooled standard deviation)

O
x1 − x2
t=
1 1
sp +
n1 n2

t=
24.27 − 21.07

5.092
1
+
1
PY
15 14
O
t = +1.69 (computed value)

df = n1 + n2 − 2 = 15 + 14 − 2 = 27 (degree of freedom)
C

Below is the 5-step solution:

Step 1.
T

H 0 : TBI  IGI ; TBI is only as effective as IGI.


AF

Ha : TBI  IGI ; TBI is more effective than IGI.

Step 2.
 = 0.05 ; one-tailed test; ttabular = +1.703 (found in the table)
R

REJECTION REGION
D

ACCEPTANCE REGION

+1.703 Critical Value


34

Step 3.
Decision Rule: Reject H 0 if tcomputed  ttabular .

Step 4.
Decision: We cannot reject H 0 because the computed value of t is not  the critical
value.

LY
REJECTION REGION
ACCEPTANCE REGION

N
0

O
computed value of t=+1.69
(inside the acceptance region) +1.703 critical value found in the table

Step 5.
PY
Conclusion: The team-based instruction (TBI) method is only as effective as the
individually-guided instruction method (IGI).
O
C
T
AF
R
D
35

LESSON 4. THE t-TEST for PAIRED or CORRELATED SAMPLES


OBJECTIVES
At the end of this lesson, you should be able to:
• Determine when the paired t-test is appropriate.
• Use the paired t-test in hypothesis testing.

LY
4.1 Performing a t-Test for Paired or Correlated Samples
The formula for the test statistic is:
d n
t= df = n − 1

N
sd
The formula is somewhat different from the previous ones. What is that variable d doing

O
there? It turns out that the variable d stands for the differences between pairs of values from the
two correlated samples.

4.1.1 Example. Paired t-Test


PY
The following are the weights in pounds of 15 students before and after six months of
attending aerobics.

Before 243 179 201 165 183 153 170 180 212 169 178 209 158 192 144
After 231 173 199 162 179 152 164 177 207 170 171 196 159 190 140
O
Test at  = 0.05 if aerobics is effective in reducing weight.
C

Solution:
Since the two groups of weight measurements are taken from the same set of persons,
they are correlated. Hence, we use the paired t-test.
T

Let us first compute for the differences. Just perform a subtraction on each pair of
measurements.
AF

Before After Difference (d)


243 231 12
179 173 6
201 199 2
165 162 3
R

183 179 4
153 152 1
D

170 164 6
180 177 3
212 207 5
169 170 -1
178 171 7
209 196 13
36

158 159 -1
192 190 2
144 140 4

Now, let us determine the mean of the differences. Check if you can get d = 4.4 .
Next, let us determine the standard deviation of the differences. Verify that it is
sd = 4.049691  4.05 .

LY
Our degree of freedom is df = n − 1 = 15 − 1 = 14 . What happened to df = n1 + n2 − 2 ?
The answer lies in the fact that we are no longer working with two samples now. We are
working with a single variable d. And d has only n=15 items.

N
Let us now compute for the test statistic.
d n 4.40 15
t= = = 4.21

O
sd 4.05

Let B = the mean before attending aerobics.


PY
Let  A = the mean after attending aerobics.

The 5-step solution is as follows:

Step 1.
O
H 0 : B  A ; Aerobics is not effective in reducing weight.
Ha : B  A ; Aerobics is effective in reducing weight.
C

Don’t get confused here. We’re talking about reducing the weights. Therefore, for
aerobics to be effective, the weights before attending aerobics must be significantly
higher than the weights after attending aerobics ( B  A ).
T

Note: It is critical to take note of how we computed for each of the differences.
AF

The formula we used is d = weightbefore – weightafter. So, in our hypotheses,


we have B on the left side of the relations and  A on the right side. If we
computed for the differences using d = weightafter – weightbefore, then we will
have to reverse the positions of B and  A in the hypotheses.
R
D
37

Step 2.
 = 0.05 ; one-tailed test; df=14; ttabular = +1.761 (found in the table)

REJECTION REGION

ACCEPTANCE REGION

LY
+1.761 Critical Value

N
Step 3.
Decision Rule: Reject H 0 if tcomputed  +ttabular .

O
Step 4.
Decision: We reject H 0 because the computed value of t is  the critical value.

Step 5. PY
Conclusion: Based on the sample evidence, aerobics is effective in reducing weight.
O
REJECTION REGION
C

ACCEPTANCE REGION
T

+1.761 Critical Value +4.21 computed value


AF

in the rejection region


R
D
38

SUMMARY AND KEY POINTS:


1. t-test actually compares population means even though we are using the means of the
samples in the formulas.
2. t-test can be used to compare the population mean against a hypothesized mean or
assumed mean. This is called a one-sample t-test.
3. t-test can be used to compare means of two samples (actually the population means of

LY
two samples). This is called the two-sample t-test.
4. The two-sample t-test has two variations:
• the t-test for independent samples; and
• the t-test for paired samples
5. t-test is used for small sample sizes (n<30). However, it can still be used for samples

N
where n  30 in place of the z-test. That’s why some computer statistical softwares do
not have a two-sample z-test.
6. t-test is parametric. The implication of this is that it is used when you can meet the

O
assumption of normality. Meaning, your samples came from a normally distributed
population or you can assume that each of your samples came from a normal
distribution, e.g. by performing a normality test.

PY
O
C
T
AF
R
D
39

PRACTICE EXERCISE #1

The Food and Drug Administration is conducting tests on a certain drug to determine if it
has the undesirable side effect of reducing the body’s temperature. It is known that the mean
human temperature is 98.6 F . The new drug is administered to 25 patients and the patients’
mean temperature drops to 98.3 F , with a standard deviation of 0.64 . At the 0.05 significance
level, is there sufficient reason to conclude that the drug reduces body temperature?

LY
Solution:

Determine the test statistic.

N
tcomputed =

O
Write your 5-step solution.

Step 1.
PY
H0 :
O
Ha :
C

Step 2.
 = 0.05 ; _____-tailed test; df=____; ttabular = ___________ (found in the table)
T

Step 3.
Decision Rule:
AF

Draw the t-distribution and then mark the critical and computed
t-values.
R
D
40

Step 4.
Decision:

Step 5.

LY
Conclusion:

N
O
PY
O
C
T
AF
R
D
41

PRACTICE EXERCISE #2

A researcher wishes to conduct a study on the effectiveness of using Drug A and Drug B
as supplements to underweight breast-fed infants. The same group of underweight breast-fed
infants are exposed to two drugs as vitamin-supplement. The infants are almost of the same
age, 1 month, and almost of the same weight. They take Drug A (control or old brand) as
supplement for three months and their weight increments are monitored until the third month.

LY
After another three months, they take Drug B (experimental or new brand) as supplement and
every month their weight increments are monitored. In other words, the breast-fed infants’ age is
seven months when the experiment is through. To determine if there is significant difference in
the weight increments of breast-fed infants using Drug A (control) and Drug B (experimental) as
supplements, t-test is used.

N
The specific research problem of the foregoing study is, “Is there a significant difference
in the weight increments of underweight breast-fed infants using Drug A and Drug B as
supplements?”

O
The artificial result of the foregoing research problem is shown on the following table.
Mean Weight Increment in Kilograms
Month 1 Month 2 Month 3 Month 4 Month 5 Month 6
Drug A (control)
Drug B (experimental)
5.4 PY
6.3 7.5
8.5 10.7 12.8

Solution:
O
Write your preliminary analysis here:
C

Explain why t-test for independent samples must be used even though the values came
from the same set of infants.
T
AF

Determine the sample statistics using your calculator.

Control Experimental
R

5.4 8.5
6.3 10.7
D

7.5 12.8
n1 = n2 =
x1 = x2 =
1 = 2 =
42

Write your 5-step solution here.

Step 1.
H0 :
Ha :

Step 2.

LY
 = 0.01 ; two-tailed test; ttabular = ____________ (look it up in the table)

Draw your t-distribution, mark the critical value, and then shade the rejection region.

N
O
PY
Step 3.
Decision Rule:
O
C
T

Step 4.
AF

Compute the t-test statistic and then decide whether to reject or accept the null
hypothesis.
R
D
43

Decision:

Step 5.
Conclusion:

LY
N
O
PY
O
C
T
AF
R
D
44

PRACTICE EXERCISE #3
A drug manufacturer is undergoing a study about the safety of its new vaccine and
wanted to find out if it causes a significantly lower lymphocyte count. The vaccine was tested on
5 volunteers and their blood count taken.

Lymphocyte Count
After Before

LY
Volunteer
Vaccination Vaccination
#1 150 165
#2 155 170
#3 152 151

N
#4 146 164
#5 152 160

O
Solution:

PY
O
C
T
AF
R
D
45

EXERCISES
1. We have a random sample of 25 fifth grade pupils who were asked to pushups after
completing a special physical education program. The results are as follows:

Number of Pushups

12 20 9 12 10 11 10 15 20

LY
15 20 17 15 12 12 17 18 25
15 15 15 15 15 15 15

Does the average of these fifth grade pupils differ significantly from the population value
of 12?

N
2. The following table shows the scores of two groups of children who have been scored on
a learning variable.

O
Group 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
Score 3 7 6 2 9 11 13 8 10 2 5 8 12 12 10 17 12 10 13 8

PY
The condition to which a child was randomly assigned is coded in this table as the
"Group" variable, with either a value of 1 or 2. If Group = 1, the child was assigned to the no-
reward group. If Group = 2, the child was assigned to the reward group. The values under
"Learning" are the number of letters the child correctly pronounced during the testing phase.
O
Is there a significant difference between the mean scores of the two groups of children?
Test at  = 0.01.
C

3. Are non-smoking rooms less polluted?

Dr. Debruil, an environmental health professional, has been hired by Hotels International
to investigate whether the air in rooms designated as non-smoking contains less airborne
T

pollution. He has developed a sensing device that measures the volume of airbone cigarette
pollutants in parts per million. This device is installed in a sample of 20 rooms. Ten are
AF

located on the west side of a hall and the remainder on the east side. Those on the west
side of a hall and the remainder on the east side. Those on the west side are reserved for
non-smoking patrons. The readings are as follows:

Smoking Room Pollutants (ppm) Nonsmoking Pollutants


R

Number Room Number (ppm)


312 6.36 313 4.72
314 7.43 315 5.52
D

316 8.14 317 6.57


318 8.40 319 6.60
320 10.11 321 8.17
322 9.62 323 8.30
324 10.34 325 8.72
326 14.09 327 12.89
328 12.19 329 10.62
330 12.97 331 10.96
46

4. When labor has to be induced, the mother's cervix can fail to soften and enlarge,
prolonging the labor and perhaps requiring delivery by caesarean section. To investigate
whether the cervix can be softened and dilated by treating it with a gel containing
prostaglandin E2, C. O'Herlihy and H. MacDonald (“Influence of Preinduction Prostaglandin
E2 Vaginal Gel on Cervical Ripening and Labor,” Obstet. Gynecol., 54: 708–710, 1979)
applied such a gel to the cervixes of 21 women who were having labor induced and a
placebo gel that contained no active ingredients to 21 other women who were having labor
induced. The two groups of women were of similar ages, heights, weeks of gestation, and

LY
initial extent of cervical dilation before applying the gel. The labor of women treated with
prostaglandin E2 averaged 8.5 h, and the labor of control women averaged 13.9 h. The
standard deviations for these two groups were 4.7 and 4.1 h, respectively. Is there evidence
that the prostaglandin gel shortens labor?

N
5. A striking example where the correlation between two groups is due to a seasonal effect
follows. Although it is a weather example, these kinds of results can occur easily in clinical

O
trial data as well. The data are fictitious but are realistic temperatures for the two cities at
various times during the year. We are considering two temperature readings from stations
that are located in neighboring cities, A and B. We may think that it tends to be a little
warmer in City A, but seasonal effects could mask a slight difference of a few degrees. We
PY
want to test the null hypothesis that the average daily temperatures of the two cities are the
same. We will test this hypothesis versus the two-sided alternative that there is a difference
between the cities. The following table shows the mean temperature on the 15th of each
month during a 12-month period.

Daily Temperatures in Cities A and B


O
City A City B
Mean Mean
C

Day Temperature Temperature


(°F) (°F)
1 (January 15) 31 28
2 (February 15) 35 33
T

3 (March 15) 40 37
4 (April 15) 52 45
5 (May 15) 70 68
AF

6 (June 15) 76 74
7 (July 15) 93 89
8 (August 15) 90 85
9 (September 15) 74 69
10 (October 15) 55 51
R

11 (November 15) 32 27
12 (December 15) 26 24
D
47

MODULE 3
Analysis of Variance (ANOVA)

LY
OVERVIEW
• ANOVA
• Post-Hoc Comparison Test

N
LEARNING OBJECTIVES

O
At the end of this lesson, students are expected to:
1. Explain when ANOVA is used.
2. Perform a One-Way Analysis of Variance (ANOVA).
3.
4.
PY
Explain when a post-hoc analysis is needed.
Perform a Post-Hoc Comparison Test.
O
PREREQUISITES
Students must have sufficient knowledge about hypothesis testing, e.g. significance
level, critical values, implementing decision rules, z-test, degrees of freedom, t-test.
C
T
AF
R
D
48

LESSON 1. THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)


OBJECTIVES
At the end of this lesson, you should be able to:
• Know when ANOVA is used.
• Compute different types of variances.
• Use ANOVA in hypothesis testing.

LY
1.1 The One-Way Analysis of Variance (ANOVA)
The ANOVA is used to analyze or to test the significance of differences among the means of
3 or more groups simultaneously. ANOVA is not used to show that variances are different; it is

N
used to show that means are different by analyzing some types of variances.

O
1.2 An Illustration of When ANOVA is Used
Three methods used to dissolve a powder in water are compared by the time (in minutes) it
takes until the powder is fully dissolved. The results are summarized in the following table:

Method 1
15
18
19
Method 2
22
27
18
PY Method 3
18
24
19
22 21 16
O
11 17 22
15

It is thought that the population means of the three methods 1 ,  2 and  3 are not all equal
C

- i.e., at least one  i is different from the others. How can this be tested?
One way is to use multiple two-sample t-tests and compare Method 1 with Method 2,
Method 1 with Method 3 and Method 2 with Method 3 (comparing all the pairs). But if  for
T

each test is 0.05, the probability of making a Type 1 error when running three tests would
increase.
AF

A better method is called Analysis of Variance, or ANOVA, which is a statistical technique


for determining the existence of differences among several population means. The technique
requires the analysis of different forms of variances – hence the name.
R

1.3 Performing an ANOVA


D

Basically, ANOVA compares two types of variances: the variance within each sample and
the variance between different samples. The assumption is: If the population means are
different, then the variance within the samples must be small compared to the variance between
the samples. Hence, if the variance between samples divided by the variance within each
sample is large, then we say that the means are different.
49

Steps for Using ANOVA

Step 1. Compute the Variance Between Samples


First, the sum of squares (SS) between sample means is computed:

SSbetween =  n x − x ( )
2

where x is the sample mean and x is the overall mean (or the grand mean, or mean of

LY
the sample means).

Use your calculator to compute the mean of each sample.

Method 1 Method 2 Method 3

N
15 22 18
18 27 24

O
19 18 19
22 21 16
11 17 22
15
n1 = 5 n2 = 5 n3 = 6
x1 = 17.00
PY
x2 = 21.00 x3 = 19.00

So, we have 3 means: 17.00, 21.00, and 19.00. These are just numbers, so we can
compute for their variance. The method that you already know will be used.
O
There are 3 columns, one for each sample. So we also have k = 3 number of means.
Note: We are using the variable k to denote the number of columns which is also always equal
to the number of sample means. If you have 5 samples or groups, then you’ll have k = 5 .
C

So, if we compute for the variance between the 3 sample means, we will get:

(x − x ) ( )
n x nx x−x 2 2
n x−x
T

5 17.00 85.00 17 – 19 = -2 4 20
5 21.00 105.00 21 – 19 = 2 4 20
AF

6 19.00 114.00 19 – 19 = 0 0 0
 nx = 304.00 (
SSbetween =  n x − x )
2
= 40

weighted x=
 nx = 304 = 19
mean n 5 + 5 + 6
R

Now, the variance between the means is


D

n (x − x )
2
SSbetween 40 40
s 2
= = = = = 20 .
k −1 k −1 3 −1 2
between
50

Summary of what we have done so far:

( )
2
n n x−x
Method 1 5 20
Method 2 5 20
Method 3 6 0

LY
SS Factor 40
k 3
k-1 2 degree of freedom
variancebetween 20

N
Take note of some terminologies in the table.

O
Step 2. Compute the Variance Within Samples
Again, first compute the sum of squares within. This is:

SSwithin =  ( x − x )
2

PY
Don’t let this formula confuse you. It simply means that you’re going to determine the sum of
( x − x )
2
squares for each sample. So you’ll have three of these sums because you have
three sample groups. Then, you will simply need to add them all up.
O
( x − x )
2
Let’s determine for each sample.
C
T
AF
R

Therefore,
D

SSwithin = SS1 + SS2 + SS3 = 70 + 62 + 60 = 192 .

To obtain the variance within, use this formula:


SSwithin
s 2 within =
n−k
51

So, we have:
SSwithin 192
s 2 within = = = 14.769 Note: n = number of all items in all samples.
n−k 16 − 3
Summary of what we did:
SSwithin 192
n 16

LY
n-k 13 another degree of freedom
variancewithin 14.769

N
Step 3. Compute the Ratio of variancebetween and variancewithin .

variancebetween 20
= = 1.354 = Fcomputed

O
variancewithin 14.769
This is called the F-ratio which follows an F-distribution. We have to look up for the critical
value from the table. But first, we need to remember the degrees of freedom.
Examine the formula:

F=
variancebetween
variancewithin
PY
There is a numerator and there is a denominator. Remember how you computed for them.
O
Now, answer the following questions:
• What is the degree of freedom (df) for the numerator? Answer: k-1=2.
• What is the degree of freedom (df) for the denominator? Answer: n-k=13.
C

So, in our Tables of Critical Values of F (appendix), for  = 0.05 , df numerator = 2 , and
df denominator = 13 , we will get a critical value of 3.81.
T
AF

The F-distribution with  = 0.05 and


degrees of freedom 2 and 13.
R
D

3.81 Critical Value (found in the table)


52

Note: There is no such thing as a two-tailed test in ANOVA. Even the one-tailed
diagram should not be viewed as a one-tailed test.

Step 4. Decide whether to reject or accept the assumption that the means of the 3 groups are
equal.

LY
Since the Fcomputed value of 1.354 is less than the Ftabular value of 3.81, we cannot reject the
assumption. Therefore, we can say that the three methods are equal.
We can summarize the computations in an ANOVA table:
Source DF SS MS F

N
Factor 2 40.00 20.000 1.354
Error 13 192.00 14.769
Total 15 232.00

O
ANOVA Table

Let us formalize our solution using the 5-step method.

Step 1.
PY
H 0 : 1 = 2 = 3 ; The three methods dissolve powder in same amount of time.
Ha : i   j for some i  j ; At least one method has different mean time from another.
O
Step 2.
 = 0.05 ; df numerator = 2 ; df denominator = 13 ; Ftabular = 3.81
C
T
AF
R

Computed F-ratio 1.354 3.81 Critical Value


D

Step 3.
Decision Rule: Reject H 0 if Fcomputed  Ftabular .

Step 4.
Decision: We cannot reject H 0 because the computed value of F, 1.354, is not greater
than the critical value 3.81. That is, it does not fall within the rejection region.
53

Step 5.
Conclusion: The three methods dissolve the powder in the same mean amount of time.

LY
N
O
PY
O
C
T
AF
R
D
54

PRACTICE EXERCISE #1

Suppose the researcher wishes to determine the effectiveness of teaching Maternal


Nursing to nursing students using Method 1, Method 2, Method 3, and Method 4 in a certain
College of Nursing in Region 6. There are 12 sections of nursing students who take Maternal
Nursing. These 12 sections are divided into four groups. The grouping of these nursing students
is heterogeneous and they are grouped according to their surname; for instance, surname that

LY
starts with letter A is Section 1; letters B and C, Section 2 and so on. Sections 1 to 3 method of
teaching is Method 1; Sections 4 to 6, Method 2; Sections 7 to 9, Method 3; and sections 10 to
12, Method 4. Their mean grades for every departmental test, namely, preliminary examination,
midterm examination, and final examination are taken.

N
The specific research problem of the foregoing study is: Is there a significant difference
in the effectiveness of teaching Maternal Nursing using Method 1, Method 2, Method 3, and
Method 4 to nursing students in a certain College of Nursing in Region 6?”. Test at  = 0.01 .

O
Mean Grade
Treatment
Preliminary Midterm Final
Teaching Method 1 1.8 1.6 1.4
Teaching Method 2
Teaching Method 3
Teaching Method 4
PY
1.9
2.5
2.1
1.7
2.0
1.9
1.6
1.9
1.8
O
SOLUTION:
Step 1. Compute the Variance Between Samples
C

Use your calculator to determine the means.


Method 1 Method 2 Method 3 Method 4
x1 x2 x3 x4
T

1.8 1.9 2.5 2.1


1.6 1.7 2.0 1.9
1.4 1.6 1.9 1.8
AF

n1 = n2 = n3 = n4 =
x1 = x2 = x3 = x4 =
R
D
55

Next, determine the variance between the sample means.


n
(x − x ) ( )
2 2
x x−x n x−x

x = (
SSbetween =  n x − x )
2

LY
=

x=
x =
k

N
Now, compute the variance between the means of the samples.

n (x − x )
2
SSbetween

O
s 2
= = =
k −1 k −1
between

Summary of what you have done so far:

Method 1
Method 2
n PY (
n x−x )
2

Method 3
Method 4
O
SS Factor
K
k-1
C

dfnumerator
variancebetween
T

Step 2. Compute the Variance Within Samples


AF

Method 1 Method 2
( x1 − x1 ) ( x2 − x2 )
2 2
x1 x1 − x1 x2 x2 − x2
R

( x − x1 ) = ( x − x2 ) =
2 2
x1 = 1
x2 = 2
D

This is SS1 . This is SS2 .


56

Method 3 Method 4
x3 − x3 ( x3 − x3 ) ( x4 − x4 )
2 2
x3 x4 x4 − x4

x3 = ( x − x3 ) = ( x − x4 ) =
2 2
3
x4 = 4

This is SS3 .

LY
This is SS4 .

Determine the total sum of squares within samples.

SSwithin = SS1 + SS2 + SS3 + SS4 =

N
Compute the variance within samples.

O
SSwithin
s 2 within = =
n−k

Summary of what you did:


SSwithin
n
n-k
PY dfdenominator
Variancewithin
O
Step 3. Compute the Ratio of Variance Between and Variance Within
C

variancebetween
Fcomputed = =
variancewithin
T

Fill up the ANOVA table.


AF

Source DF SS MS F
Factor
Error
Total
R

Present your analysis in the familiar 5-step method:


D

Step 1.
H0 :
Ha :
57

Step 2.
 = 0.01 ; df numerator = ____; df denominator = ____; Ftabular = _____

LY
N
______ Critical Value

O
Step 3.
Decision Rule:

Step 4.
Decision:
PY
O
Step 5.
Conclusion:
C
T
AF
R
D
58

PRACTICE EXERCISE #2

This time, use the statistical capabilities of your calculator or computer to quickly solve the
following problem.
A drug manufacturer did a research regarding the acceptability of its four major brands of
cough syrups. The following table shows the acceptability levels of the four brands of cough
syrups in terms of taste. The higher the level, the more acceptable the syrup is to respondents.

LY
A B C D
7 9 2 4
3 8 3 5

N
5 8 4 7
6 7 5 8
9 6 6 3

O
4 9 4 4
3 10 2 5

PY
Test the following hypothesis at  = 0.05 .
H 0 : A = B = C = D ; The mean levels of acceptability of the four brands are the
same.
O
Solution:
C
T
AF
R
D
D
R
AF
T
C
O
PY
O
N
59

LY
60

LESSON 2. POST-HOC TEST


OBJECTIVES
At the end of this lesson, you should be able to:
• Determine when post-hoc analysis is required.
• Perform a post-hoc test using Scheffe’s test.
• Know that there are other post-hoc methods available.

LY
2.1 Post-Hoc Test
To illustrate when post-hoc analysis is needed, let us tackle the problem in the last practice
exercise. The problem is as follows:

N
A drug manufacturer did a research regarding the acceptability of its four major brands of
cough syrups. The following table shows the acceptability levels of the four brands of cough

O
syrups in terms of taste. The higher the level, the more acceptable the syrup is to respondents.

A B C D
7 9 2 4
3 8 3 5
5
6
9
8
7
6
PY 4
5
6
7
8
3
4 9 4 4
3 10 2 5
O
After doing the ANOVA, the table below was obtained. It indicates that there is significant
difference between at least one pair of brands at  = 0.05 .
C

Source DF SS MS F
Factor 3 72.29 24.10 7.97
Error 24 72.57 3.02
T

Total 27 144.86
AF

Greater than critical value of 3.01.

But which pairs of brands are different? We cannot answer this question from the table.

To find out where the differences lie, another test must be used.
R
D
61

2.2 Scheffé’s Test


To determine if two pairs of means are significantly differently, we can use Scheffé’s test. Its
formula is as follows:

(x − xj )
2

F'=
i

s 2within ( ni + n j )
ni  n j

LY
Where:
F' = Scheffé’s test statistic
xi , x j = means of the samples to compare
s 2within

N
= the variance within or mean squares within (overall, found in the ANOVA table
that you constructed after performing your ANOVA test)

O
We need to apply this test pairwise. That is: A vs B, A vs C, A vs D, B vs C, B vs D, C vs D.

So, let us perform pairwise comparisons among the different brands.

Brand A versus B:

F 'AB = 2
( x A − xB )
2

s within ( nA + nB )
=
(
PY
5.28 − 8.14 )
3.02 ( 7 + 7 )
2

= 9.48

nA  nB 77
O
Brand A versus C:
( x A − xC ) ( 5.28 − 3.71)
2 2

F 'AC = 2 = = 2.86
s within ( nA + nC ) 3.02 ( 7 + 7 )
C

nA  nC 77

Brand A versus D:
( x A − xD ) ( 5.28 − 5.14 )
T

2 2

F 'AD = 2 = = 0.02
s within ( nA + nD ) 3.02 ( 7 + 7 )
AF

nA  nD 77

Brand B versus C:
( xB − xC ) ( 8.14 − 3.71)
2 2

F 'BC = 2 = = 22.74
s within ( nB + nC ) 3.02 ( 7 + 7 )
R

nB  nC 77

Brand B versus D:
D

( xB − xD ) ( 8.14 − 5.14 )
2 2

F 'BD = 2 = = 10.43
s within ( nB + nD ) 3.02 ( 7 + 7 )
nB  nD 77
62

Brand C versus D:
( xC − xD ) ( 3.71 − 5.14 )
2 2

F 'CD = 2 = = 2.37
s within ( nC + nD ) 3.02 ( 7 + 7 )
nC  nD 77

The critical value of F at  = 0.05 and degrees of freedom 3 and 24 is:


Ftabular = 3.01

LY
Between Critical Value
F' Interpretation
Brand 3.01
A vs B 9.48 3.01 significant

N
A vs C 2.86 3.01 not significant
A vs D 0.02 3.01 not significant
B vs C 22.74 3.01 significant

O
B vs D 10.43 3.01 significant
C vs D 2.37 3.01 not significant

The analysis shows that there is significant difference in the levels of acceptability between
PY
brand A and brand B, brand B and brand C, and brand B and brand D. However, brands A and
C, brands A and D, and brands C and D do not have significant differences in their acceptability.

2.3 Other Post-Hoc Tests


There are other post-hoc methods available. Each has its advantages and disadvantages.
O
Some of the other commonly used post-hoc techniques are:
• Fisher's least significant difference (LSD)

C

Tukey's range test


• Dunnett
• Duncan's new multiple range test
• Bonferroni correction
T

Statistical software packages implement most of these methods.


AF
R
D
63

EXERCISES

1. The quantity of oxygen dissolved in water is a measure of water pollution. Samples are
taken at three lakeside locations in a lake and the quantity of dissolved oxygen is recorded
as follows (lower readings indicate greater pollution):

Location Quantity

LY
Sta. Cruz 6.5 6.4 6.9
Tanay 6.7 7.1 6.9 7.3
Angono 7.4 6.9 7.2

Are the levels of pollution the same in the three locations? Test at  = 0.05 .

N
2. A physician randomly selects 18 patients among those she is treating for high blood

O
pressure. These patients are randomly assigned to three groups and treated with three
different drugs, all designed to reduce blood pressure. The amount of reduction in
millimetres of mercury, is shown. At the 0.01 level of significance, is there sufficient evidence
to show that the drugs act differently?

Drug A
10
10
Drug B
13
14
PY
Drug C
9
8
9 11 6
O
10 10 10
7 9 10
6 10 7
C

3. A psychologist wants to investigate the effect of social background on the time (in
minutes) it takes freshmen to solve a puzzle. A random sample of students from different
T

backgrounds is selected, resulting in the following data. Use the 0.05 level of significance to
test the hypothesis that social background has no effect on the time required to solve the
puzzle. State your conclusion and interpret the results.
AF

Inner City Urban Suburban Rural


16.5 10.9 18.6 14.2
5.2 5.2 8.1 24.5
R

12.1 10.8 6.4 14.8


14.3 8.9 24.9
16.1 5.1
D

4. An oncologist – a physician who specializes in the treatment of tumors – has 24 patients


with advanced lung cancer. He is aware of three treatments, reported in medical journals,
that may gain remission for his patients. To assess the effect of the treatments, the doctor
randomly assigns patients to each treatment, then keeps careful records on the number of
days the patients live after treatment starts. At the 0.05 level, can it be concluded that there
is any difference in the effect of the treatments?
64

Laetrile Chemotherapy Radiation


75 80 64
88 82 90
62 64 58
97 45 64
62 67 82
81 84 71
93 55 59

LY
39 66
60

5. A research study was conducted to examine the clinical efficacy of a new

N
antidepressant. Depressed patients were randomly assigned to one of three groups: a
placebo group, a group that received a low dose of the drug, and a group that received a
moderate dose of the drug. After four weeks of treatment, the patients completed the Beck

O
Depression Inventory. The higher the score, the more depressed the patient. The data are
presented below.

Placebo Low Dose Moderate Dose


38
47
PY 22
19
14
26
39 8 11
25 23 18
O
42 31 5
C

Did the dosage of antidepressant have an effect on the level of depression? Test at 95%
confidence level (  = 0.05 ). Do a post-hoc test if necessary.
T
AF
R
D

Вам также может понравиться