Вы находитесь на странице: 1из 170

Research Design: Topic 7

Module 2: Topic 1

Topic 1: Basic Concepts


of Experimental Design

Dr Amirul Islam

Acknowledged to: Dr Jahar Bhowmik

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

Contents

1.1 Topic introduction

1.2 Topic learning objectives

1.3 Important Terms and Definitions of Experimental Design

1.4 Principles of an Experimental Design

1.5 Design of Experiments in Marketing

11

1.6 Sample Surveys versus Experimental Design

12

1.7 The Parallels between Experimental Designs & Sample Surveys

12

1.8 Study Design in Medical Research

13

1.9 Guidelines for Designing Experiments

14

1.10 Research questions and hypotheses

15

Revision Exercises

17

Solution to Revision Exercises

20

References

22

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

Note: Some of the materials are adapted from standard texts and guides (see references).

1.1 Topic introduction


The formulation of a problem is often more essential than its solution which may be
merely a matter of mathematical or experimental skill. --------------Albert Einstein
Design of Experiment is a structured, organized method that is used to determine the
relationship between the different factors affecting a process and the output of that
process. This method was first developed in the 1920s and 1930, by Sir Ronald A. Fisher,
the renowned mathematician and geneticist.
This chapter examines the basic concepts of experimental design. Experimentation and
making inferences are twin features of general scientific methodology. The subject-matter
of experimental design includes:
(i)
Planning the experiment,
(ii)
Obtaining relevant information from it regarding the statistical
hypothesis under study, and
(iii) Making a statistical analysis of the data.
Experimental design is a term which includes efficient methods for planning for the
collection of data, in order to obtain the maximum amount of information for the least
amount of work. Data are everywhere. Anyone can collect and analyse data, be it in the
lab, the field, or the production plant, can benefit from knowledge about experimental

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

design. Directed experimentation generates critical events. An experiment is an


invitation for an informative event to occur (Box et al., 2005).
Experience has shown that proper consideration of statistical analysis before the
experiment is conducted, forces the experimenter to plan more carefully the design of the
experiment. The observations obtained from a carefully planned and well-designed
experiment give entirely valid inferences.
Experiments are usually more structured than sample surveys and include the additional
step of treating the elements. In Sample Survey we select elements from frames and then
take measurements (such as responses to a questionnaire) but in Experimental Designs
we select experimental units, allocate treatments and then take measurements (either a
few or all elements).

1.2 Topic learning objectives


Learning objectives
When you have worked through this topic you should:

Understand the idea of experimental design.


Know the basic definition of experimental design.
Understand the basic concepts that underlie scientific investigations.

1.3 Important Terms and Definitions of


Experimental Design
Observation (Correlational) and experimental studies
Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

A study in which the researcher observes and records what has already happened is called
an observational approach. On the other hand, an experimental study or trial is initiated
by a researcher. In an "ideal" experiment, the researcher manipulates the independent
variable (s), holds all other variables constant, and observes the changes in the dependent
variable. In experimental studies or trials we determine which experimental units receive
which treatment, whereas in observational studies we have to take what is observed.
Observational studies often show an association between two variables, but they cannot in
themselves prove cause and effect.
For example, consider the hypothesis:
"Driving ability varies with blood alcohol level".
The researcher would manipulate the amount of alcohol given to the drivers and then
observe changes in their driving skills. If all other variables are held constant, then any
changes in driving skill must be caused by the effects of the alcohol.
Consider an alternative means of collecting data. The researcher stands outside the pub
on Friday night and asks for volunteers leaving the pub. Each volunteer undergoes a
driving test and also has his/her blood alcohol level measured. The researcher then
compares the driving skills of volunteers with zero blood alcohol level to the driving
skills of those drivers whose alcohol level is over .05. This is an observational design.
The researcher is merely observing the blood alcohol level of each subject, rather than
controlling or manipulating it.

Experiment
An experiment is the device or the means of getting the answer to the problem under
investigation, e.g. comparison of different manures or fertilizers, different varieties of a
crop, different cultivation processes, or different diets or medicines in a dietary or medical
experiment.
An experiment is a planned inquiry to discover new facts, or to confirm or deny the
results of previous investigations (Petersen, 1985).

Nuisance variables
Nuisance variables are associated with variation in an outcome (dependent variable) that
is extraneous to the effects of independent variables that are of primary interest to the
researcher. In our description of an "ideal" experiment, we stated that "all other variables"
should be held constant. If, for example, we are interested in the effects of alcohol on
driving ability, then any other variable which may influence driving ability is known as a
nuisance variable. Such things as the type of car, the driving course, temperature,

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

humidity, time of day, and the driver's, age, reflexes and level of experience would all
have an influence on a driving test score. These are all referred to as nuisance variables.

Confounding variables
Variables that are not controlled for that systematically change experimental results, they
are called confounding variables. A confounding variable has two properties. First, a
confounding variable is related to the explanatory (independent) variable in the sense that
individuals who differ due to the explanatory variable are also likely to differ for the
confounding variable. Second, a confounding variable affects the response (dependent)
variable.
Suppose we are interested in the effects of alcohol on driving ability. If all of the zero
alcohol level driving tests were performed in the morning, and all of the .05 alcohol level
driving tests were completed in the evening, we could not tell if the resulting differences
in driving abilities were due to differences in the alcohol level, or if they were due to
differences in the time of day of the test. In this case, "time of day" is known as a
confounding factor, because it literally confounds our interpretation of the experiment.

Treatments
Various objectives of comparison in a comparative experiment, are known as treatments,
e.g., in field experimentation different fertilizers or different varieties of crop or different
methods of cultivation are treatments.
A treatment is one or a combination of categories of the explanatory variable(s) assigned
by the experimenter. The plural term treatments incorporates a collection of conditions,
each of which is one treatment.

Factor and Level


A factor of an experiment is a controlled independent variable; the levels of the variable
are set by the experimenter.
A factor is a general type or category of treatment. Different treatments constitute
different levels of a factor. For example, three different groups of runners are subjected to
three different training methods. The runners are the experimental units, the training
methods are the treatments. Where the three types of training methods constitute three
levels of the factor e.g. 'type of training'. The states of a factor, i.e., the treatments within
the class, are called the levels of the factor.

Experimental Units
The individuals in an experiment are referred to as experimental units. The smallest
division of the experimental material, to which we apply the treatments and on which we
make observations on the variable under study, is termed an experimental unit.
Experimental units can be people, animals, batteries, etc. In field experiment the plot of
land is the experimental unit. In other examples may be a patient in a hospital, pigs in a
pen, or a batch of seeds. With animal trials, an experimental unit can be a paddock of
animals, a single animal, or even part of an animal.

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

Blocks
In agricultural experiments, most of the time we divide the whole experimental unit
(field) into relatively homogenous sub-groups (shown in the following diagram) or strata.
These strata, which are uniform amongst themselves, are known as blocks. That means, a
block is a group of experimental units that are similar in a way that is expected to affect
the response to the treatments. A group of homogenous experimental units is called a
block.
The term blocking was first used by R. A. Fisher in agronomic experiments (1920). In the
statistical theory of the design of experiments, blocking is the arranging of experimental
units in groups (blocks) that are similar to one another. Typically, a blocking factor is a
source of variability that is not of primary interest to the experimenter. Blocking is
sometimes used for nuisance factors that can be controlled. Nuisance factors are those
that may affect the measured result, but are not of primary interest. For example, in
applying a treatment, nuisance factors might be the specific operator who prepared the
treatment, the time of day the experiment was run, or the room temperature. All
experiments have nuisance factors. The experimenter will typically need to spend some
time deciding which nuisance factors are important enough to keep track of or control if
possible, during the experiment.

Figure 1: Non-homogenous experimental units

Figure 2: Blocking into homogenous groups

Replication
Replication means the repetition of a test or an experiment more than once. In other
words, the repetition of treatments under investigation is known as replication.

Precision

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

The reciprocal of the variance of the mean is termed as the precision, or the amount of
information of a design. Thus for an experiment replicated r times, the precision is given
by
1
r
= 2
var( x )

Experimental Error
Let us suppose that a large homogenous field is divided into different plots (of equal
shape and size) and different treatments are applied to those plots. If the yields from some
of the treatments are more than those of the others, the experimenter is faced with the
problem of deciding if the observed differences are really due to treatment effects or they
are due to chance (uncontrolled) factors. In field experiments, it is a common experience
that the fertility gradient of the soil does not follow any systematic pattern but behaves in
an erratic fashion. Experience tells us that even if the same treatment is used on all the
plots, the yields would still vary due to the differences in soil fertility. Such variation
from plot to plot, which is due to random (or chance or non-assignable) factors beyond
human control, is spoken of as experimental error. It may be pointed out that the term
error used here in not synonymous with mistake but is a technical term which includes
all types of extraneous variations due to:
(i)
the inherent variability in the experimental material to
which treatments are applied,
(ii)
the lack of uniformity in the methodology of conducting the
experiments, or in other words failure to standardise the
experimental technique, and
(iii)
lack of representativeness of the sample to the population
under study.

Blind Experiment
The blind method is a part of some scientific methods, used to prevent research outcomes
from being influenced by either the placebo effect or the observer bias. In a blind
experiment, the subjects do not know whether they are in the treatment group or the
control group. The idea is that the groups studied, including the control, should be
unaware of the group they are placed in. In medicine, when researchers are testing a new
medicine, they ensure that the placebo looks, and tastes, the same as the actual medicine.
There is strong evidence of a placebo effect with medicine, where, if people believe that
they are receiving a medicine, they show some signs of improvement in health. A blind
experiment reduces the risk of bias from this effect, giving an honest baseline for the
research, and allowing a realistic statistical comparison.
Ideally, the subjects should not be told that a placebo was being used at all, but this is
regarded as unethical.

Natural sources of error in field experiments


Plant variability

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

type of plant, larger variation among larger plants


competition, variation among closely spaced plants is smaller
plot to plot variation because of plot location (border effects)
Seasonal variability
climatic differences from year to year
rodent, insect, and disease damage varies
conduct tests for several years before drawing firm conclusions
Soil variability
differences in texture, depth, moisture-holding capacity, drainage, available
nutrients

since these differences persist from year to year, the pattern of variability can be
mapped with a uniformity trial

1.4 The Three basic Principles of Experimental


Design
Professor Ronald A. Fisher pioneered the study of experimental designs with his classical
book, The Design of Experiments. According to him, the basic principles of the design of
experiments are:
(i)
(ii)
(iii)

Randomisation
Replication, and
Local Control or Error Control or Blocking.

The roles they play in data collection and interpretation are discussed below.

Randomisation
By randomisation we mean that both the allocation of the experimental material and the
order in which the individual runs or trials of the experiment to be performed, are
randomly determined. After the treatments and the experimental units are decided the
treatments are allotted to the experimental units at random to avoid any type of personal
or subjective bias which may be conscious or unconscious. This brings to the
experimenter the question of allocation of treatments to experimental units so that each
treatment gets an equal chance of showing its worth. In the absence of prior knowledge of
the variability of the experimental material, this objective is achieved through
randomisation, a process of assigning the treatments to various experimental units in a
purely chance manner. The following are the main objectives of randomisation:
(i)
To eliminate bias,
(ii)
To ensure independence among the observations.

Criteria for randomisation in clinical trial studies

Course notes STA60004

Semester 1, 2015

Research Design: Topic 7

Module 2: Topic 1

1. Unpredictability
Each participant has the same chance of receiving any of the interventions.
Allocation is carried out using a chance mechanism so that neither the
participant nor the investigator will know in advance which will be
assigned.
2. Balance
Treatment groups are of a similar size & constitution; groups are alike in all
important aspects and only differ in the intervention each group receives.
3. Simplicity
Easy for investigator/staff to implement.

Replication
As pointed out earlier, replication means the execution of an experiment more than once.
In other words, the repetition of treatments under investigation is known as replication.
An experimenter resorts to replication in order to average out the influence of the chance
factors on different experimental units. Thus, the repetition of treatments results in a more
reliable estimate than is possible with a single observation. Replication is necessary to
increase the accuracy of estimates of the treatment effects. Although, the more the
number of replications the better the estimate is, it cannot be increased indefinitely as it
increases costs of experimentation.
Replication serves a number of purposes in an experimental design:
(i)
(ii)
(iii)

It allows the experimenter to obtain an estimate of the experimental


error.
It permits the experimenter to increase precision by reducing
standard errors.
It can expand the base for making inference.

Local Control or Blocking


Blocking means to arrange the experimental materials into groups, or blocks, of more or
less uniform experimental units. If the experimental material, say field for agriculture
experimentation, is heterogenous and different treatments are allocated to various units
(plots) at random over the entire field, the soil heterogeneity will also enter the
uncontrolled factors and thus increase the experimental error. It is desirable to reduce the
experimental error as far as practicable without unduly increasing the number of
replications or without interfering with the statistical requirement of randomness, so that
even smaller differences between treatments can be detected as significant.
In addition to the principles of replication and randomisation discussed earlier, the
experimental error can further be reduced by making use of the fact that neighbouring
areas in a field are relatively more homogenous than those widely spread. In order to
separate the soil fertility effects from the experimental error, the whole experimental area
(field) is divided into homogenous groups (blocks) row-wise or column-wise or both,
according to the fertility gradient of the soil such that the variation within each block is
minimum and between the blocks is maximum. The treatments are then allocated at

Course notes STA60004

Semester 1, 2015

10

Research Design: Topic 7

Module 2: Topic 1

random within each block. The process of reducing the experimental error by dividing the
relatively heterogenous experimental area (field) into homogenous blocks is known as
local control.

Example 1.1
Consider the very simple agricultural problem of comparing two varieties of tomatoes.
The purpose of the comparison is to find the variety which produces the greater quantity
of marketable quality fruit from a given area for large scale commercial planting. What
should we do? A simple approach would be to plant a block of land of each variety and
measure the total weight of marketable fruit produced. However, there are some obvious
difficulties. The variety that cropped most heavily may have done so simply because it
was growing in better soil. There are a number of factors which affect growth: soil
fertility, soil acidity, irrigation and drainage, wind exposure, exposure to sunlight
(e.g. shading, north-facing or south-facing hillside). Unfortunately no one knows exactly
to what extent changes in these factors affect growth. So unless the two blocks of land are
comparable with respect to all of these features, we won't be able to conclude that the
more heavily producing variety is better as it may just be planted in a block that is better
suited to growth.
If it was possible (and it never will be) to find two tracts of land that were identical in
these respects, using just those two blocks for comparison would result in a fair
comparison but the differences found might be so special to that particular combination of
growing conditions that the results obtained were not a good guide to full scale
agricultural production anyway.

Why randomise? Let us think about it another way. Suppose we took a large block of
land and subdivided it into smaller plots by laying down a rectangular grid. By using
some sort of systematic design to decide what variety to plant in each plot, we may come
unstuck if there is a feature of the land like an unknown fertility gradient. We may still
end up giving one variety better plots on average. Instead, let's do it randomly by
numbering the plots and randomly choosing half of them to receive the first variety. The
rest receive the second variety. We might expect the random assignment to ensure that
both varieties were planted in roughly the same numbers of high fertility and low fertility
plots, high pH and low pH plots, well drained and poorly drained plots etc.
In that sense we might expect the comparison of yields to be fair. Moreover, although we
have thought of some factors affecting growth, there will be many more that we, and even
the specialist, will not have thought of. And we can expect the random assignment of
treatments to ensure some rough balancing of those as well!

Why replicate? Random sampling gives representative samples, on average. However, in


small samples, it may occur, just by chance, that your sample may be a 'bit weird'.
Unfortunately, we can only expect the random allocation of treatments to lead to balanced
samples (e.g. a fair division of the more and less fertile plots) if we have a large number
of experimental units to randomise. In many experiments this is not true (e.g. using plots
to compare varieties) so that in any particular experiment there may well be a lack of
balance on some important factor. Random assignment still leads to fair or unbiased
comparisons, but only in the sense of being fair or unbiased when averaged over a whole
sequence of experiments. This is one of the reasons why there is such an emphasis in
science on results being repeatable.
Why block? Partly because random assignment of treatments does not necessarily ensure
a fair comparison when the number of experimental units is small. In this case more
complicated experimental designs are available to ensure fairness with respect to those
factors which we believe to be very important. Suppose with our tomato example that,

Course notes STA60004

Semester 1, 2015

11

Research Design: Topic 7

Module 2: Topic 1

because of the small variation in the fertility of the land we were using, the only thing that
we thought mattered greatly was drainage. We could then try and divide the land into two
blocks, one well drained and one badly drained. These would then be subdivided into
smaller plots, say 6 plots per block. Then in each block, 3 plots are assigned at random to
the first variety and the remaining 3 plots to the second variety. We would then only
compare the two varieties within each block so that well drained plots are only compared
with well drained plots, and similarly for badly drained plots. This idea is called blocking.
By allocating varieties to plots within a block at random we would provide some
protection against other extraneous factors.

1.5 Design of Experiments in Marketing


Design of experiments, or conjoint analysis as it is known in a marketing context, is
known to be the most powerful statistical method for establishing the linkage between a
customer's decision-making process and the service or product being offered. After
effective application of design of experiments, companies find it easier to gain an insight
into the significant variables affecting a customer's decision-making ability.

Marketing Problems
Eventually, the primary aim of marketing is to calculate the upcoming market share net
sales, or profitability of an offering, thus, allowing a company to:
Foretell customer buying tendency
Boost customer retention
Ascertain trade-off strategies during contract negotiation
Ascertain competitive pricing
Predict sales
Control brand equity
Devise product elements
Establish price sensitivity
Forecast and reduce customer switch rates
Ascertain best market position for new product introductions.

1.6 Sample Surveys versus Experimental Design


Experiments are usually more structured than sample surveys and include the additional
step of treating the elements in some way.

Sample Survey

Course notes STA60004

Experimental Design

Semester 1, 2015

12

Research Design: Topic 7

Module 2: Topic 1

Select elements from frame.

Select experimental units

Take measurements.

Allocate treatments.

Take measurements.

1.7 The Parallels between Experimental Designs &


Sample Surveys
Sample Survey

Experimental Design

Random selection is the method used to Randomisation is used to assign treatments


choose units from the population for the to experimental units.
sample.
The sampling error can be minimized
by stratification.
Partial grouping is useful in cluster
sampling.
For analysis regression techniques are
useful.

Method of blocking/local control is


common to reduce error.
Partial grouping is useful in split-plot
designs.
For analysis ANOVA (analysis of
variance) and ANCOVA (analysis of
covariance) are useful.

1.8 Study Design in Medical Research


(Taken from Dawson, B. & Trapp, R.G. (2004): Basic & Clinical Biostatistics, p.7)
Study designs in medicine fall into two categories: studies in which subjects are observed
(observational), and studies in which the effect on an intervention in observed
(experimental).

Classification of Study Designs


With a little practice, the classification of study designs outlined below would help us to
read medical articles and classify studies with little difficulty.
1. Observational Studies
a. Descriptive or case-series

Course notes STA60004

Semester 1, 2015

13

Research Design: Topic 7

Module 2: Topic 1

b. Case-control studies (retrospective studies)


i. Case and incidence of disease
ii. Identification of risk factors
c. Cross-sectional studies, surveys (prevalence)
i. Disease description
ii. Diagnosis and staging
iii. Disease processes, mechanisms
d. Cohort studies (prospective studies)
i. Causes and incidence of disease
ii. Natural history, prognosis
iii. Identification of risk factors
e. Historical cohort studies
2. Experimental studies
a. Controlled trials
i. Parallel or concurrent controls
1. Randomised
2. Not randomised
ii. Sequential controls
1. Self-controlled
2. Crossover
iii. External controls (including historical)
b. Studies with no controls
3. Meta-analysis.

1.9 Guidelines for Designing Experiments


(Taken from Montgomery, D.C. (2005): Design and Analysis of Experiments).
To use the statistical approach in designing and analysing an experiment, it is necessary
for everyone involved in the experiment to have a clear idea in advance of exactly what is
to be studied, how the data are to be collected, and at least a qualitative understanding of
how these data are to be analysed. An outline of the recommended procedure by
Montgomery (2005) is as below:
Step 1: Recognition of and statement of the problem
Step 2: Selection of the response variable*

Pre-experimental planning.

Step 3: Choice of factors, levels and ranges*


Step 4: Choice of experimental design
Step 5: Performing the experiment

Course notes STA60004

Semester 1, 2015

14

Research Design: Topic 7

Module 2: Topic 1

Step 6: Statistical analysis of the data


Step 7: Conclusions and recommendations.
*In practice, steps 2 & 3 are often done simultaneously or in reverse order.

Step 1: The first step for designing an experiment is to develop all ideas about the
objectives of the experiment. It is usually helpful to prepare a list of specific problems or
questions that are to be addressed by the experiment. A clear statement of the problem
often contributes substantially to better understanding of the phenomenon being studied
and the final solution of the problem. It is also important to keep an overall objective in
mind; for example, is this a new process or system-in which case the initial objective is
likely to be characterization or factor screening-or is it a mature or reasonably wellunderstood system that has been previously characterized-in which case the objective may
be optimization.
Step 2: In selecting the response variable, the experimenter should be certain that this
variable really provides useful information about the process under study. Most often, the
average or standard deviation (or both) of the measured characteristic will be the response
variable. Multiple responses are not unusual. It is usually critically important to identify
issues related to defining the responses of interest and how they are to be measured before
conducting the experiment.
Step 3: When considering the factors that may influence the performance of a process or
system, the experimenter usually discovers that these factors can be classified as either
potential design factors or nuisance factors. The potential design factors are those factors
that the experimenter may wish to vary in the experiment. Nuisance factors are often
classified as controllable, uncontrollable, or noise factors. Once the experimenter has
selected the design factors, he or she must choose the ranges over which these factors will
be varied and the specific levels at which runs will be made. We reiterate how crucial it is
to bring out all points of view and process information in steps 1 through 3. We refer to
this as pre-experimental planning.
Step 4: If the above pre-experimental planning activities are done correctly, this step is
relatively easy. Choice of design involves consideration of sample size (number of
replicates), selection of a suitable run order for the experimental trials, and determination
of whether or not blocking or other randomisation restrictions are involved. In Topic 2 we
discusses some of the important types of experimental designs for a wide variety of
problems. In selecting design, it is important to keep the experimental objectives in mind.
Step 5: When running the experiment, it is vital to monitor the process carefully to ensure
that everything is being done according to plan. Errors in experimental procedure at this
stage will usually destroy experimental validity. Up-front planning is crucial to success. It
is easy to underestimate the logistical and planning aspects of running a designed
experiment in a complex manufacturing or research and development environment. This
step suggests re-visiting the decisions made in steps 1-4, if necessary.
Step 6: Statistical methods should be used to analyse the data so that results and
conclusions are objective rather than judgemental in nature. If the experiment has been
designed correctly and performed according to the design, the statistical methods required
are not elaborate. Remember that statistical methods cannot prove that a factor (or
factors) has a particular effect. They only provide guidelines as to the reliability and
validity of results.

Course notes STA60004

Semester 1, 2015

15

Research Design: Topic 7

Module 2: Topic 1

Step 7: Once the data have been analysed, the experimenter must draw practical
conclusions about the results and recommend a course of action. Graphical methods are
often used in this stage, particularly in presenting the results to others. Follow-up runs and
confirmation testing should also be performed to validate the conclusions from the
experiment.

1.10 Research questions and hypotheses


The development of a research question from a research idea is largely a matter of
organising ones thoughts into a concise statement of what one intends to do and why.
Research questions and hypotheses are closely related but are not quite the same. A
hypothesis is a statement, at a higher level, in which an attempt is made to generalise
about the nature of the universe in which we live.
Research begins with a question. Such questions may come about talking with friends,
reading the scientific literature, or through an untold number of ways. When reading the
current literature as a means to inform your research, you will need to ask three questions:
1. Is my idea based solidly in theory?; 2. Is this idea the next most obvious step for the
discipline to take?; and 3. Is my idea novel in some way? Having satisfied yourself
that your idea is worth pursuing it is necessary to turn it into a specific research question.
In doing so, you will have to tease out various parts of your idea, making each a more
focused question. Through this process there is the genesis of experimental/research
hypotheses.
There is an art to devising good experimental/research hypotheses. As a general rule there
should be one hypothesis per experiment. Put another way, each experiment should have
only one question to answer. As to how we state an experimental/research hypothesis, it is
more or less convention to treat it as a proposition of only one sentence. Begin with the
word That.. Within the hypothesis include the general sort of manipulation you will
be performing, known as the independent variable, and what it is you will be measuring,
now referred to as the dependent variable. However, an excellent hypothesis goes one
step further by suggestion how specific treatments known as the levels of the independent
variable, will affect the dependent variable.
Research design and analysis is a method of thought. It begins with a good idea that is
then refined into an experimental/research hypothesis but does not conclude until the
experiment is completed and the results published. At its heart is an experimental design
that limits error and thus promotes a simple and honest analysis of the data.
(Taken from Edwards, T. 2008).
Example: A suitable research question might be:
Does drug treatment of hypertension reduce the morbidity associated with cardiovascular
disease?
A suitable hypothesis for the above research question might be:
Participants with hypertension who are treated with a specific drug will experience less
morbidity associated with cardiovascular disease than participants who were not.

Course notes STA60004

Semester 1, 2015

16

Research Design: Topic 7

Module 2: Topic 1

Revision Exercises

1.

Explain the differences between sample survey and experimental design.

2.

A pharmaceutical company wishes to test a new medication it thinks will reduce


cholesterol. A group of 20 volunteers is formed and each person has their cholesterol
level measured. Half the group is randomly assigned to take the new drug and the other
half is given a placebo. After 6 months the volunteers cholesterol is measured again and
any change from the beginning of the study is recorded. In this experiment, identify the
experimental unit, factors, treatments, and response variable.

3.

An agricultural researcher is interested in determining how much water and


fertilizer are optimal for growing a certain plant. Twenty four plots of land are available
to grow the plant. The researcher will apply three different amounts of fertilizer (low,
medium, and high) and two different amounts of water (light and heavy). These will be
applied at random in equal combination to each of four plots. After 6 weeks, the plants
heights in each plot will be recorded.
Identify the experimental units, factors and their levels, treatments (treatment
combinations), and response variable in this study.

4. In 1930, it was decided to carry out an experiment in Lanarkshire schools to assess the
possible beneficial effects of giving the children free milk during the school day. Twenty
thousand children took part and over the course of five months, February to June, half of
them had three-quarters of a pint of either raw or pasteurised milk while the remainder
did not have milk. All the children were weighed and had their heights measured before
and after the experiment, but contrary to expectation the average increase for the children
who had not had milk exceeded that for the children who had milk. This unexpected
result was later attributed to unconscious bias in the formation of the groups being
compared. In each school the division of the children into a "milk" or a "no-milk" group
was made either by ballot or by using an alphabetic system, but if the outcome appeared
to give groups with an undue preponderance of well-nourished or ill-nourished children,
some arbitrary interchange was carried out in an effort to balance them. In this
interchange the teachers must have unconsciously tended to put a preponderance of illnourished children into the group receiving milk. The results of the experiment were
further complicated by the fact that the children were weighed in their clothes and this
probably introduced a differential effect as between winter and summer and children from
poorer and wealthier homes. Because of the deficiencies in design the results of the
experiment were ambiguous despite the very large sample of children concerned.
(a)

Suggest an appropriate research hypothesis.

(b)

What is the independent/predictor variable?

(c)

What is the dependent/outcome variable?

Course notes STA60004

Semester 1, 2015

17

Research Design: Topic 7

(d)

Module 2: Topic 1

Is it an observational study or a designed experiment?

5. From the shelf of a fresh juice shop, all the bottles of a certain brand of orange
juice on the shelf on a particular day were taken and analysed to observe the
vitamin C in orange juice. There were 21 bottles, and their vitamin C readings
(mg/100gm) were as follows:
15, 21, 20, 21, 18, 17, 15, 17, 13, 22, 23, 16,
13, 19, 23, 20, 25, 14, 26, 22, 23.
(a)
(b)
(c)
(d)

Is it an observational study or a designed experiment?


Is the random variable discrete or continuous?
What parameters are we likely to be interested in estimating?
What null hypothesis might be taken?

6. A researcher conducted an experiment to examine the efficiency of three types of


fungal sprays (T1, T2, & T3) in controlling fungal rots on blueberries. Three
adjacent rows of blueberries are available, each with 24 plants. Sprays can be
applied to individual blueberry plants. The outcome/response variable is the
proportion of blueberries with rot. For the following two designs, specify the
experimental unit, blocking factor, and number of replications of the treatments.
(a) The sprays are randomly allocated to rows and 8 blueberry plants
randomly selected from each row for assessment.
(b) Each row is divided into 3 plots of 8 plants each. The sprays are
randomly allocated to plots within each row.
7. A new drug was given to a group of 20 patients who suffered hay fever. Of these,
15 reported that the remedy was very helpful in treating their hay fever. From the
information we can conclude
(A) The new drug is effective for the treatment of hay fever;
(B) Sample size is too small to make a decision;
(C) This result is not valid because there was no control group for
comparison.
8. Why is randomisation important?
9. Suppose a toy company wants to know if certain colors are more appealing and
attractive to toddlers than others. They decide to measure this by choosing five
colors of blocks and making sets of blocks in each of the five colors. Then they
found 30 toddlers to participate in the study, and they randomly assigned each
toddler a block color. They observed each toddler separately at the same time of
the day, and gave them no other toys to play with. They recorded the length of
time each toddler played with the blocks, to see if some colors of blocks were
played with longer than other colors. All toddlers in the experiment were the same
age (2 years old) and an equal number of girls and boys played with each color of
blocks.

Course notes STA60004

Semester 1, 2015

18

Research Design: Topic 7

Module 2: Topic 1

(A) What is the explanatory variable (IV) and what is the response variable
(DV)?
(B) Is this study an observational study or an experiment?
(C) Name one confounding variable that was controlled for in this study.
(D) Give two reasons why we must sometimes use an observational study
instead of an experiment.

10. A common mistake made by the media, the general public, and some researchers,
is to think that a link between two variables in any study implies that one variable
causes the other. Explain what is wrong with this automatic conclusion.
11. How can a researcher try to address the problem of confounding variables when
designing an observational study?
12. Explain why each of the following is used in experiments:
a) Placebo treatments
b) Blinding
c) Control groups.

Course notes STA60004

Semester 1, 2015

19

Research Design: Topic 7

Module 2: Topic 1

Solution to revision exercises


1. The sample survey focuses on the selection of individuals from the population. We
discover the effect of applying a stimulus to subjects from experiments. The experimental
design focuses on the formation of comparison groups that allow conclusions about the
effect of the stimulus to be drawn.
2. The experimental units (subjects) in this study are the 20 volunteers. There is one
factor, the medication and it has two levels, the active pill and the placebo. There are two
treatments; the active pill and the placebo. The response variable is the change in
cholesterol over the period of the study.
3. The experimental units in this study are plots of land. There are two factors, fertilizer
and water. Fertilizer has three levels: low, medium, and high. Water has two levels: light
and heavy. There are a total of six treatments of fertilizer-water combinations: low-light,
low-heavy, medium-light, medium-heavy, high-light, high-heavy. The response variable
is the height of the plants at the end of the study.
4. (a) Average height will increase for the children who had milk as compared to the
children who had not had milk.
(b) Milk
(c) Height
(d) Designed experiment.

5. (a) Observational study.


(b) Continuous random variable.
(c) Example: mean vitamin C concentration in all bottles of that brand of orange juice
stocked at the fresh juice shop over a period of time.
(d) Example: mean vitamin C equal to 20 mg/100gm.
6. (a) Experimental unit = row,
No Blocking factor,
Number of replications is one.

Plant

Row-1

Lay out of the design:


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Row-2

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Row-3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Single replication

(b) Experimental unit = plot of 8 plants,


Blocking factor is row,
Number of replication is 3.

Course notes STA60004

Semester 1, 2015

20

Research Design: Topic 7

Module 2: Topic 1

Row-1

Lay out of the design:


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Row-2

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Row-3

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 replications

7. (C) This result is not valid because there was no control group for comparison.
8. Why randomisation?
The basic benefits of randomisation include
i. Elimination of selection bias.
ii. Formation of basis for statistical tests, a basis for an assumption-free statistical test of
the equality of treatments.
In general, a randomised trial is an essential tool for testing the efficacy of the treatment.
9. (A) Explanatory variable is Block Colour and response variable is Playing Time.
(B) An experiment.
(C) Any of the following: age, time of the day, other toys, interaction with other children
etc.
(D) 1) It is unethical or impossible in certain situations to assign people to receive a
specific treatment (such as smoking); 2) certain explanatory variables, such as left vs.
right handedness, are inherent traits and cannot be randomly assigned.
10. If the link is based on an observational study, there is simply no way to rule out all
potential confounding factors, so cause and effect cannot be established.
11. Measure all the potential confounding variables he/she can think of and include them
in the ANALYSIS to see whether they are related to the response variable; or use a casecontrol study and choose the controls to be as similar as possible to the cases.
12. a) The power of suggestion may lead to changes in the participants, and those changes
would be mistakenly attributed to the treatment or drug.
b) Participants are kept blind so they don't alter their behavior or outcome to please the
experimenter. Those collecting the measurements are kept blind so they don't
inadvertently bias the measurements in the desired direction.
c) Control groups are used to compare the effect of the treatment with what would have
happened under similar circumstances without the treatment.

Course notes STA60004

Semester 1, 2015

21

Research Design: Topic 7

Module 2: Topic 1

References
Box G.E.P., Hunter W.G. & Hunter J.S. (2005). Statistics for Experimenters: Design,
Innovation and Discovery. 2nd edition. New York: Wiley.
Cox D.R. (1958). Panning of Experiments. New York: Wiley.
Das M.N. & Giri N.C. (1986). Design and Analysis of Experiments. 2nd edition. New
Delhi: Wiley Eastern Ltd.
Dawson B. & Trapp R.G. (2004). Basic and Clinical Biostatistics. New York: McGrawHill.
Edwards T. (2008). Research Designs and Statistics. New York: McGraw-Hill.
Gupta S.C and Kapoor V.K. (1984). Applied Statistics. Sultan Chand & Sons, New Delhi.
Hinkelmann, K. and Kempthorne, O. (2008). Design and Analysis of Experiments, John
Wiley & Sons, Inc.
Jones B. & Kenward M.G. (2003). Design and Analysis of Crossover Trials. 2nd edition.
London: Chapman & Hall.
Montgomery D.C. (2005). Design and Analysis of Experiments. 6th edition. New York:
Wiley.
Petersen, R.G. (1985). Design and Analysis of Experiments. New York: Marcel Dekker,
INC.
Utts J.M. (2005). Seeing Through Statistics. Third Edition. Brooks/Cole Cengage
Learning, CA, USA.

Course notes STA60004

Semester 1, 2015

22

Research Design: Topic 8

Module 2: Topic 2

Topic 2: Common Designs

Dr Amirul Islam

Acknowledged to: Dr Jahar Bhowmik

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

Contents

2.1 Topic introduction

2.2 Topic learning objectives

2.3 Completely Randomised Designs

2.4 Randomised Block Designs

2.5 Latin Square Designs

14

2.6 Factorial Experiments

17

2.7 Nested Designs

19

2.8 Repeated Measures Design

20

Revision Exercises

21

Solutions to Revision Exercises

22

References

23

Note: Some of the materials are adapted from standard texts and guides (see references).

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

2.1 Topic introduction


In the previous chapter we have explored the fundamental principles of good
experimental design. In this chapter we apply these principles to some of the basic
designs that are commonly used in practice. These are: (i) Completely Randomised
Designs (CRD), (ii) Randomised Block Designs (RBD) and (iii) Latin Square Designs
(LSD). These designs are described below one by one. We also consider the analysis of
data from these basic designs. In practice most experimental data are continuous, so we
will try to restrict our attention to continuous response (outcome) variable.

2.2 Topic learning objectives


Learning objectives
When you have worked through this topic you should:

Recognise the designs commonly used in practice.


Understand the principles of basic designs.
Understand which design would be useful for a particular research
project.

2.3 Completely Randomised Designs (CRD)


The completely randomised design is the simplest of all the designs, based on principles
of randomisation and replication. In this design treatments are allocated at random to the
experimental units over the entire experimental material and each treatment is repeated an
equal number of times. This design is very flexible in that any number of treatments and
any number of replications may be used. A completely randomised design is one in which
all experimental units are assigned treatments solely by chance. No grouping of
experimental units is done prior to assignment of treatments. In general, an equal number
of replications for each treatment should be made except in particular cases when some
treatments are of greater interest than others or when practical limitations dictate
otherwise.
In this design treatments are assigned to the experimental units completely at random.
There are a variety of ways that this is done in practice, usually using computer programs
are easy but all have the feature that each observation has an equal chance of being
allocated to each group. Suppose we want to conduct an experiment with four treatments,
each replicated five times. This will require 20 experimental units, which we number
from 1 to 20 as in Figure 2.1 below. We can now assign different experimental units to
various treatments many ways. For example, we are explaining two methods. Method 1 is
not used much practically but the Method 2 is always used.

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

Method 1:
1. Obtain 20 identical pieces of paper (this is the experimental unit). Label five of
them Treatment A, five of them Treatment B, five of them Tret C and five
of them Tret D.
2. Place the pieces of paper in a box and mix thoroughly.
3. Pick a piece of paper at random. The treatment named on this piece is assigned to
experimental unit 1.
4. Without returning the first piece of paper to the box, select another piece. The
treatment named on this piece is assigned to experimental unit 2.
5. Continue this way until all 20 pieces of paper have been drawn.
6. This is just an example, the allocation will vary according to what you get
randomly.
Treatment

Experimental Unit

Total 5 units in each


treatment

Treatment A

Treatment B

10

Treatment C

11

12

13

14

15

Treatment D

16

17

18

19

20

Figure 2.1: Assignment of numbers to experimental units.

Method 2: Using EXCEL


1. Put the numbers 1 to 20 in column A.
2. Enter the formula =RAND ( ) in cell B1 and fill down to B20. It will generate 20
random number with 4-5 decimal places. You can make it one decimal places or it
does not matter if you keep it. You have to do exactly

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

3. Copy column B onto itself using Paste Special

Values

4. Select a cell in either column A or B, and sort the worksheet by


Column B. (from points 3 and 4, you will find the following)

Data

Sort

5. We now have the numbers 1 to 20 in column A in random order.


6. Give the first five to treatment A, and so on, which gives
Treatment

Experimental Unit

Total 5 units
in each
treatment

Treatment A

20

16

13

Treatment B

10

12

19

17

Treatment C

15

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Treatment D

Module 2: Topic 2

18

14

11

Figure 2.2: Assignment of numbers to experimental units.

2.3.1 Analysis
A completely randomised design provides a one-way classified data according to levels of
a single factor, treatment. The data from this design can be analysed by a one-way
analysis of variance (ANOVA). The ANOVA results help us to answer the following:
 How much variation is due to differences between treatments?
 How much variation is due to differences within each set of observations for the
same treatment?
 It provides solution of the hypotheses to test if there is any difference across the
treatments, i.e., Treatment A vs. Treatment B and so on.

An appropriate linear statistical model for a one-way classified data is


Response = general mean effect (overall mean)+ effect of treatment i + error


yij = + i + eij ;

i=1,2,,p & j=1,2,.,r.

Where yij is the yield or response from the jth unit receiving the ith treatment, is the
general mean effect, i is the effect due to the ith treatment, and eij is the error component
due to chance. The error components are assumed to be independently and normally
distributed with 0 mean and constant variance 2.
The general form of the ANOVA table for a completely randomised design with p
treatments each replicated r times with N (rp) experimental units is given below.
Table 2.1: ANOVA for CRD
Source of
variation (SV)

Degrees of
freedom (df)

Sum of
Mean square
squares (SS) (MS)

F Statistic

Treatment

p-1

SST

MST=SST/(p-1)

FT=MST/MSE

Error

N-p [N-1-p+1,
i.e., total df
treatment df]]

SSE

MSE=SSE/(N-p)

Total

SSTot

N-1

SST= Between treatments sum of squares (or between groups sum of squares) which is
the sum of squares of the differences between the treatment means and the overall mean.

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

SSE= Residual sum of squares or error sum of squares (or within groups sum of squares)
which is the sum of the squares of the differences between the observations and their
respective treatment means.
SSTot= Total sum of squares which is the sum of the squares of the differences between
the observations and the overall mean. Note that SSTot=SST+SSE.
In this design the total variation is partitioned into two components:
(a) Variation among treatment means (treatments).
(b) Variation among units within treatments (error).
Example 2.1
The following table shows some of the results of an experiment on the effect of
applications of sulphur [S3, S6, S12] in reducing scale disease of potatoes. The object in
applying sulphur is to increase the acidity of the soil since scale does not thrive in very
acid soil. In addition to untreated plots which serve as controls [O]- 3 [F3, F6, F12]
amounts of dressing are compared-300, 600 and 1200 lb. per acre. Both a spring and fall
application of each treatment was tested so that in all there were seven distinct treatments.
Field plan and scale indices for a completely randomized experiment on potatoes
F3

S6

F12

S6

S12

S3

F6

12

18

10

24

17

30

16

S3

F12

F6

S3

S6

10

10

21

24

29

12

F3

S12

S6

F6

S12

F3

F12

18

30

18

16

16

S3

S12

S6

F12

F3

18

17

19

32

26

Results grouped by treatments for data analysis


O

F3

S3

F6

S6

F12

S12

12

30

30

16

18

10

17

10

18

10

24

24

32

16

21

18

12

16

29

26

18

19

17

Totals

181

38

67

62

73

23

57

Means

22.6

9.5

16.8

15.5

18.2

5.8

14.2

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

Figure 2.1: Mean (acidity level) plots for Sulphur (S3, S6 and S12:treatment) and
controls (O, F3, F6 and F12)
The figure shows the highest mean acidity level for control O but in general application
of sulphur increased the acidity level, especially the mean deferences were higher in case
of S3 vs F3 and S12 vs F12.

Example 2.2 (taken from Petersen R.G. 1985, p.14)


An anthropologist was interested in studying physical differences, if any, among the
various races of people inhabiting Hawaii. As a part of her study she obtained a random
sample of eight 5-year-old girls from each of three races: Caucasian, Japanese, and
Chinese. She made a number of anthropometric measurements on each girl. She wanted
to determine whether the Oriental races differ from the Caucasian, and whether the
Oriental races differ from each other. The results of the head width measurements (cm)
are given in the following table. The anthropologist is interested in knowing whether or
not head width means differ among the races.
Head width (cm)
Caucasian

Japanese

Chinese

14.20

12.85

14.15

14.30

13.65

13.90

15.00

13.40

13.65

14.60

14.20

13.60

14.55

12.75

13.20

15.15

13.35

13.20

14.60

12.50

14.05

14.55

12.80

13.80

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

Total:

116.95

105.50

109.55

Mean:

14.619

13.188

13.694

Grant mean

13.83

AOVA table of head width


SV

d.f.

SS

MS

Race (Treatment)

3-1=2

8.43

4.21

23.39

Error

24-3=21

3.84

0.18

Total

24-1=23

12.27

Calculations:
Error sum square calculation
(14.2-14.619)2+..+(14.55-14.619)2+(12.8513.188)2+..+(13.80-13.694)2= 3.84
Sum square total = (14.2-13.83)2+(13.80-13.83)2=12.27
In this case, (every treatment units grant total)2
Sum square treatment = SS total SS error = 12.27-3.84 = 8.43.
In case of CRD, the total variation is due to treatment and error.

Course notes STA60004

Semester 2/SP3, 2015

Research Design: Topic 8

Module 2: Topic 2

If we do the analysis in SPSS, then the data entry should be like this.

Here, 0 = Caucasian; 1= Japanese; and 2= Chinese


If the data are in SPSS, the analysis will produce the following output.
ANOVA
headwidth
Sum of Squares

df

Mean Square

Between Groups

8.428

4.214

Within Groups

3.841

21

.183

12.268

23

Total

Course notes STA60004

F
23.041

Semester 2/SP3, 2015

Sig.
.000

10

Research Design: Topic 8

Module 2: Topic 2

Explanation of the ANOVA Table:


Between groups degrees of freedom is 2. This is because there are 3 ethnic groups, i.e.,
the number of group minus 1. Eight girls in each ethnic groups, i.e., the total degrees of
freedom equal 38 1 = 23. The error degree of freedom = 23-2 = 21. Mean sum square
equals sum squares divided by the number of degrees of freedom. F = (4.214/0.183)= 23.04. F
was supposed to be significant with F(2, 21) degrees of freedom if F was greater than
2.57. Please get this information from the F table (available online from the link).
http://www.socr.ucla.edu/applets.dir/f_table.html.

Descriptives
Headwidth
N

Mean

Std. Deviation

Caucasian

14.6188

.31953

Japanse

13.1875

.56553

13.6938

.35601

24

13.8333

.73035

Chinese
Total

2.3.2 Advantages of CRD


There are a number of advantages of a completely randomised design:
(i)

The design is very flexible. Any number of treatments can be used and
different treatments can be used unequal number of times without unduly
complicating the statistical analysis in most of the cases. The number of
replications need not be the same from one treatment to another, although
comparisons are most precise when the treatments are equally replicated.

(ii)

The statistical analysis remains simple if some or all the observations for
any treatment are rejected or lost or missing for some purely random
accidental reasons. Moreover the loss of information due to missing data
is smaller in comparison with any other design.

(iii) The design provides maximum degrees of freedom for the estimation of
the error variance, which increases the sensitivity or the precision of the
experiment for small experiments, i.e., for experiments with small
number of treatments.
(iv) This design results in the maximum use of the experimental units since all
the experimental material can be used.

Course notes STA60004

Semester 2/SP3, 2015

11

Research Design: Topic 8

Module 2: Topic 2

2.3.3 Disadvantages of CRD


There is one principal disadvantage of this design:
(i) If the experimental material is not uniform its precision is low. Since
randomisation is not restricted in any direction to ensure that the units receiving
one treatment are similar to those receiving the other treatment, the whole
variation among the experimental units is included in the residual variance. This
makes the design less efficient and results in less sensitivity in detecting
significant effects.

2.3.4 Applications of CRD


Although other designs have more precision, the CRD has a number of uses:
(i)

It is most useful in laboratory techniques and methodological studies, e.g.,


in physics, chemistry or cookery, in chemical and biological experiments,
in some green house studies, etc., where the experimental material is
uniform.

(ii)

This design is also recommended in situations where a large fraction of


units is likely to be destroyed or to fail to respond.

(iii) This design may be useful for experiments in which the total number of
units is limited.

2.4 Randomised Block Designs (RBD)


The second commonly used design is the randomised block design. If a researcher has to
believe that subgroups of the experimental units will respond differently to treatments
because of some characteristic, the units are sorted into those subgroups before treatments
are assigned. In an experiment these subgroups are called blocks. Once units are assigned
to blocks, treatments are randomly assigned to the units in each block. Blocking is a form
of control to reduce unwanted variability in the response variable due to some variable
other than the treatment (s). In field experimentation, if the whole of the experimental
area is not homogenous (uniform) and the fertility gradient is only in one direction, then a
simple method of controlling the variability of the experimental material consists in
stratifying or grouping the whole area into relatively homogenous strata or sub-groups (or
blocks), perpendicular to the direction of the fertility gradient. Now if the treatments are
applied at random to relatively homogenous units within each strata or block and
replicated over all the blocks, the design is a randomised block design (RBD). In CRD no
such local control measure is adopted except that the experimental units should be
homogenous and treatments allocated at random to the experimental units. But in
randomised block designs treatments are allocated at random within the units of each
stratum or block, i.e. randomisation is restricted. Therefore, homogenous grouping of
experimental units and the random allocation of the treatments separately in each block
are the two main characteristic features of randomised block designs. RBD is the

Course notes STA60004

Semester 2/SP3, 2015

12

Research Design: Topic 8

Module 2: Topic 2

improvement of CRD obtained by providing error control measures. The error control
measures in RBD consist of making the units in each of these blocks homogenous.
Layout of RBD: In the RBD the experimental units are first grouped into blocks or
strata. Treatments are then randomly assigned to the units within the blocks. A separate
randomisation is used in each block. To illustrate the procedure, suppose we want to run
an experiment with five treatments (A, B, C, D and E) replicated four times in an
agricultural field with a fertility gradient (see Petersen R.G, 1985, p. 36). We construct a
RBD using the following steps:
BLOCK
I

II

III

IV

Treatment A

Treatment B

Treatment C

Treatment D

Treatment E

GRADIENT
Figure 2.3: Assignment of numbers to units blocked to remove effects of a gradient
Step 1: Form four blocks of five plots each perpendicular to the gradient. Number the
plots from 1 to 5 within each bock as shown in Figure 2.3.
Step 2: Use a table of random numbers or some other procedure (e.g. using EXCEL), to
assign treatments to the units in the first block. To illustrate for
Block I:
Sequence Treatment

Random number
(generated in Excel)

Random No
sorted and
ranked the plot

Treatment
according to the
rank, e.g., 1=A,
2=B, 3=C, 4=D,
5=E

293 (second smallest)=2

078 (smallest) =1

721 (largest)=5

569 (3rd smallest)=3

612 (4th smallest)=4

Step 3: Repeat step 2 for the reaming three blocks:


Block II

Course notes STA60004

Semester 2/SP3, 2015

13

Research Design: Topic 8

Sequence

Module 2: Topic 2

Random number

Rank (plot)

Treatment

962

036

844

963

097

Block III
Sequence

Random number

Rank (plot)

Treatment

675

936

709

591

665

Block IV
Sequence

Random number

Rank (plot)

Treatment

230

981

687

604

454

The final plan of the RBD is given in the following figure.


Block

Treatment

Course notes STA60004

II

III

IV

Semester 2/SP3, 2015

14

Research Design: Topic 8

Module 2: Topic 2

Figure 2.4: Final experimental plan with treatments randomly assigned to units within
blocks in a RBD
Example 2.3
Suppose we are interested in how weight gain (Y) in rats is affected by source of protein
(Beef, Cereal, and Pork) and by level of protein (High or Low). There are a total of 6
(3x2) treatment combinations of the two factors (Beef -High Protein, Cereal-High
Protein, Pork-High Protein, Beef -Low Protein, Cereal-Low Protein, and Pork-Low
Protein) . Suppose we have available to us a total of N = 66 experimental rats to which
we are going to apply the different diets based on the t = 6 treatment combinations. Prior
to the experimentation the rats were divided into n = 11 homogeneous groups of size 6.
The grouping was based on factors that had previously been ignored (Example - Initial
weight size, appetite size etc.). Within each of the 11 blocks a rat is randomly assigned a
treatment combination (diet). The weight gain (in grams) after six month is measured for
each of the test animals and is tabulated in the following table.
Block
1

107
(1)

96
(2)

112
(3)

83
(4)

87
(5)

90
(6)

98
(1)

72
(2)

101
(3)

82
(4)

70
(5)

94
(6)

102
(1)

76
(2)

101
(3)

85
(4)

95
(5)

97
(1)

70
(2)

93
(3)

65
(4)

109
(1)

79
(2)

101
(3)

101

70

(1)

(2)

Block
7

128
(1)

89
(2)

104
(3)

85
(4)

84
(5)

89
(6)

56
(1)

70
(2)

71
(3)

64
(4)

62
(5)

67
(6)

89
(6)

99
(1)

91
(2)

92
(3)

80
(4)

71
(5)

85
(6)

71
(5)

61
(6)

10

82
(1)

63
(2)

87
(3)

87
(4)

81
(5)

61
(6)

75
(4)

75
(5)

81
(6)

11

101
(1)

102
(2)

110
(3)

83
(4)

93
(5)

83
(6)

98

82

77

79

(3)

(4)

(5)

(6)

Example 2.4
A group of researchers are interested in comparing the effects of four different chemicals
(A, B, C and D) in producing water resistance (y) in textiles. A strip of material,
randomly selected from each bolt, is cut into four pieces (samples) the pieces are
randomly assigned to receive one of the four chemical treatments. This process is
replicated three times producing a Randomised Block (RB) design. Moisture resistance
(y) was measured for each of the samples. (Low readings indicate low moisture
penetration). The data is given below.

Blocks (Bolt Samples)


9.9
10.1
11.4
12.1

C
A
B
D

13.4
12.9
12.2
12.3

D
B
A
C

12.7
12.9
11.4
11.9

B
D
C
A

Completed Design
Course notes STA60004

Semester 2/SP3, 2015

15

Research Design: Topic 8

Module 2: Topic 2

Example 2.5
An experiment was carried out on wheat. Three varieties of wheat A, B, C were tested for
their yield in four randomised blocks. Each of four blocks were divided into three plots
and plots of each block were assigned at random to the three varieties. The plan and yield
per plot in kg are given below:

Block 1
A
8
C
12
B
10
Wheat yield
A
B
C

Block 2
C
10
B
8
A
8

Block 3
A
6
B
9
C
10

Block 1
8
10
12

Block 2
8
8
10

Block 4
B
10
A
8
C
9
Block 3
6
9
10

Block 4
8
10
9

Example 2.6
A researcher is carrying out a study of the effectiveness of four different skin creams for
the treatment of a certain skin disease. He has eighty subjects and plans to assign them
into 4 treatment groups of twenty subjects each. Using a randomised block design, the
subjects are assessed and put in blocks of four according to how severe their skin
condition is; the four most severe cases are the first block, the next four most severe cases
are the second block, and so on to the twentieth block. The four members of each block
are then randomly assigned, one to each of the four treatment groups.
(Example taken from Valerie J. Easton and John H. McColl's Statistics Glossary).

2.4.1 Analysis
If in an RBD a single observation is made on each of the experimental units, then the data
from an RBD can be analysed by a two-way ANOVA. In this design the ANOVA enables
us to partition the total variation into blocks, treatments and error. A randomised block
experiment is assumed to be a two-factor experiment. The factors are blocks and
treatments.
An appropriate linear statistical model for RBD is
Response = general mean effect (overall mean)+ treatment effect + block effect + error
Course notes STA60004

Semester 2/SP3, 2015

16

Research Design: Topic 8

Module 2: Topic 2

yij = + i + b j + eij ;

i=1,2,,p & j=1,2,.,r.

Where yij is the yield or response of experimental unit from ith treatment and jth block,
is the general mean effect, i is the effect due to the ith treatment, bj is the effect due to jth
block or replicate and eij is the error component due to chance. The error components are
assumed to be independently and normally distributed with 0 mean and constant variance
2 .
The general form of the ANOVA table with p treatments each replicated r times in a
randomised block design with r blocks of p units each, is given below.
Table 2.2: ANOVA for RBD
Source
of Degrees
variation (SV)
freedom (df)

of Sum
of Mean square (MS)
squares (SS)

Treatment

p-1

SST

MST=SST/(p-1)

Block

r-1

SSB

MSB=SSB/(r-1)

Error

(p-1)(r-1)

SSE

MSE=SSE/(p-1)(r-1)

Total

rp1

SSTot

F Statistic
FT=MST/MSE

Analysis output using SPSS from example 2.5 (four blocks and 3 varieties of wheat)

Analysis summary

Course notes STA60004

Semester 2/SP3, 2015

17

Research Design: Topic 8

Module 2: Topic 2

Between-Subjects Factors
Value Label
BLOCK

TREATNUM

Course notes STA60004

Semester 2/SP3, 2015

18

Research Design: Topic 8

Module 2: Topic 2

Tests of Between-Subjects Effects


Dependent Variable: YEILD
Type III Sum of
Source

Squares

df

Mean Square

Sig.

Corrected Model

26.000

11

2.364

Intercept

972.000

972.000

BLOCK
TREATNUM
BLOCK *
TREATNUM

4.667

1.556

15.500

7.750

5.833

.972

Error

.000

Total

998.000

12

26.000

11

Corrected Total

a. R Squared = 1.000 (Adjusted R Squared = .)

Interpretation of the Table


There are four blocks, so theoretically df is expected to be 4-1 = 3; similarly for
variety/treatment, 2 and for interaction (4-1) (3-1) = 6 or simply 32=6. Since there is
no error df, no F value was able to compute.

2.4.2 Advantages of RBD


There are a number of advantages of a randomised block design. Chief advantages of
RBD can be outlined as follows:
(i) This design is more efficient or accurate than CRD for most types of experimental
works. Blocking can increase precision by removing one source of variation from
experimental error.
(ii) In this design no restrictions are placed on the number of treatments or the number
of replicates. Any number of blocks and any number of treatments can be used so long
as each treatment is replicated the same number of times in each block. However, for
better management of the experiment, it is suitable not to use a large number of
treatments.
(iii) Statistical analysis is relatively simple and rapid.

2.4.3 Disadvantages of RBD


There are a few disadvantages of RBD:
(i) RBD is not suitable for large number of treatments. The efficiency of the design
decreases as the number of treatments and, hence, block size increases.
Course notes STA60004

Semester 2/SP3, 2015

19

Research Design: Topic 8

Module 2: Topic 2

(ii) In the analysis, missing data can cause some difficulty.

2.4.4 Applications of RBD


This design has a number of applications:
(i) RBD provides unbiased estimates of the means for blocking categories, providing
additional information from the experiment.
(ii) This design can remove one source of variation from the experimental error and
thus increase precision.

2.5 Latin Square Designs (LSD)


In RBD the whole of the experimental area is divided into relatively homogenous groups
(blocks) to control one source of variation, and treatments are allocated at random to units
within each block. But in field experimentation, it may happen that an experimental area
(field) exhibits fertility in strips, e.g., cultivation might result in alternative strips of high
and low fertility. RBD will be effective if the blocks happen to be parallel to these strips
and would be extremely inefficient if the blocks are across the strips. Initially the fertility
gradient is seldom known. A useful method of eliminating fertility variations consists of
an experimental layout which will control variation in two perpendicular directions. One
design which controls two sources of variations is called a Latin square design. A Latin
square design incorporates two blocking factors, which are usually represented as rows
and columns.
Layout of LSD: In this design the number of treatments is equal to the number of
replications. To construct a Latin square design for p treatments we require pp = p2
experimental units. The whole of the experimental area in divided into p2 experimental
units (plots) arranged in a square so that each row as well as each column contains p units
(plots). The p treatments are then allocated at random to these rows and columns in such a
way that every treatment occurs once and only once in each row and in each column.
Such a layout is known as pp Latin Square Design (LSD) and is extensively used in
agricultural experiments, e.g. if we are interested in studying the effects of p types of
fertilizers (treatments) on the yield of a certain variety of wheat, it is customary to
conduct the experiment on a square field with p2-plots of equal area and to associate
treatments with different fertilizers and row and column effects with variations in fertility
of soil. A Latin Square Design incorporates two blocking factors, which are usually
represented as rows and columns.
The basic pattern of a Latin square design with p = 5 treatments, A, B, C, D, and E, in a
5x5 square is given below which enables both blocking factors (rows: say treatment and
column: say soil fertility):

Row
1
2
3

Course notes STA60004

1
A
B
C

Column
2
B
C
D

3
C
D
E

Semester 2/SP3, 2015

4
D
E
A

5
E
A
B

20

Research Design: Topic 8

Module 2: Topic 2

4
5

D
E

E
A

A
B

B
C

C
D

Figure 2.4: Basic design for a 5x5 Latin square.

Example 2.7
An experiment was conducted to compare the effectiveness of four types of food
supplements for increasing the milk yield of dairy cows in a farm. The supplements (A,
B, C and D) were given to four cows, and repeated in four successive time periods while
rotating the cows. Milk yields, in grams/day, are recorded. The cow (1, 2, 3 or 4) was one
blocking factor and the time period (I, II, III or IV) was the other. The plan and yields are
given in the following table:

Cow

I
A
882
B
1078
C
702
D
690

1
2
3
4

II
B
605
C
705
D
659
A
789

Time Period
III
C
947
D
712
A
824
B
930

IV
D
772
A
756
B
644
C
762

Example 2.8
An experiment was conducted to compare the effectiveness of five manorial treatments
A, B, C, D and E on the yield of sugarcane (in kg/plot). The following are the results of
the Latin Square experiment.
B
405
C
325
E
471
A
552
D
430

A
525
D
445
B
492
C
431
E
469

E
463
B
429
A
472
D
425
C
432

D
441
A
513
C
381
E
572
B
467

C
481
E
493
D
410
B
410
A
460

Example 2.9 (taken from Petersen R.G, page 57)


A ceramics engineer wanted to test the strength of high-tension insulators made from four
clay mixtures A, B, C, D and a control, E. He made five insulators from each mixture. He
suspected that there was a temperature gradient from front to back and from top to bottom
in his oven. He decided to use a Latin square design with shelves (top to bottom) as rows
and positions on the shelves (front to back) as columns. The insulators were placed in the
oven in the Latin square arrangements. After firing, the strength of each insulator was

Course notes STA60004

Semester 2/SP3, 2015

21

Research Design: Topic 8

Module 2: Topic 2

measured. The experimental layout and strength measurements were as shown in the
following table:
Front
Top

Bottom

A
33.8
D
35.0
C
35.8
E
33.2
B
34.8

Back
B
33.7
E
28.8
D
35.6
A
37.1
C
39.1

D
30.4
B
33.5
A
36.9
C
37.4
E
32.7

C
32.7
A
26.7
E
26.7
B
38.1
D
37.4

E
24.4
C
33.4
B
35.1
D
34.1
A
36.4

2.5.1 Analysis
In Latin square design there are three factors: row, column and treatment. The data
collected from this design can be analysed by a three-way ANOVA.
An appropriate linear statistical model for the ith row, jth column and the pth treatment
is:
Response = general mean effect (overall mean) + row effect + column effect + treatment
effect + error


yijk = + ri + c j + k + eijk ;

i=j=k=1,2,,p.

Where yijk is the yield or response of experimental unit from ith row, jth column and kth
treatment, is the general mean effect, ri is the effect due to the ith row, cj is the effect
due to jth column and k is the effect due to kth treatment and eijk is the error component
due to chance. As usual the error components are assumed to be independently and
normally distributed with 0 mean and constant variance 2.
The general form of the ANOVA table for a Latin square design with p treatments is
presented in the following table.
Table 2.2: ANOVA for RBD
Source
of Degrees
variation (SV)
freedom (df)

of Sum
of Mean square (MS)
squares (SS)

F Statistic

Rows

p-1

SSR

MSR=SSR/(p-1)

FR=MSR/MSE

Columns

p-1

SSC

MSC=SSC/(p-1)

FC=MSC/MSE

Treatment

p-1

SST

MST=SST/(p-1)

FT=MST/MSE

Error

(p-1)(p-2)

SSE

MSE=SSE/(p-1)(p-1)

Course notes STA60004

Semester 2/SP3, 2015

22

Research Design: Topic 8

Total

P2--1

Module 2: Topic 2

SSTot

Degrees of freedom for Error: (P2-1) - (p-1) (p-1) (p-1) = (P2-1- p+1 p+1 p+1)
=(P2-p 2p +2) = p(p-1)-2 (p-1) = (p-1)(p-2) [common: p-1 p-2: uncommon]

2.5.2 Advantages of LSD


There are a number of advantages of a Latin square design. Chief advantages of LSD are:
(i) This design allows the experimenter to control two sources of variation (row wise
and column wise). With two way grouping or stratification, LSD controls more of
the variation than CRD or RBD.
(ii) In this design more than one factor can be investigated simultaneously and with
fewer trials than more complicated designs.
(iii) The statistical analysis is simple though slightly more complicated than that for
RBD. Even with missing data (not many missing) the analysis remains relatively
simple as compared to RBD.

2.5.3 Disadvantages of LSD


The Latin square design has a number of disadvantages:
(i) In LSD the number of treatments is restricted to the number of replications and
this limits its field of application. This design is suitable for the number of
treatments between 5 and 10 and for more than 10 to 12 treatments the design is
seldom used since in that case the square becomes too large and does not remain
homogenous.
(ii) The fundamental assumption in this design, that there is no interaction between
different factors (i.e., the factors are independent), may not be true in general.
(iii) In case of missing plots, when several units are missing the statistical analysis
becomes complex.

2.5.4 Applications of LSD


LSD has a number of applications:
(i) Latin square design is frequently used in agricultural experiments.
(ii) It is used in food manufacturing industry.
(iii) This design is useful in the engineering industry.
(iv) LSD is also used by the dairy farm industry.

2.6 Factorial Designs


In the foregoing experiments performed in CRD, RBD or LSD, we were primarily
concerned with the comparison and estimation of the effects of a single set of treatments
like varieties of wheat, manure or different methods of cultivation etc. Such experiments
which deal with one factor only may be called simple experiments. According to Fisher
(1926), in factorial experiment, the treatments are all the combinations of different factors
under study. In these experiments an attempt is made to estimate the effects of each of the
Course notes STA60004

Semester 2/SP3, 2015

23

Research Design: Topic 8

Module 2: Topic 2

factors and also the interaction effects, i.e., the variation in the effect of one factor as a
result of different levels of other factors.
As a simple illustration let us consider two fertilizers (independent variables), say, Potash
(K) and Nitrogen (N). Let us suppose that there are three different varieties of Potash and
four different varieties of Nitrogen, three and four are termed as levels of the factors
Potash and Nitrogen respectively. To find the effectiveness of various treatments, e.g.,
different levels of Potash or Nitrogen we might conduct two simple experiments, one for
Potash and the other for Nitrogen. A series of experiments in which only one factor is
varied at a time would be both lengthy and costly and might still be unsatisfactory
because of systematic changes in the general background conditions. Moreover, these
simple experiments do not give us any information regarding the dependence or
independence of one factor on the other, i.e., they do not tell us anything about the
interaction effect (PotashNitrogen). The only alternative is to try to investigate the
variation in several factors simultaneously by conducting the above experiment as a 34
factorial experiment, where 3 and 4 are the levels of factors Potash and Nitrogen
respectively. Factorial experiments are ones in which two or more independent variables
are manipulated systematically.
In factorial experiments combinations of two ore more levels of more than one factor are
the treatments. For example, with two factors (i) nitrogen fertilizer at three levels,
denoted by n0, n1 and n2 and (ii) irrigation at two levels, I0 and I1 in an agricultural
experiment we can form the following six treatment combinations taking one level from
each factor I0n0, I0n1, I0n2, I1n0, I1n1 and I1n2. Such combinations form treatments in factorial
experiments.
Factorial designs have several advantages over single factor designs. It is more efficient to
collect information of the effects of two or three independent variables in one experiment,
rather than to run a separate experiment for each factor. Also factorial designs allow us to
investigate interactions between independent variables.
In factorial design each complete trial or replication of the experiment and all possible
combinations of the levels of the factors, are investigated. When factors are arranged in a
factorial design, they are often said to be crossed. The effect of a factorial design is
defined as the change in response produced by a change in the level of the factor which is
called the main effect. For example, consider the following simple experiment taken from
Montgomery (2009). This is a two-factor factorial experiment with both design factors at
two levels. We have called these levels low and high and denoted them - and +,
respectively.
+
(High)

30

+
(High)

52

40

12

20

50

Factor B

Factor B

(Low)

(Low)

20

40

(Low)

+
(High)

(Low)

Factor A
Course notes STA60004

+
(High)
Factor A

Semester 2/SP3, 2015

24

Research Design: Topic 8

Module 2: Topic 2

The main effect of factor A in this two-level design can be thought of as the difference
between the average response at the low level of A and the average response at the high
level of A. Numerically, this is
A=

40 + 52 20 + 30

= 21 .
2
2

That is, increasing factor A from the low level to the high level causes an average
response increase of 21 units. Similarly, the main effect of B is

30 + 52 20 + 40

= 11 .
2
2
In some experiments, we may find that the difference in response between the levels of
one factor is not the same at all levels of the other factors. When this occurs, there is an
interaction between the factors.
B=

Example: Quality of cakes


The effect of two factors- amount of milk plasma (4 levels) and baking powder (two
levels) on the quality of cakes was studied in an experiment with a completely
randomised design. Four batches of each type were made, and the quality rated by a panel
of experts. The quality ratings were recorded in the following table.
Amount of milk plasma

No baking powder

Baking powder

0%
3.9
3.3
3.5
3.7
4.1
3.9
3.8
4.1

5%
4.1
4.0
4.3
4.4
4.2
4.2
3.9
4.5

10%
4.3
4.4
4.5
4.6
4.4
4.5
4.3
4.6

15%
4.7
4.6
4.7
5.0
4.5
4.5
4.4
4.7

2.7 Nested Designs


In certain multifactor experiments, the levels of one factor (e.g., factor B) are similar but
not identical for different levels of another factor (e.g., A). Such an arrangement is called
nested, or hierarchical, design, with the levels of factor B nested under the levels of factor
A. In this type of experimental design the variables have an implicit hierarchy. For
example, a hospital has two wings (I and II). Patients in wing I are randomly assigned to
either consultant A or consultant B. Patients in wing II are randomly assigned to either
consultant C or consultant D. Thus consultants A and B are nested within the wing I
patients, and consultants C and D are nested within the wing II patients.

Course notes STA60004

Semester 2/SP3, 2015

25

Research Design: Topic 8

Module 2: Topic 2

Example: Suppose we have five different machines making the same part and each
machine has two operators, one for the day shift and one for the night shift. We take five
samples from each machine for each operator to obtain the following data:
Machine
1

.125 .118 .123 .126 .118

Operator .127 .122 .125 .128 .129


Day
.125 .120 .125 .126 .127
.126 .124 .124 .127 .120
.128 .119 .126 .129 .121
.124 .116 .122 .126 .125
.128 .125 .121 .129 .123
Operator
.127 .119 .124 .125 .114
Night
.126 .125 .126 .130 .124
.129 .120 .125 .124 .117
This is a two-stage nested design, with operators nested under machines. This design can
be analysed in an ANOVA table.

Advantages of nested design: A nested design is recommended for studying the effect of
sources of variability that manifest themselves over time. Data collection and analysis are
straightforward, and there is no reason to estimate interaction terms when dealing with
time-dependent errors. Nested designs can be run at several levels. In nested designs,
carry over effects are not a problem, as individuals are measured only once.

Course notes STA60004

Semester 2/SP3, 2015

26

Research Design: Topic 8

Module 2: Topic 2

2.8 Repeated Measures Design


The repeated measures design is a frequently used ANOVA design in which all subjects
participate under all levels of the independent variable (hence subjects are repeatedly
measured). It is also referred to as a totally within subjects design. The main advantage of
the repeated measure design is that it controls for subject heterogeneity (individual
differences).
If, for example, we are interested in the effects of alcohol on driving ability, then any
other variable which may influence driving ability is known as a nuisance variable. Such
things as the type of car, the driving course, temperature, humidity, time of day, and the
driver's level of experience, age and reflexes would all have an influence on the score on
a driving test. These are all referred to as nuisance variables. If the same subjects are used
in each treatment condition, then most of the subject nuisance variables will remain
constant across the different conditions of the experiment. In our driving ability example,
the same subject would be given two driving tests. One when his or her blood alcohol
level was zero, and another, when his or her blood alcohol level was .05. In this way the
driver's experience, age, personality and reflexes etc. would all be the same in both of the
treatment conditions (alcohol levels). By comparing the driving skill at zero blood
alcohol level to the driving skill of the same driver at .05 blood alcohol level, the effects
of the alcohol will not be obscured by differences in basic driving skill. This is known as
a repeated measures design. It is the most sensitive of the three designs we will consider.
It can, however, have one major drawback. Suppose all of the drivers first perform at the
zero blood alcohol level, then we give them morning tea, laced with alcohol and then test
their driving skill again. This introduces practice effects (these are also known as order
effects). In the first driving test, the subjects were not familiar with the driving course. In
the second test, the subjects performed better, because they were more familiar with the
course. We could reduce this practice effect by allowing all drivers an hour of practice on
the course before the testing began. However, practice effects could still be present to
some extent.

Example
An educational psychologist was interested in examining the effects of a program
designed to improve the attentiveness of children in primary school classes. In a random
sample of 120 primary school aged children, attentiveness was measured at the start of the
year (before the program started), and on three other occasions throughout the year after
3 months, after 6 months and after 12 months. The psychologist expected that
attentiveness would improve after the program commenced, and would continue to
improve throughout the year. Attentiveness was measured on a metric scale taking values
from 0 to 50, with higher values representing greater attentiveness.

Course notes STA60004

Semester 2/SP3, 2015

27

Research Design: Topic 8

Module 2: Topic 2

Revision Exercises

1. To test the effect of small proportions of coal in sand for manufacturing concrete,
several batches were mixed under particularly identical conditions except for the
variation in the percentage of coal. From each batch, several cylinders were made
and tested for breaking strength. The results obtained are shown below. (Taken
from Gupta & Kapoor (1984)).

.00
1690
1580
1745
1685
(i)
(ii)
(iii)
(iv)

0.05
1550
1445
1645
1545

Percentage of coal
0.10
1625
1450
1510
-

0.50
1725
1550
1430
1445

1.00
1530
1545
1665
1520

What is the experimental unit?


What is the treatment factor?
What type of design is used (e.g. CRD, RBD or LSD)?
Calculate the error degrees of freedom.

2. A plum orchard has 24 trees set aside for an experiment which aims to examine
the effect of mulching on tree growth. There are 4 mulching treatments: (A)
Control (no mulch); (B) Wood chips; (C) garden compost; and (D) Clippings from
own collection. The trees are in a 46 rectangle. The ground slopes down from the
left to the right.
What is the experimental unit?
(i)
(ii)
What type of design is used?
(iii)
If it is a randomised block design, state the blocking factor.

3. In a hypothetical study, the effects of noise on the performance of a perceptual


task was tested with 40 participants. The participants practiced the task until they
reached a consistent level of performance. Participants were then asked to perform
the perceptual task in each of the three noise conditions. In treatment I quiet
conditions were used, in treatment II an intermittent noise was used, and in
treatment III the noise was continuous. The researchers observed the number of
errors made in each condition. The order of the treatments was counterbalanced
across the participants. The results were recorded for analysis.
(i)
(ii)
(iii)
(iv)

Course notes STA60004

What is the experimental unit?


What experimental design has been used?
What measures have been taken to control practice effects?
Give the research hypothesis.

Semester 2/SP3, 2015

28

Research Design: Topic 8

Module 2: Topic 2

Solutions to revision exercises


1. (i) Experimental unit is cylinder.
(ii) Treatment is percentage of coal.
(iii) This is a completely randomised design (CRD) (no blocking factor).
(iv))Error degrees of freedom=Total observation-number of treatment
=N-p=19-5=14.
2. (i) Experimental unit is a tree.
(ii) It is a randomised block design.
(iii) The blocking factor is slope of the ground.

(v)
(i) The experimental unit is a participant.
(ii) Repeated measures design.
(iii) Participants practiced the task until they had reached a consistent level of
performance. Also, participants performed the three conditions in different orders.
(iv) Noise condition affects the number of errors made on a perceptual task.

Course notes STA60004

Semester 2/SP3, 2015

29

Research Design: Topic 8

Module 2: Topic 2

References
Box, G.E.P., Hunter, W.G. & Hunter J.S. (2005). Statistics for Experimenters: Design,
Innovation and Discovery. 2nd edition. New York: Wiley.
Cox, D.R. (1958). Panning of Experiments. New York: Wiley.
Das, M.N. & Giri, N.C. (1986). Design and Analysis of Experiments. 2nd edition. New
Delhi: Wiley Eastern Ltd.
Dawson, B. & Trapp, R.G. (2004). Basic and Clinical Biostatistics. New York: McGrawHill.
Jones, B. & Kenward, M.G. (2003). Design and Analysis of Crossover Trials. 2nd edition.
London: Chapman & Hall.
Mead, R. (1988). The Design of Experiments. Cambridge University Press, Cambridge.
Montgomery, D.C. (2005). Design and Analysis of Experiments. 6th edition. New York:
Wiley.
Petersen, R.G. (1985). Design and Analysis of Experiments. New York: Marcel Dekker,
INC.

Course notes STA60004

Semester 2/SP3, 2015

30

Research Design: Topic 9

Module 2: Topic 3

Topic 3: Incidence, Prevalence,


Measures of Ratio and Fertility
Statistics

Dr Amirul Islam

Acknowledged to Dr Jahar Bhowmik

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

Contents

3.1 Topic introduction

3.2 Topic learning objectives

3.3 Definitions

3.4 Some disadvantages of morbidity data

3.5 Morbidity measures

3.6 Relationship between Incidence and Prevalence

3.7 Measures of Ratio


3.8 Difference Measures 1: Attributable Risk
3.9 Measures of Fertility

Revision exercises

12

Solution to the revision exercises

16

References

19

Note: Some of the materials are adapted from standard texts and guides (see references).

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

3.1 Topic introduction


Welcome to Morbidity Statistics. Morbidity means the relative incidence of
sickness and injury occurring among a given group of people. Incidence and
prevalence rates are both measures of morbidity which measures the various
effects of disease on a population. Morbidity is the percentage of people in a
population that get sick from a particular disease. In this topic, we will explore the
issues associated with comparing morbidity (disease, illness) statistics in
populations and common measures of fertility statistics.

3.2 Topic learning objectives


Learning objectives
When you have worked through this topic you should:

Know sources of morbidity data (in Australia)


Define incidence and prevalence; state the relationship between them
Know how to calculate prevalence and incidence
Know how to evaluate morbidity data by person, place and time.

3.3 Definitions
Some key health statistics definitions are as follows. Note that there is a
distinction between rate and risk which are often confused. A risk is a
proportion and a rate is a measure over time.
In morbidity statistics, there are both measures of risks and rates.
Numerator

Top number in a fraction

Population

Any defined group of people eg. all


children of school age

Morbidity

Disease, illness, injury (non-death health


outcome)

Mortality

Death

Risk (%)

Number of events/number in population

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

Rate

Number of events/time measure

Ratio

Number of cases/population

Mid-year population

Time measure

Person-years observation Time measure

Further Notes
A proportion is a dimensionless number between 0.0 and 1.0 (if a probability) or,
equivalently, between 0% and 100% (if a percentage) consisting of one count as
the numerator divided by another count as the denominator. Note that for
consistent, unbiased interpretation, 1) all the individuals in the numerator must
also be included in the denominator, 2) each individual in the denominator must
be at risk of being in the numerator, and 3) all the individuals at risk of being in
the numerator in a group must be in the denominator.
A ratio is a value obtained by dividing the numerator by the denominator. The
numerator and denominator do not have to be related to each other. For example,
if there were 10 smokers and 500 non-smokers in a factory, then the ratio of
smokers to non-smokers would be 10/500 = 0.02 or 1:50. Similarly, the ratio of
females and males in a classroom of 25 boys and 5 girls is 25: 5 or 5: 1.

A risk is a proportion and it is a special type of ratio in which the numerator is


actually included in the denominator. If there were 5523 deaths in a population, of
which 221 were due to coronary heart disease, then the proportion of total deaths
due to deaths from coronary heart disease is 221/5523 or 4%. The risk can also be
expressed as a decimal (0.04), a fraction (4/100) or a percentage (4%).

A rate is another type of ratio, and is an expression of the frequency (number) of


events that is occurring within a defined population over a specified time period.
The use of rates rather than raw numbers allow one to compare health problems
between either the same population at different points in time, or between
different populations in different places.
Risks and rates usually have values < 1, and since decimals are quite hard to
discuss (eg. Talking about fractions of death), rates are usually multiplied by a
constant. This can be either 100 (to make a percentage), or 1000, 10,000 or
1000,000. Note that, when a constant multiplier is used, both the numerator and
denominator are multiplied by the same number and thus the value of the ratio is
not changed.
An association exists if two variables appear to be related by a mathematical
relationship; that is, a change of one appears to be related to the change in the
other. Association is necessary for a causal relationship to exist but association
alone does not prove that a causal relationship exists. A correlation coefficient
or the risk measures often quantify associations.

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

Cause is the combination of necessary and sufficient factors (e.g., attributes and
exposures) the presence of which, alone or in combination, at some time during an
individuals life, inevitably result in disease in that individual.

3.4 Some disadvantages of morbidity data


Morbidity data can be a useful measure of disease, especially for the many
diseases where mortality is not appropriate. However, it is not routinely collected
(except that there are some infectious diseases that must be notified to the
Department of Human Services). Morbidity data can be expensive to collect and
can go out of date as knowledge of the disease increases rapidly (eg. changes to
criteria for including or excluding a case from the collection).

Surveillance
Surveillance is a continuous and systematic process of collection, analysis,
interpretation and dissemination of descriptive information for monitoring health
problems (World Health Organisation). Surveillance systems are networks of
people and activities that maintain this process and may function at a range of
levels, from local to international.
There are essentially three types of surveillance system:


Passive: most of the surveillance routinely done is passive


surveillance. Here, those who are required to report diseases, such as
doctors, laboratories and hospitals, are given the mailing forms with
instructions and are expected to report the cases of reportable diseases
that they come across. This is more expensive to maintain but provides
more complete and accurate data.

Active: requires periodic (e.g. weekly) telephone calls or personal


visits to the reporting individuals and institutions to obtain the required
data. More labour intensive and costly. e.g. Cancer registries reviewing
hospital records for new cases of cancer and benign tumours.

Sentinel: relies on reports of cases of disease (which is a preventable


event) that can be used to identify a problem in the system of
prevention, detection, or treatment. An example of a sentinel event is a
case of polio, which indicates the need for attention to immunization in
the population. Although relatively inexpensive to maintain, sentinel
surveillance lacks specificity regarding the cause of death and the risk
factors to which the population has been exposed.

More precisely Disease Surveillance


Follow long term trends or patterns
Secular trends (ignoring fluctuations)
Forecast future trends or patterns
Recognise new agents, changes in host response

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

Assess potential for new diseases to emerge


Understand & prepare for greater susceptibility
Detect sudden changes (person, place & time)
Disease -> new population subgroup
Disease -> new areas
Identification of epidemics & outbreaks
Identification of bio-terrorist activity

3.5 Morbidity measures


Incidence and prevalence are the two major measures of disease frequency.

Incidence
Incidence is the number of new cases of disease within a defined population
during a specified time period. Examples of incidence is the number of new cases
of lung cancer diagnosed in Australia in year 2003; or number of new cases of
HIV positives diagnosed in Victoria in year 2003.
The incidence rate (IR) is defined as the number of newly reported cases of a
given disease in a Calendar year, divided by the mid year population, the quotient
being multiplied by a convenient factor, usually 1000, 100,000, or 1,000,000.
People talk of two different types of incidence:
Cumulative incidence and
Incidence density
The difference between them relates to the way in which they are
measured.
Cumulative incidence (CI)
Defined as the total number of people who became diseased during a specified
period of time. Cumulative incidence provides an estimate of the probability or
risk that an individual will develop a disease during a specified period of time.

CI =

number of new cases in a time period


population at risk

The cumulative incidence assumes that the entire population at risk has been
followed for the entire specified period of time. For example, if 1000 coal miners
were followed-up for 10 years and 200 of them were found to develop lung
cancer, then the cumulative incidence for lung cancer would be (200/1000) = 20%
over a 10 year observation period (note that the period of observation is
specified). However, not all studies are designed like this. For instance, most
participants in follow-up studies enter the study over a period of time, often over
several years. Others will become lost to contact during the follow-up period so

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

that their information is not available at the end of the study. The length of time of
the study or follow-up will therefore not be the same for each participant.
Incidence density (ID)
Incidence density accounts for the varying time periods of follow-up and thus
maximizes whatever data is available for each person to examine the occurrence
of new disease. Incident density is given by the formula:
ID =

number of new cases during a given time period


Total person_time of observation (disease_free)

Although the numerator is the same as in the calculation of cumulative incidence,


the denominator is now the sum of each individuals time at risk or the sum of the
time that each person remained under observation and free from disease.

Example:
90

91

92

93

94

95

96

97

98

99

Sub A

Time
at risk
6.0

Sub B

6.0

Sub C

11.0

Sub D

9.5

Sub E
Total years at risk

00

5.0
37.5

Time followed
Disease onset

The figure provides data concerning five subjects who have a given disease in a
defined population. Each person is followed-up for varying periods of time
between 1990 and 2000, with two people developing the disease of interest.
Subject A was followed-up for 6 years (1990 to 1995), subject B for 6 years (1992
to 1997) until diagnosed with the disease, subject C for 11 years, subject D for 9.5
years and subject E for 5 years before being diagnosed. Therefore, the incidence
density is:

ID =

2 new cases
37.5 disease_free yrs
(person_yrs of observation)

= 5.3 per 100 person-years of observation.

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

For more please see this link: http://www.ispub.com/journal/the-internet-journalof-internal-medicine/volume-2-number-2/density-incidence-and-cumulativeincidence-a-fundamental-difference.html

Prevalence
Prevalence is defined as the number of affected persons in the population at a
specific time divided by the number of persons in the population at that time. It is
given by the formula:

number of cases of a disease present in the population at


a specified time
Prevalence per 1,000 =
number of persons in the population at that specified time

Examples include the percentage of children with skin cancer when tested at
school entry in 2003 and proportion of people with lung cancer in Australia.
There are two types of prevalence.

Point prevalence
Defined as the number of persons (old and new) in a defined population who have
a specified outcome (e.g. disease) at a single point in time. Example: number of
women in a maternity ward in The Royal Childrens Hospital giving birth right
now!!!.

Period prevalence
Defined as the number of persons who had the disease at any time during the
specified time interval. Example: number of women in the maternity ward giving
birth at any time during the month December 2003.
Note that the period prevalence is the combination of point prevalence and
incidence. Some articles in the literature will discuss period prevalence which is
the sum of the point prevalence at the beginning of the interval plus the incidence
during the interval.

3.6 Relationship between Incidence and Prevalence


Incidence measures the appearance of disease; prevalence measures the existence
of disease. Incidence means new, and prevalence means all. Prevalence of a
disease depends upon two factors: the incidence and duration of disease. Thus, a
change in disease prevalence may reflect a change in incidence, duration of
disease (e.g. by an introduction of a new and effective treatment), or both.

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

Assuming a steady-state situation, the relationship between prevalence, incidence


and duration can be expressed as:
Prevalence=Incidence x Duration of disease
A high prevalence may rise due to high incidence of disease, longer duration of
disease or both. For example, improvements in therapy by preventing death but at
the same time not producing recovery may increase the prevalence of the disease.
A low prevalence may result due to low incidence of disease, a shorter duration of
disease (e.g. rapid recovery or rapid death) or both.
The four common measures of morbidity are summarized in the following table.

Type

Disease

Measure

Definition

Risk

Existing

Point
prevalence

Number of persons with disease at a given


time/total population at that time.

Risk

New

Cumulative
incidence

Number of new cases in a time period/number


persons at risk at beginning of the period.

Risk

Existing
and new

Period
prevalence

Total number of cases in a time period/midperiod population

Rate

New

Incidence
rate (or
density)

Number of new cases in a time period/personyears of observation.

3.7 Measures of Ratio

Tell us how many times more likely it is that someone who is exposed to a risk
factor will experience a particular health outcome than someone who is not
exposed
Do not tell us anything about the actual amount of disease occurring in either
group

IRe
Incidence Rate in exposed
=
Incidence Rate in unexposed
IRo
Cumulative Incidence in exposed
Risk Ratio =
=
Cumulative Incidence in unexposed
Pe
Prevalence in exposed
Prevalence Ratio =
=
Prevalence in unexposed
Po
Rate Ratio

3.8

CI e
CI o

Difference Measures 1: Attributable Risk

Course notes STA60004

Semester 1, 2015

Research Design: Topic 9

Module 2: Topic 3

Tells us how much extra disease is occurring among those exposed to something
compared to those who are unexposed

How much disease among those who are exposed that could potentially be
prevented by removing the exposure

Attributable Risk

Rate difference or Excess rate


= Incidence rate in exposed Incidence rate in unexposed
= IRe - IRo

Risk difference or Excess risk


= Cumulative incidence in exposed Cumulative incidence in unexposed
= CIe - CIo

Attributable Fraction

Tells us the proportion of disease in those exposed that can be attributed to the
exposure

Attributable Fraction (AF)

Attributable Risk
100
Incidence in exposed

Using Incidence Rate

AF = IRe - IRo x 100


IRe
Using Cumulative Incidence AF = CIe CIo x 100
CIe

An Example: Obesity and Type-2 Diabetes


Imagine that:
30% of a community is overweight
82.5% of diabetics are overweight
the rate of type-2 diabetes is

330/105 person-years in the obese (IRe),

30/105 person-years in the non-obese (IRo)

The rate ratio


RR =

Course notes STA60004

IRe
IRo

330
30

= 11.0

Semester 1, 2015

10

Research Design: Topic 9

Module 2: Topic 3

The rate ratio tells us the rate of type-2 diabetes is 11 times higher among people
who are obese than among non-obese people

3.9 Measures of Fertility


In demography, the word fertility is used in relation to the actual production of
children or occurrence of births, especially live births. Fertility must be
distinguished from fecundity which refers to the capacity to bear children. In fact,
fecundity provides an upper bound for fertility. As a measure of the rate of growth
of population, various fertility rates are computed.

Crude Birth Rate (CBR): This rate reflects the number of births in a defined
population during a specified periodusually a yeardivided by the midyear
(July 1) population, the quotient being multiplied by 1000. Because of variations
in age composition and other factors, crude rates are seldom useful for
comparisons.
For example: In 1986 the mid-year (July 1) Australian population was
15,602,156 and 243,408 live birth were recorded. So, crude birth rate
CBR=

243, 408
1000
15, 602,156
=15.6 live births per 1000 population per year.

General Fertility Rate (GFR): The general fertility rate is defined as the
number of live births in a calendar year, divided by the number of women ages
15-44 (or 15-49) at midyear, the quotient being multiplied by 1000.
For example: In US, 1987-live births: 3,829,000; number of women aged
15-44: 58,012,000; neonatal deaths: 2780.
GFR= 3829000 1000
58012000
=66 live births per 1000 women ages 15-44 per year.
This rate is more sensitive than CBR.

Age-Specific Birth Rate (ASBR) or Age-Specific Fertility Rate


(ASFR): GFR gives a heterogeneous figure since it overlooks the age condition
of the female population in the child-bearing age. In order to overcome this
drawback it is necessary to compute the fertility rates for different age-groups of
reproductive age separately which is called age-specific birth rate.
Age-specific birth rate is the number of resident live births to women in a specific
age group for a specified geographic area (country, state, county, etc.), divided by
the total population of women in the same age group for the same geographic area
(for a specified time period, usually a calendar year). This figure is usually
multiplied by 1000 to give a rate per 1000 population.
Course notes STA60004

Semester 1, 2015

11

Research Design: Topic 9

Module 2: Topic 3

Example: In a specific state in US, 36,000 live births in 2008 among state
resident women who are 20-24 years old and 310,000 state resident women who
are 20-24 years old in 2008. The age-specific birth rate (ASBR) for the age group
20-24 is (36,000/310,000) x 1000 = 116.1 live births per 1000 state resident
women who are 20-24 years old in 2008.

Total Fertility Rate (TFR):


In order to arrive at a more practical measure of the population growth, the age
specific fertility rates for different groups have to be combined to give a single
quantity known as total fertility rate.
TFR is the sum of the age-specific birth rates (5-year age groups between 10 and
49) for female residents of a specified geographic area (nation, state, county, etc.)
during a specified time period (usually a calendar year) multiplied by 5. (NOTE:
This rate estimates the number of children a hypothetical cohort of 1,000 females
in the specified population would bear if they all went through their childbearing
years experiencing the same age-specific birth rates for a specified time period.).
TFR = (ASBR) x 5.
Example:
The Total Fertility Rate for a state in US for the year 2000 is given below:
Age
Group

Birth in
2000

10-14
15-19
20-24
25-29
30-34
35-39
40-44
45-49

300*
11,000
20,000
22,000
20,000
10,000
2,000
500*

Female
Population in 2000
165,000
179,000
192,000
222,000
213,000
212,000
210,000
200,000

(grossly rounded)
ASBR
1.8
61.5
104.2
99.1
93.9
47.2
9.5
2.5

Total or of ASBRs ================ 419.7


TFR = 419.7 X 5 = 2,098.5 live births per 1,000 female state residents in 2000 who live
through their reproductive years.

Course notes STA60004

Semester 1, 2015

12

Research Design: Topic 9

Module 2: Topic 3

Revision Exercise

1. What are the differences between each of the following terms?


Incidence and prevalence
Cumulative incidence and incidence density

2. Indicate which type of measure of disease frequency best describes each of the
following scenarios?
i. Percentage of students enrolled in a postgraduate course who
developed influenza during the mid-term exam.
ii. Percentage of students enrolled in an epidemiology class who had
sore throats on the first day of class
iii. Percent of breast cancer patients who underwent mastectomy
during year 2003.
iv. Percent of men found to have high cholesterol level at their yearly
physical examination.
v. Number of newly diagnosed cases of SARS in a year per 1 million
persons.
vi. Percent of drivers found to be legally drunk at the time of their car
accident.
vii. A group of 1000 children who were free of asthma were followedup for 5 years. The rate of developing asthma in this group was 10
per 1000 person years.
viii. A study was carried out to examine the health status of the elderly
in the Ivanhoe nursing home on the 10th of June 2003. 13% of the
subjects in the nursing home had high blood pressure.

3. Suppose that you want to estimate the average duration of a disease from its onset
to death or cure. Which two measures of disease frequency do you need to know
in order to calculate your estimate?

Course notes STA60004

Semester 1, 2015

13

Research Design: Topic 9

Module 2: Topic 3

4. A study was carried out to compare the risk of coronary heart disease (CHD)
between pre- and post-menopausal women and the results are tabulated below.

Post-menopause
Pre-menopause

(a)

Person yrs at risk


7524
9583

Cases
32
10

Calculate the risk of CHD in both pre- and post -menopause groups.

(b) What type of measure is this?

(c)

How would you summarize the findings?

5. A population of 1000 people is monitored for a year for the development of


measles. No one has measles at the start of the investigation. Thirty people
develop measles on June 30 and twenty people develop measles on September 30.
Eight people are lost to follow-up on March 31 and twenty-four people are lost to
follow-up on November 30. None of those lost to follow-up had developed
measles prior to becoming lost.
(a)

What is the cumulative incidence of measles in this population?

(b)

What is the incidence rate of measles?

(c)

What is the prevalence of measles on July 1?

6. Questions 6 & 7 are taken from Morton RF, Hebel, JR & McCarter RJ (2001). A
Study Guide to Epidemiology and Biostatistics, 5th ed.
From the data in the table below, compute the average duration, in
years, of the five chronic neurological conditions listed.
Table: Prevalence and incidence of selected Neurological diseases
in Rochester

Disease
Epilepsy
Multiple sclerosis
Parkinsons disease
Motor neuron disease
Central nervous system neoplasms

Course notes STA60004

Semester 1, 2015

Rates per 100,000 population


Prevalence
Incidence
376
30.8
55
5.0
157
20.0
7
1.7
69
17.3

14

Research Design: Topic 9

Module 2: Topic 3

7. Assume that the prevalence of coronary heart disease decreases after age 70,
while its incidence continues to increase with age. What is the most probable
explanation for the divergence of these rates?

8. Figure taken from Lilienfeld and Stolley (2000). Foundations of Epidemiology,


3rd edn.

Epidemiology, 3rd

(a) What was the prevalence of disease on January 1, 1992?

(b) What was the prevalence of disease on December 31, 1992?

(c) What was the cumulative incidence of disease during the 1year follow-up period?

(d) Calculate the period prevalence of the disease for 1992?

9. Calculate the 1986 birth rate for the Australian population if 243,408 live births
were recorded and the mid-year population was 15,602,156.

Course notes STA60004

Semester 1, 2015

15

Research Design: Topic 9

Module 2: Topic 3

10. The following table gives the female population of an Indian city for the year
1975, together with the estimated number of births based on a special vital
statistics enquiry conducted in that city.

Age group (year)

Number of women
(000)

Number of births

15-19

16.0

260

20-24

16.4

2244

25-29

15.8

1894

30-34

15.2

1320

35-39

14.8

916

40-44

15.0

280

45-49

14.5

145

Total

107.7

7059

Compute (i) GFR, (ii) ASBR/ASFR and (iii) TFR.

Course notes STA60004

Semester 1, 2015

16

Research Design: Topic 9

Module 2: Topic 3

Solution to the revision exercises


1. What are the differences between each of the following terms?
Incidence and prevalence
Cumulative incidence and incidence density
Incidence measures new cases while prevalence measures existing cases.
The key difference between cumulative incidence and incidence rate is the way that
each handles time. Time is not integrated into the cumulative incidence measure (it is
mentioned in words that go along with the number) while time is an integral part of
the incidence rate denominator.

2. Indicate which type of measure of disease frequency best describes each of


the following scenarios?
(i) Cumulative incidence
(ii) Prevalence
(iii) Cumulative incidence
(iv) Prevalence
(v) Incidence density
(vi) Prevalence
(vii) Incidence density
(viii) Point prevalence.

3. Suppose that you want to estimate the average duration of a disease from its
onset to death or cure. Which two measures of disease frequency do you need
to know in order to calculate your estimate?
Prevalence (P) = Incidence (I) x duration of disease (D). To solve for D, you need
P and I, since D = P/I.

4. (a)

Calculate the risk of CHD in both pre and post menopause groups.
(32 / 7524) x1000 = 0.00425 x1000= 4.25 per 1000 person-years (per
year) for the post-menopausal.
(10 / 9583) x1000 = 0.00104x1000 = 1.04 per 1000 per year for the premenopausal.

(b)

What type of measure is this?


This is a rate, and specifically, it is the incidence density or incidence rate.

(c)

How would you summarize the findings?


Post-menopausal women have a higher rate of coronary heart disease,
about 4 times higher. (0.00425 / 0.00104 = 4.09)

Course notes STA60004

Semester 1, 2015

17

Research Design: Topic 9

5. (a)

Module 2: Topic 3

What is the cumulative incidence of measles in this population?


(50 / 1000) x1000= 50 cases per 1000 population over a year.

(b)

What is the incidence rate of measles?


IR= (50 / 992) x1000= 50 new cases of measles per 1000 population per
year (where 1000-8=992).

(c)

What is the prevalence of measles on July 1?


(30 / 992) x1000=30 measles per 1000 population on July 1.

6. Epilepsy = 12.2 (376/30.8) years; multiple sclerosis = 11 years; Parkinsons


disease = 7.9 years, motor neuron disease = 4.1 years; central nervous system
neoplasms = 4.0 years.

7. Patients older than age 70 who develop coronary heart disease have shorter
survival times than the younger patients.

8. (a)

What was the prevalence of disease on January 1, 1992?


3 were cases on January 1 (prevalence = 3/8)

(b)

What was the prevalence of disease on December 31, 1992?


4 were cases on December 31 (prevalence = 4/8)

(c)

What was the cumulative incidence of disease during the 1-year


follow-up period?
Number of new cases/population at risk = 4/8

(d)

Calculate the period prevalence of the disease for 1992?


Period prevalence = Incidence + Point prevalence
= 4/8 + 3/8 = 7/8
= (7/8) x1000=875 per 1000 people.

9. Calculate the 1986 birth rate for the Australian population if 243,408 live
births were recorded and the mid-year population was 15,602,156.
Crude birth rate:

CBR= (243,408/15,602,156) x 1000


= 15.6 per 1000 population per year.

Course notes STA60004

Semester 1, 2015

18

Research Design: Topic 9

Module 2: Topic 3

10.

Age group
(year)

15-19
20-24
25-29
30-34
35-39
40-44
45-49
Total

Course notes STA60004

Number of
women (000)

G
F
16.0
R
16.4
=
7059 15.8
1000
107700 15.2
14.8

Number of
births

(ii)Age-Specific
Fertility Rate
(ASFR)
(per 1000)

260

(260/16000)x1000=16

2244

137

1894

120

1320

87

916

62

=
19
15.0
280
6
10
14.5
145
5
.
107.7
7059
451
5
(i) GFR=(7059/107700)*1000=66 (approx.) per thousand.
(iii) TFR= 5 (Sum of the ASFR)
=5451
=2251 per thousand.

Semester 1, 2015

19

Research Design: Topic 9

Module 2: Topic 3

References

Clayton D, Hills M. (1993), Statistical Models in Epidemiology. Oxford: Oxford


University Press.

Dawson, B. and Trapp, R.G. (2001), Basic and Clinical Biostatistics, 3rd Edition
(international edition), Lange Medical Book/McGraw-Hill.

Gordis, L., (2013), Epidemiology, 5th Edition, Elsevier/Saunders (available at


Swinburne Bookshop and students may buy this book).

Gupta S.C. & Kapoor V.K. (1984). Fundamentals of Applied Statistics. Sultan
Chand & Sons. New Delhi.

Kahn H.A. (1989), Statistical Methods in Epidemiology. Oxford: Oxford


University Press.

Kuzma J.W. (1998), Basic Statistics for the Health Sciences, 3rd Edition, Mayfield
Publishing Company, Mountain View, California, USA.

Pagano M., and Gauvreau, K. (1993), Principles of Biostatistics, Duxbury Press,


USA.

Course notes STA60004

Semester 1, 2015

20

Research Design: Topic 10

Module 2: Topic 4

Topic 4: Mortality Statistics and


Standardization of Rates

Dr Amirul Islam

Acknowledged to: Dr Jahar Bhowmik

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

Contents

4.1 Topic introduction

4.2 Topic learning objectives

4.3 Suggested Reading

4.4 Why study mortality statistics?

4.5 Source of Mortality data

4.6 Problems with Death Certificate

4.7 Other sources of mortality data

4.8 Mortality studies

4.9 Mortality Statistics

4.10 Age affects the rate

11

4.11 Standardized Rates

12

4.11.1 Direct Standardization

13

4.11.2 Indirect Standardization

15

Revision exercises

18

Solution to the revision exercises

22

References

25

Note: Some of the materials are adapted from standard texts and guides (see references).

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

4.1 Topic introduction


Welcome to Mortality Statistics. Mortality is the risk of death of a given person based on
factors such as age, gender, lifestyle factors and diseases. In this topic, we will explore the
issues associated with comparing mortality (death) statistics (rates) in populations. We
will also explore the concepts and basic methods for deriving measures that are
comparable across populations that differ in age and other demographic variables.
Standardization (or adjustment) of rates is used to enable the valid comparison of groups
(e.g., those studied in different places or times) that differ regarding an important health
determinant (most commonly age). The standardized rate reflects the number of new
cases that would arise in a hypothetical population.

4.2 Topic learning objectives


Learning objectives
When you have worked through this topic you should:

Identify the sources of mortality data in Australia


Define and interpret basic measures of mortality
Be able to directly standardise mortality rates
Be able to indirectly standardise mortality rates.

4.3 Suggested Reading:


1. Gordis, L (2009). Epidemiology, 4rd Ed. Chapter 4.
2. Dawson, B. and Trapp, R.G. (2001), Basic and Clinical Biostatistics, 3rd
Ed. (international edition), Lange Medical Book/McGraw-Hill.

4.4 Why study mortality statistics?


The study of mortality provides information to prevent early and needless death.
Mortality data is just one of the vital statistics that are collected by government
bodies throughout the world. Vital statistics are records of vital events such as
births and adoptions, deaths, marriages, divorces, separations and annulments.
They characterise the health and social well-being of a population and are usually
periodically presented in summary form by geographic location.

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

4.5 Source of Mortality data


The major source of mortality data in Australia, as in other countries, is death
certificates. Mortality data is available from the Australian Institute of Health
and Welfare (AIHW). Death certificates generally contain information about the
person who died, including age at death, gender, race, occupation, marital status,
and birth cohort. Comparisons by such characteristics provide much information
about subgroups at risk for different causes of death and help target public health
efforts.

4.6 Problems with Death Certificate

Problem with assigning one cause of death


Deaths are coded in the death certificate according to the underlying cause
(i.e., other contributing factors to death). However, the underlying cause of
death does not provide a complete representation of events. It excludes
information on immediate cause of death (final cause) and intermediate
conditions.

Problem with accuracy and completeness of information provided


Causes of death on the death certificate represent a medical opinion that
might vary among individual physicians. In signing the death certificate,
the physician, medical examiner, or coroner, certifies that in his/her
medical opinion, the individual died from the reported causes of death. The
certifiers opinion and confidence in that opinion are based upon his/her
training, knowledge of medicine, available medical history, symptoms,
diagnostic tests, and available autopsy results for the decedent. Even if
extensive information is available to the certifier, causes of death may be
difficult to determine, so the certifier may indicate uncertainty by
qualifying the causes on the death certificate.

Studies have evaluated (a) underreporting of some diseases (e.g. AIDS),


(b) over reporting of certain diseases (e.g. stroke) and (c) differences
among physicians classification in underlying cause of death using the
same case histories.

International differences in the quality of data on death certificates


Differences may arise through differences in availability of medical
services, diagnostic practices of physicians, and classification procedures.

Use of International Classification of Diseases (ICD) codes


The ICD is the international coding manual for classifying diagnoses
which is revised periodically every eight to ten years. The revisions entail
changes in code numbers and the addition of different disease entities to
categories within a specific code.

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

4.7 Other sources of mortality data

Autopsy
Hospital records
Occupational records
Financial records (e.g. insurance, pension funds)

4.8 Mortality studies


Mortality studies describe disease in given populations (defining populations
involves specifying time and place) and among certain groups of people (subgroups of persons, e.g. greater than 55, female of Asian origin).
A difference in mortality trends over time or between populations may be
artifactual (the result of errors in the numerator or denominator) or real.
Artifactual
(a)

Errors in the numerator due to:


Changes in the recognition of disease
Improvements in medical services over any given time are reflected in
improved diagnoses of disease and, in turn, in the accuracy of
statements of the cause of death on death certificates.
E.g. in Figure 5.1 from L & S(pp 78) uterine cancer mortality has
declined because of earlier detection of the disease, not because of
decline in mortality due to uterine cancer.

Changes in coding rules

Changes in the classification


Increase (or decrease) in mortality from one cause of death may be
explained by the decline (or increment) in another.

(b)

Errors in the denominator due to


Errors in counting population
May arise due to the degree of error in the census differing from one
decade to another, and varying by age, sex, and race. For example, young
black males in U.S population are more likely to be undercounted than any
other subgroup. Thus, mortality rates among black males in this subgroup
can be overestimated (number of deaths/underestimated denominator). If
the degrees of undercount changes in the different census years with no
change in the quality of death certification, artifactual trends in mortality
will result (see Table 5.3 in L & S (pp. 81).
Errors in classifying by demographic characteristics
e.g., age, gender, social class, ethnicity

Real
STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

Changes in incidence of disease


May occur as a result of:
(i)
genetic factors if a large increase or decrease in mortality trends
occurs, this indicates that a new environmental agent has been
introduced or removed from the population. For example, the
pattern of mortality from AIDS indicates the introduction into a
specific population of a new virus that causes a fatal disease.
(ii)

Environmental factors changes in personal habits (e.g. exercise,


alcohol, drugs etc.), occupation, air and water pollution.

Changes in survivorship
E.g. decreased mortality from ectopic pregnancy.

Changes in age distribution of the population


Some diseases vary from the pattern of increasing mortality with age.
E.g. AIDS: affects young to middle aged people more than the elderly
because the behaviours leading to exposure occur more frequently in the
young.
Childbirth: occurs in women of child-bearing age but disappears as women
reach menopause.

4.9 Mortality Statistics


Rates
There are two frequently used time denominators:
mid-year population and
Person-years.
The denominator mid-year population is taken as an estimate of the
average population during the whole calendar year.
For example, in 1991 there were 119,146 deaths recorded in Australia.
The mid-year population estimate from ABS was 17,292,000 persons.
Therefore the mortality rate for the Australian population was
119,146/17,292,000 = 0.0069.
If this figure is multiplied by 1000, then the crude rate becomes 6.9 per
1000 per year (It is easier to deal in whole numbers rather than decimals).
Similar rates can be calculated for subgroups of the population.
For example, in 1986 there were 94,575 babies born to Australian women
aged 25-29 years. The mid-year population for women in this age group
was 648,677.

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

Therefore the specific fertility rate was (94,575/648,677) x 1000 = 145 per
1000 women per year.
For epidemiological research studies, annual rates are often inappropriate
because periods of follow-up are not always whole years and recruitment
into studies can occur unevenly throughout years. Therefore, another
denominator used is person-years of observation.
Person-years of observation mean the total amount of time for which
people were observed to be disease-free during the time of follow-up. If
death occurs during the follow-up then individuals can be counted only if
they die from the cause under study, not if they die from another disease or
migrate away.
For example, a study follows 10 people from 1982 to 1992 and the
outcome of interest was the death from a particular disease. There were
three deaths recorded. The total person-years are the combined time of
people alive (disease-free) during the study. That is 8.5 + 10 + 5.5 + 10 +
+ 2.5 = 64.5 person-years.
The mortality rate for the study is (3/64.5) x 1000 = 46.5 per 1000 personyears.

Crude and Specific rates


Crude rates are rates in which the result for a study population is taken as a
whole, without any subdivision or refinement. The total mortality rate for a
population is an example of a crude rate. On the other hand, specific rates
are rates in which some subdivision of the data has occurred. Sex, age and
race specific rates are frequently reported.
Crude mortality rates (CMR) or Crude death rates (CDR) apply to an
entire population without reference to any characteristics of the individuals
in it. The total mortality rate in Victoria for 2003 is an example of a
crude rate.
CMR =

all deaths during a calender year


x 1000
population at mid - year

CMRs produce an overall summary measure of what is happening in the


total population. While CMRs are valid rates, they are often misleading
since they mask important differences which may be occurring among
certain subgroups.
These rates reflect the number of deaths in a defined population during a
specified periodusually a yeardivided by the midyear (July 1)
population. Because of variations in age composition and other factors,
crude rates are seldom useful for comparisons.

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

For example: A city contains 100,000 people (45,000 males and 55,000
females), and 1,000 people die per year (600 males and 400 females).
Crude mortality rate for that city
1000
CMR =
1000
100000
=10 deaths per 1000 population per year.
Merits of CMR:
It is simple to understand and calculate.
(i)
It is one of the most widely used of any vital statistics rates. As an
(ii)
index of mortality, it is used in numerous demographic and public
health problems.
(iii) Since the entire population of the region is exposed to the risk of
mortality, CMR is a probability rate indicating the probability that a
person belonging to the given population will die in the given
period.
Demerits of CMR: The most serious drawback of CMR is that it
completely ignores the age and sex distribution of the population.
Experience shows that mortality is different in different segments of the
population. Children, in the early ages of life, and the older generation are
exposed to higher risk of mortality as compared to younger people.
Moreover, mortality rate is also different for females, irrespective of age
groups, than their male counterparts.

Specific mortality rates (SMR) or specific death rates (SDR) are rates in
which some subdivision of the data has occurred. Here, a population is
divided into more homogeneous subgroups based on particular
characteristics of interest (e.g. age, sex, disease, and race) and rates are
calculated within these groups (age-specific rates, sex-specific rates,
disease-specific rates, race-specific rates, respectively). In order to arrive at
a more useful figure than CMR, we must take into account the fact that the
mortality pattern is different in different segments of the population.
Various segments may be sex, age, occupation, social status, etc. For
example, the people engaged in infant or child welfare work would be
interested to know the mortality condition in the age groups below one
year, 1-4 years, 5-9 years, etc. Those engaged in maternal health programs
would like to know the number of deaths occurring among women in the
reproductive period (usually 15 to 49 years); insurance authorities would
be interested in the mortality pattern at different ages of the population.
Mortality rate computed for a particular specified section of the population
is termed as specific mortality rate (SMR) or specific death rate (SDR).

Total number of deaths in the specified section of the


SMR or SDR =

STA60004: Research Design

population in the given period


k
Total population of the specified section in the same period

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

Where k = 1000 usually.

Merits of SMR:
(i)

(ii)

The death rates specific to age and sex overcome the drawback of
CMR, since they are computed by taking into consideration the age
and sex composition of the population. By eliminating the variation
in the death rates due to age-sex distribution of the population
SMRs provide more appropriate measures of the relative mortality
situation in the regions.
For general analytical purposes, the death rate specific for age and
sex is one of the most important and widely applicable types of
death rates. It also supplies the essential components required for
the construction of life table.

Demerits of SMR:
(i)
(ii)

SMRs are not of much utility for overall comparison of mortality


conditions in two different regions.
Moreover, in addition to age and sex distribution of the population,
social, occupational and topographical factors come into operation
causing what is called differential mortality. SMRs completely
ignore these factors. In order to eliminate such spurious effects,
standardized death rates are computed.

The age-specific mortality rate (ASMR) or age-specific death rate


(ASDR) is defined as the number of deaths in a specific age group in a
calendar year, divided by the population of same age group on July 1 of
that year multiplied my 1000.
For example: In US, 1987-age group: 25-34 years; population: 43,513,000;
deaths: 57,701.

ASMR=

57701 1000
43513000

= 1.3 deaths per 1000 population per year for the age group 25-34.

Cause-Specific Mortality Rate (CSMR): The greatest value of


mortality rates for studies of common conditions like cancer and coronary
heart disease is in comparisons of cause-specific mortality rates in different
populations (different regions, occupations, time periods). Such
comparisons have illuminated understanding of many causal and
associated factors and have prompted much detailed study of these
diseases.
The cause-specific mortality rate is defined as the number of deaths
assigned to a specific cause in a calendar year, divided by the population
on July 1 of that year multiplied by 100,000.

STA60004: Research Design

Semester 2/SP 3 2015

Research Design: Topic 10

Module 2: Topic 4

For example: In US, 1987-cause: accidents; population: 243,827,000;


deaths: 94,840.

CSMR=

94840 100000
243827000

=39 accidental deaths per 100,000 populations per year.

Maternal Mortality Ratio (MMR): This is a measure of the risk of


dying from puerperal causescauses associated with pregnancy, childbirth
and the postpartum period. The World Health Organization defines this as
any time up to forty-two days after termination of pregnancy, irrespective
of the duration of pregnancy or its outcome in a live birth, stillbirth,
miscarriage, or abortion. Maternal mortality rates are very low in the
industrial nations, reflecting high standards of care during pregnancy and
childbirth. In countries where women have no other recourse than induced
abortion to control their fertility, almost a million women die annually of
puerperal causesa terrible loss of life that could be prevented by easier
access to family planning. Cultural, political, and religious opposition
inhibits these societies from addressing this appalling problem.
The maternal mortality ratio is defined as the number of deaths assigned to
puerperal causes (i.e., those related to childbearing) in a calendar year,
divided by the number of live births in that year, the quotient multiplied by
100,000.
For example: In US, 1987-deaths assigned to puerperal causes: 253; live
births: 3,829,000.
MMR=

253 100000
3829000

=6.6 maternal deaths per 100,000 live births per year.

Infant Mortality Rate (IMR): The indicator of health most


commonly used for comparing nations and trends over time is the infant
mortality rate. This is the number of deaths in a calendar year of infants
under one year of age, divided by the number of live born infants, the
quotient being multiplied by 1000. It is strongly correlated with social and
economic conditions.

For example: In US, California, 1987-live births: 494,053; infant deaths:


4546.

STA60004: Research Design

Semester 2/SP 3 2015

10

Research Design: Topic 10

Module 2: Topic 4

IMR= 4546 1000


494053
=9.2 infant deaths per 1000 live births per year.

Case Fatality Ratio (CFR): The case fatality ratio for a particular
condition is the number of deaths caused by the condition, divided by the
total number of identified cases of the condition in a specified population.
Although case fatality ratio is commonly used to describe the deaths
attributable to specific infectious diseases, it can also be used to describe
the deaths attributable to any condition (Ref. see Epidemiology,
Biostatistics and Preventive Medicine by Jekel et. all.).

4.10 Age affects the rate


Most diseases (and death!) are strongly related to age so when comparing
rates between two populations, differences can arise from either different
levels of disease or different age structures in the populations. Before
exploring the mortality in two populations, account must be taken of
differences in age composition.
Rates that do not take account of age differences are called crude rates. In
contrast, rates can be calculated to remove any effect of age and these rates
are known as adjusted/specific rates.
This is illustrated in the following case study that was undertaken to
investigate mortality rates from heart disease of officers and other ranked
soldiers.

Case Study:

STA60004: Research Design

Comparison of mortality rates between


officers and other ranked soldiers.

Semester 2/SP 3 2015

11

Research Design: Topic 10

Module 2: Topic 4

Age Yrs
Officers
Deaths

Person Rate
yrs

Nonofficers
Deaths

Person
yrs

Rate

15 19

500

0.0

126500

0.0

20 24

9000

0.0

226000

1.8

25 29

13000

0.0

10

164500

6.07

30 34

11500

0.0

26

88000

29.5

35 39

13500

14.8

36

60500

59.5

40 44

15000

40

19

21500

88.4

45 - 49

14

135000 103.7

11

5500

200

50 - 54

19

7500

1500

133.3

108

694000

15.6

253.3

41
83500 49.1
All ages
Notes: Rate per 100,000 person years

In the case study above, the crude mortality rate for officers (49.1 per
100,000 per year) is much higher than the crude rate for non-officers (15.6
per 100,000 per year). On the surface, it appears that the officers have a
greater risk of dying from heart disease than the non-officers.
However by comparing age specific rates, the rates are higher for nonofficers than the officers at almost every age. Remember the crude rates
ignore the age structure of the two populations. Officers are, on average,
much older than the non-officers and therefore their crude rate is greater.
But when like with like is compared, the officers are generally at lower risk
than the non-officers.

4.11 Standardized Rates or Adjustment of Rates


Crude rates can be used to make approximate comparisons between
different populations. But the comparisons are invalid if the populations
are dissimilar with respect to an important characteristic such as age, sex,
or race.
When comparing populations with different age structures one can do three
things:
a. Ignore the age differences and use the CMR (will draw
inappropriate conclusions);

STA60004: Research Design

Semester 2/SP 3 2015

12

Research Design: Topic 10

Module 2: Topic 4

b. Calculate and compare the age-specific mortality rates within each


age group (useful but may be clumsy with a mass of figures); or
c. Calculate a summary statistic that takes into account the
differences in the age structures of the two or more populations
being compared called standardized rates.
The standardized rates, also commonly referred to as adjusted rates, are
CMRs that have been adjusted to control for the effects of age (or other
characteristics) and thus provide valid comparisons of rates among
different populations. There are two main methods of calculating agestandardized rates, direct and indirect and both methods ensure that any
differences in rates observed are not due to the population age structure.

Direct

study rates x standard population

Indirect

standard population rates x study population

4.11.1 Direct Standardization


The direct method of adjustment applies a standard population to the death
rates of two comparison groups. The sum of the expected deaths for the
two groups is then used to compute the adjusted death rate (dividing the
expected deaths by the total of the standard population). This method can
be used when the age-specific mortality rates for the population(s) you
are interested are known. There are essentially five steps involved in the
calculation of direct standardization. The following example is adapted
from Jekel, Katz and Elmore, Epidemiology, Biostatistics, and Preventive
Medicine (2001), 2nd Ed. to illustrate the steps involved in the calculation
of direct standardized mortality rates.

1. Calculate the age-specific mortality rates for each age group in


the population(s) you wish to compare.
Population A

Population B

Age Population No. of


AgePopulation No. of
Agegroup size
deaths specific
size
deaths specific
death rate
death rate

Young

STA60004: Research Design

1,000

0.001

4,000

0.002

Middle- 5,000
aged

50

0.010

5,000

100

0.020

Older
Total

400
451

0.100

1,000
10,000

200
308

0.200

4,000
10,000

Semester 2/SP 3 2015

13

Research Design: Topic 10

Module 2: Topic 4

50 people died out of the total 5,000 people in the middle-aged


group in Population A, giving an age-specific death rate of
50/5000 = 0.01.
The CMR for Population A = 451/10,000 = 4.51%
The CMR for Population B = 308/10,000 = 3.08%
2. Select a population whose age-distribution is well defined to serve
as your standard or reference population. The age
distribution of a hypothetical reference population is shown
below.
Reference
population
Age group
Population size
Young
Middle-aged
Older
Total

5,000
10,000
5,000
20,000

3. Multiply the number of people in each age group of the reference


population by the age-specific mortality rate in the comparable
age group of the population(s) of interest, in this case Population
A and B. This gives you the hypothetical number of people who
would be expected to have died in the reference population if that
population had the same mortality rate as Population A and B.

Population A

Population B

Age Population AgeExpected


group size
specific no. of
(ref)
death rate deaths

Young

Population Age- Expected


size
specific no. of
(ref)
death rate death

5,000

0.001

5,000

0.002

10

Middle- 10,000
aged

0.010

100

10,000

0.02

200

Older

0.100

500

5,000

0.200

605

20,000

Total

5,000

20,000

1,000

1,210

4. Sum the total number of expected deaths across all age


groups of the reference population. For Population A this comes
to 605, while for Population B this comes to 1,210.

5. Divide the total number of expected deaths by the total number


of people in the reference population to yield the summary age-

STA60004: Research Design

Semester 2/SP 3 2015

14

Research Design: Topic 10

Module 2: Topic 4

adjusted mortality rate.


The age-adjusted mortality rate for Population A is:
Population A =

605
= 3.03%
20,000

The age-adjusted mortality rate for Population B is:


Population A =

1,210
= 6.05%
20,000

Thus although the CMR for Population A was higher (4.51%) than
That of Population B (3.08%), the age-adjusted rates indicate that
the risk of death is actually twice as high in Population B than in
Population A.

4.11.2 Indirect Standardization


The indirect method of adjustment is somewhat different from the direct
method. Indirect standardization method is used when either the agespecific mortality rates are not available in the region or when they are
statistically unstable (this happens when the population to be standardized
is small).
In the indirect method, we calculate the Standardized Mortality Ratio
(SMR), which is defined as:
SMR =

Observed number of deaths


Expected number of deaths

There are essentially five steps involved in calculating the standardized


mortality rates by the indirect method. The following example is adapted
from Jekel, Katz and Elmore, Epidemiology, Biostatistics, and Preventive
Medicine (2001), 2nd Ed. to illustrate the steps involved in the calculation
of the indirect standardized mortality rates.
Example: Suppose that an investigator wanted to see if the death rates for
male workers in a certain company were similar to or greater than the death
rates for males in the US population.

1. Choose a reference or standard population, as in the direct


method. Make sure that in your reference population, the agespecific mortality rates are known.

Age group
Young
Middle-aged
STA60004: Research Design

Proportion of people in
ref. population
0.40
0.30
Semester 2/SP 3 2015

Age-specific
mortality rate
0.0001
0.0010
15

Research Design: Topic 10

Module 2: Topic 4

Older

0.30

0.0100

2. Calculate the observed number of deaths in the population(s) of


interest.

Males in reference
population
Age Proportion AgeObserved
group of ref.
specific death
size
death rate rate

Young 0.40 x

Males in the company

Number AgeObserved
of
specific death
workers death rate rate

0.0001 = 0.00004

2,000 x

= ?

Middle- 0.30 x 0.0010 = 0.00030


aged

3,000 x

= ?

Older

5,000 x

= ?

Total

0.30 x 0.0100 = 0.00300

1.00

0.00334

10,000

48

Observed death rate for males in the reference population


= 0.00334, or 334 per 100,000
Observed rate for males in the company
= 48 per 10,000 or 480 per 100,000

3. Multiply the age-specific death rates from each age group in the
reference population by the number of workers in the
corresponding age groups in the company. This gives the number
of deaths that would be expected in each age group of workers if
they had the death rates of the reference population.

Males in reference
population
Age Proportion AgeObserved
group of ref.
specific death
size
death rate rate

STA60004: Research Design

Males in the company

Number AgeObserved
of
specific death
workers death rate rate

Semester 2/SP 3 2015

16

Research Design: Topic 10

Module 2: Topic 4

Young 0.40 x

0.0001 = 0.00004

2,000 x 0.0001 = 0.2

Middle- 0.30 x 0.0010 = 0.00030


aged

3,000 x 0.0010 = 3.0

Older

5,000 x 0.0100 = 50.0

0.30 x 0.0100 = 0.00300

Total

1.00

0.00334

10,000

53.2

4. Sum the total number of expected deaths for each population of


interest. For males in the company this sums to 53.2 deaths per
10,000.
5. Divide the total number of observed deaths in the population(s) of
interest by the total number of expected deaths to obtain the
SMR.
SMR =

Observed number of deaths


Expected number of deaths

= (0.0048/0.00532) x 100
= 90%

Thus, males in the company actually had a death rate that was only 90% of
the value that would be expected, based on the death rates in the standard
population.
Note that if the employees in this example had an SMR of 140, it means
that their mortality was 40% greater than was expected on the basis of the
age-specific death rates of the reference population.

STA60004: Research Design

Semester 2/SP 3 2015

17

Research Design: Topic 10

Module 2: Topic 4

Revision Exercises

1. Taken from Lilienfeld & Stolley (1994). Foundations of Epidemiology, 3rd


edn.
Most western countries have shown sharp declines in infant mortality since
1990. In the United States, infant mortality has declined for both whites and
African Americans, but differences between the races have persisted over
time. Variations over time and between racial groups can give information
about several causes of infant mortality.
Infant mortality rates in the United States (deaths under 1 year of age per 1,000
live births) are listed below by year and race.
Infant mortality (deaths under 1 year per 1,000 live births)

White
1915-1919
1920-1924
1930-1934
1940
1950
1960
1970
1980
1987

92.8
73.3
55.2
43.2
26.8
22.9
17.8
11.0
8.6

African
American
150.4
117.3
90.5
72.9
43.9
44.3
32.6
21.4
17.9

a) What are some arguments that the time trends are real, not artifactual?
b) What effect would improved birth registration have on these data?
c) What effect would improved death registration have in these data?
d) What might explain the decline in infant mortality between 1915 and
1960?
e) What might explain the decline in infant mortality between 1960 and
1987?
f) What are three hypotheses that might explain the differences in infant
mortality between whites and blacks?

STA60004: Research Design

Semester 2/SP 3 2015

18

Research Design: Topic 10

Module 2: Topic 4

2.

Calculate the 1986 age-sex specific mortality rate for Australian


men aged 65-74 years if 17,032 deaths were recorded and the midyear population for this group was 463,802.

3.

Below is a common way of depicting graphically the disease


patterns in a small study (9 people).

Person
1
2
3
4
5
6
7
8
9

D
D
L

1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 Time
D = death, L = lost to follow up

a. What is the number of deaths in this study?


b. What is the total person-years of observation for this study?
c. Calculate the mortality rate for this study population.

4.

There are 10,000 individuals in City A, which is located in NSW. Eight young
individuals and 420 old individuals develop the flu over the course of a year.
(a)

Use these data to calculate the crude influenza rate per 1,000
individuals in City A.

(b)

Calculate the crude rate of influenza in City A? The needed


information is given in the table below.

STA60004: Research Design

Semester 2/SP 3 2015

19

Research Design: Topic 10

Module 2: Topic 4

Age
group

% of population in Age group

Young

City A
40%

City B
50%

City C
80%

NSW
60%

Old

60%

50%

20%

40%

Influenza rate per


1,000 person-years
City A City B City C
2
10
30
70

110

(c)

What is the crude rate of influenza in City B?

(d)

What is the crude rate of influenza in City C?

(e)

Calculate an age-adjusted influenza rate for each of the cities.


Use the age distribution for NSW as the standard or reference.

5. The following table describes hypothetical age-specific rates of heart disease


in United Kingdom and United States in 2000. Also included are hypothetical
age distributions for the two countries and for the entire world population.

Age group

% of population in Age group

UK
60%
30%
10%

<30
30-55
>55

USA
30%
40%
30%

Heart disease rate per


100,000 person-years

World
50%
30%
20%

UK
50
80
120

USA
75
150
400

(a) Calculate the crude rate of heart disease for each of the two
countries.

6.

(b)

Suppose that you want to compare the rate of heart disease in the
UK to that in the USA. Should you use the two crude rates to
compare the two countries? Why or why not?

(c)

Calculate an age-adjusted rate for heart disease in each country.


Use the age distribution of the entire world as your standard.

(d)

Based on these answers, would you say that the age differences
between UK and the USA account for the entire difference in crude
heart disease rates between the two countries?

Taken from Morton RF, Hebel JR & McCarter RJ. (2001). A Study Guide to
Epidemiology and Biostatsitics, 5th ed.
A city contains 100,000 people (45,000 males and 55,000 females), and
1,000 people die per year (600 males and 400 females). There were 50
cases (40 males and 10 females) of lung cancer per year, of which 45 died
(36 males and 9 females). Compute:

STA60004: Research Design

Semester 2/SP 3 2015

20

Research Design: Topic 10

Module 2: Topic 4

a.
b.
c.
d.
e.

Crude mortality rate


Sex-specific mortality rate
Cause-specific mortality rate for lung cancer
Case fatality ratio for lung cancer
Proportionate mortality rate (PMR) from lung cancer.

7. Match the term with its definition. Each term may be selected once, more than
once, or not at all.
Age-specific death rate
case fatality ratio
cause-specific death rate
crude birth rate
direct standardization
incidence rate
indirect standardization
infant mortality rate
prevalence rate
standardized mortality ratio
standardized rate
(a)

This is calculated after the two populations to be compared are


given the same age distribution, which is then applied to the
observed age-specific death rates of each population.

(b)

This is the number of new cases over a defined study period,


divided by the mid point population at risk.

(c)

This is used if age-specific death rates are not available in the


population whose crude death rate is to be adjusted.

(d)

This is not a true rate; it is actually a proportion.

(e)

This is the observed total deaths in a population, divided by the


expected deaths in that population multiplied by 100.

(f)

This is useful for studying trends in the causes of death over time.

(g)

This is the proportion of individuals with a given condition who die


of the condition.

(h)

This is the number of live births, divided by the mid period


population.

(i)

This provides the death rate within a defined age range.

STA60004: Research Design

Semester 2/SP 3 2015

21

Research Design: Topic 10

Module 2: Topic 4

Solution to the revision exercises


1(a). What are some arguments that the time trends are real, not
artifactual?
The trend is steadily downwards over 70 years. The decline in modality is
dramatic; in 1987 it is less than one-tenth of what it was in 1915-1919 for
whites, and slightly above one-tenth for blacks. The decline occurred in
both whites and blacks.
(b) What effect would improved birth registration have on these data?
Improved birth registration would increase the denominator (live births)
and decrease the infant mortality rate.
(c) What effect would improved death registration have on these data?
Improved death registration would increase the numerator (deaths for
infants under 1 year of age) and increase the infant mortality rate.
(d) What might explain the decline in infant mortality between 1915 and
1960?
The elimination of many infectious diseases through improved sanitation,
sterile techniques, vaccinations, and antibiotics could explain the decline in
infant mortality between 1915 and 1960.
(e)

What might explain the decline in infant mortality between 1960 and
1987?
Between 1960 and 1987 hospital-based technology reduced the mortality
for many high-risk infants including those born prematualy, born at low or
very low birth weight and those with congenital malformations

(f)

What are three hypotheses that might explain the differences in infant
mortality between whites and blacks?
Some hypotheses are:
o Black infants have genetically determined lower survival rates
o Black mothers are less likely to have access to medical care during
pregnancy
o Black mothers are more likely to be of lower socioeconomic class
than white mothers
o Deaths of white infants are less likely to be recorded than deaths of
black infants
o Births of black infants are less likely to be recorded than are births
of white infants.
2. Calculate the 1986 age-sex specific mortality rate for Australian men
aged 65-74 years if 17,032 deaths were recorded and the mid-year
population for this group was 463,802.
Age-specific mortality rate:

STA60004: Research Design

Semester 2/SP 3 2015

22

Research Design: Topic 10

Module 2: Topic 4

ASMR= (17,032/463,802) x1000


= 36.72 per 1000 population per year for the age group 65-74.

3. (a)

Number of deaths = D = 3

(b)

Total person-years of observation


7 + 11 + 11 + 6 + 11 + 5 + 4 + 5 + 5 = 65

(c)

Mortality rate:
3/65 = 0.046 per person years or 4.6 per 100 person per year.

4. Crude influenza rate


(a) Since a total of 428 people develop the flu and the population at risk is
10,000, the crude flu rate is simply 428/10,000 or 42.8 per 1,000 per year.
(b)

To calculate the crude rate in City A you need to multiply the


percentage of the population in each category by the flu rate for that
age category, and then sum over all age categories:
0.40 (2/1000) + 0.60(70/1000) = 42.8 per 1,000 population per year

(c)

City B = 0.50 (10/1000) + 0.50(110/1000) = 60 per 1,000


population per year.

(d)

City C = 0.8 (30/1000) +0.2(5/1000) = 25 per 1,000 per year

(e)

City A: 0.60(2/1000) + 0.40(70/1000) = 29.2 per 1,000 population


per year.
City B: 0.60(10/1000) + 0.40(110/1000) = 50 per 1,000 population
per year.
City C: 0.60(30/1000) +0.40 (5/1000) = 20 per 1,000 population
per year.

5. (a)

Crude rate of heart disease for UK


0.60(50) + 0.30(80) + 0.10(120)
= 66 per 100,000 population
UK =
100,000
in 2003.
Crude rate of heart disease for USA
0.30(75) + 0.40(150) + 0.30(400)
USA =
100,000
population in 2003.

(b)

= 202.5 per 100,000

NO. Should not compare the crude rates because the age
distributions of the two countries are very different and age is a
confounder for heart disease.

STA60004: Research Design

Semester 2/SP 3 2015

23

Research Design: Topic 10

(c)

Module 2: Topic 4

Age-adjusted heart disease for UK


0.50(50) + 0.30(80) + 0.20(120)
= 73 per 100,000 population
UK =
100, 000
in 2003
Age-adjusted heart disease for USA
0.50(75) + 0.30(150) + 0.20(400)
=
USA =
100, 000
100,000 population in 2003.

(d)

162.5

per

Age does not account for all of the difference between the USA and
the UK. The age-adjusted rates are quite different; in fact; the USA
adjusted rate is more than twice the adjusted rate for the UK.

6. (a) Crude mortality rate = (1,000/100,000) or 10 per 1,000


population per year.

(b)

Sex-specific mortality rate = (600/45,000) x1000= 13.3 per 1,000


for males; and
(400/55,000) or 7.3 per 1,000 person for females per year.

(c)

Cause-specific mortality rate for lung cancer = (45/100,000)


0.45 per 1,000 person per year.

(d)

Case fatality ratio for lung cancer = (45/50) x100 = 90%

(e)

Proportionate mortality rate (PMR) for lung cancer = (45/1000)


x100 = 4.5%.

7. (a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)

Direct standardization
Incidence rate
Indirect standardization
Prevalence rate
Standardized mortality ratio
Cause-specific death rate
Case fatality ratio
Crude birth rate
Age-specific death rate

Revision questions/Activity
Activity 1:
The population of Australia in 2005 was 20,395,000 with 10,128,000 males and
10,267,000 females. At the end of 2004, 654,977 living persons had been
diagnosed with cancer at some time in the last 23 years. During 2005, 22,017
males and 17,080 females died from cancer, 1119 males and 1062 females were

STA60004: Research Design

Semester 2/SP 3 2015

24

Research Design: Topic 10

Module 2: Topic 4

diagnosed with pancreatic cancer and 964 males and 1062 females died from
pancreatic cancer.

Questions of the activity:


Q1.What was the prevalence of cancer in Australia on 1
January 2005?
Q2. What was the cancer-specific mortality rate in 2005?
Q3. What were the sex-specific incidence rates for pancreatic cancer in 2005?
Q4. Estimate the case-fatality rate for pancreatic cancer.
Q5.What proportion of all cancer deaths are due to pancreatic cancer?
Activity 2:
In a cohort study, men were classified as having high or low cholesterol levels. Among
the group with low cholesterol levels there were 45 heart attacks in 6000 person-years of
follow-up and among the group with high cholesterol levels there were 100 heart attacks
in 4000 person-years of follow-up.

Questions of the activity:


Q1.What was the incidence of heart attack among (i) men with low cholesterol levels, (ii)
men with high cholesterol levels and (iii) all men?
Q2.How strong was the association between high cholesterol and heart attack?
Q3. What proportion of heart attacks in men could theoretically be prevented if nobody
had a high cholesterol level?

References

Clayton D, Hills M. (1993), Statistical Models in Epidemiology. Oxford: Oxford


University Press.

Dawson, B. and Trapp, R.G. (2001), Basic and Clinical Biostatistics, 3rd Edition
(international edition), Lange Medical Book/McGraw-Hill.

Gordis, L., (2013), Epidemiology, 5th Edition, Elsevier/Saunders (available at


Swinburne Bookshop and students may buy this book).

Gupta, S.C. & Kapoor V.K. (1984). Fundamentals of Applied Statistics. Sultan
Chand & Sons. New Delhi.

Jekel, F.J., Katz, D.L. and Elmore, J.G. (2001). Epidemiology, Biostatistics, and

Preventive Medicine, 2nd Ed., W.B. Saunders Company, New York.

STA60004: Research Design

Semester 2/SP 3 2015

25

Research Design: Topic 10

Module 2: Topic 4

Kahn H.A. (1989), Statistical Methods in Epidemiology. Oxford: Oxford


University Press.

Kuzma J.W. (1998), Basic Statistics for the Health Sciences, 3rd Edition, Mayfield
Publishing Company, Mountain View, California, USA.

Lilienfeld, D.E. and Stolley, P.D. (1994). Foundations of Epidemiology, 3rd ed.
New York : Oxford University Press.

Morton, R.F., Hebel, J.R. & McCarter, R.J. (2001). A Study Guide to

Epidemiology and Biostatsitics, 5th ed.

Pagano M., and Gauvreau, K. (1993), Principles of Biostatistics, Duxbury Press,


USA.

STA60004: Research Design

Semester 2/SP 3 2015

26

Study types in epidemiology


Descriptive studies

Analytical studies

Cross-sectional
studies

Observational
studies

Experimental
studies

Longitudinal
studies

Cross-sectional
studies

Randomised
controlled trials

Cohort studies

Community or
cluster trials

Case-control studies
Ecological studies
1

Objectives of different study types


 Descriptive studies
 Used to describe a health related issue

 Analytic studies
 Analyse associations between factors
 Including cause and effect

Descriptive studies
 Describe a health state or event in a defined
population
 Provide measures of prevalence or incidence,
 eg: How common is obesity among people who eat first

food?


High prevalence may suggest a causal link or can be due to


genetic factors
But, we also need to know prevalence of obesity among
people who dont eat first food
3

Analytic studies
 Designed to answer questions:
 Is there any (causal) association?
 How strong is an association?

 Involve planned comparisons between groups


 Generate measures of association

What are we studying?


 Outcome factor
 the outcome of interest in our study
 often discussed in terms of disease

 Study factor
 the factor (or factors) that we are interested in, that may be

related to the outcome


 often called exposure
5

Identifying study factors and outcome


factors
 Examples of study question:
Does smoking cause lung cancer?
Outcome factor: lung cancer
Study factor: smoking

Does hormone replacement therapy


cause ovarian cancer?
Outcome factor: ovarian cancer
Study Factor: hormone replacement
therapy
6

Example of reporting a study outcome


(stroke) and study factor (smoking)

TIME AND TYPES OF STUDY

PAST

FUTURE

PRESENT
Retrospective

Prospective
Crosssectional

Time

 Cross Sectional Study, Cohort study, Case-control


study, RCT are explained in more details throughout
the Weeks 10 and 11 (Topics 5 and 6 in Module 2).
 In this document, I have explained briefly the
Ecological study.

Ecological studies
 In most epidemiological studies individuals are
counted and analysed in terms of their exposure and
outcome status
 Ecological studies analyse aggregates of individuals or
other larger units
 Usually because aggregate data is all that is available

10

Ecological studies

Source: Beaglehole, Bonita, Kjellstrom. 1993. Epidemiology. Geneva, WHO.

11

Ecological studies

12

Ecological studies

Source: Gertsman BB. 2003. Epidemiology kept simple. Hoboken, New Jersey, Wiley-Liss

13

Ecological studies

Source: Gertsman BB. 2003. Epidemiology kept simple. Hoboken, New Jersey, Wiley-Liss

14

Ecological studies
 Strengths for answering questions on associations ie
CAUSE and EFFECT questions
 Can use existing data collected for different purposes


Relatively quick and easy to do

 Important type of study for phenomena that apply at group

or area level, eg


Characteristics of physical or social environment

15

Research Design: Topic 11

Module 2: Topic 5

Topic 5: Randomised Trials


and Cohort Studies

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

Contents
5.1 Randomised clinical trials

5.1.1 Topic Background

5.1.2 Topic learning objectives

5.1.3

Suggested Reading

5.1.4

Design of a randomised trial

5.1.5

Major Concepts

5.1.5.1 Prospective study

5.1.5.2 Observational and experimental studies

5.1.5.3 Design and Conduct of the Randomised Clinical Trial (RCT) 5


5.1.5.4 Ethical Considerations

5.1.6

Discussion Point

5.1.7

Extra readings

5.2 Cohort Studies

14

5.2.1 Topic Background

14

5.2.2 Topic learning objectives

14

5.2. 3 Suggested Reading

15

5.2. 4 Design of a cohort study

15

5.2. 5 Types of cohort study

16

5.2.6

16

Major Concepts

5.2.7 Advantages and disadvantages of cohort studies

17

5.2.8 Measure of effects for categorical data

17

5.2.8.1 Relative Risk (RR)

18

5.2.9 Comparing cohort studies with randomised trials

20

Revision exercises

21

Solution to the revision exercises

24

References

32

Note: Some of the materials are adapted from standard texts and guides (see references).

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

5.1 Randomised Clinical Trials


5.1.1 Background
In medicine, the evaluation of the effectiveness of medical treatment occurs in a
systematic fashion. Clinical trials compare new treatments with established treatments or
placebos under standardised conditions. Such treatments may be drugs or surgical
procedures. Ethical standards are maintained by ethics committees and they also ensure
that the clinical trial procedure is explained to patients. The system was established with
the development of new medicines in the 1940s and 1950s.
Clinical trials are a type of epidemiologic study and have been defined as any form of
planned experiment that involves patients and is designed to elucidate the most
appropriate treatment of future patients with a given medical condition (Pocock, 1983).
The epidemiologic study types have different advantages and disadvantages which must
be assessed when deciding on the best study type to answer a given research question.
Randomised clinical trials (RCTs) are often considered to produce the most accurate
results. This topic explores the many issues related to RCTs.
A randomised controlled trial (RCT) is a type of scientific experiment most commonly
used in testing the efficacy or effectiveness of healthcare services (such as medicine or
nursing) or health technologies (such as pharmaceuticals, medical devices or surgery).
RCTs are also employed in other research areas, such as judicial, educational, and social
research. As their name suggests, RCTs involve the random allocation of different
interventions (treatments or controls) to subjects. As long as numbers of subjects are
sufficient, this ensures that both known and unknown confounding factors are evenly
distributed between treatment groups.
Randomised controlled trials (RCTs) are considered the gold-standard of study design.
They can provide evidence for causal relationships and support changes in clinical
practice. In an RCT, subjects are randomly assigned to receive the intervention or control
treatment, and outcomes are evaluated after the intervention period. The control group is
the group which receives the standard of care (or a placebo).
It is very important that the investigator and subjects are blinded to which treatment the
subjects receive. This is called a double-blind study. Conducting a double-blind study
helps to reduce any potential bias in the results of the study.

5.1.2 Learning Objectives


By the end of this topic you should:

Identify the major epidemiologic study designs and which types are classified as
experimental or observational

Understand the essential elements of a randomised clinical trial

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

Understand some of the ethical considerations of research involving human


subjects

5.1.3 Suggested Reading


Leon Gordis (2004). Epidemiology (3rd ed.). Chapters: 7-8.

5.1.4 Design of a Randomised Trial

Study Population

RANDOMIS
ED

Control group

Treatment group

Improved

Not improved

Improved

Not improved

Reference: Leon Gordis (2004). Epidemiology (3rd edn.). Chapters: 7.

5.1.5 Major Concepts


5.1.5.1 Prospective Study
A prospective study is a study in which we have two groups of patients. One group
consists of patients possessing the risk factor and the other group consists of patients who
do not possess the risk factor. The patients are followed into the future (that is, they are
followed prospectively) and a record is kept on the number of patients in each sample
who, at some point of time, are classifiable into each of the categories of the outcome
variable. The number of subjects in each sample is determined by the investigator
following some specific rules to determined the number of subjects. Prospective cohort
studies and randomised clinical trials (RCT) are examples of prospective studies.
Prospective study design is the best design for establishing relationships between your
outcome of interest and exposure variables. The primary feature of prospective designs is
the outcome has not occurred at the time the study is initiated, and information is
collected over time to assess relationships with the outcome.
Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

5.1.5.2 Observational and Experimental Studies


An observational study draws a conclusion by comparing subjects against a control group,
in cases where the researcher has no control over the experiment. In order to study the
relationships among variables, observational studies are performed. Unlike controlled
experimental designs where only certain variables are allowed to vary, in observational
studies the variables are observed and recorded. Often some of the variables are
controlled as much as possible. Consider a long term study on a drug involving humans
where a variable that needs to be controlled is diet. The diet guidelines are set but these
will probably be broken from time to time (or maybe often) by some of the human
subjects. Contrast this with a lab setting, where the diet of animals can be controlled.
A research study comparing the risk of developing lung cancer, between smokers and
non-smokers, would be a good example of an observational study.
Observational studies consists of four main types: case-series, case-control, crosssectional (including surveys), and cohort studies.
A statistical investigation in which the researcher can influence events in the study is
known as an experimental study. Experimental studies are generally easier to identify
than observational studies in medical research. Experimental studies in medicine that
involve humans are called clinical trials because their purpose is to draw conclusions
about a particular procedure or treatment. Controlled trials (randomised, not randomised,
self-controlled, crossover) are experimental studies.

5.1.5.3 Design and Conduct of the Randomised Clinical Trial (RCT)


Randomisation
The essential element of RCTs is the R - randomisation. Theoretically, randomisation
eliminates the potential bias of observer or participants who expect a certain outcome due
to a given treatment provided they are masked to the random treatment allocation.
Types of Randomised Clinical Trials
Clinical trials can be categorised according to their objectives and also the way they are
organised. For example, they might be prevention trials, treatment trials or diagnostic
and screening trials.

Interventional trials (also called treatment trials) are clinical trials which set out
to test treatments or combinations of treatments which have not yet been
officially approved. For example, a pharmaceutical company may have developed
a new drug which it believes would be effective in the treatment of Alzheimers
disease but must first test it on human volunteers in accordance with strict and
rigorous guidelines in order to ensure that it is safe and effective. Some drugs

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

might be aimed at curing a particular condition, whereas others might be aimed


at better controlling symptoms of a particular condition.

Prevention trials involve tests to find ways to prevent particular medical


conditions or if people have them already, to prevent them from reoccurring. The
emphasis of these studies might be on medicines, vitamins and minerals or
lifestyle changes.

Observational trials investigate health issues in large groups of people. The


participants in such trials do not receive any treatment but may be asked to
provide information, blood samples.

Diagnostic and screening trials are aimed at finding new ways to detect and
diagnose medical conditions (e.g. a better test, a more effective procedure or a
more sophisticated tool).

Clinical trials can also be classed according to whether they are considered therapeutic
or non-therapeutic.

A therapeutic trial is one in which the treatment under investigation is believed


to be likely to benefit the participants in some way (at least those receiving the
experimental drug - in the case of drug trials).

A non-therapeutic trial, on the other hand, is one which is unlikely to produce


any direct benefit to the participants involved. The aim of a non-therapeutic trial
is to obtain knowledge which may contribute towards the future development of
new forms of treatment or procedure.

Evidence from approved RCTs is required for medications to receive marketing approval
by the Therapeutic Goods Administration (TGA) in Australia.
Placebo Effects
A well-known phenomenon in medicine is that patients given only inert substance (e.g.
sugar pills) will often show subsequent clinical improvement when compared with
similar untreated patients. This phenomenon, termed the placebo effect, must be taken
into account in clinical trial design if effects in the intervention group are to be ascribed to
the intervention itself and not to the generic effect of being treated. The usual method to
accomplish this is to use an inert treatment that is indistinguishable from the primary
intervention in the control group. Thus, the only difference between groups is the specific
intervention under study.

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

Problems in Randomised Clinical Trials


The National Health and Medical Research Council has developed the National Statement
on Ethical Conduct in Research Involving Humans, which can be downloaded from the
website:
http://www.nhmrc.gov.au/publications/synopses/e35syn.htm
Under this statement it is unethical to pay study subjects to participate in studies if it
would induce them to participate when they would otherwise not participate.
5.1.5.4 Ethical Considerations
A set of Ethical Guidelines for Epidemiologists has been proposed (J Clin. Epidemiology
Vol. 44, 1991). The proposed obligations are:
I.

Obligations to the subjects of research:


to protect their welfare
to obtain their informed consent
to protect their privacy
to maintain confidential information

The National Health and Medical Research Council has developed guidelines for the
organisation and conduct of human research ethics committees to review research
involving humans. This involves the provision of plain language information to potential
study participants so that they are truly informed of all aspects of the trial and can thus
give written, informed consent. In the case of minors, a parent or legal guardian can give
consent provided the minor agrees to participate. Further information is available from
the National Health and Medical Research Council in Canberra.
II.

Obligations to society:
to avoid conflicts of interest
to avoid partiality
to widen the scope of epidemiology
to pursue responsibilities with due diligence
to maintain pubic confidence

The basis of this obligation is the accurate interpretation and reporting of research results.
These concepts will be discussed in more detail in future topics.
III.

Obligations to founders and employers:


to specify obligations
to protect privileged information

Oftentimes, epidemiologists do research for third parties and are obliged not to reveal
proprietary information.
IV.

Obligations to colleagues:
to report methods and results
to confront unacceptable behaviour and conditions
to communicate ethical requirements

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

These obligations occur in two common situations: who should be included as authors on
articles and the necessity to report negative results. It has been shown that research
articles that comprise statistically significant results are more likely to be published than
non-significant results. When this occurs, further resources may be expended to prove
something that has already been shown not to be significant.

5.1.6 Discussion Point


Would a randomised clinical trial be an appropriate epidemiologic study methodology to
evaluate the effect of polio vaccination versus a placebo on life expectancy? Why or why
not? (See answer to the revision exercises.)

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Module 2: Topic 5

5.1.7 Extra Readings

Course notes STA60004

Semester 2, 2014

Research Design: Topic 11

Course notes STA60004

Module 2: Topic 5

Semester 2, 2014

10

Research Design: Topic 11

Course notes STA60004

Module 2: Topic 5

Semester 2, 2014

11

Research Design: Topic 11

Course notes STA60004

Module 2: Topic 5

Semester 2, 2014

12

Research Design: Topic 11

Course notes STA60004

Module 2: Topic 5

Semester 2, 2014

13

Research Design: Topic 11

Module 2: Topic 5

5.2 Cohort Studies


5.2.1 Background
A cohort is a well-defined group of people who have had a common experience. For
example, a group of people born during a particular period or year is called a birth cohort.
Another example would be a group of students who reach a milestone event together at
the same time. For example, a group of students who matriculate at the same time is an
entering cohort. On the other hand, a group of students who receive their degrees on the
same date comprise a graduating cohort.
In a clinical study, a well-defined group of subjects or patients who have had a common
experience or exposure and are then followed up for the incidence of new diseases or
events is called a cohort.
In statistics and demography, a cohort is a group of subjects, most often humans from a
given population, defined by experiencing an event (typically birth) in a particular time
span.
A Cohort Study allso referred to as prospective, incidence, longitudinal or follow-up study
is a study in which subjects who presently have a certain condition and/or receive a
particular treatment are followed over time and compared with another group who are not
affected by the condition under investigation. For research purposes, a cohort is any group
of individuals who are linked in some way or who have experienced the same significant
life event within a given period. There are many kinds of cohorts, including birth (for
example, all those who born between 1970 and 1975) disease, education, employment,
family formation, etc. Any study in which there are measures of some characteristic of
one or more cohorts at two or more points in time is cohort analysis. The cohort study is
generally considered to be the epitome of observational epidemiologic study designs
because the issue of temporality can be resolved. With this advantage, of course, are
some associated disadvantages. Issues related to the design of cohort studies will be
explored in this topic, while analytic issues will be discussed in more detail in a later
topic.

5.2.2 Learning Objectives


By the end of this topic you should:

Understand the basic features of a cohort study

Know when a cohort study is the appropriate epidemiologic research design to


answer a scientific question

Know the advantages and disadvantages of cohort studies

Know the types of summary rates calculated with data from cohort studies

Course notes STA60004

Semester 2, 2014

14

Research Design: Topic 11

Module 2: Topic 5

5.2.3 Suggested Reading


Leon Gordis (2004). Epidemiology (3rd edn.). Chapters: 9.

5.2.4 Cohort study


5.2.5 Types of Cohort Study
There are two major types of cohort studies in health related study: (i) prospective cohort
study (noncurrent cohort or longitudinal study) & (ii) Retrospective cohort study (nonconcurrent or historical cohort study).
A study comprising participants with and without exposure who are selected from the
population at the start of the study and followed into the future is called concurrent or
prospective cohort study.
On the other hand, a study in which an investigator goes back in time to select subjects or
participants based on exposure and traces these subjects/participants over time to the
present (that is, at the time the study initiated, both exposure and outcome have occurred)
is known as a non-concurrent or retrospective cohort study.

Prospective Cohort Study

In a prospective cohort study, exposure status is measured first,


then participants are followed up until outcome status is established

Course notes STA60004

Semester 2, 2014

15

Research Design: Topic 11

Module 2: Topic 5

Exposed
Disease
Population

People without
disease

No disease
Unexposed

Disease
No disease

Source: The above are figures are for prospective cohort studies: Medical research library
of Brooklyn
http://servers.medlib.hscbklyn.edu/ebm/2400.htm

Course notes STA60004

Semester 2, 2014

16

Research Design: Topic 11

Module 2: Topic 5

5.2.6 Major Concepts


The key feature of a cohort study is the prospective nature of the study design.
Information about risk behaviours and disease status at baseline is known and then
compared to outcomes at follow-up in the future.
Measuring Association in Cohort Studies


What is the interpretation of an incidence rate? How does it differ from a


prevalence rate?

Exposure Assessment
The two key elements in the identification of relative risks associated with the
development of new disease are the measurement of exposure and outcome. Recall from
an earlier topic the components of screening. Tools to measure exposures and disease
must be both reliable and valid. One of the potential problems in cohort studies is that the
measure of exposure or outcome can change over the study period. For example, a new
International Classification of Disease may be implemented half way through a 10-year
cohort study. If the investigators were tracking death certificates for certain disease
classifications, they may have difficulties.

Course notes STA60004

Semester 2, 2014

17

Research Design: Topic 11

Module 2: Topic 5

An abstract from a Prospective cohort study which reports the


measures of associations:
Am J Ophthalmol. 2007 Jun;143(6):970-6.
Three-year incidence and cumulative prevalence of retinopathy: the atherosclerosis
risk in communities study by
Wong TY, Klein R, Amirul Islam FM, Cotch MF, Couper DJ, Klein BE, Hubbard LD,
Sharett AR.
 PURPOSE: To describe the three-year incidence and cumulative prevalence of
retinopathy and its risk factors.
 DESIGN: Population-based, prospective cohort study in four US communities.
 METHODS: In the Atherosclerosis Risk in Communities (ARIC) Study, 981
participants had retinal photography of one randomly selected eye at the third
examination (1993 to 1995) and three years later at the fourth examination (1996).
Photographs were graded on both occasions for retinopathy signs (for example,
microaneurysm, retinal hemorrhage, and/or cotton-wool spots). Incidence was
defined as participants without retinopathy at the third examination who
developed retinopathy at the fourth examination, and cumulative prevalence was
defined to include incident retinopathy as well as participants who had retinopathy
at both the third and fourth examinations.
 RESULTS: The three-year incidence and cumulative prevalence of any
retinopathy in the whole cohort was 3.8% and 7.7%, respectively. In multivariable
analysis, incident retinopathy was related to higher mean arterial blood pressure
(odds ratio [OR] 1.5, 95% confidence interval [CI] 1.0 to 2.3, per standard
deviation increase in risk factor levels), fasting serum glucose (OR 1.6, 95% CI
1.3 to 2.1), serum total cholesterol (OR 1.4, 95% CI 1.0, 2.0), and plasma
fibrinogen (OR 1.4, 95% CI 1.1 to 1.9). Among persons without diabetes, the
three-year incidence and cumulative prevalence of nondiabetic retinopathy was
2.9% and 4.3%, respectively. Incident nondiabetic retinopathy was related to
higher mean arterial blood pressure (OR 1.4, 95% CI 0.9 to 2.3) and fasting serum
glucose (OR 1.5, 95% CI 1.0 to 2.3). Among persons with diabetes, the three-year
incidence and cumulative prevalence of diabetic retinopathy was 10.1% and
27.2%, respectively.
 CONCLUSIONS: Retinopathy signs occur frequently in middle-aged people,
even in those without diabetes. Hypertension and hyperglycemia are risk factors
for incident retinopathy.
Cross-sectional study
A cross-sectional study is a study in which a study population is ascertained at one point
in time and all subjects in the population (or a sample of the population) are investigated
for outcome and/or exposure. As noted earlier, cross-sectional studies are not the cohort
studies but a starting point to start a cohort studies, in which exposed and non-exposed
groups are followed up over time and the development of disease recorded. A crosssectional study may be used as the first step in a longitudinal study, or may be used to
obtain cases and controls for a later case-control analysis. This type of study is sometimes
called a prevalence study, because the prevalence of disease at one point in time
compared between exposed and unexposed individuals. The following diagram shows
how the studies are linked and what are objectives of the studies.
Course notes STA60004

Semester 2, 2014

18

Research Design: Topic 11

Module 2: Topic 5

5.2.7 Advantages and disadvantages of cohort studies


Advantages:

ethically safe;

subjects can be matched;

can establish timing and directionality of events;

eligibility criteria and outcome assessments can be standardised;

Disadvantages:

controls may be difficult to identify;

exposure may be linked to a hidden confounder;

blinding is difficult;

randomisation not present;

administratively time consuming and more costly than RCT.

for rare disease, large sample sizes or long follow-up necessary.

5.2.8 Measure of effects for categorical data


We would like to compare the frequency of disease between exposed and unexposed
subjects. Doing so is most straightforward in the context of prospective studies where we
compare incidence rate, or in cross-sectional studies where we compare prevalence rates
between exposed and unexposed individuals. Assumptions related to categorical data
analysis are:
Independent (Explanatory) Variable is Categorical (Nominal or Ordinal)
Course notes STA60004

Semester 2, 2014

19

Research Design: Topic 11

Module 2: Topic 5

Dependent (Response) Variable is Categorical (Nominal or Ordinal)

Tests for Binary/Categorical outcomes (commonly used measures)

Outcome
Variable

Binary or
categorical
(e.g.
fracture,
yes/no)

Are the observations correlated?


Independent

Correlated

Chi-square
test:
compares proportions
between more than two
groups
Relative risks: odds
ratios or risk ratios
Logistic regression:
multivariate technique
used when outcome is
binary;
gives
multivariate-adjusted
odds ratios

McNemars chi-square test:


compares
binary
outcome
between correlated groups (e.g.,
before and after)
Conditional logistic regression:
multivariate regression technique
for a binary outcome when
groups are correlated (e.g.,
matched data)
GEE modeling:
multivariate regression technique
for a binary outcome when
groups are correlated (e.g.,
repeated measures)

Alternative to the
chi-square test if
sparse cells:
Fishers
exact
test:
compares
proportions
between
independent
groups when there are
sparse data (some cells
<5).
McNemars exact test:
compares
proportions
between correlated groups
when there are sparse data
(some cells <5).

5.2.8.1 Relative Risk (RR)


Relative risk is appropriate for cross-sectional, prospective cohort studies and clinical
trials and is calculated as the ratio of the risk of developing a disease among patients in
the exposed group to the risk of developing the disease among the patients in the
Course notes STA60004

Semester 2, 2014

20

Research Design: Topic 11

Module 2: Topic 5

unexposed group. More simply, we calculate the percentage of patients who developed
disease in the exposed and unexposed groups, then divide the percentage for the exposed
group by that of the unexposed group. This will give the required RR.
The relative risk may be defined as the ratio of the incidence of disease among an
exposed groups relative to the incidence of disease among a non-exposed groups.
RR =

Incidence in exposed
.
Incidence in nonexposed

The data resulting from a prospective study in which the outcome variable and the risk
factor are both dichotomous may be displayed in a 2 2 table as given below.
Table 5.1: General representation of two nominal characteristics
Risk Factor Develop disease Do not develop disease Total
a
b
a+b
Exposed
c
d
c+d
Unexposed
a+c
b+d
a+b+c+d
Total

From the above table, a out of (a + b) patients who developed the disease in the
exposed group, thus the proportion of patients who developed disease in the exposed
group is
a /(a + b). Similarly, the proportion of patients who developed disease
in the unexposed group is c /(c + d).
Thus, the relative risk can be represented symbolically as

a
a (c + d )
RR = a + b =
c
c ( a + b)
c+d
A RR ranges between zero to some positive values. However, in general we encounter the
following three possibilities.
A RR of 1 or very close to 1 indicates that the patients in the exposed and unexposed
groups have the same risk.
If the RR is greater than 1, patients in the exposed group are at increased risk compared
to those in the unexposed group.
If the RR is less than 1, patients in the exposed group are at lower risk than the patients
in the unexposed group.
Note: To test the hypothesis (null) of no association between the factors (exposure and
disease) we use the (chi-squared test) and to obtain an estimate of the magnitude of the
association between exposure and disease we use RR (cohort study) or OR (case-control
study).

Example: (Reference: Evans DA et al., N. Engl. J. Med. 1978; 299:536)


A follow-up study of oral contraceptive (OC) use & the subsequent development of
urinary tract infection (UTI) in women aged 16-49 years. Among 2390 subjects free of
Course notes STA60004

Semester 2, 2014

21

Research Design: Topic 11

Module 2: Topic 5

UTI at an initial survey in 1973, there were 482 OC users and the remaining 1908 were
not OC users. At a second survey in 1976, 27 of the OC users had developed UTS, as
had 77 of the non-users.
Outcome
Exposure

UTI

No UTI

Total

OC use

27

455

482

No OC use

77

1831

1908

Total

104

2286

2390

 Expresses the probability of a disease or outcome of interest occurring in the


population
Same as the incidence of the disease
 From the Table above:
27 of 482 OC users develop UTI over 3-yr period
then absolute risk of UTI over 3-yr period is:
27/482 = 0.056 cases of UTI over 3-yrs
0. 056 x 100 = 5. 6%
for non OC users: 77/1908 x 100 = 0.04 = 4.0%
The results suggest that users of OCs may be at a higher risk of developing
UTI during a three year interval.
 3-year risk in OC users (exposed) = 5.6%
 3-year risk in non-users (unexposed) = 4%
 RR = Risk in exposed/Risk in unexposed
= 5.6/4.0
= 1.4
The relative risk of 1.4 indicates that in this study, OC users were 40% more
likely to develop UTI during a 3-year interval than those who did not use OCs.

5.2.9 Comparing Cohort Studies with Randomised Trials


(Reference: Leon Gordis (2004). Epidemiology (3rd edn.). Chapters: 9)
In an observational cohort study and a randomised trial (experimental cohort) design, both
studies compare exposed with nonexposed groups. Because of ethical and other reasons,
we cannot randomise people to receive a harmful substance; the exposure in most trials
is a treatment or preventive measure. In cohort studies investigating etiology, the
exposure is often to a possibly toxic or carcinogenic agent. In both types of design,
however, an exposed group is compared with a non-exposed group or to a group with
another exposure. The difference between these two designs-the presence or absence of
randomization-is critical with regard to interpreting the study findings.
Course notes STA60004

Semester 2, 2014

22

Research Design: Topic 11

Module 2: Topic 5

A simple comparison between these two studies is presented in the following figure.
Observational Study

Experimental Study

Population

Population

Other-Than-Random Allocation
(e.g. Self Selection)

Random Allocation

Group A

Course notes STA60004

Group B

Group A

Semester 2, 2014

Group B

23

Research Design: Topic 11

Module 2: Topic 5

Revision Exercises

5.1 Randomised Clinical Trials (RCT)


1. Give an example of a therapeutic, intervention and preventive RCT.
Therapeutic:

Intervention:

Preventive:

2. In an RCT to assess the effectiveness of a new type of open heart surgery, could the
trial be single blind, double blind or triple blind? Why?

3. Fill in the missing blanks in the following table:


Aetiological Factor

Risk Factor

low dietary calcium


salmonella

Disease
osteoporosis

unhygienic food preparation


cigarette smoking

lung cancer

4. If investigators find no statistically significant relationship between a new medication


and mortality in a clinical trial, what are two explanations for this finding?
5. What is a potential strategy to assess subject compliance to a quit smoking campaign?

6. What are some potential means of keeping subjects enrolled in clinical trials?
Analysis of the Results
7 If you discover that, by chance, the baseline characteristics of your study groups in a
clinical trial are significantly different, what are some possible solutions?
8.  Read the following supplementary article:
Course notes STA60004

Semester 2, 2014

24

Research Design: Topic 11

Module 2: Topic 5

Shackel NA, Day RO, Kellett B, Brooks PM. Copper-salicylate gel for pain relief
in osteoarthritis: a randomised controlled trial. Med J Aust 1997; 167:134-136.
Can be downloaded from the website:
http://espace.library.uq.edu.au/eserv/UQ:10099/copper.pdf
A pdf copy of this article also attached with Topic 5 notes on Blackboard.
(i). Were people in the treatment group in the supplementary article significantly
more likely to withdraw during the trial? How could this affect the results of the
clinical trial?
(ii). Recalculate the chi-square and p-value associated with the number of patients
in each group that reported adverse events during the trial.
9. Why is randomisation essential and what does it accomplish?
Discussion Point
10. Would a randomised clinical trial be an appropriate epidemiologic study methodology
to evaluate the effect of polio vaccination versus a placebo on life expectancy? Why or
why not?
11. If investigators find no statistically significant relationship between a new medication
and mortality in a clinical trial, what are two possible explanations for this finding?

5.2 Cohort Studies


1. What are some examples of cohorts that could be studied?
2. What is the interpretation of an incidence rate? How does it differ from a prevalence
rate?

3. How might you measure exposure to second-hand cigarette smoke in a cohort study of
the effects of second-hand smoke on the incidence of asthma in children?
4.

A common measure of dietary intake is the 24-hour food recall. Write down every
thing that you had to eat and drink in the last 24 hours. Do you think that those 24
hours were representative of your usual food intake? Usually, a series of random
24-hour recalls are undertaken to classify a persons usual dietary intake.

5.  Read the following supplementary article:


Crofts N, Aitken CK. Incidence of bloodborne virus infection and risk behaviours
in a cohort of injecting drug users in Victoria, 1990-1995.
Med J Aust 1997;167:17-20.
Course notes STA60004

Semester 2, 2014

25

Research Design: Topic 11

Module 2: Topic 5

Can be downloaded from the website:


https://www.mja.com.au/journal/1997/167/1/incidence-bloodborne-virusinfection-and-risk-behaviours-cohort-injecting-drug
(i) Recall from an earlier topic that Prevalence = Incidence x Duration
Use this formula and the information in tables 3 in the article to estimate the
overall duration in years for people infected with the Hepatitis C virus (HCV).
(ii) Calculate the chi-square value for the relationship between reports of sharing
of needles and syringes, and HCV serostatus in table 3 of the article. How many
degrees of freedom are there? How is this chi-square value interpreted?

6.

The following table shows a hypothetical cohort study of 3,000 cigarette smokers
and 5,000 nonsmokers to investigate the relation of smoking to the development
of coronary heart disease (CHD) over a 1-year period.
CHD develops

CHD does not develop

Total

Smoke cigarettes

84

2916

3000

Do not smoke cigarettes

87

4913

5000

Calculate relative risk and interpret your answer.


7. Of 595 patients who had received blood transfusions and 712 patients who had not, 75
and 16 respectively, developed hepatitis during a 2.5 yr follow-up in a study designed to
evaluate relative risk.
a) Construct a 2 x 2 table.
b) Calculate the relative risk of developing hepatitis in patients receiving a blood
transfusion compared with patients not receiving a blood transfusion.
c) Interpret the answer.
8. The Roper Organization (1992) conducted a study as part of a larger survey to ascertain
the number of American adults who had experienced phenomena such as seeing a ghost,
feeling as if you left your body, and seeing a UFO. A representative sample of adults
(18 and over) in the continental United States were interviewed in their homes during
July, August, and September 1991. The results when respondents were asked about
seeing a ghost are shown in the following table (taken from Utts, J.M., seeing through
statistics, p.235).
Reportedly has seen a Ghost
Age group
Aged 18 to 29
Aged 30 or over
Total

Course notes STA60004

Yes
212
465
677

No
1313
3912
5225

Semester 2, 2014

Total
1525
4377
5902

26

Research Design: Topic 11

Module 2: Topic 5

What is the relative risk (RR) of reportedly seeing a ghost for one group compared to the
other? Write a sentence to explain your answer in a form that could be understood by
someone who knows nothing about statistics.

Solution to the revision exercises


5.1 Solutions: Randomised Clinical Trials

1.

Give an example of a therapeutic, intervention and preventive RCT.


Therapeutic: A trial comparing laser surgery for myopia (short-sightedness) with
contact lens vision correction (hypothetical).
The Beta-Blocker Heart Attack Trial.

Intervention: Pap smears to detect early signs of cervical cancer.


Cessation of high cholesterol diet and smoking to lower risk of
myocardial infarction.
Preventive:

Oestrogen treatment to prevent osteoporosis.


Vaccine trials to prevent infection.

2. In an RCT to assess the effectiveness of a new type of open heart surgery, could
the trial be single blind, double blind or triple blind? Why?
Single blinding would involve the patient (subject) being unaware of which type of
surgery (s)he experienced. This would not be difficult to achieve. Double blinding would
involve the observer of the subject not being aware of the type of surgery the subject has
received. This would not be possible if the observer were the surgeon (as would most
likely be the case) or a member of theatre staff. Triple blinding is not possible if double
blinding is not achieved. In many cases the person reviewing the data is the statistician
whose responsibility it is to arrange and manage double blinding.
3.

Fill in the missing blanks in the following table:


Aetiological Factor

Risk Factor

low dietary calcium


salmonella
carcinogens in

Course notes STA60004

Disease
osteoporosis

no dairy products in diet


unhygienic food preparation
cigarette smoking

Semester 2, 2014

food poisoning
lung cancer

27

Research Design: Topic 11

Module 2: Topic 5

smoke

4. If investigators find no statistically significant relationship between a new


medication and mortality in a clinical trial, what are two explanations for this
finding?
(i)
(ii)

No relationship exists. The medication has no effect on the disease.


There is a relationship but the trial has been unable to detect it because
of low power due to poor design, poor control of confounding factors
or too small a sample size. (There has been a type 2 error.)

5. What is a potential strategy to assess subject compliance to a quit smoking


campaign?
Testing of blood for nicotine or other smoking related residues. Lung capacity testing.
(Reliable methods of counting the numbers of cigarettes smoked are difficult to imagine.)
6. What are some potential means of keeping subjects enrolled in clinical trials?
Offering an incentive or inducement to participate eg free medical examinations or
health care

Appealing to their altruism (their desire to help others with the disease)
Promising that should they be assigned to the control group they will be given the new
treatment after the trial, if it proves effective
7. If you discover that, by chance, the baseline characteristics of your study groups in a
clinical trial are significantly different, what are some possible solutions?
If the discovery is made before the trial is in progress you could reallocate subjects
randomly to groups after stratifying on key variables that are possibly related to the
outcome.
Randomly discard subjects with certain characteristics (the ones showing imbalance)
from one or other of the groups. This strategy may be viable in very large studies.
Include the variables showing imbalance in the analysis as covariates so as to
statistically control for them.
8.  Read the following supplementary article:
Shackel NA, Day RO, Kellett B, Brooks PM. Copper-salicylate gel for pain relief in
osteoarthritis: a randomised controlled trial. Med J Aust 1997;167:134-136.

Course notes STA60004

Semester 2, 2014

28

Research Design: Topic 11

(i)

Module 2: Topic 5

Were people in the treatment group in the supplementary article significantly


more likely to withdraw during the trial? How could this affect the results of
the clinical trial?

Yes, people in the treatment group were significantly more likely to withdraw during the
trial. This could affect the results by decreasing the power of the study and increasing the
likelihood of a type II error.
(ii) Recalculate the chi-square and p-value associated with the number of
patients in each group that reported adverse events during the trial.
The two-by-two table to test the association between the report of adverse events and the
groups is shown below (note that intention-to-teat analyses are assumed and all 58 people
in each group have been included). These data were obtained from Table 2. Note that the
information in Table 1 relates to people who withdrew from the trial due to their adverse
events, not the total number of adverse events.
Observed
Reported adverse
event
Yes
No
Total

Copper-salicylate
gel
48
10
58

Placebo

Total

29
29
58

77
39
116

Expected
Reported adverse
event
Yes
No
Total

Copper-salicylate
gel
38.50
19.50
58

Placebo

Total

38.50
19.50
58

77
39
116

2 = (O-E)2/E
2 = [(48 38.5)2 / 38.5] + [(10 19.5)2 / 19.5] + [(29 38.5)2 / 38.5] +
[(29 19.5)2 / 19.5]
2 = 2.3441 + 4.6282 + 2.3441 + 4.6282
2 = 13.9 on one degree of freedom has a p-value is less than 0.001.
9. Why is randomisation essential and what does it accomplish?

Course notes STA60004

Semester 2, 2014

29

Research Design: Topic 11

Module 2: Topic 5

Randomisation is essential because then and only then can the difference in outcome
between the groups be ascribed to the treatment applied. If randomization is omitted, the
difference in outcome may be due to differing characteristics in the competing groups.
Randomisation achieves an unbiased distribution of risk factors (e.g. age, sex, severity of
disease). This principle is the underlying concept on which randomised clinical trials are
based.
10. Would a randomised clinical trial be an appropriate epidemiologic study
methodology to evaluate the effect of polio vaccination versus a placebo on life
expectancy? Why or why not?
No, because it would involve the withholding of an effective vaccine from some people at
risk of contracting polio, thereby exposing them to a possibly disabling disease which
would reduce permanently the quality of their lives.
However, it would be ethical to compare the effect of polio vaccination using a new
vaccine with the polio vaccines that are currently used.
11. If investigators find no statistically significant relationship between a new
medication and mortality in a clinical trial, what are two possible explanations for
this finding?

(i)
(ii)

No relationship exists. The medication has no effect on the disease.


There is a relationship but the trial has been unable to detect it because
of low power due to poor design, poor control of confounding factors
or too small a sample size. (There has been a type 2 error.)

5.2 Solutions: Cohort Studies


1. What are some examples of cohorts that could be studied?
There are endless possibilities. Here are a couple of examples:


Women of childbearing age (say 20-25 years) could be followed for (say) 20 years to
assess how maternal diet affects the birth weight of children. A difficulty would be
in defining the exposures (diet during what time period, how is diet assessed or
measured?).

Children with asthma could be followed over time to assess the relative effectiveness
of various treatments or treatment regimes.

2. What is the interpretation of an incidence rate?


prevalence rate?

Course notes STA60004

Semester 2, 2014

How does it differ from a

30

Research Design: Topic 11

Module 2: Topic 5

An incidence rate is a rate of appearance of new cases of a disease. A prevalence rate is a


proportion of cases in a population. Incidence has person-time in the denominator while
prevalence has number of persons.

3. How might you measure exposure to second-hand cigarette smoke in a cohort study
of the effects of second-hand smoke on the incidence of asthma in children?
Since the children will spend much of their time at home, the exposure could be measured
by the amount of smoking done by their parents. Other sources of second-hand smoke
would be difficult to associate with individual children.

4. A common measure of dietary intake is the 24-hour food recall. Write down every
thing that you had to eat and drink in the last 24 hours. Do you think that those 24
hours were representative of your usual food intake? Usually, a series of random 24hour recalls are employed to classify a persons usual dietary intake.
You need to think how to do this yourself. Try to write down what you see as the
problems associated with this kind of recall measurement.
5. Supplementary reading:
Crofts N, Aitken CK. Incidence of bloodborne virus infection and risk behaviours in a
cohort of injecting drug users in Victoria, 1990-1995. Med J Aust. 1997;167:17-20.

(i) Recall from topic 3 that

Prevalence = Incidence x Duration

Use this formula and the information in tables 3 in the article to estimate the
overall duration in years for people with the Hepatitis C virus.
Prevalence = No. with positive HCV serostatus/Total population at risk
Prevalence = (49 + 52 + 43)/202 = 144/202

Prevalence = 0.71

Incidence = No. of people who seroconverted /Total person-years at risk


Incidence = 15/130.5
Incidence = 0.115 per person-years at risk

Course notes STA60004

Semester 2, 2014

31

Research Design: Topic 11

Module 2: Topic 5

Duration = Prevalence/Incidence
Duration = 0.71/0.115
Duration = 6.2 years
(ii) Calculate the chi-square value for the relationship between reports of
sharing of needles and syringes, and HCV serostatus in table 3 of the article.
How many degrees of freedom are there? How is this chi-square value
interpreted?
Observed
Reports of sharing
50%
< 50%
None
Total

+
49
52
43
144

HCV serostatus
14
21
23
58

Total
63
73
66
202

+
44.91
52.04
47.05
144

HCV serostatus
18.09
20.96
18.95
58

Total
63
73
66
202

Expected
Reports of sharing
50%
< 50%
None
Total

2 = (O-E)2/E
2 = [(49 - 44.91)2 / 44.91] + [(52 - 52.04)2 / 52.04] + [(43 - 47.05)2 / 47.05] +
[(14 - 18.09)2 / 18.09] + [(21 - 20.96)2 / 20.96] + [(23 - 18.95)2 / 18.95]
2 = 0.3725 + 0 + 0.3486 + 0.9247 + 0 + 0.8656
2 = 2.5114 (which corresponds with the value in the paper)

Degrees of freedom = (r - 1) (c - 1) = 2 x 1 = 2
Interpretation:
2 < 5.99, the critical 2 value at the 0.05 level. Therefore, do not reject the null
hypothesis. There is no significant association between the frequency of reporting shared
use of needles and syringes and HCV serostatus
6. The Following table shows a hypothetical cohort study of 3,000 cigarette smokers
and 5,000 nonsmokers to investigate the relation of smoking to the development of
coronary heart disease (CHD) over a 1-year period.
Course notes STA60004

Semester 2, 2014

32

Research Design: Topic 11

Module 2: Topic 5

CHD develops

CHD does not develop

Total

Smoke cigarettes

84 (a)

2916 (b)

3000
(a+b)

Do not smoke cigarettes

87 (c)

4913 (d)

5000
(c+d)

RR =

a (c + d ) 84 5000
=
c(a + b) 87 3000

=1.61 (approx).
Thus the risk of developing CHD for smoker is 1.6 fold higher than that for the nonsmoker.
7. (a) The 2x2 Table:
Exposure
Transfusion
No transfusion
Total

No Hepatitis
Hepatitis
75 (a)
16 (c)
91

520 (b)
696 (d)
1216

Total
595 (a+b)
712 (c+d)
1307

(b) RR = (75/595) / (16/712) = (75x712)/(595x16)= 5.6


(c) Patients who had blood transfusions were 5.6 times more likely to suffer from
hepatitis compared to patients who did not have blood transfusions.
8. The relative risk RR= (212/1,525)/ (465/4,377)=0.1390/0.1062 = 1.31. We would say:
People aged 18 to 29 are 1.31 times as likely to report seeing a ghost than those over 30
(or some similar wording).

Course notes STA60004

Semester 2, 2014

33

Research Design: Topic 11

Module 2: Topic 5

References

Concato J, Shah N, Horwitz RI (2000). Randomized, controlled trials,


observational studies, and the hierarchy of research designs. N Eng. J. Med.
342:1887 1892.

Clayton D, Hills M. (1993), Statistical Models in Epidemiology. Oxford: Oxford


University Press.

Dawson, B. and Trapp, R.G. (2001), Basic and Clinical Biostatistics, 3rd Edition
(international edition), Lange Medical Book/McGraw-Hill.

Crofts N, Aitken CK (1997). Incidence of blood borne virus infection and risk
behaviours in a cohort of injecting drug users in Victoria, 1990-1995. Med J Aust;
167:17-20.
Can be downloaded from the website:
http://www.mja.com.au/public/issues/jul7/crofts/crofts.html
Friedman LM, Furberg CD, DeMets DL (a985). Fundamentals of Clinical Trials.
Littleton, Massachusetts: PSG Publishing Company.
Gordis, L., (2013), Epidemiology, 5th Edition, Elsevier/Saunders, Chapters 7-8
(available at Swinburne Bookshop and students may buy this book).

Jekel, F.J., Katz, D.L. and Elmore, J.G. (2001). Epidemiology, Biostatistics, and
Preventive Medicine, 2nd Ed., W.B. Saunders Company, New York.

Kelsey JL, Thompson WD, Evans AS (1986).

Methods in Observational

Epidemiology. New York: Oxford University Press.

Lilienfeld, D.E. and Stolley, P.D. (1994). Foundations of Epidemiology, 3rd ed.
New York: Oxford University Press.

Morton, R.F., Hebel, J.R. & McCarter, R.J. (2001). A Study Guide to
Epidemiology and Biostatsitics, 5th Edn.

Pagano M., and Gauvreau, K. (1993), Principles of Biostatistics, Duxbury Press,


USA.

Pocock SJ (1983). Clinical Trials. A Practical Approach. New York: Wiley/Liss.

Rosner B (2006). Fundamentals of Biostatistics, 6th Edition. Thomson, Australia.

Shackel NA, Day RO, Kellett B, Brooks PM (1997). Copper-salicylate gel for
pain relief in osteoarthritis: a randomised controlled trial. Med J Aust; 167:134136.

Online Blog-Experiment Resources.com:


http://www.experiment-resources.com/observational-study.html

Course notes STA60004

Semester 2, 2014

34

Research Design: Topic 12

Module 2: Topic 6

Topic 6: Case-Control
Studies

Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

Contents
6.1 Topic background

6.2 Topic learning objectives

6.3 Suggested reading

6.4 Design of a case-control study

6.5 Major Concepts

6.6

6.5.1 Retrospective Study

6.5.2 Measuring Association in Case-Control Studies

6.5.3

The Selection of Cases and Controls

6.5.4

Potential Sources of Bias

6.5.5

Analysing Case-Control Studies

Advantages and Disadvantages of Case-Control Studies

6.7 Discussion Point

Revision exercises

10

Solution to the revision exercises

17

References

23

Note: Some of the materials are adapted from standard texts and guides (see references).

Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

6.1

Module 2: Topic 6

Background

In todays medicine literature a case-control study is one of the most frequently


encountered research designs. A study that compares two groups of people: those with the
disease or condition under study (cases) and a very similar group of people who do not
have the disease or condition (controls). Researchers study the medical and lifestyle
histories of the people in each group to learn what factors may be associated with the
disease or condition. For example, one group may have been exposed to a particular
substance that the other was not. Sometimes this type of study is called retrospective
study.
Case-control studies begin with individuals who have the outcome (cases) and compare
them to individuals who do not have the outcome (controls) according to past history of
exposure to a factor. Case-control studies are appropriate when: (1) the outcome is rare
and (2) there is reliable evidence of past exposure. One issue to consider in ascertaining
past exposure is recall bias. Past exposure is generally ascertained by interviewing
subjects or analyzing historical records or charts. If cases and controls differentially recall
past exposures or there is more or less thorough documentation on cases compared to
controls, study results may be biased. For case-control studies, the general concern is that
cases will be more likely than controls to recall past exposures because they have already
considered the potential causes of their disease. Similarly, interviewer bias occurs when
study investigators interview cases more thoroughly regarding past exposures than
controls because they know the subject is a case.
Like cohort studies, the purpose of case-control studies is to establish association
between exposure to risk factors and disease. Case-control studies are one of the most
common types of observational epidemiological studies because they generally take less
time to complete than cohort studies. Of course, this saving in time and other resources is
associated with some disadvantages. Issues related to the design and analysis of casecontrol studies will be explored in this topic.

6.2

Learning Objectives

By the end of this topic you should:

Understand the basic features of a case-control study


Know when a case-control study is the appropriate epidemiologic research design
to answer a scientific question
Know the advantages and disadvantages of case-control studies
Know the summary rates calculated with data from case-control studies

6.3

Suggested Reading

Gordis L (2009). Epidemiology (4th edn.). Chapters 10, 11 & 13.

Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

Siskind V, Green A, Bain C, Purdie D. Breastfeeding, menopause, and epithelial ovarian


cancer. Epidemiology 1997, Vol. 8, pp.188-191 (copy attached at the end).

6.4

Design of a Case-Control Study

In a case- control study subjects are selected on the basis of whether or not they have the
disease of interest. Cases (those with disease) are then compared to controls (those
without disease) in terms of their history of exposure to a hypothesised causal factor. This
is illustrated in Figure 5.1 (Source: Beaglehole et al., 1993).

Time
Direction of inquiry
Exposed

Start with:
Cases (people
with disease)

Not exposed
Population
Controls (people
without disease)

Exposed

Not exposed

Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

Source: Medical Research Library of Brooklyn


http://servers.medlib.hscbklyn.edu/ebm/toc.html
Figure 5.1: Design of a case-Control Study

6.5

Major Concepts

6.5.1 Retrospective Study


A retrospective study (or case control study) is the reverse of a prospective study. In
retrospective studies, samples are selected from those falling into the categories of the
outcome variable. The investigator then looks back (that is takes a retrospective look) at
the patients and determines which ones have (or did not have) and which ones do not
have (or did not have) the risk factor. In this type of study individuals are initially
identified into two groups: (1) a group that has the disease under study (the cases) and (2)
a group that does not have the disease under study (the controls). An attempt is then made
to relate their prior health habits to their current disease status. This type of study is also
sometimes called a case-control study.
6.5.2 Measuring Association in Case-Control Studies
A case-control study has four stages: the selection of cases, the selection of controls, the
measurement of the risk factor, and the analysis of the association.
Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

6.5.3 The Selection of Cases and Controls


To avoid bias, cases in a case-control study should be newly diagnosed. Although this
will reduce the numbers available for study, a group containing a mixture of newly
diagnosed cases and long-standing chronic cases can cause great difficulty in interpreting
the results. The diagnostic criteria for defining a case must be explicit.
The selection of controls is the most difficult part of a case-control study. Ideally, the
controls should be representative of the general population without the disease, and thus
be a random sample from the same population that gave rise to the cases. However,
practical considerations usually dictate that controls are taken from a hospital population
of patients who have neither the disease under investigation nor a disease positively
related to the risk factor being studied. However, it must be remembered that patients
admitted to a hospital are not necessarily representative of the general population.
The next problem arises in the selection of controls. A random sample from the hospital
patients not affected by the disease could be taken, but the cases and control may differ
with regard to variables that may be related to exposure to the risk factor/s and therefore
development of the disease. These variables, or confounders, could be controlled for in
the analysis stage but are better adjusted for in the design of the study. To avoid the
confounding effects of these variables, the controls are often matched, either through pair
matching or frequency matching.
Sometimes cases and controls are selected from within a cohort study. Oftentimes, this
special type of study design is called a nested case-control. Nested case control studies
are often chosen for smaller studies within a larger study when the resources are not
available to examine the entire cohort. A common example is the analysis of stored
blood samples of cases and controls for a potential genetic marker. Genetic studies can
be very expensive, so both time and money can be saved by sampling a sub-sample of the
larger cohort.
6.5.4 Potential Sources of Bias
Bias is a systematic error in the design, conduct or analysis of a study, resulting in overor under-representation of one or more cells in the 2 x 2 table and a mistaken estimate of
the relationship between exposure and disease.
Selection bias is the error due to systematic differences in characteristics between those
who are included in a study and those who are not.
Information bias is the inaccurate classification of study subjects with respect to disease
or exposure status.
Two types of misclassification can occur:
a) Differential
the amount or direction of misclassification is different in the cases and controls
risk estimates can be biased in either direction
b) Non-differential
the amount and direction of misclassification is the same in cases and controls
Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

risk estimates are biased towards null (ie towards 1.0)

Misclassification can result from:


 Inaccurate or incomplete recall of prior exposures, symptoms etc by study subjects
 Inaccurate reporting of disease or exposure information
 Systematic error due to interviewers gathering of selective data
 Inaccurate measurement of exposure or disease-related parameters, eg calibration
errors, non-standardised measurement methods
 Inaccurate diagnosis of disease
6.5.5 Analysing Case-Control Studies
The basic analysis undertaken in case-control studies involves comparing cases and
controls with respect to their frequency of exposure to hypothesised causal factors. The
important thing to remember in the analysis and interpretation of case-control studies is
the retrospective nature of the data collection. Given that we do not have true incidence
data in a case-control study, the relative risk (RR) for exposure to a given factor is
estimated by the Odds Ratio (OR). When the outcome is rare, and the cases and controls
are derived from the community, the Odds Ratio is considered to be an unbiased estimate
of the Relative Risk (RR). Odds ratio (OR) measures how much more likely it is that the
cases were exposed to the risk factor and the controls were not exposed. If exposure is
associated with the disease, we would expect the odds of exposure to be higher for cases
than for controls

6.5.5.1 Odds Ratio (OR)


The odds ratio is generally appropriate for case control studies or retrospective studies.
The odds ratio is a way of comparing whether the probability of a certain event is the
same for two groups. In a case control study, two groups of patients are selected
according to disease status a group with disease and another group without the disease.
In such circumstances it will, in general, not be possible to determine the risk of disease
amongst patients exposed to a particular factor in comparison to those not exposed. Thus,
the RR may not be recommended. However for a very large sample and rare disease, the
RR and OR are approximately the same.
In a cohort study, the odds ratio is defined as the ratio of the odds of development of
disease in exposed persons to the odds of development of disease in non-exposed persons.
It can be calculated as follows:

Exposed
Non-exposed

Develop disease
a
c

Do not develop disease


b
d

Odds Ratio,
Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

OR=

ad
.
bc

Similar to the RR, the value of the OR ranges from zero to some positive numbers. In
real life we encounter one of the three possibilities.
An OR of one: This indicates that the odds of cases is the same in the exposed
and unexposed groups.
An OR of greater than one: This shows that the odds of cases is higher in the
exposed group than the odds of cases in the unexposed group.
An OR of less than one. This means that the odds of cases in the exposed
group is less than that in the unexposed group.

Example: Consider a case control study that investigates the association of bronchial
carcinoma and asbestos exposure in the Canadian chrysolite mines and mills, the data is
given in Table 8.4. Calculate an appropriate measure of association between carcinoma
and asbestos exposure and comment.
Cases

Controls

Total

Exposed

148

372

520

Unexposed

75

343

418

Total

223

715

938

From the above table we have: a = 148, b = 372, c =75 and d = 343. Thus the OR =
(ad)/bc) = (148343) / (37275) =1.83. This means that the odds of developing
bronchial carcinoma is 1.83 fold in the asbestos exposed group compared to the odds of
developing bronchial carcinoma in the unexposed group.
6.6

Advantages and Disadvantages of Case-Control Studies

Advantages:

They are relatively quick and cheap to undertake compared to other analytic
designs.

They are valuable for studying rare or uncommon conditions.

They need fewer subjects than cross-sectional studies.

Odds ratios can be calculated from this type of study which is a good
approximation of the relative risk.

Disadvantages:

They are usually inefficient if the exposure is rare.

They do not establish the time sequence of events.

Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

In this study selection of control groups is difficult.

They are prone to potential bias as compared with other designs.

6.7


Module 2: Topic 6

Discussion Point
In a study of a rare type of cancer, do you think that it would be best to design a
cohort study or a case control study? Think not only of the study rigour, but also
the implications for resources.

Course notes STA60004

Semester 1, 2014

Research Design: Topic 12

Module 2: Topic 6

Revision Exercises

1. Why are case-control studies sometimes called retrospective studies?


2.

State in words the meaning of an odds ratio, remembering that the data about risk
factors are collected retrospectively.

3. In a study to examine the association between smoking and drinking and the
development of head and neck cancer, from where would you source cases and controls?
What factors would you match for? What data would you collect and how would the data
be obtained?
4. A case-control study of bladder cancer was undertaken in a region with a large dye
manufacturing plant. Newly diagnosed cases of bladder cancer in that region were
identified from the state cancer registry. Controls were identified by telephone sampling
in the region and matched for age and sex. Exposures to chemicals, dyes, dietary factors
and smoking were assessed and odds ratios for these exposures were assessed. During
this process, the investigators learned that the dye plant had recently begun a medical
surveillance program for its employees that included screening for bladder cancer. A few
subclinical cases of bladder cancer were detected through the medical surveillance
program and were included among the case group. Could the differential medical
surveillance of the dye manufacturing workers bias the results of this study? If so, why?
5. A case-control study of lung cancer used identified cases of newly diagnosed lung
cancer and age/sex matched controls diagnosed with other conditions in the same
hospital. Rather than interviewing cases and controls, investigators reviewed the medical
histories to collect data on exposures. The medical charts of the lung cancer cases were
8.5 times more likely to mention a history of asbestos exposure compared with the
medical histories of the controls. Is this differential or non-differential misclassification?
What impact might this have had on the outcome of the study?
6. Compare the use and interpretation of a relative risk and an odds ratio.
7. How would you explain to someone the difference between an absolute and a relative
risk?
8. Why cannot you calculate an incidence rate from a case-control study?
9. Why is it easier to study a rare disease with a case-control study than with a cohort
study?
10.  Read the following supplementary article (extra reading):
Siskind V, Green A, Bain C, Purdie D. Breastfeeding, menopause, and epithelial
ovarian cancer. Epidemiology 1997;8:188-191.

Course notes STA60004

Semester 1, 2014

10

Research Design: Topic 12

Course notes STA60004

Module 2: Topic 6

Semester 1, 2014

11

Research Design: Topic 12

Course notes STA60004

Module 2: Topic 6

Semester 1, 2014

12

Research Design: Topic 12

Course notes STA60004

Module 2: Topic 6

Semester 1, 2014

13

Research Design: Topic 12

Course notes STA60004

Module 2: Topic 6

Semester 1, 2014

14

Research Design: Topic 12

Module 2: Topic 6

(i)
How were the cases and controls selected? What potential biases could
have been introduced by this selection method?
(ii)
How was the exposure measured? How could this method have biased the
results?
(iii) What information was provided about the non-participants? Based on this
information, do you think that it is likely that non-participation could have
influenced the results?
(iv) Calculate the unadjusted odds ratio that measures the association between
unsupplemented breastfeeding and epithelial ovarian cancer for all women. (N.B.
the data will need to be extracted from Table 1 on page 190). Interpret this odds
ratio.
(v) How would you interpret the overall results of this study for a lay person? Do
you trust the results of this one study?
11.

A case control study investing the association of bronchial carcinoma and


asbestos exposure in the Canadian chrysolite mines and mills, the data is
given in the table below. Calculate appropriate measure of association
between carcinoma and asbestos exposure and comment.
Table: Association between bronchial carcinoma and asbestos exposure.
Asbestos exposure

Cases

Controls

Total

Exposed

148

372

520

Unexposed

75

343

418

Total

223

715

938

12. A case-control study of myocardial infarction gave results for prior alcohol
consumption as shown. Calculate the odds ratio for each category of drinkers compared
with the non-drinkers. Does the table show a positive or negative association between
alcohol consumption and myocardial infarction?
Standard drinks per day
0 (Nondrinkers)
2
3-5
6
Total

Cases
136
202
42
11
391

Controls
110
238
46
24
418

13.
In a study of risk factors for congenital defects of the neural tube, maternal
deficiency of folate was found in 15 out of a total of 100 mothers of cases and 10 out of a
total of 200 mothers of controls. Calculate the odds ratio for exposure and interpret the
result.
Course notes STA60004

Semester 1, 2014

15

Research Design: Topic 12

Module 2: Topic 6

14. The data given in the following table are reproduced from a Case Study and represent
employees laid off by the U.S. Department of Labor (taken from Utts J.M. seeing through
statistics, p238).
Laid Off?
% Laid Off
Ethnic Group
Yes
No
Total
African
130
1382
1512
8.6
American
87
2813
2900
3.0
White
217
4195
4412
Total
a) Compute the odds of being retained to being laid off for each ethnic group.
b) Use your results in part (a) to compute the odds ratio. Write a sentence to explain
your answer in a form that could be understood by someone who knows nothing
about statistics.

Course notes STA60004

Semester 1, 2014

16

Research Design: Topic 12

Module 2: Topic 6

Solution to the revision exercises

1. Why are case-control studies sometimes called retrospective studies?


The study starts with the end point or disease state, eg lung cancer, and goes backward in
time to identify risk factors, eg history of smoking or exposure to asbestos. There is no
period of follow-up in case-control studies as the disease (outcome) and exposure have
already occurred at the time of study.

2. State in words the meaning of an odds ratio, remembering that the data about
risk factors are collected retrospectively.
The odds ratio is an approximation of the relative risk of disease in those exposed to a
risk factor compared with those who were not exposed to the risk factor. The odds ratio
is calculated by dividing the odds of an exposed person having the disease by the odds of
a non-exposed person having the disease.

3. In a study to examine the association between smoking and drinking and the
development of head and neck cancer, from where would you source cases and
controls? What factors would you match for? What data would you collect and
how could the data be obtained?
Cases could be selected from newly diagnosed patients attending a particular hospital
over a specified time period for treatment of cancer of the head and neck. Controls could
be sourced from the general population in the catchment area of the hospital through
electoral roles or alternatively, from patients attending the same hospital for the treatment
of disease not related to smoking or alcohol intake.
The subjects should be matched for age (in 5-10 year groups) and sex.
Details of sex, age, tobacco consumption, alcohol consumption, education and occupation
could be collected. This data could be obtained through medical records or through
interviewing the subjects using a pre-designed questionnaire. The latter would be
preferred as medical records are often incomplete and a poor source of information on
exposure.

Course notes STA60004

Semester 1, 2014

17

Research Design: Topic 12

Module 2: Topic 6

4. A case-control study of bladder cancer was carried out in a region with a large
dye manufacturing plant. Newly diagnosed cases of bladder cancer in that region
were identified from the state cancer registry. Controls were identified by telephone
sampling in the region and matched for age and sex. Exposures to chemicals, dyes,
dietary factors and smoking were assessed and odds ratios for these exposures were
assessed. During this process, the investigators learned that the dye plant had
recently begun a medical surveillance program for its employees that included
screening for bladder cancer. A few subclinical cases of bladder cancer were
detected through the medical surveillance program and were included among the
case group. Could the differential medical surveillance of the dye manufacturing
workers bias the results of this study? If so, why?
Yes, the results would be biased because subclinical cases of bladder cancer would have
gone undetected among individuals in the community who were not dye workers. These
subclinical cases would not have been included in the study.

5. A case-control study of lung cancer used identified cases of newly diagnosed lung
cancer and age/sex matched controls diagnosed with other conditions in the same
hospital. Rather than interviewing cases and controls, investigators reviewed the
medical histories to collect data on exposures. The medical charts of the lung cancer
cases were 8.5 times more likely to mention a history of asbestos exposure compared
with the medical histories of the controls. Is this differential or non-differential
misclassification? What impact might this have had on the outcome of the study?
This represents differential misclassification, since the misclassification of asbestos
exposure is different between the cases and controls.
This can produce bias in either direction, raising or lowering the estimation of risk.
Because most physicians are aware of the relationship between asbestos and lung cancer,
they are likely to ask lung cancer patients if they had ever been exposed to asbestos. For
most other patients, they are unlikely to ask about asbestos or other occupational
exposures and asbestos exposure is less likely to be noted for the controls, compared with
the cases. This is another illustration of medical records being a poor source of data on
exposure!
6. Compare the use and interpretation of a relative risk and an odds ratio.
A relative risk measures the likelihood of developing the disease in the exposed group
relative to those who are unexposed. Relative risk is used to measure the association
between exposure and outcome for data collected in a cohort study or randomised clinical
trial.
For a case control study, the risk ratio is meaningless. Instead, the odds ratio is used
which compares the odds of exposure to a risk factor among cases with that among
controls. However, for a rare condition, the odds ratio approximates the relative risk.

Course notes STA60004

Semester 1, 2014

18

Research Design: Topic 12

Module 2: Topic 6

7. How would you explain to someone the difference between an absolute and a
relative risk?
Absolute risk is the risk of having a disease. Eg If the incidence of a disease is 1 in 1000,
then the absolute risk is 1 in 1000 or 0.1%.
Relative risk is the risk of the outcome of interest in the exposed group relative to the risk
of the outcome in the unexposed group. Eg Relative risk = 3.0 the risk of the outcome
in the exposed group is three times that of the unexposed group.

8. Why cant you calculate an incidence rate from a case-control study?


The design of case-control studies prevent the measurement of absolute risk or incidence
of disease as the subjects are not followed forward in time and the proportion of cases
with the disease of interest is controlled by the study design.

9. Why is it easier to study a rare disease with a case-control study than with a
cohort study?
Both cohort (prospective) and case-control (retrospective) studies can be utilised to
examine associations between risk factors and disease. However to study a rare disease,
cohort studies will require a very large sample size and extensive follow-up of subjects
over time. Thus, many persons must be studied over a long period to obtain even a few
cases that develop the end-point of interest. In contrast, a case-control study utilises
existing cases and requires a much smaller sample size. Case-control studies are
therefore cheaper to carry out and provide results much faster.

10.  Read the following supplementary article:


Siskind V, Green A, Bain C, Purdie D. Breastfeeding, menopause, and epithelial
ovarian cancer. Epidemiology 1997;8:188-191.

(i) How were the cases and controls selected? What potential biases could
have been introduced by this selection method?
Histologically confirmed incident cases of epithelial ovarian cancer were recruited from
the gynaecologic oncology treatment centres in Victoria, New South Wales, and
Queensland. The controls were randomly selected from the electoral rolls to be similar to
the anticipated age and geographic distribution of the cases. Misclassification should not
be a problem with the cases, but may exist with the controls. As with any case control
study, recall bias is a potential problem with the cases.

(ii)

How was the exposure measured? How could this method have
biased the results?

Course notes STA60004

Semester 1, 2014

19

Research Design: Topic 12

Module 2: Topic 6

Breastfeeding history was obtained through interview with both the cases and the
controls. Again, recall bias is a potential problem with self-report of exposure.

(iii)What information was provided about the non-participants? Based on


this information, do you think that it is likely that non-participation could
have influenced the results?
No information, other than number of women, is given about the non-participants.
Therefore, it is not possible to determine if non-response could have biased these results.

(iii)

Calculate the unadjusted odds ratio that measures the association


between un-supplemented breastfeeding and epithelial ovarian
cancer for all women. (NB the data will need to be extracted from
Table 1 on p. 190). Interpret this odds ratio.

Breastfed
Did not breast feed
Total

Cases
450
168
618

Controls
569
155
724

Total
1,019
323
1,342

OR = ad/bc
= (450 x 155) / (569 x 168)
= 69,750 / 95,592
= 0.72
ie. The unadjusted odds ratio of 0.72 for all women indicates that women with a history
of breastfeeding have a 28% lower risk of developing epithelial ovarian cancer than
women without a history of breastfeeding.

(iv)

How would you interpret the overall results of this study for a lay
person? Do you trust the results of this one study?

In this case control study, breastfeeding was found to be protective for epithelial ovarian
cancer in pre-menopausal women, but the protective effect was not present for postmenopausal women. Before it is accepted that breastfeeding provides protection against
epithelial ovarian cancer, the results of this study would need to be reproduced by other
larger studies in a range of population groups.
11.

This is a Case-control study. We have to use OR


OR =

ad 148 343
=
= 1.82
bc 372 75

Course notes STA60004

Semester 1, 2014

20

Research Design: Topic 12

Module 2: Topic 6

The odds of developing bronchial carcinoma among the asbestos exposure is 1.82 times
than those who are un-exposed.

12. A case-control study of myocardial infarction gave results for prior alcohol
consumption as shown. Calculate the odds ratio for each category of drinkers
compared with the nondrinkers. Does the table show a positive or negative association
between alcohol consumption and myocardial infarction?
Standard drinks per day
0 (Nondrinkers)
2
3-5
6
Total

Cases
136
202
42
11
391

Controls
110
238
46
24
418

a) Odds ratio for the 2 standard drinks group compared with the non-drinkers:
OR = (202 x 110)/(238 x 136) = 0.69

b) Odds ratio for the 3 - 5 standard drinks group compared with the non-drinkers:
OR = (42 x 110)/(46 x 136) = 0.74

c) Odds ratio for the 6 standard drinks group compared with the non-drinkers:
OR = (11 x 110)/(24 x 136) = 0.37

d) Odds ratio for all drinkers compared with the non-drinkers:


OR = (255 x 110)/(308 x 136) = 0.67

Alcohol consumption appears to have a protective effect against acute myocardial infarct.
The likelihood of having an AMI appears to decrease with increasing alcohol
consumption.

Course notes STA60004

Semester 1, 2014

21

Research Design: Topic 12

Module 2: Topic 6

13. In a study of risk factors for congenital defects of the neural tube, maternal
deficiency of folate was found in 15 out of a total of 100 mothers of cases and 10 out of
a total of 200 mothers of controls. Calculate the odds ratio for exposure and interpret
the result.

Maternal folate
deficiency present
Maternal folate
deficiency absent
Total

Odds ratio

Cases
15

Controls
10

Total
25

85

190

275

100

200

300

= ad / bc
= (15 x 190) / (10 x 85)
= 2,850 / 850

= 3.4
Babies whose mothers had a folate deficiency were 3.4 times more likely to suffer a
congenital defect of the neural tube compared with babies whose mothers were not folate
deficient.
14. The data given in the following table are reproduced from a Case Study and represent
employees laid off by the U.S. Department of Labor (taken from Utts J.M. seeing through
statistics, p238).
Laid Off?
% Laid Off
Ethnic Group
Yes
No
Total
African
130
1382
1512
8.6
American
87
2813
2900
3.0
White
217
4195
4412
Total
a) Compute the odds of being retained to being laid off for each ethnic group.

The odds are 1,382 to 130 or about 10.6 to 1 for African Americans and 2,813
to 87 or about 32.3 to 1 for Caucasians.
b) Use your results in part (a) to compute the odds ratio. Write a sentence to explain
your answer in a form that could be understood by someone who knows nothing
about statistics.

Odds ratio is (2,813/87) (1,382/130) = 3.04. That means the odds of being laid
off compared with being retailed are 3 times higher for African American than
for white.

Course notes STA60004

Semester 1, 2014

22

Research Design: Topic 12

Module 2: Topic 6

References
Beaglehole, R., Bonita, R., and Kjellstrom, T. (1993). Basic Epidemiology. WHO,
Geneva.
Dawson, B. and Trapp, R.G. (2001), Basic and Clinical Biostatistics, 3rd Edition
(international edition), Lange Medical Book/McGraw-Hill.
Gordis L (2013). Epidemiology (5th ed.). Chapters 10, 11, 13.
Siskind V, Green A, Bain C, Purdie D (1997). Breastfeeding, menopause, and epithelial
ovarian cancer. Epidemiolog, Vol. 8, pp. 188-191.
Kelsey JL, Thompson WD, Evans AS (1986). Methods in Observational Epidemiology.
New York: Oxford University Press.
Rosner B (2006). Fundamentals of Biostatistics, 6th Edn.Thomson, Australia.
Schlesselman JJ. (1982). Case-Control Studies: Design, Conduct, Analysis. Oxford:
Oxford University Press.
Applications of the Case-control Method. Epidemiologic Reviews. 1994, Vol. 16, pp.1164.
Utts J.M. (2005). Seeing Through Statistics. Third Edition. Brooks/Cole Cengage
Learning, CA, USA.

Course notes STA60004

Semester 1, 2014

23

Вам также может понравиться