Академический Документы
Профессиональный Документы
Культура Документы
to Epidemiology
Second Edition
Neil Pearce
1
Centre for Public Health Research
Massey University Wellington Campus
Private Box 756
Wellington, New Zealand
Phone: 64-4-3800-606
Fax: 64-4-3800-600
E-mail: cphr@massey.ac.nz
Website: http://www.publichealth.ac.nz/
2nd edition
February 2005
ISBN 0-476-01236-8
ISSN 1176-1237
2
To Irihapeti Ramsden
3
4
Preface
5
Part 2 then addresses study design of situations in which epidemiologic
issues. Chapter 5 discusses issues of methods can be used. However, there are
study size and precision. Chapter 6 undoubtedly many other types of
considers general issues of validity, epidemiologic hypotheses and
namely selection bias, information bias, epidemiologic studies which are not
and confounding. Chapter 7 discusses represented in this book. In particular,
effect modification. my focus is on the use of epidemiology in
public health, particularly with regard to
Part 3 then discusses the practical issues non-communicable disease, and I include
of conducting a study. Chapter 8 few examples from clinical epidemiology
addresses issues of measurement of or from communicable disease outbreak
exposure and disease. Chapters 9-11 investigations. Nevertheless, I hope that
then discuss the conduct of cohort the book will be of interest not only to
studies, case-control studies and cross- epidemiologists, but also to others who
sectional studies respectively. have other training but are involved in
epidemiologic research, including public
Finally, Part 4 considers what happens health professionals, policy makers, and
after the data are collected, with chapter clinical researchers.
12 addressing data analysis and chapter
13 the interpretation of the findings of
epidemiologic studies. Neil Pearce
I should stress that this book provides no Centre for Public Health Research
more than a very preliminary introduction Massey University Wellington Campus
to the field. In doing so I have attempted Private Box 756
to use a wide range of examples, which Wellington, New Zealand
give some indication of the broad range
Acknowledgements
6
A Short Introduction to Epidemiology
Contents
1. Introduction 9
PART 3: CONDUCTING A STUDY
– Germs and miasmas 10
– Risk factor epidemiology 11 8. Measurement of exposure and
– Epidemiology in the 21st century 12 health status 95
– Exposure 95
PART 1: STUDY DESIGN OPTIONS – Health status 102
–
2. Incidence studies 21
9. Cohort studies 109
– Incidence studies 22
– Defining the source population and
– Incidence case-control studies 28 risk period 109
– Measuring exposure 112
3. Prevalence studies 33
– Follow-up 113
– Prevalence studies 33
– Prevalence case-control studies 38 10. Case-control studies 117
– Defining the source population and
4. More complex study designs 41 risk period 117
– Other axes of classification 41 – Selection of cases 118
– Continuous outcome measures 42 – Selection of controls 119
– Ecologic and multilevel studies 47 – Measuring exposure 122
7
8
CHAPTER 1. Introduction
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
Table 1.1
Prevention Treatment
----------------------------------------------------------------------
Populations Public health Health systems research
Individuals Primary health care/ Medicine (including primary health care)
Health education
9
1.1 Germs and Miasmas
Table 1.2
Deaths and death rates from cholera in London 1854 in households supplied by the
Southwark and Vauxhall Water Company and by the Lambeth Water Company
Deaths
Cholera per 10,000
Houses deaths houses
------------------------------------------------------------------------------------------------
Southwark and Vauxhall 40,046 1,263 315
Lambeth Company 26,107 98 37
Rest of London 256,423 1,422 59
------------------------------------------------------------------------------------------------
Source: (Snow, 1936; quoted in Winkelstein, 1995)
10
Perhaps the most commonly quoted 1983; Loomis and Wing, 1991; Samet,
epidemiologic legend is that of Snow who 2000; Vandenbroucke, 1994), it is clear
studied the causes of cholera in London that Snow was able to discover, and
in the mid-19th century (Winkelstein, establish convincing proof for, the mode
1995). Snow was able to establish that of transmission of cholera, and to take
the cholera death rate was much higher preventive action several decades before
in areas supplied by the Southwark and the biological basis of his observations
Vauxhall Company which took water was understood. Thus, it was not until
from the Thames downstream from several decades after the work of Snow
London (i.e. after it had been that Pasteur and others established the
contaminated with sewerage) than in role of the transmission of specific
areas supplied by the Lambeth Company pathogens in what became known as the
which took water from upstream, with “infectious diseases”, and it was another
the death rates being intermediate in century, in most instances, before
areas served by both companies. effective vaccines or antibiotic
Subsequently, Snow (1936) studied the treatments became available.
area supplied by both companies, and Nevertheless, a dramatic decline in
within this area walked the streets to mortality from these diseases occurred
determine for each house in which a from the mid-nineteenth century long
cholera death had occurred, which before the development of modern
company supplied the water. The death pharmaceuticals. This has been
rate was almost ten times as high in attributed to improvements in nutrition,
houses supplied with water containing sanitation, and general living conditions
sewerage (table 1.2). (McKeown, 1979) although it has been
argued that specific public health
Although epidemiologists and other interventions on factors such as urban
researchers continue to battle over congestion actually played the major role
Snow’s legacy and its implications for (Szreter, 1988).
epidemiology today (Cameron and Jones,
11
2001). Subsequent decades have seen for the ethical and practical constraints,
major discoveries relating to other epidemiologic theory and practice has,
causes of chronic disease such as quite appropriately, been based on the
asbestos, ionizing radiation, viruses, theory and practice of randomised trials.
diet, outdoor air pollution, indoor air Thus, the aim of an epidemiologic study
pollution, water pollution, and genetic investigating the effect of a specific risk
factors. These epidemiologic successes factor (e.g. smoking) on a particular
have in some cases led to successful disease (e.g. lung cancer) is intended to
preventive interventions without the obtain the same findings that would have
need for major social or political change. been obtained from a randomised
For example, occupational carcinogens controlled trial. Of course, an
can, with some difficulty, be controlled epidemiologic study will usually
through regulatory measures, and experience more problems of bias than a
exposures to known occupational randomised controlled trial, but the
carcinogens have been reduced in randomised trial is the “gold standard”.
industrialized countries in recent
decades. Another example is the This approach has led to major
successful World Health Organisation developments in epidemiologic theory
(WHO) campaign against smallpox. More (presented most elegantly and
recently, some countries have passed comprehensively in Rothman and
legislation to restrict advertising of Greenland, 1998). In particular, there
tobacco and smoking in public places have been major developments in the
and have adopted health promotion theory of cohort studies (which mimic a
programmes aimed at changes in randomised trial, but without the
"lifestyle". randomisation) and case-control studies
(which attempt to obtain the same
Individual lifestyle factors would ideally findings as a full cohort study, but in a
be investigated using a randomised more efficient manner). It is these basic
controlled trial, but this is often unethical methods, which follow a randomised
or impractical (e.g. tobacco smoking). controlled trial “paradigm”, which receive
Thus, it is necessary to do observational most of the attention in this short
studies and epidemiology has made introductory text. However, while
major contributions to the understanding presenting these basic methods, it is
of the role of individual lifestyle factors important to also recognise their
and health. Because such factors would limitations, and to also consider different
ideally be investigated in randomised or more complex methods that may be
controlled trials, and in fact would be more appropriate when epidemiology is
ideally suited to such trials if it were not used in the public health context.
12
lifestyle, and little attention paid to the studies. Even if one is focusing on
population-level determinants of health individual “lifestyle” risk factors, there is
(Susser and Susser, 1996a, 1996b; good reason to conduct studies at the
Pearce, 1996; McMichael, 1999). population level (Rose, 1992). Moreover,
Furthermore, the success of risk factor every population has its own history,
epidemiology has been more temporary culture, and economic and social
and more limited than might have been divisions which influence how and why
expected. For example, the limited people are exposed to specific risk
success of legislative measures in factors, and how they respond to such
industrialised countries has led the exposures. For example, New Zealand
tobacco industry to shift its promotional (Aotearoa) was colonised by Great
activities to developing countries so that Britain more than 150 years ago,
more people are exposed to tobacco resulting in major loss of life by the
smoke than ever before (Barry, 1991; indigenous people (the Māori). It is
Tominaga, 1986). Similar shifts have commonly assumed that this loss of life
occurred for some occupational occurred primarily due to the arrival of
carcinogens (Pearce et al, 1994). Thus, infectious diseases to which Māori had no
on a global basis the "achievement" of natural immunity. However, a more
the public health movement has often careful analysis of the history of
been to move public health problems colonisation throughout the Pacific
from rich countries to poor countries and reveals that the indigenous people
from rich to poor populations within the mainly suffered major mortality from
industrialized countries. imported infectious diseases when their
land was taken (Kunitz, 1994), thus
It should be acknowledged that not all disrupting their economic base, food
epidemiologists share these concerns supply and social networks. This
(e.g. Savitz, 1994; Rothman et al, 1998; example is not merely of historical
Poole and Rothman, 1998), and some interest, since it these same infectious
have regarded these discussions as an diseases that have returned in strength
attack on the field itself, rather than as in Eastern Europe in the last decade,
an attempt to broaden its vision. after lying dormant for nearly a century
Nevertheless, the debate has progressed (Bobak and Marmot, 1996). Similarly,
and there is an increasing recognition of the effects of occupational carcinogens
the importance of taking a more global may be greater in developing countries
approach to epidemiologic research and where workers may be relatively young
of the importance of maintaining an or may be affected by malnutrition or
appropriate balance and interaction other diseases (Pearce et al, 1994).
between macro-level (population),
individual-level (e.g. lifestyle), and These issues are likely to become more
micro-level (e.g. genetic) research important because, not only is
(Pearce, 2004). epidemiology changing, but the world
that epidemiologists study is also rapidly
There are three crucial concepts which changing. We are seeing the effects of
have received increasing attention in this economic globalization, structural
regard. adjustment (Pearce et al, 1994) and
climate change (McMichael, 1993, 1995),
The Importance of Context and the last few decades have seen the
occurrence of the “informational
The first, and most important issue, is revolution” which is having effects as
the need to consider the population great as the previous agricultural and
context when conducting epidemiologic industrial revolutions (Castells, 1996).
13
In industrialized countries, this is likely theories and identifies the major public
to prolong life expectancy for some, health problems which new theories
but not all, sections of the population. must be able to explain. A fruitful
In developing countries, the benefits research process can then be
have been even more mixed (Pearce et generated with positive interaction
al, 1994), while the countries of between epidemiologists and other
Eastern Europe are experiencing the researchers. Studying real public
largest sudden drop in life expectancy health problems in their historical and
that has been observed in peacetime social context does not exclude
in recorded human history (Boback learning about sophisticated methods
and Marmot, 1996) with a major rise of study design and data analysis (in
in alcoholism and “forgotten” diseases fact, it necessitates it), but it may help
such as tuberculosis and cholera. to ensure that the appropriate
questions are asked (Pearce, 1999).
This increased interest in population-
level determinants of health has been Appropriate Technology
particularly marked by increased
interest in techniques such as A related issue is the need to use
multilevel modelling which allow “appropriate technology” to address
individual lifestyle risk factors to be the most important public health
considered “in context” and in parallel research questions. In particular, as
with macro-level determinants of attention moves “upstream” to the
health (Greenland, 2000). Such a shift population level (McKinlay, 1993) new
in approach is important, not only methods will need to be developed
because of the need to emphasize the (McMichael, 1995). One example of
role of diversity and local knowledge this, noted above, is the recent rise in
(Kunitz, 1994), but also because of the interest in multilevel modelling
more general moves within science to (Blakely and Woodward, 2000; Pearce,
consider macro-level systems and 2000), although it is important to
processes (Cohen and Stewart, 1994) stress that it is an increase in
rather than taking a solely reductionist “multilevel thinking” in the
approach (Pearce, 1996). development of epidemiologic
hypotheses and the design of studies
Problem-Based Epidemiology that is required, rather than just the
use of new statistical techniques of
A second issue is that a problem-based data analysis. The appropriateness of
approach may be particularly valuable any research methodology depends on
in encouraging epidemiologists to the phenomenon under study: its
focus on the major public health magnitude, the setting, the current
problems and to take the population state of theory and knowledge, the
context into account (Pearce, 2001; availability of valid measurement tools,
Thacker and Buffington, 2001). A and the proposed uses of the
problem-based approach to teaching information to be gathered, as well as
clinical medicine has been increasingly the community resources and skills
adopted in medical schools around the available and the prevailing norms and
world. The value of this approach is values at the national, regional or local
that theories and methods are taught level (Pearce and McKinlay, 1998).
in the context of solving real-life Thus, there has been increased
problems. Starting with “the problem” interest in the interface between
at the population level provides a epidemiology and social science
“reality check” on existing etiological (Krieger, 2000), and in the
14
development of theoretical and noted above, this short introductory
methodological frameworks text focuses on the most basic
appropriate for epidemiologic studies epidemiologic methods, but I attempt
in developing countries (Barreto et al, to refer to more complex issues, and
2001; Barreto, 2004; Loewenson, the potential use of more complex
2004), and in indigenous people in methods, where this is appropriate.
“Western “ countries (Durie, 2004). As
Summary
15
References
16
on epidemiology in an age of change. epidemiology include the readication
Am J Epidemiol 149: 887-97. of poverty? Lancet 352: 810-3.
McKeown T (1979). The role of medicine. Samet JM (2000). Epidemiology and
Princeton, NJ: Princeton University policy: the pump handle meets the
Press. new millennium. Epidemiologic
Reviews 22: 145-54.
McKinlay JB (1993). The promotion of
health through planned sociopolitical Saracci R (1999). Epidemiology in
change: challenges for research and progress: thoughts, tensions and
policy. Soc Sci Med 36: 109-17. targets. Int J Epidemiol 28: S997-9.
Pearce N (1996). Traditional Savitz DA (1994). In defense of black
epidemiology, modern epidemiology, box epidemiology. Epidemiology 5:
and public health. AJPH 86: 678-83. 550-2.
Pearce N (1999). Epidemiology as a Schairer E, Schöninger E (2001). Lung
population science. Int J Epidemiol cancer and tobacco consumption. Int J
28: S1015-8. Epidemiol 30: 24-7.
Pearce N (2000). The ecologic fallacy Snow J (1936). On the mode of
strikes back. J Epidemiol Comm communication of cholera. (Reprint).
Health 54: 326-7. New York: The Commonwealth Fund,
pp 11-39.
Pearce N (2001). The future of
epidemiology: a problem-based Susser M, Susser E (1996a). Choosing a
approach using evidence-based future for epidemiology: I. Eras and
methods. Australasian Epidemiologist paradigms. Am J Publ Health 86: 668-
8.1: 3-7. 73.
Pearce N (2004). The globalization of Susser M, Susser E (1996b). Choosing a
epidemiology: introductory remarks. future for epidemiology: II. From
Int J Epidemiol 33: 1127-31. black boxes to Chinese boxes. Am J
Publ Health 86: 674-8.
Pearce N, McKinlay J (1998). Back to the
future in epidemiology and public Szreter S (1988). The importance of
health. J Clin Epidemiol 51: 643-6. social intervention in Briatain's
mortality decline c.1850-1914: a
Pearce NE, Matos E, Vainio H, Boffetta P,
reinterpretation of the role of public
Kogevinas M (eds) (1994).
health. Soc Hist Med 1: 1-37.
Occupational cancer in developing
countries. Lyon: IARC. Terris M (1987). Epidemiology and the
public health movement. J Publ Health
Poole C, Rothman KJ (1998). Our
Policy 7: 315-29.
conscientious objection to the
epidemiology wars. J Epidemiol Comm Thacker SB, Buffington J (2001). Applied
Health 52: 613-4. epidemiology for the 21st century. Int
J Epidemiol 30: 320-5.
Rose G. The strategy of preventive
medicine. Oxford: Oxford University Tominaga S (1986). Spread of smoking
Press, 1992. to the developing countries. In:
Zaridze D, Peto R (eds). Tobacco: a
Rothman KJ, Greenland S (1998).
major international health hazard.
Modern epidemiology. 2nd ed.
Lyon: IARC, pp 125-33.
Philadelphia: Lippincott-Raven.
Rothman KJ, Adami H-O, Trichopolous
(1998). Should the mission of
17
Vandenbroucke JP (1994). New public Wynder EL, Graham EA (1950). Tobacco
health and old rhetoric. Br Med J 308: smoking as a possible etiologic factor
994-5. in bronchiogenic carcinoma. J Am
Statist Assoc 143: 329-38.
Winkelstein W (1995). A new perspective
on John Snow’s communicable disease
theory. Am J Epidemiol 142: S3-9.
18
Part I
19
20
CHAPTER 2. Incidence Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
In this chapter and the next one I review The responses to these two questions yield
the possible study designs for the simple four basic types of epidemiologic studies
situation where individuals are exposed to (Morgenstern and Thomas, 1993; Pearce,
a particular risk factor (e.g. a particular 1998):
chemical) and when a dichotomous
outcome is under study (e.g. being alive or 1. Incidence studies
dead, or having or not having a particular 2. Incidence case-control studies
disease). Thus, the aim is to estimate the 3. Prevalence studies
effect of a (dichotomous) exposure on the 4. Prevalence case-control studies
occurrence of a (dichotomous) disease
outcome or health state. These four study types represent cells in a
two-way cross-classification (table 2.1).
It should first be emphasized that all Such studies may be conducted to describe
epidemiologic studies are (or should be) the occurrence of disease (e.g. to estimate
based on a particular source population the burden of diabetes in the community
(also called the “study population” or “base by conducting a prevalence survey), or to
population”) followed over a particular risk estimate the effect of a particular exposure
period. Within this framework a on disease (e.g. to estimate whether the
fundamental distinction is between studies incidence new cases of diabetes is greater
of disease incidence (i.e. the number of in people with a high fat diet than in
new cases of disease over time) and people with a low fat diet) in order to find
studies of disease prevalence (i.e. the out how we can prevent the disease
number of people with the disease at a occurring. In the latter situation we are
particular point in time). Studies involving comparing the occurrence of disease in an
dichotomous outcomes can then be “exposed” group with that in a “non-
classified according to two questions: exposed” group, and we are estimating the
effect of exposure on the occurrence of the
a. Are we studying studying incidence or disease, while controlling for other known
prevalence?; causes of the disease.
b. Is there sampling on the basis of
outcome?
Table 2.1
The four basic study types in studies involving a dichotomous health outcome
Sampling on outcome
------------------------------------------------------------
No Yes
------------------------------------------------------------
Study Incidence Incidence studies Incidence case-control studies
outcome
Prevalence Prevalence studies Prevalence case-control studies
------------------------------------------------------------
21
Thus, we might conclude that “lung consider prevalence studies. In chapter
cancer is five times more common in 4, I then consider studies involving more
asbestos workers than in other workers, complex measurements of health status
even after we have controlled for (e.g. continuous lung function or blood
differences in age, gender, and pressure measurements) and more
smoking”. In some instances we may complex study designs (ecologic and
have multiple categories of exposure multilevel studies). As noted in chapter
(high, medium, low) or individual 1, the latter situation is perhaps the
exposure “scores”, but we will start with norm, rather than the exception, when
the simple situation in which individuals conducting studies in the public health
are classified as “exposed” or “non- context. However, for logical and
exposed”. practical reasons I will first address the
simpler situation of a dichotomous
In this chapter I discuss incidence exposure (in individuals) and a
studies, and in the following chapter I dichotomous health outcome measure.
22
censored, and after that we stop younger age than the non-exposed
counting them. This approach is followed group. If we only calculated the
because we may not get a fair percentage of people who died, then it
comparison between the “exposed” and would be 100% in both groups, and we
the “non-exposed” groups” if they have would see no difference. However, if we
been followed for different lengths of take into account the person-time
time, e.g. if one group has many more contributed by each group, then it
people lost to follow-up than the other becomes clear that both groups had the
group. same number of deaths (1,000), but that
in the exposed group these deaths
However, the person-time approach occurred earlier and the person-time
would be necessary even if no-one was contributed was therefore lower. Thus,
lost to follow up and both groups were the average age at death would be lower
followed for the same length of time. For in the exposed group; to say the same
example, consider a cohort study of thing another way, the death rate
1,000 exposed and 1,000 non-exposed (deaths divided by person-years) would
people in which no-one was lost to be higher. To see this, we need to
follow-up and everyone was followed consider not only how many people were
until they died. Assume also that the in each group, but how much person-
exposure causes some deaths so the time they contributed, i.e. how long they
exposed group, on the average, died at a were followed for.
Figure 2.1
Occurrence of disease in a hypothetical population followed from birth
23
Example 2.1
Martinez et al (1995) were completed during of three years but had
studied 1246 newborns the child’s second year wheezing at six years,
in the Tucson, Arizona of life and again at six and 13.7% had
area enrolled between years. At the age of six wheezing both before
May 1980 and October years, 51.5% of the three years of age and
1984. Parents were children had never at six years. The authors
contacted shortly after wheezed, 19.9% had concluded that the
the children were born, had at least one lower majority of infants with
and completed a respiratory tract illness wheezing have transient
questionnaire about their with wheezing during the conditions and do not
history or respiratory first three years of life have increased risks of
illness, smoking habits, but had no wheezing at asthma or allergies later
and education. Further six years, 15.0% had no in life.
parental questionnaires wheezing before the age
24
persons living in a particular 0.0100 (or 1000 per 100,000 person-
geographical area) and incidence years).
studies involving sampling on the basis
of exposure, since the latter procedure A second measure of disease
merely redefines the source population occurrence is the incidence proportion
(cohort) (Miettinen, 1985). or average risk which is the proportion
of people who experience the outcome
Measures of Disease Occurrence of interest at any time during the
follow-up period (the incidence
I will briefly review the basic measures proportion is often called the
of disease occurrence that are used in cumulative incidence, but the latter
incidence studies, using the notation term is also used to refer to
depicted in table 2.2 which shows the cumulative hazards (Breslow and Day,
findings of a hypothetical incidence 1987)). Since it is a proportion it is
study of 20,000 persons followed for dimensionless, but it is necessary to
10 years (statistical analyses using specify the time period over which it is
these measures are discussed further being measured. In this instance,
in chapter 12). there were 952 incident cases among
the 10,000 people in the non-exposed
Three measures of disease incidence group, and the incidence proportion
are commonly used in incidence (b/N0) was therefore 952/10,000 =
studies. 0.0952 over the ten year follow-up
period. When the outcome of interest
Perhaps the most common measure of is rare over the follow-up period (e.g.
disease occurrence is the person-time an incidence proportion of less than
incidence rate (or hazard rate, force of 10%), then the incidence proportion is
mortality or incidence density approximately equal to the incidence
(Miettinen, 1985)) which is a measure rate multiplied by the length of time
of the disease occurrence per unit that the population has been followed
population time, and has the reciprocal (in the example, this product is 0.1000
of time as its dimension. In this whereas the incidence proportion is
example (table 2.2), there were 952 0.0952). I have assumed, for
cases of disease diagnosed in the non- simplicity, that no-one or was lost to
exposed group during the ten years of follow-up during the study period (and
follow-up, which involved a total of therefore stopped contributing person-
95,163 person-years; this is less than years to the study). However, as noted
the total possible person-time of above when this assumption is not
100,000 person-years since people valid (i.e. when a significant proportion
who developed the disease before the of people have died or have been lost
end of the ten-year period were no to follow-up), then the incidence
longer “at risk” of developing it, and proportion cannot be estimated
stopped contributing person-years at directly, but must be estimated
that time (for simplicity I have ignored indirectly from the incidence rate
the problem of people whose disease (which takes into account that follow-
disappears and then reoccurs over up was not complete) or from life
time, and I have assumed that we are tables (which stratify on follow-up
studying the incidence of the first time).
occurrence of disease). Thus, the
incidence rate in the non-exposed
group (b/Y0) was 952/95,163 =
25
A third possible measure of disease estimated indirectly from the incidence
occurrence is the incidence odds rate (via the incidence proportion, or
(Greenland, 1987) which is the ratio of via life-table methods). The incidence
the number of people who experience odds is not very interesting or useful
the outcome (b) to the number of as a measure of disease occurrence,
people who do not experience the but it is presented here because the
outcome (d). As for the incidence incidence odds is used to calculate the
proportion, the incidence odds is incidence odds ratio which is estimated
dimensionless, but it is necessary to in certain case-control studies (see
specify the time period over which it is below).
being measured. In this example, the
incidence odds (b/d) is 952/9,048 = These three measures of disease
0.1052. When the outcome is rare occurrence all involve the same
over the follow-up period then the numerator: the number of incident
incidence odds is approximately equal cases of disease (b). They differ in
to the incidence proportion. Once whether their denominators represent
again, if loss to follow-up is significant, person-years at risk (Y0), persons at
then the incidence odds cannot be risk (N0), or survivors (d).
estimated directly, but must be
Table 2.2
Findings from a hypothetical cohort study of 20,000 persons followed for 10 years
26
Measures of Effect in Incidence to that in the non-exposed group. The
Studies various measures of disease occurrence
all involve the same numerators
Corresponding to these three measures (incident cases), but differ in whether
of disease occurrence, there are three their denominators are based on person-
principal ratio measures of effect which years, persons, or survivors (people who
can be used in incidence studies. The do not develop the disease at any time
measure of interest is often the rate during the follow-up period). They are all
ratio (incidence density ratio), the ratio approximately equal when the disease is
of the incidence rate in the exposed rare during the follow-up period (e.g. an
group (a/Y1) to that in the non-exposed incidence proportion of less than 10%).
group (b/Y0). In the example in table However, the odds ratio has been
2.2, the incidence rates are 0.02 per severely criticised as an effect measure
person-year in the exposed group and (Greenland, 1987; Miettinen and Cook,
0.01 per person-year in the non-exposed 1981), and has little intrinsic meaning in
group, and the rate ratio is therefore incidence studies, but it is presented
2.00. here because it is the standard effect
measure in incidence case-control
studies (see below).
A second commonly used effect measure
is the risk ratio (incidence proportion Finally, it should be noted that an
ratio or cumulative incidence ratio) which analogous approach can be used to
is the ratio of the incidence proportion in calculate measures of effect based on
the exposed group (a/N1) to that in the differences rather than ratios, in
non-exposed group (b/N0). In this particular the rate difference and the risk
example, the risk ratio is 0.1813/0.0952 difference. Ratio measures are usually of
= 1.90. When the outcome is rare over greater interest in etiologic research,
the follow-up period the risk ratio is because they have more convenient
approximately equal to the rate ratio. statistical properties, and it is easier to
assess the strength of effect and the
A third possible effect measure is the possible role of various sources of bias
incidence odds ratio which is the ratio of when using ratio measures (Cornfield et
the incidence odds in the exposed group al, 1951). Thus, I will concentrate on the
(a/c) to that in the non-exposed group use of ratio measures in the remainder
(b/d). In this example the odds ratio is of this text. However, other measures
0.2214/0.1052 = 2.11. When the (e.g. risk difference, attributable
outcome is rare over the study period fraction) may be of value in certain
the incidence odds ratio is approximately circumstances, such as evaluating the
equal to the incidence rate ratio. public health impact of a particular
exposure, and I encourage readers to
These three multiplicative effect consult standard texts for a
measures are sometimes referred to comprehensive review of these measures
under the generic term of relative risk. (e.g. Rothman and Greenland, 1998).
Each involves the ratio of a measure of
disease occurrence in the exposed group
27
2.2. Incidence Case-Control Studies
Incidence studies are the most same population over the same period
comprehensive approach to studying the (the possible methods of sampling
causes of disease, since they use all of controls are described below).
the information about the source
population over the risk period. Table 2.3 shows the data from a
However, they are very expensive in hypothetical case-control study, which
terms of time and resources. For involved studying all of the 2,765
example, the hypothetical study incident cases which would have been
presented in table 2.2 would involve identified in the full incidence study, and
enrolling 20,000 people and collecting a sample of 2,765 controls (one for each
exposure information (on both past and case). Such a case-control study would
present exposure) for all of them. The achieve the same findings as the full
same findings can be obtained more incidence study, but would be much
efficiently by using a case-control more efficient, since it would involve
design. ascertaining the exposure histories of
5,530 people (2,765 cases and 2,765
An incidence case-control study involves controls) rather than 20,000. When the
studying all (or a sample) of the incident outcome under study is very rare, an
cases of the disease that occurred in the even more remarkable gain in efficiency
source population over the risk period, can be achieved with very little reduction
and a control group sampled from the in the precision of the effect estimate.
Table 2.3
Findings from a hypothetical incidence case-control study based on the cohort in table 2.2
28
Measures of Effect in Incidence recently been termed case-cohort
Case-Control Studies sampling (Prentice, 1986), or case-base
sampling (Miettinen, 1982). In this
In case-control studies, the relative risk instance, the ratio of exposed to non-
is estimated using the odds ratio. exposed controls will estimate the
exposure odds in the source population
Suppose that a case-control study is of persons at risk at the start of follow-
conducted in the study population shown up (N1/N0 = 10000/10000 =
in table 2.2; such a study might involve 1383/1383), and the odds ratio obtained
all of the 2,765 incident cases and a in the case-control study will therefore
group of 2,765 controls (table 2.3). The estimate the risk ratio in the source
effect measure which the odds ratio population over the study period (1.90).
obtained from this case-control study will In this instance the method of calculation
estimate depends on the manner in of the odds ratio is the same as for any
which controls are selected. Once again, other case-control study, but minor
there are three main options (Miettinen, changes are needed in the standard
1985; Pearce, 1993; Rothman and methods for calculating confidence
Greenland, 1998). intervals and p-values to take into
account that some cases may also be
One option, called cumulative (or selected as controls (Greenland, 1986).
cumulative incidence) sampling, is to
select controls from those who do not The third approach is to select controls
experience the outcome during the longitudinally throughout the course of
follow-up period, i.e. the survivors the study (Sheehe, 1962; Miettinen,
(those who did not develop the disease 1976); this is sometimes described as
at any time during the follow-up period). risk-set sampling (Robins et al, 1986),
In this instance, the ratio of exposed to sampling from the study base (the
non-exposed controls will estimate the person-time experience) (Miettinen,
exposure odds (c/d = 8178/9048 = 1985), or density sampling (Kleinbaum
1313/1452) of the survivors, and the et al, 1982). In this instance, the ratio of
odds ratio obtained in the case-control exposed to non-exposed controls will
study will therefore estimate the estimate the exposure odds in the
incidence odds ratio in the source person-time (Y1/Y0 = 90635/95613 =
population over the study period (2.11). 1349/1416), and the odds ratio obtained
Early presentations of the case-control in the case-control study will therefore
approach usually assumed this context estimate the rate ratio in the study
(Cornfield, 1951), and it was emphasised population over the study period (2.00).
that the odds ratio was approximately
equal to the risk ratio when the disease Case-control studies have traditionally
was rare. been presented in terms of cumulative
sampling (e.g. Cornfield, 1951), but
It was later recognised that controls can most case-control studies actually
be sampled from the entire source involve density sampling (Miettinen,
population (those at risk at the 1976), often with matching on a time
beginning of follow-up), rather than just variable such as calendar time or age,
from the survivors (those at risk at the and therefore estimate the rate ratio
end of follow-up). This approach which without the need for any rare disease
was previously used by Thomas (1972) assumption (Sheehe, 1962; Miettinen,
and Kupper et al (1975), has more 1976; Greenland and Thomas, 1982).
29
Example 2.2
Summary
30
References
31
Rothman KJ, Greenland S (1998). Thomas DB (1972). The relationship of
Modern epidemiology. 2nd ed. oral contraceptives to cervical
Philadelphia: Lippincott-Raven. carcinogenesis. Obstet Gynecol 40:
508-18.
Sheehe PR (1962). Dynamic risk analysis
of matched pair studies of disease.
Biometrics 18: 323-41.
32
CHAPTER 3. Prevalence Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
Incidence studies are ideal for studying conditions (e.g. chronic non-fatal
events such as mortality or cancer disease) prevalence studies are the only
incidence, since they involve collecting option. Furthermore, in some instances
and analysing all of the relevant we may be more interested in factors
information on the source population and which affect the current burden of
we can get better information on when disease in the population. Consequently,
exposure and disease occurred. although incidence studies are usual
However, incidence studies involve preferable, there is also an important
lengthy periods of follow-up and large role for prevalence studies, both for
resources, in terms of both time and practical reasons, and because such
funding, and it may be difficult to studies enable the assessment of the
identify incident cases of non-fatal level of morbidity and the population
chronic conditions such as diabetes. “disease burden” for a non-fatal
Thus, in some settings (e.g. some condition.
developing countries) and/or for some
The term prevalence denotes the a specific population with that in other
number of cases of the disease under communities or countries. This may be
study existing in the source population at done, for example, in order to discover
a particular time. This can be defined as differences in disease prevalence and to
point prevalence estimated at one point thus suggest possible risk factors for the
in time, or period prevalence which disease. These further studies may
denotes the number of cases that involve testing specific hypotheses by
existed during some time interval (e.g. comparing prevalence in subgroups of
one year). people who have or have not been
exposed to a particular risk factor (e.g.
The prevalence is a proportion, and the as passive smoking) in the past.
statistical methods for calculating a
confidence interval for the prevalence Prevalence studies often represent a
are identical to those presented above considerable saving in resources
for calculating a confidence interval for compared with incidence studies, since it
the incidence proportion (chapter 12). is only necessary to evaluate disease
prevalence at one point in time, rather
In some instances, the aim of a than continually searching for incident
prevalence study may simply be to cases over an extended period of time.
compare the disease prevalence among On the other hand, this gain in efficiency
33
is achieved at the cost of greater risk of chronic heart disease will be negatively
biased inferences, since it may be much associated with the prevalence of heart
more difficult to understand the temporal disease (in people who are alive!), and
relationship between various exposures will therefore appear to be ‘protective’
and the occurrence of disease. For against heart disease in a prevalence
example, an exposure that increases the study.
risk of death in people with pre-existing
Example 3.1
34
Figure 3.1
Twelve month period prevalence of asthma symptoms in 13-14 year old children in
Phase I of the International Study of Asthma and Allergies in Childhood (ISAAC)
20%
10 to <20%
5 to <10%
<5%
35
Measures of Effect in Prevalence Studies
Figure 3.2 shows the relationship population size - and that average
between incidence and prevalence of disease duration (D) does not change
disease in a “steady state” population. over time. Then, if we denote the
Assume that the population is in a prevalence of disease in the study
“steady state” (stationary) over time (in population by P, the prevalence odds is
that the numbers within each equal to the incidence rate (I) times the
subpopulation defined by exposure, average disease duration (Alho, 1992):
disease and covariates do not change
with time) – this usually requires that P
incidence rates and exposure and ------ = ID
disease status are unrelated to the (1-P)
immigration and emigration rates and
Figure 3.2
P=prevalence
I=incidence
P/(1-P) = I x D N(1-p) x I D=duration
N=population
Non-asthmatic Asthma
cases
[N(1-P)]
[NP]
NP/D
36
average duration of disease is the same difference in prevalence between two
in the exposed and non-exposed groups groups could entirely depend on
(i.e. D1 = D0), then the prevalence odds differences in disease duration (e.g.
ratio satisfies the equation: because of factors which prolong or
exacerbate symptoms) rather than
POR = I1/I0 differences in incidence. Changes in
incidence rates, disease duration and
i.e. under the above assumptions, the population sizes over time can also bias
prevalence odds ratio directly estimates the POR away from the rate ratio, as can
the incidence rate ratio (Pearce, 2004). migration into and out of the population
However, it should be emphasised that at risk or the prevalence pool.
prevalence depends on both incidence
and average disease duration, and a
Table 3.1
Findings from a hypothetical prevalence study of 20,000 persons
Exposed Non-exposed Ratio
--------------------------------------------------------------------------------------
Cases 909 (a) 476 (b)
Non-cases 9,091 (c) 9,524 (d)
--------------------------------------------------------------------------------------
Total population 10,000 (N1) 10,000 (N0)
--------------------------------------------------------------------------------------
Prevalence 0.0909 (P1) 0.0476 (P0) 1.91
Prevalence odds 0.1000 (O1) 0.0500 (O0) 2.00
Table 3.1 shows data from a number of new cases generated from
prevalence study of 20,000 people. the source population. For example, in
This is based on the incidence study the non-exposed group, there are 476
represented in table 2.2 (chapter 2), prevalent cases, and 95 (20%) of
with the assumptions that, for both these "lose" their disease each year;
populations, the incidence rate and this is balanced by the 95 people who
population size is constant over time, develop the disease each year (0.0100
that the average duration of disease is of the susceptible population of 9524
five years, and that there is no people). With the additional
migration of people with the disease assumption that the average duration
into or out of the population (such of disease is the same in the exposed
assumptions may not be realistic, but and non-exposed groups, then the
are made here for purposes of prevalence odds ratio (2.00) validly
illustration). In this situation, the estimates the incidence rate ratio (see
number of cases who "lose" the table 2.2).
disease each year is balanced by the
37
3.2. Prevalence Case-Control Studies
Table 3.2
Findings from a hypothetical prevalence case-control study based on the population
represented in table 3.1
Exposed Non-exposed Ratio
--------------------------------------------------------------------------------------
Cases 909 (a) 476 (b)
Controls 676 (c) 709 (d)
--------------------------------------------------------------------------------------
Prevalence odds 1.34 (O1) 0.67 (O0) 2.00
---------------------------------------------------------------------------------
38
Example 3.2
Summary
When a dichotomous outcome is under particular time, rather than the incidence
study (e.g. being alive or dead, or of the disease over time. Prevalence
having or not having a disease) four case-control studies involve sampling on
main types of studies can be identified: the basis of outcome, i.e. they usually
incidence studies, incidence case-control involve all prevalent cases in the source
studies, prevalence studies, and population and a control group (of non-
prevalence case-control studies cases) sampled from the source
(Morgenstern and Thomas, 1993; population.
Pearce, 1998). Prevalence studies
involve measuring the prevalence of the
disease in the source population at a
References
39
ISAAC Steering Committee (1998a). Pearce N (1998). The four basic
Worldwide variation in prevalence of epidemiologic study types. J Epidemiol
symptoms of asthma, allergic Biostat 3: 171-7.
rhinoconjunctivitis and atopic eczema:
Pearce N (2004). Effect measures in
ISAAC. Lancet 351: 1225-32.
prevalence studies. Environmental
ISAAC Steering Committee (1998b). Health Perspectives 2004; 112: 1047-
Worldwide variations in the 50.
prevalence of asthma symptoms:
Pearce NE, Weiland S, Keil U, et al
International Study of Asthma and
(1993). Self-reported prevalence of
Allergies in Childhood (ISAAC). Eur
asthma symptoms in children in
Respir J 12: 315-35.
Australia, England, Germany and New
Morgenstern H, Thomas D (1993). Zealand: an international comparison
Principles of study design in using the ISAAC protocol. Eur Resp J
environmental epidemiology. Environ 6: 1455-61.
Health Perspectives 101: S23-S38.
40
CHAPTER 4. More Complex Study Designs
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
In the previous two chapters I reviewed not having a particular disease). I now
the possible study designs for the simple consider studies involving other axes of
situation where individuals are exposed classification, continuous measurements
to a particular risk factor (e.g. a of health status (e.g. continuous lung
particular chemical) and when a function or blood pressure
dichotomous outcome is under study measurements) and more complex study
(e.g. being alive or dead, or having or designs (ecologic and multilevel studies).
The four basic study types discussed in The Timing of Collection Of Exposure
chapters 2 and 3 are defined in terms of: Information
(a) the type of outcome under study
(incidence or prevalence); and (b) Perhaps the feature that has received
whether there is sampling on the basis of the most attention in various
outcome. They do not involve any classification schemes is the timing of
consideration of the nature of the the collection of exposure information.
exposure data. This provides additional This has dominated discussions of
axes of classification. “directionality”, particularly with regard
to case-control studies. In fact, for all of
Continuous Exposure Data the four basic study types, exposure
information can be collected
Firstly, it should be noted that in prospectively or retrospectively. For
discussing the above classification we example, an incidence study or incidence
have assumed that exposure is case-control study of occupational cancer
dichotomous (i.e. study participants may collect exposure information
are exposed or not exposed). In prospectively, or use historical
reality, there may be multiple information that was collected
exposure categories (e.g. high, prospectively but abstracted
medium and low exposure), or retrospectively by the investigator (e.g.
exposure may be measured as a occupational hygiene monitoring
continuous variable (see chapter 8). records), or use exposure information
However, although this requires minor that was collected retrospectively (e.g.
changes to the data analysis (see recall of duration and intensity of
chapter 12), it does not alter the four- pesticide use). An unfortunate aspect of
fold categorisation of study design some discussions of the merits of case-
options presented above. control studies is that they have often
41
been labelled as “retrospective” studies, not fundamental to the classification of
when this is in fact not an inherent part study types since, as with issues of
of their design. The potential “problem” directionality, they do not affect the
of bias due to exposure ascertainment parameterization of the exposure-
errors (e.g. recall bias) arises from the outcome association.
retrospective collection of exposure
information, irrespective of whether the The Level of Measurement of
study is an incidence, incidence case- Exposure
control, prevalence, or prevalence case-
control study. A third additional axis of classification
involves the level of measurement of
Sources of Exposure Information exposure. In particular, in ecologic
studies exposure information may be
Another set of issues that occur in collected on a group rather than on
practice involve the sources of exposure individuals (e.g. average level of meat
information (e.g. routine records, job- consumption) although others may still
exposure-matrices, questionnaires, be available for individuals (e.g. age,
biological samples). However, as noted gender). This situation is discussed in
above, these issues are important in section 4.3.
understanding sources of bias but are
42
dust exposure), or at a previous time Measures of Effect in Cross-Sectional
(e.g. from historical records on past Studies
exposure levels) or integrated over time.
The key feature of cross-sectional In a simple cross-sectional study
studies is that they involve studying involving continuous outcome data, the
disease at a particular point in time. basic methods of statistical analysis
Exposure information can be collected involve comparing the mean level of the
for current and/or historical exposures, outcome in “exposed” and “non-
and a wide variety of exposure exposed” groups, e.g. the mean levels of
assessment methods can be used within blood pressure in “exposed” and “non-
this general category of study (these are exposed” people. Standard statistical
discussed further in chapter 8). methods of analysis for comparing
means (perhaps after a suitable
Just as a prevalence case-control study transformation to normalise the data),
can be based on a prevalence survey, a and calculating confidence intervals (and
cross-sectional study can also involve associated p-values) for differences
sampling on the basis of the disease between means, can be used to analyse
outcome. For example, a cross-sectional such studies (see chapter 12). More
study of bronchial hyperresponsiveness generally, regression methods can be
(BHR) could involve testing all study used to model the relationship between
participants for BHR and then the level of exposure (measured as a
categorising the test results into severe continuous variable) and the level of the
BHR, mild BHR, and no BHR, and then outcome measure (also measured as a
obtaining exposure information on all continuous variable) (e.g. Armitage et al,
severe BHR cases and from random 2002).
samples of the other two groups.
Example 4.1
43
Longitudinal Studies
Example 4.2
The Tokelau Island were repeated (Round who had not: the mean
Migrant Study (Wessen II) in both the Tokelau differences were 1.43 for
et al, 1992) examined Islands (1976) and in systolic and 1.15 for
the effects of migration New Zealand (1975-7). diastolic in men, and
on development of A regression analysis of 0.66 and 0.46
‘Western diseases’ within changes in blood respectively in women.
a population which pressure between Round These differences in
initially had a low I and Round II (adjusted rates of annual increase
incidence of these for age) found that the in blood pressure were
conditions. Round I mean annual increase in maintained in
surveys were conducted blood pressure was subsequent surveys in
in the Tokelau Islands in greater in those who had men, but not in women.
1968/1971, and these migrated than in those
44
Time series measured over minutes, hours, days,
weeks, months or years (Dockery and
One special type of longitudinal study is Brunekreef, 1996). In many instances,
that of “time series” comparisons in such data can be analysed using the
which variations in exposure levels and standard statistical techniques outlined
symptom levels are assessed over time above. For example, a study of daily
with each individual serving as their own levels of air pollution and asthma
control. Thus, the comparison of hospital admission rates can be
“exposed” and “non-exposed” involves conceptualised as a study of the
the same persons evaluated at different incidence of hospital admission in a
times, rather than different groups of population exposed to air pollution
persons being compared (often at the compared with that in a population not
same time) as in other longitudinal exposed to air pollution. The key
studies. The advantage of the time series difference is that only a single population
approach is that it reduces or eliminates is involved, and it is regarded as
confounding (see chapter 6) by factors exposed on high pollution days and as
which vary among subjects but not over non-exposed on low pollution days.
time (e.g. genetic factors), or whose day Provided that the person-time of
to day variation is unrelated to the main exposure is appropriately defined and
exposure (Pope and Schwartz, 1996). On assessed, then the basic methods of
the other hand, time series data often analysis are not markedly different from
require special statistical techniques other studies involving comparisons of
because any two factors that show a exposed and non-exposed groups.
time trend will be correlated (Diggle et
al, 1994). For example, even a three- However, the analysis of time series may
month study of lung function in children be complicated because the data for an
will generally show an upward trend due individual are not independent and serial
to growth, as well as learning effects data are often correlated (Sherrill and
(Pope and Schwartz, 1996). A further Viegi, 1996), i.e. the value of a
problem is that the change in a measure continuous outcome measure on a
over time may depend on the baseline particular day may be correlated with the
value, e.g. changes in lung function over value for the previous day.
time may depend on the baseline level Furthermore, previous exposure may be
(Schouten and Tager, 1996). as relevant as, or more relevant than,
current exposure. For example, the
Time series can involve dichotomous effects of air pollution may depend on
(binary) data, continuous data, or exposure on preceding days as well as
“counts” of events (e.g. hospital on the current day (Pope and Schwartz,
admissions) (Pope and Schwartz, 1996), 1996).
and the changes in these values may be
45
Example 4.3
Table 4.1
Relative risks* (and 95% CIs) of cardiovascular disease mortality associated with air
pollution concentrations in the Netherlands
------------------------------------------------------------------------------------------------
*Relative risks per 1 to 99th percentile pollution difference
Relative risks per 150 g/m3 for ozone (8-hour maximum of the previous
Day), per 120 g/m3 for CO, per 80 g/m3 for PM10, per 30 g/m3
for NO2, and per 40 g/m3 for black smoke and SO2, all as 7-day moving averages
Source: Hoek et al (2001)
46
4.3 Ecologic and Multilevel Studies
The basic study designs described in ‘ecologic fallacy’ (see below) can occur
chapters 2 and 3 involved the in that factors that are associated with
measurement of exposure and disease national disease rates may not be
in individuals. In this section, I associated with disease in individuals
consider more complex study designs (Greenland and Robins, 1994). Thus,
in which exposures are measured in ecologic studies have recently been
populations instead of, or in addition regarded as a relic of the “pre-
to, individuals. modern” phase of epidemiology before
it became firmly established with a
Ecologic Studies methodologic paradigm based on the
theory of randomized controlled trials
In ecologic studies exposure of individuals.
information may be collected on a
group rather than on individuals. In However, population-level studies are
the past, ecologic studies have been now experiencing a revival for two
regarded as an inexpensive but important reasons (Pearce, 2000).
unreliable method for studying
individual-level risk factors for disease. Firstly, it is increasingly recognised
For example, rather than go to the that, even when studying individual-
time and expense to establish a cohort level risk factors, population-level
study or case-control study of fat studies play an essential role in
intake and breast cancer, one could defining the most important public
simply use national dietary and cancer health problems to be addressed, and
incidence data and, with minimal time in generating hypotheses as to their
and expense, show a strong potential causes. Many important
correlation internationally between fat individual-level risk factors for disease
intake and breast cancer. In this simply do not vary enough within
situation, an ecologic study does not populations to enable their effects to
represent a fundamentally different be identified or studied (Rose, 1992).
study design, but merely a particular More importantly, such studies are a
variant of the four basic study designs key component of the continual cycle
described in chapter 2 in which of theory and hypothesis generation
information on average levels of and testing (Pearce, 2000).
exposure in populations is used as a Historically, the key area in which
surrogate measure of exposure in epidemiologists have been able to “add
individuals. value” has been through this
population focus (Pearce, 1996, 1999).
This approach has been quite rightly For example, many of the recent
regarded as inadequate and unreliable discoveries on the causes of cancer
because of the many additional forms (including dietary factors and colon
of bias that can occur in such studies cancer, hepatitis B and liver cancer,
compared with studies of individuals aflatoxins and liver cancer, human
within a population. In particular, not papilloma virus and cervical cancer)
only will measures of exposure in have their origins, directly or
populations often be poor surrogates indirectly, in the systematic
for exposures in individuals, but the international comparisons of cancer
47
incidence conducted in the 1950s and consistent with biological knowledge at
1960s (Doll et al, 1966). These the time, but in other instances they
suggested hypotheses concerning the were new and striking, and might not
possible causes of the international have been proposed, or investigated
patterns, which were investigated in further, if the population level analyses
more depth in further studies. In some had not been done.
instances these hypotheses were
Example 4.4
A second reason that ecologic studies et al, 1999). The failure to take account
are experiencing a revival is that it is of the importance of population context,
increasingly being recognised that some as an effect modifier and determinant of
risk factors for disease genuinely individual-level exposures could be
operate at the population level (Pearce, termed the “individualistic fallacy”
2000). In some instances they may (Diez-Rouz, 1998) in which the major
directly cause disease, but perhaps population determinants of health are
more commonly they may cause disease ignored and undue attention is focussed
as effect modifiers or determinants of on individual characteristics. In this
exposure to individual-level risk factors. situation, the associations between
For example, being poor in a rich these individual characteristics and
country or neighbourhood may be worse health can be validly estimated, but
than having the same income level in a their importance relative to other
poor country or neighbourhood, because potential interventions, and the
of problems of social exclusion and lack importance of the context of such
of access to services and resources (Yen interventions, may be ignored.
48
Figure 4.1
Association of tuberculosis notification rates for the period 1980-1982 (in countries with
valid tuberculosis notification data) and the prevalence of asthma symptoms in 13-14
year old children in the International Study of Asthma and Allergies in Childhood (ISAAC)
Wet
Source: von Mutius heeze last
al (2000) 12 months (written questionnaire) vs tuberculosis
notification rate for the period 1980-1982 in countries with valid
tuberculosis notification data
40
35
Wheeze last 12 months %
30
25
20
15
10
0
0 10 20 30 40 50 60 70 80
Example 4.5
49
Ecologic Fallacies
While stressing the potential value of usage. This does not mean that
ecologic analyses, it is also important to watching television causes every type of
recognise their limitations. In particular, disease, but rather than in many
ecologic studies are a very poor means instances the association between sales
of assessing the effects of individual of television sets and disease at the
exposures (e.g. diet or tobacco national level is confounded by other
smoking) since confounding (and effect exposures (at both the national and
modification) can occur at the individual individual level). A hypothetical example
level, the country (population) level, or is given in example 4.6. Another
both (Morgenstern, 1998). For example, problem is that individual level effects
almost any disease that is associated can confound ecologic estimates of
with affluence and westernisation has in population-level effects (Greenland,
the past been associated at the national 2001).These problems of cross-level
level with sales of television sets, and inference are avoided (or reduced) in
nowadays is probably associated at the multilevel analyses (see below).
national level with rates of internet
Example 4.6
Table 4.2 shows the data the country level: if a Thus, the ecologic
for a hypothetical regression is performed analysis correctly
ecological analysis. The on the country-level data estimates the individual-
numbers of cases and it indicates (comparing level relative risk of 0.5.
population numbers (and 100% exposure with 0% In table 4.4, there is
hence disease rates), as exposure) a relative risk confounding at the
well as the percentage of of 0.5. However, it is not country level (because
the population exposed, known whether this the rate in the non-
are known for each association applies to exposed differs by
country. Thus, the individuals, since the country) and there is in
numbers of people data are not available. fact no association at the
exposed and non- individual level. In table
exposed within each Tables 4.3-4.5 give three 4.5, there is effect
country are known, but different scenarios, each modification at the
it is not known how of which could generate country level, and the
many cases were the data in table 4.2. In relative risk is positive,
exposed and how many table 4.3, there is no but of differing
were not; thus it is not confounding at the magnitude, in all three
possible to estimate the country level (because countries. These three
rates in the exposed and the rate in the non- very different situations
non-exposed groups exposed is the same - (a protective effect, no
within each country. The 200 per 1,000 - in each effect, a positive effect
country-level data country), although there which is different in each
indicate a negative could of course still be country) all yield the
association between uncontrolled confounding same country-level data
exposure and disease at at the individual level. shown in table 4.2.
50
Table 4.2
Table 4.3
Hypothetical example of an ecologic analysis:
No confounding by country
51
Table 4.4
Table 4.5
52
Multilevel Studies
Example 4.7
Summary
53
which each subject serves as his or her estimate the effects of exposures in
own control. individuals. These problems are avoided
(or reduced) in multilevel analyses,
Ecologic studies play an important role which permit us to take the population
in the process of hypothesis generation context of exposure into account.
and testing, but they pose additional
problems of bias when attempting to
References
54
health? J Epidemiol Comm Health 54: data. Am J Respir Crit Care Med 154:
404-8. S229-S233.
Morgenstern H (1998). Ecologic studies. Rose G (1992). The strategy of
In: Rothman K, Greenland S. Modern preventive medicine. Oxford: Oxford
epidemiology. Philadelphia: University Press, 1992.
Lippincott-Raven, pp 459-80.
Schouten JP, Tager IB (1996).
Nersesyan AK, Boffetta P, Sarkisyan TF, Interpretation of longitudinal studies:
et al (2001). Chromosome aberrations an overview. Am J Respir Crit Care
in lymphocytes of persons exposed to Med 154: S278-S284.
an earthquake in Armenia. Scand J
Schwartz J (2000). The distributed lag
Work Environ Health 27: 120-4.
between air pollution and daily
Pearce N (1996). Traditional deaths. Epidemiol 2000; 11: 320-6.
epidemiology, modern epidemiology,
Sherill D, Viegi G (1996). On modeling
and public health. AJPH 1996; 86:
longitudinal pulmonary function data.
678-83.
Am J Respir Crit Care Med 154: S217-
Pearce N (1999). Epidemiology as a S222.
population science. Int J Epidemiol
Von Mutius E, Pearce N, Beasley R,
1999; 28: S1015-8.
Cheng S, Von Ehrenstein O, Björkstén
Pearce N (2000). The ecologic fallacy B, Weiland S, on behalf of the ISAAC
strikes back. J Epidemiol Comm Steering Committee (2000).
Health 2000; 54: 326-7. International patterns of tuberculosis
and the prevalence of symptoms of
Pearce N, Davey Smith G (2003). Is
asthma, rhinitis and eczema. Thorax
social capital the key to inequalities in
55: 449-53.
health? Am J Publ Health 93: 122-9.
Wessen AF, Hooper A, Huntsman J, et al
Pearce NE, Weiland S, Keil U, et al (1992). Migration and health in a
(1993). Self-reported prevalence of small society: The case of Tokelau.
asthma symptoms in children in Oxford: Clarendon Press, 1992, pp
Australia, England, Germany and New 318-57.
Zealand: an international comparison Wilkinson RG (1992). Income
using the ISAAC protocol. Eur Resp J distribution and life expectancy. Br
6: 1455-61. Med J 304: 165-8.
55
56
Part II
57
58
CHAPTER 5: Precision
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
Random error will occur in any 12). However, there will always be other
epidemiologic study, just as it occurs in unknown or unmeasurable risk factors
experimental studies. It is often referred operating, and hence the disease rates in
to as chance, although it can perhaps particular subgroups will fluctuate about
more reasonably be regarded as the average. This will occur even if each
"ignorance" (although it is not the only subgroup has exactly the same exposure
thing that we may be ignorant about as history.
our study may be biased by unknown
confounders, measurement error, etc). Even in an experimental study, in which
For example, if we toss a coin 50 times, participants are randomised into
then ideally we might be able to predict "exposed" and "non-exposed" groups,
the outcome of each “toss” based on the there will be "random" differences in
speed, spin, and trajectory of the coin. background risk between the compared
In practice, we do not have all of the groups, but these will diminish in
necessary information (because of importance (i.e. the random differences
“ignorance”), or the computing power to will tend to “even out”) as the study size
use it (because of chaotic behaviour), grows. In epidemiological studies,
and we therefore regard the outcome of because of the lack of randomisation,
each “toss” as a “chance” phenomenon. there is no guarantee that differences in
However, we may note that, on the baseline (background) risk will "even
average, 50% of the “tosses” are heads out" between the exposure groups as the
and therefore we may say that a study size grows.
particular toss has a “50% chance” of
producing a head. The basic principles of analysis of
epidemiologic data are discussed in
Similarly, suppose that 50 lung cancer chapter 12. However, at this stage it is
deaths occurred among 10,000 people important to discuss some basic
aged 35-39 exposed to a particular statistical principles and methods since
factor during one year. Then, if each they are relevant to the calculation of
person had exactly the same cumulative the appropriate study size.
exposure, we might expect two
subgroups of 5,000 people each to
experience 25 deaths during the one-
year period. However, just as 50 tosses
of a coin will not usually produce exactly
25 heads and 25 tails, neither will there
be exactly 25 deaths in each group. This
occurs because of differences in
exposure to other risk factors for lung
cancer, and differences in individual
susceptibility between the two groups.
Ideally, we should attempt to gather
information on all known risk factors
(potential confounders), and to adjust
for these in the analysis (see chapter
59
5.1: Basic Statistics
60
one can calculate the proportion with the proportion in all births nationally. In
malformations (i.e. the mean score for a doing so, we not only wish to estimate
population in which a malformation the size of the observed association, but
scores 1 and a completely healthy baby also whether an association as large as
scores 0), and the standard deviation of this is likely to have arisen by chance, if
this proportion (i.e. the standard error of in fact there is no causal association
the mean score), and if the sample is between exposure and disease. The p-
sufficiently large one can analyze these value is the probability that differences
estimates based on the normal as large or larger as those observed
distribution. could have arisen by chance if the null
hypothesis (of no association between
Testing and Estimation exposure and disease) is correct. In the
past, it is been common to “test” the
Usually, in epidemiologic studies, we statistical significance of the study
wish to measure the difference in findings by seeing whether the p-value is
disease occurrence between groups less than an arbitrary value (e.g.
exposed and not exposed to a particular p<0.05). The limitations of statistical
factor. For example, if we have significance testing are discussed in
estimated the proportion of pregnancies chapter 12. However, even if we do not
involving congenital malformations in an intend to use p-values when reporting
area with high nitrate levels in drinking the findings of a study, the statistical
water, then we would wish to compare principles involved are nevertheless
this to the corresponding proportion in relevant to determining the appropriate
an area with low nitrate levels (or with study size.
The most effective means of reducing collection, etc) of index and reference
random error is by increasing the study subjects are the same, then a 1:1 ratio
size, so that the precision of the is most efficient for a given total study
measure of association (the effect size. When exposure increases the risk
estimate) will be increased, i.e. the of the outcome, or referents are
confidence intervals will be narrower. cheaper to include in the study than
Random error thus differs from index subjects, then a larger ratio may
systematic error (see chapter 6) which be more efficient. The optimal
cannot be reduced simply by increasing reference: index ratio is rarely greater
the study size. than 2:1 for a simple unstratified
A second factor that can affect analysis (Walter, 1977) with equal index
precision, given a fixed total study size, and referent costs, but a larger average
is the relative size of the reference ratio may be desirable in order to
group (the unexposed group in a cohort assure an adequate ratio in each
study, or the controls in a case-control stratum for stratified analyses.
study). When exposure is not associated
with disease (i.e. the true relative risk is The ideal study would be infinitely large,
1.0), and the costs (of recruitment, data but practical considerations set limits on
61
the number of participants that can be commencing the study, whether it is
included. Given these limits, it is large enough to be informative. One
desirable to find out, before
method is to calculate the "power" of the
study. This depends on five factors:
• the cutoff value (i.e. alpha level) • the expected relative risk (i.e. the
below which the p-value from the specified value of the relative risk
study would be considered under the alternative (non-null)
“statistically significant”. This value hypothesis));
is usually set at 0.05 or 5%;
• the ratio of the sizes of the two
• the disease rate in the non-exposed groups being studied;
group in a cohort study or the
• the total number of study participants.
exposure prevalence of the controls
in a case-control study;
K0.5
where:
Zβ = standard normal deviate corresponding to a given statistical power
Zα = standard normal deviate corresponding to an alpha level (the largest
p-value that would be considered "statistically significant")
N0 = number of persons in the reference group (i.e. the non-exposed
group in a cohort study, or the controls in a case-control study)
P1 = outcome proportion in study group
P0 = outcome proportion in the reference group
A = allocation ratio of referent to study group (i.e., the relative size of the
two groups)
B = (1-P0) (P1+ (A-1) P0) + P0 (1-P1)
C = (1-P0) (AP1 - (A-1) P0) + AP0 (1-P1)
K = BC - A (P1-P0)2
62
Example 5.1
Consider a proposed study group of workers, the double the risk of disease,
of 5,000 exposed persons expected number of cases so the number of cases
and 5,000 non-exposed of the disease of interest observed will be 50 in the
persons. Suppose that on is 25 in the non-exposed exposed group.
the basis of mortality group. However, we
rates in a comparable expect that exposure will
Then:
Zα = 1.96 (if a two-tailed significance test, for an alpha-level of 0.05, is to
be used)
N0 = 5,000
P1 = 0.010 (= 50/5000)
P0 = 0.005 (= 25/5000)
A = 1
Using the equation above, the standard statistically significant lung cancer
normal deviate corresponding to the excess in the exposed group is:
power of the study to detect a
63
Related approaches are to estimate 1982; Greenland, 1983), and the size
the minimum sample sizes required to of the expected association is often
detect an association (e.g., relative just a guess. Nevertheless, power
risk) of specified magnitudes calculations are an essential aspect of
(Beaumont and Breslow, 1981), and to planning a study since, despite all their
estimate the minimum detectable assumptions and uncertainties, they
association for a given alpha level, nevertheless provide a useful general
power and study size (Armstrong, indication as to whether a proposed
1987). study will be large enough to satisfy
the objectives of the study.
Occasionally, the outcome is measured
as a continuous rather than a Estimating the expected precision can
dichotomous variable (e.g. blood also be useful (Rothman and
pressure). In this situation the Greenland, 1998). This can be done by
standard normal deviate corresponding "inventing" the results, based on the
to the study power is: same assumptions used in power
calculations, and carrying out an
Zβ = N00.5(µ1-µ0) – Zα analysis involving calculations of effect
estimates and confidence limits. This
s(A + 1)0.5 approach has particular advantages
when the exposure is expected to have
no association with disease, since the
where: concept of power is not applicable but
precision is still of concern. However,
µ1 = mean outcome measure in this approach should be used with
exposed group considerable caution, as the results
may be misleading unless interpreted
µ0 = mean outcome measure in
carefully. In particular, a study with an
reference group
expected lower limit equal to a
s = estimated standard particular value (e.g. 1.0) will have
deviation of outcome measure only a 50% chance of yielding an
observed lower confidence limit above
that value.
The power is not the probability that
the study will estimate the size of the In practice, the study size depends on
association correctly. Rather, it is the the number of available participants
probability that the study will yield a and the available resources. Within
"statistically significant" finding when these limitations it is desirable to make
an association of the postulated size the study as large as possible, taking
exists. The observed association could into account the trade-off between
be greater or less than expected, but including more participants and
still be "statistically significant". The gathering more detailed information
overemphasis on statistical about a smaller number of participants
significance is the source of many of (Greenland, 1988). Hence, power
the limitations of power calculations. calculations can only serve as a rough
Many features such as the significance guide as to whether a feasible study is
level are completely arbitrary, issues large enough to be worthwhile. Even if
of confounding, misclassification and such calculations suggest that a
effect modification are generally particular study would have very low
ignored (although appropriate methods power, the study may still be
are available - see Schlesselman, worthwhile if exposure information is
64
collected in a form which will permit individual cohorts were too small to be
the study to contribute to the broader informative in themselves, but each
pool of information concerning a contributed to the overall pool of data.
particular issue. For example, the
International Agency for Research on Once a study has been completed,
Cancer (IARC) has organised several there is little value in retrospectively
international collaborative studies such performing power calculations since
as those of occupational exposure to the confidence limits of the observed
man-made mineral fibers (Simonato et measure of effect provide the best
al, 1986) and phenoxy herbicides and indication of the range of likely values
contaminants (Saracci et al, 1991). for the true association (Smith and
The man-made mineral fiber study Bates, 1992; Goodman and Berlin,
involved pooling the findings from 1994). In the next chapter, random
individual cohort studies of 13 error will be ignored, and the
European factories. Most of the discussion will concentrate on issues of
systematic error.
Summary
65
References
66
CHAPTER 6: Validity
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
6.1: Confounding
67
First, a confounder is a factor which is between exposure and disease, or a
predictive of disease in the absence of symptom of disease) should not be
the exposure under study. Note that a treated as a confounder because to do
confounder need not be a genuine so could introduce serious bias into the
cause of disease, but merely results (Greenland and Neutra, 1981;
"predictive". Hence, surrogates for Robins, 1987; Weinberg, 1993). For
causal factors (e.g. age) may be example, in a study of high fat diet
regarded as potential confounders, and colon cancer, it would be
even though they are rarely directly inappropriate to control for serum
causal factors. cholesterol levels if it was considered
that high serum cholesterol levels were
Second, a confounder is associated a consequence of a high fat diet, and
with exposure in the source population hence a part of the causal chain
at the start of follow-up (i.e. at leading from diet to colon cancer. On
baseline). In case-control studies this the other hand, if serum cholesterol
implies that a confounder will tend to itself was of primary interest, then this
be associated with exposure among should be studied directly, and high fat
the controls. An association can occur diet would be regarded as a potential
among the cases simply because the confounder if it also involved exposure
study factor and a potential to other risk factors for colon cancer.
confounder are both risk factors for Evaluating this type of possibility
the disease, but this does not cause requires information external to the
confounding in itself unless the study to determine whether a factor is
association also exists in the source likely to be a part of the causal chain.
population. Intermediate variables can sometimes
be used in the analysis, but special
Thirdly, a variable which is affected by techniques are then required to avoid
the exposure or the disease (e.g. an adding bias (Robins, 1989; Robins et
intermediate in the causal pathway al, 1992; Robins et al, 2000).
Example 6.1
68
Table 6.1
69
factor which is associated with according to the levels of the
exposure, but is not a risk factor for the confounder(s) and calculating an effect
disease of interest. However, matching estimate which summarizes the
on a strong risk factor will usually association across strata of the
increase the precision of effect confounders. It is usually not possible to
estimates. control simultaneously for more than 2
or 3 confounders in a stratified analysis.
Control in the Analysis For example, in a cohort study, finer
stratification will often lead to many
Confounding can also be controlled in strata containing no exposed or no non-
the analysis, although it may be exposed persons. Such strata are
desirable to match on potential uninformative, thus fine stratification is
confounders in the design to optimize wasteful of information. This problem
the efficiency of the analysis. The can be mitigated to some extent, by the
analysis ideally should control use of multiple regression which allows
simultaneously for all confounding for simultaneous control of more
factors. Control of confounding in the confounders by "smoothing" the data
analysis involves stratifying the data across confounder strata.
Example 6.2
If the data presented in 1.0 in each of the two specific estimates (see
example 6.1 (table 6.1) subgroups (i.e. 1.0 in chapter 12) then yields
is analysed separately in smokers and 1.0 in non- an overall smoking-
smokers and non- smokers). Taking a adjusted prevalence
smokers, then the weighted average of odds ratio of 1.0.
prevalence odds ratio is these two stratum-
70
Example 6.3
Suppose that a cohort incidence rate will be 6.5 would be biased upwards
study of lung cancer (= 0.50 x 1.0 + 0.40 x by a factor of 9.4/6.5 =
involves a comparison 10 + 0.10 x 20) times 1.4, i.e. it would be 1.4
with national mortality the rate in non-smokers. times higher than the
rates in a country where Suppose that it was national rate due to
50% of the population considered most unlikely confounding by smoking.
are non-smokers, 40% that the cohort under Table 7.2 gives a range
are moderate smokers study contained more of such calculations
with a 10-fold risk of than 50% moderate presented by Axelson
lung cancer (compared smokers and 20% heavy (1978) using data from
to non-smokers), and smokers. Then, the Sweden. The last column
10% are heavy smokers incidence rate in the indicates the likely bias
with a 20-fold risk of study cohort would be in the observed rate
lung cancer. Then, it 9.4 times the rate in ratio due to confounding
can be calculated that non-smokers. Hence, the by smoking (a value of
the national lung cancer observed incidence rate 1.00 indicates no bias).
Table 6.2
71
Assessment of Confounding exposed and non-exposed groups in
order to check that the average level
When one lacks data on a suspected of humidity in the home is similar in
confounder (and thus cannot control the two groups. Such limited
confounding directly) it is still desirable information, if taken in all exposure-
to assess the likely direction and disease subgroups, can also be used to
magnitude of the confounding it directly control confounding (White,
produces. It may be possible to obtain 1982; Walker, 1982; Rothman and
information on a surrogate for the Greenland, 1998).
confounder of interest (for example,
social class is associated with many Finally, even if it is not possible to
lifestyle factors such as smoking, and obtain confounder information for any
may therefore be a useful surrogate study participants, it may still be
for some lifestyle-related possible to estimate how strong the
confounders). Even though confounder confounding is likely to be from
control will be imperfect in this particular risk factors. For example,
situation, it is still possible to examine this is often done in occupational
whether the exposure effect estimate studies, where tobacco smoking is a
changes when the surrogate is potential confounder, but smoking
controlled in the analysis, and to information is rarely available; in fact,
assess the strength and direction of although smoking is one of the
the change. For example, if the strongest risk factors for lung cancer,
relative risk actually increases (e.g. with relative risks of 10 or 20, it
from 2.0 to 2.5), or remains stable appears that smoking rarely exerts a
(e.g. at 2.0) when social class is confounding effect of greater than 1.5
controlled for, then this is evidence times in studies of occupational
that the observed excess risk is not disease (Axelson, 1978; Siemiatycki,
due to confounding by smoking, since 1988), because few occupations are
social class is correlated with smoking strongly associated with smoking,
(Kogevinas et al, 1997), and control although this degree of confounding
for social class involves partial control may still be important in some
for smoking. contexts.
72
6.2: Selection Bias
Example 6.4
Although we should recognize the relative risk estimate provided that loss
possible biases arising from subject to follow-up applied equally to the
selection, it is important to note that exposed and non-exposed populations
epidemiologic studies need not be based (Criqui, 1979). Analogously, case-
on representative samples to avoid bias. control studies have differing selection
For example, in a cohort study persons probabilities as an integral part of their
who develop disease might be more design, in that the selection probability
likely to be lost to follow-up than of diseased persons is usually close to
persons who did not develop disease; 1.0 provided that most persons with
however, this would not affect the disease are identified, whereas that for
73
non-diseased persons is substantially restricted to union members (because
less; however, this does not affect the the records are available), then the non-
relative risk estimate provided that exposed comparison group could be
these selection probabilities apply other workers in the same geographical
equally within each exposure group. area who are members of the same
union, and/or a similar union.
Additional forms of selection bias can
occur in case-control studies because Control of Selection Bias
these involve sampling from the source
population. In particular, selection bias Selection bias can sometimes be
can occur in a case-control study controlled in the analysis by identifying
(involving either incident or prevalent factors which are related to subject
cases) if controls are chosen in a non- selection and controlling for them as
representative manner, e.g. if exposed confounders (provided that these
people were more likely to be selected factors are not affected by the study
as controls than non-exposed people. exposure or disease). For example, if
white-collar workers are more likely to
Minimizing Selection Bias be selected for (or participate in) a
study than manual workers (and white
If selection bias has occurred in the collar work is negatively or positively
enumeration of the exposed group, it related to the exposure of interest),
may still be possible to avoid bias by then this bias can be partially controlled
choosing an appropriate non-exposed by collecting information on social class
comparison group. For example, if the and controlling for social class in the
exposed group does not include all analysis as a confounder.
workers in a particular industry, but is
74
likely to be misclassified according to risk estimate towards the null value
disease outcome, or if diseased and of 1.0 (Copeland et al, 1977;
non-diseased persons are equally Dosemeci et al, 1990). Hence, non-
likely to be misclassified according to differential misclassification tends to
exposure. Non-differential produce "false negative" findings and
misclassification of exposure usually is of particular concern in studies
(but not always) biases the relative which find a negligible association
Example 6.5
In many cohort studies risk is thus 10. If 15% of result, the observed
some exposed persons high exposed persons are incidence rates per
will be classified as non- incorrectly classified, 100,000 person-years
exposed, and vice versa. then 15 of every 100 will be 91 and 23
Table 6.3 illustrates this deaths and 15,000 of respectively, and the
situation with every 100,000 person- observed relative risk will
hypothetical data from a years will be incorrectly be 4.0 instead of 10.0.
study of lung cancer allocated to the low Due to non-differential
incidence in asbestos exposure group. Similarly misclassification,
workers. Suppose the if 10% of high exposed incidence rates in the
true incidence rates are persons are incorrectly high exposed group have
100 per 100,000 person- classified, then 1 of every been biased downwards,
years in the high 10 deaths and 10,000 of and incidence rates in
exposure group, and 10 every 100,000 person- the low exposure group
per 100,000 person- years will be incorrectly have been biased
years in the low exposure allocated to the low upwards.
group, and the relative exposure group. As a
Table 6.3
Hypothetical data from a cohort study in which 15% of highly exposed persons and
10% of low exposed persons are incorrectly classified.
Actual Observed
------------------------------- -----------------------------------------------------------
High Low High Exposure Low Exposure
Exposure Exposure
-----------------------------------------------------------------------------------------------------------------
Deaths 100 10 85 + 1= 86 9+ 15 = 24
Person-years 100,000 100,000 85,000 + 10,000 = 95,000 90,000 +15,000 = 105,000
-----------------------------------------------------------------------------------------------------------------
Incidence rate 100 10 91 23
per 100,000
person years
----------------------------------------------------------------------------------------------------------------
Rate ratio 10.0 4.0
75
between exposure and disease. One by the misclassification. For example if
important condition is needed to ensure only 80% of the deaths are identified in
that exposure misclassification produces a study, but this under-ascertainment
bias towards the null however: the applies equally to the exposed and non-
exposure classification errors must be exposed groups, then this will not affect
independent of other errors. Without the relative risk estimate.
this condition, non-differential exposure
misclassification can produce bias in any Secondly, the effect estimate may be
direction (Chavance et al, 1992; biased away from the null for some
Kristensen, 1992). exposure categories when there are
multiple exposure categories (see
Furthermore, there are several other example 6.6).
situations in which non-differential
misclassification will not produce a bias Finally, when there is positive
towards the null. confounding, and there is non-
differential misclassification of the
Firstly, when the specificity of the confounder, then confounding control
method of identifying the disease under will be incomplete and the adjusted
study is 100%, but the sensitivity is less effect estimate will consequently be
than 100%, then the risk difference will biased away from the null.
be biased towards the null, but the risk
ratio (or rate ratio) will be not be biased
Example 6.6
Table 6.4
Hypothetical data from a cohort study in which 15% of highly exposed persons and 10% of
low exposed persons are incorrectly classified, but the non-exposed are correctly classified
Actual Observed
--------------------------------------- ---------------------------------------
High Low Non-Exposed High Low Non-Exposed
------------------------------------------------------------------------------------------------------------
Deaths 100 10 5 86 24 5
Person-years 100,000 100,000 100,000 95,000 105,000 100,000
------------------------------------------------------------------------------------------------------------
Rate 100 10 5 91 23 5
------------------------------------------------------------------------------------------------------------
Rate ratio 20.0 2.0 1.0 18.1 4.6 1.0
76
One special type of non-differential phenomena do not represent
misclassification occurs when the study misclassification because these are not
outcome is not well-defined and errors in measurement. However, they
includes a wide range of etiologically do involve misclassification in the sense
unrelated outcomes (e.g., all deaths). that the etiologically relevant exposure
This may obscure the effect of exposure (or disease) has not been measured
on one specific disease since a large appropriately.
increase in risk for this disease may
only produce a small increase in risk for Differential Misclassification
the overall group of diseases under
study. A similar bias can occur when the Differential misclassification occurs when
exposure measure is not well defined the probability of misclassification of
and includes a wide range of exposure is different in diseased and non-
etiologically unrelated exposures, diseased persons, or the probability of
possibly due to a non-specific exposure misclassification of disease is different in
definition or due to the inclusion of exposed and non-exposed persons. This
exposures which could not have caused can bias the observed effect estimate
the disease of interest because they either toward or away from the null
occurred after, or shortly before, value. For example, in a nested case-
diagnosis. It could be argued that these control study of lung cancer, with a
Example 6.7
Table 6.5 shows data from some chemical. The true are classified correctly,
a hypothetical case- odds ratio is thus (70/30) then the observed odds
control study in which 70 (50/50) = 2.3. If 90% ratio would be (63/37) /
of the 100 cases and 50 of (63) of the 70 exposed (30/70) = 4.0.
the 100 controls have cases, but only 60% (30)
actually been exposed to of the 50 exposed controls
Table 6.5
Hypothetical data from a case-control study in which 90% of exposed cases and 60% of
exposed controls are correctly classified
Actual Observed
Exposed Non-exposed Exposed Non-exposed
Cases 70 30 63 37
Controls 50 50 30 70
77
control group selected from among non- the validity of a study. Given limited
diseased members of the cohort, the resources, it will often be more
recall of occupational exposures in desirable to reduce information bias by
controls might be different from that of obtaining more detailed information on
the cases. In this situation, differential a limited number of subjects than to
misclassification would occur, and it reduce random error by including more
could bias the odds ratio towards or subjects. However, a certain amount of
away from the null, depending on misclassification is unavoidable, and it is
whether members of the cohort who did usually desirable to ensure that it is
not develop lung cancer were more or towards the null value (as usually
less likely to recall such exposure than occurs with nondifferential exposure
the cases. misclassification) to minimize the
chance of false positive results.
As can be noted from example 6.7,
misclassification can drastically affect
Example 6.8
78
the groups being compared, even if this Relationship of Selection and
means ignoring more detailed exposure Information Bias to Confounding
information if this is not available for
both groups. However, this is not Selection bias and confounding are not
always the case (Greenland and Robins, always clearly demarcated. In
1985). particular, selection bias can sometimes
be viewed as a type of confounding,
Assessment of information bias since both can be reduced by controlling
for surrogates for the determinants of
Information bias is usually of most the bias (e.g. social class).
concern in historical cohort studies or Unfortunately, selection affected by
case-control studies when information is exposure and disease generates a bias
obtained by personal interview. Despite that cannot be reduced in this fashion.
these concerns, relatively little Some consider any bias that can be
information is generally available on the controlled in the analysis as
accuracy of recall of exposures. When confounding. Other biases are then
possible, it is important to attempt to categorized according to whether they
validate the classification of exposure or arise from the selection of study
disease, e.g., by comparing interview subjects (selection bias), or their
results with other data sources such as classification (information bias).
employer records, and to assess the
potential magnitude of bias due to
misclassification of exposure.
Summary
79
Again, one should appreciate the misclassification of a confounder can
limitations of these observations: it lead to bias away from the null if the
may be difficult to be sure that the confounder produces confounding
exposure and disease misclassification away from the null.
is nondifferential, and nondifferential
References
Criqui MH (1979). Response bias and risk Kristensen P (1992). Bias from
ratios in epidemiologic studies. nondifferential but dependent
American Journal of Epidemiology misclassification of exposure and
109:394-399. outcome. Epidemiol 3: 210-5.
80
Robins J (1989). The control of Walker AM (1982). Anamorphic analysis:
confounding by intermediate variables. sampling and estimation for covariate
Stat Med 8: 679-701. effects when both exposure and
disease are known. Biometrics 38:
Robins JM, Blevins D, Ritter G, et al 1025-32.
(1992). G-estimation of the effect of
prophylaxis therapy for pneumocystis Weinberg CR (1993). Toward a clearer
carinii pneumonia on the survival of definition of confounding. Am J
AIDS patients. Epidemiol 3: 319-36. Epidemiol 137: 1-8.
Robins JM, Hernán MA, Brumback B White JE (1982). A two-stage design for
(2000). Marginal structural models and the study of the relationship between a
causal inference in epidemiology. rare exposure and a rare disease. Am J
Epidemiol 11; 550-62. Epidemiol 115: 119-28.
81
82
CHAPTER 7: Effect Modification
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
In the previous chapter I discussed the (Miettinen, 1974). The term statistical
problem of confounding which occurs interaction denotes a similar phenomenon
when the exposed and non-exposed in the observed data. However, the terms
subpopulations of the source population “interaction” and “effect modification” are
are inherently different in background also used in a variety of other contexts,
disease risk. This should not be confused with a variety of meanings. In particular,
with effect modification which occurs the term “interaction” has different
when the measure of the effect of the meanings for biostatisticians, lawyers,
study factor depends on the level of clinicians, public health professionals,
another factor in the study population epidemiologists and biologists.
Example 7.1
Katsouyanni et al (1993) heat wave were modified urban areas and 27% in
studied the effects of air by the presence (or non-urban areas. Further
pollution and high absence) or high air analyses suggested that
temperature in the pollution levels. In Athens the threshold of effect of
causation of excess (where air pollution levels various air pollutants
mortality during a major are high) the increase in appeared to be lower on
heat wave in Greece in deaths on extremely hot extremely hot days.
July 1987. They found days was 97% in Athens,
that the effects of the but was 33% in other
83
asbestos exposure and smoking. difference as the effect measure. They
However, in this case the asbestos note that the risk difference for smoking
exposure has already occurred and the and lung cancer is 30 per 1,000 (35 - 5)
factory has now closed, so our focus is on in asbestos workers and 9 per 1,000 (10
smoking. We want to know whether the - 1) in other people. Thus, the effect of
‘effect” of smoking is “modified” by smoking is greater in asbestos workers
asbestos exposure, i.e. do smoking and and there is a positive statistical
asbestos exposure “interact”? interaction between the effects of
smoking and asbestos (table 7.2). They
Two Biostatisticians may even fit an additive model with an
interaction term and show that the
Suppose that we first consult a interaction term is positive.
biostatistician about how to interpret this
data. The first biostatistician we talk to We eventually get our two biostatistical
uses relative risk measures of effect. consultants together and they argue that
They note that the relative risk for there is no contradiction in the advice
smoking and lung cancer is 7.0 (35/5) in they have given us. Effect modification
asbestos workers and 10.0 (10/1) in and statistical interaction are merely
other people. Thus, the effect of smoking statistical concepts which depend on the
on lung cancer is less in asbestos workers methods used. In fact, all secondary risk
and there is therefore a negative factors modify either the rate ratio or the
statistical interaction between the effects rate difference, and uniformity over one
of smoking and asbestos (table 7.2). measure implies non-uniformity over the
They may even fit a multiplicative model other (Koopman, 1981; Steenland and
with an interaction term and show that Thun, 1986), e.g. an apparent additive
the interaction term is negative. joint effect implies a departure from a
multiplicative model. Several authors
We can see the logic of this argument, (e.g. Kupper and Hogan, 1978; Walter
but are somewhat surprised by the and Holford, 1978) have demonstrated
conclusion, since we can see the very the dependence of statistical interaction
high rates in people who both smoke and on the underlying statistical measure of
are exposed to asbestos. We therefore effect, and have therefore argued that
consult a second biostatistician. This the assessment of interaction is "model-
“alternative” biostatistician uses the risk dependent".
Table 7.1
Lung cancer risk per 1,000 people (and RR) in relation to exposure to cigarette smoke
and asbestos
Asbestos
Yes No
------------------------------------------------------
Smoking Yes 35/1000 (35.0) 10/1000 (10.0)
No 5/1000 (5.0) 1/1000 (1.0)
------------------------------------------------------
Rate difference 30/1000 9/1000
-------------------------------------------------------------------------------------
Rate ratio 7.0 10.0
-------------------------------------------------------------------------------------
84
A Lawyer
Table 7.2
85
A Clinician except that they are concerned about
the population rather than about
Next we consult with a clinician. She/he individual patients. They say “I want to
says “I advise my patients to give up conduct population smoking prevention
smoking, and I tell them that if they do campaigns and persuade people to give
manage to stop then they will reduce up smoking and that if they do then
their risk of lung cancer. They ask ‘by they will reduce their risk of lung
how much?’ So I want to know what the cancer. I only have a limited amount of
reduction in their individual risk will be if resources so I want to know if I can
they give up smoking”. Well, if their prevent more cases of lung cancer by
patient is an asbestos worker then they focusing on asbestos workers, or by
will reduce their risk by 30 per 1,000 doing my campaigns in the same
(over five years) by giving up smoking; number of people in the general
other people will reduce their risk by 9 population”. If they prevent 1,000
per 1,000 (once again, this is a little asbestos workers smoking, then (once
simplistic since it this does not tell us there has been time for the reduction in
exactly how many years of life they will risk to start occurring) they will have
gain). Thus, the effect of smoking is prevented 30 lung cancer cases each
greater in asbestos workers and there is year. If they prevent 1,000 other people
therefore a positive statistical from smoking then each year they will
interaction between the effects of have prevented 9 cases of lung cancer.
smoking and asbestos (table 8.2). Thus, the effect of smoking is greater in
asbestos workers and there is therefore
A Public Health Worker a positive statistical interaction between
the effects of smoking and asbestos
The public health worker that we consult (table 7.2).
has a similar approach to the clinician,
Figure 7.1
A S
U U’ A U’’ S
U’”
86
An Epidemiologist asbestos) together with unknown
background exposures (U’’), and 21
I have argued in chapter 1 that cases (60%) occurred through
epidemiology is part of public health, mechanisms involving both factors
and therefore I might be quite content together with unknown background
to accept the public health worker’s exposures (U’’’). This means that 86%
approach. However, as an of the cases (26% + 60%) could have
epidemiologist I do want to know more been prevented by preventing smoking,
about the causation of disease, since whereas 71% (11% + 60%) could have
what I learn may be relevant to other been prevented by preventing asbestos
exposures or other diseases. Thus, I exposure. Thus, the attributable risks
may be particularly interested in the for the individual factors of smoking
combination of smoking and asbestos to (86%) and asbestos (71%) sum to
produce cases of lung cancer. Rothman more than 100% because of the cases
and Greenland (1998) have thus that occur through mechanisms
adopted an unambiguous involving both exposures and which
epidemiological definition of interaction consequently could be prevented by
in which two factors are not preventing either exposure.
"independent" if they are component
causes in the same sufficient cause. This One apparent exception should be noted
concept of independence of effects leads (Koopman, 1977). If two factors (A and
to the adoption of additivity of incidence B) belong to different sufficient causes,
rates as the state of "no interaction". but a third factor (C) belongs to both
Thus, the fact that the lung cancer rate sufficient causes, then A and B are
in the group exposed to both factors competing for a single pool of
(35/1000) is greater than the sum of susceptible individuals (those who have
the baseline risk (1/1000) plus the C). Consequently the joint effect of A
effect of asbestos alone (5/1000 – and B will be less than additive
1/1000) plus the effect of smoking (Miettinen (1982) reaches a similar
alone (10/1000 - 1) indicates that there conclusion based on a model of
are some cases of disease that are individual outcomes). However, this
occurring due to the combination of phenomenon can be incorporated
exposures and which would not have directly into the causal constellation
occurred if either of the exposures had model by clarifying a previous ambiguity
been eliminated. We can do the same in the description of antagonism in the
calculations using the relative risks model's terms. Specifically, the absence
(relative to the group with exposure to of B can be included in the causal
neither factor) rather than incidence constellation involving A, and vice
rates: the joint effect is 35.0 times, versa. Then, two factors would not be
whereas it would be 1+(5.0-1)+(10.0- "independent" if the presence or
1)=14.0 if it were additive. This absence of the factors (or particular
situation is summarized in figure 7.1. It levels of both factors) were component
shows that in the group exposed to both causes in the same sufficient cause
factors, 1 case (3%) occurred through (Greenland and Poole, 1988; Rothman
unknown “background” exposures (U), 4 and Greenland, 1998).
cases (11%) through mechanisms
involving asbestos exposure alone (and A Biologist
not smoking) together with unknown
background exposures (U’), 9 cases Finally, it should be stressed that this
(26%) occurred through mechanisms epidemiological concept of
involving smoking alone (and not independence of effects is distinct from
87
some biological concepts of which a particular biologic model, rather
independence. For example, Siemiatycki than being accepted as the "baseline", is
and Thomas (1981) give a definition in itself evaluated in terms of the co-
which two factors are considered to be participation of factors in a sufficient
biologically independent "if the cause. For example, two factors which
qualitative nature of the mechanism of act at different stages of a multistage
action of each is not affected by the process are not independent since they
presence of absence of the other". are joint components of at least one
However, this concept does not lead to sufficient cause. This occurs irrespective
an unambiguous definition of of whether they affect each other's
independence of effects, and thus does qualitative mechanism of action (the
not produce clear analytic implications. ambiguity in Siemiatycki and Thomas'
Rothman's concept of independence is formulation stems from the ambiguity of
at a more abstract conceptual level in this concept).
88
7.3: Joint Effects
89
Summary
The terms interaction and effect in which two factors are not
modification are used in a variety of "independent" if they are component
contexts, with a variety of meanings. In causes in the same sufficient cause. This
particular, the term “interaction” has leads to the adoption of additivity of
different meanings for biostatisticians, incidence rates as the state of "no
lawyers, clinicians, public health interaction". However, there are other
professionals, epidemiologists and considerations which generally favor the
biologists. In each instance, they are use of multiplicative models. This
interested in the same question, namely implies an apparent dilemma as to how
does the effect of exposure A depend on an analysis can be conducted which
whether exposure B is also present (or combines the advantages of ratio
absent)? However, the word “effect” has measures of effect with the assessment
different meanings in different contexts. of independence in terms of a departure
In contrast to definitions based on from additivity. These apparently
statistical concepts, Rothman has contradictory goals can be reconciled
adopted an unambiguous through the analysis of separate and
epidemiological definition of interaction joint effects.
References
90
Miettinen OS (1982). Causal and cancer etiology. Epidemiologic
preventive interdependence: Reviews 9: 175-93.
elementary principles. Scand J Work
Selikoff I, Sedman H, Hammond E
Environ Health 8: 159-68.
(1980). Mortality effects of cigarette
Moolgavkar SH, Venzon DJ (1987). smoking among amosite asbestos
General relative risk models for factory workers. JNCI 65: 507-13
epidemiologic studies. Am J
Epidemiol 126: 949-61. Siemiatycki J, Thomas DC (1981).
Biological models and statistical
Pearce NE (1989). Analytic implications
interactions: an example from
of epidemiological concepts of
multistage carcinogenesis. Int J
interaction. Int J Epidemiol 18: 976-
Epidemiol 10: 383-7.
80.
Steenland K, Thun M (1986). Interaction
Rothman KJ, Greenland S, Walker AM
between tobacco smoking and
(1980). Concepts of interaction. Am J
occupational exposures in the
Epidemiol 112: 467-70.
causation of lung cancer. Journal of
Rothman KJ, Greenland S (1998). Occupational Medicine 28:110-118.
Modern epidemiology. 2nd ed.
Walter SD, Holford TR (1978). Additive,
Philadelphia: Lippincott-Raven.
multiplicative and other models for
Saracci R (1987). The interactions of disease risks. Am J Epidemiol 108:
tobacco smoking and other agents in 341-6.
91
92
Part III
Conducting a study
93
94
CHAPTER 8: Measurement of Exposure and
Health Status
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
In this chapter I briefly review the chapters I then discuss the practicalities
various options for measuring exposure of conducting cohort, case-control and
and disease status. In the following cross-sectional studies.
8.1: Exposure
95
years of exposure to a sensitising agent can be measured in a variety of ways,
(Antó et al, 1996). including occupation, income, and
education (Liberatos et al, 1988;
General Approaches to Exposure Berkman and MacIntyre, 1997). These
Assessment measures may pose problems in some
demographic groups; for example,
Methods of exposure measurement occupation and income may be poor
include personal interviews or self- measures of socio-economic status in
administered questionnaires (completed women, for whom the total family
either by the study participant or by a situation may reflect their socio-
proxy respondent), diaries, observation, economic status better than their
routine records, physical or chemical individual situation, and measures of
measurements on the environment, or socio-economic status in children must
physical or chemical measurements on be based on the situation of the parents
the person (Armstrong et al, 1992). For or the total family situation.
example, table 8.1 summarizes the types Nevertheless, the various measures of
of exposures data most commonly used socio-economic status are strongly
in occupational epidemiology studies correlated with each other, and asthma
(Checkoway et al, 2004). Measurements epidemiology studies are usually based
on the person can relate either to on whichever measures are available,
exogenous exposure (e.g. airborne dust) unless socio-economic status is the main
or internal dose (e.g. plasma cotinine); focus of the research and it is necessary
the other measurement options (e.g. to obtain more detailed information.
questionnaires) all relate to exogenous
exposures. Questionnaires
96
Table 8.1
Example 8.1
Example 8.2
Vartia (2001) studied participants were asked more general stress and
the consequences of if they felt themselves mental stress reactions
workplace bullying in the subjected to such than did respondents
municipal sector in behaviour, or if they had from workplaces with no
Helsinki, Finland. Every observed someone else bullying. The targets
35th member of the at their workplace being of bullying used sleep-
Municipal Officials Union bullied. They were also inducing drugs and
was selected and 1037 asked about the sedatives more often
(65.5%) responded to a frequency and duration than did the
postal questionnaire. A of such acts. Both the respondents who were
definition of bullying was targets of bullying and not bullied.
provided and study the observers reported
97
Environmental Measurements and for each job title, exposure levels
Job-Exposure Matrices decreased over time, but increased again
during the 1966-75 time period. Within
In many studies, e.g. community-based each time period, the highest exposures
case-control studies, questionnaires are were in raw fiber handling and the
the only source of exposure information. lowest were in general area workers.
However, in some instances, particularly This historical exposure information can
in occupational studies, questionnaires be combined with information from
may be combined with environmental employment records to obtain exposure
exposure measurements (e.g. industrial estimates for individual workers. For
hygiene surveys) to obtain a quantitative example, table 5.3 shows the cumulative
estimate of individual exposures. Table exposure for a worker who worked as a
8.2 shows environmental measurements card operator during 1933-1938 and
in an asbestos textile plant in South then worked in “clean-up” during 1939-
Carolina (Dement et al, 1983; 1948.
Checkoway et al, 2004). It shows that
Example 8.3
Table 8.2
98
Table 8.3
Example of an exposure history of an individual worker
Job Years Mean exposure Cumulative exposure
Card operator 1933-35 10.8 32.4
Card operator 1936-1938 6.5 41.9
Clean-up 1939-45 8.8 103.5
Clean-up 1946-48 4.0 115.5
99
Example 8.4
Biomarkers
More recently, there has been increasing exposures can be used if it is reasonable
emphasis on the use of molecular to assume that exposure levels (or at
markers of internal dose (Schulte, least relative exposure levels) have
1993). In fact, there are a number of remained stable over time (this may be
major limitations of currently available particularly relevant in occupational
biomarkers of exposure (Armstrong et studies), and have not been affected by
al, 1992), particularly with regard to lifestyle changes, or by the occurrence of
historical exposures (Pearce et al, 1995). the disease. However, if the aim is to
For example, serum levels of measure historical exposures, then
micronutrients reflect recent rather than historical information on exposure
historical dietary intake (Willett, 1990). surrogates may be more valid than direct
Some biomarkers are better than others measurements of current exposure or
in this respect (particularly markers of dose levels. This situation has long been
exposure to biological agents), but even recognised in occupational epidemiology,
the best markers of chemical exposures where the use of work history records in
usually reflect only the last few weeks or combination with a job-exposure matrix
months of exposure. On the other hand, (based on historical exposure
with some biomarkers it may be possible measurements of work areas rather than
to estimate historical levels provided that individuals) is usually considered to be
certain assumptions are met. For more valid than current exposure
example, it may be possible to estimate measurements (whether based on
historical levels of exposure to pesticides environmental measurements or
(or contaminants) from current serum biomarkers) if the aim is to estimate
levels provided that the exposure period historical exposure levels (Checkoway et
is known, and the half-life is known. al, 2004). On the other hand, some
Similarly, information on recent biomarkers have potential value in
100
validation of questionnaires which can environmental exposure (e.g. tobacco
then be used to estimate historical smoke) may involve hundreds of
exposures. Furthermore, biomarkers of different chemicals, each of which may
internal dose may have relatively good produce hundreds of measurable
validity in studies involving an acute biological responses (there are
effect of exposure. exceptions to this, of course, such as
environmental lead exposure, but most
A more fundamental problem of environmental exposure involves
measuring internal dose with a complex mixtures). A biomarker typically
biomarker is that it is not always clear measures one of the biological responses
whether one is measuring the exposure, to one of the chemicals. If the chosen
the biological effect, or some stage of biomarker measures the key etiological
the disease process itself (Saracci, factor, then it may yield relatively good
1984b). Thus the findings may be exposure data; however, if a biomarker
uninterpretable in terms of the causal is chosen which has little relationship to
association between exposure and the etiological component of the complex
disease. When it is known that the exposure mixture then the biomarker will
biologically effective dose is the most yield relatively poor exposure data.
appropriate measure, then the use of
appropriate biomarkers clearly has some A further major problem with the use of
scientific advantages. However, choosing biomarkers is that the resulting expense
the appropriate biomarker is a major and complexity may drastically reduce
dilemma, and biomarkers are frequently the study size, even in a case-control
chosen on the basis of an incomplete or study, and therefore greatly reduce the
erroneous understanding of the etiologic statistical power for detecting an
process (or simply because a particular association between exposure and
marker can be measured). An disease.
Example 8.5
101
biological measurements) will vary from within the same complex chemical
study to study, and from exposure to mixture (e.g. in tobacco smoke).
exposure within the same study, or
The type of information required for death information for identified deaths
measuring health status in can be obtained by requesting copies of
epidemiological studies may be different death certificates from national, state, or
from that which is required in clinical municipal vital statistics offices. In most
practice. As with exposure data, the key instances the causes of death are coded
issue is that information should be of by a nosologist trained in the rules
similar quality for the various groups specified in the International
being compared. For example, suppose Classification of Diseases (ICD) volumes
that the bladder cancer incidence in a compiled by the World Health
particular geographical area is being Organisation. Revisions to the ICD
compared with national incidence rates; coding are made about every ten years,
then it would be inappropriate to and in some instances the ICD code for a
conduct a pathological review and particular cause of death may change
reclassification of the cases of the cancer (Checkoway et al, 2004).
identified in the area, since such a
reclassification had not been made for Some countries or states also maintain
the national data and the information incidence registers for conditions such as
would not be comparable. Rather, the cancer, congenital malformations or
cancer cases in the area should be epilepsy. These have most commonly
classified exactly as they had been been established for cancer registration
classified in routine national cancer and the International Agency for
statistics. Thus, the emphasis should be Research on Cancer (IARC) has been
on the comparability of information attempting to encourage the
across the various groups being establishment of cancer registries and to
compared. standardise methods of cancer
registration throughout the world
The types of health outcome data used
(Jensen et al, 1991). Provided that
in epidemiological studies include:
registration is relatively complete, then
mortality; disease registers; health
cancer registrations can provide valuable
service records; and morbidity surveys.
additional health status information (and
These can be grouped into data based
increase the number of identified cases)
on routinely collected records, and
in a cohort study. Furthermore, cancer
morbidity data that is collected for a
registries are invaluable for identifying
specific epidemiologic study.
newly diagnosed cases who can be
interviewed (while they are still alive) for
Routine Records
population-based case-control studies.
Most countries maintain comprehensive
Many Western countries have notification
death registration systems at the
systems for occupational diseases. For
national or regional levels, and cause of
example, in the United Kingdom the
102
Surveillance of Work Related and for determining health status in cohort
Occupational Respiratory Disease studies, or to create informal “registers”
(SWORD) project was established in for identifying cases for case-control
1989 as a national surveillance scheme studies; these include hospital admission
for occupational respiratory disease records, health insurance claims, health
(Meredith et al, 1991). maintenance organisation (HMO)
records, and family doctor (general
As discussed in chapter 9, other practitioner records).
routinely collected records can be used
Example 8.6
Morbidity Surveys
103
"diagnosed asthma" in asthma between the study participants and
prevalence studies, since the diagnosis physicians, and this is not possible or
of "variable airflow obstruction" usually affordable in large-scale epidemiological
requires several medical consultations studies. Thus, most epidemiological
over an extended period. It is therefore studies must, by necessity, focus on
not surprising that several studies have factors which are related to, or
found the prevalence of physician- symptomatic of, asthma but which can
diagnosed asthma to be substantially be readily assessed on a particular day.
lower than the prevalence of asthma The main options in this regard are
symptoms. Such problems of differences symptoms and physiological
in diagnostic practice could be minimised measurements (Pearce et al, 1998). In
by using a standardised protocol for particular, standardised symptoms
asthma diagnosis in prevalence studies. questionnaires have been developed for
However, this is rarely a realistic option use in adults (Burney et al, 1994) and
since it requires repeated contacts children (Asher et al, 1995).
Example 8.7
Health status can also be measured by role functioning, bodily pain, mental
more general morbidity and “quality of health, and general health perceptions.
life” questionnaires. Perhaps the most The SF-36 scales have been widely used
widely used questionnaire has been the in clinical research in a wide variety of
Medical Outcomes Study Short Form populations to assess overall health
(SF-36) (Ware, 1993). This includes status.
scales to measure physical functioning,
104
Summary
105
References
106
Pearce N, Sanjose S, Boffetta P, et al disease determinants. In: Berlin A,
(1995). Limitations of biomarkers of Draper M, Hemminki K, Vainio H
exposure in cancer epidemiology. (eds). Monitoring human exposure to
Epidemiol 6: 190-4. carcinogenic and mutagenic agents.
Lyon: IARC.
Polednak AP (1989). Racial and ethnic
differences in disease. New York: Schulte PA (1993). A conceptual and
Oxford University Press. historical framework for molecular
epidemiology. In: Schulte P, Perera
Pomare E, Tutengaehe H, Ramsden I, et
FP. Molecular epidemiology: principles
al (1992). Asthma in Maori people. NZ
and practices. New York: Academic
Med J 105: 469-70.
Press, pp 3-44.
Raum E, Arabin B, Schlaud M, et al
Vartia M (2001). Consequences of
(2001). The impact of maternal
workplace bullying with respect to the
education on intrauterine growth: a
well-being of its targets and the
comparison of former West and East
observers of bullying. Scand J Work
Germany. Int J Epidemiol 2001: 30:
Environ Health 27: 63-9.
81-7.
Ware JE (1993). SF-36 Health Survey,
Ross RK, Yuan J-M, Yu MC, et al (1992).
Manual and Interpretation Guide.
Urinary aflatoxin biomarkers and risk
Boston: The Health Institute.
of hepatocellular carcinoma. Lancet
1992; 339: 943-6. World Health Organisation (WHO)
(1985). WHO Study Group: Diabetes
Saracci R, Simonato L, Acheson ED, et al
mellitus. Technical Report Series no
(1984a). Mortality and incidence of
727. Geneva: World Health
cancer of workers in the man made
Organisation.
vitreous fibres producing industry: an
international investigation at 13 Willett W (1990). Nutritional
European plants. Br J Ind Med 1984; epidemiology. New York: Oxford
41: 425-36. University Press.
Saracci R (1984b). Assessing exposure
of individuals in the identification of
107
108
CHAPTER 9: Cohort Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
109
historical cohort study might involve “all the country who did not work in the
workers who worked for at least one factory). However, this is rarely feasible
month in the factory at any time during in practice, and is usually a trivial
1970-1999”. The list of such workers problem if the exposure is rare. Thus,
can be enumerated using personnel the comparison is usually made between
records which also provide information the exposed group and the national
on their job titles and departments population as a whole.
(which can be used to estimate their
historical exposures). The risk period
110
Example 9.1
Example 9.2
111
9.2: Measuring exposure
Example 9.3
112
9.3 Follow-up
113
registration records, and it is not meet the eligibility criteria (i.e.
necessary (or desirable) to recode death employment for one month),
registrations for a specific study. whichever is the latest date. If they
However, the ICD codes have changed started working in the factory after the
over time, and when using routine death start of the study, then they would
registration records it is necessary to be only start being followed on the date
aware of which ICD revision was in effect they started work (or a subsequent
at the time of death. date when they met the eligibility
criteria).
Person-time
They stop contributing person-time
In a study of a specific population, e.g. when they die (or are diagnosed with
workers in a particular factory, the disease in an incidence study),
participants may enter the study on emigrate, they are lost to follow-up, or
the date that the study starts the study finishes (31/12/99)
(1/1/70), or the date that they first whichever is the earliest.
Example 9.4
114
Summary
Cohort studies provide the most exposed and those participants not
comprehensive approach for evaluating exposed to a particular risk factor.
patterns of exposure and disease, since However, in some instances, all of the
they involve studying the entire source study participants may be exposed, or
population (assuming that there is a valid individual exposure information
100% response rate) over the entire risk may not be available, and it may be
period. necessary to make an external
comparison, e.g. with national mortality
Thus, the cohort design ideally includes rates (in which case the national
all of the relevant person-time population comprises the source
experience of the source population over population for the study). It is important
the risk period. A cohort study may be that any comparisons are made over the
based on a particular community (e.g. a same risk period, and that follow-up is
geographical community), or on a more as complete as possible. The basic effect
specific population defined by a measures in a cohort study are the rate
particular exposure (e.g. workers in a ratio and risk ratio. Methods of data
particular factory). In both instances, an analysis for these effect measures are
internal comparison would ideally be described in chapter 12.
made between those participants
References
115
population-based study (Iceland). Social Security Administration Master
Cancer Causes and Control 12: 95- Beneficiary Record file and the
101. National Death Index in the
ascertainment of vital status. Am J
Wentworth DN, Neaton JD, Rasmussen
Public Health 73; 127-1274.
WL (1983). An evaluation of the
116
CHAPTER 10: Case-control Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
117
Example 10.1
In a population-based study the first screening and are therefore more likely
step in the selection of cases is to to be diagnosed with a non-fatal
attempt to ascertain all cases generated myocardial infarction.
by the source population over the risk
period (Checkoway et al, 2004). If In a registry-based study the case-group
complete case ascertainment is not usually consists of all incident cases
achieved, then the relative risk estimate occurring in the registry during the risk
(odds ratio) will not necessarily be period. The “registry” could consist of a
biased unless case ascertainment is formal population-based registry (e.g. a
associated with exposure history, e.g. if cancer registry or birth defects registry),
people who are prescribed a particular or could involve an ad hoc “registry”,
drug receive more intensive medical e.g. based on admission records for the
major hospitals in a city.
118
Example 10.2
Mian et al (2001) studied homicide in both cases and controls, the interviews
Orangi, the largest squatter settlement were conducted with their wife, or if she
in Karachi with an estimated population was inaccessible or unwilling it was
of 1.2 million. They defined the cases as conducted with the wife of the head of
individuals who lived in Orangi and were the household. People who were killed
killed in Orangi between January 1994 were 34 times more likely to have
and January 1997, due to intentional attended all political processions (29%
violence, by firearms, sharp or blunt versus 1%, odds ratio (OR) = 34, 95%
trauma. Cases were identified in the 15 CI 4-749), 19 times more likely to have
neighbourhoods (out of 103 in total in attended political meetings (31% versus
Orangi) which field workers identified as 2%, OR = 19, 95% CI 4-136), and 17
the highest violence neighbourhoods. times more likely to have held an
Field workers identified households important position in a political party
where they knew someone had been (29% versus 2%, OR = 17, 95% CI 3-
killed; in a few neighbourhoods they also 120). The authors concluded that
contacted other social organisations in homicide in Orangi was political and that
the community to identify further cases. efforts to build trust between ethnic
Controls were selected from a random groups and to build legitimacy for non-
sample of households enrolled in a violent forms of conflict resolution are
related study conducted at the same important steps to limit future violence.
time in the same 15 neighbourhoods. For
119
living in the same country during the admissions from all major hospitals in
same year, actually involves density the city and excluding cases who do not
sampling with calendar year as the live in the city; controls can then be
“time” matching variable (possibly with sampled from that defined source
additional matching on the additional population.
“time” variable of age).
If it proves impossible to define and
Sources of controls enumerate the source population, then
one possibility is to select controls
In a population-based case-control from people appearing in the same
study, controls are usually sampled at “register” for other health conditions
random from the entire source (e.g. admissions to the hospital for
population (perhaps with matching on other causes). This may not only
factors such as age and gender). In produce a valid sample of the “source
some instances, it may be necessary population”, but may also have
to restrict the source population in advantages in making the case and
order to achieve valid control control recall more comparable (Smith
sampling. For example, if controls are et al, 1988). However, it may result in
to be selected from voter registration bias if the other health conditions are
rolls, and these are known to be less also caused (or prevented) by the
than 100% complete for the exposure under study (Pearce and
geographical area under study, then Checkoway, 1988). For this reason,
the source population might be the population-based approach is
restricted to persons appearing on the preferable, although registry-based
voter registration roll, and cases that studies may still be valuable when
were not registered to vote would be population-based studies are not
excluded; controls would then be practicable, provided that careful
sampled from this redefined source consideration is given to possible
population by taking a random sample sources of bias.
of the roll.
Matching
In registry-based studies, selection of
controls may not be so straightforward In some instances it may be appropriate
because the source population may not to match cases and controls on potential
be so easy to define and enumerate. For confounders (e.g. age and gender). This
example, if there are two major hospitals can be done by 1:1 matching (e.g. for
in a city, and a study is based on lung each case, choose a control of the same
cancer admissions in one of them during age and gender) or by frequency
a defined risk period, then the source matching (e.g. if there are 25 male cases
population is “all those who would have in the 30-34 age-group then choose the
come to this hospital for treatment if same number of male controls for this
they had developed lung cancer during age-group). It is important to
this risk period”. This population may be emphasize, however, that this will not
difficult to define and enumerate, remove confounding in a case-control
particularly if cases may also be referred study, but will merely facilitate its
from smaller regional hospitals. The best control in the analysis. For example, in a
solution is usually to define a more case-control study of lung cancer, the
specific source population (e.g. all people cases will generally be relatively old
living in the city) and to attempt to whereas a random general population
identify all cases generated by that control sample will be relatively young.
source population, e.g. by including This may lead to inefficiencies when age
120
is controlled in the analysis since the can be done with simple stratification on
older age-groups will contain many cases age (e.g. by five-year age-groups) and
and few controls, whereas the younger gender and it is not necessary to retain
age-groups will contain many controls the 1:1 matched pairs in the analysis
and few cases. Matching on age will (Rothman and Greenland, 1998).
ensure that there are approximately
equal numbers of cases and controls in There are also potential disadvantages of
each age-strata and will thereby improve matching. In particular, matching may
the precision of the effect estimates actually reduce precision in a case-
(given a fixed number of cases and control study if it is done on a factor that
controls). However, it will not remove is associated with exposure but is not a
confounding by age – it merely makes it risk factor for the disease under study
easier to control in the analysis and hence is not a true confounder
(Checkoway et al, 2004). (Rothman and Greenland, 1998).
Furthermore, matching is often
It is also important to emphasize that if expensive and/or time consuming. For
“pair” matching (i.e. 1:1 matching) has these reasons, it is usually sufficient, and
been done, then it is important to control preferable, to only match on basic
for the matching factors in the analysis, demographic factors such as age and
but that this need not involve a gender, and to then control for other
“matched analysis”. For example, if pair potential confounders (along with age
matching has been done on age and and gender) in the analysis (Checkoway
gender, then it is important to control for et al, 2004).
age and gender in the analysis, but this
Example 10.3
121
10.4: Measuring exposure
Once the cases and controls have been open to criticism as being particularly
selected, information on previous prone to bias, e.g. because the recall of
exposures is then obtained for both past exposures (e.g. eating meat,
groups. As discussed in chapter 8, there drinking alcohol, spraying pesticides)
are a variety of possible methods for may be different between cases of
measuring exposure in case-control disease and healthy controls. However,
studies. In some instances this may be collecting exposure information from
from historical records, e.g. personnel questionnaires is not an inherent feature
records that contain work history of case-control studies, and is sometimes
information. also a feature of cohort studies. Thus,
there is nothing inherently biased in the
Perhaps more commonly, exposure case-control design; rather what is
information may be obtained from important is the validity of the exposure
questionnaires. It is this latter feature of information that is collected, whatever
case-control studies which has left them study design is employed.
Summary
The only conceptual difference between and risk period. The tasks are then to:
a full cohort study based on a specified (i) identify all cases generated by the
source population and risk period, and source population over the risk period;
an (incidence) case-control study based (ii) select a random sample of controls
on the same source population and risk from the source population over the risk
period, is that the latter involves period (ideally by density matching); (iii)
outcome-specific samples of the source obtain exposure information from cases
population, rather than an analysis of and controls in a standardised and
the entire source population. There is unbiased manner.
usually little loss of precision compared
to a full cohort study, and there may be The standard effect estimate in a case-
considerable savings in terms of time control study is the odds ratio. If controls
and expense, particularly if the study are selected by density matching, then
disease is rare or has a long induction the odds ratio will estimate the incidence
time. rate ratio (in the source population and
risk period) in an unbiased manner
The key feature of good case-control without the need for any rare disease
study design is that the study should be assumption. Methods of data analysis for
based on a specified source population odds ratios are described in chapter 12.
122
References
123
124
CHAPTER 11: Prevalence Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
125
Example 11.1
126
prevalence will be exactly equal to the measure of the validity of a particular
true difference in prevalence. More question or technique in prevalence
commonly, Youden's Index will be less comparisons (Pekkanen and Pearce,
than 1 and the observed prevalence 1999).
difference will be reduced accordingly,
In this respect, basic symptom
e.g. if Youden’s Index is 0.75 then the
questionnaires may often perform
observed prevalence difference will be
better than supposedly more
0.75 times the true prevalence
“objective” measures such as bronchial
difference. Youden's Index therefore
responsiveness testing (Pearce et al,
provides the most appropriate
1998).
Example 11.2
127
Table 11.1
Actual Observed
----------------------------- ---------------------------------------------------
Non-
Exposed exposed Exposed Non-exposed
----------------------------------------------- ----------------------- ---------------------------
Asthmatics 40 20 32 + 6 = 38 16 + 8 = 24
Non-asthmatics 60 80 54 + 8 = 62 72 + 4 = 76
--------------------------------------------------------------------------------------------------
Total 100 100 100 100
--------------------------------------------------------------------------------------------------
Prevalence 40% 20% 38% 24%
--------------------------------------------------------------------------------------------------
128
collected for both current smoking as
well as smoking history.
Example 11.3
Summary
Incidence studies are usually the population rather than disease incidence.
preferred approach, but in some settings The conduct of a prevalence study is (at
and for some conditions prevalence least in theory) relatively
studies are the only option. Furthermore, straightforward. A source population is
in some instances we may be more defined, and at one point in time the
interested in factors that affect the prevalence of disease is measured in the
current burden of disease in the population. Exposure information is then
129
obtained for all members of the source standard effect estimate in a prevalence
population (a prevalence study), or for study is the odds ratio. Methods of data
all cases of the disease under study and analysis for odds ratios are described in
a control sample of the non-cases (a chapter 12.
prevalence case-control study). The
References
130
Part IV
131
132
CHAPTER 12: Data Analysis
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)
133
of coding instructions that should have done, both to avoid confusion, and also
been prepared prior to data collection. to avoid any possibility of the data
For instance, a detailed occupational coding and checking being influenced by
history may have been taken in a semi- the results of preliminary analyses.
narrative form, and must be Once the data have been entered and
subsequently coded. It is usually edited, there is usually a major task of
preferable to do this when entering the data management. This typically
data directly onto a PC, since this involves the use of a computer package
minimizes transcription errors. to transform the data, compute new
variables, and prepare new files suitable
Once the data are coded and entered, for statistical analysis.
programmes should be run that seek
strange data, contradictions, and Data Analysis
impossible data (e.g. a systolic blood
pressure of 40 mm Hg). These The basic aim of the analysis of a single
programmes should not be restricted to study is to estimate the effect of
a search for logic errors or exposure on the outcome under study
impermissible symbols. They should while controlling for confounding and
include also procedures that identify minimizing other possible sources of
values that lie outside plausible limits. bias. In addition, when confounding and
The values being queried should be other sources of bias cannot be
listed, and decisions on how the "errors" removed, then it is important to assess
are dealt with should be documented. their likely strength and direction. This
With many packages, this process can latter task was discussed in chapter 7.
be conducted during the actual data In this chapter I focus on the control of
entry since the range of permissible confounding.
values (for numeric variables) or legal
codes (for alphanumeric variables) can Effect estimation
be specified, as well as variables which
must not be left blank, conditional The basic effect measures, and methods
jumps (e.g. if the answer is "NO" the of controlling confounding are described
computer skips to the next relevant below. Usually, in epidemiology studies,
question), repeat fields (so that the we wish to measure the difference in
value of a variable is set by default to disease occurrence between groups
that of the last record entered or exposed and not exposed to a particular
displayed), and logical links between factor.
variables. The best method of data
checking is to enter all of the data The analysis ideally should control
twice, and to compare the two files for simultaneously for all confounding
discrepancies. This approach, combined factors. Control of confounding in the
with extensive edit checks at the time of analysis involves stratifying the data
data entry, should minimize errors. according to the levels of the
confounder(s) and calculating an effect
Even with double data entry and estimate which summarizes the
sophisticated checking procedures, information across strata of the
errors may occur, and it is therefore confounder(s). For example, controlling
important to run further edit checks for age (grouped into 5 categories) and
before data analysis begins. It is gender (with 2 categories) might involve
particularly important to finish all edit grouping the data into the 10 (= 5 x 2)
checks and to have a final version of the confounder strata and calculating a
data file before any data analysis is
134
summary effect estimate which is a In most instances, epidemiologic data
weighted average of the stratum- involves binomial (i.e. with persons in
specific effect estimates. the denominator) or Poisson (i.e. with
person-years in the denominator)
Confidence intervals outcome variables and ratio measures
of effect. The estimated relative risk
As well as estimating the effect of an (rate ratio, risk ratio, odds ratio) has an
exposure, it is also important to approximate log normal distribution,
estimate the statistical precision of the and the ln(RR) can be written as the
effect estimate. The confidence interval difference of the two compared risks:
(usually the 95% confidence interval)
provides a range of values in which it is ln(RR) = ln(R1/R0) = ln(R1) – ln(R0)
plausible (provided that there is no
uncontrolled confounding or other bias)
that the true effect estimate may lie. If Thus (assuming no bias) the 95%
the statistical model is correct, and confidence interval for the natural log
there is no bias, then the confidence (ln) of the relative risk is:
intervals derived from an infinite series
of study repetitions would contain the ln(RR) + 1.96 SE
true effect estimate with a frequency no
less than its confidence level (Rothman
and Greenland, 1998). Thus the confidence interval for the
relative risk itself is:
The usual practice is to use 90% or
95% confidence intervals, but these RR e + 1.96 SE
values are completely arbitrary. Given a
large enough sample, an approximate
95% confidence interval for the true
P-Values
population mean is:
135
chance if the null hypothesis (that there studies, as well as non-statistical
is no difference in reality) were true. considerations such as the plausibility
and coherence of the effect in the light
In the past, p-values have often been of current theoretical and empirical
used to describe the results of a study knowledge (see chapter 13).
as "significant" or "not significant" on
the basis of decision rules involving an The problems of significance testing can
arbitrary alpha level as a “cutoff” for be avoided by recognizing that the
significance (e.g. alpha=0.05). principal aim of an individual study
However, it is now recognised that there should be to estimate the size of the
are major problems with this approach effect rather than just to decide whether
(Rothman and Greenland, 1998). or not an effect is present. The point
estimate should be accompanied by a
First, the p-value associated with a confidence interval (the interval
difference in outcome between two estimate) which indicates the precision
groups depends on two factors: the size of the point estimate by providing a
of the difference; and the size of the range of values within which it is most
study. A very small difference may be plausible that the true treatment effect
statistically significant if the study is may lie if no bias were present (Gardner
very large, whereas a very large and Altman, 1986; Rothman and
difference may not be significant if the Greenland, 1998). The point estimate
study is very small. p-values thus reflects the size of the effect, whereas
combine two phenomena which should the confidence interval reflects the
be kept separate: the size of the effect; study size on which this effect estimate
and the size of the study used to is based. This approach also facilitates
measure it. the comparison of the study findings
with those of previous studies. Note that
A second problem with significance all conventional statistical methods
testing is more fundamental. The assume “no bias is present”. Because
purpose of significance testing is to this assumption is rarely if ever correct,
reach a decision. However, in further considerations beyond the
environmental research, decisions statistics presented here are always
should ideally not be based on the needed (see chapter 13).
results of a single study, but should be
based on information from all available
The basic measures of disease occurrence used measures. In the next section I
and association have been introduced in extend these methods to adjust for
chapter 2. In this section I consider them potential confounders. I will only present
in more depth and show how to calculate “large sample” methods of analysis which
confidence intervals for the commonly have sample size requirements for valid
136
use. To avoid statistical bias, more persons followed for 10 years. As noted
complex techniques are required for in chapter 2, three measures of disease
analyses of studies involving very small incidence are commonly used in incidence
numbers or sparse stratifications studies.
(Greenland et al, 2000). Once again,
readers are referred to standard texts The observed incidence rate in the non-
(particularly Rothman and Greenland, exposed group (table 9.1) has the form:
1998) for a more comprehensive review
of these methods. I will emphasise
cases b
confidence intervals, but will also present
I0 = -------------- = ----
methods for calculating p-values.
person-time Y0
Table 12.1 shows the findings of a
hypothetical incidence study of 20,000
Table 12.1
Findings from a hypothetical cohort study of 20,000 persons followed for 10 years
137
The observed incidence proportion in cases of disease (b). They differ in
the non-exposed group has the form: whether their denominators represent
person-years at risk (Y0), persons at
cases b risk (N0), or survivors (d).
R0 = ---------- = ------
persons N0 Measures of Effect
The natural log of the incidence odds An approximate p-value for the null
(ln(O0)) has (under a binomial model) hypothesis that the rate ratio equals
the null value of 1.0 can be obtained
an approximate standard error of: using the person-time version of the
Mantel-Haenszel chi-square (Breslow
and Day, 1987). This test statistic
SE(ln(O0)) = (1/b + 1/d)0.5
compares the observed number of
exposed cases with the number
expected under the null hypothesis
and a 95% confidence interval for O0 that I1 = I0:
is:
[Obs(a) - Exp(a)]2 [a - Y1M1/T]2
O0 e+1.96 SE χ2 = ---------------------- = ----------------
Var(Exp(a)) [M1Y1Y0/T2]
138
The natural logarithm of the rate ratio An approximate 95% confidence
has (under a Poisson model for a and interval for the risk ratio is then given
b) an approximate standard error of: by:
RR e+1.96 SE
SE[ln(RR)] = (1/a + 1/b)0.5
139
12.3: Control of Confounding
Pooling [Σ M1iY1iY0i/Ti2]0.5
SE = ------------------------------
Pooling involves calculating a [(ΣaiY0i/Ti)(ΣbiY1i/Ti)]0.5
summary effect estimate assuming
stratum-specific effects are equal.
There are a number of different Thus, an approximate 95% confidence
methods of obtaining pooled effect interval for the summary rate ratio is
estimates, but a commonly used then given by:
method which is both simple and
close to being statistically optimal
(even when there are small numbers RR e+1.96 SE
in all strata) is the method of Mantel
and Haenszel (1959).
Σ aiY0i/Ti Σ aiN0i/Ti
RR = -------------- RR = -------------
Σ biY1i/Ti ΣbiN1i/Ti
140
An approximate p-value for the where M1i, M0i, N1i, N0i and Ti are as
hypothesis that the summary risk ratio is depicted in table 12.1.
1.0 can be obtained from the one degree-
of-freedom Mantel-Haenszel summary An approximate standard error for the
chi-square (Mantel and Haenszel, 1959): natural log of the odds ratio (under a
binomial or hypergeometric model) is
(Robins et al, 1986):
[ΣObs(a) - ΣExp(a)]2 [Σai - ΣM1iM1i/Ti]2
2
χ = ----------------------- = ------------------
ΣPR Σ(PS + QR) ΣQS
ΣVar(Exp(a)) [ΣM1iM0iM1iN0i/Ti2(Ti-1)]
SE = ----- + -------------- + ------
2R+2 2R+S+ 2S+2
where M1i, M0i, N1i, N0i and Ti are as
depicted in table 9.1. where: P = (ai + di)/Ti
Q = (bi + ci)/Ti
An approximate standard error for the R = aidi/Ti
natural log of the risk ratio is S = bici/Ti
(Greenland and Robins, 1985b):
R+ = ΣR
S+ = ΣS
[Σ M1iN1iN0i/Ti2 - Σaibi/Ti]0.5
SE = ---------------------------------
[(ΣaiN0i/Ti)(ΣbiN1i/Ti)]0.5 Thus, an approximate 95% confidence
interval for the summary odds ratio is
then given by:
Thus, an approximate 95% confidence
OR e+1.96 SE
interval for the summary risk ratio is
then given by:
Standardisation
RR e+1.96 SE
Standardisation is an alternative
The Mantel-Haenszel summary odds approach to obtaining a summary
ratio has the form: effect estimate (Miettinen, 1974;
Rothman and Greenland, 1998).
Σ aidi/Ti Pooling involves calculating the effect
estimate under the assumption that
OR = -----------
the measure (e.g. The rate ratio)
Σ bici/Ti would be the same (uniform) across
strata if random error were absent. In
contrast, standardisation involves
An approximate p-value for the taking a weighted average of the
hypothesis that the summary odds ratio disease occurrence across strata (e.g.
is 1.0 can be obtained from the one the standardized rate) and then
degree-of-freedom Mantel-Haenszel comparing the standardized
summary chi-square (Mantel and occurrence measure between exposed
Haenszel, 1959): and non-exposed (e.g. the
standardized rate ratio) with no
assumptions of uniformity of effect.
[ΣObs(a) - ΣExp(a)]2 [Σai - ΣN1iM1i/Ti]2
Standardisation is more prone than
χ2 = ------------------------ = ---------------------- pooling to suffer from statistical
ΣVar(Exp(a)) [Σ M1iM0iN1iN0i/Ti2(Ti-1)]] instability due to small numbers in
141
specific strata; by comparison, pooling (under the binomial model for random
with Mantel-Haenzsel estimators is error) of:
robust and in general its statistical
0.5
stability depends on the overall [Σ wi2Ri(1-Ri)/Ni]
numbers rather than the numbers in
SE = -----------------------
specific strata. However, direct
standardisation has practical RΣ wi
advantages when more than two
groups are being compared, e.g. when
comparing multiple exposure groups or where Ni is the number of persons in
making comparisons between multiple stratum i. An approximate 95%
countries or regions, and does not confidence interval for the
require the assumption of constant standardized rate is thus:
effects across strata.
R e+ 1.96 SE
The standardized rate has the form:
142
log-linear rate regression, risk ratios can
be modelled using binomial log-linear risk Table 12.2
regression, and odds ratios can be
modelled using binomial logistic Segi’s World population
regression (Pearce et al, 1988; Rothman
and Greenland, 1998). Age-group Population
-----------------------------
Similarly, continuous outcome variables 0-4 years 12,000
(e.g. in a cross-sectional study) can be 5-9 years 10,000
modelled with standard multiple linear 10-14 years 9,000
regression methods. These models all 15-19 years 9,000
have similar forms, with minor variations 20-24 years 8,000
to take into account the different data 25-29 years 8,000
types. They provide powerful tools when 30-34 years 6,000
used appropriately, but are often used 35-39 years 6,000
inappropriately, and should always be 40-44 years 6,000
used in combination with the more 45-49 years 6,000
straightforward methods presented here 50-54 years 5,000
(Rothman and Greenland, 1998). 55-59 years 4,000
Mathematical modelling methods and 60-64 years 4,000
issues are reviewed in depth in a number 65-69 years 3,000
of standard texts (e.g. Breslow and Day, 70-74 years 2,000
1980, 1987; Checkoway et al, 2004; 75-59 years 1,000
Clayton and Hills, 1993; Rothman and 80-84 years 500
Greenland, 1998), and will not be 85+ years 500
discussed in detail here. -----------------------------
Total 100,000
-----------------------------
Source: Segi (1960)
Summary
The basic aim of the analysis of a single assessment of the extent to which the
study is to estimate the effect of effect estimate changes when the factor
exposure on the outcome under study is controlled in the analysis. There are
while controlling for confounding and two basic methods of calculating a
minimizing other possible sources of summary effect estimate to control
bias. In addition, when confounding and confounding: pooling and
other sources of bias cannot be standardisation. Multiple regression
removed, then it is important to assess allows for the simultaneous control of
their likely strength and direction. more confounders by "smoothing" the
Control of confounding in the analysis data across confounder strata. It
involves stratifying the data according provides a powerful tool when used
to the levels of the confounder(s) and appropriately, but are often used
calculating an effect estimate which inappropriately, and should always be
summarizes the information across used in combination with the more
strata of the confounder(s). In general, straightforward methods presented
control of confounding requires careful here.
use of a priori knowledge, together with
143
References
144
CHAPTER 13: Interpretation
[In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005]
In this chapter I first consider the issues associations are likely to be valid, then
involved in interpreting the findings of a attention shifts to more general causal
single epidemiological study. I then inference, which should be based on all
consider problems of interpretation of all available information. In both situations,
of the available evidence. Interpreting it should be stressed that
the findings of a single study includes epidemiological studies almost always
considering the strength and precision contain potential biases, and the focus
of the effect estimate and the possibility should be on assessing the likely
that it may have been affected by direction and magnitude of the biases,
various possible biases (confounding, and whether they could explain the
selection bias, information bias). If it is observed associations.
concluded that the observed
145
alone?” and this issue is usually epidemiologic study will involve biases.
assessed by calculating the p-value. The problem is not to identify possible
This is the probability (assuming that biases (these will almost always exist),
there are no biases) that a test statistic but rather to ascertain what direction
as large as that actually observed would they are likely to be in, and how strong
be found in a study if the null they are is likely to be.
hypothesis were true, i.e. that there
was in reality no causal effect of Confounding
exposure. However, recent reviews have
stressed the limitations of p-values and In assessing whether an observed
significance testing (Rothman, 1978; association could be due to confounding,
Gardner and Altman, 1986; Poole, the first consideration is whether all
1987; Pearce and Jackson, 1988). potential confounders have been
Foremost among these is that appropriately controlled for or
significance testing attempts to reach a appropriately assessed (e.g. by
decision on the basis of the data from a collecting and using confounder
single study, whereas what is more information in a sample of study
important is the strength and precision participants). If not, it is essential to
of the effect estimate and whether the assess the potential strength and
findings of a particular study are direction of uncontrolled confounding.
consistent with those of previous
studies. These issues are better In some areas of epidemiologic
addressed by calculating confidence research, e.g. occupational and
intervals rather than p-values (Gardner environmental studies, the strength of
and Altman, 1986; Rothman and uncontrolled confounding is often less
Greenland, 1998). Similarly, the than might be expected. For example,
possibility that the lack of a statistically Axelson (1978) has shown that for
significant association could be due to plausible estimates of the smoking
lack of precision (lack of study power) is prevalence in occupational populations,
more appropriately addressed by confounding by smoking can rarely
considering the confidence interval of account for a relative risk of lung cancer
the effect estimate rather than by of greater than 1.5. Similarly,
making post hoc power calculations Siemiatycki et al (1988) have found that
(Smith and Bates, 1992). confounding by smoking is generally
even weaker for internal comparisons in
What are the likely strengths and which exposed workers are compared
directions of possible biases? with non-exposed workers in the same
factory or industry). On the other hand,
Systematic error is distinguished from the potential for confounding can be
random error in that it would be present severe in studies of lifestyle and related
even with an infinitely large study, factors (e.g. diet, nutrition, exercise).
whereas random error can be reduced
by increasing the study size. Thus, It is unreasonable to simply assume
systematic error, or "bias", occurs if that a strong association could be due
there is a systematic difference between to confounding by unknown risk factors,
what the study is actually estimating since to be a strong confounder a factor
and what it is intended to estimate. The must be a very strong risk factor as well
types of bias (confounding, selection as being strongly associated with
bias, information bias) have already exposure. For example, if an
been discussed in chapter 6. In the occupational study found a relative risk
current context the key issue is that any of 2.0 for lung cancer in exposed
146
workers, it is highly unlikely that this to have been in. The important issue is
could be due to confounding by not whether information bias could
smoking, and it would be unreasonable have occurred (this is almost always
to dismiss the study findings merely the case since there are almost always
because smoking information had not problems of misclassification of
been available. On the other hand, exposure and/or disease) but rather
small relative risks (e.g. those in the the likely direction and strength of
range of 0.7-1.5, as frequently occur in such bias. In particular, if a study has
dietary studies) are not so difficult to yielded a positive finding (i.e. an effect
explain by lack of measurement, or poor estimate markedly different from the
measurement and control, of null value) then it is not valid to
confounders. dismiss it because of the possibility of
non-differential misclassification, or
Selection bias differential misclassification that is
likely (although not guaranteed)
Whereas confounding generally produce a bias towards the null.
involves biases inherent in the source
population, selection bias involves Summary of Issues of Systematic
biases arising from the procedures by Error
which the is study subjects are chosen
from the source population. As with In summary, when assessing whether
confounding, if it is not possible to the findings of a particular study could
directly control for selection bias, it be due to such biases, the important
still may be possible to assess its likely issue is not whether such biases are
strength and direction. It is likely to have occurred (since they will
unreasonable to dismiss the findings of almost always be present to some
a particular study because of possible extent), but rather what their direction
selection bias, without at least and strength is likely to be, and
attempting to assess which direction whether they taken together could
the possible selection bias would have explain the observed association. In
been in, and how strong it might have particular, epidemiological studies are
been. often criticized on the grounds that
observed associations could be due to
Information bias uncontrolled confounding or errors in
the classification of exposure or
With regards to information bias, the disease. However, the likely strength is
key issue is whether misclassification of uncontrolled confounding is
is likely to have been differential or sometimes less than might be
non-differential. In the latter case, the expected, and non-differential
bias will usually be in a know direction, misclassification of exposure will
i.e. towards the null. If usually (though not always) produce a
misclassification has been differential, tendency for false negative findings
then it is important to attempt to rather than false positive findings.
assess what direction the bias is likely
147
13.2: Appraisal of All of the Available Evidence
148
exclude alternative explanations for the homogeneity often have relatively low
observed association. power, it is more appropriate to
examine the magnitude of variation
A dose-response relationship occurs instead of relying on formal statistical
when changes in the level of exposure tests (Rothman and Greenland, 1998).
are associated with changes in the
prevalence or incidence of the effect The limitations of meta-analyses should
than one would expect from biologic also be emphasized (Greenland, 1994;
considerations. The absence of an Egger and Davey-Smith, 1997; Egger et
expected dose-response relationship al, 1997). Strikingly different results can
provides evidence against a causal be obtained depending on which studies
relationship, while the presence of an are included in a meta-analysis.
expected relationship narrows the scope Publication bias is of particular concern,
of biases that could explain the given the tendency of journals to
relationship. publish “positive findings” and for the
publication of “negative findings” to be
Experimental evidence provides strong delayed (Egger and Davey-Smith,
evidence of causality, but this is rarely 1998), but naive graphical approaches
available for occupational exposures. to its assessment can be misleading
(Greenland, 1994).
Meta-Analysis
Even when an “unbiased” and
In the past, epidemiological evidence comprehensive list of studies is included
has been assessed in literature reviews, in a meta-analysis, there still remain the
but in recent years there has been an same problems of selection bias,
increasing emphasis on formal meta- information bias, and confounding, that
analysis, i.e. systematic quantitative need to be addressed in assessing
reviews. One benefit of a is meta- individual studies. Thus, a systematic
analysis is that it can reduce the quantitative review (i.e. meta-analysis)
probability of false negative results is like a report of a single study in that
because of small numbers in specific both quantitative and narrative
studies (Egger and Davey-Smith, 1997), elements are required to produce a
and may enable the effect of an balanced picture (Rothman and
exposure to be estimated with greater Greenland, 1998). Essentially the same
precision than is possible in a single issues need to be a addressed as in a
study. Furthermore, although a meta- report of a single study: what is the
analysis should ideally be based on overall magnitude and precision of the
individual data, relatively simple effect estimate (if it is considered
methods are available for meta- appropriate to calculate a summary
analyses of published studies in which effect estimate), and what are likely
the study (rather than the individual) is strengths and directions of possible
the unit of statistical analysis (Rothman biases?
and Greenland, 1998). Such methods
can be used to address the causal An advantage of meta-analysis is that
considerations outlined above, in these issues can often be better
particular the overall strength of addressed by contrasting the findings of
association and the shape and strength studies based on different populations,
of the dose-response curve. Just as or using different study designs. Thus,
importantly, statistical methods can also possible systematic biases can be
be used to assess consistency between addressed with actual data from specific
studies, but because statistical tests for studies rather than by hypothetical
149
examples. For example, in a study of an available for analysis and will therefore
occupational exposure and lung cancer, reduce random error. However, it will
there might be concern that an not necessarily reduce systematic error,
observed association was due to and may even increase it (because of
confounding by smoking. If smoking publication bias). Nevertheless, a
data had not been available, then the careful meta-analysis will enable various
best that could be done would be to possible biases to be addressed, using
attempt to assess the likely extent of actual data from specific studies, rather
confounding by smoking (see chapter than hypothetical examples. Such a
6), for example by sensitivity analysis meta-analysis will therefore facilitate
(Rothman and Greenland, 1998). the consideration of the causal
However, in a meta-analysis, if smoking considerations listed above, and in some
information were available for some instances will provide a valid summary
(but not all) studies then these studies estimate of the overall strength of
could be examined to assess the likely association and the shape and strength
strength and direction of confounding by of the dose-response curve (Greenland,
smoking (if any). 2003).
150
Summary
The task of interpreting the findings of differences between study findings and
a single epidemiological study should the likely magnitude of possible biases.
be differentiated from that of Furthermore, causal inference also
interpreting all of the available necessitates considering non-
evidence. Interpreting the findings of a epidemiological evidence from other
single study includes considering the sources (animal studies, mechanistic
strength and precision of the effect studies) in the consideration of more
estimate and the possibility that it may general causal criteria including the
have been affected by various possible plausibility and coherence of the
biases (confounding, selection bias, overall evidence.
information bias). The important issue
is not whether such biases are likely to Despite the continual need to assess
have occurred (since they will almost possible biases, and to consider
always be present to some extent), possible imperfections in the
but rather what their direction and epidemiological data, it is also
strength is likely to be, and whether important to ensure that preventive
together they could explain the action occurs when this is warranted,
observed association. If the observed albeit on the basis of imperfect data.
associations seem likely to be valid, As Hill (1965) writes:
then attention shifts to more general
causal inference, which should be "All scientific work is incomplete -
based on all available information. This whether it be observational or
includes assessing the specificity, experimental. All scientific work
strength and consistency of the is liable to be upset or modified
association and the dose-response by advancing knowledge. That
across all epidemiological studies. This does not confer upon us a
may include the use of meta-analysis, freedom to ignore the knowledge
but it is often not appropriate to derive that we already have, or to
a single summary effect estimate postpone the action that it
across all studies. Rather, a meta- appears to demand at a given
analysis can be used to examine time."
hypotheses about reasons for
References
151
Beaglehole R, Bonita R, Kjellstrom T acids: a case-control study. Br J
(1993). Basic epidemiology. Geneva: Cancer 43: 169-76.
WHO.
Hill AB (1965). The environment and
Dickerson K, Berlin JA (1992). Meta- disease: association of causation?
analysis: state-of-the-science. Proc R Soc Med 58: 295-300.
Epidemiologic Reviews 14: 154-76.
Pearce NE, Smith AH, Howard JK, et al
Egger M, Davey-Smith G (1997). Meta- (1986). Non-Hodgkin's lymphoma
analysis: principles and promise. Br and exposure to phenoxyherbicides,
Med J 1997; 315: 1371-4. chlorophenols, fencing work and
meat works employment: a case-
Egger M, Davey-Smith G, Phillips A
control study. Brit J Ind Med 43: 75-
(1997). Meta-analysis: principles and
83.
procedures. Br Med J 1997; 315:
1533-7. Pearce NE, Jackson RT (1988).
Statistical testing and estimation in
Egger M, Davey-Smith G (1998). Meta
medical research. NZ Med J 101:
analysis: bias in location and
569-70.
selection of studies. Br Med J 1998;
316: 61-6. Poole C (1987). Beyond the confidence
interval. AJPH 77: 195-9.
Feinstein AR (1988). Scientific
standards in epidemiologic studies of Rothman KJ (1978). A show of
the menace of daily life. Science 242: confidence. N Engl J Med 299: 1362-
1257-63. 3.
Gardner MJ, Altman DG (1986). Rothman KJ, Greenland S (1998).
Confidence intervals rather than p Modern epidemiology. 2nd ed.
values: estimation rather than Philadelphia: Lippincott-Raven.
hypothesis testing. Br Med J 292:
Siemiatycki J, Wacholder S, Dewar R, et
746-50.
al (1988). Smoking and degree of
Greenland S (1994). A critical look at occupational exposure: Are internal
some populat meta-analytic methods. analyses in cohort studies likely to be
Am J Epidemiol 140: 290-6. confounded by smoking status?
American Journal of Industrial
Greenland S (2003). The impact of prior
Medicine 13:59-69.
distributions for uncontrolled
confounding and response bias: a Smith AH, Bates M (1992). Confidence
case study of the relation of wire limit analyses should replace power
codes and magnetic fields to calculations in the interpretation of
childhood leukemia. J Am Statist epidemiologic studies. Epidemiol 3:
Assoc 98: 1-8. 449-52.
Hardell L, Sandstrom A (1979). Case-
control study: soft-tissue sarcomas
and exposure to phenoxyacetic acids
or chlorophenols. Br J Cancer 39:
711-7.
Hardell L, Erikkson M, Lenner P,
Lundgren E (1981). Malignant
lymphoma and exposure to
chemicals, especially organic
solvents, chlorophenols and phenoxy
152
153