Вы находитесь на странице: 1из 153

A Short Introduction

to Epidemiology
Second Edition

Neil Pearce

Occasional Report Series No 2

Centre for Public Health Research


Massey University Wellington Campus
Private Box 756
Wellington, New Zealand

1
Centre for Public Health Research
Massey University Wellington Campus
Private Box 756
Wellington, New Zealand
Phone: 64-4-3800-606
Fax: 64-4-3800-600
E-mail: cphr@massey.ac.nz
Website: http://www.publichealth.ac.nz/

Copies of this publication can be purchased in hard copy


through our website (NZ$36.744 incl GST), or downloaded for
free in pdf form from the website.

2nd edition

February 2005

ISBN 0-476-01236-8

ISSN 1176-1237

2
To Irihapeti Ramsden

3
4
Preface

Who needs another introductory are used to investigate. In particular, in


epidemiology text? Certainly, there are recent years there has been a revival in
many introductory epidemiology books public health applications of
currently in print, and many of them are epidemiology, not only at the national
excellent. Nevertheless, there are four level, but also at the international level,
reasons why I believe that this new text as epidemiologists tackle global problems
is justified. such as climate change. This text does
not attempt to review the more complex
measures used to consider such issues.
Firstly, it is much shorter than most
However, it does provide a coherent and
introductory texts, many of which contain
systematic summary of the basic
more material than is required for a short
methods in the field, which can be used
introductory course. This is a short
as a logical base for the teaching and
introduction to epidemiology, and is not
development of research into these more
intended to be comprehensive.
complex issues.

Secondly, I have endeavoured to show


Chapter 1 gives a brief introduction to the
clearly how the different basic
field, with an emphasis on the broad
epidemiologic methods “fit together” in a
range of applications and situations in
logical and systematic manner. For
which epidemiologic methods have been
example, I attempt to show how the
used historically, and will continue to be
different possible study designs relate to
used in the future.
each other, and how they are different
approaches to a common task. Similarly,
I attempt to show how the different study Part 1 then addresses study design
design issues (confounding and other options. Chapter 2 discusses incidence
types of bias) relate to each other, and studies (including cohort studies) and
how the principles and methods of data describes the basic study design and the
analysis are consistent across different basic effect measures (i.e. incidence rates
study designs and data types. and rate ratios). It then presents
incidence case-control studies as a more
efficient means of obtaining the same
Thirdly, in this context, rather than
findings. Chapter 3 similarly discusses
attempt a comprehensive review of
prevalence studies, and prevalence case-
available methods (e.g. multiple methods
control studies. Chapter 4 then considers
for estimating confidence intervals for the
study designs incorporating other axes of
summary risk ratio), I have attempted to
classification, continuous outcome
select only one standard method for each
measures (e.g. blood pressure) such as
application, which is reasonably robust
cross-sectional studies and longitudinal
and accurate, and which is consistent and
studies, or more complex study designs
coherent with the other methods
such as ecologic and multi-level studies.
presented in the text.

Finally, the field of epidemiology is


changing rapidly, not only with regards to
its basic methods, but also with regards
to the hypotheses which these methods

5
Part 2 then addresses study design of situations in which epidemiologic
issues. Chapter 5 discusses issues of methods can be used. However, there are
study size and precision. Chapter 6 undoubtedly many other types of
considers general issues of validity, epidemiologic hypotheses and
namely selection bias, information bias, epidemiologic studies which are not
and confounding. Chapter 7 discusses represented in this book. In particular,
effect modification. my focus is on the use of epidemiology in
public health, particularly with regard to
Part 3 then discusses the practical issues non-communicable disease, and I include
of conducting a study. Chapter 8 few examples from clinical epidemiology
addresses issues of measurement of or from communicable disease outbreak
exposure and disease. Chapters 9-11 investigations. Nevertheless, I hope that
then discuss the conduct of cohort the book will be of interest not only to
studies, case-control studies and cross- epidemiologists, but also to others who
sectional studies respectively. have other training but are involved in
epidemiologic research, including public
Finally, Part 4 considers what happens health professionals, policy makers, and
after the data are collected, with chapter clinical researchers.
12 addressing data analysis and chapter
13 the interpretation of the findings of
epidemiologic studies. Neil Pearce

I should stress that this book provides no Centre for Public Health Research
more than a very preliminary introduction Massey University Wellington Campus
to the field. In doing so I have attempted Private Box 756
to use a wide range of examples, which Wellington, New Zealand
give some indication of the broad range

Acknowledgements

During the writing of this text, my salary


was funded by the Health Research
Council of New Zealand. I wish to thank
Sander Greenland and Jonny Myers for
their comments on the draft manuscript.
I also wish to thank Massey University
for support for my research programme.

6
A Short Introduction to Epidemiology
Contents

1. Introduction 9
PART 3: CONDUCTING A STUDY
– Germs and miasmas 10
– Risk factor epidemiology 11 8. Measurement of exposure and
– Epidemiology in the 21st century 12 health status 95
– Exposure 95
PART 1: STUDY DESIGN OPTIONS – Health status 102

2. Incidence studies 21
9. Cohort studies 109
– Incidence studies 22
– Defining the source population and
– Incidence case-control studies 28 risk period 109
– Measuring exposure 112
3. Prevalence studies 33
– Follow-up 113
– Prevalence studies 33
– Prevalence case-control studies 38 10. Case-control studies 117
– Defining the source population and
4. More complex study designs 41 risk period 117
– Other axes of classification 41 – Selection of cases 118
– Continuous outcome measures 42 – Selection of controls 119
– Ecologic and multilevel studies 47 – Measuring exposure 122

PART 2: STUDY DESIGN ISSUES 11. Prevalence studies 125


– Defining the source population 125
5. Precision 59 – Measuring health status 126
– Basic statistics 60 – Measuring exposure 128
– Study size and power 61
PART 4: ANALYSIS AND
6. Validity 67 INTERPRETATION OF STUDIES
– Confounding 67
– Selection bias 73 12. Data analysis 133
– Information bias 74 – Basic principles 133
– Basic analyses 136
7. Effect modification 83 – Controlling for confounding 140
– Concepts of interaction 83
– Additive and multiplicative models 13. Interpretation 145
88 – Appraisal of a single study 145
– Joint effects 89 – Appraisal of all of the available
evidence 148

7
8
CHAPTER 1. Introduction
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

Public health is primarily concerned with recognise the complementary nature of


the prevention of disease in human the former (McKinlay, 1993), and some
population. It differs from clinical texts include the latter in their definition
medicine both in its emphasis on of epidemiology. However, the key
prevention rather than treatment, and in feature of epidemiological studies is that
its focus on populations rather than they are quantitative (rather than
individual patients (table 1.1). qualitative) observational (rather than
Epidemiology is the branch of public experimental) studies of the determinants
health which attempts to discover the of disease in human populations (rather
causes of disease in order to make than individuals). This will be my focus
disease prevention possible. here, while recognising the value, and
Epidemiological methods can be used in complementary nature, of other research
other contexts (particularly in clinical methodologies. The observational
research), but this short introductory text approach is a major strength of
focuses on the use of epidemiology in epidemiology as it enables a study to be
public health, i.e. on its use as part of the conducted in a situation where a
wider process of discovering the causes randomized trial would be unethical or
of disease and preventing its occurrence impractical (because of the large
in human populations. numbers of subjects required). It is also
the main limitation of epidemiological
In this context, epidemiology has been studies in that the lack of randomization
defined as (Last, 1988): means that the groups being compared
may differ with respect to various causes
"the study of the distribution and of disease (other than the main exposure
determinants of health-related states or under investigation). Thus,
events in specified populations, and the epidemiological studies, in general,
application of this study to control of experience the same potential problems
health problems" as randomized controlled trials, but may
suffer additional problems of bias because
This broad definition could in theory exposure has not been randomly
include a broad range of research allocated and there may be differences in
methodologies including qualitative baseline disease risk between the
research and quantitative randomised populations being compared.
controlled trials. Some epidemiologists

Table 1.1

The defining features of public health: populations and prevention

Prevention Treatment
----------------------------------------------------------------------
Populations Public health Health systems research
Individuals Primary health care/ Medicine (including primary health care)
Health education

9
1.1 Germs and Miasmas

Epidemiology is as old as public health ages. However, epidemiology was


itself, and it is not difficult to find founded as an independent discipline in a
epidemiological observations made by number of Western countries in parallel
physicians dating back to Hippocrates with the industrial revolution of the 19th
who observed that: century. In Anglophone countries it is
considered to have been founded by the
“Whoever wishes to investigate work of Chadwick, Engels, Snow and
medicine properly should proceed thus: others who exposed the appalling social
in the first place to consider the conditions during the industrial
seasons of the year, and what effects revolution, and the work of Farr and
each of them produces… when one others who revealed major
comes into a city in which he is a socioeconomic differences in disease in
stranger, he should consider its the 19th century. At that time,
situation, how it lies as to the winds epidemiology was generally regarded as
and the rising of the sun…One should a branch of public health and focused on
consider most attentively the waters the causes and prevention of disease in
which the inhabitants use…and the populations, in comparison with the
ground… and the mode in which the clinical sciences which were branches of
inhabitants live, and what are their medicine and focussed on disease
pursuits, whether they are fond of pathology and treatment of disease in
drinking and eating to excess, and individuals. Thus, the emphasis was on
given to indolence, or are fond of the prevention of disease and the health
exercise and labor”. (Hippocrates, needs of the population as a whole. In
1938; quoted in Hennekens and Buring, this context, the fundamental
1987) importance of population-level factors
(the urban environment, housing,
Many other examples of epidemiological socioeconomic factors, etc) was clearly
reasoning were published through the acknowledged (Terris, 1987).

Table 1.2
Deaths and death rates from cholera in London 1854 in households supplied by the
Southwark and Vauxhall Water Company and by the Lambeth Water Company

Deaths
Cholera per 10,000
Houses deaths houses
------------------------------------------------------------------------------------------------
Southwark and Vauxhall 40,046 1,263 315
Lambeth Company 26,107 98 37
Rest of London 256,423 1,422 59
------------------------------------------------------------------------------------------------
Source: (Snow, 1936; quoted in Winkelstein, 1995)

10
Perhaps the most commonly quoted 1983; Loomis and Wing, 1991; Samet,
epidemiologic legend is that of Snow who 2000; Vandenbroucke, 1994), it is clear
studied the causes of cholera in London that Snow was able to discover, and
in the mid-19th century (Winkelstein, establish convincing proof for, the mode
1995). Snow was able to establish that of transmission of cholera, and to take
the cholera death rate was much higher preventive action several decades before
in areas supplied by the Southwark and the biological basis of his observations
Vauxhall Company which took water was understood. Thus, it was not until
from the Thames downstream from several decades after the work of Snow
London (i.e. after it had been that Pasteur and others established the
contaminated with sewerage) than in role of the transmission of specific
areas supplied by the Lambeth Company pathogens in what became known as the
which took water from upstream, with “infectious diseases”, and it was another
the death rates being intermediate in century, in most instances, before
areas served by both companies. effective vaccines or antibiotic
Subsequently, Snow (1936) studied the treatments became available.
area supplied by both companies, and Nevertheless, a dramatic decline in
within this area walked the streets to mortality from these diseases occurred
determine for each house in which a from the mid-nineteenth century long
cholera death had occurred, which before the development of modern
company supplied the water. The death pharmaceuticals. This has been
rate was almost ten times as high in attributed to improvements in nutrition,
houses supplied with water containing sanitation, and general living conditions
sewerage (table 1.2). (McKeown, 1979) although it has been
argued that specific public health
Although epidemiologists and other interventions on factors such as urban
researchers continue to battle over congestion actually played the major role
Snow’s legacy and its implications for (Szreter, 1988).
epidemiology today (Cameron and Jones,

1.2 Risk Factor Epidemiology

This decline in the importance of human genome project has seen an


communicable disease was accompanied accelerated interest in the role of genetic
by an increase in morbidity and mortality factors (Beaty and Khoury, 2000).
from non-communicable diseases such
as heart disease, cancer, diabetes, and Thus, epidemiology became widely
respiratory disease. This led to major recognized with the establishment of the
developments in the theory and practice link between tobacco smoking as a cause
of epidemiology, particularly in the of lung cancer in the early 1950's (Doll
second half of the 20th century. There and Hill, 1950; Wynder and Graham,
has been a particular emphasis on 1950), although this association had
aspects of individual lifestyle (diet, already been established in Germany in
exercise, etc) and in the last decade the the 1930s (Schairer and Schöninger,

11
2001). Subsequent decades have seen for the ethical and practical constraints,
major discoveries relating to other epidemiologic theory and practice has,
causes of chronic disease such as quite appropriately, been based on the
asbestos, ionizing radiation, viruses, theory and practice of randomised trials.
diet, outdoor air pollution, indoor air Thus, the aim of an epidemiologic study
pollution, water pollution, and genetic investigating the effect of a specific risk
factors. These epidemiologic successes factor (e.g. smoking) on a particular
have in some cases led to successful disease (e.g. lung cancer) is intended to
preventive interventions without the obtain the same findings that would have
need for major social or political change. been obtained from a randomised
For example, occupational carcinogens controlled trial. Of course, an
can, with some difficulty, be controlled epidemiologic study will usually
through regulatory measures, and experience more problems of bias than a
exposures to known occupational randomised controlled trial, but the
carcinogens have been reduced in randomised trial is the “gold standard”.
industrialized countries in recent
decades. Another example is the This approach has led to major
successful World Health Organisation developments in epidemiologic theory
(WHO) campaign against smallpox. More (presented most elegantly and
recently, some countries have passed comprehensively in Rothman and
legislation to restrict advertising of Greenland, 1998). In particular, there
tobacco and smoking in public places have been major developments in the
and have adopted health promotion theory of cohort studies (which mimic a
programmes aimed at changes in randomised trial, but without the
"lifestyle". randomisation) and case-control studies
(which attempt to obtain the same
Individual lifestyle factors would ideally findings as a full cohort study, but in a
be investigated using a randomised more efficient manner). It is these basic
controlled trial, but this is often unethical methods, which follow a randomised
or impractical (e.g. tobacco smoking). controlled trial “paradigm”, which receive
Thus, it is necessary to do observational most of the attention in this short
studies and epidemiology has made introductory text. However, while
major contributions to the understanding presenting these basic methods, it is
of the role of individual lifestyle factors important to also recognise their
and health. Because such factors would limitations, and to also consider different
ideally be investigated in randomised or more complex methods that may be
controlled trials, and in fact would be more appropriate when epidemiology is
ideally suited to such trials if it were not used in the public health context.

1.3 Epidemiology in the 21st Century

In particular, in the last decade there the future direction of epidemiology


has been increasing concern expressed (Saracci, 1999). In particular, it has
about the limitations of the risk factor been argued that there has been an
approach, and considerable debate about overemphasis on aspects of individual

12
lifestyle, and little attention paid to the studies. Even if one is focusing on
population-level determinants of health individual “lifestyle” risk factors, there is
(Susser and Susser, 1996a, 1996b; good reason to conduct studies at the
Pearce, 1996; McMichael, 1999). population level (Rose, 1992). Moreover,
Furthermore, the success of risk factor every population has its own history,
epidemiology has been more temporary culture, and economic and social
and more limited than might have been divisions which influence how and why
expected. For example, the limited people are exposed to specific risk
success of legislative measures in factors, and how they respond to such
industrialised countries has led the exposures. For example, New Zealand
tobacco industry to shift its promotional (Aotearoa) was colonised by Great
activities to developing countries so that Britain more than 150 years ago,
more people are exposed to tobacco resulting in major loss of life by the
smoke than ever before (Barry, 1991; indigenous people (the Māori). It is
Tominaga, 1986). Similar shifts have commonly assumed that this loss of life
occurred for some occupational occurred primarily due to the arrival of
carcinogens (Pearce et al, 1994). Thus, infectious diseases to which Māori had no
on a global basis the "achievement" of natural immunity. However, a more
the public health movement has often careful analysis of the history of
been to move public health problems colonisation throughout the Pacific
from rich countries to poor countries and reveals that the indigenous people
from rich to poor populations within the mainly suffered major mortality from
industrialized countries. imported infectious diseases when their
land was taken (Kunitz, 1994), thus
It should be acknowledged that not all disrupting their economic base, food
epidemiologists share these concerns supply and social networks. This
(e.g. Savitz, 1994; Rothman et al, 1998; example is not merely of historical
Poole and Rothman, 1998), and some interest, since it these same infectious
have regarded these discussions as an diseases that have returned in strength
attack on the field itself, rather than as in Eastern Europe in the last decade,
an attempt to broaden its vision. after lying dormant for nearly a century
Nevertheless, the debate has progressed (Bobak and Marmot, 1996). Similarly,
and there is an increasing recognition of the effects of occupational carcinogens
the importance of taking a more global may be greater in developing countries
approach to epidemiologic research and where workers may be relatively young
of the importance of maintaining an or may be affected by malnutrition or
appropriate balance and interaction other diseases (Pearce et al, 1994).
between macro-level (population),
individual-level (e.g. lifestyle), and These issues are likely to become more
micro-level (e.g. genetic) research important because, not only is
(Pearce, 2004). epidemiology changing, but the world
that epidemiologists study is also rapidly
There are three crucial concepts which changing. We are seeing the effects of
have received increasing attention in this economic globalization, structural
regard. adjustment (Pearce et al, 1994) and
climate change (McMichael, 1993, 1995),
The Importance of Context and the last few decades have seen the
occurrence of the “informational
The first, and most important issue, is revolution” which is having effects as
the need to consider the population great as the previous agricultural and
context when conducting epidemiologic industrial revolutions (Castells, 1996).

13
In industrialized countries, this is likely theories and identifies the major public
to prolong life expectancy for some, health problems which new theories
but not all, sections of the population. must be able to explain. A fruitful
In developing countries, the benefits research process can then be
have been even more mixed (Pearce et generated with positive interaction
al, 1994), while the countries of between epidemiologists and other
Eastern Europe are experiencing the researchers. Studying real public
largest sudden drop in life expectancy health problems in their historical and
that has been observed in peacetime social context does not exclude
in recorded human history (Boback learning about sophisticated methods
and Marmot, 1996) with a major rise of study design and data analysis (in
in alcoholism and “forgotten” diseases fact, it necessitates it), but it may help
such as tuberculosis and cholera. to ensure that the appropriate
questions are asked (Pearce, 1999).
This increased interest in population-
level determinants of health has been Appropriate Technology
particularly marked by increased
interest in techniques such as A related issue is the need to use
multilevel modelling which allow “appropriate technology” to address
individual lifestyle risk factors to be the most important public health
considered “in context” and in parallel research questions. In particular, as
with macro-level determinants of attention moves “upstream” to the
health (Greenland, 2000). Such a shift population level (McKinlay, 1993) new
in approach is important, not only methods will need to be developed
because of the need to emphasize the (McMichael, 1995). One example of
role of diversity and local knowledge this, noted above, is the recent rise in
(Kunitz, 1994), but also because of the interest in multilevel modelling
more general moves within science to (Blakely and Woodward, 2000; Pearce,
consider macro-level systems and 2000), although it is important to
processes (Cohen and Stewart, 1994) stress that it is an increase in
rather than taking a solely reductionist “multilevel thinking” in the
approach (Pearce, 1996). development of epidemiologic
hypotheses and the design of studies
Problem-Based Epidemiology that is required, rather than just the
use of new statistical techniques of
A second issue is that a problem-based data analysis. The appropriateness of
approach may be particularly valuable any research methodology depends on
in encouraging epidemiologists to the phenomenon under study: its
focus on the major public health magnitude, the setting, the current
problems and to take the population state of theory and knowledge, the
context into account (Pearce, 2001; availability of valid measurement tools,
Thacker and Buffington, 2001). A and the proposed uses of the
problem-based approach to teaching information to be gathered, as well as
clinical medicine has been increasingly the community resources and skills
adopted in medical schools around the available and the prevailing norms and
world. The value of this approach is values at the national, regional or local
that theories and methods are taught level (Pearce and McKinlay, 1998).
in the context of solving real-life Thus, there has been increased
problems. Starting with “the problem” interest in the interface between
at the population level provides a epidemiology and social science
“reality check” on existing etiological (Krieger, 2000), and in the

14
development of theoretical and noted above, this short introductory
methodological frameworks text focuses on the most basic
appropriate for epidemiologic studies epidemiologic methods, but I attempt
in developing countries (Barreto et al, to refer to more complex issues, and
2001; Barreto, 2004; Loewenson, the potential use of more complex
2004), and in indigenous people in methods, where this is appropriate.
“Western “ countries (Durie, 2004). As

Summary

Public health is primarily concerned with applied to the study of non-


the prevention of disease in human communicable diseases. At the beginning
populations, and epidemiology is the of the 21st century, the field of
branch of public health which attempts epidemiology is changing rapidly, not
to discover the causes of disease in only with regards to its basic methods,
order to make disease prevention but also with regards to the hypotheses
possible. It thus differs from clinical which these methods are used to
medicine both in its emphasis on investigate. In particular, in recent years
prevention (rather than treatment) and there has been a revival in public health
in its focus on populations (rather than applications of epidemiology, not only at
individual patients). Thus, the the national level, but also at the
epidemiological approach to a particular international level, as epidemiologists
disease is intended to identify high-risk tackle global problems such as climate
subgroups within the population, to change. This text does not attempt to
determine the causes of such excess review the more complex methods used
risks, and to determine the effectiveness to study such issues. However, it does
of subsequent preventive measures. provide a coherent and systematic
Although the epidemiological approach summary of the basic methods in the
has been used for more than a century field, which can be used as a logical base
for the study of communicable diseases, for the teaching and development of
epidemiology has considerably grown in research into these more complex
scope and sophistication in the last few issues.
decades as it has been increasingly

15
References

Barreto ML (2004). The globalization of between science and indigenous


epidemiology: critical thoughts from knowledge. Int J Epidemiol 33: 1138-
Latin America. Int J Epidemiol 33: 43.
1132-7.
Greenland S (2000). Principles of
Barreto ML, Almeida-Filho N, Breihl J multilevel modelling. Int J Epidemiol
(2001). Epidemiology is more than 29: 158-67.
discourse: critical thoughts from Latin
Hennekens CH, Buring JE (1987).
America. J Epidemiol Comm Health
Epidemiology in medicine. Boston:
55: 158-9.
Little, Brown.
Barry M (1991). The influence of the
Hippocrates (1938). On airs, waters and
U.S. tobacco industry on the health,
places. Med Classics 3: 19.
economy, and environment of
developing countries. New Engl J Med Krieger N (2000). Epidemiology and
324: 917-20. social sciences: towards a critical
reengagement in the 21st century.
Beaty TH, Khoury MJ (2000). Interface of
Epidemiologic Reviews 22: 155-63.
genetics and epidemiology.
Epidemiologic Reviews 22: 120-5. Kunitz S (1994). Disease and social
diversity. New York: Oxford University
Blakeley T, Woodward AJ (2000).
Press.
Ecological effects in multi-level
studies. J Epidemiol Comm Health 54: Last JM (ed) (1988). A dictionary of
367-74. epidemiology. New York: Oxford
University Press.
Bobak M, Marmot M (1996). East-West
mortality divide and its potential Loewenson R (2004). Epidemiology in
explanations: proposed research the era of globalization: skills transfer
agenda. Br Med J 312: 421-5. or new skills? Int J Epidemiol 33:
1144-50.
Cameron D, Jones IG (1983). John
Snow, the Broad Street pump and Loomis D, Wing S (1991). Is molecular
modern epidemiology. Int J Epidemiol epidemiology a germ theory for the
12: 393-6. end of the twentieth century? Int J
Epidemiol 19: 1-3.
Castells M (1996). The information age:
Economy, society and culture. Vol 1. McMichael AJ (1993). Planetary
The rise of the network society. overload: global environmental
Oxford: Blackwell. change and the health of the human
species. Cambridge: Cambridge
Cohen J, Stewart I (1994). The collapse
University Press.
of chaos: discovering simplicity in a
complex world. London: Penguin. McMichael AJ (1995). The health of
persons, populations, and planets:
Doll R, Hill AB (1950). Smoking and
epidemiology comes full circle.
carcinoma of the lung. Br Med J 2:
Epidemiol 6: 633-5.
739-48.
McMichael AJ (1999). Prisoners of the
Durie M (2004). Understanding health
proximate: loosening the constraints
and illness: research at the interface

16
on epidemiology in an age of change. epidemiology include the readication
Am J Epidemiol 149: 887-97. of poverty? Lancet 352: 810-3.
McKeown T (1979). The role of medicine. Samet JM (2000). Epidemiology and
Princeton, NJ: Princeton University policy: the pump handle meets the
Press. new millennium. Epidemiologic
Reviews 22: 145-54.
McKinlay JB (1993). The promotion of
health through planned sociopolitical Saracci R (1999). Epidemiology in
change: challenges for research and progress: thoughts, tensions and
policy. Soc Sci Med 36: 109-17. targets. Int J Epidemiol 28: S997-9.
Pearce N (1996). Traditional Savitz DA (1994). In defense of black
epidemiology, modern epidemiology, box epidemiology. Epidemiology 5:
and public health. AJPH 86: 678-83. 550-2.
Pearce N (1999). Epidemiology as a Schairer E, Schöninger E (2001). Lung
population science. Int J Epidemiol cancer and tobacco consumption. Int J
28: S1015-8. Epidemiol 30: 24-7.
Pearce N (2000). The ecologic fallacy Snow J (1936). On the mode of
strikes back. J Epidemiol Comm communication of cholera. (Reprint).
Health 54: 326-7. New York: The Commonwealth Fund,
pp 11-39.
Pearce N (2001). The future of
epidemiology: a problem-based Susser M, Susser E (1996a). Choosing a
approach using evidence-based future for epidemiology: I. Eras and
methods. Australasian Epidemiologist paradigms. Am J Publ Health 86: 668-
8.1: 3-7. 73.
Pearce N (2004). The globalization of Susser M, Susser E (1996b). Choosing a
epidemiology: introductory remarks. future for epidemiology: II. From
Int J Epidemiol 33: 1127-31. black boxes to Chinese boxes. Am J
Publ Health 86: 674-8.
Pearce N, McKinlay J (1998). Back to the
future in epidemiology and public Szreter S (1988). The importance of
health. J Clin Epidemiol 51: 643-6. social intervention in Briatain's
mortality decline c.1850-1914: a
Pearce NE, Matos E, Vainio H, Boffetta P,
reinterpretation of the role of public
Kogevinas M (eds) (1994).
health. Soc Hist Med 1: 1-37.
Occupational cancer in developing
countries. Lyon: IARC. Terris M (1987). Epidemiology and the
public health movement. J Publ Health
Poole C, Rothman KJ (1998). Our
Policy 7: 315-29.
conscientious objection to the
epidemiology wars. J Epidemiol Comm Thacker SB, Buffington J (2001). Applied
Health 52: 613-4. epidemiology for the 21st century. Int
J Epidemiol 30: 320-5.
Rose G. The strategy of preventive
medicine. Oxford: Oxford University Tominaga S (1986). Spread of smoking
Press, 1992. to the developing countries. In:
Zaridze D, Peto R (eds). Tobacco: a
Rothman KJ, Greenland S (1998).
major international health hazard.
Modern epidemiology. 2nd ed.
Lyon: IARC, pp 125-33.
Philadelphia: Lippincott-Raven.
Rothman KJ, Adami H-O, Trichopolous
(1998). Should the mission of

17
Vandenbroucke JP (1994). New public Wynder EL, Graham EA (1950). Tobacco
health and old rhetoric. Br Med J 308: smoking as a possible etiologic factor
994-5. in bronchiogenic carcinoma. J Am
Statist Assoc 143: 329-38.
Winkelstein W (1995). A new perspective
on John Snow’s communicable disease
theory. Am J Epidemiol 142: S3-9.

18
Part I

Study Design Options

19
20
CHAPTER 2. Incidence Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

In this chapter and the next one I review The responses to these two questions yield
the possible study designs for the simple four basic types of epidemiologic studies
situation where individuals are exposed to (Morgenstern and Thomas, 1993; Pearce,
a particular risk factor (e.g. a particular 1998):
chemical) and when a dichotomous
outcome is under study (e.g. being alive or 1. Incidence studies
dead, or having or not having a particular 2. Incidence case-control studies
disease). Thus, the aim is to estimate the 3. Prevalence studies
effect of a (dichotomous) exposure on the 4. Prevalence case-control studies
occurrence of a (dichotomous) disease
outcome or health state. These four study types represent cells in a
two-way cross-classification (table 2.1).
It should first be emphasized that all Such studies may be conducted to describe
epidemiologic studies are (or should be) the occurrence of disease (e.g. to estimate
based on a particular source population the burden of diabetes in the community
(also called the “study population” or “base by conducting a prevalence survey), or to
population”) followed over a particular risk estimate the effect of a particular exposure
period. Within this framework a on disease (e.g. to estimate whether the
fundamental distinction is between studies incidence new cases of diabetes is greater
of disease incidence (i.e. the number of in people with a high fat diet than in
new cases of disease over time) and people with a low fat diet) in order to find
studies of disease prevalence (i.e. the out how we can prevent the disease
number of people with the disease at a occurring. In the latter situation we are
particular point in time). Studies involving comparing the occurrence of disease in an
dichotomous outcomes can then be “exposed” group with that in a “non-
classified according to two questions: exposed” group, and we are estimating the
effect of exposure on the occurrence of the
a. Are we studying studying incidence or disease, while controlling for other known
prevalence?; causes of the disease.
b. Is there sampling on the basis of
outcome?

Table 2.1
The four basic study types in studies involving a dichotomous health outcome

Sampling on outcome
------------------------------------------------------------
No Yes
------------------------------------------------------------
Study Incidence Incidence studies Incidence case-control studies
outcome
Prevalence Prevalence studies Prevalence case-control studies
------------------------------------------------------------

21
Thus, we might conclude that “lung consider prevalence studies. In chapter
cancer is five times more common in 4, I then consider studies involving more
asbestos workers than in other workers, complex measurements of health status
even after we have controlled for (e.g. continuous lung function or blood
differences in age, gender, and pressure measurements) and more
smoking”. In some instances we may complex study designs (ecologic and
have multiple categories of exposure multilevel studies). As noted in chapter
(high, medium, low) or individual 1, the latter situation is perhaps the
exposure “scores”, but we will start with norm, rather than the exception, when
the simple situation in which individuals conducting studies in the public health
are classified as “exposed” or “non- context. However, for logical and
exposed”. practical reasons I will first address the
simpler situation of a dichotomous
In this chapter I discuss incidence exposure (in individuals) and a
studies, and in the following chapter I dichotomous health outcome measure.

2.1 Incidence Studies

The most comprehensive approach In the hypothetical study shown in figure


involves collecting data on the 2.1, people enter the study when they
experience of the entire source are born, and some of them
population over the risk period in order subsequently develop disease. Of these,
to estimate disease incidence (the some subsequently "lose" their disease
development of a disease for the first (although they may "regain" it at a later
time) or mortality (i.e. death which is a date), and some have the condition all
particular type of incidence measure). their lives; some persons die from the
Figure 2.1 shows the experience of a disease under study, but most eventually
source population in which all persons die from another cause. However, the
are followed from a particular date. For information is "censored" since the study
simplicity, I will initially assume that the cannot last indefinitely; i.e. follow-up
source population is confined to persons stops by a particular age, at which time
born in a particular year, i.e. a birth some members of the study population
cohort. In the hypothetical study shown have died, and some have been lost to
in figure 2.1, the outcome under study is follow-up for other reasons (e.g.
the "event" of developing a particular emigration). For example, several people
disease. However, the concept of in figure 2.1 were “censored” before
incidence applies equally to studies of follow-up finished, either because they
other health events, such as died of the disease we were studying (if
hospitalisation or death. The key feature we were studying the incidence of
of incidence studies is that they involve disease, rather than deaths, they would
an event (e.g. developing a disease for be “censored” as soon as they developed
the first time) which occurs at a the disease), they died of something
particular point in time, rather than a else, or because they were “lost to
state (e.g. having a disease) which can follow-up”. Each person only contributes
exist over an extended period of time. “person-time” to the study until they are

22
censored, and after that we stop younger age than the non-exposed
counting them. This approach is followed group. If we only calculated the
because we may not get a fair percentage of people who died, then it
comparison between the “exposed” and would be 100% in both groups, and we
the “non-exposed” groups” if they have would see no difference. However, if we
been followed for different lengths of take into account the person-time
time, e.g. if one group has many more contributed by each group, then it
people lost to follow-up than the other becomes clear that both groups had the
group. same number of deaths (1,000), but that
in the exposed group these deaths
However, the person-time approach occurred earlier and the person-time
would be necessary even if no-one was contributed was therefore lower. Thus,
lost to follow up and both groups were the average age at death would be lower
followed for the same length of time. For in the exposed group; to say the same
example, consider a cohort study of thing another way, the death rate
1,000 exposed and 1,000 non-exposed (deaths divided by person-years) would
people in which no-one was lost to be higher. To see this, we need to
follow-up and everyone was followed consider not only how many people were
until they died. Assume also that the in each group, but how much person-
exposure causes some deaths so the time they contributed, i.e. how long they
exposed group, on the average, died at a were followed for.

Figure 2.1
Occurrence of disease in a hypothetical population followed from birth

Birth End of Follow up


death from disease under study “at risk”

other death disease symptoms

lost to follow up severe symptoms

23
Example 2.1
Martinez et al (1995) were completed during of three years but had
studied 1246 newborns the child’s second year wheezing at six years,
in the Tucson, Arizona of life and again at six and 13.7% had
area enrolled between years. At the age of six wheezing both before
May 1980 and October years, 51.5% of the three years of age and
1984. Parents were children had never at six years. The authors
contacted shortly after wheezed, 19.9% had concluded that the
the children were born, had at least one lower majority of infants with
and completed a respiratory tract illness wheezing have transient
questionnaire about their with wheezing during the conditions and do not
history or respiratory first three years of life have increased risks of
illness, smoking habits, but had no wheezing at asthma or allergies later
and education. Further six years, 15.0% had no in life.
parental questionnaires wheezing before the age

In some circumstances, a study might formally defined and enumerated (e.g.


be conducted to study the "natural a group of workers exposed to a
history" of a disease (e.g. diabetes). In particular chemical) then the study
such “clinical epidemiology” studies, may be termed a cohort study or
the population (denominator) under follow-up study (Rothman and
study comprises people who already Greenland, 1998) and the former
have a particular disease or condition, terminology will be used here.
and the goal is to ascertain which Incidence studies also include studies
factors affect the disease prognosis. where the source population has been
More typically, one might be interested defined but a cohort has not been
in a particular hypothesis about formally enumerated by the
developing disease, such as "a high investigator. Perhaps the most
cholesterol diet increases the risk of common examples are descriptive
developing ischaemic heart disease". studies, e.g. of national death rates. In
In this situation, the population under fact, as Rothman and Greenland
study comprises healthy individuals (1998) note, no qualitative distinction
and we are interested in factors that distinguishes “descriptive” variables
determine who develops the disease from the variables that are studied in
under study (and who doesn’t). The “analytic” studies of risk factors. Thus,
data generated by such an incidence the distinction between “descriptive”
study involve comparing “exposed” incidence studies and “analytic”
and “non-exposed” groups and are incidence studies is at best only a
similar to that generated by a distinction based on data source (e.g.
randomised controlled trial, except obtaining information from routine
that dietary “exposure” has not been records rather than collecting the
randomly allocated. information specifically for the study).

Incidence studies ideally measure Similarly, there is no fundamental


exposures, confounders and outcome distinction between incidence studies
times on all population members. based on a broad population (e.g. all
When the source population has been workers at a particular factory, or all

24
persons living in a particular 0.0100 (or 1000 per 100,000 person-
geographical area) and incidence years).
studies involving sampling on the basis
of exposure, since the latter procedure A second measure of disease
merely redefines the source population occurrence is the incidence proportion
(cohort) (Miettinen, 1985). or average risk which is the proportion
of people who experience the outcome
Measures of Disease Occurrence of interest at any time during the
follow-up period (the incidence
I will briefly review the basic measures proportion is often called the
of disease occurrence that are used in cumulative incidence, but the latter
incidence studies, using the notation term is also used to refer to
depicted in table 2.2 which shows the cumulative hazards (Breslow and Day,
findings of a hypothetical incidence 1987)). Since it is a proportion it is
study of 20,000 persons followed for dimensionless, but it is necessary to
10 years (statistical analyses using specify the time period over which it is
these measures are discussed further being measured. In this instance,
in chapter 12). there were 952 incident cases among
the 10,000 people in the non-exposed
Three measures of disease incidence group, and the incidence proportion
are commonly used in incidence (b/N0) was therefore 952/10,000 =
studies. 0.0952 over the ten year follow-up
period. When the outcome of interest
Perhaps the most common measure of is rare over the follow-up period (e.g.
disease occurrence is the person-time an incidence proportion of less than
incidence rate (or hazard rate, force of 10%), then the incidence proportion is
mortality or incidence density approximately equal to the incidence
(Miettinen, 1985)) which is a measure rate multiplied by the length of time
of the disease occurrence per unit that the population has been followed
population time, and has the reciprocal (in the example, this product is 0.1000
of time as its dimension. In this whereas the incidence proportion is
example (table 2.2), there were 952 0.0952). I have assumed, for
cases of disease diagnosed in the non- simplicity, that no-one or was lost to
exposed group during the ten years of follow-up during the study period (and
follow-up, which involved a total of therefore stopped contributing person-
95,163 person-years; this is less than years to the study). However, as noted
the total possible person-time of above when this assumption is not
100,000 person-years since people valid (i.e. when a significant proportion
who developed the disease before the of people have died or have been lost
end of the ten-year period were no to follow-up), then the incidence
longer “at risk” of developing it, and proportion cannot be estimated
stopped contributing person-years at directly, but must be estimated
that time (for simplicity I have ignored indirectly from the incidence rate
the problem of people whose disease (which takes into account that follow-
disappears and then reoccurs over up was not complete) or from life
time, and I have assumed that we are tables (which stratify on follow-up
studying the incidence of the first time).
occurrence of disease). Thus, the
incidence rate in the non-exposed
group (b/Y0) was 952/95,163 =

25
A third possible measure of disease estimated indirectly from the incidence
occurrence is the incidence odds rate (via the incidence proportion, or
(Greenland, 1987) which is the ratio of via life-table methods). The incidence
the number of people who experience odds is not very interesting or useful
the outcome (b) to the number of as a measure of disease occurrence,
people who do not experience the but it is presented here because the
outcome (d). As for the incidence incidence odds is used to calculate the
proportion, the incidence odds is incidence odds ratio which is estimated
dimensionless, but it is necessary to in certain case-control studies (see
specify the time period over which it is below).
being measured. In this example, the
incidence odds (b/d) is 952/9,048 = These three measures of disease
0.1052. When the outcome is rare occurrence all involve the same
over the follow-up period then the numerator: the number of incident
incidence odds is approximately equal cases of disease (b). They differ in
to the incidence proportion. Once whether their denominators represent
again, if loss to follow-up is significant, person-years at risk (Y0), persons at
then the incidence odds cannot be risk (N0), or survivors (d).
estimated directly, but must be

Table 2.2

Findings from a hypothetical cohort study of 20,000 persons followed for 10 years

Exposed Non-exposed Ratio


------------------------------------------------------------------------------------------------
Cases 1,813 (a) 952 (b)
Non-cases 8,187 (c) 9,048 (d)
------------------------------------------------------------------------------------------------
Initial population size 10,000 (N1) 10,000 (N0)
------------------------------------------------------------------------------------------------
Person-years 90,635 (Y1) 95,163 (Y0)
------------------------------------------------------------------------------------------------
Incidence rate 0.0200 (I1) 0.0100 (I0) 2.00
Incidence proportion 0.1813 (R1) 0.0952 (R0) 1.90
(average risk)
Incidence odds 0.2214 (O1) 0.1052 (O0) 2.11

26
Measures of Effect in Incidence to that in the non-exposed group. The
Studies various measures of disease occurrence
all involve the same numerators
Corresponding to these three measures (incident cases), but differ in whether
of disease occurrence, there are three their denominators are based on person-
principal ratio measures of effect which years, persons, or survivors (people who
can be used in incidence studies. The do not develop the disease at any time
measure of interest is often the rate during the follow-up period). They are all
ratio (incidence density ratio), the ratio approximately equal when the disease is
of the incidence rate in the exposed rare during the follow-up period (e.g. an
group (a/Y1) to that in the non-exposed incidence proportion of less than 10%).
group (b/Y0). In the example in table However, the odds ratio has been
2.2, the incidence rates are 0.02 per severely criticised as an effect measure
person-year in the exposed group and (Greenland, 1987; Miettinen and Cook,
0.01 per person-year in the non-exposed 1981), and has little intrinsic meaning in
group, and the rate ratio is therefore incidence studies, but it is presented
2.00. here because it is the standard effect
measure in incidence case-control
studies (see below).
A second commonly used effect measure
is the risk ratio (incidence proportion Finally, it should be noted that an
ratio or cumulative incidence ratio) which analogous approach can be used to
is the ratio of the incidence proportion in calculate measures of effect based on
the exposed group (a/N1) to that in the differences rather than ratios, in
non-exposed group (b/N0). In this particular the rate difference and the risk
example, the risk ratio is 0.1813/0.0952 difference. Ratio measures are usually of
= 1.90. When the outcome is rare over greater interest in etiologic research,
the follow-up period the risk ratio is because they have more convenient
approximately equal to the rate ratio. statistical properties, and it is easier to
assess the strength of effect and the
A third possible effect measure is the possible role of various sources of bias
incidence odds ratio which is the ratio of when using ratio measures (Cornfield et
the incidence odds in the exposed group al, 1951). Thus, I will concentrate on the
(a/c) to that in the non-exposed group use of ratio measures in the remainder
(b/d). In this example the odds ratio is of this text. However, other measures
0.2214/0.1052 = 2.11. When the (e.g. risk difference, attributable
outcome is rare over the study period fraction) may be of value in certain
the incidence odds ratio is approximately circumstances, such as evaluating the
equal to the incidence rate ratio. public health impact of a particular
exposure, and I encourage readers to
These three multiplicative effect consult standard texts for a
measures are sometimes referred to comprehensive review of these measures
under the generic term of relative risk. (e.g. Rothman and Greenland, 1998).
Each involves the ratio of a measure of
disease occurrence in the exposed group

27
2.2. Incidence Case-Control Studies

Incidence studies are the most same population over the same period
comprehensive approach to studying the (the possible methods of sampling
causes of disease, since they use all of controls are described below).
the information about the source
population over the risk period. Table 2.3 shows the data from a
However, they are very expensive in hypothetical case-control study, which
terms of time and resources. For involved studying all of the 2,765
example, the hypothetical study incident cases which would have been
presented in table 2.2 would involve identified in the full incidence study, and
enrolling 20,000 people and collecting a sample of 2,765 controls (one for each
exposure information (on both past and case). Such a case-control study would
present exposure) for all of them. The achieve the same findings as the full
same findings can be obtained more incidence study, but would be much
efficiently by using a case-control more efficient, since it would involve
design. ascertaining the exposure histories of
5,530 people (2,765 cases and 2,765
An incidence case-control study involves controls) rather than 20,000. When the
studying all (or a sample) of the incident outcome under study is very rare, an
cases of the disease that occurred in the even more remarkable gain in efficiency
source population over the risk period, can be achieved with very little reduction
and a control group sampled from the in the precision of the effect estimate.

Table 2.3

Findings from a hypothetical incidence case-control study based on the cohort in table 2.2

Exposed Non-exposed Odds Ratio


-----------------------------------------------------------------------------------------------------
Cases 1,813 (a) 952 (b)
Controls: from survivors
(cumulative sampling) 1,313 (c) 1,452 (d) 2.11
from source population
(case-cohort sampling) 1,383 (c) 1,383 (d) 1.90
from person-years
(density sampling) 1,349 (c) 1,416 (d) 2.00
------------------------------------------------------------------------------------------------------------------

28
Measures of Effect in Incidence recently been termed case-cohort
Case-Control Studies sampling (Prentice, 1986), or case-base
sampling (Miettinen, 1982). In this
In case-control studies, the relative risk instance, the ratio of exposed to non-
is estimated using the odds ratio. exposed controls will estimate the
exposure odds in the source population
Suppose that a case-control study is of persons at risk at the start of follow-
conducted in the study population shown up (N1/N0 = 10000/10000 =
in table 2.2; such a study might involve 1383/1383), and the odds ratio obtained
all of the 2,765 incident cases and a in the case-control study will therefore
group of 2,765 controls (table 2.3). The estimate the risk ratio in the source
effect measure which the odds ratio population over the study period (1.90).
obtained from this case-control study will In this instance the method of calculation
estimate depends on the manner in of the odds ratio is the same as for any
which controls are selected. Once again, other case-control study, but minor
there are three main options (Miettinen, changes are needed in the standard
1985; Pearce, 1993; Rothman and methods for calculating confidence
Greenland, 1998). intervals and p-values to take into
account that some cases may also be
One option, called cumulative (or selected as controls (Greenland, 1986).
cumulative incidence) sampling, is to
select controls from those who do not The third approach is to select controls
experience the outcome during the longitudinally throughout the course of
follow-up period, i.e. the survivors the study (Sheehe, 1962; Miettinen,
(those who did not develop the disease 1976); this is sometimes described as
at any time during the follow-up period). risk-set sampling (Robins et al, 1986),
In this instance, the ratio of exposed to sampling from the study base (the
non-exposed controls will estimate the person-time experience) (Miettinen,
exposure odds (c/d = 8178/9048 = 1985), or density sampling (Kleinbaum
1313/1452) of the survivors, and the et al, 1982). In this instance, the ratio of
odds ratio obtained in the case-control exposed to non-exposed controls will
study will therefore estimate the estimate the exposure odds in the
incidence odds ratio in the source person-time (Y1/Y0 = 90635/95613 =
population over the study period (2.11). 1349/1416), and the odds ratio obtained
Early presentations of the case-control in the case-control study will therefore
approach usually assumed this context estimate the rate ratio in the study
(Cornfield, 1951), and it was emphasised population over the study period (2.00).
that the odds ratio was approximately
equal to the risk ratio when the disease Case-control studies have traditionally
was rare. been presented in terms of cumulative
sampling (e.g. Cornfield, 1951), but
It was later recognised that controls can most case-control studies actually
be sampled from the entire source involve density sampling (Miettinen,
population (those at risk at the 1976), often with matching on a time
beginning of follow-up), rather than just variable such as calendar time or age,
from the survivors (those at risk at the and therefore estimate the rate ratio
end of follow-up). This approach which without the need for any rare disease
was previously used by Thomas (1972) assumption (Sheehe, 1962; Miettinen,
and Kupper et al (1975), has more 1976; Greenland and Thomas, 1982).

29
Example 2.2

Gustavsson et al (2001) years in Stockholm infarction was 2.11


studied the risk of County from 1992-1994. (95% CI 1.23-3.60)
myocardial infarction They selected controls among those highly
from occupational from the general exposed occupationally,
exposure to motor population living in the and 1.42 (95% CI 1.05-
exhaust, other same County during the 1.92) in those
combustion products, same period (i.e. density moderately exposed,
organic solvents, lead, matching), matched for compared with persons
and dynamite. They sex, age, year, and not occupationally
identified first-time, hospital catchment area. exposed to combustion
nonfatal myocardial The odds ratio products from organic
infarctions among men (estimating the rate material.
and women aged 45-70 ratio) of myocardial

Summary

When a dichotomous outcome is under the exposure and disease experience of


study (e.g. being alive or dead, or the entire source population. They may
having or not having a disease) a resemble randomized trials, but they
fundamental distinction is between may involve additional problems of
studies of incidence and studies of confounding because exposure has not
prevalence. Thus, four main types of been randomly assigned. The other
studies can be identified: incidence potential study designs all involve
studies, incidence case-control studies, sampling from the source population,
prevalence studies, and prevalence case- and therefore may include additional
control studies (Morgenstern and biases arising from the sampling process
Thomas, 1993; Pearce, 1998). These (chapter 6). In particular, incidence
various study types differ according to case-control studies involve sampling on
whether they involve incidence or the basis of outcome, i.e. they usually
prevalence data and whether or not they involve all incident cases generated by
involve sampling on the basis of the the source population and a control
outcome under study. Incidence studies group (of non-cases) sampled at random
involve collecting and analysing data on from the source population.

30
References

Breslow NE, Day NE (1987). Statistical useful in estimating relative risk. J Am


methods in cancer research. Vol II: Stat Assoc 70:524-8.
The analysis of cohort studies. Lyon,
Martinez FD, Wright AJ, Taussig LM, et al
France: IARC.
(1995). Asthma and wheezing in the
Checkoway H, Pearce N, Kriebel D first six years of life. New Engl J Med
(2004). Research methods in 332: 133-8.
occupational epidemiology. 2nd ed.
Miettinen OS (1976). Estimability and
New York: Oxford University Press.
estimation in case-referent studies.
Cornfield J (1951). A method of Am J Epidemiol 103: 226-35.
estimating comparative rates from
Miettinen OS, Cook EF (1981).
clinical data: applications to cancer of
Confounding: essence and detection.
the lung, breast and cervix. JNCI 11:
Am J Epidemiol 114: 593-603.
1269-75.
Miettinen O (1982). Design options in
Greenland S (1986). Adjustment of risk
epidemiologic research: an update.
ratios in case-base studies (hybrid
Scand J Work Environ Health 8(suppl
epidemiologic designs). Stat Med 5:
1): 7-14.
579-84.
Miettinen OS (1985). Theoretical
Greenland S (1987). Interpretation and
epidemiology. New York: Wiley.
choice of effect measures in
epidemiologic analyses. Am J Morgenstern H, Thomas D (1993).
Epidemiol 125: 761-8. Principles of study design in
environmental epidemiology. Environ
Greenland S, Thomas DC (1982). On the
Health Perspectives 101: S23-S38.
need for the rare disease assumption
in case-control studies. Am J Pearce N (1993). What does the odds
Epidemiol 116: 547-53. ratio estimate in a case-control study?
Int J Epidemiol 22: 1189-92.
Gustavsson P, Plato N, Hallqvist J, et al
(2001). A population-based case- Pearce N (1998). The four basic
referent study of myocardial infarction epidemiologic study types. J
and occupational exposure to motor Epidemiol Biostat 3: 171-7.
exhaust, other combustion products,
Prentice RL (1986). A case-cohort design
organic solvents, lead and dynamite.
for epidemiologic cohort studies and
Epidemiol 12: 222-8.
disease prevention trials. Biometrika
Kleinbaum DG, Kupper LL, Morgenstern 73: 1-11.
H (1982). Epidemiologic research.
Robins JM, Breslow NE, Greenland S
Principles and quantitative methods.
(1986). Estimation of the Mantel-
Belmont, CA: Lifetime Learning
Haenszel variance consistent with
Publications.
both sparse-data and large-strata
Kupper LL, McMichael AJ, Spirtas R limiting models. Biometrics 42: 311-
(1975). A hybrid epidemiologic design 23.

31
Rothman KJ, Greenland S (1998). Thomas DB (1972). The relationship of
Modern epidemiology. 2nd ed. oral contraceptives to cervical
Philadelphia: Lippincott-Raven. carcinogenesis. Obstet Gynecol 40:
508-18.
Sheehe PR (1962). Dynamic risk analysis
of matched pair studies of disease.
Biometrics 18: 323-41.

32
CHAPTER 3. Prevalence Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

Incidence studies are ideal for studying conditions (e.g. chronic non-fatal
events such as mortality or cancer disease) prevalence studies are the only
incidence, since they involve collecting option. Furthermore, in some instances
and analysing all of the relevant we may be more interested in factors
information on the source population and which affect the current burden of
we can get better information on when disease in the population. Consequently,
exposure and disease occurred. although incidence studies are usual
However, incidence studies involve preferable, there is also an important
lengthy periods of follow-up and large role for prevalence studies, both for
resources, in terms of both time and practical reasons, and because such
funding, and it may be difficult to studies enable the assessment of the
identify incident cases of non-fatal level of morbidity and the population
chronic conditions such as diabetes. “disease burden” for a non-fatal
Thus, in some settings (e.g. some condition.
developing countries) and/or for some

3.1. Prevalence Studies

The term prevalence denotes the a specific population with that in other
number of cases of the disease under communities or countries. This may be
study existing in the source population at done, for example, in order to discover
a particular time. This can be defined as differences in disease prevalence and to
point prevalence estimated at one point thus suggest possible risk factors for the
in time, or period prevalence which disease. These further studies may
denotes the number of cases that involve testing specific hypotheses by
existed during some time interval (e.g. comparing prevalence in subgroups of
one year). people who have or have not been
exposed to a particular risk factor (e.g.
The prevalence is a proportion, and the as passive smoking) in the past.
statistical methods for calculating a
confidence interval for the prevalence Prevalence studies often represent a
are identical to those presented above considerable saving in resources
for calculating a confidence interval for compared with incidence studies, since it
the incidence proportion (chapter 12). is only necessary to evaluate disease
prevalence at one point in time, rather
In some instances, the aim of a than continually searching for incident
prevalence study may simply be to cases over an extended period of time.
compare the disease prevalence among On the other hand, this gain in efficiency

33
is achieved at the cost of greater risk of chronic heart disease will be negatively
biased inferences, since it may be much associated with the prevalence of heart
more difficult to understand the temporal disease (in people who are alive!), and
relationship between various exposures will therefore appear to be ‘protective’
and the occurrence of disease. For against heart disease in a prevalence
example, an exposure that increases the study.
risk of death in people with pre-existing

Example 3.1

The International Study years within specified symptoms in English-


of Asthma and Allergies geographical areas. The speaking countries; (ii)
in Childhood (ISAAC) older age-group was centres in Latin America
(Asher et al, 1995; chosen to reflect the also had particularly high
Pearce et al, 1993) period when morbidity symptom prevalence;
involved a simple Phase from asthma is common (iii) there is also high
I global asthma and to enable the use of asthma prevalence in
symptom prevalence self-completed Western Europe, with
survey and a more in- questionnaires. The lower prevalences in
depth Phase II survey. younger age-group was Eastern and Southern
The emphasis was on chosen to give a Europe - for example,
obtaining the maximum reflection of the early there is a clear
possible participation childhood years, and Northwest-Southeast
across the world in order involves parent- gradient within Europe,
to obtain a global completion of with the highest
overview of childhood questionnaires. The prevalence in the world
asthma prevalence, and Phase I findings, being in the United
the Phase I involving more than Kingdom, and some of
questionnaire modules 700,000 children, the lowest prevalences
were designed to be showed striking in Albania and Greece;
simple and to require international differences (iv) Africa and Asia
minimal resources to in asthma symptom generally showed
administer. In addition, prevalence (ISAAC relatively low asthma
a video questionnaire Steering Committee, prevalence. These
involving the audio- 1998a, 1998b). Figure striking findings call into
visual presentation of 3.1 shows the findings question many of the
clinical signs and for current wheeze (i.e. “established” theories of
symptoms of asthma wheeze in the previous asthma causation, and
was developed in order 12 months). There are a have played a major role
to minimise translation number of interesting in the development of
problems. The features of the figure: (i) new theories of asthma
population of interest there is a particularly causation in recent years
was schoolchildren aged high prevalence of (Douwes and Pearce,
6-7 years and 13-14 reported asthma 2003).

34
Figure 3.1

Twelve month period prevalence of asthma symptoms in 13-14 year old children in
Phase I of the International Study of Asthma and Allergies in Childhood (ISAAC)

Source: ISAAC Steering Committee (1998b)

20%
10 to <20%
5 to <10%
<5%

35
Measures of Effect in Prevalence Studies

Figure 3.2 shows the relationship population size - and that average
between incidence and prevalence of disease duration (D) does not change
disease in a “steady state” population. over time. Then, if we denote the
Assume that the population is in a prevalence of disease in the study
“steady state” (stationary) over time (in population by P, the prevalence odds is
that the numbers within each equal to the incidence rate (I) times the
subpopulation defined by exposure, average disease duration (Alho, 1992):
disease and covariates do not change
with time) – this usually requires that P
incidence rates and exposure and ------ = ID
disease status are unrelated to the (1-P)
immigration and emigration rates and

Figure 3.2

Relationship between prevalence and incidence in a “steady state” population

P=prevalence
I=incidence
P/(1-P) = I x D N(1-p) x I D=duration
N=population

Non-asthmatic Asthma
cases
[N(1-P)]
[NP]

NP/D

Now suppose that we compare two POR = [P1/(1-P1)]/[P0/(1-P0)] = I1D1/I0D0


populations (indexed by 1=exposed and
0=non-exposed) and that both satisfy An increased prevalence odds ratio may
the above conditions. Then, the thus reflect the influence of factors that
prevalence odds is directly proportional increase the duration of disease, as well
to the disease incidence, and the as those that increase disease incidence.
prevalence odds ratio (POR) satisfies the However, in the special case where the
equation:

36
average duration of disease is the same difference in prevalence between two
in the exposed and non-exposed groups groups could entirely depend on
(i.e. D1 = D0), then the prevalence odds differences in disease duration (e.g.
ratio satisfies the equation: because of factors which prolong or
exacerbate symptoms) rather than
POR = I1/I0 differences in incidence. Changes in
incidence rates, disease duration and
i.e. under the above assumptions, the population sizes over time can also bias
prevalence odds ratio directly estimates the POR away from the rate ratio, as can
the incidence rate ratio (Pearce, 2004). migration into and out of the population
However, it should be emphasised that at risk or the prevalence pool.
prevalence depends on both incidence
and average disease duration, and a

Table 3.1
Findings from a hypothetical prevalence study of 20,000 persons
Exposed Non-exposed Ratio
--------------------------------------------------------------------------------------
Cases 909 (a) 476 (b)
Non-cases 9,091 (c) 9,524 (d)
--------------------------------------------------------------------------------------
Total population 10,000 (N1) 10,000 (N0)
--------------------------------------------------------------------------------------
Prevalence 0.0909 (P1) 0.0476 (P0) 1.91
Prevalence odds 0.1000 (O1) 0.0500 (O0) 2.00

Table 3.1 shows data from a number of new cases generated from
prevalence study of 20,000 people. the source population. For example, in
This is based on the incidence study the non-exposed group, there are 476
represented in table 2.2 (chapter 2), prevalent cases, and 95 (20%) of
with the assumptions that, for both these "lose" their disease each year;
populations, the incidence rate and this is balanced by the 95 people who
population size is constant over time, develop the disease each year (0.0100
that the average duration of disease is of the susceptible population of 9524
five years, and that there is no people). With the additional
migration of people with the disease assumption that the average duration
into or out of the population (such of disease is the same in the exposed
assumptions may not be realistic, but and non-exposed groups, then the
are made here for purposes of prevalence odds ratio (2.00) validly
illustration). In this situation, the estimates the incidence rate ratio (see
number of cases who "lose" the table 2.2).
disease each year is balanced by the

37
3.2. Prevalence Case-Control Studies

Just as an incidence case-control study Measures of Effect in Prevalence


can be used to obtain the same findings Case-Control Studies
as a full incidence study, a prevalence
case-control study can be used to obtain Suppose that a nested case-control
the same findings as a full prevalence study is conducted in the study
study in a more efficient manner. population (table 3.1), involving all of
the 1,385 prevalent cases and a group
For example, in a prevalence study, of 1,385 controls (table 3.2). The usual
obtaining exposure information may be approach is to select controls from the
difficult or costly, e.g. if it involves non-cases. The ratio of exposed to non-
lengthy interviews, or expensive testing exposed controls will then estimate the
of biological samples. In this situation, a exposure odds (b/d) of the non-cases,
considerable gain in efficiency can be and the odds ratio obtained in the
achieved by only obtaining exposure prevalence case-control study will
information on the prevalent cases and a therefore estimate the prevalence odds
sample of controls selected at random ratio in the source population (2.00),
from the non-cases, rather than which in turn estimates the incidence
collecting exposure information for rate ratio provided that the
everyone in the prevalence study. assumptions described above are
satisfied in the exposed and non-
exposed populations.

Table 3.2
Findings from a hypothetical prevalence case-control study based on the population
represented in table 3.1
Exposed Non-exposed Ratio
--------------------------------------------------------------------------------------
Cases 909 (a) 476 (b)
Controls 676 (c) 709 (d)
--------------------------------------------------------------------------------------
Prevalence odds 1.34 (O1) 0.67 (O0) 2.00
---------------------------------------------------------------------------------

38
Example 3.2

Studies of congenital exposure to pesticides and the month before


malformations usually congenital malformations conception and the first
involve estimating the in Comunidad Valenciana, trimester of pregnancy,
prevalence of Span. A total of 261 cases the adjusted prevalence
malformations at birth and 261 controls were odds ratio for congenital
(i.e. this is a prevalence selected from those malformations was 3.2
rather than an incidence infants born in eight (95% CI 1.1-9.0). There
measure). Garcia et al public hospitals during was no such association
(1999) conducted a 1993-1994. For mothers with paternal agricultural
(prevalence) case-control who were involved in work.
study of occupational agricultural activities in

Summary

When a dichotomous outcome is under particular time, rather than the incidence
study (e.g. being alive or dead, or of the disease over time. Prevalence
having or not having a disease) four case-control studies involve sampling on
main types of studies can be identified: the basis of outcome, i.e. they usually
incidence studies, incidence case-control involve all prevalent cases in the source
studies, prevalence studies, and population and a control group (of non-
prevalence case-control studies cases) sampled from the source
(Morgenstern and Thomas, 1993; population.
Pearce, 1998). Prevalence studies
involve measuring the prevalence of the
disease in the source population at a

References

Alho JM (1992). On prevalence, Douwes J, Pearce N (2003). Asthma and


incidence, and duration in general the Westernization “package”. Int J
stable populations. Biometrics 48: Epidemiol 31: 1098-1102.
587-92. Garcia AM, Fletcher T, Benavides FG,
Asher I, Keil U, Anderson HR, et al Orts E (1999). Parental agricultural
(1995). International study of asthma work and selected congenital
and allergies in childhood (ISAAC): malformations. Am J Epidemiol 149:
rationale and methods. Eur Resp J 8: 64-74.
483-91.

39
ISAAC Steering Committee (1998a). Pearce N (1998). The four basic
Worldwide variation in prevalence of epidemiologic study types. J Epidemiol
symptoms of asthma, allergic Biostat 3: 171-7.
rhinoconjunctivitis and atopic eczema:
Pearce N (2004). Effect measures in
ISAAC. Lancet 351: 1225-32.
prevalence studies. Environmental
ISAAC Steering Committee (1998b). Health Perspectives 2004; 112: 1047-
Worldwide variations in the 50.
prevalence of asthma symptoms:
Pearce NE, Weiland S, Keil U, et al
International Study of Asthma and
(1993). Self-reported prevalence of
Allergies in Childhood (ISAAC). Eur
asthma symptoms in children in
Respir J 12: 315-35.
Australia, England, Germany and New
Morgenstern H, Thomas D (1993). Zealand: an international comparison
Principles of study design in using the ISAAC protocol. Eur Resp J
environmental epidemiology. Environ 6: 1455-61.
Health Perspectives 101: S23-S38.

40
CHAPTER 4. More Complex Study Designs
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

In the previous two chapters I reviewed not having a particular disease). I now
the possible study designs for the simple consider studies involving other axes of
situation where individuals are exposed classification, continuous measurements
to a particular risk factor (e.g. a of health status (e.g. continuous lung
particular chemical) and when a function or blood pressure
dichotomous outcome is under study measurements) and more complex study
(e.g. being alive or dead, or having or designs (ecologic and multilevel studies).

4.1: Other Axes of Classification

The four basic study types discussed in The Timing of Collection Of Exposure
chapters 2 and 3 are defined in terms of: Information
(a) the type of outcome under study
(incidence or prevalence); and (b) Perhaps the feature that has received
whether there is sampling on the basis of the most attention in various
outcome. They do not involve any classification schemes is the timing of
consideration of the nature of the the collection of exposure information.
exposure data. This provides additional This has dominated discussions of
axes of classification. “directionality”, particularly with regard
to case-control studies. In fact, for all of
Continuous Exposure Data the four basic study types, exposure
information can be collected
Firstly, it should be noted that in prospectively or retrospectively. For
discussing the above classification we example, an incidence study or incidence
have assumed that exposure is case-control study of occupational cancer
dichotomous (i.e. study participants may collect exposure information
are exposed or not exposed). In prospectively, or use historical
reality, there may be multiple information that was collected
exposure categories (e.g. high, prospectively but abstracted
medium and low exposure), or retrospectively by the investigator (e.g.
exposure may be measured as a occupational hygiene monitoring
continuous variable (see chapter 8). records), or use exposure information
However, although this requires minor that was collected retrospectively (e.g.
changes to the data analysis (see recall of duration and intensity of
chapter 12), it does not alter the four- pesticide use). An unfortunate aspect of
fold categorisation of study design some discussions of the merits of case-
options presented above. control studies is that they have often

41
been labelled as “retrospective” studies, not fundamental to the classification of
when this is in fact not an inherent part study types since, as with issues of
of their design. The potential “problem” directionality, they do not affect the
of bias due to exposure ascertainment parameterization of the exposure-
errors (e.g. recall bias) arises from the outcome association.
retrospective collection of exposure
information, irrespective of whether the The Level of Measurement of
study is an incidence, incidence case- Exposure
control, prevalence, or prevalence case-
control study. A third additional axis of classification
involves the level of measurement of
Sources of Exposure Information exposure. In particular, in ecologic
studies exposure information may be
Another set of issues that occur in collected on a group rather than on
practice involve the sources of exposure individuals (e.g. average level of meat
information (e.g. routine records, job- consumption) although others may still
exposure-matrices, questionnaires, be available for individuals (e.g. age,
biological samples). However, as noted gender). This situation is discussed in
above, these issues are important in section 4.3.
understanding sources of bias but are

4.2: Continuous Outcome Measures

Cross-Sectional Studies of cross-sectional studies in which the


disease outcome is dichotomous.
In chapters 2 and 3, the health outcome
under study was a state (e.g. having or Although cross-sectional studies are
not having hypertension). Studies could sometimes described as studies in which
involve observing the incidence of the exposure and disease information is
event of acquiring the disease state (e.g. collected at the same point in time (e.g.
the incidence of being diagnosed with Kramer and Boivin, 1988; Last 1988),
hypertension), or the prevalence of the this is not in fact an inherent feature of
disease state (e.g. the prevalence of such studies. In most cross-sectional
hypertension). More generally, the studies (including prevalence studies),
health state under study may have information on exposure will be
multiple categories (e.g. non- physically collected by the investigator at
hypertensive, mild hypertension, the same time that information on
moderate hypertension, severe disease is collected. Nonetheless,
hypertension) or may be represented by exposure information may include
a continuous measurement (e.g. blood factors that do not change over time
pressure). Since these measurements (e.g. gender) or change in a predictable
are taken at a particular point in time, manner (e.g. age) as well as factors that
such studies are often referred to as do change over time. The latter may
cross-sectional studies. Prevalence have been measured at the time of data
studies (see chapter 3) are a subgroup collection (e.g. current levels of airborne

42
dust exposure), or at a previous time Measures of Effect in Cross-Sectional
(e.g. from historical records on past Studies
exposure levels) or integrated over time.
The key feature of cross-sectional In a simple cross-sectional study
studies is that they involve studying involving continuous outcome data, the
disease at a particular point in time. basic methods of statistical analysis
Exposure information can be collected involve comparing the mean level of the
for current and/or historical exposures, outcome in “exposed” and “non-
and a wide variety of exposure exposed” groups, e.g. the mean levels of
assessment methods can be used within blood pressure in “exposed” and “non-
this general category of study (these are exposed” people. Standard statistical
discussed further in chapter 8). methods of analysis for comparing
means (perhaps after a suitable
Just as a prevalence case-control study transformation to normalise the data),
can be based on a prevalence survey, a and calculating confidence intervals (and
cross-sectional study can also involve associated p-values) for differences
sampling on the basis of the disease between means, can be used to analyse
outcome. For example, a cross-sectional such studies (see chapter 12). More
study of bronchial hyperresponsiveness generally, regression methods can be
(BHR) could involve testing all study used to model the relationship between
participants for BHR and then the level of exposure (measured as a
categorising the test results into severe continuous variable) and the level of the
BHR, mild BHR, and no BHR, and then outcome measure (also measured as a
obtaining exposure information on all continuous variable) (e.g. Armitage et al,
severe BHR cases and from random 2002).
samples of the other two groups.

Example 4.1

Nersesyan et al (2001) “exposed” to the findings could be due


studied chromosome earthquake had a higher either to environmental
aberrations in proportion of cells with exposures related to the
lymphocytes of persons chromosome aberrations earthquake or to severe
exposed to an (3.1% (SD 2.1)) than psychogenic stress.
earthquake in Armenia. the referents (1.7% (SD They noted that studies
They collected blood 1.3)). The differences in wild rodents living in
samples from 41 victims persisted when the data seismic regions have
of the 1988 earthquake were adjusted for age shown similar findings.
and from 47 reference and gender. The authors
blood donors. Those suggested that the

43
Longitudinal Studies

Longitudinal studies (cohort studies) or the prevalence at a particular time (as


involve repeated observation of study in a prevalence study), or the mean
participants over time (Pearce et al, blood pressure at a particular point in
1998). Incidence studies (chapter 2) are time (as in a cross-sectional study), a
a subgroup of longitudinal study in which longitudinal study might involve
the outcome measure is dichotomous. measuring baseline blood pressure in
More generally, longitudinal studies may exposed and non-exposed persons and
involve repeated assessment of then comparing changes in mean blood
categorical or continuous outcome pressure (i.e. the change from the
measures over time (e.g. a series of baseline measure) over time in the two
linked cross-sectional studies in the groups. Such a comparison of means can
same population). They thus can involve be made using standard statistical
incidence data, a series of prevalence methods for comparing means and
surveys, or a series of cross-sectional calculating confidence intervals and
continuous outcome measures. associated p-values for the difference
between the means (Armitage et al,
General longitudinal studies 2002; Beaglehole et al, 1993). More
generally, regression methods (Diggle et
A simple longitudinal study may involve al, 1994) might be used to model the
comparing the disease outcome relationship between the level of
measure, or more usually changes in the exposure (measured as a continuous
measure over time, between exposed variable) and the level of the outcome
and non-exposed groups. For example, measure (also measured as a continuous
rather than comparing the incidence of variable, in this instance the change in
hypertension (as in an incidence study), FEV1).

Example 4.2

The Tokelau Island were repeated (Round who had not: the mean
Migrant Study (Wessen II) in both the Tokelau differences were 1.43 for
et al, 1992) examined Islands (1976) and in systolic and 1.15 for
the effects of migration New Zealand (1975-7). diastolic in men, and
on development of A regression analysis of 0.66 and 0.46
‘Western diseases’ within changes in blood respectively in women.
a population which pressure between Round These differences in
initially had a low I and Round II (adjusted rates of annual increase
incidence of these for age) found that the in blood pressure were
conditions. Round I mean annual increase in maintained in
surveys were conducted blood pressure was subsequent surveys in
in the Tokelau Islands in greater in those who had men, but not in women.
1968/1971, and these migrated than in those

44
Time series measured over minutes, hours, days,
weeks, months or years (Dockery and
One special type of longitudinal study is Brunekreef, 1996). In many instances,
that of “time series” comparisons in such data can be analysed using the
which variations in exposure levels and standard statistical techniques outlined
symptom levels are assessed over time above. For example, a study of daily
with each individual serving as their own levels of air pollution and asthma
control. Thus, the comparison of hospital admission rates can be
“exposed” and “non-exposed” involves conceptualised as a study of the
the same persons evaluated at different incidence of hospital admission in a
times, rather than different groups of population exposed to air pollution
persons being compared (often at the compared with that in a population not
same time) as in other longitudinal exposed to air pollution. The key
studies. The advantage of the time series difference is that only a single population
approach is that it reduces or eliminates is involved, and it is regarded as
confounding (see chapter 6) by factors exposed on high pollution days and as
which vary among subjects but not over non-exposed on low pollution days.
time (e.g. genetic factors), or whose day Provided that the person-time of
to day variation is unrelated to the main exposure is appropriately defined and
exposure (Pope and Schwartz, 1996). On assessed, then the basic methods of
the other hand, time series data often analysis are not markedly different from
require special statistical techniques other studies involving comparisons of
because any two factors that show a exposed and non-exposed groups.
time trend will be correlated (Diggle et
al, 1994). For example, even a three- However, the analysis of time series may
month study of lung function in children be complicated because the data for an
will generally show an upward trend due individual are not independent and serial
to growth, as well as learning effects data are often correlated (Sherrill and
(Pope and Schwartz, 1996). A further Viegi, 1996), i.e. the value of a
problem is that the change in a measure continuous outcome measure on a
over time may depend on the baseline particular day may be correlated with the
value, e.g. changes in lung function over value for the previous day.
time may depend on the baseline level Furthermore, previous exposure may be
(Schouten and Tager, 1996). as relevant as, or more relevant than,
current exposure. For example, the
Time series can involve dichotomous effects of air pollution may depend on
(binary) data, continuous data, or exposure on preceding days as well as
“counts” of events (e.g. hospital on the current day (Pope and Schwartz,
admissions) (Pope and Schwartz, 1996), 1996).
and the changes in these values may be

45
Example 4.3

Hoek et al (2001) (SO2) and nitrogen and thrombocytic causes


studied associations dioxide (NO2). As with were more strongly
between daily variations previously published associated with air
in air pollution and studies, the effects pollution than were
mortality in The depended on exposures cardiovascular deaths in
Netherlands during on the previous few general. In particular,
1986-1994. The authors days, and were weaker heart failure deaths,
found (table 4.1) that when the analysis only which made up 10% of
heart disease deaths considered exposures on all cardiovascular
were increased during a particular day without deaths, were responsible
periods with high levels using any lag period for about 30% of the
of ozone, black smoke, (Schwartz, 2000). The excess cardiovascular
particulate matter 10 authors reported that deaths related to air
microns in diameter deaths due to heart pollution from particular
(PM10), carbon monoxide failure, arrhythmia, matter, SO2, CO, and
(CO), sulfur dioxide cerebrovascular causes NO2.

Table 4.1

Relative risks* (and 95% CIs) of cardiovascular disease mortality associated with air
pollution concentrations in the Netherlands

Pollutant Total CVD mortality Heart failure mortality


------------------------------------------------------------------------------------------------
Ozone (1 day lag) 1.055 (1.032-1.079) 1.079 (1.009-1.154)
Black smoke (7 day mean) 1.029 (1.013-1.046) 1.081 (1.031-1.134)
PM10 (7 day mean) 1.012 (0.984-1.041) 1.036 (0.960-1.118)
CO (7 day mean) 1.026 (0.993-1.060) 1.109 (1.012-1.216)
SO2 (7 day mean) 1.029 (1.012-1.046) 1.098 (1.043-1.156)

NO2(7 day mean) 1.023 (1.009-1.036) 1.064 (1.024-1.106)

------------------------------------------------------------------------------------------------
*Relative risks per 1 to 99th percentile pollution difference
Relative risks per 150 g/m3 for ozone (8-hour maximum of the previous
Day), per 120 g/m3 for CO, per 80 g/m3 for PM10, per 30 g/m3
for NO2, and per 40 g/m3 for black smoke and SO2, all as 7-day moving averages
Source: Hoek et al (2001)

46
4.3 Ecologic and Multilevel Studies

The basic study designs described in ‘ecologic fallacy’ (see below) can occur
chapters 2 and 3 involved the in that factors that are associated with
measurement of exposure and disease national disease rates may not be
in individuals. In this section, I associated with disease in individuals
consider more complex study designs (Greenland and Robins, 1994). Thus,
in which exposures are measured in ecologic studies have recently been
populations instead of, or in addition regarded as a relic of the “pre-
to, individuals. modern” phase of epidemiology before
it became firmly established with a
Ecologic Studies methodologic paradigm based on the
theory of randomized controlled trials
In ecologic studies exposure of individuals.
information may be collected on a
group rather than on individuals. In However, population-level studies are
the past, ecologic studies have been now experiencing a revival for two
regarded as an inexpensive but important reasons (Pearce, 2000).
unreliable method for studying
individual-level risk factors for disease. Firstly, it is increasingly recognised
For example, rather than go to the that, even when studying individual-
time and expense to establish a cohort level risk factors, population-level
study or case-control study of fat studies play an essential role in
intake and breast cancer, one could defining the most important public
simply use national dietary and cancer health problems to be addressed, and
incidence data and, with minimal time in generating hypotheses as to their
and expense, show a strong potential causes. Many important
correlation internationally between fat individual-level risk factors for disease
intake and breast cancer. In this simply do not vary enough within
situation, an ecologic study does not populations to enable their effects to
represent a fundamentally different be identified or studied (Rose, 1992).
study design, but merely a particular More importantly, such studies are a
variant of the four basic study designs key component of the continual cycle
described in chapter 2 in which of theory and hypothesis generation
information on average levels of and testing (Pearce, 2000).
exposure in populations is used as a Historically, the key area in which
surrogate measure of exposure in epidemiologists have been able to “add
individuals. value” has been through this
population focus (Pearce, 1996, 1999).
This approach has been quite rightly For example, many of the recent
regarded as inadequate and unreliable discoveries on the causes of cancer
because of the many additional forms (including dietary factors and colon
of bias that can occur in such studies cancer, hepatitis B and liver cancer,
compared with studies of individuals aflatoxins and liver cancer, human
within a population. In particular, not papilloma virus and cervical cancer)
only will measures of exposure in have their origins, directly or
populations often be poor surrogates indirectly, in the systematic
for exposures in individuals, but the international comparisons of cancer

47
incidence conducted in the 1950s and consistent with biological knowledge at
1960s (Doll et al, 1966). These the time, but in other instances they
suggested hypotheses concerning the were new and striking, and might not
possible causes of the international have been proposed, or investigated
patterns, which were investigated in further, if the population level analyses
more depth in further studies. In some had not been done.
instances these hypotheses were

Example 4.4

The International Study tuberculosis notification are described below),


of Asthma and Allergies rates (von Mutius et al, but it is generally
in Childhood (ISAAC) 2000). It shows a consistent with the
(Asher et al, 1995; negative association “hygiene hypothesis”
Pearce et al, 1993) was between tuberculosis that suggests that
described in example rates and asthma asthma prevalence is
3.1. Figure 4.1 shows prevalence. This is not increasing in Western
the findings for current compelling evidence in countries because of the
wheeze (i.e. wheeze in itself (because of the loss of a protective effect
the previous 12 months) major shortcoming of from infections such as
and its association with ecologic analyses that tuberculosis in early life.

A second reason that ecologic studies et al, 1999). The failure to take account
are experiencing a revival is that it is of the importance of population context,
increasingly being recognised that some as an effect modifier and determinant of
risk factors for disease genuinely individual-level exposures could be
operate at the population level (Pearce, termed the “individualistic fallacy”
2000). In some instances they may (Diez-Rouz, 1998) in which the major
directly cause disease, but perhaps population determinants of health are
more commonly they may cause disease ignored and undue attention is focussed
as effect modifiers or determinants of on individual characteristics. In this
exposure to individual-level risk factors. situation, the associations between
For example, being poor in a rich these individual characteristics and
country or neighbourhood may be worse health can be validly estimated, but
than having the same income level in a their importance relative to other
poor country or neighbourhood, because potential interventions, and the
of problems of social exclusion and lack importance of the context of such
of access to services and resources (Yen interventions, may be ignored.

48
Figure 4.1

Association of tuberculosis notification rates for the period 1980-1982 (in countries with
valid tuberculosis notification data) and the prevalence of asthma symptoms in 13-14
year old children in the International Study of Asthma and Allergies in Childhood (ISAAC)
Wet
Source: von Mutius heeze last
al (2000) 12 months (written questionnaire) vs tuberculosis
notification rate for the period 1980-1982 in countries with valid
tuberculosis notification data
40

35
Wheeze last 12 months %

30

25

20

15

10

0
0 10 20 30 40 50 60 70 80

Tuberculosis notification rate per 100,000

Example 4.5

Wilkinson (1992) has is clearly of crucial researchers (e.g. Lynch


analysed measures of importance since it et al, 2000; Pearce and
income inequality and implies that Davey Smith, 2003) who
found them to be ‘development’ in itself have argued that the
positively associated may not automatically level of income
with national mortality be good for health, and inequality in a country,
rates in a number of that the way in which or in a state, is a
Western countries. This the Gross National surrogate measure for
is a true “ecologic Product (GNP) is 'shared' other socioeconomic
exposure” since the level may be as important as factors, including the
of income inequality is a its absolute level. It provision of public
characteristic of a should be noted, education and health
country, and not of an however, that this services, as well as
individual. If this evidence has been social welfare services.
evidence is correct, this disputed by other

49
Ecologic Fallacies

While stressing the potential value of usage. This does not mean that
ecologic analyses, it is also important to watching television causes every type of
recognise their limitations. In particular, disease, but rather than in many
ecologic studies are a very poor means instances the association between sales
of assessing the effects of individual of television sets and disease at the
exposures (e.g. diet or tobacco national level is confounded by other
smoking) since confounding (and effect exposures (at both the national and
modification) can occur at the individual individual level). A hypothetical example
level, the country (population) level, or is given in example 4.6. Another
both (Morgenstern, 1998). For example, problem is that individual level effects
almost any disease that is associated can confound ecologic estimates of
with affluence and westernisation has in population-level effects (Greenland,
the past been associated at the national 2001).These problems of cross-level
level with sales of television sets, and inference are avoided (or reduced) in
nowadays is probably associated at the multilevel analyses (see below).
national level with rates of internet

Example 4.6

Table 4.2 shows the data the country level: if a Thus, the ecologic
for a hypothetical regression is performed analysis correctly
ecological analysis. The on the country-level data estimates the individual-
numbers of cases and it indicates (comparing level relative risk of 0.5.
population numbers (and 100% exposure with 0% In table 4.4, there is
hence disease rates), as exposure) a relative risk confounding at the
well as the percentage of of 0.5. However, it is not country level (because
the population exposed, known whether this the rate in the non-
are known for each association applies to exposed differs by
country. Thus, the individuals, since the country) and there is in
numbers of people data are not available. fact no association at the
exposed and non- individual level. In table
exposed within each Tables 4.3-4.5 give three 4.5, there is effect
country are known, but different scenarios, each modification at the
it is not known how of which could generate country level, and the
many cases were the data in table 4.2. In relative risk is positive,
exposed and how many table 4.3, there is no but of differing
were not; thus it is not confounding at the magnitude, in all three
possible to estimate the country level (because countries. These three
rates in the exposed and the rate in the non- very different situations
non-exposed groups exposed is the same - (a protective effect, no
within each country. The 200 per 1,000 - in each effect, a positive effect
country-level data country), although there which is different in each
indicate a negative could of course still be country) all yield the
association between uncontrolled confounding same country-level data
exposure and disease at at the individual level. shown in table 4.2.

50
Table 4.2

Hypothetical example of an ecologic analysis

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
-----------------------------------------------------------------------------------------
Exposed ?/ ? ?/ ? ?/ ?
7000 10000 13000
Non-exposed ?/ ? ?/ ? ?/ ?
13000 10000 7000
-----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000

Source: Adapted from Morgenstern (1998)

Table 4.3
Hypothetical example of an ecologic analysis:
No confounding by country

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
----------------------------------------------------------------------------------------------
Exposed 7/ 100 10/ 100 13/ 100
7000 10000 13000
Non-exposed 26/ 200 20/ 200 14/ 200
13000 10000 7000
----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000
----------------------------------------------------------------------------------------------
Ratio 0.5 0.5 0.5

Source: Adapted from Morgenstern (1998)

51
Table 4.4

Hypothetical example of an ecologic analysis:


Confounding by country

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
----------------------------------------------------------------------------------------------
Exposed 12/ 171 15/ 150 18/ 139
7000 10000 13000
Non-exposed 21/ 162 15/ 150 9/ 129
13000 10000 7000
----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000
----------------------------------------------------------------------------------------------
Ratio 1.1 1.0 1.1

Source: Adapted from Morgenstern (1998)

Table 4.5

Hypothetical example of an ecologic analysis:


Effect modification by country

Country 1 Country 2 Country 3


(35% exposed) (50% exposed) (65% exposed)
Cases Rate Cases Rate Cases Rate
----------------------------------------------------------------------------------------------
Exposed 20/ 286 20/ 200 20/ 154
7000 10000 13000
Non-exposed 13/ 100 10/ 100 7/ 100
13000 10000 7000
----------------------------------------------------------------------------------------------
Total 33/ 165 30/ 150 27/ 135
20000 20000 20000
----------------------------------------------------------------------------------------------
Ratio 2.9 2.0 1.5

Source: Adapted from Morgenstern (1998)

52
Multilevel Studies

If individual as well as population-level best features of individual level


data are available, then the problems analyses and population-level analyses.
of cross-level confounding and effect In particular, it enables us to take the
modification (illustrated in example population context of exposure into
4.6) are avoided by using multilevel account (Pearce, 2000). However, it
modelling (Greenland, 2000, 2002). should be stressed that multilevel
This enables the simultaneous modelling is complex, and requires
consideration of individual level effects intensive consideration of possible
(e.g. individual income) and biases at the population level, as well
population-level effects (e.g. per capita as at the individual level (Blakely and
national income, or income inequality). Woodward, 2000).
This approach therefore combines the

Example 4.7

Yen and Kaplan (1999) recruited in 1965. perceived health


conducted a multi-level Mortality risks were status, smoking status,
analysis of significantly higher in body mass index, and
neighbourhood social neighbourhoods with a alcohol consumption.
environment and risk of “low social The authors concluded
death in the Alameda environment”, even that the findings
County Study, after account was demonstrate the
comprising 6,928 non- taken of individual importance of area
institutionalised adult income level, characteristics as a
residents of the County education, ethnicity, health risk factor.

Summary

The basic study designs presented in Prevalence studies are a subgroup of


chapters 2 and 3 can be extended in cross-sectional studies in which the
two ways: by the inclusion of continuous outcome measure is dichotomous.
outcome measures; and by the use of Similarly, longitudinal studies can
exposure information on populations involve incidence data, but may also
rather than individuals. involve a series of cross-sectional
measurements. Incidence studies are a
Cross-sectional studies can include a subgroup of longitudinal studies in
variety of measurements of the health which the outcome measure is
outcome under study (e.g. lung function dichotomous. Time series studies are a
or blood pressure measurements). particular type of longitudinal study in

53
which each subject serves as his or her estimate the effects of exposures in
own control. individuals. These problems are avoided
(or reduced) in multilevel analyses,
Ecologic studies play an important role which permit us to take the population
in the process of hypothesis generation context of exposure into account.
and testing, but they pose additional
problems of bias when attempting to

References

Armitage P, Berry G, Matthews JNS Greenland S, Robins J (1994). Ecologic


(2002). Statistical methods in medical studies - biases, misconceptions, and
research. 4th ed Oxford: Blackwell. counterexamples. Am J Epidemiol
139: 747-60.
Asher I, Keil U, Anderson HR, Beasley R,
et al (1995). International study of Greenland S (2000). Principles of
asthma and allergies in childhood multilevel modelling. Int J Epidemiol
(ISAAC): rationale and methods. Eur 2000; 29: 158-67.
Resp J 8: 483-91. Greenland S (2001). Ecologic versus
Beaglehole R, Bonita R, Kjellstrom T individual-level sources of bias in
(1993). Basic epidemiology. Geneva: ecologic estimates of contextual
WHO. health effects, Int J Epidemiol 30:
1343-50.
Blakeley T, Woodward AJ (2000).
Ecological effects in multi-level Greenland S (2002). A review of
studies. J Epidemiol Comm Health 54: multilevel theory for ecologic
367-74. analyses. Stat Med 21: 389-95.

Diez-Roux AV (1998). Bringing context Hoek G, Brunekreef B, Fischer P, van


back into epidemiology: variables and Mijnen J (2001). The association
fallacies in multilevel analysis. Am J between air pollution and heart
Publ Health 1998; 88: 216-22. failure, arrhythmia, embolism,
thrombosis, and other cardiovascular
Diggle PJ, Liang K-Y, Zeger SL (1994). causes of death in a time series study.
Analysis of longitudinal data. Oxford: Epidemiol 2001; 12: 355-57.
Clarendon.
Kramer MS, Boivin J-F (1988). The
Dockery DW, Brunekreef B (1996). importance of directionality in
Longitudinal studies of air pollution epidemiologic research design. J Clin
effects on lung function. Am J Respir Epidemiol 41: 717-8.
Crit Care Med 154: S250-S256.
Last JM (ed) (1988). A dictionary of
Doll R, Payne P, Waterhouse J (eds) epidemiology. New York: Oxford
(1966). Cancer Incidence in Five University Press.
Continents: A Technical Report.
Berlin: Springer-Verlag (for UICC). Lynch J, Due P, Muntaner C, Davey
Smith G (2000). Social capital – is it a
good investment strategy for public

54
health? J Epidemiol Comm Health 54: data. Am J Respir Crit Care Med 154:
404-8. S229-S233.
Morgenstern H (1998). Ecologic studies. Rose G (1992). The strategy of
In: Rothman K, Greenland S. Modern preventive medicine. Oxford: Oxford
epidemiology. Philadelphia: University Press, 1992.
Lippincott-Raven, pp 459-80.
Schouten JP, Tager IB (1996).
Nersesyan AK, Boffetta P, Sarkisyan TF, Interpretation of longitudinal studies:
et al (2001). Chromosome aberrations an overview. Am J Respir Crit Care
in lymphocytes of persons exposed to Med 154: S278-S284.
an earthquake in Armenia. Scand J
Schwartz J (2000). The distributed lag
Work Environ Health 27: 120-4.
between air pollution and daily
Pearce N (1996). Traditional deaths. Epidemiol 2000; 11: 320-6.
epidemiology, modern epidemiology,
Sherill D, Viegi G (1996). On modeling
and public health. AJPH 1996; 86:
longitudinal pulmonary function data.
678-83.
Am J Respir Crit Care Med 154: S217-
Pearce N (1999). Epidemiology as a S222.
population science. Int J Epidemiol
Von Mutius E, Pearce N, Beasley R,
1999; 28: S1015-8.
Cheng S, Von Ehrenstein O, Björkstén
Pearce N (2000). The ecologic fallacy B, Weiland S, on behalf of the ISAAC
strikes back. J Epidemiol Comm Steering Committee (2000).
Health 2000; 54: 326-7. International patterns of tuberculosis
and the prevalence of symptoms of
Pearce N, Davey Smith G (2003). Is
asthma, rhinitis and eczema. Thorax
social capital the key to inequalities in
55: 449-53.
health? Am J Publ Health 93: 122-9.
Wessen AF, Hooper A, Huntsman J, et al
Pearce NE, Weiland S, Keil U, et al (1992). Migration and health in a
(1993). Self-reported prevalence of small society: The case of Tokelau.
asthma symptoms in children in Oxford: Clarendon Press, 1992, pp
Australia, England, Germany and New 318-57.
Zealand: an international comparison Wilkinson RG (1992). Income
using the ISAAC protocol. Eur Resp J distribution and life expectancy. Br
6: 1455-61. Med J 304: 165-8.

Pearce N, Beasley R, Burgess C, Crane J Yen IH, Kaplan GA (1999).


(1998). Asthma epidemiology: Neighbourhood social environment
principles and methods. New York: and risk of death: multilevel
Oxford University Press. evidence from the Alameda County
Study. Am J Epidemiol 149: 898-
Pearce N, Douwes J, Beasley R (2000). 907.
The rise and rise of asthma: a new
paradigm for the new millenium? J
Epidemiol Biostat 2000; 5: 5-16.
Pope CA, Schwartz J (1996). Time series
for the analysis of pulmonary health

55
56
Part II

Study Design Issues

57
58
CHAPTER 5: Precision
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

Random error will occur in any 12). However, there will always be other
epidemiologic study, just as it occurs in unknown or unmeasurable risk factors
experimental studies. It is often referred operating, and hence the disease rates in
to as chance, although it can perhaps particular subgroups will fluctuate about
more reasonably be regarded as the average. This will occur even if each
"ignorance" (although it is not the only subgroup has exactly the same exposure
thing that we may be ignorant about as history.
our study may be biased by unknown
confounders, measurement error, etc). Even in an experimental study, in which
For example, if we toss a coin 50 times, participants are randomised into
then ideally we might be able to predict "exposed" and "non-exposed" groups,
the outcome of each “toss” based on the there will be "random" differences in
speed, spin, and trajectory of the coin. background risk between the compared
In practice, we do not have all of the groups, but these will diminish in
necessary information (because of importance (i.e. the random differences
“ignorance”), or the computing power to will tend to “even out”) as the study size
use it (because of chaotic behaviour), grows. In epidemiological studies,
and we therefore regard the outcome of because of the lack of randomisation,
each “toss” as a “chance” phenomenon. there is no guarantee that differences in
However, we may note that, on the baseline (background) risk will "even
average, 50% of the “tosses” are heads out" between the exposure groups as the
and therefore we may say that a study size grows.
particular toss has a “50% chance” of
producing a head. The basic principles of analysis of
epidemiologic data are discussed in
Similarly, suppose that 50 lung cancer chapter 12. However, at this stage it is
deaths occurred among 10,000 people important to discuss some basic
aged 35-39 exposed to a particular statistical principles and methods since
factor during one year. Then, if each they are relevant to the calculation of
person had exactly the same cumulative the appropriate study size.
exposure, we might expect two
subgroups of 5,000 people each to
experience 25 deaths during the one-
year period. However, just as 50 tosses
of a coin will not usually produce exactly
25 heads and 25 tails, neither will there
be exactly 25 deaths in each group. This
occurs because of differences in
exposure to other risk factors for lung
cancer, and differences in individual
susceptibility between the two groups.
Ideally, we should attempt to gather
information on all known risk factors
(potential confounders), and to adjust
for these in the analysis (see chapter

59
5.1: Basic Statistics

Basic Concepts taken from the same population, then


the mean will vary between samples.
Data can be summarized in various Even if the underlying population is not
forms, including frequency tables, normally distributed, the means of the
histograms, bar charts, cross-tabulations samples will be approximately normally
and pie charts. However, it is usually distributed provided that the samples are
also useful to give a summary measure sufficiently large (how "large" depends
of central tendency. The mean (or on how non-normally distributed the
average) is the most commonly used population is). The standard deviation of
measure of central tendency, because of the sample means is termed the
its convenient statistical properties. The standard error of the mean. Since the
next step is data smoothing which means are approximately normally
involves the combination of the data with distributed, about 95% of sample means
a statistical model. In the simplest case, will lie within 1.96 standard errors of the
this involves assuming a particular overall population mean. Usually, a
statistical distribution in order to obtain a study only involves one sample, but the
summary measure of variability of the standard error can be estimated by
data. The most common measure of dividing the standard deviation of the
variability is the standard deviation sample by the square root of the number
(Armitage et al, 2002). The standard of people in the sample.
deviation is especially useful when the
underlying data distribution is Most epidemiological studies involve
approximately normal (i.e. symmetric categorical rather than continuous
with a special type of bell-shape). If data outcome data. For example, in a
is not normally distributed, then it can particular area one might estimate the
often be made approximately normally proportion of births involving congenital
distributed by an appropriate malformations over a particular time
transformation (e.g. a log period (this is actually the prevalence at
transformation), but these birth - it is very difficult to calculate the
transformations may distort the scientific incidence of congenital malformations
meaning of the findings, and make them because this requires information on
difficult to interpret. abortions and stillbirths as well as live
births). This involves the calculation of a
Usually it is not possible to study the proportion (p). Under the binomial
entire population in which one is distribution, if the sample is sufficiently
interested (theoretically, this is almost large, the sampling distribution will
always infinite since we usually wish to approximate to the normal distribution
generalise our findings not only to the with mean (p) and standard deviation:
population we are studying, but also to
other populations). It is therefore s= (p(1-p)/n)0.5
necessary to consider a random sample
and to relate its characteristics to the where the “0.5” indicates the square root
total population. If repeated samples are of the expression in parentheses. Thus,

60
one can calculate the proportion with the proportion in all births nationally. In
malformations (i.e. the mean score for a doing so, we not only wish to estimate
population in which a malformation the size of the observed association, but
scores 1 and a completely healthy baby also whether an association as large as
scores 0), and the standard deviation of this is likely to have arisen by chance, if
this proportion (i.e. the standard error of in fact there is no causal association
the mean score), and if the sample is between exposure and disease. The p-
sufficiently large one can analyze these value is the probability that differences
estimates based on the normal as large or larger as those observed
distribution. could have arisen by chance if the null
hypothesis (of no association between
Testing and Estimation exposure and disease) is correct. In the
past, it is been common to “test” the
Usually, in epidemiologic studies, we statistical significance of the study
wish to measure the difference in findings by seeing whether the p-value is
disease occurrence between groups less than an arbitrary value (e.g.
exposed and not exposed to a particular p<0.05). The limitations of statistical
factor. For example, if we have significance testing are discussed in
estimated the proportion of pregnancies chapter 12. However, even if we do not
involving congenital malformations in an intend to use p-values when reporting
area with high nitrate levels in drinking the findings of a study, the statistical
water, then we would wish to compare principles involved are nevertheless
this to the corresponding proportion in relevant to determining the appropriate
an area with low nitrate levels (or with study size.

5.2: Study Size and Power

The most effective means of reducing collection, etc) of index and reference
random error is by increasing the study subjects are the same, then a 1:1 ratio
size, so that the precision of the is most efficient for a given total study
measure of association (the effect size. When exposure increases the risk
estimate) will be increased, i.e. the of the outcome, or referents are
confidence intervals will be narrower. cheaper to include in the study than
Random error thus differs from index subjects, then a larger ratio may
systematic error (see chapter 6) which be more efficient. The optimal
cannot be reduced simply by increasing reference: index ratio is rarely greater
the study size. than 2:1 for a simple unstratified
A second factor that can affect analysis (Walter, 1977) with equal index
precision, given a fixed total study size, and referent costs, but a larger average
is the relative size of the reference ratio may be desirable in order to
group (the unexposed group in a cohort assure an adequate ratio in each
study, or the controls in a case-control stratum for stratified analyses.
study). When exposure is not associated
with disease (i.e. the true relative risk is The ideal study would be infinitely large,
1.0), and the costs (of recruitment, data but practical considerations set limits on

61
the number of participants that can be commencing the study, whether it is
included. Given these limits, it is large enough to be informative. One
desirable to find out, before
method is to calculate the "power" of the
study. This depends on five factors:

• the cutoff value (i.e. alpha level) • the expected relative risk (i.e. the
below which the p-value from the specified value of the relative risk
study would be considered under the alternative (non-null)
“statistically significant”. This value hypothesis));
is usually set at 0.05 or 5%;
• the ratio of the sizes of the two
• the disease rate in the non-exposed groups being studied;
group in a cohort study or the
• the total number of study participants.
exposure prevalence of the controls
in a case-control study;

Once these quantities have been 1977; Schlesselman, 1982). The


determined, standard formulas are then standard normal deviate corresponding
available to calculate the statistical to the power of the study (derived from
power of a proposed study (Walter, Rothman and Boice, 1982) is then:

Zβ = N00.5|P1 – P0|B0.5 – ZαB

K0.5

where:
Zβ = standard normal deviate corresponding to a given statistical power
Zα = standard normal deviate corresponding to an alpha level (the largest
p-value that would be considered "statistically significant")
N0 = number of persons in the reference group (i.e. the non-exposed
group in a cohort study, or the controls in a case-control study)
P1 = outcome proportion in study group
P0 = outcome proportion in the reference group
A = allocation ratio of referent to study group (i.e., the relative size of the
two groups)
B = (1-P0) (P1+ (A-1) P0) + P0 (1-P1)
C = (1-P0) (AP1 - (A-1) P0) + AP0 (1-P1)
K = BC - A (P1-P0)2

Standard calculator and http://www.cdc.gov/epiinfo/, and


microcomputer programmes Rothman’s Episheet programme
incorporating procedures for power (Rothman, 2002) can be downloaded
calculations are widely available. In for free from
particular, EPI-INFO (Dean et al, 1990) http://www.oup-usa.org/epi/rothman/
can be downloaded for free from

62
Example 5.1

Consider a proposed study group of workers, the double the risk of disease,
of 5,000 exposed persons expected number of cases so the number of cases
and 5,000 non-exposed of the disease of interest observed will be 50 in the
persons. Suppose that on is 25 in the non-exposed exposed group.
the basis of mortality group. However, we
rates in a comparable expect that exposure will

Then:
Zα = 1.96 (if a two-tailed significance test, for an alpha-level of 0.05, is to
be used)
N0 = 5,000
P1 = 0.010 (= 50/5000)
P0 = 0.005 (= 25/5000)
A = 1

Using the equation above, the standard statistically significant lung cancer
normal deviate corresponding to the excess in the exposed group is:
power of the study to detect a

Zβ = 50000.5 (0.010-0.005) (0.0149)0.5 - 1.96 x 0.0149 = 0.994


0.0001970.5

From tables for the An alternative approach lower 95% confidence


(one-sided) standard is to carry out a limit is 1.0, then the
normal distribution, it standard analysis of the power for a two-tailed
can be seen that this hypothesized results. If test (of p<0.05) would
corresponds to a power we make the be only 50%. This
of 83%. This means that assumptions given simulated confidence
if 100 similar studies of above, then the relative interval gives the
this size were risk would be 2.0, with a additional information
performed, then we 90% confidence interval that the observed
would expect 83 of them of 1.4-3.0. This relative risk could be as
to show a statistically approach only has an large as 3.0 or as low as
significant (p<0.05) indirect relationship to 1.4 if the observed
excess of cases in the the power calculations. relative risk is 2.0.
exposed group. For example, if the

63
Related approaches are to estimate 1982; Greenland, 1983), and the size
the minimum sample sizes required to of the expected association is often
detect an association (e.g., relative just a guess. Nevertheless, power
risk) of specified magnitudes calculations are an essential aspect of
(Beaumont and Breslow, 1981), and to planning a study since, despite all their
estimate the minimum detectable assumptions and uncertainties, they
association for a given alpha level, nevertheless provide a useful general
power and study size (Armstrong, indication as to whether a proposed
1987). study will be large enough to satisfy
the objectives of the study.
Occasionally, the outcome is measured
as a continuous rather than a Estimating the expected precision can
dichotomous variable (e.g. blood also be useful (Rothman and
pressure). In this situation the Greenland, 1998). This can be done by
standard normal deviate corresponding "inventing" the results, based on the
to the study power is: same assumptions used in power
calculations, and carrying out an
Zβ = N00.5(µ1-µ0) – Zα analysis involving calculations of effect
estimates and confidence limits. This
s(A + 1)0.5 approach has particular advantages
when the exposure is expected to have
no association with disease, since the
where: concept of power is not applicable but
precision is still of concern. However,
µ1 = mean outcome measure in this approach should be used with
exposed group considerable caution, as the results
may be misleading unless interpreted
µ0 = mean outcome measure in
carefully. In particular, a study with an
reference group
expected lower limit equal to a
s = estimated standard particular value (e.g. 1.0) will have
deviation of outcome measure only a 50% chance of yielding an
observed lower confidence limit above
that value.
The power is not the probability that
the study will estimate the size of the In practice, the study size depends on
association correctly. Rather, it is the the number of available participants
probability that the study will yield a and the available resources. Within
"statistically significant" finding when these limitations it is desirable to make
an association of the postulated size the study as large as possible, taking
exists. The observed association could into account the trade-off between
be greater or less than expected, but including more participants and
still be "statistically significant". The gathering more detailed information
overemphasis on statistical about a smaller number of participants
significance is the source of many of (Greenland, 1988). Hence, power
the limitations of power calculations. calculations can only serve as a rough
Many features such as the significance guide as to whether a feasible study is
level are completely arbitrary, issues large enough to be worthwhile. Even if
of confounding, misclassification and such calculations suggest that a
effect modification are generally particular study would have very low
ignored (although appropriate methods power, the study may still be
are available - see Schlesselman, worthwhile if exposure information is

64
collected in a form which will permit individual cohorts were too small to be
the study to contribute to the broader informative in themselves, but each
pool of information concerning a contributed to the overall pool of data.
particular issue. For example, the
International Agency for Research on Once a study has been completed,
Cancer (IARC) has organised several there is little value in retrospectively
international collaborative studies such performing power calculations since
as those of occupational exposure to the confidence limits of the observed
man-made mineral fibers (Simonato et measure of effect provide the best
al, 1986) and phenoxy herbicides and indication of the range of likely values
contaminants (Saracci et al, 1991). for the true association (Smith and
The man-made mineral fiber study Bates, 1992; Goodman and Berlin,
involved pooling the findings from 1994). In the next chapter, random
individual cohort studies of 13 error will be ignored, and the
European factories. Most of the discussion will concentrate on issues of
systematic error.

Summary

Random error will occur in any large enough to be informative. One


epidemiologic study, just as it occurs method is to calculate the "power" of
in experimental studies. The most the study. In practice, the study size
effective means of reducing random depends on the number of available
error is by increasing the study size, participants and the available
so that the precision of the effect resources. Within these limitations it is
estimate will be increased. Random desirable to make the study as large
error thus differs from systematic error as possible, taking into account the
which cannot be reduced simply by trade-off between including more
increasing the study size. The ideal participants and gathering more
study would be infinitely large, but detailed information about a smaller
practical considerations set limits on number of participants. Hence, power
the number of participants that can be calculations can only serve as a rough
included. Given these limits, it is guide as to whether a feasible study is
desirable to find out, before large enough to be worthwhile.
commencing the study, whether it is

65
References

Armitage P, Berry G, Matthews JNS Rothman KJ, Greenland S (1998).


(2002). Statistical methods in medical Modern epidemiology. 2nd ed.
research. 4th ed. Oxford: Blackwell. Philadelphia: Lippincott-Raven.
Armstrong B (1987). A simple estimator Saracci R, Kogevinas M, Bertazzi P, et al
of minimum detectable relative risk, (1991). Cancer mortality in an
sample size, or power in cohort international cohort of workers
studies. Am J Epidemiol 125: 356- exposed to chlorophenoxy herbicides
358. and chlorophenols. Lancet 338: 1027-
32.
Beaumont JJ, Breslow NE (1981). Power
considerations in epidemiologic Schlesselman JJ (1982). Case-control
studies of vinyl chloride workers. Am J studies: design, conduct, analysis.
Epidemiol 114: 725-734 New York: Oxford University Press.
Dean J, Dean A, Burton A, Dicker R Simonato L, Fletcher AC, Cherrie J, et al.
(1990). Epi Info. Version 5.01. (1986). Scandinavian Journal of Work,
Atlanta, GA: CDC. Environment and Health 12: (suppl 1)
34-47.
Goodman SN, Berlin JA (1994). The use
of predicted confidence intervals when Smith AH, Bates M (1992). Confidence
planning experiments and the misuse limit analyses should replace power
of power when interpreting results. calculations in the interpretation of
Ann Intern Med 121: 200-6. epidemiologic studies. Epidemiol 3:
449-52.
Greenland S (1983). Tests for interaction
in epidemiologic studies: a review and Walter SD (1977). Determination of
a study of power. Statist Med 2: 243- significant relative risks and optimal
51. sampling procedures in prospective
and retrospective studies of various
Greenland S (1988). Statistical
sizes. Am J Epidemiol 105: 387-97.
uncertainty due to misclassification:
implications for validation substudies.
J Clin Epidemiol 41: 1167-74.
Rothman KJ, Boice JD (1983).
Epidemiologic Analysis with a
Programmable Calculator.
Epidemiology Resources, Inc.: Boston,
MA.
Rothman KJ (2002). Epidemiology: an
introduction. New York, Oxford
University Press.

66
CHAPTER 6: Validity
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

Systematic error (lack of validity) is categories (Rothman and Greenland,


distinguished from random error (lack 1998): confounding; selection bias;
of precision) in that it would be and information bias. In general
present even with an infinitely large terms, these refer to biases arising
study, whereas random error can be from differences in baseline disease
reduced by increasing the study size. risk between the exposed and non-
Thus, systematic error, or bias, occurs exposed subpopulations of the source
if there is a systematic difference population (confounding), biases
between what the study is actually resulting from the manner in which
estimating and what it is intended to study participants are selected from
estimate. the source population (selection bias),
and biases resulting from the
There are many different types of bias, misclassification of these study
but in studies of cause and effect most participants with respect to exposure
biases fall into one of three major or disease (information bias).

6.1: Confounding

Confounding occurs when the exposed characteristics (and different baseline


and non-exposed groups (in the source disease risk) at the time that they enter
population) are not comparable due to the study, and because of differential
inherent differences in background loss and non-compliance across
disease risk (Greenland and Robins, treatment groups. However, there is
1986) because of differences in the more concern about non-comparability
distribution of other risk factors between in epidemiological studies because of
the exposed and non-exposed groups. the absence of randomisation. The
For example, this could occur if we were concept of confounding thus generally
studying the risk of heart disease in refers to the source population,
people who exercise frequently and although confounding can also be
those who do not, and if the people who introduced (or removed) by the manner
exercised frequently smoked less than in which study participants are selected
those who did not exercise; thus they from the source population (Pearce and
might have a lower risk of heart disease Greenland, 2004).
because they smoked less, and not
because they exercised more. Similar If no other biases are present, three
problems can occur in randomised trials conditions are necessary for a factor to
because randomisation may fail, leaving be a confounder (Rothman and
the treatment groups with different Greenland, 1998).

67
First, a confounder is a factor which is between exposure and disease, or a
predictive of disease in the absence of symptom of disease) should not be
the exposure under study. Note that a treated as a confounder because to do
confounder need not be a genuine so could introduce serious bias into the
cause of disease, but merely results (Greenland and Neutra, 1981;
"predictive". Hence, surrogates for Robins, 1987; Weinberg, 1993). For
causal factors (e.g. age) may be example, in a study of high fat diet
regarded as potential confounders, and colon cancer, it would be
even though they are rarely directly inappropriate to control for serum
causal factors. cholesterol levels if it was considered
that high serum cholesterol levels were
Second, a confounder is associated a consequence of a high fat diet, and
with exposure in the source population hence a part of the causal chain
at the start of follow-up (i.e. at leading from diet to colon cancer. On
baseline). In case-control studies this the other hand, if serum cholesterol
implies that a confounder will tend to itself was of primary interest, then this
be associated with exposure among should be studied directly, and high fat
the controls. An association can occur diet would be regarded as a potential
among the cases simply because the confounder if it also involved exposure
study factor and a potential to other risk factors for colon cancer.
confounder are both risk factors for Evaluating this type of possibility
the disease, but this does not cause requires information external to the
confounding in itself unless the study to determine whether a factor is
association also exists in the source likely to be a part of the causal chain.
population. Intermediate variables can sometimes
be used in the analysis, but special
Thirdly, a variable which is affected by techniques are then required to avoid
the exposure or the disease (e.g. an adding bias (Robins, 1989; Robins et
intermediate in the causal pathway al, 1992; Robins et al, 2000).

Example 6.1

Table 6.1 presents a exposed people. Thus, exposure (as noted


hypothetical example of although “exposure” is above) and is an
confounding by tobacco not associated with independent risk factor
smoking in a prevalence disease either within the for the disease (40% of
case-control study. One- subgroup of smokers non-exposed smokers
half of the study (POR=1.0) or within the have the disease
participants are subgroup of non- compared with 20% of
"exposed” to the risk smokers (POR=1.0), it is non-exposed non-
factor of interest and associated with disease smokers). Thus, smoking
one-half are not. overall (POR=1.38) is a confounder and the
However, two-thirds of when the two subgroups “crude” prevalence odds
the exposed people are are combined. This ratio of 1.38 is invalid
smokers compared with occurs because smoking because it is not
one-third of the non- is associated with the adjusted for smoking.

68
Table 6.1

Hypothetical example of confounding by tobacco smoking in a prevalence


case-control study
Smokers Non-smokers Total
Exposed Non- Exposed Non- Exposed Non-
exposed exposed exposed
Cases 800 400 200 400 1,000 8,00
Non-cases 1,200 600 800 1,600 2,000 2,200
Total 2,000 1,000 1,000 2,000 3,000 3,000
Prevalence (%) 40 40 20 20 33.3 26.7
Prevalence odds ratio 1.0 1.0 1.38

Control of Confounding whether randomised studies are part of


epidemiology or whether they constitute
Misclassification of a confounder leads to a separate methodology).
a loss of ability to control confounding,
although control may still be useful A second method of control at the design
provided that misclassification of the stage is to restrict the study to narrow
confounder is non-differential ranges of values of the potential
(Greenland, 1980). Misclassification of confounders, e.g., by restricting the
exposure poses a greater problem study to white males aged 35-54. This
because factors which influence approach has a number of conceptual
misclassification may appear to be and computational advantages, but may
confounders, but control of these factors severely restrict the number of potential
may increase the net bias (Greenland study subjects and the generalizability of
and Robins, 1985). In general, control of the study, as effects in younger or older
confounding requires careful use of a people will not be observable.
priori knowledge, as well as inference
from the observed data. A third method of control involves
matching study subjects on potential
Control in the study design confounders. For example, in a cohort
study one would match a white male
Confounding can be controlled in the non-exposed subject aged 35-39 with
study design, or in the analysis, or both. an exposed white male aged 35-39. This
Control at the design stage involves will prevent age-sex-race confounding
three main methods (Rothman and in a cohort study, but is seldom done
Greenland, 1998). because it may be very expensive.
Matching can also be expensive in case-
The first method is randomization, i.e., control studies, and does not prevent
random allocation to exposure confounding in such studies, but does
categories, but this is rarely an option in facilitate its control in the analysis.
epidemiology which generally involves Matching may actually reduce precision
observational studies (it is debatable in a case-control study if it is done on a

69
factor which is associated with according to the levels of the
exposure, but is not a risk factor for the confounder(s) and calculating an effect
disease of interest. However, matching estimate which summarizes the
on a strong risk factor will usually association across strata of the
increase the precision of effect confounders. It is usually not possible to
estimates. control simultaneously for more than 2
or 3 confounders in a stratified analysis.
Control in the Analysis For example, in a cohort study, finer
stratification will often lead to many
Confounding can also be controlled in strata containing no exposed or no non-
the analysis, although it may be exposed persons. Such strata are
desirable to match on potential uninformative, thus fine stratification is
confounders in the design to optimize wasteful of information. This problem
the efficiency of the analysis. The can be mitigated to some extent, by the
analysis ideally should control use of multiple regression which allows
simultaneously for all confounding for simultaneous control of more
factors. Control of confounding in the confounders by "smoothing" the data
analysis involves stratifying the data across confounder strata.

Example 6.2

If the data presented in 1.0 in each of the two specific estimates (see
example 6.1 (table 6.1) subgroups (i.e. 1.0 in chapter 12) then yields
is analysed separately in smokers and 1.0 in non- an overall smoking-
smokers and non- smokers). Taking a adjusted prevalence
smokers, then the weighted average of odds ratio of 1.0.
prevalence odds ratio is these two stratum-

In general, control of confounding prior knowledge that the factor is


requires careful use of a priori predictive of disease.
knowledge, together with assessment
of the extent to which the effect Misclassification of a confounder leads
estimate changes when the factor is to a loss of ability to control
controlled in the analysis. Most confounding, although control may still
epidemiologists prefer to make a be useful provided that
decision based on the latter criterion, misclassification of the confounder was
although it can be misleading, nondifferential (unbiased) (Greenland,
particularly if misclassification is 1980). Misclassification of exposure is
present (Greenland and Robins, 1985). more problematic, since factors which
The decision to control for a presumed influence misclassification may appear
confounder can certainly be made with to be confounders, but control of these
more confidence if there is supporting factors may increase the net bias
(Greenland and Robins, 1985).

70
Example 6.3

Suppose that a cohort incidence rate will be 6.5 would be biased upwards
study of lung cancer (= 0.50 x 1.0 + 0.40 x by a factor of 9.4/6.5 =
involves a comparison 10 + 0.10 x 20) times 1.4, i.e. it would be 1.4
with national mortality the rate in non-smokers. times higher than the
rates in a country where Suppose that it was national rate due to
50% of the population considered most unlikely confounding by smoking.
are non-smokers, 40% that the cohort under Table 7.2 gives a range
are moderate smokers study contained more of such calculations
with a 10-fold risk of than 50% moderate presented by Axelson
lung cancer (compared smokers and 20% heavy (1978) using data from
to non-smokers), and smokers. Then, the Sweden. The last column
10% are heavy smokers incidence rate in the indicates the likely bias
with a 20-fold risk of study cohort would be in the observed rate
lung cancer. Then, it 9.4 times the rate in ratio due to confounding
can be calculated that non-smokers. Hence, the by smoking (a value of
the national lung cancer observed incidence rate 1.00 indicates no bias).

Table 6.2

Estimated crude rate ratios in relation to fraction of smokers in various hypothetical


populations

Population fraction (%)


Bias in
Nonsmokers Moderate Smokersa Heavy Smokersa relative risk
100 -- -- 0.15
80 20 - 0.43
70 30 -- 0.57
60 35 5 0.78
50 40 10 1.00b
40 45 15 1.22
30 50 20 1.43
20 55 25 1.65
10 60 30 1.86
-- 65 35 2.08
-- 25 75 2.69
-- -- 100 3.08
Source: Axelson (1978)
aTwo different risk levels are assumed for smokers: 10 times for moderate smokers; and 20 times for heavy
smokers.
bReference population with rates similar to those in general population in countries such as Sweden.

71
Assessment of Confounding exposed and non-exposed groups in
order to check that the average level
When one lacks data on a suspected of humidity in the home is similar in
confounder (and thus cannot control the two groups. Such limited
confounding directly) it is still desirable information, if taken in all exposure-
to assess the likely direction and disease subgroups, can also be used to
magnitude of the confounding it directly control confounding (White,
produces. It may be possible to obtain 1982; Walker, 1982; Rothman and
information on a surrogate for the Greenland, 1998).
confounder of interest (for example,
social class is associated with many Finally, even if it is not possible to
lifestyle factors such as smoking, and obtain confounder information for any
may therefore be a useful surrogate study participants, it may still be
for some lifestyle-related possible to estimate how strong the
confounders). Even though confounder confounding is likely to be from
control will be imperfect in this particular risk factors. For example,
situation, it is still possible to examine this is often done in occupational
whether the exposure effect estimate studies, where tobacco smoking is a
changes when the surrogate is potential confounder, but smoking
controlled in the analysis, and to information is rarely available; in fact,
assess the strength and direction of although smoking is one of the
the change. For example, if the strongest risk factors for lung cancer,
relative risk actually increases (e.g. with relative risks of 10 or 20, it
from 2.0 to 2.5), or remains stable appears that smoking rarely exerts a
(e.g. at 2.0) when social class is confounding effect of greater than 1.5
controlled for, then this is evidence times in studies of occupational
that the observed excess risk is not disease (Axelson, 1978; Siemiatycki,
due to confounding by smoking, since 1988), because few occupations are
social class is correlated with smoking strongly associated with smoking,
(Kogevinas et al, 1997), and control although this degree of confounding
for social class involves partial control may still be important in some
for smoking. contexts.

Alternatively, it may be possible to


obtain accurate confounder
information for a subgroup of
participants in the study, and to assess
the effects of confounder control in this
subgroup. A related approach, known
as two-stage sampling, involves
obtaining confounder information for a
sample of the source population (or a
sample of the controls in a case-
control study). For example, in a study
of asthma in children, it may not be
possible to obtain information on
humidity levels in the home in all
children. However, it may still be
possible to obtain humidity
measurements for a sample of the

72
6.2: Selection Bias

Whereas confounding generally in the study or follow-up is incomplete.


involves biases that are inherent in the For example, in a cohort mortality
source population, and therefore would study, if a national population registry
occur even if everyone in the source (or some surrogate for this such as the
population took part in the study, United States Social Security system)
selection bias involves biases arising were not available, then it might be
from the procedures by which the necessary to attempt to contact each
study participants are selected from worker or his next-of-kin to verify vital
the source population. Thus, selection status (i.e. whether the worker was
bias is not an issue in a cohort study still alive). Bias could occur if the
involving complete follow-up, since in response rate was higher in the most
this case the study cohort composes heavily exposed persons who had been
the entire source population. However, diagnosed with disease than in other
selection bias can occur if participation persons.

Example 6.4

Wrensch et al (2000) obtained during a brief there was evidence of a


conducted a case-control telephone interview with selection bias in the
study of 476 adults 101 controls who recruitment of controls.
newly diagnosed with declined participation in The odds ratio for cases
glioma in the San the lengthy in-person versus controls who
Francisco Bay Area interview. Controls who completed the full
between August 1991 participated in the full interview was 0.9,
and April 1994, and 462 interview were more whereas when both
age- gender- and likely than controls who control groups were
ethnicity-matched only completed the combined the odds ratio
controls. In addition, telephone interview to was 1.3.
limited information was report head injury. Thus

Although we should recognize the relative risk estimate provided that loss
possible biases arising from subject to follow-up applied equally to the
selection, it is important to note that exposed and non-exposed populations
epidemiologic studies need not be based (Criqui, 1979). Analogously, case-
on representative samples to avoid bias. control studies have differing selection
For example, in a cohort study persons probabilities as an integral part of their
who develop disease might be more design, in that the selection probability
likely to be lost to follow-up than of diseased persons is usually close to
persons who did not develop disease; 1.0 provided that most persons with
however, this would not affect the disease are identified, whereas that for

73
non-diseased persons is substantially restricted to union members (because
less; however, this does not affect the the records are available), then the non-
relative risk estimate provided that exposed comparison group could be
these selection probabilities apply other workers in the same geographical
equally within each exposure group. area who are members of the same
union, and/or a similar union.
Additional forms of selection bias can
occur in case-control studies because Control of Selection Bias
these involve sampling from the source
population. In particular, selection bias Selection bias can sometimes be
can occur in a case-control study controlled in the analysis by identifying
(involving either incident or prevalent factors which are related to subject
cases) if controls are chosen in a non- selection and controlling for them as
representative manner, e.g. if exposed confounders (provided that these
people were more likely to be selected factors are not affected by the study
as controls than non-exposed people. exposure or disease). For example, if
white-collar workers are more likely to
Minimizing Selection Bias be selected for (or participate in) a
study than manual workers (and white
If selection bias has occurred in the collar work is negatively or positively
enumeration of the exposed group, it related to the exposure of interest),
may still be possible to avoid bias by then this bias can be partially controlled
choosing an appropriate non-exposed by collecting information on social class
comparison group. For example, if the and controlling for social class in the
exposed group does not include all analysis as a confounder.
workers in a particular industry, but is

6.3: Information Bias

Information bias involves It is customary to consider two types


misclassification of the study of misclassification: non-differential
participants with respect to disease or and differential misclassification.
exposure status. Thus, the concept of
information bias refers to those people Non-Differential Misclassification
actually included in the study, whereas
selection bias refers to the selection of Non-differential misclassification
the study participants from the source occurs when the probability of
population, and confounding generally misclassification of exposure is the
refers to non comparability of same for cases and non-cases (or
subgroups within the source when the probability of
population. Information bias involves misclassification of disease is the
misclassification of the study subjects same for exposed and non-exposed
with respect to exposure, confounders, persons). This can occur if exposed
or disease. and non-exposed persons are equally

74
likely to be misclassified according to risk estimate towards the null value
disease outcome, or if diseased and of 1.0 (Copeland et al, 1977;
non-diseased persons are equally Dosemeci et al, 1990). Hence, non-
likely to be misclassified according to differential misclassification tends to
exposure. Non-differential produce "false negative" findings and
misclassification of exposure usually is of particular concern in studies
(but not always) biases the relative which find a negligible association

Example 6.5

In many cohort studies risk is thus 10. If 15% of result, the observed
some exposed persons high exposed persons are incidence rates per
will be classified as non- incorrectly classified, 100,000 person-years
exposed, and vice versa. then 15 of every 100 will be 91 and 23
Table 6.3 illustrates this deaths and 15,000 of respectively, and the
situation with every 100,000 person- observed relative risk will
hypothetical data from a years will be incorrectly be 4.0 instead of 10.0.
study of lung cancer allocated to the low Due to non-differential
incidence in asbestos exposure group. Similarly misclassification,
workers. Suppose the if 10% of high exposed incidence rates in the
true incidence rates are persons are incorrectly high exposed group have
100 per 100,000 person- classified, then 1 of every been biased downwards,
years in the high 10 deaths and 10,000 of and incidence rates in
exposure group, and 10 every 100,000 person- the low exposure group
per 100,000 person- years will be incorrectly have been biased
years in the low exposure allocated to the low upwards.
group, and the relative exposure group. As a

Table 6.3
Hypothetical data from a cohort study in which 15% of highly exposed persons and
10% of low exposed persons are incorrectly classified.

Actual Observed
------------------------------- -----------------------------------------------------------
High Low High Exposure Low Exposure
Exposure Exposure
-----------------------------------------------------------------------------------------------------------------
Deaths 100 10 85 + 1= 86 9+ 15 = 24
Person-years 100,000 100,000 85,000 + 10,000 = 95,000 90,000 +15,000 = 105,000
-----------------------------------------------------------------------------------------------------------------
Incidence rate 100 10 91 23
per 100,000
person years
----------------------------------------------------------------------------------------------------------------
Rate ratio 10.0 4.0

75
between exposure and disease. One by the misclassification. For example if
important condition is needed to ensure only 80% of the deaths are identified in
that exposure misclassification produces a study, but this under-ascertainment
bias towards the null however: the applies equally to the exposed and non-
exposure classification errors must be exposed groups, then this will not affect
independent of other errors. Without the relative risk estimate.
this condition, non-differential exposure
misclassification can produce bias in any Secondly, the effect estimate may be
direction (Chavance et al, 1992; biased away from the null for some
Kristensen, 1992). exposure categories when there are
multiple exposure categories (see
Furthermore, there are several other example 6.6).
situations in which non-differential
misclassification will not produce a bias Finally, when there is positive
towards the null. confounding, and there is non-
differential misclassification of the
Firstly, when the specificity of the confounder, then confounding control
method of identifying the disease under will be incomplete and the adjusted
study is 100%, but the sensitivity is less effect estimate will consequently be
than 100%, then the risk difference will biased away from the null.
be biased towards the null, but the risk
ratio (or rate ratio) will be not be biased

Example 6.6

Table 6.4 gives non-exposed group for groups produces a bias


hypothetical data from a which there is no away from the null when
cohort study in which the misclassification. In this the low exposure group is
findings for the high and instance, the non- compared to the non-
low exposure groups are differential exposed group: the
the same as in example misclassification between relative risk is 4.6 instead
6.5, but there is also a the high and low exposure of 2.0.

Table 6.4
Hypothetical data from a cohort study in which 15% of highly exposed persons and 10% of
low exposed persons are incorrectly classified, but the non-exposed are correctly classified
Actual Observed
--------------------------------------- ---------------------------------------
High Low Non-Exposed High Low Non-Exposed
------------------------------------------------------------------------------------------------------------
Deaths 100 10 5 86 24 5
Person-years 100,000 100,000 100,000 95,000 105,000 100,000
------------------------------------------------------------------------------------------------------------
Rate 100 10 5 91 23 5
------------------------------------------------------------------------------------------------------------
Rate ratio 20.0 2.0 1.0 18.1 4.6 1.0

76
One special type of non-differential phenomena do not represent
misclassification occurs when the study misclassification because these are not
outcome is not well-defined and errors in measurement. However, they
includes a wide range of etiologically do involve misclassification in the sense
unrelated outcomes (e.g., all deaths). that the etiologically relevant exposure
This may obscure the effect of exposure (or disease) has not been measured
on one specific disease since a large appropriately.
increase in risk for this disease may
only produce a small increase in risk for Differential Misclassification
the overall group of diseases under
study. A similar bias can occur when the Differential misclassification occurs when
exposure measure is not well defined the probability of misclassification of
and includes a wide range of exposure is different in diseased and non-
etiologically unrelated exposures, diseased persons, or the probability of
possibly due to a non-specific exposure misclassification of disease is different in
definition or due to the inclusion of exposed and non-exposed persons. This
exposures which could not have caused can bias the observed effect estimate
the disease of interest because they either toward or away from the null
occurred after, or shortly before, value. For example, in a nested case-
diagnosis. It could be argued that these control study of lung cancer, with a

Example 6.7

Table 6.5 shows data from some chemical. The true are classified correctly,
a hypothetical case- odds ratio is thus (70/30) then the observed odds
control study in which 70 (50/50) = 2.3. If 90% ratio would be (63/37) /
of the 100 cases and 50 of (63) of the 70 exposed (30/70) = 4.0.
the 100 controls have cases, but only 60% (30)
actually been exposed to of the 50 exposed controls

Table 6.5

Hypothetical data from a case-control study in which 90% of exposed cases and 60% of
exposed controls are correctly classified

Actual Observed
Exposed Non-exposed Exposed Non-exposed

Cases 70 30 63 37

Controls 50 50 30 70

Odds ratio 2.3 4.0

77
control group selected from among non- the validity of a study. Given limited
diseased members of the cohort, the resources, it will often be more
recall of occupational exposures in desirable to reduce information bias by
controls might be different from that of obtaining more detailed information on
the cases. In this situation, differential a limited number of subjects than to
misclassification would occur, and it reduce random error by including more
could bias the odds ratio towards or subjects. However, a certain amount of
away from the null, depending on misclassification is unavoidable, and it is
whether members of the cohort who did usually desirable to ensure that it is
not develop lung cancer were more or towards the null value (as usually
less likely to recall such exposure than occurs with nondifferential exposure
the cases. misclassification) to minimize the
chance of false positive results.
As can be noted from example 6.7,
misclassification can drastically affect

Example 6.8

In the case-control exposed cases would is in a predictable


study of lung cancer in recall exposure, but now direction, towards the
Example 6.7, the 45 (90%) of the 50 null. However, it should
misclassification could exposed controls would be noted that making a
be made non-differential recall their exposure. bias non-differential will
by selecting controls The observed odds ratio not always make it
from cohort members would be smaller, and that the
with other types of (63/37)/(45/55) = 2.1 direction of bias from
cancer, or other This estimate is still non-differential
diseases, in order that biased in comparison misclassification is
their recall of exposure with the correct value of sometimes predictable
would be more similar to 2.3. However, the bias is in advance.
that of the cases. As non-differential, is much
before, 63 (90%) of the smaller than before, and

Minimizing Information Bias

Misclassification can drastically affect to produce false negative findings and is


the validity of a study. It is often helpful thus of greatest concern in studies
to ensure that the misclassification is which have not found an important
non-differential, by ensuring that effect of exposure. Thus, in general it is
exposure information is collected in an important to ensure that information
identical manner in cases and non-cases bias is non-differential and, within this
(or that disease information is collected constraint, to keep it as small as
in an identical manner in the exposed possible. Thus, can be argued that the
and non-exposed groups). In this aim of data collection is not to collect
situation, if it is independent of other perfect information, but to collect
errors, exposure misclassification tends information in a similar manner from

78
the groups being compared, even if this Relationship of Selection and
means ignoring more detailed exposure Information Bias to Confounding
information if this is not available for
both groups. However, this is not Selection bias and confounding are not
always the case (Greenland and Robins, always clearly demarcated. In
1985). particular, selection bias can sometimes
be viewed as a type of confounding,
Assessment of information bias since both can be reduced by controlling
for surrogates for the determinants of
Information bias is usually of most the bias (e.g. social class).
concern in historical cohort studies or Unfortunately, selection affected by
case-control studies when information is exposure and disease generates a bias
obtained by personal interview. Despite that cannot be reduced in this fashion.
these concerns, relatively little Some consider any bias that can be
information is generally available on the controlled in the analysis as
accuracy of recall of exposures. When confounding. Other biases are then
possible, it is important to attempt to categorized according to whether they
validate the classification of exposure or arise from the selection of study
disease, e.g., by comparing interview subjects (selection bias), or their
results with other data sources such as classification (information bias).
employer records, and to assess the
potential magnitude of bias due to
misclassification of exposure.

Summary

The greatest concern in confounding appear smaller than it


epidemiological studies usually relates really is).
to confounding, because exposure has
not been randomly allocated, and the Provided that information has been
groups under study may therefore be collected in a standardized manner
noncomparable with respect to their (and it’s accuracy is unrelated to other
baseline disease risk. However, to be a errors), then misclassification will be
significant confounder, a factor must non-differential, and any bias it
be strongly predictive of disease and produces will usually be towards the
strongly associated with exposure. null value. In this situation,
Thus, although confounding is misclassification tends to produce false
constantly a source of concern, the negative findings and is thus of
strength of confounding is often greatest concern in studies which have
considerably less than might be not found an important effect of
expected (it should be appreciated exposure; it is of much less concern in
however, that this appearance may be studies with positive findings, since
illusory, for nondifferential these findings are likely to have been
misclassification of a confounder which even more strongly positive if
is common will usually make the misclassification had not occurred.

79
Again, one should appreciate the misclassification of a confounder can
limitations of these observations: it lead to bias away from the null if the
may be difficult to be sure that the confounder produces confounding
exposure and disease misclassification away from the null.
is nondifferential, and nondifferential

References

Axelson O (1978). Aspects on estrogens and endometrial cancer. J


confounding in occupational health Chron Dis 34: 433-8.
epidemiology. Scand J Work Environ
Health 4: 85-9. Greenland S, Robins JM (1985).
Confounding and misclassification. Am
Chavance M, Dellatolas G, Lellouch J J Epidemiol 122: 495-506.
(1992). Correlated nondifferential
misclassifications of disease and Greenland S, Robins JM (1986).
exposure: application to a cross- Identifiability, exchangeability and
sectional study of the relationship epidemiological confounding. Int J
between handedness and immune Epidemiol 15: 412-8.
disorders. Int J Epidemiol 21: 537-46.
Kogevinas M, Pearce N, Susser M,
Copeland KT, Checkoway H, McMichael Boffetta P (1997). Social inequalities
AJ, et al (1977). Bias due to and cancer. In: Kogevinas M, Pearce
misclassification in the estimation of N, Susser M, Boffetta P (eds). Social
relative risk. Am J Epidemiol 105: 488- inequalities and cancer. Lyon: IARC,
95. pp 1-15.

Criqui MH (1979). Response bias and risk Kristensen P (1992). Bias from
ratios in epidemiologic studies. nondifferential but dependent
American Journal of Epidemiology misclassification of exposure and
109:394-399. outcome. Epidemiol 3: 210-5.

Dosemeci M, Wacholder S, Lubin JH Pearce N, Greenland S (2004).


(1990). Does nondifferential Confounding and interaction. In:
misclassification of exposure always Ahrens W, Krickeberg K, Pigeot I
bias a true effect toward the null (eds). Handbook of epidemiology.
value? Am J Epidemiol 132: 746-8. Heidelberg: Springer-Verlag, 2004, pp
375-401.
Greenland S (1980). The effect of
misclassification in the presence of Robins J (1987). A graphical approach to
covariates. Am J Epidemiol 112: 564- the identification and estimation of
9. causal parameters in mortality studies
with sustained exposure periods. J
Greenland S, Neutra R (1981). An Chron Dis 40 (suppl 2): 139S-161S.
analysis of detection bias and
proposed corrections in the study of

80
Robins J (1989). The control of Walker AM (1982). Anamorphic analysis:
confounding by intermediate variables. sampling and estimation for covariate
Stat Med 8: 679-701. effects when both exposure and
disease are known. Biometrics 38:
Robins JM, Blevins D, Ritter G, et al 1025-32.
(1992). G-estimation of the effect of
prophylaxis therapy for pneumocystis Weinberg CR (1993). Toward a clearer
carinii pneumonia on the survival of definition of confounding. Am J
AIDS patients. Epidemiol 3: 319-36. Epidemiol 137: 1-8.

Robins JM, Hernán MA, Brumback B White JE (1982). A two-stage design for
(2000). Marginal structural models and the study of the relationship between a
causal inference in epidemiology. rare exposure and a rare disease. Am J
Epidemiol 11; 550-62. Epidemiol 115: 119-28.

Rothman KJ, Greenland S (1998). Modern Wrensch M, Miike R, Neuhaus J


epidemiology. 2nd ed. Philadelphia: (2000). Are prior head injuries of
Lippincott-Raven. diagnostic X-rays associated with
glioma in adults? The effects of
Siemiatycki J, Wacholder S, Dewar R, et control selection bias.
al (1988). Smoking and degree of Neuroepidemiology 2000; 19: 234-
occupational exposure: Are internal 44.
analyses in cohort studies likely to be
confounded by smoking status?

81
82
CHAPTER 7: Effect Modification
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

In the previous chapter I discussed the (Miettinen, 1974). The term statistical
problem of confounding which occurs interaction denotes a similar phenomenon
when the exposed and non-exposed in the observed data. However, the terms
subpopulations of the source population “interaction” and “effect modification” are
are inherently different in background also used in a variety of other contexts,
disease risk. This should not be confused with a variety of meanings. In particular,
with effect modification which occurs the term “interaction” has different
when the measure of the effect of the meanings for biostatisticians, lawyers,
study factor depends on the level of clinicians, public health professionals,
another factor in the study population epidemiologists and biologists.

Example 7.1

Katsouyanni et al (1993) heat wave were modified urban areas and 27% in
studied the effects of air by the presence (or non-urban areas. Further
pollution and high absence) or high air analyses suggested that
temperature in the pollution levels. In Athens the threshold of effect of
causation of excess (where air pollution levels various air pollutants
mortality during a major are high) the increase in appeared to be lower on
heat wave in Greece in deaths on extremely hot extremely hot days.
July 1987. They found days was 97% in Athens,
that the effects of the but was 33% in other

7.1: Concepts of Interaction

The different concepts of interaction will per 1,000 person-years in smokers. On


be illustrated with data from a the other hand, the rate ratio for smoking
hypothetical study of the risk of lung is 7.0 in asbestos workers and 10.0 in
cancer per 1,000 population (e.g. over a other people. I will now consider how this
five year period) in relation to exposure data might be interpreted by a different
to cigarette smoke and asbestos (Table researchers and policy makers. In each
7.1). The risk difference due to smoking instance, it is recognized that it is
is 30 per 1,000 in asbestos workers and 9 important to prevent or reduce both

83
asbestos exposure and smoking. difference as the effect measure. They
However, in this case the asbestos note that the risk difference for smoking
exposure has already occurred and the and lung cancer is 30 per 1,000 (35 - 5)
factory has now closed, so our focus is on in asbestos workers and 9 per 1,000 (10
smoking. We want to know whether the - 1) in other people. Thus, the effect of
‘effect” of smoking is “modified” by smoking is greater in asbestos workers
asbestos exposure, i.e. do smoking and and there is a positive statistical
asbestos exposure “interact”? interaction between the effects of
smoking and asbestos (table 7.2). They
Two Biostatisticians may even fit an additive model with an
interaction term and show that the
Suppose that we first consult a interaction term is positive.
biostatistician about how to interpret this
data. The first biostatistician we talk to We eventually get our two biostatistical
uses relative risk measures of effect. consultants together and they argue that
They note that the relative risk for there is no contradiction in the advice
smoking and lung cancer is 7.0 (35/5) in they have given us. Effect modification
asbestos workers and 10.0 (10/1) in and statistical interaction are merely
other people. Thus, the effect of smoking statistical concepts which depend on the
on lung cancer is less in asbestos workers methods used. In fact, all secondary risk
and there is therefore a negative factors modify either the rate ratio or the
statistical interaction between the effects rate difference, and uniformity over one
of smoking and asbestos (table 7.2). measure implies non-uniformity over the
They may even fit a multiplicative model other (Koopman, 1981; Steenland and
with an interaction term and show that Thun, 1986), e.g. an apparent additive
the interaction term is negative. joint effect implies a departure from a
multiplicative model. Several authors
We can see the logic of this argument, (e.g. Kupper and Hogan, 1978; Walter
but are somewhat surprised by the and Holford, 1978) have demonstrated
conclusion, since we can see the very the dependence of statistical interaction
high rates in people who both smoke and on the underlying statistical measure of
are exposed to asbestos. We therefore effect, and have therefore argued that
consult a second biostatistician. This the assessment of interaction is "model-
“alternative” biostatistician uses the risk dependent".

Table 7.1

Lung cancer risk per 1,000 people (and RR) in relation to exposure to cigarette smoke
and asbestos
Asbestos
Yes No
------------------------------------------------------
Smoking Yes 35/1000 (35.0) 10/1000 (10.0)
No 5/1000 (5.0) 1/1000 (1.0)
------------------------------------------------------
Rate difference 30/1000 9/1000
-------------------------------------------------------------------------------------
Rate ratio 7.0 10.0
-------------------------------------------------------------------------------------

84
A Lawyer

Next we consult a lawyer (I do not 86% (this is just 100*(R-1)/R where R is


advise this as a real course of action; the relative risk of 1.9). The
this is just a hypothetical consultation!). corresponding estimate for other people
She/he is also concerned about the (not exposed to asbestos) is 100*9/10
effect of smoking, but the effect they are which is 90%. Thus, the probability of
interested in is “what is the probability causation by smoking is slightly less in
that my client’s lung cancer was caused asbestos workers and there is therefore
by their smoking?” If we look at the a negative interaction between the
asbestos workers, we find that if they effects of smoking and asbestos (table
smoked their risk of lung cancer was 35 7.2). It should be noted that this
per 1,000 whereas it was 7 per 1,000 if lawyer’s approach is a little simplistic
they didn’t smoke. Thus, assuming there (Greenland, 1999), but the key issue
is no confounding by other factors, then here is that the “effect” that is being
of every 35 lung cancer occurring in the measured, and the inference about
smokers, 5 would have happened interaction, is different from that of the
anyway, and 30 are additional cases due two biostatisticians, although it is more
to smoking. Thus, for an individual lung consistent with that of the biostatistician
cancer case, the probability that smoking who uses the relative risk as the
caused the cancer is 100*30/35 which is measure of effect.

Table 7.2

The approaches of different consultants to interpreting the data in table 8.1


Size of effect
---------------------------- Inherent
Effect Asbestos Statistical Is there an
Consultant measure workers Others model Interaction? Direction?
----------------------------------------------------------------------------------------------------------

Biostatistician 1 Relative risk 7.0 10.0 Relative risk Yes -ve

Biostatistician 2 Risk difference 30/1000 9/1000 Risk difference Yes +ve

Lawyer Probability of 86% 90% Relative risk Yes -ve


causation

Clinician Individual risk 30 per 9 per Risk difference Yes +ve


1,000 1,000

Public health Deaths 30 per 9 per Risk difference Yes +ve


worker prevented 1,000 1,000 Risk difference Yes +ve

Epidemiologist Combination of 21 cases Not Risk difference Yes +ve


factors to cause out of 35 applicable
disease (60%) are
due to the
combination
of exposures

85
A Clinician except that they are concerned about
the population rather than about
Next we consult with a clinician. She/he individual patients. They say “I want to
says “I advise my patients to give up conduct population smoking prevention
smoking, and I tell them that if they do campaigns and persuade people to give
manage to stop then they will reduce up smoking and that if they do then
their risk of lung cancer. They ask ‘by they will reduce their risk of lung
how much?’ So I want to know what the cancer. I only have a limited amount of
reduction in their individual risk will be if resources so I want to know if I can
they give up smoking”. Well, if their prevent more cases of lung cancer by
patient is an asbestos worker then they focusing on asbestos workers, or by
will reduce their risk by 30 per 1,000 doing my campaigns in the same
(over five years) by giving up smoking; number of people in the general
other people will reduce their risk by 9 population”. If they prevent 1,000
per 1,000 (once again, this is a little asbestos workers smoking, then (once
simplistic since it this does not tell us there has been time for the reduction in
exactly how many years of life they will risk to start occurring) they will have
gain). Thus, the effect of smoking is prevented 30 lung cancer cases each
greater in asbestos workers and there is year. If they prevent 1,000 other people
therefore a positive statistical from smoking then each year they will
interaction between the effects of have prevented 9 cases of lung cancer.
smoking and asbestos (table 8.2). Thus, the effect of smoking is greater in
asbestos workers and there is therefore
A Public Health Worker a positive statistical interaction between
the effects of smoking and asbestos
The public health worker that we consult (table 7.2).
has a similar approach to the clinician,

Figure 7.1

Numbers of cases occurring through background factors, asbestos alone,


smoking alone, and their combination in people exposed to both factors

Background Asbestos Smoking Asbestos &


Smoking

A S
U U’ A U’’ S
U’”

Cases 1/35 (3%) 4/35 (11%) 9/35 (26%) 21/35 (60%)

86
An Epidemiologist asbestos) together with unknown
background exposures (U’’), and 21
I have argued in chapter 1 that cases (60%) occurred through
epidemiology is part of public health, mechanisms involving both factors
and therefore I might be quite content together with unknown background
to accept the public health worker’s exposures (U’’’). This means that 86%
approach. However, as an of the cases (26% + 60%) could have
epidemiologist I do want to know more been prevented by preventing smoking,
about the causation of disease, since whereas 71% (11% + 60%) could have
what I learn may be relevant to other been prevented by preventing asbestos
exposures or other diseases. Thus, I exposure. Thus, the attributable risks
may be particularly interested in the for the individual factors of smoking
combination of smoking and asbestos to (86%) and asbestos (71%) sum to
produce cases of lung cancer. Rothman more than 100% because of the cases
and Greenland (1998) have thus that occur through mechanisms
adopted an unambiguous involving both exposures and which
epidemiological definition of interaction consequently could be prevented by
in which two factors are not preventing either exposure.
"independent" if they are component
causes in the same sufficient cause. This One apparent exception should be noted
concept of independence of effects leads (Koopman, 1977). If two factors (A and
to the adoption of additivity of incidence B) belong to different sufficient causes,
rates as the state of "no interaction". but a third factor (C) belongs to both
Thus, the fact that the lung cancer rate sufficient causes, then A and B are
in the group exposed to both factors competing for a single pool of
(35/1000) is greater than the sum of susceptible individuals (those who have
the baseline risk (1/1000) plus the C). Consequently the joint effect of A
effect of asbestos alone (5/1000 – and B will be less than additive
1/1000) plus the effect of smoking (Miettinen (1982) reaches a similar
alone (10/1000 - 1) indicates that there conclusion based on a model of
are some cases of disease that are individual outcomes). However, this
occurring due to the combination of phenomenon can be incorporated
exposures and which would not have directly into the causal constellation
occurred if either of the exposures had model by clarifying a previous ambiguity
been eliminated. We can do the same in the description of antagonism in the
calculations using the relative risks model's terms. Specifically, the absence
(relative to the group with exposure to of B can be included in the causal
neither factor) rather than incidence constellation involving A, and vice
rates: the joint effect is 35.0 times, versa. Then, two factors would not be
whereas it would be 1+(5.0-1)+(10.0- "independent" if the presence or
1)=14.0 if it were additive. This absence of the factors (or particular
situation is summarized in figure 7.1. It levels of both factors) were component
shows that in the group exposed to both causes in the same sufficient cause
factors, 1 case (3%) occurred through (Greenland and Poole, 1988; Rothman
unknown “background” exposures (U), 4 and Greenland, 1998).
cases (11%) through mechanisms
involving asbestos exposure alone (and A Biologist
not smoking) together with unknown
background exposures (U’), 9 cases Finally, it should be stressed that this
(26%) occurred through mechanisms epidemiological concept of
involving smoking alone (and not independence of effects is distinct from

87
some biological concepts of which a particular biologic model, rather
independence. For example, Siemiatycki than being accepted as the "baseline", is
and Thomas (1981) give a definition in itself evaluated in terms of the co-
which two factors are considered to be participation of factors in a sufficient
biologically independent "if the cause. For example, two factors which
qualitative nature of the mechanism of act at different stages of a multistage
action of each is not affected by the process are not independent since they
presence of absence of the other". are joint components of at least one
However, this concept does not lead to sufficient cause. This occurs irrespective
an unambiguous definition of of whether they affect each other's
independence of effects, and thus does qualitative mechanism of action (the
not produce clear analytic implications. ambiguity in Siemiatycki and Thomas'
Rothman's concept of independence is formulation stems from the ambiguity of
at a more abstract conceptual level in this concept).

7.2 Multiplicative and Additive Models

Rothman's approach is attractive Second, it has been argued that


because it is based on epidemiological multiplicative models facilitate the
concepts which have a clear biologic assessment of the extent of unknown
interpretation, and because it leads to confounding or bias (Cornfield et al,
an unambiguous definition of 1959), although this is not always the
independence of effects which is case.
identical to that obtained through
public health considerations (Rothman Third, if it is desired to keep statistical
et al, 1980). However, the analytic interaction (effect modification) to a
implications of these concepts are not minimum, then a multiplicative model
straightforward, since assessing may be more appropriate. It is not
independence of effects is usually only uncommon for risk factors to have
one of the analytic goals of an approximately multiplicative effects
epidemiological study. Rather, there (Saracci, 1987). This presumably
are several other considerations which occurs because they are a part of
often favour the use of multiplicative common causal processes, although
models. other sufficient causes usually also
operate, and exact multiplicativity may
First, multiplicative models have not occur. Nevertheless, in this
convenient statistical properties. situation there may be less masking of
Estimation in non-multiplicative heterogeneity in calculating an overall
models may have problems of rate ratio than in calculating an overall
convergence, and inference based on rate difference; there are also many
the asymptotic standard errors may be instances of non-multiplicative
flawed unless the study size is very departures from additivity, however
large (Moolgavkar and Venzon, 1987). (Selikoff et al, 1980; Saracci, 1987).

88
7.3: Joint Effects

These considerations imply an apparent analysis strategies are based on the


dilemma. How can an analysis be principle that it is not appropriate to
conducted which combines the calculate an overall effect estimate if
advantages of ratio measures of effect interaction is present. However, this
with the assessment of independence in principle is commonly ignored if the
terms of a departure from additivity? difference in stratum-specific effect
These apparently contradictory goals estimates is not too great. In fact
can be reconciled in analyses which standardized rate ratios (see chapter
concentrate on the estimation of 12) have been developed for precisely
separate and joint effects (Pearce, this situation, and will consistently
1989). estimate meaningful epidemiological
parameters even under heterogeneity
Thus, when studying asbestos, smoking (Greenland, 1982). Nevertheless, some
and lung cancer, relative risks might be authors have proposed modeling
presented for smoking (in non-asbestos strategies in which the first step in the
workers), asbestos exposure (in non- analysis involves testing for statistical
smokers) and exposure to both factors, interaction. A related approach has been
relative to persons exposed to neither the development of generalized families
factor. These relative risks would be of models which include the additive
adjusted for all other factors (e.g. age) and multiplicative models as special
which are potential confounders, but not cases. An alternative general strategy
of immediate interest as effect can be based on epidemiological
modifiers. considerations (Pearce, 1989). The key
difference is that interaction is assessed
The estimation of separate and joint (rather than tested) in terms of a
effects may be difficult when the factors departure from additivity in order to
of interest are closely correlated, and elaborate an observed effect, rather
there are therefore only small numbers than being tested for departure from an
of people who are exposed to either arbitrary effect measure as an essential
factor alone. However, when it is initial analytic step. This procedure can
feasible, this approach combines the be achieved within the confines of
best features of multiplicative models statistically convenient multiplicative
and additive independence assessment, models through the analysis of separate
but also permits readers with other and joint effects.
concepts of independence to draw their
own conclusions (as in table 7.1).

When the assessment of joint effects is


a fundamental goal of the study, it can
be accomplished by calculating stratum-
specific effect estimates, as in Example
7.1 above. On the other hand, it is less
clear how to proceed when effect
modification is occurring, but
assessment of joint effects is not an
analytical goal. Conventional statistical

89
Summary

The terms interaction and effect in which two factors are not
modification are used in a variety of "independent" if they are component
contexts, with a variety of meanings. In causes in the same sufficient cause. This
particular, the term “interaction” has leads to the adoption of additivity of
different meanings for biostatisticians, incidence rates as the state of "no
lawyers, clinicians, public health interaction". However, there are other
professionals, epidemiologists and considerations which generally favor the
biologists. In each instance, they are use of multiplicative models. This
interested in the same question, namely implies an apparent dilemma as to how
does the effect of exposure A depend on an analysis can be conducted which
whether exposure B is also present (or combines the advantages of ratio
absent)? However, the word “effect” has measures of effect with the assessment
different meanings in different contexts. of independence in terms of a departure
In contrast to definitions based on from additivity. These apparently
statistical concepts, Rothman has contradictory goals can be reconciled
adopted an unambiguous through the analysis of separate and
epidemiological definition of interaction joint effects.

References

Cornfield J, Haenszel W, Hammond EC, Katsouyanni K, Pantazopoulou A,


et al (1959). Smoking and lung Touloumi G, et al. Evidence for
cancer: recent evidence and a interaction between air pollution and
discussion of some questions. JNCI high temperature in the causation of
22: 173-203. excess mortality. Arch Environ Health
1993; 48: 235-42.
Greenland S (1982). Interpretation and
estimation of summary ratios under Koopman JS (1977). Causal models and
heterogeneity. Statist Med 1: 217- sources of interaction. Am J
27. Epidemiol 106: 439-44.
Greenland S, Poole C (1988). Invariants Koopman JS (1981). Interaction
and noninvariants in the concept of between discrete causes. American
interdependent effects. Scand J Work Journal of Epidemiology 13:716-724.
Environ Health 14; 125-9.
Kupper LL, Hogan MD (1978).
Greenland S (1999). Relation of Interaction in epidemiologic studies.
probability of causation to relative Am J Epidemiol 108: 447-53.
risk and doubling dose: a
Miettinen OS (1974). Confounding and
methodologic error that has become
effect modification. Am J Epidemiol
a social problem. AJPH 89: 1166-9.
100: 350-3.

90
Miettinen OS (1982). Causal and cancer etiology. Epidemiologic
preventive interdependence: Reviews 9: 175-93.
elementary principles. Scand J Work
Selikoff I, Sedman H, Hammond E
Environ Health 8: 159-68.
(1980). Mortality effects of cigarette
Moolgavkar SH, Venzon DJ (1987). smoking among amosite asbestos
General relative risk models for factory workers. JNCI 65: 507-13
epidemiologic studies. Am J
Epidemiol 126: 949-61. Siemiatycki J, Thomas DC (1981).
Biological models and statistical
Pearce NE (1989). Analytic implications
interactions: an example from
of epidemiological concepts of
multistage carcinogenesis. Int J
interaction. Int J Epidemiol 18: 976-
Epidemiol 10: 383-7.
80.
Steenland K, Thun M (1986). Interaction
Rothman KJ, Greenland S, Walker AM
between tobacco smoking and
(1980). Concepts of interaction. Am J
occupational exposures in the
Epidemiol 112: 467-70.
causation of lung cancer. Journal of
Rothman KJ, Greenland S (1998). Occupational Medicine 28:110-118.
Modern epidemiology. 2nd ed.
Walter SD, Holford TR (1978). Additive,
Philadelphia: Lippincott-Raven.
multiplicative and other models for
Saracci R (1987). The interactions of disease risks. Am J Epidemiol 108:
tobacco smoking and other agents in 341-6.

91
92
Part III

Conducting a study

93
94
CHAPTER 8: Measurement of Exposure and
Health Status
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

In this chapter I briefly review the chapters I then discuss the practicalities
various options for measuring exposure of conducting cohort, case-control and
and disease status. In the following cross-sectional studies.

8.1: Exposure

As discussed in chapter 1, epidemiological may be strongly correlated with internal


studies involve a wide variety of dose, whereas in other situations (e.g.
exposures ranging from the population environmental lead exposure) the dose
level to the individual and micro-levels. may depend on individual lifestyle and
The term “exposure” is thus used activities and may therefore be only
generically to refer to any factor that is weakly correlated with the
under study, and exposures may include environmental exposure levels.
population factors (e.g. income
inequality), individual-level socio- Exposure levels can be assessed with
economic factors (e.g. income), physical regard to the intensity of the substance
environmental factors (e.g. air pollution), in the environment (e.g. dust
aspects of individual lifestyle (e.g. diet), concentration in the air) and the
as well as “exposures” measured at the duration of time for which exposure
level of the body, (e.g. total body burden occurs. The risk of developing disease
of dioxin), organ (e.g. the concentration may be much greater if the duration of
of asbestos in the lung), cell, or molecule exposure is long and/or the exposure is
(e.g. DNA adducts). These various intense, and the total cumulative
situations are discussed here briefly; a exposure may therefore be important.
more detailed discussion can be found in For protracted etiologic processes, the
Armstrong et al (1992). time-pattern of exposure may be
important and it is possible to assess
Exposure and Dose this by examining the separate effects of
exposures in various time windows prior
Strictly speaking, the term exposure to the occurrence and recognition of
refers to the presence of a substance clinical disease (Pearce, 1992). For
(e.g. fine particulate matter) in the example, in cancer studies recent
external environment, whereas the term exposures may not be relevant since the
dose refers to the amount of substance cancer may have first become
that reaches susceptible targets within established some years previously
the body, such as the airways. In some (Pearce, 1988). Similarly, recent work
situations (e.g. in a coal mine) suggests that occupational asthma is
measurements of external exposures most likely to occur after about 1-3

95
years of exposure to a sensitising agent can be measured in a variety of ways,
(Antó et al, 1996). including occupation, income, and
education (Liberatos et al, 1988;
General Approaches to Exposure Berkman and MacIntyre, 1997). These
Assessment measures may pose problems in some
demographic groups; for example,
Methods of exposure measurement occupation and income may be poor
include personal interviews or self- measures of socio-economic status in
administered questionnaires (completed women, for whom the total family
either by the study participant or by a situation may reflect their socio-
proxy respondent), diaries, observation, economic status better than their
routine records, physical or chemical individual situation, and measures of
measurements on the environment, or socio-economic status in children must
physical or chemical measurements on be based on the situation of the parents
the person (Armstrong et al, 1992). For or the total family situation.
example, table 8.1 summarizes the types Nevertheless, the various measures of
of exposures data most commonly used socio-economic status are strongly
in occupational epidemiology studies correlated with each other, and asthma
(Checkoway et al, 2004). Measurements epidemiology studies are usually based
on the person can relate either to on whichever measures are available,
exogenous exposure (e.g. airborne dust) unless socio-economic status is the main
or internal dose (e.g. plasma cotinine); focus of the research and it is necessary
the other measurement options (e.g. to obtain more detailed information.
questionnaires) all relate to exogenous
exposures. Questionnaires

Demographic Factors Traditionally, exposure to most non-


biological risk factors (e.g. tobacco
In most instances, information on smoking) has been measured with
demographic factors such as age, questionnaires, and this approach has a
gender and ethnicity can be obtained in long history of successful use in
a straightforward manner from routine epidemiology (Armstrong et al, 1992).
health care records or with Questionnaires may be self-administered
questionnaires. In studies focusing on (e.g. postal questionnaires) or
ethnicity, the etiologically relevant interviewer-administered (e.g. in
definition will depend on the extent to telephone or face-to-face interviews)
which an ethnic difference is considered and may be completed by the study
to be due to genetic and/or cultural and subject or by a proxy (e.g. parental
environmental factors, but the available completion of questionnaires in a study
information will vary from country to of children, or completion by the spouse
country depending on historical and of deceased cases). The validity of
cultural considerations. For example, in questionnaire data also depends on the
New Zealand, Māori ethnicity is defined structure, format, content and wording
as ‘a person who has Māori ethnicity and of questionnaires, as well as methods of
chooses to identify as Māori’ (Pomare et administration and selection and training
al, 1992), whereas some other countries of interviewers (Armstrong et al, 1992).
use solely biologically-based definitions
of “race” or ethnicity (Polednak, 1989).

Socio-economic status poses more


significant measurement problems. It

96
Table 8.1

Types of exposure data commonly used in occupational epidemiology studies


(Source: Adapted from Checkoway et al, 2004)

• Ever employed in the industry


• Duration of employment in the industry
• Ordinally ranked jobs or tasks
• Job-exposure matrices
• Quantified personal measurements

Example 8.1

Raum et al (2001) pregnancy and given a risk of small-for-


studied the impact of self-administered 30- gestational-age (SGA)
maternal socio-economic page questionnaire newborns compared to
status on intrauterine covering socio- women with the highest
growth in the former demographic, education in both the
west and East Germany. psychosocial, nutritional, west (OR = 2.58, 95%
Information on socio- environmental and CI 1.17-5.67) and the
demographic or lifestyle occupational factors. The east (OR = 2.77, 95% CI
factors and pregnancy two school systems were 1.54-5.00). The authors
outcome was available not identical, but in each concluded that social
for 3,374 live-born system maternal inequalities existed and
singletons from West educational level was caused health
Germany (1987/88) and grouped into five inequalities in both the
3070 from East Germany categories. Women with West, and in the former
(1990/91). Women were the lowest education had socialist country of East
recruited during a significantly elevated Germany.

Example 8.2

Vartia (2001) studied participants were asked more general stress and
the consequences of if they felt themselves mental stress reactions
workplace bullying in the subjected to such than did respondents
municipal sector in behaviour, or if they had from workplaces with no
Helsinki, Finland. Every observed someone else bullying. The targets
35th member of the at their workplace being of bullying used sleep-
Municipal Officials Union bullied. They were also inducing drugs and
was selected and 1037 asked about the sedatives more often
(65.5%) responded to a frequency and duration than did the
postal questionnaire. A of such acts. Both the respondents who were
definition of bullying was targets of bullying and not bullied.
provided and study the observers reported

97
Environmental Measurements and for each job title, exposure levels
Job-Exposure Matrices decreased over time, but increased again
during the 1966-75 time period. Within
In many studies, e.g. community-based each time period, the highest exposures
case-control studies, questionnaires are were in raw fiber handling and the
the only source of exposure information. lowest were in general area workers.
However, in some instances, particularly This historical exposure information can
in occupational studies, questionnaires be combined with information from
may be combined with environmental employment records to obtain exposure
exposure measurements (e.g. industrial estimates for individual workers. For
hygiene surveys) to obtain a quantitative example, table 5.3 shows the cumulative
estimate of individual exposures. Table exposure for a worker who worked as a
8.2 shows environmental measurements card operator during 1933-1938 and
in an asbestos textile plant in South then worked in “clean-up” during 1939-
Carolina (Dement et al, 1983; 1948.
Checkoway et al, 2004). It shows that

Example 8.3

Saracci et al (1984a) each plant, job/plant respirable fibres in the


conducted a historical areas were grouped into job category. The
cohort study of mortality six main occupational relative risk of lung
and cancer incidence of categories: not cancer was elevated,
workers exposed to specified, office, particularly in the group
made-made vitreous preproduction, with 30 years or more
fibres at 13 European production, secondary since first employment
plants. At 12 of the processes and (RR=1.92, 95% CI 1.17-
plants an environmental maintenance. For each 3.07). There was a
survey was conducted to worker a cumulative tendency for the risk to
measure present exposure index was increase with cumulative
concentrations of fibres created by multiplying exposure, but the
in air samples. This was the time spent in each pattern was not
used to create a job- job category by the consistent.
exposure matrix. Within mean concentration of

Table 8.2

Asbestos concentrations (fibers/cc) in job categories in an asbestos textile plant


(Source: Adapted from Checkoway et al, 2004)

Job category 1930-35 1936-45 1946-65 1966-75


General area 10.8 5.3 2.4 4.3
Card operators 13.3 6.5 2.9 5.3
Clean-up 18.1 8.8 4.0 7.2
Raw fiber handling 22.8 11.0 5.0 9.0

98
Table 8.3
Example of an exposure history of an individual worker
Job Years Mean exposure Cumulative exposure
Card operator 1933-35 10.8 32.4
Card operator 1936-1938 6.5 41.9
Clean-up 1939-45 8.8 103.5
Clean-up 1946-48 4.0 115.5

Quantified Personal Measurements

In some instances, quantified personal Quantified personal exposure


exposure measurements may be measurements can also be used in
available, e.g. in radiation workers case-control studies to estimate
wearing radiation dosimeters historical exposures. However, a
(Checkoway et al, 2004). This potential problem in this situation is
information is invaluable when it is that exposure may have changed over
available, but it is rarely available for time, or study participants may change
historical exposures with the exception their behaviour as a result of having
of some industries such as the nuclear been diagnosed with disease. This has
power industry. Such information can been a particular issue in case-control
of course be collected prospectively. studies of electromagnetic field
This is rarely practical for cohort exposure and childhood leukemia
studies of rare diseases with long where it has been argued that current
latency periods (e.g. cancer), but is personal exposure measurements may
more appropriate for cohort studies of be inferior to “wire code” information
relatively common conditions. For (i.e. whether the wiring to the house is
example, infant cohort studies of underground, or by overhead wires,
respiratory disease frequently etc) in estimating historical exposures
prospectively collect information on (Neutra and del Pizzo, 1996).
individual levels of allergen exposure
(e.g. Lau et al, 2001).

99
Example 8.4

Wing et al (1991) film badges from then exposure lag (i.e.


conducted a historical until 1975, and exposures were only
cohort mortality study thermoluminescent considered up until 20
among workers at Oak dosimeters since 1975. years previously) was
Ridge National This information was associated with an
Laboratory, Tennessee. used to estimate increased risk of death
Individual exposures to individual exposures (2.68% increase per 10
external penetrating over time. After mSv cumulative
radiation, primarily accounting for age, birth exposure), particularly
gamma rays, were cohort, socio-economic from cancer (4.94%
measured using pocket status, and active increase per 10 mSv).
ionising chambers from worker status, external
1943 until June 1944, radiation with a 20-year

Biomarkers

More recently, there has been increasing exposures can be used if it is reasonable
emphasis on the use of molecular to assume that exposure levels (or at
markers of internal dose (Schulte, least relative exposure levels) have
1993). In fact, there are a number of remained stable over time (this may be
major limitations of currently available particularly relevant in occupational
biomarkers of exposure (Armstrong et studies), and have not been affected by
al, 1992), particularly with regard to lifestyle changes, or by the occurrence of
historical exposures (Pearce et al, 1995). the disease. However, if the aim is to
For example, serum levels of measure historical exposures, then
micronutrients reflect recent rather than historical information on exposure
historical dietary intake (Willett, 1990). surrogates may be more valid than direct
Some biomarkers are better than others measurements of current exposure or
in this respect (particularly markers of dose levels. This situation has long been
exposure to biological agents), but even recognised in occupational epidemiology,
the best markers of chemical exposures where the use of work history records in
usually reflect only the last few weeks or combination with a job-exposure matrix
months of exposure. On the other hand, (based on historical exposure
with some biomarkers it may be possible measurements of work areas rather than
to estimate historical levels provided that individuals) is usually considered to be
certain assumptions are met. For more valid than current exposure
example, it may be possible to estimate measurements (whether based on
historical levels of exposure to pesticides environmental measurements or
(or contaminants) from current serum biomarkers) if the aim is to estimate
levels provided that the exposure period historical exposure levels (Checkoway et
is known, and the half-life is known. al, 2004). On the other hand, some
Similarly, information on recent biomarkers have potential value in

100
validation of questionnaires which can environmental exposure (e.g. tobacco
then be used to estimate historical smoke) may involve hundreds of
exposures. Furthermore, biomarkers of different chemicals, each of which may
internal dose may have relatively good produce hundreds of measurable
validity in studies involving an acute biological responses (there are
effect of exposure. exceptions to this, of course, such as
environmental lead exposure, but most
A more fundamental problem of environmental exposure involves
measuring internal dose with a complex mixtures). A biomarker typically
biomarker is that it is not always clear measures one of the biological responses
whether one is measuring the exposure, to one of the chemicals. If the chosen
the biological effect, or some stage of biomarker measures the key etiological
the disease process itself (Saracci, factor, then it may yield relatively good
1984b). Thus the findings may be exposure data; however, if a biomarker
uninterpretable in terms of the causal is chosen which has little relationship to
association between exposure and the etiological component of the complex
disease. When it is known that the exposure mixture then the biomarker will
biologically effective dose is the most yield relatively poor exposure data.
appropriate measure, then the use of
appropriate biomarkers clearly has some A further major problem with the use of
scientific advantages. However, choosing biomarkers is that the resulting expense
the appropriate biomarker is a major and complexity may drastically reduce
dilemma, and biomarkers are frequently the study size, even in a case-control
chosen on the basis of an incomplete or study, and therefore greatly reduce the
erroneous understanding of the etiologic statistical power for detecting an
process (or simply because a particular association between exposure and
marker can be measured). An disease.

Example 8.5

Ross et al (1992) studied follow-up, a nested case- of liver cancer were


urinary aflatoxin control study was more likely than controls
biomarkers and risk of conducted based on the to have detectable
hepatocellular carcinoma 22 identified cases of concentrations of
as part of an ongoing liver cancer, and 140 aflatoxin metabolites
prospective study of density-matched (OR = 2.4, 95% CI 1.0-
18,244 middle-aged men controls (matched for 5.9).
in Shanghai. After age and neighbourhood
35,299 person-years of or residence). The cases

Thus, questionnaires and environmental developed. The emphasis should be on


measurements will continue to play a using “appropriate technology” to obtain
major role in exposure assessment in the most practical and valid estimate of
epidemiology, but biomarkers may be the etiologically relevant exposure. The
expected to become increasingly useful appropriate approach (questionnaires,
over time, as new techniques are environmental measurements or

101
biological measurements) will vary from within the same complex chemical
study to study, and from exposure to mixture (e.g. in tobacco smoke).
exposure within the same study, or

8.2: Health Status

The type of information required for death information for identified deaths
measuring health status in can be obtained by requesting copies of
epidemiological studies may be different death certificates from national, state, or
from that which is required in clinical municipal vital statistics offices. In most
practice. As with exposure data, the key instances the causes of death are coded
issue is that information should be of by a nosologist trained in the rules
similar quality for the various groups specified in the International
being compared. For example, suppose Classification of Diseases (ICD) volumes
that the bladder cancer incidence in a compiled by the World Health
particular geographical area is being Organisation. Revisions to the ICD
compared with national incidence rates; coding are made about every ten years,
then it would be inappropriate to and in some instances the ICD code for a
conduct a pathological review and particular cause of death may change
reclassification of the cases of the cancer (Checkoway et al, 2004).
identified in the area, since such a
reclassification had not been made for Some countries or states also maintain
the national data and the information incidence registers for conditions such as
would not be comparable. Rather, the cancer, congenital malformations or
cancer cases in the area should be epilepsy. These have most commonly
classified exactly as they had been been established for cancer registration
classified in routine national cancer and the International Agency for
statistics. Thus, the emphasis should be Research on Cancer (IARC) has been
on the comparability of information attempting to encourage the
across the various groups being establishment of cancer registries and to
compared. standardise methods of cancer
registration throughout the world
The types of health outcome data used
(Jensen et al, 1991). Provided that
in epidemiological studies include:
registration is relatively complete, then
mortality; disease registers; health
cancer registrations can provide valuable
service records; and morbidity surveys.
additional health status information (and
These can be grouped into data based
increase the number of identified cases)
on routinely collected records, and
in a cohort study. Furthermore, cancer
morbidity data that is collected for a
registries are invaluable for identifying
specific epidemiologic study.
newly diagnosed cases who can be
interviewed (while they are still alive) for
Routine Records
population-based case-control studies.
Most countries maintain comprehensive
Many Western countries have notification
death registration systems at the
systems for occupational diseases. For
national or regional levels, and cause of
example, in the United Kingdom the

102
Surveillance of Work Related and for determining health status in cohort
Occupational Respiratory Disease studies, or to create informal “registers”
(SWORD) project was established in for identifying cases for case-control
1989 as a national surveillance scheme studies; these include hospital admission
for occupational respiratory disease records, health insurance claims, health
(Meredith et al, 1991). maintenance organisation (HMO)
records, and family doctor (general
As discussed in chapter 9, other practitioner records).
routinely collected records can be used

Example 8.6

then linked the hospital 95% CI 0.90-38.3), and


Jones et al (1998) record for each child to a significantly raised risk
performed a record all of that child’s hospital with pre-eclampsia or
linkage study of pre- records and to his or her eclampsia during
natal and early life risk mother’s maternity pregnancy (OR=1.48,
factors for childhood record. There were no 95% CI 1.05-2.10).
onset diabetes mellitus. significant associations They hypothesized that
They identified 160 boys between subsequent pre-eclampsia may be
and 155 girls born diabetes and the result of an
during 1965-1986 who birthweight, gestational immunogenetic
had been admitted to age, birthweight for incompatibility between
hospital in Oxfordshire, gestational age, mother and fetus, and
England with a diagnosis maternal age and parity. that this early
of diabetes during 1965- There were non- immunological
1987. For each case, up significantly increased disturbance may be
to eight controls were risks with not related to the incidence
chosen from records for breastfeeding (OR=1.33, of diabetes later in life.
live births in the same 95% CI 0.76-2.34) and
area, matched on sex, with diabetes recorded
year of birth and hospital in the mother during
or place of birth. They pregnancy (OR=5.87,

Morbidity Surveys

In some circumstances, routine records peak flow measurements for asthma),


may not be available for the health more invasive testing (e.g. blood tests
outcome under study, or may not be for diabetes), questionnaires, or a
sufficiently complete or accurate or use combination of these methods.
in epidemiological studies. Although this
could in theory apply to mortality To take the example of asthma, the
records, more commonly this is an issue essential feature of the condition (at
for non-fatal conditions, particularly least in clinical and epidemiological
chronic diseases such as respiratory terms) is variable airflow obstruction
disease and diabetes. Such morbidity which can be reversed by treatment or is
surveys may involve clinical self-limiting (Pearce et al, 1998). This
examinations (e.g. a clinical history and poses several problems with the use of

103
"diagnosed asthma" in asthma between the study participants and
prevalence studies, since the diagnosis physicians, and this is not possible or
of "variable airflow obstruction" usually affordable in large-scale epidemiological
requires several medical consultations studies. Thus, most epidemiological
over an extended period. It is therefore studies must, by necessity, focus on
not surprising that several studies have factors which are related to, or
found the prevalence of physician- symptomatic of, asthma but which can
diagnosed asthma to be substantially be readily assessed on a particular day.
lower than the prevalence of asthma The main options in this regard are
symptoms. Such problems of differences symptoms and physiological
in diagnostic practice could be minimised measurements (Pearce et al, 1998). In
by using a standardised protocol for particular, standardised symptoms
asthma diagnosis in prevalence studies. questionnaires have been developed for
However, this is rarely a realistic option use in adults (Burney et al, 1994) and
since it requires repeated contacts children (Asher et al, 1995).

Example 8.7

Dowse et al (1990) classified according to commented that the


studied the prevalence the World Health findings in Indians were
of non-insulin dependent Organisation (WHO) similar to those in other
diabetes mellitus criteria (World Health studies of Indian migrant
(NIDDM) in adults aged Organisation, 1985). The communities, but the
25-74 years in Mauritius. prevalence of NIDDM findings in Creoles and
A random sample of was similar in men Chinese were
5,892 individuals was (12.1%) and women unexpected. “Potent
chosen and 5,080 (11.7%). Age and sex- environmental factors
(83.4%) participated. standardised prevalence shared between ethnic
They used a 75g oral was similar in Hindu groups in Mauritius may
glucose tolerance test Indians (12.4%), Muslim be responsible for the
with fasting and 2-h post Indians (13.3%), Creoles epidemic of glucose
load blood collection. (10.4%) and Chinese intolerance”.
Glucose tolerance was (11.9%). The authors

Health status can also be measured by role functioning, bodily pain, mental
more general morbidity and “quality of health, and general health perceptions.
life” questionnaires. Perhaps the most The SF-36 scales have been widely used
widely used questionnaire has been the in clinical research in a wide variety of
Medical Outcomes Study Short Form populations to assess overall health
(SF-36) (Ware, 1993). This includes status.
scales to measure physical functioning,

104
Summary

Methods of exposure measurement (Schulte, 1993). However,


include personal interviews or self- questionnaires and environmental
administered questionnaires (completed measurements have good validity and
either by the study participant or by a reproducibility with regard to current
proxy respondent), diaries, observation, exposures and are likely to be superior
routine records, physical or chemical to biological markers with respect to
measurements on the environment, or historical exposures. The emphasis
physical or chemical measurements on should be on using “appropriate
the person. Measurements on the person technology” to obtain the most practical
can relate either to exogenous exposure and valid estimate of the etiologically
(e.g. airborne dust) or internal dose relevant exposure.
(e.g. plasma cotinine); the other
measurement options (e.g. Similar considerations apply to the
questionnaires) all relate to exogenous collection of information on health
exposures. Traditionally, exposure to status. Once again, it is important that
most non-biological risk factors (e.g. the information obtained should be of
cigarette smoking) has been measured comparable quality in the exposed and
with questionnaires (either self- non-exposed populations. With this
administered or interviewer- proviso, the specific methods used will
administered), and this approach has a differ according to the hypothesis and
long history of successful use in population under study, but the main
epidemiology. Questionnaires may be options include use of routine records
combined with environmental exposure (mortality, incidence, hospital admission,
measurements (e.g. pollen counts, health insurance, general practitioner,
industrial hygiene surveys) to obtain a etc) and the mounting of a special
quantitative estimate of individual morbidity survey (using clinical
exposures. More recently, there has examinations, biological testing or
been increasing emphasis on the use of questionnaires).
molecular markers of internal dose

105
References

Antó JM, Sunyer J, Newman-Taylor AJ Jensen OM, Parkin DM, MacLennan R, et


(1996). Comparison of soybean al (1991). Cancer registration:
epidemic asthma and occupational principles and methods. Lyon: IARC.
asthma. Thorax 51: 743-9.
Jones ME, Swerdlow AJ, Gill LE, Goldacre
Armstrong BK, White E, Saracci R MJ (1998). Pre-natal and early life risk
(1992). Principles of exposure factors for childhood onset diabetes
measurement in epidemiology. New mellitus: a record linkage study. Int J
York: Oxford University Press. Epidemiol 1998; 27: 444-9.
Asher I, Keil U, Anderson HR, et al Lau S, Illi S, Sommerfeld C, et al (2001).
(1995). International Study of Asthma Early exposure to house-dust mite
and Allergies in Childhood (ISAAC): and cat allergens and development of
rationale and methods. Eur Resp J 8: childhood asthma: a cohort study.
483-91. Multicentre Allergy Study Group.
Lancet 2001; 356: 1392-7.
Berkman LF, MacIntyre S (1997). The
measurement of social class in health Liberatos P, Link BG, Kelsey JL (1988).
studies: old measures and new The measurement of social class in
formulations. Kogevinas M, Pearce N, epidemiology. Epidemiologic Reviews
Susser M, Boffetta P (eds). 10: 87-121.
Socioeconomic factors and cancer.
Meredith SK, Taylor VM, McDonald JC
Lyon: IARC, pp 51-64.
(1991). Occupational respiratory
Burney PGJ, Luczynska C, Chinn S, Jarvis disease in the UK 1989: a report by
D (1994). The European Community the SWORD project group. Br J Ind
Respiratory Health Survey. Eur Resp J Med 1991; 48: 292-8.
7: 954-60.
Neutra RR, del Pizzo V (1996). When
Checkoway HA, Pearce N, Kriebel D “wire codes” predict cancer better
(2004). Research methods in than spot measurements of magnetic
occupational epidemiology. 2nd ed. fields. Epidemiol 1996; 7: 217-8.
New York: Oxford University Press.
Pearce N (1988). Multistage modeling of
Dement JM, Harris RL, Symons MJ and lung cancer mortality in asbestos
Shy, CM (1983). Exposures and textile workers. Int J Epidemiol 17:
mortality among chrysotile asbestos 747-52.
workers. Part II: Mortality.
Pearce N (1992). Methodological
American Journal of Industrial
problems of time-related variables in
Medicine 4:421-433.
occupational cohort studies. Rev
Dowse GK, Gareeboo H, Zimmet PZ, et Epidem et Santé Publ 40: S43-S54.
al (1990). High prevalence of NIDDM
Pearce N, Beasley R, Burgess C, Crane J
and impaired glucose tolerance in
(1998). Asthma epidemiology:
Indian, Creole, and Chinese
principles and methods. New York:
Mauritians. Diabetes 1990; 39: 390-6.
Oxford University Press.

106
Pearce N, Sanjose S, Boffetta P, et al disease determinants. In: Berlin A,
(1995). Limitations of biomarkers of Draper M, Hemminki K, Vainio H
exposure in cancer epidemiology. (eds). Monitoring human exposure to
Epidemiol 6: 190-4. carcinogenic and mutagenic agents.
Lyon: IARC.
Polednak AP (1989). Racial and ethnic
differences in disease. New York: Schulte PA (1993). A conceptual and
Oxford University Press. historical framework for molecular
epidemiology. In: Schulte P, Perera
Pomare E, Tutengaehe H, Ramsden I, et
FP. Molecular epidemiology: principles
al (1992). Asthma in Maori people. NZ
and practices. New York: Academic
Med J 105: 469-70.
Press, pp 3-44.
Raum E, Arabin B, Schlaud M, et al
Vartia M (2001). Consequences of
(2001). The impact of maternal
workplace bullying with respect to the
education on intrauterine growth: a
well-being of its targets and the
comparison of former West and East
observers of bullying. Scand J Work
Germany. Int J Epidemiol 2001: 30:
Environ Health 27: 63-9.
81-7.
Ware JE (1993). SF-36 Health Survey,
Ross RK, Yuan J-M, Yu MC, et al (1992).
Manual and Interpretation Guide.
Urinary aflatoxin biomarkers and risk
Boston: The Health Institute.
of hepatocellular carcinoma. Lancet
1992; 339: 943-6. World Health Organisation (WHO)
(1985). WHO Study Group: Diabetes
Saracci R, Simonato L, Acheson ED, et al
mellitus. Technical Report Series no
(1984a). Mortality and incidence of
727. Geneva: World Health
cancer of workers in the man made
Organisation.
vitreous fibres producing industry: an
international investigation at 13 Willett W (1990). Nutritional
European plants. Br J Ind Med 1984; epidemiology. New York: Oxford
41: 425-36. University Press.
Saracci R (1984b). Assessing exposure
of individuals in the identification of

107
108
CHAPTER 9: Cohort Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

As discussed in chapter 2, an incidence geographical area. However, the validity


study is a subtype of longitudinal study of such analyses may be questionable,
in which the outcome measure is because in most countries death
dichotomous (e.g. death or disease certificates (or other routine records
incidence). Perhaps the simplest type of such as cancer registration records) are
incidence study involves “descriptive” not linked directly to the corresponding
analyses using routine mortality or population records. Thus, problems may
incidence records for a defined occur if factors such as ethnicity are
geographic population. For example, coded differently on the death records
most countries have comprehensive and on the population records.
death registration schemes, as well as Nevertheless, such “descriptive”
regular national censuses, a population analyses, have played a major role in
register, or other methods of estimating identifying public health problems and
population numbers. These can then be suggesting priorities for public health
used, as the numerator and denominator research.
respectively, to calculate overall national
death rates, as well as the death rates However, the limitations of analyses
by age-group and gender. In some based on routine records usually mean
countries, information may also be that a specific “cohort” must be
available to calculate death rates by constructed for many epidemiologic
other demographic variables such as studies. In this chapter I discuss the
ethnicity, socio-economic status, practicalities of conducting a cohort
employment status, occupation or study.

9.1: Defining the source population and risk period

Community-based cohort studies further surveys being conducted at


regular intervals.
For studies investigating
environmental factors, or general More specific cohorts
lifestyle (diet, exercise, etc) a cohort
study may be based on a particular Cohorts may also be constructed not
community which is followed (usually only on the basis of more specific
prospectively) over time. For example, exposures. Perhaps the most common
a cohort may be based on “all persons example of this approach involves
aged 20 years or more” living in a studies that are based on workers in a
particular city or county in a particular particular factory or industry
year. This would usually require a (Checkoway et al, 2004). Such studies
special survey to be conducted at the may be based on historical records,
start of the follow-up period, with enabling follow-up to be conducted
retrospectively. Typically, such a

109
historical cohort study might involve “all the country who did not work in the
workers who worked for at least one factory). However, this is rarely feasible
month in the factory at any time during in practice, and is usually a trivial
1970-1999”. The list of such workers problem if the exposure is rare. Thus,
can be enumerated using personnel the comparison is usually made between
records which also provide information the exposed group and the national
on their job titles and departments population as a whole.
(which can be used to estimate their
historical exposures). The risk period

Comparison populations Once the source population has been


defined, then the risk period must also
In community-based cohorts, be specified. It is important that the risk
comparisons are usually made internally period is the same for the two or more
between study participants exposed and groups being compared. For example, it
those not exposed to a particular risk would be inappropriate to compare
factor (e.g. low dietary beta carotene deaths from ischaemic heart disease in
intake compared with high dietary beta two different communities at two
carotene intake). different time periods, since there is a
continuing decline in IHD mortality, and
In studies of specific populations, an spurious differences between the
internal comparison may still be communities may be observed if they
possible, e.g. by comparing workers are not studied over the same risk
with high benzene exposure to those period.
with low benzene exposure. However, in
some instances this may not be possible In a historical cohort study, participants
because good individual exposure may be followed from some date in the
information is not available (apart from past (e.g. the date the factory opened)
the fact that workers in the factory up until the present (or some recent date
received high exposure on the average) for which death records or cancer
or because there is not sufficient registration records are complete). In a
variation in exposure within the prospective cohort study, participants
population (e.g. because everyone who may be followed from the present until
worked in the factory had high some specified future date (e.g. a ten-
exposure). In this situation, an external year follow-up of participants in a recent
comparison may be made, e.g. with survey). In both instances, not all study
national death rates or cancer participants will be followed for the
registration rates. In this situation, the entire risk period. For example, someone
source population for the study is who moved into the community during
effectively the national population, and a the risk period and was “recruited”
comparison is being made between the during a later survey would only be
subgroup in the source population that followed from the time of that survey.
worked in a particular factory (for Similarly, someone who emigrated
example) and the entire source during the risk period would only be
population. Ideally the comparison followed until their date of emigration.
should be made between the exposed
group and the source population minus
the exposed group (i.e. everyone else in

110
Example 9.1

The Renfrew/Paisley included self-reported adjusted for smoking,


study was based on two smoking history, and reduced further
adjacent urban burghs occupation, address, when adjusted for lung
considered to be typical age, gender, and function, phlegm and
of the West of Scotland. respiratory symptoms. (area) deprivation
During 1972-1976, men Study participants were category. They
and women aged “flagged” at the National concluded that the social
between 45 and 64 and Health Service Central class difference in lung
identified by door-to- Register in Edinburgh cancer mortality was
door census as living in and followed for 20 explained by poor lung
Renfrew and Paisley years. Hart et al (2001) health, deprivation and
were invited to take reported that high lung poor socio-economic
part. The response rate cancer mortality risks conditions throughout
was 80% (7,052 men were seen for manual life, in addition to
and 8,354 women). compared with non- smoking.
Participants completed a manual workers. The
questionnaire which risk reduced when

Example 9.2

Rafnsson et al (2001) attendants, there were hired in 1971 or later


studied cancer incidence 64 cases of cancer, and therefore had had
in a cohort of 1690 flight whereas 51.6 were the heaviest exposure to
attendants working with expected on the basis of cosmic radiation at a
two airline companies in national cancer young age (RR=4.1).
Iceland. The total incidence rates The authors concluded
number of person-years (RR=1.2). There was a that the association may
of follow-up was 27,148. particularly elevated risk be due to cosmic
Among the 1,532 for breast cancer in radiation or disturbance
women flight those who had been of circadian rhythm.

111
9.2: Measuring exposure

As discussed in chapter 8, there are a exposures) regular surveys, or regular


variety of possible methods for examination of routine records, may be
measuring exposure in cohort studies. desirable to update the exposure
These include routine records, information. However, in many studies
questionnaires, environmental this is not feasible and information is
measurements, Job-Exposure-Matrices only collected in a baseline survey; it is
(JEM), quantified personal then necessary to assume that the
measurements, and biomarkers of exposure level (e.g. serum cholesterol
exposure. level) has not changed meaningfully
during the subsequent follow-up.
Ideally, exposures should be measured
continuously, or at least at regular In occupational studies, more detailed
intervals, through the risk period (i.e. exposure information may be available
the period of follow-up). For some risk through the combination of personnel
factors (e.g. for demographic factors records (which include changes of job
such as age, gender and ethnicity), the title and department) and Job-Exposure-
risk factor status is unlikely to change Matrices (JEMs) based on workplace
during the risk period, and can simply be exposure surveys and/or personal
ascertained at baseline. For other measurements in a subgroup of the
exposures that do change over time workforce (see chapter 8).
(e.g. smoking, diet, occupational

Example 9.3

Prescott et al (2004) questionnaire checked follow-up, 483


studied vital exhaustion with the participant by experienced an IHD
(fatigue, hopelessness trained staff, and by event, of which 25%
and depression) as a various laboratory were fatal, and 1559
risk factor for tests. Vital exhaustion subjects died from all
ischaemic heart disease was assessed using a causes. All but 4 of the
(IHD) in 4084 men and 17-item questionnaire. 17 items were
5479 women in Participants were significantly associated
Copenhagen. The study followed until 31 with IHD with
was based on December 1997 for significant relative risks
participants in the fatal and non-fatal IHD, ranging from 1.36 to
Copenhagen City Heart with the information 2.10. The RR for IHD in
Study, and the being obtained from those with a vital
analyses were based on the National Board of exhaustion score of 10
10,135 people who Health and National or more was 2.57
attended the third Hospital Discharge (95% CI 1.65-4.00)
follow-up examination Register respectively. and this altered little
in 1991-1993. Subjects with self- after adjustment for
Cardiovascular risk reported and verified biological, behavioural
factors were assessed IHD prior to enrolment and socioeconomic risk
by a self-administered were excluded. During factors.

112
9.3 Follow-up

Vital status ascertainment A further problem is that some countries


do not have national death registrations,
In some instances, particularly in and these may be done on a regional or
community based studies, follow-up state basis instead, making it necessary
may involve regular contact with the to search multiple registers. Since 1979
study participants, including repeated a National Death Index for the United
surveys of health status. Perhaps more States has been compiled and
commonly, follow-up may not involve computerized and is available for vital
further contact with the study status tracing (Wentworth et al, 1983).
participants, but may be done by
routine record linkage. Just because someone has been not
been identified in death records, this
For example, study participants may be does not mean that they are still alive
followed over time by linking the study and “at risk” since they may have
information with national death records, emigrated or may not have been
or incidence records (e.g. a national identified in death registrations for some
cancer registry) as well as with other other reason. It is therefore desirable to
record systems (e.g. social security confirm that they are alive using other
records, drivers license records) to record sources such as drivers license
confirm vital status in those who are not records, voter registrations, social
found to have died during the follow-up security records, etc. In the United
period. States, the Social Security
Administration (SSA) records have been
Although most developed countries have frequently used in the past, and in Great
complete systems of death registration, Britain the Central Record Office of the
and it is easy in theory to identify all Ministry of Pensions and National
deaths in a particular cohort, this may Insurance is the analogous tracing
not be so straightforward in practice. For source (Checkoway et al, 2004).
example, many countries do not have
national identification numbers and Coding of the disease outcome
record linkage may have to be done on
the basis of name and date of birth. This It is not only necessary to determine if
may not be infallible because of and when an event such as a death or
differences in spelling of names, or hospital admission occurred. It is also
inaccuracies in date of birth, but various necessary to verify, for example, the
record linkage programmes are available cause of death, or the cause of a hospital
to identify “near matches” (Jones and admission. Coding of causes of death
Sujansky, 2004). These will be should be performed by a nosologist
ineffective, however, for people who trained in the rules specified by the
have changed their name, e.g. because International Classification of Diseases
of marriage. (ICD) volumes compiled by the World
Health Organisation. In many countries
this is done routinely for national death

113
registration records, and it is not meet the eligibility criteria (i.e.
necessary (or desirable) to recode death employment for one month),
registrations for a specific study. whichever is the latest date. If they
However, the ICD codes have changed started working in the factory after the
over time, and when using routine death start of the study, then they would
registration records it is necessary to be only start being followed on the date
aware of which ICD revision was in effect they started work (or a subsequent
at the time of death. date when they met the eligibility
criteria).
Person-time
They stop contributing person-time
In a study of a specific population, e.g. when they die (or are diagnosed with
workers in a particular factory, the disease in an incidence study),
participants may enter the study on emigrate, they are lost to follow-up, or
the date that the study starts the study finishes (31/12/99)
(1/1/70), or the date that they first whichever is the earliest.

Example 9.4

Munk Nielsen et al Since 1 April 1968, all discharge (whichever


(2003) studied long- Danish citizens have came later) until the
term mortality after been given a unique date of death,
poliomyelitis by identification number, emigration or 1 May
identifying a group of which is recorded in the 1997 (whichever came
5,977 patients diagnosed Danish Civil Registration earlier).
with poliomyelitis in System (CRS). The
Copenhagen between cohort was linked to the There were 1295 deaths
1919 and 1954. This CRS to identity individual compared with an
involved a review of CRS numbers which expected number of
more than 80,000 were then used to 1141 (SMR 1.14, 95% CI
consecutive hospital identify deaths in the 1.07-1.20). Excess
records for Danish Cause-of-Death mortality was restricted
Blegdamshospitalet Register. Patients not to polio patients with a
which served as the identified in the CRS history of severe
primary centre for were believed to have paralysis of the
diagnosing and treating died or emigrated before extremities (SMR =
patients with acute 1 April 1968 and for 1.69, 95% CI 1.32-2.15)
poliomyelitis in the area these patients the or patients who had
of greater Copenhagen. Cause-of-Death Register been treated for
Information extracted was searched for their respiratory failure during
from the records name and date of birth. the epidemics (SMR =
included name, sex, date 2.71; 95% CI 2.18-
and place of birth, date Patients were followed 3.37).
of admission and from the initiation of the
discharge, and details of Cause-of-Death Register
the acute severity of the in 1943 or the month
case. after the hospital

114
Summary

Cohort studies provide the most exposed and those participants not
comprehensive approach for evaluating exposed to a particular risk factor.
patterns of exposure and disease, since However, in some instances, all of the
they involve studying the entire source study participants may be exposed, or
population (assuming that there is a valid individual exposure information
100% response rate) over the entire risk may not be available, and it may be
period. necessary to make an external
comparison, e.g. with national mortality
Thus, the cohort design ideally includes rates (in which case the national
all of the relevant person-time population comprises the source
experience of the source population over population for the study). It is important
the risk period. A cohort study may be that any comparisons are made over the
based on a particular community (e.g. a same risk period, and that follow-up is
geographical community), or on a more as complete as possible. The basic effect
specific population defined by a measures in a cohort study are the rate
particular exposure (e.g. workers in a ratio and risk ratio. Methods of data
particular factory). In both instances, an analysis for these effect measures are
internal comparison would ideally be described in chapter 12.
made between those participants

References

Checkoway HA, Pearce N, Kriebel D Munk Nielsen N, Rostgaard K, Juel K, et


(2004). Research methods in al (2003). Long-term mortality after
occupational epidemiology. 2nd ed. poliomyelitis. Epidemiol 14: 355-60.
New York: Oxford University Press.
Prescott E, Holst C, Gronbaek M, et al
Hart CL, Hole DJ, Gillis CR, et al (2001). (2004). Vital exhaustion as a risk
Social class differences in lung cancer factor for ischaemic heart disesae and
mortality: risk factor explanations all-cause mortality in a community
using two Scottish cohort studies. Int sample: a prospective study of 4084
J Epidemiol 30: 268-74. men and 5479 women in the
Copenhagen City Heart Study. Int J
Jones L, Sujansky W (2004). Patient
Epidemiol 33: 990-7.
data matching software: a buyers
guide for the health conscious. Rafnsson V, Tulinius H, Jónasson JG,
Okaland, CA: California HealthCare Hrafnkelsson J (2001). Risk of breast
Foundation (www.chcf.org). cancer in female flight attendants: a

115
population-based study (Iceland). Social Security Administration Master
Cancer Causes and Control 12: 95- Beneficiary Record file and the
101. National Death Index in the
ascertainment of vital status. Am J
Wentworth DN, Neaton JD, Rasmussen
Public Health 73; 127-1274.
WL (1983). An evaluation of the

116
CHAPTER 10: Case-control Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

As discussed in chapter 2, the only usually little loss of precision compared


conceptual difference between a full to a full cohort study, and there may be
cohort study based on a specified source considerable savings in terms of time
population and risk period, and an and expense, particularly if the study
(incidence) case-control study based on disease is rare or has a long induction
the same source population and risk time.
period, is that the latter involves
outcome-specific samples of the source In this chapter I discuss the practicalities
population, rather than an analysis of of conducting an (incidence) case-control
the entire source population. There is study.

10.1: Defining the source population and risk period

An incidence case-control study should defined geographical area. This may


be based on a specified source be a formal register (such as a Cancer
population and risk period. The task in Register) or a similar data source (e.g.
such a population-based case-control admission records for a particular
study is then to identify all cases of hospital). In such a registry-based
the outcome under study that are study the task is to identify the source
generated by the source population population for the register (e.g. all
over the risk period. Controls are then persons who would have been
sampled at random from the source admitted to the hospital if they had
population. developed the disease under study).
This obviously poses more problems in
In some instances, cases may be the appropriate selection of controls
identified from a particular disease than is the case for a population-based
register which is not comprehensive study; these issues are discussed in
with respect to the population of any more depth below.

117
Example 10.1

Bigert et al (2001) units (1% of the cases, questionnaire (for fatal


studied myocardial obtained from a cases the questionnaires
infarction (MI) among computerized hospital were completed by next-
professional drivers. The discharge register), or of-kin). Of the cases, 45
source population death certificates from (4.2%) had worked as a
comprised all men aged the Causes of death bus driver, compared
45-70 years free of Register at Statistics with 31 (2.1%) of the
previous MI and living in Sweden (12% of the controls, yielding an
Stockholm County cases). Controls were odds ratio of 2.14 (95%
during 1992-1993. selected at random from CI 1.34-3.41). The
Cases of first MI a computerized corresponding odds
generated by this source population register, ratios for taxi drivers
population and risk stratified for sex, 5-year and truck drivers were
period were identified age-group, hospital 1.88 (95% CI 1.19-2.98)
from three sources: the catchment area and year and 1.66 (95% CI 1.22-
medical care units at the of enrolment in the 1.26) respectively.
10 emergency hospitals study (1992 or 1993). Adjustment for potential
within the Stockholm The 1,067 cases and confounders gave lower
County (87% of the 1,482 controls odds ratios: 1.49, 1.34
cases), other hospital completed a postal and 1.10 respectively.

10.2: Selection of cases

In a population-based study the first screening and are therefore more likely
step in the selection of cases is to to be diagnosed with a non-fatal
attempt to ascertain all cases generated myocardial infarction.
by the source population over the risk
period (Checkoway et al, 2004). If In a registry-based study the case-group
complete case ascertainment is not usually consists of all incident cases
achieved, then the relative risk estimate occurring in the registry during the risk
(odds ratio) will not necessarily be period. The “registry” could consist of a
biased unless case ascertainment is formal population-based registry (e.g. a
associated with exposure history, e.g. if cancer registry or birth defects registry),
people who are prescribed a particular or could involve an ad hoc “registry”,
drug receive more intensive medical e.g. based on admission records for the
major hospitals in a city.

118
Example 10.2

Mian et al (2001) studied homicide in both cases and controls, the interviews
Orangi, the largest squatter settlement were conducted with their wife, or if she
in Karachi with an estimated population was inaccessible or unwilling it was
of 1.2 million. They defined the cases as conducted with the wife of the head of
individuals who lived in Orangi and were the household. People who were killed
killed in Orangi between January 1994 were 34 times more likely to have
and January 1997, due to intentional attended all political processions (29%
violence, by firearms, sharp or blunt versus 1%, odds ratio (OR) = 34, 95%
trauma. Cases were identified in the 15 CI 4-749), 19 times more likely to have
neighbourhoods (out of 103 in total in attended political meetings (31% versus
Orangi) which field workers identified as 2%, OR = 19, 95% CI 4-136), and 17
the highest violence neighbourhoods. times more likely to have held an
Field workers identified households important position in a political party
where they knew someone had been (29% versus 2%, OR = 17, 95% CI 3-
killed; in a few neighbourhoods they also 120). The authors concluded that
contacted other social organisations in homicide in Orangi was political and that
the community to identify further cases. efforts to build trust between ethnic
Controls were selected from a random groups and to build legitimacy for non-
sample of households enrolled in a violent forms of conflict resolution are
related study conducted at the same important steps to limit future violence.
time in the same 15 neighbourhoods. For

10.3: Selection of controls

Control sampling options the rate ratio is usually the effect


measure of interest. Fortunately,
As discussed in chapter 2, there are although case-control studies have
three main options for selection of traditionally been presented in terms
controls: (i) cumulative sampling of cumulative sampling (e.g. Cornfield,
involves selecting controls from those 1951), most case-control studies
who do not experience the outcome actually involve density sampling
during the risk period (i.e. the (Miettinen, 1976), often with matching
survivors) and will estimate the on a time variable (such as calendar
incidence odds; (ii) case-cohort time and/or age), and therefore
sampling involves selecting controls estimate the rate ratio without the
from the entire source population and need for any rare disease assumption
will estimate the risk ratio; (iii) density (Pearce, 1993). In particular, the
sampling involves selecting controls “standard” population-based case-
longitudinally throughout the course of control design in which all cases
the study and will estimate the rate occurring in a country (or state or city)
ratio. Density sampling is therefore in a particular year are compared with
usually the preferred approach since a control sample of all other people

119
living in the same country during the admissions from all major hospitals in
same year, actually involves density the city and excluding cases who do not
sampling with calendar year as the live in the city; controls can then be
“time” matching variable (possibly with sampled from that defined source
additional matching on the additional population.
“time” variable of age).
If it proves impossible to define and
Sources of controls enumerate the source population, then
one possibility is to select controls
In a population-based case-control from people appearing in the same
study, controls are usually sampled at “register” for other health conditions
random from the entire source (e.g. admissions to the hospital for
population (perhaps with matching on other causes). This may not only
factors such as age and gender). In produce a valid sample of the “source
some instances, it may be necessary population”, but may also have
to restrict the source population in advantages in making the case and
order to achieve valid control control recall more comparable (Smith
sampling. For example, if controls are et al, 1988). However, it may result in
to be selected from voter registration bias if the other health conditions are
rolls, and these are known to be less also caused (or prevented) by the
than 100% complete for the exposure under study (Pearce and
geographical area under study, then Checkoway, 1988). For this reason,
the source population might be the population-based approach is
restricted to persons appearing on the preferable, although registry-based
voter registration roll, and cases that studies may still be valuable when
were not registered to vote would be population-based studies are not
excluded; controls would then be practicable, provided that careful
sampled from this redefined source consideration is given to possible
population by taking a random sample sources of bias.
of the roll.
Matching
In registry-based studies, selection of
controls may not be so straightforward In some instances it may be appropriate
because the source population may not to match cases and controls on potential
be so easy to define and enumerate. For confounders (e.g. age and gender). This
example, if there are two major hospitals can be done by 1:1 matching (e.g. for
in a city, and a study is based on lung each case, choose a control of the same
cancer admissions in one of them during age and gender) or by frequency
a defined risk period, then the source matching (e.g. if there are 25 male cases
population is “all those who would have in the 30-34 age-group then choose the
come to this hospital for treatment if same number of male controls for this
they had developed lung cancer during age-group). It is important to
this risk period”. This population may be emphasize, however, that this will not
difficult to define and enumerate, remove confounding in a case-control
particularly if cases may also be referred study, but will merely facilitate its
from smaller regional hospitals. The best control in the analysis. For example, in a
solution is usually to define a more case-control study of lung cancer, the
specific source population (e.g. all people cases will generally be relatively old
living in the city) and to attempt to whereas a random general population
identify all cases generated by that control sample will be relatively young.
source population, e.g. by including This may lead to inefficiencies when age

120
is controlled in the analysis since the can be done with simple stratification on
older age-groups will contain many cases age (e.g. by five-year age-groups) and
and few controls, whereas the younger gender and it is not necessary to retain
age-groups will contain many controls the 1:1 matched pairs in the analysis
and few cases. Matching on age will (Rothman and Greenland, 1998).
ensure that there are approximately
equal numbers of cases and controls in There are also potential disadvantages of
each age-strata and will thereby improve matching. In particular, matching may
the precision of the effect estimates actually reduce precision in a case-
(given a fixed number of cases and control study if it is done on a factor that
controls). However, it will not remove is associated with exposure but is not a
confounding by age – it merely makes it risk factor for the disease under study
easier to control in the analysis and hence is not a true confounder
(Checkoway et al, 2004). (Rothman and Greenland, 1998).
Furthermore, matching is often
It is also important to emphasize that if expensive and/or time consuming. For
“pair” matching (i.e. 1:1 matching) has these reasons, it is usually sufficient, and
been done, then it is important to control preferable, to only match on basic
for the matching factors in the analysis, demographic factors such as age and
but that this need not involve a gender, and to then control for other
“matched analysis”. For example, if pair potential confounders (along with age
matching has been done on age and and gender) in the analysis (Checkoway
gender, then it is important to control for et al, 2004).
age and gender in the analysis, but this

Example 10.3

Cole et al (2000) women under 76 years time urgency/


studied time urgency old living in the Boston impatience was
and risk of non-fatal area with no previous ascertained using four
myocardial infarction history of MI. For each items from the 10-item
(MI) in a study of 340 case, a control subject Framingham Type A
cases and an equal of the same sex and scale. A dose-response
number of age, sex and age (+ 5 years) was relation was apparent
community-matched selected at random among subjects who
controls. Cases were from the residents’ list rated themselves
identified from of the town in which higher on the four-item
admissions to the the patient resided. urgency/impatience
coronary or intensive Each subject was scale with a matched
care units of six interviewed in his or odds ratio for non-fatal
suburban Boston her home by one of two MI of 4.45 (95% CI
hospitals between 1 trained nurse 2.20-8.99) comparing
January 1982 and 31 interviewers those with the highest
December 1983. Those approximately 8 weeks rating to those with the
eligible for inclusion after discharge from lowest.
were white men and the hospital. A sense of

121
10.4: Measuring exposure

Once the cases and controls have been open to criticism as being particularly
selected, information on previous prone to bias, e.g. because the recall of
exposures is then obtained for both past exposures (e.g. eating meat,
groups. As discussed in chapter 8, there drinking alcohol, spraying pesticides)
are a variety of possible methods for may be different between cases of
measuring exposure in case-control disease and healthy controls. However,
studies. In some instances this may be collecting exposure information from
from historical records, e.g. personnel questionnaires is not an inherent feature
records that contain work history of case-control studies, and is sometimes
information. also a feature of cohort studies. Thus,
there is nothing inherently biased in the
Perhaps more commonly, exposure case-control design; rather what is
information may be obtained from important is the validity of the exposure
questionnaires. It is this latter feature of information that is collected, whatever
case-control studies which has left them study design is employed.

Summary

The only conceptual difference between and risk period. The tasks are then to:
a full cohort study based on a specified (i) identify all cases generated by the
source population and risk period, and source population over the risk period;
an (incidence) case-control study based (ii) select a random sample of controls
on the same source population and risk from the source population over the risk
period, is that the latter involves period (ideally by density matching); (iii)
outcome-specific samples of the source obtain exposure information from cases
population, rather than an analysis of and controls in a standardised and
the entire source population. There is unbiased manner.
usually little loss of precision compared
to a full cohort study, and there may be The standard effect estimate in a case-
considerable savings in terms of time control study is the odds ratio. If controls
and expense, particularly if the study are selected by density matching, then
disease is rare or has a long induction the odds ratio will estimate the incidence
time. rate ratio (in the source population and
risk period) in an unbiased manner
The key feature of good case-control without the need for any rare disease
study design is that the study should be assumption. Methods of data analysis for
based on a specified source population odds ratios are described in chapter 12.

122
References

Bigert C, Gustavsson P, Hallqvist J, et al Miettinen OS (1976). Estimability and


(2003). Myocardial infarction among estimation in case-referent studies.
professional drivers. Epidemiol 14: Am J Epidemiol 103: 226-35.
333-9.
Pearce N (1993). What does the odds
Checkoway HA, Pearce N, Kriebel D ratio estimate in a case-control study?
(2004). Research methods in Int J Epidemiol 22: 1189-92.
occupational epidemiology. New York:
Pearce N, Checkoway H (1988). Case-
Oxford University Press.
control studies using other diseases
Cornfield J (1951). A method of as controls: problems of excluding
estimating comparative rates from exposure-related diseases. Am J
clinical data: applications to cancer of Epidemiol 127: 851-6.
the lung, breast and cervix. JNCI 11:
Rothman KJ, Greenland S (1998).
1269-75.
Modern epidemiology. 2nd ed.
Mian A, Mahmood SF, Chotani H, Luby S Philadelphia: Lippincott-Raven.
(2001). Vulnerability to homicide in
Smith AH, Pearce N, Callas PW (1988).
Karachi: political activity as a risk
Cancer case-control studies with other
factor. Int J Epidemiol 31: 581-5.
cancers as controls. Int J Epidemiol
17; 298-306.

123
124
CHAPTER 11: Prevalence Studies
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

As discussed in chapter 3, incidence Examples of prevalence surveys include


studies are usually the preferred general households surveys conducted
approach, but may be time consuming by government agencies (e.g. Ministry of
and expensive, and it may be difficult to Health, 1999), more focussed general
identify incidence cases of non-fatal population surveys (e.g. Australian
chronic conditions such as diabetes. In Bureau of Statistics, 1998) such as the
particular, some degenerative diseases National Health and Nutrition
(e.g. chronic bronchitis) may have no Examination Survey (NHANES)
clear point of onset. (Kuczmarski et al, 1994), international
surveys of the prevalence of conditions
Thus, in some settings (e.g. developing such as asthma (Burney et al, 1994;
countries) and for some conditions (e.g. Asher et al, 1995), and surveys in
chronic non-fatal disease) prevalence populations with specific exposures (e.g.
studies may be the only realistic option. surveys of asthma in children living on
Furthermore, in some instances we may farms (Braun-Fahrländer et al, 1999)).
be more interested in factors that affect
the current burden of disease in the In this chapter I discuss the practicalities
population (i.e. prevalence) rather than of conducting a prevalence study.
disease incidence.

11.1: Defining the source population

Prevalence studies usually involve “point in time” may not necessarily be


surveys in a source population defined the same “date” for each study
by a geographic region or a particular participant. For example, studies of
exposure (e.g. an industry or factory). congenital malformations usually
As with an incidence study, it is involve measuring the prevalence of
important that this source population congenital malformations at birth.
is well-defined and that a high Thus the source population may be “all
response rate is obtained. babies born in this city during 2004”
and the “time” at which prevalence is
In a prevalence study, disease measured may be “birth” which will be
prevalence is measured at a specific a different date and time for each
point in time, rather than over a member of the source population.
specified risk period. However, this

125
Example 11.1

Wilks et al (1999) glucose tolerance testing fourfold excess of


conducted a survey of was conducted after an diabetes in women
the prevalence of overnight fast (response compared to men, but
diabetes in the rate = 60%). The obesity could not
population of Spanish prevalence of Type 2 entirely account for the
Town, Jamaica. A diabetes mellitus was high prevalences
random population 15.7% among women observed which exceed
sample was recruited by and 9.8% among men. those previously
door-to-door canvassing The sex patterns were reported among
(n=1,303) and oral consistent with the European populations.

11.2: Measuring health status

Prevalence studies differ from In fact, if we are using a particular


incidence studies in that the method to measure the prevalence of a
measurement of health status most disease, and:
commonly involves a morbidity survey,
Sn = sensitivity
rather than identifying incident cases
through routine records (e.g. hospital Sp = specificity
admissions or cancer registration
P is the true prevalence of the
records). Methods that can be used for
disease in the source population
such surveys have already been
discussed in chapter 8 and will only be then the observed prevalence that will
considered briefly here. be obtained in the survey is:
As discussed in chapter 8, methods of Sn P + (1 - Sp) (1 - P)
measuring disease status, that are
= P (Sn + Sp - 1) + (1 - Sp)
most appropriate in clinical practice
may not be appropriate or applicable therefore if two populations are being
in epidemiologic surveys. Furthermore, compared, and their true prevalences
the criteria for deciding the most valid (according to the gold standard) are P1
method to use may differ between and P0 respectively, then the observed
clinical practice and epidemiological difference in prevalence between the two
surveys. In the clinical setting the centres is:
emphasis is often on the positive
(P1 – P0)(Sn + Sp - 1)
predictive value of a test, which
depends in turn on the sensitivity, The expression (Sn + Sp -1) is
specificity, and the underlying Youden's Index. When this is equal to
population prevalence of the disease. 1 (which only occurs when the
sensitivity and specificity are both 1)
then the observed difference in

126
prevalence will be exactly equal to the measure of the validity of a particular
true difference in prevalence. More question or technique in prevalence
commonly, Youden's Index will be less comparisons (Pekkanen and Pearce,
than 1 and the observed prevalence 1999).
difference will be reduced accordingly,
In this respect, basic symptom
e.g. if Youden’s Index is 0.75 then the
questionnaires may often perform
observed prevalence difference will be
better than supposedly more
0.75 times the true prevalence
“objective” measures such as bronchial
difference. Youden's Index therefore
responsiveness testing (Pearce et al,
provides the most appropriate
1998).

Example 11.2

Table 11.1 shows and 24% respectively would have been no


hypothetical data from a (table 11.1); the diminishment in the
study of asthma observed prevalence observed prevalence
prevalence in childhood. difference will then be difference; on the other
The true prevalence 14% (instead of the true hand, if the sensitivity
rates were 40% in the value of 20%). The net and specificity had been
exposed group, and 20% effect is to bias the no better than chance
in the non-exposed prevalence difference (e.g. both equal to 0.5)
group; the true towards the null value of then Youden’s Index
prevalence difference zero. The extent of the would have been zero,
was thus 20%. If 20% of bias is related to and the expected value
asthmatics are Youden’s index: this is of the observed
incorrectly classified as 0.80+0.90-1.0=0.7, and prevalence difference
non-asthmatics (i.e. a the observed prevalence also have been zero
sensitivity of 0.80), and difference of 14% is 0.7 (although the observed
10% of non-asthmatics times the true value of value might be different
are incorrectly classified 20%. If the sensitivity from zero due to chance
as asthmatics (i.e. a and specificity had been variation).
specificity of 0.90), then perfect (1.0) then
the observed Youden’s Index would
prevalences will be 38% have been 1.0 and there .

127
Table 11.1

Hypothetical data from a prevalence study in which 20% of asthmatics


and 10% of non-asthmatics are incorrectly classified

Actual Observed
----------------------------- ---------------------------------------------------
Non-
Exposed exposed Exposed Non-exposed
----------------------------------------------- ----------------------- ---------------------------
Asthmatics 40 20 32 + 6 = 38 16 + 8 = 24
Non-asthmatics 60 80 54 + 8 = 62 72 + 4 = 76
--------------------------------------------------------------------------------------------------
Total 100 100 100 100
--------------------------------------------------------------------------------------------------
Prevalence 40% 20% 38% 24%
--------------------------------------------------------------------------------------------------

11.3: Measuring exposure

As discussed in chapter 8, there are a Whereas incidence case-control studies


variety of possible methods for involve at least three possible methods
measuring exposure in prevalence of selecting controls, in a prevalence
studies. These include questionnaires, case-control study there is only one
biological measurements, and valid option, i.e. controls should be
examination of historical records (e.g. selected at random from the non-
personnel and work history records). cases. For both groups, information on
historical and current exposures may
be obtained, as well as information on
In a full prevalence study, exposure is
potential confounders.
measured in all members of the source
population. In a prevalence case- It is important to emphasize that
control study, exposure information is although a prevalence study involves
obtained for the cases and for a measuring disease status at one point
control sample of non-cases (chapter in time, information can be collected
3). Thus, a prevalence case-control on historical exposures. For example, a
study can be based on routine records prevalence survey of bronchitis might
(see example 3.2) or as a second involving assessing whether a person
phase of a specific prevalence survey. has “current bronchitis” on a particular
day, but exposure information could be

128
collected for both current smoking as
well as smoking history.

Example 11.3

Guha Mazumder et al household members arsenic concentrations


(2000) studied arsenic were invited to in drinking water. In
in drinking water and participate and participants with
the prevalence of sampling continued arsenic-related skin
respiratory effects in from house to house lesions, the age-
West Bengal, India. A until sufficient numbers adjusted prevalence
cross-sectional survey had been recruited. odds ratios for cough
involving 7,683 Participants were were 7.8 for females
participants of all ages clinically examined and (95% CI 3.1-19.5), and
was conducted in an interviewed, and the 5.0 for males (95% CI
arsenic-affected region arsenic content of their 2.6-9.9); the
between April 1995 and current primary corresponding findings
March 1996. The drinking water source for chest sounds were
source population was was measured. There 9.6 (95% CI 4.0-22.9)
based on two areas of were few smokers and and 6.9 (95% CI 5.8-
the arsenic-affected analyses were confined 92.8), and those for
districts south of to non-smokers (6,864 shortness of breath
Calcutta. A participants). Among were 23.3 (95% CI
convenience sampling both males and 5.8-92.8) and 3.7
strategy was used in females, the (95% CI 1.3-10.6). The
which the field team prevalence of cough, authors concluded that
went to the centre of shortness of breath, these results add to
each village and and chest sounds evidence that long-
selected the most (crepitations and/or term ingestion of
convenient hamlet to rhonchi) in the lungs arsenic can cause
begin sampling; all rose with increasing respiratory effects.

Summary

Incidence studies are usually the population rather than disease incidence.
preferred approach, but in some settings The conduct of a prevalence study is (at
and for some conditions prevalence least in theory) relatively
studies are the only option. Furthermore, straightforward. A source population is
in some instances we may be more defined, and at one point in time the
interested in factors that affect the prevalence of disease is measured in the
current burden of disease in the population. Exposure information is then

129
obtained for all members of the source standard effect estimate in a prevalence
population (a prevalence study), or for study is the odds ratio. Methods of data
all cases of the disease under study and analysis for odds ratios are described in
a control sample of the non-cases (a chapter 12.
prevalence case-control study). The

References

Asher I, Keil U, Anderson HR, et al in Wert Bengal, India (2000). Int J


(1995). International study of asthma Epidemiol 29: 1047-52.
and allergies in childhood (ISAAC):
Kuczmarski RJ, Flegal KM, Campbell SM,
rationale and methods. Eur Resp J 8:
Johnson CL (1994). Increasing
483-91.
Prevalence of Overweight Among US
Australian Bureau of Statistics (1998). Adults. The National Health and
National Nutrition Survey: Nutrient Nutrition Examination Surveys, 1960
Intakes and Physical Measurements, to 1991. JAMA 1994; 272: 205-211
Australia, 1995. Canberra: Australian
Ministry of Health (1999). Taking the
Bureau of Statistics.
pulse: the 1996/1997 New Zealand
Braun-Fahrländer CH, Gassner M, Grize Health Survey. Wellington: Ministry of
L, et al (1999). Prevalence of hay Health.
fever and allergic sensitization in
Pearce N, Beasley R, Burgess C, Crane J
farmer’s children and their peers
(1998). Asthma epidemiology:
living in the same rural community.
principles and methods. New York:
Clin Exp Allergy 29: 28-34.
Oxford University Press.
Burney PGJ, Luczynska C, Chinn S, Jarvis
Pekkanen J, Pearce N. Defining asthma
D (1994). The European Community
in epidemiological studies. Eur Respir
Respiratory Health Survey. Eur Resp J
J 1999; 14: 951-7.
7: 954-60.
Wilks R, Rotimi C, Bennett F, et al
Checkoway HA, Pearce N, Kriebel D
(1999). Diabetes in the Caribbean:
(2004). Research methods in
results of a population survey from
occupational epidemiology. 2nd ed.
Spanish Town, Jamaica. Diabetic
New York: Oxford University Press.
Medicine 1999; 16: 875-83.
Guha Mazumder DN, Haque R, Ghosh N,
et al. Arsenic in drinking water and
the prevalence of respiratory effects

130
Part IV

Analysis and Interpretation of Studies

131
132
CHAPTER 12: Data Analysis
(In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005)

In this chapter I describe the basic outcomes (chapters 2 and 3) and I do


principles of data analysis in not consider more complex study
epidemiologic studies including the designs (chapter 4). Readers requiring a
estimation of effects and calculation of more formal and detailed statistical
confidence intervals while controlling for presentation are referred to standard
potential confounders. I only cover the texts (particularly Rothman and
basic methods for dichotomous Greenland, 1998).
exposures and dichotomous health

12.1: Basic Principles

Data Management epidemiological resources, including


epidemiological software which is
With the rapid advances in computer available free of charge, or at
technology in recent years, almost minimal cost, has been produced by
any epidemiological study can be the Epidemiology Monitor, and this
analysed on a personal computer publication also has a regular feature
(PC). In addition, a wide variety of reviewing such software (see the
software is available for data entry, Epidemiology Monitor Website at
data analysis and graphical http://www.epimonitor.net/). There
presentation of data on PCs (much of is also an excellent epidemiology
which is not available for mainframe Excel spreadsheet (Episheet)
computers). One particularly useful available, which can be used to do
package is EPI-INFO (Dean et al, most of the analyses described in this
1990), which is available through chapter (Rothman, 2002). It can be
WHO (Geneva) and CDC (Atlanta) downloaded from http://www.oup-
and can be downloaded from usa.org/epi/rothman/.
http://www.cdc.gov/epiinfo. This
package is particularly useful for data Given the huge amount of work usually
entry and editing, and can be used involved in collecting data for
on small laptop computers in the field epidemiologic studies, it is essential to
as well as on desktop computers. examine the raw data very carefully for
However, the same facilities are errors and to make every attempt to
available in many other packages, avoid errors in the transfer of data from
some of which are more sophisticated questionnaires onto the computer. In
both statistically and in terms of data most cases, the first step is to translate
management (e.g. Stata (Hills and some of the information into numerical
De Stavola, 2002)). A catalogue of (or alphabetical) codes, following a set

133
of coding instructions that should have done, both to avoid confusion, and also
been prepared prior to data collection. to avoid any possibility of the data
For instance, a detailed occupational coding and checking being influenced by
history may have been taken in a semi- the results of preliminary analyses.
narrative form, and must be Once the data have been entered and
subsequently coded. It is usually edited, there is usually a major task of
preferable to do this when entering the data management. This typically
data directly onto a PC, since this involves the use of a computer package
minimizes transcription errors. to transform the data, compute new
variables, and prepare new files suitable
Once the data are coded and entered, for statistical analysis.
programmes should be run that seek
strange data, contradictions, and Data Analysis
impossible data (e.g. a systolic blood
pressure of 40 mm Hg). These The basic aim of the analysis of a single
programmes should not be restricted to study is to estimate the effect of
a search for logic errors or exposure on the outcome under study
impermissible symbols. They should while controlling for confounding and
include also procedures that identify minimizing other possible sources of
values that lie outside plausible limits. bias. In addition, when confounding and
The values being queried should be other sources of bias cannot be
listed, and decisions on how the "errors" removed, then it is important to assess
are dealt with should be documented. their likely strength and direction. This
With many packages, this process can latter task was discussed in chapter 7.
be conducted during the actual data In this chapter I focus on the control of
entry since the range of permissible confounding.
values (for numeric variables) or legal
codes (for alphanumeric variables) can Effect estimation
be specified, as well as variables which
must not be left blank, conditional The basic effect measures, and methods
jumps (e.g. if the answer is "NO" the of controlling confounding are described
computer skips to the next relevant below. Usually, in epidemiology studies,
question), repeat fields (so that the we wish to measure the difference in
value of a variable is set by default to disease occurrence between groups
that of the last record entered or exposed and not exposed to a particular
displayed), and logical links between factor.
variables. The best method of data
checking is to enter all of the data The analysis ideally should control
twice, and to compare the two files for simultaneously for all confounding
discrepancies. This approach, combined factors. Control of confounding in the
with extensive edit checks at the time of analysis involves stratifying the data
data entry, should minimize errors. according to the levels of the
confounder(s) and calculating an effect
Even with double data entry and estimate which summarizes the
sophisticated checking procedures, information across strata of the
errors may occur, and it is therefore confounder(s). For example, controlling
important to run further edit checks for age (grouped into 5 categories) and
before data analysis begins. It is gender (with 2 categories) might involve
particularly important to finish all edit grouping the data into the 10 (= 5 x 2)
checks and to have a final version of the confounder strata and calculating a
data file before any data analysis is

134
summary effect estimate which is a In most instances, epidemiologic data
weighted average of the stratum- involves binomial (i.e. with persons in
specific effect estimates. the denominator) or Poisson (i.e. with
person-years in the denominator)
Confidence intervals outcome variables and ratio measures
of effect. The estimated relative risk
As well as estimating the effect of an (rate ratio, risk ratio, odds ratio) has an
exposure, it is also important to approximate log normal distribution,
estimate the statistical precision of the and the ln(RR) can be written as the
effect estimate. The confidence interval difference of the two compared risks:
(usually the 95% confidence interval)
provides a range of values in which it is ln(RR) = ln(R1/R0) = ln(R1) – ln(R0)
plausible (provided that there is no
uncontrolled confounding or other bias)
that the true effect estimate may lie. If Thus (assuming no bias) the 95%
the statistical model is correct, and confidence interval for the natural log
there is no bias, then the confidence (ln) of the relative risk is:
intervals derived from an infinite series
of study repetitions would contain the ln(RR) + 1.96 SE
true effect estimate with a frequency no
less than its confidence level (Rothman
and Greenland, 1998). Thus the confidence interval for the
relative risk itself is:
The usual practice is to use 90% or
95% confidence intervals, but these RR e + 1.96 SE
values are completely arbitrary. Given a
large enough sample, an approximate
95% confidence interval for the true
P-Values
population mean is:

m + 1.96 SE As discussed in chapter 5, the p-value is


the probability that a test statistic as
large or larger as that observed could
where m is the observed mean of the have arisen by chance if there is no bias
sample, and SE is its standard error, and if the null hypothesis (of no
estimated from the standard deviation association between exposure and
of the sample divided by the square root disease) is correct. The test statistic
of the sample size. defines the p-value and usually has the
form:
This confidence interval depends on two
quantities (m and SE) which are z = D/SE
estimated from the sample itself, and
different results will be obtained from
different samples. Provided that the where D is the observed difference and
samples are sufficiently large, then 95% SE is the standard error of the
of the time, the confidence interval difference.
estimated from the sample would
contain the true population mean. One This provides a test statistic (z) which
should note, however, that this is no can be used to calculate the probability
guarantee that the interval from one’s (p-value) that a difference as large as
data contains the true value. that observed would have occurred by

135
chance if the null hypothesis (that there studies, as well as non-statistical
is no difference in reality) were true. considerations such as the plausibility
and coherence of the effect in the light
In the past, p-values have often been of current theoretical and empirical
used to describe the results of a study knowledge (see chapter 13).
as "significant" or "not significant" on
the basis of decision rules involving an The problems of significance testing can
arbitrary alpha level as a “cutoff” for be avoided by recognizing that the
significance (e.g. alpha=0.05). principal aim of an individual study
However, it is now recognised that there should be to estimate the size of the
are major problems with this approach effect rather than just to decide whether
(Rothman and Greenland, 1998). or not an effect is present. The point
estimate should be accompanied by a
First, the p-value associated with a confidence interval (the interval
difference in outcome between two estimate) which indicates the precision
groups depends on two factors: the size of the point estimate by providing a
of the difference; and the size of the range of values within which it is most
study. A very small difference may be plausible that the true treatment effect
statistically significant if the study is may lie if no bias were present (Gardner
very large, whereas a very large and Altman, 1986; Rothman and
difference may not be significant if the Greenland, 1998). The point estimate
study is very small. p-values thus reflects the size of the effect, whereas
combine two phenomena which should the confidence interval reflects the
be kept separate: the size of the effect; study size on which this effect estimate
and the size of the study used to is based. This approach also facilitates
measure it. the comparison of the study findings
with those of previous studies. Note that
A second problem with significance all conventional statistical methods
testing is more fundamental. The assume “no bias is present”. Because
purpose of significance testing is to this assumption is rarely if ever correct,
reach a decision. However, in further considerations beyond the
environmental research, decisions statistics presented here are always
should ideally not be based on the needed (see chapter 13).
results of a single study, but should be
based on information from all available

12.2: Basic Analyses

Measures of Disease Occurrence

The basic measures of disease occurrence used measures. In the next section I
and association have been introduced in extend these methods to adjust for
chapter 2. In this section I consider them potential confounders. I will only present
in more depth and show how to calculate “large sample” methods of analysis which
confidence intervals for the commonly have sample size requirements for valid

136
use. To avoid statistical bias, more persons followed for 10 years. As noted
complex techniques are required for in chapter 2, three measures of disease
analyses of studies involving very small incidence are commonly used in incidence
numbers or sparse stratifications studies.
(Greenland et al, 2000). Once again,
readers are referred to standard texts The observed incidence rate in the non-
(particularly Rothman and Greenland, exposed group (table 9.1) has the form:
1998) for a more comprehensive review
of these methods. I will emphasise
cases b
confidence intervals, but will also present
I0 = -------------- = ----
methods for calculating p-values.
person-time Y0
Table 12.1 shows the findings of a
hypothetical incidence study of 20,000

Table 12.1

Findings from a hypothetical cohort study of 20,000 persons followed for 10 years

Exposed Non-exposed Ratio


Cases 1,813 (a) 952 (b)
Non-cases 8,187 (c) 9,048 (d)

Initial population size 10,000 (N1) 10,000 (N0)

Person-years 90,635 (Y1) 95,163 (Y0)


Incidence rate 0.0200 (I1) 0.0100 (I0) 2.00
Incidence proportion (average risk) 0.1813 (R1) 0.0952 (R0) 1.90
Incidence odds 0.2214 (O1) 0.1052 (O0) 2.11

The natural logarithm of I0 has an I0 e+ 1.96 SE


approximate standard error (under the
Poisson model for random variation in
b) of: The observed incidence proportion in
the non-exposed group has the form:
SE [ln(I0)] = (1/b)0.5
cases b
R0 = ---------- = ------
and an approximate 95% confidence persons N0
interval for the incidence rate is thus:

137
The observed incidence proportion in cases of disease (b). They differ in
the non-exposed group has the form: whether their denominators represent
person-years at risk (Y0), persons at
cases b risk (N0), or survivors (d).
R0 = ---------- = ------
persons N0 Measures of Effect

Corresponding to these three


measures of disease occurrence, there
Its logarithm has an approximate
are three principal ratio measures of
standard error (under the binomial
effect which can be used in incidence
model for random variation in b) of:
studies: the rate ratio, the risk ratio,
and the odds ratio. In incidence case-
SE[ln(R0)] = (1/b - 1/N0)0.5
control studies, the measure of effect
is always the odds ratio (though what
this is estimating depends on how the
and an approximate 95% confidence controls were chosen). In prevalence
interval for the incidence proportion is studies, the effect measure is usually
thus: the prevalence odds ratio, and the
statistical methods are identical to
R0e+ 1.96 SE those used in incidence case-control
studies.

The observed incidence rate ratio has


The observed incidence odds in the the form (table 12.1):
non-exposed group has the form:
I1 a/Y1
cases b
O0 = ----------- = ---- RR = ----- = ------
I0 b/Y0
non-cases d

The natural log of the incidence odds An approximate p-value for the null
(ln(O0)) has (under a binomial model) hypothesis that the rate ratio equals
the null value of 1.0 can be obtained
an approximate standard error of: using the person-time version of the
Mantel-Haenszel chi-square (Breslow
and Day, 1987). This test statistic
SE(ln(O0)) = (1/b + 1/d)0.5
compares the observed number of
exposed cases with the number
expected under the null hypothesis
and a 95% confidence interval for O0 that I1 = I0:
is:
[Obs(a) - Exp(a)]2 [a - Y1M1/T]2
O0 e+1.96 SE χ2 = ---------------------- = ----------------
Var(Exp(a)) [M1Y1Y0/T2]

These three measures of disease


occurrence all involve the same where M1, Y1, Y0 and T are as depicted
numerator: the number of incident in table 12.1.

138
The natural logarithm of the rate ratio An approximate 95% confidence
has (under a Poisson model for a and interval for the risk ratio is then given
b) an approximate standard error of: by:

RR e+1.96 SE
SE[ln(RR)] = (1/a + 1/b)0.5

The incidence odds ratio has the form:


An approximate 95% confidence
interval for the rate ratio is then given O1 a/c ad
by (Rothman and Greenland, 1998): OR = --- = ----- =
O0 b/d bc
RR e+1.96 SE

An approximate p-value for the


The risk ratio has the form: hypothesis that the odds ratio equals
the null value of 1.0 can be obtained
from the Mantel-Haenszel chi-square
R1 a/N1
(Mantel and Haenszel, 1959):
RR = ------ = --------
R0 b/N0
[Obs(a) - Exp(a)]2 [a - N1M1/T]2
χ2 = ------------------- = -----------------
Var(Exp(a)) [M1M0N1N0/T2(T-1)]
An approximate p-value for the null
hypothesis that the risk ratio equals
the null value of 1.0 can be obtained
using the Mantel-Haenszel chi-square where M1, M0, N1, N0 and T are as
(Mantel and Haenszel, 1959): depicted in table 9.1.

[Obs(a) - Exp(a)]2 [a - N1M1/T]2 The natural logarithm of the odds ratio


has (under a binomial model) an
χ = ------------------ = ------------------
2
approximate standard error of:
Var(Exp(a)) [M1M0N1N0/T2(T-1)]

SE[ln(OR)] = (1/a +1/b+ 1/c +1/d)0.5

where M1, M0, N1, N0 and T are as


depicted in table 9.1.
An approximate 95% confidence
The natural logarithm of the risk ratio interval for the odds ratio is then given
has (under a binomial model for a and by:
b) an approximate standard error of:
OR e+1.96 SE

SE[ln(RR)] = (1/a - 1/N1 + 1/b - 1/N0)0.5

139
12.3: Control of Confounding

In general, control of confounding where Ti = Y1i + Y0i


requires careful use of a priori
knowledge, together with assessment An approximate p-value for the null
of the extent to which the effect hypothesis that the summary rate ratio
estimate changes when the factor is is 1.0 can be obtained from the person-
controlled in the analysis. Most time version of the one degree-of-
epidemiologists prefer to make a freedom Mantel-Haenszel summary chi-
decision based on the latter criterion, square (Shore et al, 1976):
although it can be misleading,
particularly if misclassification is
present (Greenland and Robins, [ΣObs(a) - ΣExp(a)]2 [Σai - ΣY1iM1i/Ti]2
1985a). The decision to control for a χ2 = ------------------------- = --------------------
presumed confounder can certainly ΣVar(Exp(a)) [ΣM1iY1iT0i/Ti2]
be made with more confidence if
there is supporting prior knowledge
that the factor is predictive of
disease. where M1i, Y1i, Y0i and Ti are as depicted
in table 12.1.
There are two methods of calculating
a summary effect estimate to control An approximate standard error for the
confounding: pooling and natural log of the rate ratio is
standardisation (Rothman and (Greenland and Robins, 1985b):
Greenland, 1998).

Pooling [Σ M1iY1iY0i/Ti2]0.5
SE = ------------------------------
Pooling involves calculating a [(ΣaiY0i/Ti)(ΣbiY1i/Ti)]0.5
summary effect estimate assuming
stratum-specific effects are equal.
There are a number of different Thus, an approximate 95% confidence
methods of obtaining pooled effect interval for the summary rate ratio is
estimates, but a commonly used then given by:
method which is both simple and
close to being statistically optimal
(even when there are small numbers RR e+1.96 SE
in all strata) is the method of Mantel
and Haenszel (1959).

The Mantel-Haenszel summary rate The Mantel-Haenszel summary risk ratio


ratio has the form: has the form:

Σ aiY0i/Ti Σ aiN0i/Ti
RR = -------------- RR = -------------
Σ biY1i/Ti ΣbiN1i/Ti

140
An approximate p-value for the where M1i, M0i, N1i, N0i and Ti are as
hypothesis that the summary risk ratio is depicted in table 12.1.
1.0 can be obtained from the one degree-
of-freedom Mantel-Haenszel summary An approximate standard error for the
chi-square (Mantel and Haenszel, 1959): natural log of the odds ratio (under a
binomial or hypergeometric model) is
(Robins et al, 1986):
[ΣObs(a) - ΣExp(a)]2 [Σai - ΣM1iM1i/Ti]2
2
χ = ----------------------- = ------------------
ΣPR Σ(PS + QR) ΣQS
ΣVar(Exp(a)) [ΣM1iM0iM1iN0i/Ti2(Ti-1)]
SE = ----- + -------------- + ------
2R+2 2R+S+ 2S+2
where M1i, M0i, N1i, N0i and Ti are as
depicted in table 9.1. where: P = (ai + di)/Ti
Q = (bi + ci)/Ti
An approximate standard error for the R = aidi/Ti
natural log of the risk ratio is S = bici/Ti
(Greenland and Robins, 1985b):
R+ = ΣR
S+ = ΣS
[Σ M1iN1iN0i/Ti2 - Σaibi/Ti]0.5
SE = ---------------------------------
[(ΣaiN0i/Ti)(ΣbiN1i/Ti)]0.5 Thus, an approximate 95% confidence
interval for the summary odds ratio is
then given by:
Thus, an approximate 95% confidence
OR e+1.96 SE
interval for the summary risk ratio is
then given by:
Standardisation
RR e+1.96 SE
Standardisation is an alternative
The Mantel-Haenszel summary odds approach to obtaining a summary
ratio has the form: effect estimate (Miettinen, 1974;
Rothman and Greenland, 1998).
Σ aidi/Ti Pooling involves calculating the effect
estimate under the assumption that
OR = -----------
the measure (e.g. The rate ratio)
Σ bici/Ti would be the same (uniform) across
strata if random error were absent. In
contrast, standardisation involves
An approximate p-value for the taking a weighted average of the
hypothesis that the summary odds ratio disease occurrence across strata (e.g.
is 1.0 can be obtained from the one the standardized rate) and then
degree-of-freedom Mantel-Haenszel comparing the standardized
summary chi-square (Mantel and occurrence measure between exposed
Haenszel, 1959): and non-exposed (e.g. the
standardized rate ratio) with no
assumptions of uniformity of effect.
[ΣObs(a) - ΣExp(a)]2 [Σai - ΣN1iM1i/Ti]2
Standardisation is more prone than
χ2 = ------------------------ = ---------------------- pooling to suffer from statistical
ΣVar(Exp(a)) [Σ M1iM0iN1iN0i/Ti2(Ti-1)]] instability due to small numbers in

141
specific strata; by comparison, pooling (under the binomial model for random
with Mantel-Haenzsel estimators is error) of:
robust and in general its statistical
0.5
stability depends on the overall [Σ wi2Ri(1-Ri)/Ni]
numbers rather than the numbers in
SE = -----------------------
specific strata. However, direct
standardisation has practical RΣ wi
advantages when more than two
groups are being compared, e.g. when
comparing multiple exposure groups or where Ni is the number of persons in
making comparisons between multiple stratum i. An approximate 95%
countries or regions, and does not confidence interval for the
require the assumption of constant standardized rate is thus:
effects across strata.
R e+ 1.96 SE
The standardized rate has the form:

Standardisation is not usually used for


Σ wiRi odds, since the odds is only used in
R = --------- the context of a case-control study,
Σ wi where the odds ratio is the effect
measure of interest, but standardized
odds ratios can be computed from
The natural log of the standardized rate case-control data (Miettinen, 1985;
has an approximate standard error Rothman and Greenland, 1998).
(under the Poisson model for random
error) of: A common choice of weights in
international comparisons is Segi's
World Population (Segi, 1960) shown
2 0.5
[Σ wi Ri/Yi] in table 12.2, although it does reflect a
SE = ---------------- “developed countries” bias in its age
structure. In etiologic studies a better
RΣ wi
approach is to use the structure of the
overall source population as the
weights when calculating standardized
where Yi is the person-time in stratum i. rates or risks in subgroups of the
An approximate 95% confidence interval source population. When one is
for the standardized rate is thus: specifically interested in the effects
that exposure had, or would have, on
R e+ 1.96 SE a particular subpopulation, then
weights should be taken from that
subpopulation.
The standardized risk has the form:
Multiple Regression
Σ wiRi
Multiple regression allows for the
R = ----------
simultaneous control of more
Σ wi confounders by "smoothing" the data
across confounder strata. In particular,
rate ratios (based on person-time
The natural log of the standardized risk
has an approximate standard error data) can be modelled using Poisson

142
log-linear rate regression, risk ratios can
be modelled using binomial log-linear risk Table 12.2
regression, and odds ratios can be
modelled using binomial logistic Segi’s World population
regression (Pearce et al, 1988; Rothman
and Greenland, 1998). Age-group Population
-----------------------------
Similarly, continuous outcome variables 0-4 years 12,000
(e.g. in a cross-sectional study) can be 5-9 years 10,000
modelled with standard multiple linear 10-14 years 9,000
regression methods. These models all 15-19 years 9,000
have similar forms, with minor variations 20-24 years 8,000
to take into account the different data 25-29 years 8,000
types. They provide powerful tools when 30-34 years 6,000
used appropriately, but are often used 35-39 years 6,000
inappropriately, and should always be 40-44 years 6,000
used in combination with the more 45-49 years 6,000
straightforward methods presented here 50-54 years 5,000
(Rothman and Greenland, 1998). 55-59 years 4,000
Mathematical modelling methods and 60-64 years 4,000
issues are reviewed in depth in a number 65-69 years 3,000
of standard texts (e.g. Breslow and Day, 70-74 years 2,000
1980, 1987; Checkoway et al, 2004; 75-59 years 1,000
Clayton and Hills, 1993; Rothman and 80-84 years 500
Greenland, 1998), and will not be 85+ years 500
discussed in detail here. -----------------------------
Total 100,000
-----------------------------
Source: Segi (1960)

Summary

The basic aim of the analysis of a single assessment of the extent to which the
study is to estimate the effect of effect estimate changes when the factor
exposure on the outcome under study is controlled in the analysis. There are
while controlling for confounding and two basic methods of calculating a
minimizing other possible sources of summary effect estimate to control
bias. In addition, when confounding and confounding: pooling and
other sources of bias cannot be standardisation. Multiple regression
removed, then it is important to assess allows for the simultaneous control of
their likely strength and direction. more confounders by "smoothing" the
Control of confounding in the analysis data across confounder strata. It
involves stratifying the data according provides a powerful tool when used
to the levels of the confounder(s) and appropriately, but are often used
calculating an effect estimate which inappropriately, and should always be
summarizes the information across used in combination with the more
strata of the confounder(s). In general, straightforward methods presented
control of confounding requires careful here.
use of a priori knowledge, together with

143
References

Breslow NE, Day NE (1980). Statistical Mantel N, Haenszel W (1959). Statistical


methods in cancer research. Vol I: aspects of the analysis of data from
The analysis of case-control studies. retrospective studies of disease. J
Lyon, France: IARC. Natl Cancer Inst 22: 719-48.
Breslow NE, Day NE (1987). Statistical Miettinen OS (1974). Standardization of
methods in cancer research. Vol II: risk ratios. Am J Epidemiol 96: 383-
The analysis of cohort studies. 8.
Lyon, France: IARC.
Miettinen OS (1985). Theoretical
Checkoway HA, Pearce N, Kriebel D epidemiology. New York: Wiley and
(2004). Research methods in Sons.
occupational epidemiology. 2nd ed.
Pearce NE, Checkoway HA, Dement JM
New York: Oxford University Press.
(1988). Exponential models for
Clayton D, Hills M (1993). Statistical analyses of time-related factors:
models in epidemiology. Oxford: illustrated with asbestos textile
Oxford Scientific Publications. worker mortality data. J Occ Med
30: 517-22.
Dean J, Dean A, Burton A, Dicker R
(1990). Epi Info. Version 5.01. Robins JM, Breslow NE, Greenland S
Atlanta, GA: CDC. (1986). Estimation of the Mantel-
Haenszel variance consistent with
Gardner MJ, Altman DG (1986).
both sparse-data and large-strata
Confidence intervals rather than p
limiting models. Biometrics 42:
values: estimation rather than
311-23.
hypothesis testing. Br Med J 292:
746-50. Rothman KJ (2002). Epidemiology: an
introduction. New York: Oxford
Greenland S, Robins JM (1985a).
University Press.
Confounding and misclassification.
Am J Epidemiol 122: 495-506. Rothman KJ, Greenland S (1998).
Modern epidemiology. 2nd ed.
Greenland S, Robins JM (1985b).
Philadelphia: Lippincott-Raven.
Estimation of a common effect
parameter from sparse follow-up Segi M (1960). Cancer mortality for
data. Biometrics 41: 55-68. selected sites in 24 countries (1950-
1957). Sendai, Japan: Department
Greenland S, Schwartsbaum JA, Finkle
of Public Health, Tohoku University
WD (2000). Problems due to small
School of Medicine.
samples and sparse data in
conditional logistic regression Shore RE, Pasternak BS, Curnen MG
analysis. Am J Epidemiol 2000; (1976). Relating influenza epidemics
191; 530-9. to childhood leukaemia in tumor
registries without a defined
Hills M, De Stavola BL (2002). A short
population base. Am J Epidemiol 103:
introduction to Stata for
527-35.
biostatistics. London: Timberlake,
2002.

144
CHAPTER 13: Interpretation
[In: Pearce N. A Short Introduction to Epidemiology. 2nd ed. Wellington, CPHR, 2005]

In this chapter I first consider the issues associations are likely to be valid, then
involved in interpreting the findings of a attention shifts to more general causal
single epidemiological study. I then inference, which should be based on all
consider problems of interpretation of all available information. In both situations,
of the available evidence. Interpreting it should be stressed that
the findings of a single study includes epidemiological studies almost always
considering the strength and precision contain potential biases, and the focus
of the effect estimate and the possibility should be on assessing the likely
that it may have been affected by direction and magnitude of the biases,
various possible biases (confounding, and whether they could explain the
selection bias, information bias). If it is observed associations.
concluded that the observed

13.1: Appraisal of a Single Study

It is easy to criticize an epidemiological associations found (or a lack of


study. Populations do not usually association) in the study, before
randomize themselves by exposure proceeding to consider other evidence.
status, do not always respond to However, the emphasis should not be
requests to participate in on simply preparing a list of possible
epidemiological studies, may supply biases (e.g. Feinstein, 1988). Rather, it
incomplete or inaccurate exposure is essential to attempt to assess the
histories for known or possible risk likely strength and direction of each
factors, and cannot be asked about possible bias, and to assess whether
unknown risk factors. Thus, although these biases (and their possible
some studies are clearly better than interactions) could explain the observed
others, it is important to emphasize that associations.
perfect epidemiological studies do not
exist. Furthermore, it is usually not What is the magnitude and
possible, nor desirable, to reach precision of the effect estimate?
conclusions on the basis of the findings
of a single study, and it is essential to As discussed in chapter 5, random error
consider all of the available evidence. (lack of precision) will occur in any
epidemiologic study, just as it occurs in
Nevertheless, when confronted with a experimental studies. The possible role
new study, perhaps with unexpected of random error is often addressed
findings, it is valuable to first consider through the question “could the
possible explanations for the observed association be due to chance

145
alone?” and this issue is usually epidemiologic study will involve biases.
assessed by calculating the p-value. The problem is not to identify possible
This is the probability (assuming that biases (these will almost always exist),
there are no biases) that a test statistic but rather to ascertain what direction
as large as that actually observed would they are likely to be in, and how strong
be found in a study if the null they are is likely to be.
hypothesis were true, i.e. that there
was in reality no causal effect of Confounding
exposure. However, recent reviews have
stressed the limitations of p-values and In assessing whether an observed
significance testing (Rothman, 1978; association could be due to confounding,
Gardner and Altman, 1986; Poole, the first consideration is whether all
1987; Pearce and Jackson, 1988). potential confounders have been
Foremost among these is that appropriately controlled for or
significance testing attempts to reach a appropriately assessed (e.g. by
decision on the basis of the data from a collecting and using confounder
single study, whereas what is more information in a sample of study
important is the strength and precision participants). If not, it is essential to
of the effect estimate and whether the assess the potential strength and
findings of a particular study are direction of uncontrolled confounding.
consistent with those of previous
studies. These issues are better In some areas of epidemiologic
addressed by calculating confidence research, e.g. occupational and
intervals rather than p-values (Gardner environmental studies, the strength of
and Altman, 1986; Rothman and uncontrolled confounding is often less
Greenland, 1998). Similarly, the than might be expected. For example,
possibility that the lack of a statistically Axelson (1978) has shown that for
significant association could be due to plausible estimates of the smoking
lack of precision (lack of study power) is prevalence in occupational populations,
more appropriately addressed by confounding by smoking can rarely
considering the confidence interval of account for a relative risk of lung cancer
the effect estimate rather than by of greater than 1.5. Similarly,
making post hoc power calculations Siemiatycki et al (1988) have found that
(Smith and Bates, 1992). confounding by smoking is generally
even weaker for internal comparisons in
What are the likely strengths and which exposed workers are compared
directions of possible biases? with non-exposed workers in the same
factory or industry). On the other hand,
Systematic error is distinguished from the potential for confounding can be
random error in that it would be present severe in studies of lifestyle and related
even with an infinitely large study, factors (e.g. diet, nutrition, exercise).
whereas random error can be reduced
by increasing the study size. Thus, It is unreasonable to simply assume
systematic error, or "bias", occurs if that a strong association could be due
there is a systematic difference between to confounding by unknown risk factors,
what the study is actually estimating since to be a strong confounder a factor
and what it is intended to estimate. The must be a very strong risk factor as well
types of bias (confounding, selection as being strongly associated with
bias, information bias) have already exposure. For example, if an
been discussed in chapter 6. In the occupational study found a relative risk
current context the key issue is that any of 2.0 for lung cancer in exposed

146
workers, it is highly unlikely that this to have been in. The important issue is
could be due to confounding by not whether information bias could
smoking, and it would be unreasonable have occurred (this is almost always
to dismiss the study findings merely the case since there are almost always
because smoking information had not problems of misclassification of
been available. On the other hand, exposure and/or disease) but rather
small relative risks (e.g. those in the the likely direction and strength of
range of 0.7-1.5, as frequently occur in such bias. In particular, if a study has
dietary studies) are not so difficult to yielded a positive finding (i.e. an effect
explain by lack of measurement, or poor estimate markedly different from the
measurement and control, of null value) then it is not valid to
confounders. dismiss it because of the possibility of
non-differential misclassification, or
Selection bias differential misclassification that is
likely (although not guaranteed)
Whereas confounding generally produce a bias towards the null.
involves biases inherent in the source
population, selection bias involves Summary of Issues of Systematic
biases arising from the procedures by Error
which the is study subjects are chosen
from the source population. As with In summary, when assessing whether
confounding, if it is not possible to the findings of a particular study could
directly control for selection bias, it be due to such biases, the important
still may be possible to assess its likely issue is not whether such biases are
strength and direction. It is likely to have occurred (since they will
unreasonable to dismiss the findings of almost always be present to some
a particular study because of possible extent), but rather what their direction
selection bias, without at least and strength is likely to be, and
attempting to assess which direction whether they taken together could
the possible selection bias would have explain the observed association. In
been in, and how strong it might have particular, epidemiological studies are
been. often criticized on the grounds that
observed associations could be due to
Information bias uncontrolled confounding or errors in
the classification of exposure or
With regards to information bias, the disease. However, the likely strength is
key issue is whether misclassification of uncontrolled confounding is
is likely to have been differential or sometimes less than might be
non-differential. In the latter case, the expected, and non-differential
bias will usually be in a know direction, misclassification of exposure will
i.e. towards the null. If usually (though not always) produce a
misclassification has been differential, tendency for false negative findings
then it is important to attempt to rather than false positive findings.
assess what direction the bias is likely

147
13.2: Appraisal of All of the Available Evidence

If it is concluded that the association in same time (e.g. by questionnaire, is


a particular study is unlikely to be blood tests, etc).
primarily due to bias and chance,
attention then shifts to assessing The criterion of specificity has been
whether this association exists more criticised (e.g. Rothman and Greenland,
generally, and whether the association 1998), on the grounds that there are
is likely to be causal. This should involve many instances of exposures that have
a review of all of the available evidence multiple (i.e. non-specific) effects.
including non-epidemiological studies. A These include tobacco smoke and
systematic quantitative review of the ionizing radiation, both of which cause
epidemiological evidence may involve a many different types of cancer.
formal meta-analysis with statistical Nevertheless, the specificity of the
pooling of information from the various effect may be relevant in assessing the
studies (e.g. Dickerson and Berlin, possibility of various biases. For
1992; Rothman and Greenland is, example, if an exposure is associated
1998). However, such a summary of the with esophageal cancer but is not
various study findings is just one step in associated with lung cancer, then the
the process of causal inference. A association is unlikely to be due to
systematic approach to causal inference confounding by smoking.
was elaborated by Hill (1965) and has
since been widely used and adapted Consistency is demonstrated by several
(e.g. Beaglehole et al (1993)). I will studies giving similar results, and
divide these considerations into those corresponds to the statistical concept of
that involve systematic review of the homogeneity across studies (Rothman
epidemiological evidence (including and Greenland, 1998). This is
meta-analyses) and those that also particularly important when a variety of
involve consideration of evidence from designs are used in different settings,
animal or mechanistic studies. since the likelihood that all studies are
all suffering from the same biases may
Evidence From Epidemiological thereby be reduced. On the other hand,
Studies a lack of consistency does not exclude a
causal association, because different
Considerations for assessing the exposure levels and other conditions
epidemiological evidence include may alter the effect of exposure in
temporality, specificity, consistency, certain studies.
strength of association and whether
there is evidence of a dose-response The strength of association is important
relationship (Hill, 1965). in that a relative risk than is far from
the null value of 1.0 is more likely to be
Temporality is crucial; the cause must causal than a weak association, which
precede the effect. This is usually self- could be more easily explained by
evident, but difficulties may arise in confounding or other biases. However,
studies (particularly case-control the fact that an association is weak does
studies) when measurements of not preclude it from being causal; rather
exposure and effect are made at the it means that it is more difficult to

148
exclude alternative explanations for the homogeneity often have relatively low
observed association. power, it is more appropriate to
examine the magnitude of variation
A dose-response relationship occurs instead of relying on formal statistical
when changes in the level of exposure tests (Rothman and Greenland, 1998).
are associated with changes in the
prevalence or incidence of the effect The limitations of meta-analyses should
than one would expect from biologic also be emphasized (Greenland, 1994;
considerations. The absence of an Egger and Davey-Smith, 1997; Egger et
expected dose-response relationship al, 1997). Strikingly different results can
provides evidence against a causal be obtained depending on which studies
relationship, while the presence of an are included in a meta-analysis.
expected relationship narrows the scope Publication bias is of particular concern,
of biases that could explain the given the tendency of journals to
relationship. publish “positive findings” and for the
publication of “negative findings” to be
Experimental evidence provides strong delayed (Egger and Davey-Smith,
evidence of causality, but this is rarely 1998), but naive graphical approaches
available for occupational exposures. to its assessment can be misleading
(Greenland, 1994).
Meta-Analysis
Even when an “unbiased” and
In the past, epidemiological evidence comprehensive list of studies is included
has been assessed in literature reviews, in a meta-analysis, there still remain the
but in recent years there has been an same problems of selection bias,
increasing emphasis on formal meta- information bias, and confounding, that
analysis, i.e. systematic quantitative need to be addressed in assessing
reviews. One benefit of a is meta- individual studies. Thus, a systematic
analysis is that it can reduce the quantitative review (i.e. meta-analysis)
probability of false negative results is like a report of a single study in that
because of small numbers in specific both quantitative and narrative
studies (Egger and Davey-Smith, 1997), elements are required to produce a
and may enable the effect of an balanced picture (Rothman and
exposure to be estimated with greater Greenland, 1998). Essentially the same
precision than is possible in a single issues need to be a addressed as in a
study. Furthermore, although a meta- report of a single study: what is the
analysis should ideally be based on overall magnitude and precision of the
individual data, relatively simple effect estimate (if it is considered
methods are available for meta- appropriate to calculate a summary
analyses of published studies in which effect estimate), and what are likely
the study (rather than the individual) is strengths and directions of possible
the unit of statistical analysis (Rothman biases?
and Greenland, 1998). Such methods
can be used to address the causal An advantage of meta-analysis is that
considerations outlined above, in these issues can often be better
particular the overall strength of addressed by contrasting the findings of
association and the shape and strength studies based on different populations,
of the dose-response curve. Just as or using different study designs. Thus,
importantly, statistical methods can also possible systematic biases can be
be used to assess consistency between addressed with actual data from specific
studies, but because statistical tests for studies rather than by hypothetical

149
examples. For example, in a study of an available for analysis and will therefore
occupational exposure and lung cancer, reduce random error. However, it will
there might be concern that an not necessarily reduce systematic error,
observed association was due to and may even increase it (because of
confounding by smoking. If smoking publication bias). Nevertheless, a
data had not been available, then the careful meta-analysis will enable various
best that could be done would be to possible biases to be addressed, using
attempt to assess the likely extent of actual data from specific studies, rather
confounding by smoking (see chapter than hypothetical examples. Such a
6), for example by sensitivity analysis meta-analysis will therefore facilitate
(Rothman and Greenland, 1998). the consideration of the causal
However, in a meta-analysis, if smoking considerations listed above, and in some
information were available for some instances will provide a valid summary
(but not all) studies then these studies estimate of the overall strength of
could be examined to assess the likely association and the shape and strength
strength and direction of confounding by of the dose-response curve (Greenland,
smoking (if any). 2003).

Similarly, studies of exposure to


Combination of Epidemiological
phenoxy herbicides and the
Evidence With Evidence From Other
development of soft tissue sarcoma and
Sources
non-Hodgkin’s lymphoma have
produced widely differing findings, and
it has been suggested that the high Epidemiological evidence should be
relative risks obtained in the Swedish considered together with all other
studies could be due to “recall bias” (a available evidence, including animal
particular type of information bias) in experiments. An association is plausible
that cases or cancer (soft tissue if it is consistent with other knowledge,
sarcoma or non-Hodgkin’s lymphoma) whereas the epidemiological evidence is
were compared with healthy general coherent if it is not inconsistent with
population controls, and that patients other knowledge. For instance,
with cancer may be more likely to recall laboratory experiments may have
previous chemical exposures. This shown that a particular environmental
hypothesis was tested in specific studies exposure can cause cancer in laboratory
(e.g. Hardell et al, 1979, 1981), but can animals, and this would make more
also be tested more generally by plausible is the hypothesis that this
considering the findings of studies that exposure could cause cancer in humans.
used general population controls with However, biological plausibility is a
those that used “other cancer” controls. relative concept; many epidemiological
In particular, one New Zealand study associations were considered
(Pearce et al, 1986) used both types of implausible when they were first
controls and found similar results with discovered but were subsequently
each, indicating that recall bias was not confirmed by other evidence, e.g. the
an important problem in this study. relation of lice to typhus. Lack of
plausibility may simply reflect lack of
In summary, a key advantage of meta- knowledge (medical, biological, or
analysis is that pooling findings from social) which is continually changing and
studies will increase the numbers evolving.

150
Summary

The task of interpreting the findings of differences between study findings and
a single epidemiological study should the likely magnitude of possible biases.
be differentiated from that of Furthermore, causal inference also
interpreting all of the available necessitates considering non-
evidence. Interpreting the findings of a epidemiological evidence from other
single study includes considering the sources (animal studies, mechanistic
strength and precision of the effect studies) in the consideration of more
estimate and the possibility that it may general causal criteria including the
have been affected by various possible plausibility and coherence of the
biases (confounding, selection bias, overall evidence.
information bias). The important issue
is not whether such biases are likely to Despite the continual need to assess
have occurred (since they will almost possible biases, and to consider
always be present to some extent), possible imperfections in the
but rather what their direction and epidemiological data, it is also
strength is likely to be, and whether important to ensure that preventive
together they could explain the action occurs when this is warranted,
observed association. If the observed albeit on the basis of imperfect data.
associations seem likely to be valid, As Hill (1965) writes:
then attention shifts to more general
causal inference, which should be "All scientific work is incomplete -
based on all available information. This whether it be observational or
includes assessing the specificity, experimental. All scientific work
strength and consistency of the is liable to be upset or modified
association and the dose-response by advancing knowledge. That
across all epidemiological studies. This does not confer upon us a
may include the use of meta-analysis, freedom to ignore the knowledge
but it is often not appropriate to derive that we already have, or to
a single summary effect estimate postpone the action that it
across all studies. Rather, a meta- appears to demand at a given
analysis can be used to examine time."
hypotheses about reasons for

References

Axelson O (1978). Aspects on epidemiology. Scand J Work Environ


confounding in occupational health Health 4: 85-9.

151
Beaglehole R, Bonita R, Kjellstrom T acids: a case-control study. Br J
(1993). Basic epidemiology. Geneva: Cancer 43: 169-76.
WHO.
Hill AB (1965). The environment and
Dickerson K, Berlin JA (1992). Meta- disease: association of causation?
analysis: state-of-the-science. Proc R Soc Med 58: 295-300.
Epidemiologic Reviews 14: 154-76.
Pearce NE, Smith AH, Howard JK, et al
Egger M, Davey-Smith G (1997). Meta- (1986). Non-Hodgkin's lymphoma
analysis: principles and promise. Br and exposure to phenoxyherbicides,
Med J 1997; 315: 1371-4. chlorophenols, fencing work and
meat works employment: a case-
Egger M, Davey-Smith G, Phillips A
control study. Brit J Ind Med 43: 75-
(1997). Meta-analysis: principles and
83.
procedures. Br Med J 1997; 315:
1533-7. Pearce NE, Jackson RT (1988).
Statistical testing and estimation in
Egger M, Davey-Smith G (1998). Meta
medical research. NZ Med J 101:
analysis: bias in location and
569-70.
selection of studies. Br Med J 1998;
316: 61-6. Poole C (1987). Beyond the confidence
interval. AJPH 77: 195-9.
Feinstein AR (1988). Scientific
standards in epidemiologic studies of Rothman KJ (1978). A show of
the menace of daily life. Science 242: confidence. N Engl J Med 299: 1362-
1257-63. 3.
Gardner MJ, Altman DG (1986). Rothman KJ, Greenland S (1998).
Confidence intervals rather than p Modern epidemiology. 2nd ed.
values: estimation rather than Philadelphia: Lippincott-Raven.
hypothesis testing. Br Med J 292:
Siemiatycki J, Wacholder S, Dewar R, et
746-50.
al (1988). Smoking and degree of
Greenland S (1994). A critical look at occupational exposure: Are internal
some populat meta-analytic methods. analyses in cohort studies likely to be
Am J Epidemiol 140: 290-6. confounded by smoking status?
American Journal of Industrial
Greenland S (2003). The impact of prior
Medicine 13:59-69.
distributions for uncontrolled
confounding and response bias: a Smith AH, Bates M (1992). Confidence
case study of the relation of wire limit analyses should replace power
codes and magnetic fields to calculations in the interpretation of
childhood leukemia. J Am Statist epidemiologic studies. Epidemiol 3:
Assoc 98: 1-8. 449-52.
Hardell L, Sandstrom A (1979). Case-
control study: soft-tissue sarcomas
and exposure to phenoxyacetic acids
or chlorophenols. Br J Cancer 39:
711-7.
Hardell L, Erikkson M, Lenner P,
Lundgren E (1981). Malignant
lymphoma and exposure to
chemicals, especially organic
solvents, chlorophenols and phenoxy

152
153

Вам также может понравиться