Notes On RR, Odds, Or, AR, AR% - Epidemiology

Author: Dr.
Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Epidemiology made easy for beginners
Authors note : Succinct and lucid style of writing coupled with real-life examples, which health
professionals commonly encounter, have been illustrated to help learners understand and
appreciate the concepts behind risk, relative risk, attributable risk, attributable risk difference,
odds, odds ratios, sensitivity, specificity, true positivity, true negativity, likelihood ratios,
posterior and prior probability (and odds). The endeavor has been to negate the difficulties that
one commonly faces with measures of association and effect and to help ease the process of
recall ability. Epidemiology rests on understanding these foundational pillars of association
between an exposure and an outcome. The fundamental concepts such as, what epidemiology
aims to cover, differences between descriptive and analytical epidemiology, epidemiological
triad, causal factors, natural history, steps in epidemiology have been listed and the measures
have been explained in detail.
**********************************************************************
Concept one - Distribution (in terms of person, place and time) and determinants (in terms of
agent, host and environment) are the two fundamental aspects that an epidemiologist uses
frequently to arrive at a hypothesis in conducting an investigational study that deals with
diseases and epidemics.
Concept two - Examining, identifying, and reporting on the frequency and distribution of disease
in a population constitutes descriptive epidemiology. Analytic Epidemiology looks at testing a
hypothesis about the cause of disease by studying how exposures relate to the disease.
Concept three Agent, host and the environment together determine the susceptibility of a
person to develop a disease. The severity of an infection depends on the host (the sufferer).
Likewise, the probability of the disease depends on the immune constitution, personal traits,
behaviors and genetic predisposition of the human body (host). Agent (biological, physical and
chemical) has been defined as the necessary factor for disease to occur. Environment (external
conditions, physical or biologic or social) contributes to the disease process. Epidemics arise
when host, agent, and environmental factors are not in homeostasis (balance).
A new agent, a change in existing agent (infectivity, pathogenicity, virulence), change in number
of susceptible population, environmental changes that affect transmission of the agent or
growth of the agent lead to the occurrence of a disease (or an epidemic excess than the
normal expected)
Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Concept four The causal factors can be necessary or sufficient. Necessary factors are those
that when removed, the disease does not occur. Sufficient factors are those that contribute to
some part of the disease process. Even in the absence of sufficient factors, a disease may
develop. A combination of sufficient and necessary factors causes disease.
Examples - Without HIV infection, AIDS does not develop Necessary factor
Development of tuberculosis requires M. tuberculosis and other factors, such as

immunosuppression, to cause disease. Bacteria still necessary, but not sufficient to
cause the disease
Concept five: Public Health is an integrated discipline. Health protection, disease control, risky
behavioral change, community development, primary health care and surveillance are the
notable fields in which the study of determinants and distribution (epidemiology) comes into
play.
Concept six: The natural history of disease is the history of a particular disease in the absence of
intervention, prevention or treatment. Epidemiology deals in primary, secondary and tertiary
prevention (both at an individual and population level) based on the natural history of the
disease.
Concept seven: The steps of an epidemic investigation or any causal study can be summarized
as below:
1. Begin with a general broad problem
2. Collect information about the problem,
3. Study the specific information collected,
4. Reassess the results and draw conclusions,
5. Re-evaluate the problem,
6. Re-formulate the question and

7. Collect additional information which will show the relationship between the exposure
and the outcome or the event.
Applications of epidemiology
Establish patterns of endemic and epidemic diseases
Determine origin of diseases with unknown etiology
Investigate/control diseases whose cause is unknown or poorly understood
Describe the ecology/natural history of disease
Plan and monitor control programs
Assess economic impact of disease
Development of prevention programs
Determine cost and benefits of alternate treatment, prevention, control programs

************************************************************************
Measures - The following examples have been devised to introduce the importance and
emphasize the measures of association and effect that we commonly encounter.
For 2 x 2 cells
Heart attack (Disease)
No heart attack (No Totals

disease)
Smoking (Exposed)
(a ) 80
(b) 20
a+b
(all persons
exposed to smoking)
100
No
smoking
(Not (c) 10
(d) 90
c+d (all persons who
exposed)
are not smokers)
10+90 = 100
Totals
a+c (all persons with b+d (all persons with no 200
heart attack)
heart attack)
3

90 (80+10)
20+90 = 110
Number of persons who are exposed with disease = a = 80

Number of persons who are exposed but NOT having disease = b = 20
Number of persons who are NOT exposed but having disease = c = 10
Number of persons who are NOT exposed and NOT having disease = d =90
1.) Risk of exposed to have disease = Risk of smokers to have heart attack
= (Disease and exposed)/ (all exposed persons)
= a/(a+b) = 80/(80+20) = 80/100 = 0.8 = 0.8 x 100% = 80%
This means that 80% of smokers will have the risk of having a heart attack.
This also means that out of 100 people exposed to smoking (smokers), 80 will have heart attack.
2) Risk of not exposed to have disease = Risk of people who are not smoking to have heart attack
= (no smoking with disease) / all non smokers
= c/c+d = 10/100 = 10/(10+90) = 10/100 = 0.1 = 10%
This means that 10% of non smokers will have heart attack. This also means that out of 100 non
smokers, 10 will have disease.
3) Relative risk (RR) for smokers to have heart attack
Relative means compared to ..here we compare smoking to non -smokers

So, the question is asking us how much the risk for smokers to develop heart attack is more as compared
to non smokers
= Risk of exposed to have disease / Risk of not exposed to have disease
= Risk of smokers to have heart attack /risk of non smokers to have heart attack
= 0.8 / 0.1 = 8
This means that the relative risk for smokers to develop heart attack is 8 times more (or higher) as
compared to non smokers.
This also means that the relative risk for smokers to develop heart attack is 700% more (or higher) as
compared to non smokers.
The null value for RR is 1. To reject null hypothesis, we should get values of RR to be higher or lower
than that of null value of RR
So, 8 minus 1 = 7 (7 X 100 = 700%)
4. Attributable risk (AR) = Excess risk
Is the risk difference (RD) between two groups, or the excess risk that smokers have as compared to non
smokers to develop a heart attack
So if the question is asking you - what is the AR for smokers to develop heart attack then you subtract
the risk of smokers to have heart attack from that of the risk of non smokers to have heart attack.
AR = (risk of smokers to have heart attack risk of non smokers to have heart attack)
= 0.8 0.1 = 0.7 or 70%
Therefore this 70% means that 70% is the risk difference between smokers and non smokers to have a
heart attack. This also means that smokers have an excess risk of 70% as compared to non smokers to
have heart attack

The difference between AR and RR should be very clear

RR is telling you how many times the risk is high in smokers as compared to non smokers
AR is telling you how much the risk is bigger in smokers as compared to non smokers to develop heart
attack
5) AR % - also called as attributable proportion
Ratio of (risk difference among both groups) / (risk in the exposed group) x 100
= (risk in exposed risk in unexposed) / (risk in exposed population ) x 100
=
(risk in smokers to have heart attack risk in non smokers to have heart attack )
x 100
Risk in smokers to have heart attack
= (0.8 0.1) / 0.7 = 1 x 100 = 100%

This means that 100% of risk among smokers to develop heart attack can be attributed to smoking
6. ODDS
Odds is the chance of an event to occur divided by the chance of the event not to occur.
So, if I say that odds of the horse winning the race is 4/7; it means that 4 times the horse will win and 7
times, it will lose.

In other words, if the horse runs in 11 ( 4+7) races, the chances of winning are in 4 races and chances of
losing are in 7 races
7. Odds Ratio (OR) of exposure
= odds of exposed to have a disease / odds of not exposed to have a disease
=
Odds of exposed to have disease
Odds of not exposed to have disease
Odds of exposed to have disease = Chance of exposed to have disease / chance of exposed to have not
have disease
Odds of not exposed to have disease = Chance of not exposed to have disease / chance of not exposed
to have no disease
Therefore,
Chance of exposed to have disease / chance of exposed to have not have disease
Chance of not exposed to have disease / chance of not exposed to have no disease

=
Chance of smokers to have heart attack / chance of smokers to have no heart attack
Chance of non smokers to heart attack / chance of non smokers to not have heart attack
So, in this 2 x 2 cells


disease)
Smoking (Exposed)
(a ) 80
(b) 20
a+b
(all persons
exposed to smoking)
100
No
smoking
(Not (c) 10
(d) 90
exposed)
are not smokers)
10+90 = 100
Totals
heart attack)
heart attack)
90 (80+10)
20+90 = 110
Odds of exposed to have disease = odds of smokers to have heart attack = Chance of smokers to have
heart attack / chance of smokers to not have heart attack = 80/20 (because{ (80/100) / (20/100)} =80/20
Odds of not exposed to have disease = odds of non smokers to have heart attack = Chance of non
smokers to have heart attack / chance of non smokers to have no heart attack = { (10/100) / (90/100)} =
10/90

Therefore, Odds ratio for exposure =
Chance of smokers to have heart attack / chance of smokers to have no heart attack
Chance of non smokers to heart attack / chance of non smokers to not have heart attack
(80/20)
(10/90)
Therefore the odds ratio for smokers to have heart attack as compared to non smokers is
OR of exposed people to have disease = OR for smokers to have heart attack = ( 80x 90) / (20 x 10) =
36. This means that odds for smokers to have heart attack is 36 times more as compared to non
smokers. Also , note that the OR = (axd) / (bxc) this is called as cross product ratio as shown below

disease)
Smoking (Exposed)
(a ) 80
(b) 20
a+b
(all persons
exposed to smoking)
100
No
smoking
(Not (c) 10
(d) 90
exposed)
are not smokers)
10+90 = 100
Totals
heart attack)
heart attack)
90 (80+10)
20+90 = 110
9

8. Similarly, we can also calculate the odds ratio of disease

=
Odds of disease to have exposure
Odds of no exposed to have exposure
Therefore,
Chance of diseased to have exposure / chance of diseased to have no exposure
Chance of not diseased to have exposure / chance of not diseased to have no exposure
Therefore, Odds ratio for diseased =
Chance of heart attack to have exposure to smoking / chance of heart attack to have no exposure to
smoking
Chance of no heart attack among smokers / chance of no heart attack among non smokers
10

= {(80/10) / (20/90)} = (80/10) x (90/20) = 36

OD of disease = OR of heart attack = This means that people with heart attack are 36 times more
exposed to smoking as compared to people with no heart attack
Therefore, OR of disease = OR of exposure = cross product ratio
9 . Validity
Sensitivity
New Test
New test positive
New test negative
Totals
Old test (gold standard)

Gold standard positive)
(a ) 80
Gold standard negative

(b) 220
Totals
a+b
(all persons
positive on new test)
300
(c) 10
(d) 90
c+d
(all persons
negative on new test)
10+90 = 100
a+c (all persons positive b+d
(all persons 400
on gold standard test)
negative
on
gold
90 (80+10)
standard)
20+90 = 310
Sensitivity of new test = (new test positive) / (all persons positive on gold standard test)
= a/(a+c) = 80/90 = 0.88 = 88 %
This means that the new test has a sensitivity to detect (catch) 88% of the people who are actually
positive
Specificity of the new test = (new test negative) / (all persons negative on gold standard)
= d/(d+b) = 90/310 = 0.29 = 29%
This means that the new test has a specificity to detect ( catch) 29% of the people who are actually
negative.
11

In other words,
New Test
New test positive
New test negative
Totals
Old test (gold standard)

Gold standard positive)
(a ) 80
(True positives)
Gold standard negative

(b) 220
(False positives)
Totals
a+b
(all persons
positive on new test)
300
All positive on new
test
(c) 10
(d) 90
c+d
(all persons
(False negatives)
(True negatives)
negative on new test)
10+90 = 100
All negative on new
test
a+c (all persons positive b+d
(all persons 400
on gold standard test)
negative
on
gold
90 (80+10)
standard)
20+90 = 310
Therefore,
Sensitivity = True positives / (True positives + False negatives)
Specificity = True negatives / (True negatives + false negatives)
Screening
Test
Disease
Present
Absent
Positive
True
positives
False
positives
Negative
False
negatives
True
negatives
12

Validity of Screening Tests

True Disease Status
Sensitivity: The probability of testing

positive if the disease is truly present
Sensitivity = a / (a + c)
13


True Disease Status
Specificity: The probability of screening

negative if the disease is truly absent
Specificity = d / (b + d)
14

Screening Principles
Sensitivity
the ability of a test to correctly identify those
who have a disease
a test with high sensitivity will have few false
negatives
Specificity
the ability of a test to correctly identify those
who do not have the disease
a test that has high specificity will have few false
positives
15


Breast Cancer
Physical Exam
and Mammography
132
983
45
63650
Sensitivity:
a / (a + c)
Sensitivity = 132 / (132 + 45) = 74.6%
Specificity: d / (b + d)
Specificity = 63650 / (983 + 63650) = 98.5%
Sensitivity and specificity are not able to predict the performance of the screening test in the
population
Thus, the indices of positive and negative predictive value are needed
Predictive Value Positive (PV+) : People with positive screening test results will also test positive
on the diagnostic test:
Predictive Value Negative (PV-) : People with negative screening test results are actually free of
disease:
16

Performance Yield
True Disease Status
Results of
Screening
Test
Predictive value positive (PV+): The probability that a person

actually has the disease given that he or she tests positive.
PV+ = a / (a + b)
Performance Yield
True Disease Status
+
Results of
Screening
Test
Predictive value negative (PV-): The probability

that a person is truly disease free given that he
or she tests negative.
PV- =
d / (c + d)
17

Performance Yield
True Disease Status
+
Results of
Screening
Test
400
995
100
98905
Sensitivity:
a / (a + c) = 400 / (400 + 100) =
80%
Specificity:
d / (b + d) = 98905 / (995 + 98905) =
99%
PV+:
a / (a + b) = 400 / (400 + 995) =
29%
PV-:
d / (c + d) = 98905 / (100 + 98905) =
99%
18

Performance Yield
True Disease Status
+
Results of
Screening
Test
PV+:
400
995
100
98905
a / (a + b) = 400 / (400 + 995) = 29%
Among persons who screen positive, 29% are found

to have the disease.
19

Performance Yield
True Disease Status
+
Results of
Screening
Test
PV-:
400
995
100
98905
d / (c + d) = 98905 / (100 + 98905) = 99.9%
Among persons who screen negative, 99.9% are found

to be disease free.
Factors that influence PV+ and PV1.
The more specific the test, the higher the PV+
2.
The higher the prevalence of preclinical disease in the screened population, the higher the PV+
3.
The more sensitive the test, the higher the PV-
Therefore, in revision:
20

Absent
Positive
Negative
c
a+c
Sensitivity
Specificity
Disease
Present
Absent
Positive
a+b
Negative
c+d
b+d
a+b
c+d
b+d
Screening
Test
Screening
Test
Disease
Present
a+c
Quantify a screening tests

accuracy given the known
disease status of subjects
Quantify a screening tests

accuracy given only the
test results of subjects
PPV
NPV
Concept of Likelihood ratios

Disease +
Disease -
Test +
Test -
Likelihood ratio( +) = LR + = (T+/all D+) / (T+/all D-) = (a/a+c) / (b / b+d) = sensitivity/ 1-specificity
Likelihood ratio( -) = LR- = (T-/ all D+) / ( T-/all D-) = c/(a+c) / (d/b+d) = 1-sensitivity/ specificity
Post odds + = ( LR+ ) * Pre odds
Pre also called as prior odds = Prob of disease/(1-prob of disease)
Prob of disease = {(a+c) / (a+b+c+d)}
Prior odds = {(a+c) / (a+b+c+d)} / 1-{(a+c) / (a+b+c+d)}
21

Post odds (+) = (LR+) *prior odds

Post odds (+) = {(a/a+c) / (b / b+d)} * {(a+c) / (a+b+c+d)} / 1-{(a+c) / (a+b+c+d)}
Post odds ( -) = (LR - ) *prior odds
Post odds (-) ={ c/(a+c) / (d/b+d)} * {(a+c) / (a+b+c+d)} / 1-{(a+c) / (a+b+c+d)}
Receiver Operating Curve

ROC is a curve that plots false positive rate on X axis versus True positive rate on Y axis
ROC is a curve that plots (1- specificity) on X axis versus sensitivity on Y axis
Some examples of ROC curves are mentioned below
22

ROC analysis provides a useful means to assess the diagnostic accuracy of a test and to compare the
performance of more than one test for the same outcome. However, the usefulness of the test must be
considered in the light of the clinical circumstances.
Say for example,
In this curve, The ability of two continuous variables to diagnose an outcome can be compared using
ROC curves and their Area under ROC curve (AUROCs).
For example, Fig. 3 (above figure )and Table 6 (mentioned as below) show the ROC curve and AUROC for
urea in addition to those for lactate.
23

Reference for figures and table is: Critical Care December 2004 Vol 8 No 6 Bewick et al.
The AUROC for urea is greater than that for lactate (0.730 for urea as compared to 0.64 for lactate),
suggesting that urea may provide a better predictive test for mortality.
Tests for Reliability
Standard approach / test to diagnose depression clinical exam
Self reporting depression test ( Depressed
new test)
Not depressed
Self reported depressed
25
Self reported not depressed
10
60
1. Percent Agreement: Divide the number of paired observation in the agreement cells by the total
number of paired observations
Using the data from our example:
(25+60)/100*100%=85%
Advantage
Simple to use
Can be extended to discrete score with more than two levels
Not depressed, mild depression, severe depression
Disadvantage
Values tend to be high whenever the absent-absent cell is high.
24

2. Percent Positive Agreement: Divide the number of positive paired observation by the average
number of positives by both ratings.
Using the data from our example:
25/[((25+10) + (25+5))/2]*100%=76.9%
This represents the number of times both ratings provide a positive results out of the
average number of positives by either rating.
3. Kappa Statistic: The fraction of the observed agreement not due to chance in relation to the
maximum non-chance agreement.
K=(P0-Pe)/(1-Pe)
P0=the proportion of observed agreement
Pe=the proportion of agreement expected to occur by chance alone.
From our example
P0=(25+60)/100=.85
Pe
The sum of chance agreement for each cell on the diagonal
The expected for each cell is calculated by the product of the

corresponding marginals divided by the total
(25+5)*(25*10)/100=10.5
(60+5)*(60+10)/100=45.5
25

Therefore, Pe=(10.5+45.5)/100=.56
K=(P0-Pe)/(1-Pe) = (.85-.56)/(1-.56)=.66
Range of Kappa: -1 to 1
-1: Complete disagreement
0: Random agreement
1: Complete agreement
Suggested guidelines
< .4 Poor agreement beyond chance
.4-.75 Fair to good agreement beyond chance
> .75 Excellent agreement beyond chances
Reference : Landis and Koch (1977). The Measurement of observer agreement for categorical data.
Biometrics, 33:159-174.
4. Intraclass Correlation Coefficient: estimates the fraction of the total measurement variability caused
by variation among individuals.
This is an extension of the kappa; Same range of scores (-1 to 1)
ICC=Vb/(Vb+Ve)
Vb=Variance between individuals
Ve=Error variance
26

Can calculate ICC from ANOVA Table
More complex approach to estimating the ICC also exist, which take into account
random effect of subjects and raters
5. Coefficient of Variability (CV): the standard deviation expressed as a percentage of the mean value
of two sets of paired observations
For each paired set of observation, calculate the variance
If have an pair of scores of 25 and 35, the mean of the two observation would be 30 and the variance
would be (25-30)2+(35-30)2=50
The CV for the pair would be the standard deviation of the paired observations divided by the mean of
the pair
SQRT(50)/30=.24
This is then repeated for each pair
The overall CV is the average of the pairwise CVs
The lower the CV, the less variation there is between the repeated measurements
If not differences between pairs, the CV would be zero
--------------------------------------------------------------------------------------------------------------------For validity, we could use sensitivity, specificity, PV + and PV------------------------------------------------------------------------------------------------------
27

Notes On RR, Odds, Or, AR, AR% - Epidemiology

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Notes On RR, Odds, Or, AR, AR% - Epidemiology

Загружено:

Авторское право:

Доступные форматы

Author: Dr.

Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Development of tuberculosis requires M. tuberculosis and other factors, such as

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Establish patterns of endemic and epidemic diseases

Determine origin of diseases with unknown etiology

Investigate/control diseases whose cause is unknown or poorly understood

Describe the ecology/natural history of disease

Plan and monitor control programs

Assess economic impact of disease

Development of prevention programs

Determine cost and benefits of alternate treatment, prevention, control programs

No heart attack (No Totals

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Number of persons who are exposed with disease = a = 80

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

The difference between AR and RR should be very clear

= (0.8 0.1) / 0.7 = 1 x 100 = 100%

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Odds of not exposed to have disease

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

So, in this 2 x 2 cells

No heart attack (No Totals

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Therefore, Odds ratio for exposure =

Heart attack (Disease)

No heart attack (No Totals

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

8. Similarly, we can also calculate the odds ratio of disease

Odds of no exposed to have exposure

Therefore, Odds ratio for diseased =

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

= {(80/10) / (20/90)} = (80/10) x (90/20) = 36

New test negative

Old test (gold standard)

Gold standard negative

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

New test negative

Old test (gold standard)

Gold standard negative

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Validity of Screening Tests

Sensitivity: The probability of testing

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Validity of Screening Tests

Specificity: The probability of screening

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Validity of Screening Tests

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

Predictive value positive (PV+): The probability that a person

Predictive value negative (PV-): The probability

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

a / (a + c) = 400 / (400 + 100) =

d / (b + d) = 98905 / (995 + 98905) =

a / (a + b) = 400 / (400 + 995) =

d / (c + d) = 98905 / (100 + 98905) =

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad

a / (a + b) = 400 / (400 + 995) = 29%

Among persons who screen positive, 29% are found

Author: Dr. Raghupathy Anchala, MD MPH PDCR, IIPH Hyderabad