Senitivity and Specity

Sensitivity and Specificity
PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Tue, 08 Feb 2011 02:40:24 UTC
Sensitivity and specificity 1
Sensitivity and specificity

Sensitivity and specificity are statistical measures of the performance of a binary classification test. Sensitivity
(also called recall rate in some fields) measures the proportion of actual positives which are correctly identified as
such (e.g. the percentage of sick people who are correctly identified as having the condition). Specificity measures
the proportion of negatives which are correctly identified (e.g. the percentage of healthy people who are correctly
identified as not having the condition). These two measures are closely related to the concepts of type I and type II
errors. A theoretical, optimal prediction can achieve 100% sensitivity (i.e. predict all people from the sick group as
sick) and 100% specificity (i.e. not predict anyone from the healthy group as sick).
For any test, there is usually a trade-off between the measures. For example: in an airport security setting in which
one is testing for potential threats to safety, scanners may be set to trigger on low-risk items like belt buckles and
keys (low specificity), in order to reduce the risk of missing objects that do pose a threat to the aircraft and those
aboard (high sensitivity). This trade-off can be represented graphically using an ROC curve.
Definitions
Imagine a study evaluating a new test that screens people for a disease. Each person taking the test either has or does
not have the disease. The test outcome can be positive (predicting that the person has the disease) or negative
(predicting that the person does not have the disease). The test results for each subject may or may not match the
subject's actual status. In that setting:
• True positive: Sick people correctly diagnosed as sick
• False positive: Healthy people incorrectly identified as sick
• True negative: Healthy people correctly identified as healthy
• False negative: Sick people incorrectly identified as healthy.
Specificity
To use an example of a detection dog used by law enforcement to track drugs, a dog may be trained specifically to
find cocaine. Another dog may be trained to find cocaine, heroin and marijuana. The second dog is looking for so
many smells it can get confused and sometimes picks out odours like shampoo, so it will begin to lead the law
enforcement agents to innocent packages, thus it is less specific. Thus, a much larger number of packages will be
"picked up" as suspicious by the second dog, leading to what is called false positives – test results labeled as positive
(drugs) but that are really negative (shampoo).
In terms of specificity, the first dog doesn't miss any cocaine and does not pick out any shampoo, so it is very
specific. If it makes a call it has a high chance of being right. The second dog finds more drugs (sensitivity), but is
less specific for cocaine because it also finds shampoo. It makes more calls, but has more chance of being wrong.
Which dog you choose depends on what you want to do.
A specificity of 100% means that the test recognizes all actual negatives - for example, all healthy people will be
recognized as healthy. Because 100% specificity means no negatives are incorrectly tagged as positive, a positive
result in a high specificity test is used to confirm the disease. This maximum can also trivially be achieved by a test
that always reports negative - claims everybody healthy regardless of the true condition. Therefore, the specificity
alone does not tell us how well the test recognizes positive cases. We also need to know the sensitivity of the test.
A test with a high specificity has a low type I error rate.
Sensitivity
Continuing with the example of the law enforcement tracking dog, an old dog might be retired because its nose
becomes less sensitive to picking up the odor of drugs, and it begins to miss lots of drugs that it ordinarily would
have sniffed out. This dog illustrates poor sensitivity, as it would give an "all clear" to not only those packages that
do not contain any drugs (true negatives), but also to some packages that do contain drugs (false negatives).
A sensitivity of 100% means that the test recognizes all actual positives – for example, all sick people are recognized
as being ill. Thus, in contrast to a high specificity test, negative results in a high sensitivity test are used to rule out
the disease.
Sensitivity alone does not tell us how well the test predicts other classes (that is, about the negative cases). In the
binary classification, as illustrated above, this is the corresponding specificity test.
Sensitivity is not the same as the precision or positive predictive value (ratio of true positives to combined true and
false positives), which is as much a statement about the proportion of actual positives in the population being tested
as it is about the test.
The calculation of sensitivity does not take into account indeterminate test results. If a test cannot be repeated, the
options are to exclude indeterminate samples from analysis (but the number of exclusions should be stated when
quoting sensitivity), or, alternatively, indeterminate samples can be treated as false negatives (which gives the
worst-case value for sensitivity and may therefore underestimate it).
A test with a high sensitivity has a low type II error rate.
Medical example
In medical diagnostics, test sensitivity is the ability of a test to correctly identify those with the disease (true +ve
rate), whereas test specificity is the ability of the test to correctly identify those without the disease (true -ve rate)[1] .
If 100 patients known to have a disease were tested, and 43 test positive, then the test has 43% sensitivity. If 100
with no disease are tested and 96 return a negative result, then the test has 96% specificity.
A highly specific test is unlikely to give a false positive result: a positive result should thus be regarded as a true
positive. A sensitive test rarely misses a condition, so a negative result should be reassuring (the disease tested for is
absent). A
SPIN and SNOUT are commonly used mnemonics which says: A highly SPecific test, when Positive, rules IN
disease (SP-P-IN), and a highly 'SeNsitive' test, when Negative rules OUT disease (S-N-N-OUT).
Worked example
Relationships among terms
Condition
(as determined by "Gold standard")
Positive Negative
Test Positive True Positive False Positive → Positive predictive value

outcome (Type I error, P-value)
Negative False True Negative → Negative predictive value

Negative
(Type II error)
↓ ↓
Sensitivity Specificity
A worked example
The fecal occult blood (FOB) screen test was used in 203 people to look for bowel cancer:
Patients with bowel cancer

(as confirmed on endoscopy)
Positive Negative
Fecal Positive TP = 2 FP = 18 → Positive predictive value

occult = TP / (TP + FP)
blood = 2 / (2 + 18)
screen = 2 / 20
test = 10%
outcome
Negative FN = 1 TN = 182 → Negative predictive
value
= TN / (FN + TN)
= 182 / (1 + 182)
= 182 / 183
≈ 99.5%
↓ ↓
Sensitivity Specificity
= TP / (TP + = TN / (FP +
FN) TN)
= 2 / (2 + 1) = 182 / (18 +
=2/3 182)
≈ 66.67% = 182 / 200
= 91%
Related calculations
• False positive rate (α) = FP / (FP + TN) = 18 / (18 + 182) = 9% = 1 − specificity
• False negative rate (β) = FN / (TP + FN) = 1 / (2 + 1) = 33% = 1 − sensitivity
• Power = sensitivity = 1 − β
• Likelihood ratio positive = sensitivity / (1 − specificity) = 66.67% / (1 − 91%) = 7.4
• Likelihood ratio negative = (1 − sensitivity) / specificity = (1 − 66.67%) / 91% = 0.37
Hence with large numbers of false positives and few false negatives, a positive FOB screen test is in itself poor at
confirming cancer (PPV = 10%) and further investigations must be undertaken, it will, however, pick up 66.7% of all
cancers (the sensitivity). However as a screening test, a negative result is very good at reassuring that a patient does
not have cancer (NPV = 99.5%) and at this initial screen correctly identifies 91% of those who do not have cancer
(the specificity).
Terminology in information retrieval

In information retrieval positive predictive value is called precision, and sensitivity is called recall.
The F-measure can be used as a single measure of performance of the test. The F-measure is the harmonic mean of
precision and recall:
In the traditional language of statistical hypothesis testing, the sensitivity of a test is called the statistical power of the
test, although the word power in that context has a more general usage that is not applicable in the present context. A
sensitive test will have fewer Type II errors.
See also
Terminology and derivations
from a confusion matrix
true positive (TP)

eqv. with hit
true negative (TN)
eqv. with correct rejection
false positive (FP)
eqv. with false alarm, Type I error
false negative (FN)
eqv. with miss, Type II error
sensitivity or true positive rate (TPR)
eqv. with hit rate, recall
false positive rate (FPR)

eqv. with false alarm rate, fall-out
accuracy (ACC)
specificity (SPC) or True Negative Rate
positive predictive value (PPV)

eqv. with precision
negative predictive value (NPV)
false discovery rate (FDR)
Matthews correlation coefficient (MCC)
Source: Fawcett (2004).
• Binary classification
• Detection theory
• Receiver operating characteristic
• Statistical significance
• Type I and type II errors

• Selectivity
• Youden's J statistic
• Matthews correlation coefficient
• Gain (information retrieval)
• Accuracy and precision
Further reading
• Altman DG, Bland JM (1994). "Diagnostic tests. 1: Sensitivity and specificity" [2]. BMJ 308 (6943): 1552.
PMID 8019315. PMC 2540489.
External links
• Sensitivity and Specificity [3] Medical University of South Carolina
• Calculators:
• Vassar College's Sensitivity/Specificity Calculator [4]
• OpenEpi software program
References
[1] MedStats.Org (http:/ / medstats. org/ sens-spec)
[2] http:/ / www. bmj. com/ cgi/ content/ full/ 308/ 6943/ 1552
[3] http:/ / www. musc. edu/ dc/ icrebm/ sensitivity. html
[4] http:/ / faculty. vassar. edu/ lowry/ clin1. html
Article Sources and Contributors 6
Article Sources and Contributors

Sensitivity and specificity Source: http://en.wikipedia.org/w/index.php?oldid=410787527 Contributors: 3mta3, Akiezun, Arcadian, Az919, Btyner, Calimo, Chaldor, Cmglee, CopperKettle,
Craigjob, DMacks, DRosenbach, Decltype, Eigenclass, Epbr123, Feline Hymnic, Frederic Y Bois, G716, Genista, Giftlite, Herr blaschke, Humbefa, Jbom1, Kaihsu, Kingpin13, Kyle1278,
Levineps, MER-C, MaSt, Madhuperiasamy, Maximilianh, Melcombe, Michael Hardy, Michael93555, Qwfp, R'n'B, Rayccwong, Rod57, Sepreece, Snalwibma, Tpsibanda, Umdolofia, Valar,
Wifione, Wine Guy, Wjastle, Yt95, ZuluPapa5, 78 anonymous edits
License
Creative Commons Attribution-Share Alike 3.0 Unported
http:/ / creativecommons. org/ licenses/ by-sa/ 3. 0/

Senitivity and Specity

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Senitivity and Specity

Загружено:

Авторское право:

Доступные форматы

Sensitivity and Specificity

Sensitivity and specificity

Test Positive True Positive False Positive → Positive predictive value

Negative False True Negative → Negative predictive value

Patients with bowel cancer

Fecal Positive TP = 2 FP = 18 → Positive predictive value

Terminology in information retrieval

true positive (TP)

false positive rate (FPR)

specificity (SPC) or True Negative Rate

positive predictive value (PPV)

negative predictive value (NPV)

false discovery rate (FDR)

Matthews correlation coefficient (MCC)

Source: Fawcett (2004).

• Type I and type II errors

Article Sources and Contributors

Вам также может понравиться