You are on page 1of 10

JOURNAL OF BONE AND MINERAL RESEARCH

Volume 16, Number 5, 2001


2001 American Society for Bone and Mineral Research

Classification of Osteoporosis Based on Bone


Mineral Densities

YING LU,1 HARRY K. GENANT,1 JOHN SHEPHERD,1 SHOUJUN ZHAO,1 ASHWINI MATHUR,2
THOMAS P. FUERST,1 and STEVEN R. CUMMINGS3

ABSTRACT

In this article we examine the role of bone mineral density (BMD) in the diagnosis of osteoporosis. Using
information from 7671 women in the Study of Osteoporotic Fractures (SOF) with BMD measurements at the
proximal femur, lumbar spine, forearm, and calcaneus, we examine three models with differing criteria for the
diagnosis of osteoporosis. Model 1 is based on the World Health Organization (WHO) criteria using a T score
of 2.5 relative to the manufacturers young normative data aged 20 29 years, with modifications using
information from the Third National Health and Nutrition Examination Survey (NHANES). Model 2 uses a
T score of 1 relative to women aged 65 years at the baseline of the SOF population. Model 3 classifies women
as osteoporotic if their estimated osteoporotic fracture risk (spine and/or hip) based on age and BMD is above
14.6%. We compare the agreement in osteoporosis classification according to the different BMD measure-
ments for the three models. We also consider whether reporting additional BMD parameters at the femur or
forearm improves risk assessment for osteoporotic fractures. We observe that using the WHO criteria with the
manufacturers normative data results in very inconsistent diagnoses. Only 25% of subjects are consistently
diagnosed by all of the eight BMD variables. Such inconsistency is reduced by using a common elderly
normative population as in model 2, in which case 50% of the subjects are consistently diagnosed as
osteoporotic by all of the eight diagnostic methods. Risk-based diagnostic criteria as in model 3 improve
consistency substantially to 68%. Combining the results of BMD assessments at more than one region of
interest (ROI) from a single scan significantly increases prediction of hip and/or spine fracture risk and
elevates the relative risk with increasing number of low BMD subregions. We conclude that standardization
of normative data, perhaps referenced to an older population, may be necessary when applying T scores as
diagnostic criteria in patient management. A risk-based osteoporosis classification does not depend on the
manufacturers reference data and may be more consistent and efficient for patient diagnosis. (J Bone Miner
Res 2001;16:901910)

Key words: bone densitometry, classification, osteoporosis, diagnostic agreement

INTRODUCTION ing group,(2) white women whose BMD or BMC values are
below 2.5 T score compared with young normal adults are
ONE MINERAL density (BMD) and bone mineral content considered to have osteoporosis. Although based on the
B (BMC) values are used to diagnose osteoporosis.(1)
According to the World Health Organization (WHO) work-
properties of dual X-ray absorptiometry (DXA) scans of
spine and hip, this diagnostic approach does not specify the

1
Osteoporosis and Arthritis Research Group, Department of Radiology, University of California San Francisco, San Francisco,
California, USA.
2
Statistical Sciences, SmithKline Beecham Pharmaceuticals, King of Prussia, Pennsylvania, USA.
3
Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, USA.

901
902 LU ET AL.

anatomic measurement sites and/or techniques and, there- MATERIALS AND METHODS
fore, as the WHO working group notes, individuals will be
Subjects
categorized differently according to the site and technique
of measurements and the equipment and the reference pop- From 1986 to 1988, the SOF recruited 9704 white women
ulation used.(2) The choice of bone measurement technol- aged 65 years or older from population-based listings in
ogy, the corresponding reference population, and the mea- four areas of the United States. At baseline, BMD was
surement sites will have a direct impact on the management measured at the calcaneus, distal radius, and proximal ra-
of individual patients in clinical practice. dius, using single photon absorptiometry (SPA). At the
Although there is a wide array of bone mineral measure- second visit (1988 1989), surviving participants had BMD
ment techniques available or under development, the most measurements of the postero-anterior (PA) spine (L1L4)
commonly used is X-ray absorptiometry, either single X-ray and proximal femur (neck, trochanter, Wards triangle, and
absorptiometry (SXA) or dual X-ray absorptiometry total hip ROIs) using DXA. Fractures of the hip and spine
(DXA). DXA and/or SXA scanners can measure many were recorded for each subject at each visit. Details of the
anatomic sites.(3) The most common sites are the lumbar study design and the data have been published.(16,17)
spine, proximal femur, forearm, and calcaneus. Each pro- We included 7671 women from the study for whom all
vides a reasonable capacity for generic fracture risk predic- the BMD measurements were available. All analyses of the
tion,(1) and each correlates moderately well to the others.(4,5) classification of osteoporosis according to BMD measure-
However, this modest correlation does not necessarily lead ments in this article were based on this study population.
to consistent diagnostic classification of individual patients
when threshold-based criteria are used.(513) These incon- Fractures
sistencies may be caused by differences in reference data,
differences in peak bone levels and rates of bone loss, in All women who had fractures of the hip and/or spine were
technique-related sources of error, and in individual ana- included, whether they were followed for 5 years or not. We
tomic interrelationships.(3) These inconsistencies also make also included all women who actually were followed for 5
it unclear whether combining the results of several measure- years, whether they had fractures in that time or not.
ment techniques and/or sites reduces misclassification or Women who were followed for less than 5 years and had no
improves fracture risk assessment.(7,9 11,14) Furthermore, fractures were excluded. These parameters gave us the
inconsistencies in patient diagnosis make it difficult to de- records of 5568 women that were suitable for analyses
related to hip fracture.
velop treatment and reimbursement standards.(15)
Lateral spine radiographs were obtained at baseline and
The use of bone densitometry in patient management
about 3 years later. Incident vertebral deformities were
differs from its use in mass screening or epidemiological
defined by quantitative morphometry as a 20% reduction
study. Under managed care, physicians are obligated to
and at least 4-mm loss in the height of anterior, posterior, or
reduce the cost of examinations by eliminating unnecessary
midheight of any vertebra between L4 and T4, during the
scans. At the same time, they are responsible for accurately
time between baseline and follow-up visits.(18,19) The aver-
diagnosing their patients; a truly osteoporotic patient must
age time between the baseline and follow-up radiographs
not go undiagnosed, and a healthy patient must not be
was about 3.7 years. An experienced radiologist provided
falsely diagnosed as osteoporotic. Using only one diagnos-
visual confirmation of the diagnoses. The records of 6561
tic technique and/or one BMD parameter may fail to detect women were suitable for the analyses related to spinal
patients who might be diagnosed with another method. fractures.
Measuring multiple sites reduces the likelihood of missing By combining hip and spine fractures, 5066 women met
an osteoporotic patient but increases expense; false-positive our inclusion criteria of at least one fracture or completion
rates will increase, resulting in some normal subjects re- of 5 years follow-up. We considered hip and/or spine frac-
ceiving treatment. The choice of technique and site will tures in the SOF because they are probably the most rele-
affect the diagnostic conclusions as well as the cost, directly vant osteoporotic fractures. Unless specified otherwise, in
affecting the choice of intervention strategy for an individ- this article fracture will refer to osteoporotic fractures of
ual patient. the hip and/or the spine.
Given these conditions, we attempted to distinguish the
differences in the diagnosis of osteoporosis based on dif-
ferent BMD sites and to examine their clinical ramifications. BMD based classification of patients
Using a prospective cohort from the Study of Osteoporotic The DXA scanners used in the SOF were the Hologic
Fractures (SOF), we applied three approaches to investigate QDR 1000 (Hologic, Inc., Waltham, MA, USA). The SPA
the magnitude of discordance in the diagnosis of osteopo- scanners were from OsteoAnalyzer (Siemens-Osteon, Wa-
rosis. Model 1 was based on the WHO criteria and the hiawa, HI, USA). The unit of BMD measurement was in
manufacturers young normative data for different measure- grams per square centimeter. Lumbar spine BMD was mea-
ment sites. Model 2 used a standardized older reference sured at L1L4. The reference parameters (peak mean and
population to derive a T scorelike measure. Model 3 used SD of young women aged 20 29 years) for hip, spine, and
a cut-off threshold based on osteoporotic fracture risk. Fi- calcaneus were provided by the manufacturers. The hip
nally, we considered whether assessing more than one BMD BMD normative data represent the new version based on
region of interest (ROI) from a single scan site helps predict results from the National Health and Nutrition Examination
hip and/or spinal fractures in a clinical setting. Survey (NHANES) study.(20,21) The forearm reference data
BMD CLASSIFICATIONS FOR OSTEOPOROSIS 903

TABLE 1. MEAN AND SDS OF NORMATIVE BMD VALUES (g/cm2) FROM MANUFACTURERS (MODEL 1), SOF 65 (MODEL 2),
AND OVERALL SOF DATA

Reference values for model 1 Reference values for model 2


(young women aged 2029 (564 SOF women aged 65 Descriptive statistics of SOF
years) years at baseline) participants (n 7671)
Peak Mean (Peak SD) Mean (SD) Mean (SD)

Age at baseline 65 (0) 71.26 (5.04)


Total hip, BMD 0.942 (0.122)a 0.796 (0.13) 0.758 (0.131)
Trochanteric, BMD 0.703 (0.101)a 0.585 (0.10) 0.558 (0.102)
Femoral neck, BMD 0.849 (0.111)a 0.680 (0.11) 0.650 (0.110)
Wards triangle, BMD 0.734 (0.117)a 0.462 (0.11) 0.428 (0.110)
Calcaneus, BMD 0.425 (0.07)b 0.435 (0.09) 0.407 (0.093)
Distal radius, BMD 0.38 (0.05)c 0.383 (0.08) 0.363 (0.084)
Proximal radius, BMD 0.79 (0.05)c 0.670 (0.10) 0.637 (0.103)
Lumbar spine, BMD 1.047 (0.11)b 0.872 (0.17) 0.857 (0.169)
a
New Hologic reference values for model 1 were from NHANES study.20
b
Reference values for model 1 were from the manufacturers.
c
Reference values for forearms for model 1 were from Davis et al.22

were derived from Davis et al.(22) These reference values are evaluate the association of BMD values from different
listed in the first column of Table 1. anatomic measurement sites. -Statistics and percentage
We used the WHO criteria to classify women as osteo- agreements were used to study the agreement in classifica-
porotic based on the young normative values.(2) We consid- tions.
ered a woman to be osteoporotic if her BMD T score was To evaluate the diagnostic agreement for model 3, we
less than 2.5. (A T score is defined as the BMD value used a 10-fold cross-prediction method to classify the
minus the sex-matched young reference BMD divided by women. We randomly divided the SOF data into 10 subsets.
the sex-matched young reference SD at the corresponding We derived cut-off values from nine of the subsets and used
anatomical site.) We refer to this classification as model 1. those values to classify women in the remaining subset. We
In model 2, we substituted the peak young reference rotated the target subset among all subsets so that all pa-
values in the T score with reference values from the SOF tients in the SOF study were classified based on other
women aged 65 years at baseline. We refer to this alterna- women. This was done to reduce the bias in classification
tive T score definition as T (T prime). The midcolumn of because the classification rules and results to be compared
Table 1 lists reference values for this cohort. To make are based on the same data.
classifications based on this reference cohort comparable Logistic regression analysis was used to assess the risk of
with model 1, we used a T score of 1 as the cut-off fractures. All statistical analyses were carried out using a
point for the SOF 65 group, which derived a comparable statistical software package (SAS 6.09; SAS Institute, Cary,
prevalence of osteoporosis based on the femoral neck NC, USA). The statistically significant level was set at 0.05
BMD. throughout the study.
Model 3 was based on the estimated risk of osteoporotic
fracture of the hip and/or spine for a given BMD measure-
ment. The estimated risk was determined with a logistic RESULTS
regression equation derived from the SOF fracture data.
Because age is always available without cost, we used both Classification of osteoporosis
age and BMD values in the regression equations. In this
model, classifications of osteoporosis were based on a Descriptive statistics of the study participants used in this
threshold of the estimated fracture probability as illustrated article are shown in the last column of Table 1. Correlation
in the Appendix, that is, patients with a fracture risk above coefficients between BMD parameters measured at different
the cut-off values were designated as osteoporotic. The sites are shown in Table 2. The correlation coefficients
cut-off values for fracture risk were 14.6% for model 3, ranged from 0.51 between proximal radius and Wards
based on follow-up periods of 5 years for the hip and 3.5 triangle BMD to 0.90 between total hip and trochanteric
years for the spine. The fracture risk value of 14.6% was BMD. Measurements from the same anatomic sites tended
chosen to generate a prevalence of osteoporosis similar to to have higher correlations than those from different sites.
models 1 and 2 according to neck BMD so that comparisons Correlation coefficients between two different sites were
among models would be possible. moderate (around 0.6).
In model 1, using the WHO criteria and the manufactur-
ers reference values, the prevalence of osteoporosis ranged
Statistics
from 3% based on calcaneal BMD to 60% based on the
Descriptive statistics were used to summarize the study BMD of the proximal radius (Fig. 1). Examination of the
population. Pearsons correlation coefficients were used to cross-tables of classifications of osteoporosis, specifically
904 LU ET AL.

TABLE 2. CORRELATION COEFFICIENTS AMONG BMD MEASUREMENTSa


Femoral Wards Distal Proximal
Trochant neck triangle Calcaneal radius radius Spine

Total hip 0.901 0.861 0.838 0.693 0.599 0.597 0.662


Trochant 0.760 0.766 0.669 0.555 0.539 0.646
Femoral neck 0.868 0.617 0.532 0.516 0.603
Wards triangle 0.583 0.535 0.514 0.580
Calcaneal 0.667 0.634 0.638
Distal radius 0.757 0.566
Proximal radius 0.551
a
n 7671; p 0.0001.

substantially improved agreement in classifications, partic-


ularly among classifications based on different femoral
BMD measurements and when comparing calcaneal BMD
to other locations. The percentage of women who were
classified consistently by all eight BMD measurements was
49%: 45% as normal and 4% as osteoporotic (Fig. 2). Even
with this improvement over model 1, more than 50% of the
women were diagnosed differently depending on the ana-
tomic sites measured. Most of the percentage agreements
between two sites were above 80%, indicating that a fifth of
the women would be classified differently if they were
diagnosed based on two different BMD measurement sites.
Except for measurements of hip BMD sites, the -statistics
of other combinations were mostly less than 0.5, suggesting
only moderate agreement among different sites after adjust-
ing for agreement by chance.
FIG. 1. Prevalence of osteoporosis based on eight BMD measure- Model 3 used a risk-based approach and further improved
ments. the agreement in classifications. The prevalence of osteo-
porosis by this model ranged from 22% to 26%, as shown in
Fig. 1. The model substantially improved the -statistics
the upper triangle of Table 3, shows that the percentages of (lower triangle of Table 5) as well as the percentage agree-
agreement in the diagnosis of osteoporosis ranged from ment (upper triangle of Table 5), although moderate dis-
43% between proximal radius and calcaneal BMD to 92% crepancies remained. The percentage of women who were
between total hip and trochanter BMD. Overall, the percent- classified consistently by all eight BMD measurements was
ages of agreement varied greatly, indicating poor agreement 69%: 59% as normal and 10% as osteoporotic (Fig. 2) based
in diagnosis. This is also evident in the very low -statistics on our 10-fold cross-prediction. We found that low hip
given in the lower triangle of Table 3. The percentage of BMD missed only 6% of women diagnosed as low by other
women who were consistently classified by all eight mea- BMD measurements in model 3.
surement sites was only 25%: 24% were classified as nor-
mal and 1% as osteoporotic (Fig. 2). These results suggest Fracture outcome in the diagnostic classification of
that using the WHO criteria and the manufacturers refer- osteoporotic patients
ence data could result in very different diagnoses depending
on the anatomic site and method used. Agreement in diagnostic classification is one aspect of the
Model 2 further examined whether the differences were comparison of diagnostic criteria. Equally important is
caused by the use of younger reference values or by the use whether a diagnostic method reflects patient prognosis.(23)
of threshold-based diagnostic criteria in general. The prev- Figure 3 shows the sensitivity, specificity, and the posi-
alence of osteoporosis classified in model 2 was substan- tive and negative predictive values for hip and/or spine
tially more consistent than in model 1 (Fig. 1). The most fracture for each measurement in the three classification
noticeable changes were for osteoporosis at Wards triangle, models. Although sensitivities and specificities varied most
calcaneus, and forearm. The upper triangle of Table 4 shows in model 1 and were most consistent in model 3, the
the percentages of agreement in classifications based on two variations in positive and negative predictive values in
BMD measurements. The percentages ranged from 75% model 1 and the improvement in model 3 were not substan-
between femoral neck and proximal radius to 89% between tial. This is because both positive and negative predictive
total hip and trochanter BMD. The -statistics for pairwise values were regulated by the low prevalence of fracture,
agreement in diagnosis ranged from 0.33 to 0.70, given in which was 7%. Still, model 1 had the most variations, with
the lower triangle of Table 4. Overall, model 2 showed positive predictive values ranging from 12% Wards trian-
BMD CLASSIFICATIONS FOR OSTEOPOROSIS 905

TABLE 3. PERCENTAGE AGREEMENT (UPPER TRIANGLE) AND -STATISTICS (LOWER TRIANGLE IN PERCENTAGE) IN
OSTEOPOROSIS DIAGNOSIS FOR MODEL 1a
Femoral Wards Distal Proximal
Percent (%) Total hip Trochant neck triangle Calcaneal radius radius PA spine

Total hip 92 86 58 85 84 54 75
Trochant 68 83 55 88 86 51 74
Femoral neck 59 45 66 77 77 57 74
Wards triangle 25 19 38 45 49 69 64
Calcaneal 20 24 14 4 91 43 69
Distal radius 29 30 22 10 24 48 71
Proximal radius 18 15 23 37 4 11 63
PA spine 35 29 36 32 9 20 31

The upper triangle of the table is the percentage of agreement in classification of osteoporosis by two BMD measurements. The lower
triangle of the table is the -statistics of classification of osteoporosis by two BMD measurements. The row and column titles are the two
BMD sites to be measured.
a
n 7671.

by Wards triangle, calcaneal, and distal radius BMD in


model 1 were higher than the relative risks in model 2.
However, given the very high prevalence of Wards
triangle osteoporosis and very low prevalence of calca-
neus and distal radius osteoporosis in model 1, such
advantages in relative risk were of no practical use.
Model 3 had consistently higher relative risks and rela-
tive risks across all eight measurements sites, compared
with both models 1 and 2. Thus, model 3 was most
relevant to the prognosis of the classified subjects.

Combination of measurements
The disagreement in patient classification and the multi-
ple BMD parameters measured in hip and forearm scans
lead to the question of whether additional BMD parameters
derived from different ROIs of the same scan would be
FIG. 2. Pie chart for the distribution of the number of low BMD sites. beneficial in patient management, without additional mea-
The pie chart consists of SOF participants in nine categories according surement cost. We examined the benefits of combining
to their number of osteoporotic sites. The area of each category repre-
multiple BMD results from the hip or from the forearm scan
sents the percentage of participants in it. Participants who are classified
consistently by eight BMD measurement sites had either no low BMD
measurements.
sites or eight low BMD sites. The relative risk of osteoporotic fracture as a function of
the number (0) of osteoporotic ROIs in a hip or forearm
scan, in reference to women without osteoporotic ROIs in
gle (WARDS) to 34% calcaneus (CALC). In contrast, the the corresponding site, is listed in Table 6. The risk in-
ranges of positive predictive values were from 17% (both creased with the number of osteoporotic ROIs assessed for
forearm BMDs) to 24% (WARDS) for model 2 and 22% either hip or forearm scans in all three models (Armitage
(PRAD) to 25% (all hip BMDs) for model 3. Negative trend test, p 0.001). This suggests that the combination of
predictive values were all above 90%. osteoporotic status of all BMD ROIs from a hip or a forearm
Figure 4 shows the relative risks for fracture for os- scan could help physicians better assess the prognosis of a
teoporotic women (those with lower BMD) compared patient without increasing diagnostic expense. Therefore, it
with all the nonosteoporotic women (those with higher may be beneficial to combine BMD information from dif-
BMD) in the study, according to the three different ferent ROIs in the same hip or forearm scan in reporting
diagnostic models. Relative risk here is for 5-year hip patient bone status.
and/or spine fracture and compares two groups rather
than traditional 1 SD change in BMD. Osteoporotic
women had significantly elevated risks of fracture in each DISCUSSION
model. Although model 2 provided significantly better
agreements in classification and homogeneity of sensitiv- A number of important conclusions can be drawn from
ity and specificity, the relative risks were not all higher this study. The correlation between BMD measures is good,
than those in model 1. The relative risks for classification but the agreement in threshold-based classifications is only
906 LU ET AL.

TABLE 4. PERCENTAGE OF AGREEMENT (UPPER TRIANGLE) AND -STATISTICS (LOWER TRIANGLE IN PERCENTAGE) IN
OSTEOPOROSIS DIAGNOSIS FOR MODEL 2a
Femoral Wards Distal
Percent (%) Total hip Trochant neck triangle Calcaneal radius Proximal radius PA spine

Total hip 89 87 85 79 77 77 80
Trochant 70 83 83 79 76 76 80
Femoral neck 65 54 87 77 76 75 78
Wards triangle 59 52 64 76 75 75 78
Calcaneal 46 45 41 37 78 77 78
Distal radius 39 35 34 33 44 83 77
Proximal radius 39 36 33 34 42 55 77
PA spine 42 41 36 35 40 34 34

The illustration of this table is the same as those in Table 3.


a
n 7671.

TABLE 5. PERCENTAGE OF AGREEMENT (UPPER TRIANGLE) AND -STATISTICS (LOWER TRIANGLE IN PERCENTAGE) IN
OSTEOPOROSIS DIAGNOSIS FOR MODEL 3a
Femoral Wards Distal Proximal
Percent (%) Total hip Trochant neck triangle Calcaneal radius radius PA spine

Total hip 92 90 89 87 86 86 86
Trochant 77 87 87 87 85 85 86
Femoral neck 74 65 91 86 85 85 85
Wards triangle 71 65 76 85 85 85 85
Calcaneal 64 64 61 59 89 89 88
Distal radius 60 59 60 59 70 92 88
Proximal radius 60 58 58 58 69 78 88
PA spine 62 61 60 59 66 66 67

The illustration of this table is the same as those in Table 3.


a
n 7671.

modest. Threshold-based classifications depend heavily on miological and population-based studies than in individual
the reference data used. Standardization of the reference diagnostic procedures or as thresholds for therapeutic deci-
data will reduce but not eliminate inconsistencies in osteo- sions. This inconsistency has more consequences in clinical
porosis diagnosis. Risk-based classifications further im- practice than in epidemiological studies. Physicians face the
prove the consistency among different measurement sites dilemma of deciding which patients should receive long-
and result in nearly 70% agreement among the eight BMD term treatments. Our study suggests that many of the incon-
sites studied. Reporting the status of all ROIs at the hip or sistencies are methodological and can be reduced by stan-
forearm enhances fracture risk prediction. dardization to an older reference population and by using
The ideas and models in this article are not new. The risk-based criteria. With increasing therapeutic choices in
conclusions are not surprising. Many authors have reported patient management, such information also can be used in
inconsistency in WHO classifications.(510,12) There has choosing the optimal treatment strategies based on patient
been considerable effort to determine new definitions that prognosis.
will lead to more consistent classification.(23,24) Our article Because of logistical and proprietary concerns, manufac-
is the first to evaluate the impact of diagnostic classification turers do not codevelop or share their reference populations.
using data from a large prospective study. Our results sup- Device manufacturers typically use different reference pop-
port previous observations based on small data sets and ulations for different machines. Our study supports the
reinforce the suggestions that classification of osteoporosis notion that different normative populations within and
should focus on risk of fracture rather than on T scores among manufacturers are a source of inconsistencies in
selected to provide a certain prevalence.(24 27) classification based on T scores.(8,12,13) However, it seems
Although consensus is forming that the diagnosis and that age may play a more dominant role in the disagreement.
treatment of osteoporosis should not be based solely on The disagreement in classifications could be reduced by
BMD but should include other considerations and risk fac- using a unified elderly normative population in which the
tors,(1,24) BMD values are still the most frequently used and differences in age-related bone losses have been negated.
important factor. The WHO criteria produce inconsistent Unfortunately, the current study cannot explicitly separate
diagnoses of osteoporosis and are better applied in epide- the effect of a standardized reference population and of
BMD CLASSIFICATIONS FOR OSTEOPOROSIS 907

FIG. 3. Sensitivity, specificity, and positive


and negative predictive values for osteoporotic
fracture based on eight BMD measurements.

would be a formidable task. However, there are efforts


underway as proposed by Miller.(25) As diagnostic and
treatment options increase, the importance of studying
methodologies to calibrate future BMD measurements to
the same reference population increases also.
Currently, the T score based threshold approach for de-
fining osteoporosis is widely applied. However, there are
many advantages to a risk-based approach rather than T
scores. We have previously underscored some of the pitfalls
of using T scores for patient diagnosis.(9,10,26) Black et al.
suggested avoiding the use of T scores as classification tools
altogether and proposed including the relative risk of frac-
tures as a standard tool in reporting bone densitometry
results.(27) This article supports these arguments. Because
relative risk depends on the definition of baseline fracture
risk, it is much easier to define osteoporosis based on
FIG. 4. Relative fracture risk for hip and/or spine fracture based on absolute fracture risks within a defined time window.
eight BMD measurements. Relative risks for hip and/or spine fracture The risk-based diagnostic criteria used in this study
for osteoporotic women (lower BMD) compared with all the nonos- showed strong and consistent separation of osteoporotic
teoporotic women (higher BMD) in the study.
women from normal women. The better agreement, to a
certain degree, could be attributed to the use of age as a
variable in estimating risk and perhaps because the logistic
differential aging effects because we do not have complete regression model used information of fracture risks over the
manufacturer-based reference data for age 65 years for all whole age range. Because age is a known risk factor for
measured sites. The greatest disagreements in model 1 were osteoporosis and costs nothing to obtain, it is beneficial to
between peripheral BMD and central BMD measures. Be- include it in our diagnostic criteria. Dropping age in the
cause of changes in technology, we are not able to obtain estimation of fracture risks in model 3 would result in a
reliable references for 65-year-old white women from man- classification scheme very similar to model 2. Using age in
ufacturers for forearm and calcaneal BMD. The Interna- the model to estimate the fracture risks causes the BMD
tional Committee for Standards in Bone Measurement has cut-off values to vary with age. This should not be a
recommended the adoption of hip BMD measurements from problem in practice because the manufacturers could pro-
the NHANES(20,21) as the standardized reference population gram the logistic regression equations and report such risks
for white women. Developing a universal reference popu- to physicians. It is already being used in a limited way. It is
lation for all bone measurement sites and technologies important to point out that model 3 offers significantly
908 LU ET AL.

TABLE 6. RELATIVE OSTEOPOROTIC FRACTURE RISK AS A FUNCTION OF NUMBER OF OSTEOPOROTIC ROIs IN HIP
OR FOREARM SCANS

No. of Trend test


osteoporotic Prevalence Relative risk*
Model Scan sites (%) (95% CI) 2 p Value

1 Hip 1 30 2.43 (1.86, 3.19) 83.14 0.0001


2 12 4.47 (3.37, 5.93)
3 7 5.34 (3.95, 7.23)
4 9 8.65 (6.71, 11.16)
Forearm 1 52 1.92 (1.57, 2.36) 55.19 0.0001
2 9 3.83 (3.00, 4.89)
2 Hip 1 10 2.19 (1.64, 2.93) 73.15 0.0001
2 8 3.06 (2.32, 4.03)
3 8 4.17 (3.26, 5.35)
4 12 5.73 (4.71, 6.98)
Forearm 1 17 1.73 (1.40, 2.13) 41.47 0.0001
2 17 2.70 (2.25, 3.24)
3 Hip 1 7 2.60 (1.88, 3.59) 77.97 0.0001
2 6 3.19 (2.37, 4.28)
3 6 4.61 (3.54, 6.00)
4 16 5.97 (4.95, 7.21)
Forearm 1 8 2.34 (1.81, 3.04) 52.36 0.0001
2 19 3.56 (3.01, 4.21)

Relative risks were calculated for women with a given number (0) of osteoporotic ROIs in a hip or forearm scan in comparison to
those without osteoporotic ROI in the corresponding site.
a
Relative risk was for either hip or spine fracture during the study and relative to subjects with zero osteoporotic ROI.

better prediction of osteoporotic fractures than does age inconsistent osteoporosis diagnoses. The use of risk-based
alone, and may be more relevant to classification. An addi- diagnostic criteria does not reduce the need for a standard
tional advantage of the risk-based diagnostic criteria is that reference population.
risk information from multiple sources can be integrated A substantial disadvantage of a risk-based approach in
into the estimation of global risk of fractures. clinical use is that it requires prospective clinical studies for
Using an older rather than younger reference age in T score the risk of fractures, which are expensive and time consum-
calculations reduces age-related loss discrepancies between ing. Almost all of the prospective data available to link
sites for postmenopausal women. Prevalence is still not equal BMD and fracture pertain to elderly white women; yet
because each manufacturers T score is normalized by unique interest is growing in assessing osteoporosis risk in women
population SDs. To compensate, site and technology-specific T under the age of 65 years, in nonwhite women, and in men.
score thresholds for osteopenia and osteoporosis could be Thus, it will be difficult to establish clinical thresholds for
developed that, for large populations, would produce equiva- new techniques and populations other than white women.
lent prevalence to a reference site such as the femoral neck. Also, risk estimates based on BMD may not be very accu-
This approach is being considered by the joint committees of rate because BMD is only one of many factors that influence
the NOF and ISCD and is outlined in Faulkner et al.(13) fracture risks.(17) The use of cross-sectional studies has been
Although this approach could be implemented with existing proposed to evaluate diagnostic cut-off values based on
reference curves, prevalence-based approaches generally are observed odds ratios (ORs). Possible limitations of such an
intended for epidemiological study. Also, this type of approach approach are that selection of fracture cases can produce biases
does not explicitly take into account fracture risk estimates that and, further, the observed BMD already may be affected by the
can differ by more than a factor of 2. It is not clear how well fracture and may differ from the prospective studies.
this approach would work to reduce inconsistent diagnoses in The selection of hip and/or spine fracture as the osteo-
individual patients when different measurement techniques are porotic fractures in this study was based on their relative
used. clinical importance and abundance. In unpublished work
As expected, the risk-based diagnostic criteria used in this and other conference presentations,(9,10) we have performed
study were shown to be preferable based on the consistency similar analyses using only hip or spine fracture as the
among different BMD measurements. This supports the endpoints. The results were very similar to the results in this
proposal by Kanis and Gluer for the Committee of Scientific article and are available through the corresponding author.
Advisors of IOF.(24) The risk-based approach of model 3 is The cut-off value of 14.6% for osteoporotic fracture in
still dependent on the population used to derive the esti- model 3 was derived to make the prevalence of osteoporosis
mates of fracture probability. Although we do not have data at the femoral neck comparable with the other two models
to confirm it, we can speculate that the estimation equations so that comparison among models would be possible. It was
derived from different populations will result, somewhat, in not optimally selected and was not based on any clinical
BMD CLASSIFICATIONS FOR OSTEOPOROSIS 909

validation. Such a cut-off value requires further careful Third, the calcaneal and forearm BMD measures were
study and should not be used directly for clinical applica- obtained 2 years before the hip and spine BMD. Although it
tions without further validation. In the Appendix, we pro- seems unlikely, we do not know whether the difference in
vide a formula and corresponding parameters for calculating measurement time would affect the conclusions.
the corresponding BMD value for given percentages of Fourth, the statistical method used in this article may
cut-off points. This allows the reader to select the risk levels introduce some biases in favor of model 2 and model 3 over
and derive BMD cut-off values accordingly. model 1. Model 2 used women aged 65 years at baseline of
Regardless of which definition of osteoporosis is used, we the SOF study as reference. Thus, the reference data used in
should always expect inconsistencies when measuring dif- model 2 is part of the SOF data to be evaluated and is more
ferent body sites because of biological and technical varia- consistent for other age groups than the reference data used
tions. However, it is important to examine whether the in model 1. Using the 10-fold cross-prediction, model 3
proportion of misdiagnosed patients is small enough to built classification rules using data different from the data to
offset the cost of identifying them and whether this addi- be evaluated. Nevertheless, it still used SOF data of all age
tional knowledge enhances the prognostic information and ranges. Such biases are not thought to be substantial enough
changes the treatment strategy. Although there have been to change the conclusions. The observed differences among
limited studies of the cost-effectiveness of the use of the three models can be attributed mainly to the method-
BMD,(1,28) Black et al.(14) compared the risk of hip fractures ological differences in patient classifications.
using BMD of the femoral neck and lumbar spine. They In summary, we conclude that standardization of normative
found that by altering the cut-off values of the T score to data, perhaps referenced to an older population, may be nec-
2.2 for femoral neck BMD, they could get a classification essary for applying T scores as diagnostic criteria in patient
with the same risk prediction power as using either a neck management. A risk-based osteoporosis classification does not
or spine BMD with a T score less than 2.5. They con- depend on the manufacturers reference data and may be more
cluded that the additional spine scan was unjustified. consistent and efficient in patient diagnosis.
Our approach of combining BMD parameters from a
given site has several differences from that study. First, we
used a combination of a scan of the hip or forearm, which ACKNOWLEDGMENT
gives more information than assessment of a single param- We thank David Breazeale for editorial assistance and
eter. Second, our comparison was based on individual pa- manuscript preparation. We also thank both reviewers and
tients, not the similarity in statistical properties of popula- editors for their comments and suggestions.
tions. Although diagnosis and risk assessment are two
different concepts with different uses, we only wish to
emphasize that combining available information on diagno- REFERENCES
sis from the same scan can provide prognostic information 1. National Osteoporosis Foundation (NOF) 1998 Osteoporosis:
and should not be ignored. Review of the evidence for prevention, diagnosis, and treat-
This study has several limitations. First, our conclusions ment and cost-effectiveness analysis. Status report. Osteoporos
are limited to women over the age of 65 years, to BMD Int 8(Suppl 4):S3S80.
measured by DXA/SPA, and to hip and/or spinal fractures, 2. World Health Organization (WHO) 1994 Assessment of frac-
and may not apply to other bone measures, other osteopo- ture risk and its application to screening for postmenopausal
osteoporosis: Report of a WHO study group. WHO Technical
rotic fractures, or other populations. With the rapid devel- Report Series, Report No. 843. WHO, Geneva, Switzerland.
opment of new bone mineral measurement technologies, 3. Genant H, Engelke K, Fuerst T, Gluer C, Grampp S, Harris S,
including quantitative ultrasound, it is important to deter- Jergas M, Lang T, Lu Y, Majumdar S, Mathur A, Takada M
mine the appropriate selection of single and/or multiple 1996 Noninvasive assessment of bone mineral and structure:
technologies and to integrate this information and its appro- State of the art. J Bone Miner Res 11:707730.
priate use into patient management. 4. Steiger P, Cummings SR, Black DM, Spencer NE, Genant HK
1992 Age-related decrements in bone mineral density in
Second, the hip and spinal fractures used in this article
women over 65. J Bone Miner Res 7:625 632.
had different time windows. Hip fractures occurred within 5 5. Grampp S, Genant H, Mathur A, Lang P, Jergas M, Takada M,
years after all BMD measurements. However, spinal frac- Gluer CC, Lu Y, Chavez M 1997 Comparisons of non-
tures could have occurred before the hip and spine DXA invasive bone mineral measurements in assessing age-related
scans. Thus, the risk of incident spinal fractures for some loss, fracture discrimination, and diagnostic classification.
participants may not be evaluated prospectively, as were the J Bone Miner Res 12:697711.
hip fractures. The window of spinal fractures averaged 6. Davis JW, Ross PD, Wasnich RD 1994 Evidence for both
generalized and regional low bone mass among elderly
about 3.7 years, which is different from hip fractures. Thus, women. J Bone Miner Res 9:305309.
fracture outcomes may be weighted more toward hip frac- 7. Pouilles JM, Tremollieres F, Ribot C 1993 Spine and femur
tures than spine fractures. Given the relative clinical impor- densitometry at the menopause: Are both sites necessary in the
tance of hip fractures, a slight overemphasis on hip fractures assessment of the risk of osteoporosis? Calcif Tissue Int 52:
may be reasonable. Our osteoporotic fracture risk estimation 344 347.
only serves as a surrogate measure of the risk of future 8. Greenspan S, Maitland-Ramsey L, Myers E 1996 Classifica-
tion of osteoporosis in the elderly is dependent on site-specific
fractures. However, separate analyses indicated that our analysis. Calcif Tissue Int 58:409 414.
conclusions remained the same regardless of whether we 9. Genant H, Lu Y, Mathur A, Fuerst T, Cummings S 1996
studied only hip fracture or only spine fracture based on the Classification based on DXA measurements for assessing the
same population.(9,10) risk of hip fractures. J Bone Miner Res 11:S120.
910 LU ET AL.

10. Genant H, Lu Y, Mathur A, Fuerst T, Zhao S, Cummings S 28. Langton CM, Ballard PA, Langton DK, Purdie DW 1997
1997 Diverse diagnostic armamentarium in bone mass mea- Maximizing the cost effectiveness of BMD referral for DXA
sures: Opportunities and dilemmas. Osteoporos Int 7:258. using ultrasound as a selective population pre-screen. Technol
11. Baran D, Faulkner K, Genant H, Miller P, Pacifici R 1997 Health Care 5:235241.
Diagnosis and management of osteoporosis: Guidelines for the
utilization of bone densitometry. Calcif Tissue Int 61:433
440. Address reprint requests to:
12. Faulkner K, von Stetten E, Steiger P, Miller P 1998 Discrep- Harry K. Genant, M.D.
ancies in osteoporosis prevalence at different skeletal sites: Department of Radiology
Impact on the WHO criteria. Bone 23(5 Suppl):S194. University of California, San Francisco
13. Faulkner K, von Stetten E, Miller P 1999 Discordance in San Francisco, CA 94143-0628, USA
patient classification using T-scores. J Clin Densitom 2:343
350.
14. Black D, Bauer D, Lu Y, Tabor H, Genant H, Cummings S
1995 Should BMD be measured at multiple sites to predict Received in original form February 15, 2000; in revised form
fracture risk in elderly women? J Bone Miner Res 10:S7. November 20, 2000; accepted December 19, 2000.
15. Miller P, Bonnick S, Johnston C 1998 The challenges of
peripheral bone density testing, which patients need additional
central density skeletal measurements. J Clin Densitom 1:211
APPENDIX
217.
16. Cummings S, Black D, Nevitt M, Browner W, Cauley J,
Ensrud K, Genant HK, Palermo L, Scott J, Vogt TM 1993 Because of the logistic regression model, for a given risk of fracture
Bone density at various sites for prediction of hip fractures. (p) and age (a), the corresponding BMD cut-off values for model 3 can
The Study of Osteoporotic Fractures Research Group. Lancet be derived in the following formula:
341:7275.
17. Cummings S, Nevitt M, Browner W, Stone K, Fox K, Ensrud
K, Cauley J, Black D, Vogt TM 1995 Risk factors for hip BMD logp log1 p 0 1 a/2 , (1)
fracture in white women. Study of Osteoporosis Research
Group. N Engl J Med 332:767773. where, 0, 1, and 2 are the intercept, regression coefficient for age,
18. Black DM, Palermo L, Nevitt MC, Genant HK, Epstein R, San and regression coefficient for BMD, respectively, from the logistic
Valentin R, Cummings SR 1995 Comparison of methods for regression equation. The corresponding parameters are given in Table
defining prevalent vertebral deformities: The Study of Osteo-
A1. For example, if we are interested in a risk of fracture of 14.6% for
porotic Fractures. J Bone Miner Res 10:890 902.
67-year-old women based on their total hip BMD measurement, we can
19. Black DM, Palermo L, Nevitt MC, Genant HK, Christensen L,
Cummings SR 1999 Defining incident vertebral deformity: A use parameters in Table A1 and formula (1) to calculate the cut-off
prospective comparison of several approaches. The Study of value for total hip BMD as
Osteoporotic Fractures Research Group. J Bone Miner Res
14:90 101. BMD log0.146 log0.834 4.4164
20. Looker A, Wahner H, Dunn W, Calvo M, Harris T, Heyse S,
Johnston CC Jr, Lindsay R 1998 Updated data on proximal 0.0899 67/6.1517 0.5445 mg/cm2
femur bone mineral levels of US adults. Osteoporos Int 8:468
489. Thus, if we use a 14.6% fracture risk as our criteria, 67-year-old women
21. Hanson J 1997 Standardization of femur BMD. J Bone Miner who had a total hip BMD of less than 0.5445 mg/cm2 on a Hologic scanner
Res 12:1316 1317 (letter). will be classified as osteoporotic.
22. Davis J, Novotny R, Ross P, Wasnich R 1994 The peak bone
mass of Hawaiian, Filipino, Japanese, and white women living
in Hawaii. Calcif Tissue Int 55:249 252.
23. Windeler J 2000 Prognosiswhat does the clinician associate
with this notion? Stat Med 19:425 430.
TABLE A1. PARAMETERS USED TO DERIVE BMD CUT-OFF
24. Kanis J, Gluer C 2000 An update on the diagnosis and assess-
VALUES FOR MODEL 3
ment of osteoporosis with densitometry (for the Committee of Site 0 1 2
Scientific Advisors, International Osteoporosis Foundation).
Osteoporos Int 11:192202. Total hip 4.4164 0.0899 6.1517
25. Miller PD 2000 Controversies in bone mineral density diag- Trochanteric 4.7714 0.0935 8.2243
nostic classifications. Calcif Tissue Int 66:317319 (editorial). Femoral neck 4.2003 0.0903 7.5798
26. Lu Y, Mathur A, Genant H 1995 Pitfalls of using Z-score, Wards triangle 5.5963 0.0871 7.7108
T-score for discriminant purpose. J Bone Miner Res 11:S1; Calcaneal 7.0780 0.0989 6.3198
S264. Distal radius 8.1840 0.1072 5.6293
27. Black D, Palermo L, Genant H, Cummings S 1996 Four
Proximal radius 7.6478 0.1043 3.6623
reasons to avoid the use of BMD T-scores in treatment deci-
PA spine 8.0441 0.1175 3.4860
sions for osteoporosis. J Bone Miner Res 11:S1;S118.