MOS-SF Reliability and Validity

The MOS Short-Form General Health Survey: Reliability and Validity in a Patient Population
Author(s): Anita L. Stewart, Ron D. Hays and John E. Ware, Jr.

Source: Medical Care, Vol. 26, No. 7 (Jul., 1988), pp. 724-735
Published by: Lippincott Williams & Wilkins
Stable URL: http://www.jstor.org/stable/3765494 .
Accessed: 02/04/2014 14:32
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Lippincott Williams & Wilkins is collaborating with JSTOR to digitize, preserve and extend access to Medical
Care.
http://www.jstor.org
This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM
All use subject to JSTOR Terms and Conditions
MEDICAL CARE
July
1988, Vol.
26,
No. 7
Communication
The MOS Short-form General Health
Survey
Reliability
and
Validity
in a Patient
Population
ANITA L.
STEWART, PHD, RON D.
HAYS, PHD,
AND JOHN E.
WARE, JR.,
PHD
There is a
great
demand for measures of
physical
and mental
health,
social and role
functioning,
and other
general
health con-
cepts
for use in
evaluating
health care.1-3 To
be
useful,
these instruments should
repre-
sent
multiple
health
concepts
and a
range
of
health states
pertaining
to
general
function-
ing
and
well-being.4 They
should adhere to
conventional standards of
reliability
and va-
lidity.5
To be useful in clinic
settings,
mea-
sures must also be
simple
and
easy
to use.
Patients tend to be sicker than the
general
population,
their attention is
divided,
and
time is limited.
Therefore,
measures that
work well in
general populations may
not
work well for
patients.
Health
surveys
that are
comprehensive
and
satisfy psychometric
standards are cur-
rently
available,
including
the McMaster
Health Index
Questionnaire,
the Sickness
Impact
Profile
(SIP),
the Functional Status
Questionnaire,
the Duke-UNC Health Pro-
file,
the RAND Health Insurance
Experi-
ment
(HIE)
measures,
the
Nottingham
Health
Profile,
and the Index of
Well-being
From the
Department
of Behavioral
Sciences,
The
RAND
Corporation,
Santa
Monica, California.
Supported by grants
for the Medical Outcomes
Study
from the Robert Wood
Johnson Foundation,
The
Henry J.
Kaiser
Family
Foundation, and the Pew Chari-
table Trusts.
The
opinions expressed
are those of the authors and
do not
necessarily
reflect the
opinions
of the
sponsors
or The RAND
Corporation.
Address
correspondence
to: Anita L. Stewart, The
RAND
Corporation,
1700 Main
Street, Santa Monica,
CA 90406.
724
(IWB).6-12
However,
these instruments are
too
long
to be
practical
in most clinic set-
tings.
For
example,
the HIE health scales
included 108
questionnaire
items that took
an
average
of 45 minutes to
complete.
The
short form of the SIP includes 136
questions
that take an
average
of 30 minutes.13 The
IWB is interviewer-administered and takes
about 18 minutes.14
The
length
of most available instruments
has
prompted investigators
to
adopt surveys
based on a few
single-item
measures.15 For
example, Spitzer,
Dobson, Hall,
et al. devel-
oped
a
quality
of life index that
aggregates
five
single-item
measures of health and
health-related
concepts;
the index
required
about a minute for
completion by physi-
cians.16 A
single-item rating
of health in
general
is one of the most
commonly
used
measures,
such as in the National Health
and Nutrition Examination
Survey.17
Sin-
gle-item
measures of
subjective well-being
have been
proposed by
several
investiga-
tors.18-20 In
general, single-item
measures
are less
satisfactory
than multi-item scales
because
single
items are
generally
less
pre-
cise,
less
reliable,
and less valid.21'22 Multi-
item scales also
provide
more
options
for
estimating
scores when a
response
to a
given
item is
missing.
A
compromise
between
lengthy
instru-
ments and
single-item
measures of health
was
sought.
A small subset of items from
long-form
measures has been shown to sat-
isfy
standards of
acceptability, reliability,
and
validity
in a
general population.23
Re-
ported
here are results from
administering
SHORT-FORM HEALTH SURVEY
such a
survey
to
patients.
The results are
compared
with those from administration of
this
survey
to a
general population.
Methods
Sampling
Patients
surveyed
were
participating
in
the Medical Outcomes
Study (MOS),
an ob-
servational
study
of variations in
physician
practice styles
and
patient
outcomes in dif-
ferent
systems
of care. At each of three
study
sites
(Boston, Chicago,
Los
Angeles),
physicians (general
internists,
family physi-
cians,
cardiologists, endocrinologists,
diabe-
tologists, psychiatrists), psychologists,
and
other mental health
providers
were
sampled
from health maintenance
organizations
(HMO), large multispecialty groups,
and
solo fee-for-service
practices. Altogether,
526 health care
providers age
31-55,
who
reported
direct
patient
care as their
primary
professional activity
and who had been in
their current
practice setting
at least one
year
were included in the MOS.
The information in this article is based on
a
sample
of
11,186 adult,
English-speaking
patients
who visited these
providers during
the
sampling period (lasting
9
days
on aver-
age). Ages ranged
from 18-103
(mean age
was
47). Thirty-eight percent
were male and
87% had
completed high
school
(average
of
13.7
years
of
education). Fifty percent
of the
sample
had a total household income of at
least
$20,000
in 1985 dollars.
Seventy-nine
percent
were
white, 11% black, 5% Latino,
and 3% were Asian or Pacific Islander. The
general population sample
of adults
repre-
senting
United States
households,
to which
we
compare
results,
is described else-
where.23 As
expected,24
the
patient sample
was
slightly
older and
overrepresented
women relative to the
general population
sample.
The
patients
were also
slightly
more
educated and had
slightly higher
income.
Data Collection
Data from
patients
were collected from
February through
October 1986. The 20
health items were located in the middle of a
75-item self-administered
questionnaire,
which was
completed by patients
as
they
waited to see their doctor. In all solo
prac-
tices and in some
group practices, surveys
were distributed
by
office staff. In most
group practices, they
were distributed
by
MOS field
representatives.
The entire
ques-
tionnaire took an
average
of 13 minutes to
complete,
of which it is estimated that the
health items took from 3-4
minutes,
on
average.
Questionnaires
were returned for
about 74% of the
eligible patient
visits in
group practices
and about 65% of such visits
in fee-for-service
practices.
These return
rates underestimate
patient acceptance
of
the
questionnaire
because,
when
practices
were
very busy,
staff were
encouraged
to
survey every
other
patient.
Health Measures
In accordance with the minimum stan-
dard of content
validity
for a
comprehensive
health measure
suggested by
Ware4 and
consistent with
previous
definitions of
health,25-27
20 items were selected to
repre-
sent six health
concepts: physical
function-
ing,
role
functioning,
social
functioning,
mental
health,
health
perceptions,
and
pain.
Physical functioning
was assessed
by
limita-
tions in a
variety
of
physical
activities,
ranging
from strenuous to
basic,
due to
health. Role and social
functioning
were
defined
by
limitations due to health
prob-
lems. Mental health was assessed in terms
of
psychological
distress and
well-being.
The measure of health
perceptions tapped
patients'
own
ratings
of their current health
in
general.
Pain was included to
capture
differences in
physical
discomfort. Defini-
tions of
physical functioning,
mental
health,
and health
perceptions tap positive
as well
as
negative
states of health. The definitions
are summarized in Table 1.
Questionnaire
items are
presented
in the
appendix.
Eighteen
of the 20 items were
adapted
from
longer
HIE measures of these
concepts
and were used
successfully
in a
general pop-
725
Vol. 26, No. 7
STEWART ET AL.
TABLE 1. Definitions of Health
Concepts
No. of Item
Measure Items Definition Numbersa
Physical
6 Extent to which health interferes with a
variety
of 16a-16f
functioning
activities
(e.g., sports, carrying groceries, climbing
stairs,
and
walking)
Role 2 Extent to which health interferes with usual
daily activity
18, 19
functioning
such as
work, housework,
or school
Social 1 Extent to which health interferes with normal social 20
functioning
activities such as
visiting
with friends
during past
month
Mental 5 General mood or
affect,
including depression, anxiety,
21-25
health and
psychologic well-being during
the
past
month
Health 5 Overall
ratings
of current health in
general
2, 26a-26d
perceptions
Pain 1 Extent of
bodily pain
in
past
4 weeks 17
a
See
appendix.
ulation
survey.23
The two additional
single-
item measures
(social functioning
and
pain)
were added after that
administration,
based
on
experience
with similar measures in the
HIE.
Analysis
Plan
Analyses
were
designed
to evaluate the
extent to which
very
short multi-item scales
would
satisfy
traditional
psychometric
crite-
ria.5 A multitrait
scaling
method was used to
test item
convergent
and discriminant valid-
ity.28
This method consists of three
steps
designed
to determine whether items have
equivalent
variances,
whether each item in a
hypothesized group
is
substantially
related
(r
>
0.40)
to the total score
computed
from
other items in that
group (item convergent
validity criterion)
and whether each item
correlates
significantly higher
with its
hy-
pothesized
scale than with other scales
(item
discriminant
validity criterion).
If these con-
ditions are
met,
it is
appropriate
to combine
items as
hypothesized
into
simple
sum-
mated
ratings
scales. These multitrait scal-
ing
tests were
performed
for the
patients
having complete
data on all 20 items
(N
=
8,294, 73% of
respondents).
726
Cronbach's
alpha,29
a measure of inter-
nal-consistency reliability,
was estimated
for the four multi-item scales.
Reliability
is
considered
acceptable
for
group compari-
sons when
alpha
is 0.50 or above.30 On the
strength
of
experience
with
longer
forms of
these
measures,
the authors
thought
it
would be
possible
to achieve
reliability
coef-
ficients above
0.70,
as recommended
by
Nunnally.31 Reliability
was evaluated for
the total
sample
and in two
subsamples
for
whom data
quality
was
hypothesized
to be
lower based on
prior
studies-those with
less than a
high
school education and those
over
age
75.32 In
addition,
because
patients
with serious health
problems might
have
trouble
completing
such a
questionnaire,
the
authors tested the
reliability separately
for
groups
of
patients
with
congestive
heart
failure,
depressive symptoms,
diabetes,
and/or
recent
myocardial
infarction. Where
possible,
the
reliability
estimates for the
short-form scales were also
compared
with
those obtained for
longer
versions of the
same scales.
Preliminary
tests of
validity
were also
possible
within the available data
set,
in-
cluding product-moment
correlations
among
the health
measures,
discrimination
MEDICAL CARE
between
patient
and
general population
groups,
and correlations with sociodemo-
graphic
characteristics. All correlations were
examined to determine whether the short-
form measures
produced
the same
pattern
of results as observed for
long-form
mea-
sures in
previous
research. Patient and
gen-
eral
population groups
were
compared by
examining
the
magnitude
of the difference
in the
proportion
of those
scoring
in the
"poor"
health
range (i.e.,
known
groups
va-
lidity).
The
variability
of score distributions
was examined for all six health measures.
Our
goal
was to achieve scores that were
normally
distributed or at least not
highly
skewed or kurtotic.
Construction of Health Measures
Consistent with
previous
studies,
limita-
tions in
physical
and role
functioning
were
counted
regardless
of duration and were
scored to reflect the number of limitations
present.33'34
Scores were reversed so that a
high
value indicated better
functioning.
Mental health scales were scored
by
sum-
ming
the item
responses,
after
reversing
the
scoring
of some items so that a
high
score
indicated better health. Before
combining
items in the health
perceptions
scale,
the
authors recoded the
response
choices of the
overall health item
(item 1),
to better reflect
the
unequal
intervals of the item.* The scale
was scored
by summing
the item
responses,
after
recoding
some items so that a
high
score indicated better health. The
single-
item measures were scored so that
high
scores indicated better social
functioning
and more
pain. Finally,
for all
measures,
scores were transformed
linearly
to 0-100
*
To do
this,
the authors calculated the mean
general
health score for the other four items for each
response
level on the overall item. These mean scores were then
transposed
into a 1-5 scale to
correspond
to the scale
used for the other 4 items. This resulted in the follow-
ing
transformation: 1
=
5, 2
=
4.36,
3
=
3.43,
4
=
1.99,
5 = 1.
scales,
with 0 and 100
assigned
to the lowest
and
highest possible
scores,
respectively.t
Results
Multitrait
Scaling
The item means and standard deviations
were
roughly equal
within each
scale,
thus
meeting
the first criterion for summated rat-
ings
scales,
with one minor
exception
in the
physical functioning
scale. Item-scale cor-
relations
(corrected
for
overlap)
for the four
multi-item scales indicated that our strin-
gent
criterion of
convergent validity
was
met in all cases. Item-scale correlations for
hypothesized
scales
ranged
from 0.45 to
0.79,
with a median of 0.68. All items in
each
hypothesized
scale also exceeded the
discriminant
validity
criterion. These results
support
the construction of
simple
sum-
mated
ratings
scales based on
hypothesized
item
groupings.
Variability
of Health Measures
The mean and standard deviation for
each of the measures are shown in Table 2.
The full
range
of
possible
scores was ob-
t
For the mental health and
general
health scales, a
missing
score was
assigned only
if all five items in the
scale were
missing.
For the two-item
role-functioning
scale,
a
missing
score was
assigned initially
if either
item was
missing.
However,
if the one
nonmissing
item
indicated that the
person
was unable to work
(limited
on item
18),
the lowest
possible functioning
score of 0
was
assigned.
If the one
nonmissing
item indicated that
the
person
had no limitations in the kind or amount of
work
(not
limited on item
19),
the
highest possible
functioning
score of 100 was
assigned.
For the
physical
functioning
scale,
a
missing
score was
assigned
if more
than one item was
missing.
However,
if the first item
(item 16a)
was answered as not limited and
any
re-
maining
items were
missing,
a score of 100 was as-
signed.
The authors found about 8% of
respondents
were
assigned
a
missing
value on
physical
and role
functioning,
3% on mental
health,
and less than 1% on
general
health.
Missing
data rates for the
single-item
pain
and social
functioning
measures were both about
3%. Overall, 15% of the
sample
had
missing
data on
one or more of the final scales.
Missing
data rates were
highest among
those over 75
(39%).
727
Vol. 26, No. 7
STEWART ET AL.
TABLE 2.
Descriptive
Statistics for Health Scales and Percent of Patient
(N
=
11,186)
and General
Population (N
=
2,008) Samples Scoring
in the "Poor" Health
Range
% in Poor Healthb
No. of General
Measure" Items Mean SD Patients
Population'
Physical functioning
6 78.5 30.8 45 22
Role
functioning
2 77.5 38.3 28 12
Social
functioning
1 87.2 23.6 9
d
Mental health 5 72.6 20.2 31 19
Health
perceptions
5 63.0 26.8 52 20
Pain 1 31.4 27.7 29
d
a
Observed
range
of all scores was 0-100. A
high
score indicates better health
except
for
pain,
where a
high
score
indicates more
pain.
b Poor health defined as:
physical
and role
functioning
=
one or more
limitations;
social
functioning
=
limitations
a
good
bit of the time or more;
mental health = lowest 19% of scores in
general population sample (score
of 67 or
lower) (cutoff
defined as close as
possible
to the bottom
20%);
health
perceptions
=
lowest 20% of scores in
general
population sample (score
of 70 or
lower); pain
=
moderate, severe,
or
very
severe
pain.
'
T-tests of difference between
proportions
in
patient
and
general population samples
were
statistically significant
(P
<
0.01)
for
every possible comparison.
d
Not available.
served for all measures
(data
not
reported).?
The distributions of mental health and
health
perceptions
scores were
roughly
symmetric,
as desired. The distributions of
physical
and role
functioning
scores were
skewed,
with more
people scoring along
the
positive
end of the scale but to a lesser de-
gree
than in
general populations.34
The role
functioning
scale had a somewhat bimodal
distribution with the least
prevalent
cate-
gory being
the middle
one; 72% had
perfect
functioning
and 17% scored at the worst
level. The distribution of the social func-
tioning
item was
quite
skewed and
kurtotic,
with the modal score
(69%
of the
sample)
being perfect functioning.
The
pain
item
was well distributed even
though
the modal
score was no
pain (30%),
with
approxi-
mately
9%
reporting
severe or
very
severe
pain
and about 20%
reporting
moderate
pain.
?
Actual score distributions are available from the
senior author
upon request.
Reliability
of Health Measures
Reliability
coefficients for the multi-item
health scales
ranged
from 0.81 to 0.88
(Table 3).
These estimates are
nearly
identi-
cal to those for the same scales in the
gen-
eral
population sample (range
was from
0.76 to
0.88).
Estimates for the four multi-
item scales were similar for
depressed pa-
tients
(0.82
to
0.87)
and for other
subgroups
analyzed: congestive
heart failure
(0.77
to
0.87),
diabetes
(0.83
to
0.87), myocardial
infarction
(0.77
to
0.88),
less than a
high
school education
(0.86
to
0.88),
and over
age
75
(0.84
to
0.89).
The
internal-consistency
reliabilities of
these short-form measures were
lower,
but
not much
lower,
than their
full-length
ver-
sions. The
internal-consistency reliability
of
the five-item health
perceptions
measure
was
0.87,
compared
to 0.88 for a nine-item
general
health scale.35 The
reliability
of the
five-item mental health measure was
0.88,
compared
with 0.96 for a 38-item version.36
The
reliability
of the six-item
physical
func-
728
MEDICAL CARE
TABLE 3.
Reliability
Estimates and Correlations
Among
Health Measures
Measure PF RF SF MH HP P
Physical Functioning (PF) (0.86)
Role
Functioning (RF)
0.65
(0.81)
Social
Functioning (SF)
0.47 0.56 (-)
Mental Health
(MH)
0.24 0.33 0.45
(0.88)
Health
Perceptions (HP)
0.53 0.57 0.53 0.45
(0.87)
Pain
(P)
-0.39 -0.42 -0.39 -0.42 -0.47
(-)
Note: Ns varied from
9,729
to
10,860,
due to
missing
data. All correlation coefficients are
statistically significant
(P
<
0.01). Internal-consistency
reliabilities are
given
on the
diagonal
for multi-item scales.
tioning
measure was
0.86,
compared
with
0.90 for a 10-item similar measure.37
Finally,
the
reliability
of the two-item role function-
ing
was 0.81,
compared
with a coefficient of
reproducibility
of 0.92 for a three-item ver-
sion.34
Validity
of Health Measures
Results for the three
types
of
validity
analyses,
introduced in the methods
section,
are
presented
below.
Correlations
Among
Health Measures.
All correlations
among
the health measures
were
statistically significant (P
<
0.01)
and
most were substantial in
magnitude (see
Table
3).
This
pattern
of correlations corre-
sponds
well with that observed from studies
of
full-length
versions of these mea-
sures.1038 Because the social
functioning
item had the same format as the mental
health
items,
it
might
be
expected
to corre-
late
highest
with mental health. It did
not,
suggesting
that this method effect did not
dominate the results. Consistent with
pre-
vious research,35 the health
perceptions
scale correlated
substantially
with both
physical
and mental
health;
in
fact,
that
scale correlated
substantially
with all of the
other health scales.
Comparison
of Patient and General
Population Samples.
The authors calcu-
lated the
percent scoring
in the
"poor"
health
range
for each of the health measures
(see
Table 2 for definitions of
poor health).
The
percentage
of
respondents
with
poor
health was
significantly greater (P
< 0.01) in
the
patient sample
than in the
general popu-
lation
sample
on all four
comparable
mea-
sures, consistent with
previous
stud-
ies.13'16'39'40 The
percentage
of
patients
with
physical
or role limitations or
poor
health
perceptions
was about twice that observed
for the
general population sample.
The
per-
centage
of
respondents
with
poor
mental
health was 50%
larger
in the
patient sample
than in the
general population sample.
These differences could not be accounted
for
by
differences in
sociodemographic
characteristics between the two
samples.
Correlations Between Health Measures
and
Sociodemographics.
Correlations be-
tween the health measures and
age,
sex,
ed-
ucation, income,
and race
(not shown)
were
consistent with results
using longer
form
measures.38
People
with more education
and income tended to have better health. In
this
study,
men
reported slightly
better
health than women on all measures
except
health
perceptions.
Older
people
tended to
report poorer
health than
younger people
on all measures
except
mental
health,
as ex-
pected.
Nonwhites tended to
report poorer
health
perceptions, poorer
social function-
ing,
and more
pain
than whites.
Discussion
The authors'
goal
was to
develop
a
gen-
eral health
survey
that is
comprehensive
and
psychometrically
sound,
yet
short
729
Vol. 26, No. 7
STEWART ET AL.
enough
to be
practical
for use in
large-scale
studies of
patients
in
practice settings.
The
resulting
20-item
survey
assesses
physical
functioning,
role and social
functioning,
mental
health,
health
perceptions,
and
pain.
A full
range
of favorable and unfavorable
health levels is
tapped by
items in these
measures.
Thus,
the
survey
achieves
breadth and
depth
of
measurement,
while
permitting
self-administration in
only
3-4
minutes.
By
virtue of its reduction of re-
spondent
burden
by
80% relative to the
lengthier
measures from which it was de-
rived,
this short-form
survey
offers a more
practical approach
to
patient
health assess-
ment.
The
reliability
of each of the multi-item
scales is
acceptable
for
group comparisons,
even in
subgroups
of
patients
over
age
75,
with serious chronic
conditions,
with de-
pressive symptoms,
and with less than a
high
school education. The
reliability
of the
mental health and the health
perceptions
measures, however,
might
be
slightly
in-
flated because these items were asked in a
sequence.
Future tests of the
reliability
of
these items should
split
them
up
to mini-
mize recall effects.
The fact that the reliabilities observed
here were not
substantially
lower than those
for
long-form
measures is
encouraging.
However,
it does not
necessarily
follow that
short-form measures will achieve
equiva-
lent
precision
in
measuring changes
in
health over time. Such
precision
is essential
for studies of health outcomes.
Although
some sacrifice in
precision
is
likely
with
short
measures,
compared
with
lengthier
ones,
these short-form scales
represent
a
gain
in
precision
relative to
single-item
measures,
which are
typically
coarse.22
Tradeoffs between short- and
long-form
measures in
detecting changes
in health
over time are
currently being
evaluated in
the MOS.
The results also offer
preliminary support
for the
validity
of the measures.
First,
excel-
lent item discrimination
among hypothe-
sized scales in the multitrait
scaling analyses
was observed.
Second,
correlations
among
the health measures and between the mea-
sures and
sociodemographic
characteristics
were similar to correlations observed
using
longer
form versions of these measures.
Third,
substantial differences in health be-
tween the
patient
and
general population
samples
were observed and the
pattern
of
differences
(across measures)
was consistent
with
previous
research.39
The
validity
of a health
survey
cannot be
established in a
single study.
Future studies
of these short-form measures should evalu-
ate how well
they
discriminate
among
groups differing
in
diagnosis
and disease se-
verity.
Their
validity
in
predicting
future
health and utilization of health services
should also be tested. Short-form measures
may
not do as well as
long-form
measures
in tests of
validity.22
Thus,
it is
important
to
establish the limits of the short-form mea-
sures and understand
fully
the tradeoffs in-
volved in their use.
A
relatively high
rate of
missing
data for
items in the
physical
and role
functioning
scales
(8%)
was observed. The
pattern
of
missing
data in the role
functioning
scale
suggests
that some
people
overlooked the
second role
functioning
item
(item 19).
This
item should be
printed directly
below the
first one in future administrations of this
battery.
For the
items,
instructions should make more clear that
respondents
should answer
every
item,
not
just
those
describing problems
that
apply."
Missing
data was more
prevalent among
patients:
older than
75,
with less than a
high
school
education,
and with diabetes or heart
disease. The authors
suspect
that this
prob-
lem is not
unique
to this
study
and
suggest
"
The modified instructions for
are as follows: For how
long
has
your
health limited
you
in each
activity
listed? Please
provide
an answer for
each of the activities listed.
730
MEDICAL CARE
that
special steps
be taken to
guard against
this
problem
in studies of these
populations.
An
advantage
of multi-item scales in this
regard
is the
option they provide
for esti-
mating missing
item scores from other items
in the same scale.
Using
this
method,
the
percentage
of
respondents
available for
analysis
was increased from 73% to 85%.
The results of this
study
offer some en-
couragement regarding
the
potential
of
short-form health measures in
surveys
of
both
patients
and
general populations.
It
may
no
longer
be
necessary
to
completely
omit measures of functional status and
well-being
from
large-scale
studies because
of
practical
constraints.
Further,
for some
purposes
the same short-form measures
may
be
appropriate
for use in both
popula-
tions. The usefulness of these measures in
health
policy
research to evaluate the effects
of health care as well as in clinical trials also
warrants further
study.
(Key
words:
health;
health
assessment;
functioning.)
References
1. McDermott W. Absence of indicators of the influ-
ence of its
physicians
on a
society's
health. Am
J
Med
1981;70:833.
2. Schroeder SA. Outcome assessment 70
years
later: are we
ready?
N
Engl J
Med
1987;316:160.
3. Tarlov AR. Shattuck lecture-the
increasing sup-
ply
of
physicians,
the
changing
structure of the health-
services
system,
and the future
practice
of medicine. N
Engl
J Med
1983;308:1235.
4. Ware
JE.
Standards for
validating
health mea-
sures: definition and content.
J
Chronic Dis
1987;40:473.
5. Ware
JE.
Methodological
considerations in the se-
lection of health status assessment
procedures.
In:
Wenger
NK, Mattson
ME,
Furberg
CD,
et
al.,
eds. As-
sessment of
Quality
of Life in Clinical Trials of Cardio-
vascular
Therapies.
New York: Le
Jacq Publishing,
1984.
6. Chambers
LW, MacDonald
LA,
Tugwell
P, et al.
The McMaster Health Index
Questionnaire
as a mea-
sure of
quality
of life for
patients
with rheumatoid dis-
ease.
J
Rheumatol 1982;9:780.
7.
Bergner
M, Bobbitt
RA, Carter
WB, et al. The
Sickness
Impact
Profile:
development
and final revi-
sion of a health status measures. Med Care
1981; 19:787.
8.
Jette AM, Davies AR,
Cleary
PD, et al. The Func-
tional Status
Questionnaire:
reliability
and
validity
when used in
primary
care.
J
Gen Intern Med
1986; 1:143.
9. Parkerson GR, Gehlbach SH,
Wagner
EH, et al.
The Duke-UNC health
profile:
an adult health status
instrument for
primary
care. Med Care 1981; 19:806.
10. Brook
RH, Ware
JE,
Davies-Avery
A,
et al. Con-
ceptualization
and measurement of health for adults in
the Health Insurance
Study:
vol
VIII, overview. Santa
Monica, CA: The RAND
Corporation (publication
number
R-1987/8-HEW),
1979.
11. Hunt SM, McKenna SP, McEwen
J,
et al. The
Nottingham
Health Profile:
Subjective
health status
and medical consultations. Soc Sci Med 1981;15A:221.
12. Patrick
DL, Bush
JW,
Chen MM. Toward an
op-
erational definition of health.
J
Health Soc Behav
1973; 14:6.
13.
Bergner
M. The Sickness
Impact
Profile
(SIP)
In:
Wenger
NK, Mattson ME,
Furberg
CD, et al., eds. As-
sessment of
Quality
of Life in Clinical Trials of Cardio-
vascular
Therapies.
New York: Le
Jacq Publishing,
1984.
14. Read
JL, Quinn RJ, Hoefer MA.
Measuring
overall health: an evaluation of three
important ap-
proaches. J
Chronic Dis
1987;40(Supp):7S.
15. Diener E.
Subjective well-being. Psychol
Bull
1984;95:542.
16.
Spitzer
WO, Dobson
AJ,
Hall
J,
et al.
Measuring
the
quality
of life of cancer
patients:
a concise
QL-index
for use
by physicians. J
Chronic Dis 1981;34:585.
17. Wan TTH, Livieratos B.
Interpreting
a
general
index of
subjective well-being.
Milbank Mem Fund
Q
1978;56:531.
18. Andrews
FM,
Withey
SB.
Developing
measures
of
perceived
life
quality:
results from several national
surveys.
Social Indicators Research 1974; 1:1.
19. Cantril H. The Pattern of Human Concerns.
New Brunswick, NJ: Rutgers University
Press, 1965.
20. Gurin
G, Veroff
J,
Feld S. Americans View Their
Mental Health. New York: Basic
Books, 1960.
21. Ware
JE,
Karmos AH.
Development
and valida-
tion of scales to measure
perceived
health and
patient
role
propensity:
volume II of a final
report. Springfield,
VA: National Technical Information Services
(NTIS
publication
no.
PB288-331),
1976.
22.
Manning
WG, Newhouse
JP,
Ware
JE.
The
status of health in demand
estimation; or,
beyond
ex-
cellent,
good,
fair, and
poor.
In: Fuchs
VR, ed. Eco-
nomic
Aspects
of Health.
Chicago: University
of Chi-
cago
Press, 1982.
23. Ware
JE,
Sherboure
CA, Davies
AR,
et al. A
short-form
general
health
survey.
Santa Monica: The
RAND
number
P-7444),
1988.
24.
McKinlay JB.
Some
approaches
and
problems
in
the
study
of the use of services: an overview. J Health
Soc Behav 1972; 13:115.
25. World Health
Organization.
Constitution of the
731
Vol. 26, No. 7
STEWART ET AL.
World Health
Organization.
In: Basic Documents.
Geneva: World Health
Organization,
1948.
26.
Bergner
M. Measurement of health status. Med
Care 1985; 23:696.
27. Breslow L. A
quantitative approach
to the World
Health
Organization
definition of health:
physical,
mental and social
well-being.
Int
J Epidemiol
1972; 1:347.
28. Ware
JE,
Brook RH,
Davies-Avery
A,
et al. Con-
ceptualization
and measurement of health for adults in
the Health Insurance
Study:
vol.
I,
model of health and
methodology.
Santa Monica: The RAND
Corporation
(publication
number
R-1987/1-HEW),
1980.
29. Cronbach
LJ.
Coefficient
alpha
and the internal
structure of tests.
Psychometrika
1951;16:297.
30. Helmstadter GC.
Principles
of
Psychological
Measurement. New York:
Appleton-Century-Crofts,
1964.
31.
Nunnally JC. Psychometric Theory,
2nd ed.
New York:
McGraw-Hill,
1978.
32. Andrews FM. Construct
validity
and error com-
ponents
of
survey
measures: a structural
modeling ap-
proach.
Public
Opinion Quarterly
1984;48:409.
33. Stewart
AL, Ware
JE,
Brook RH. Advances in the
measurement of functional status: construction of
ag-
gregate
indexes. Med Care 1981; 19:473.
34. Stewart
AL,
Ware
JE,
Brook RH. Construction
and
scoring
of
aggregate
functional status measures:
volume I. Santa Monica: The RAND
Corporation
(publication
no.
R-2551-1-HHS),
1982.
35. Davies AR,
Ware
JE. Measuring
health
percep-
tions in the Health Insurance
Experiment.
Santa Mon-
ica: The RAND
Corporation, (publication
number
R-2711-HHS),
1981.
36. Veit CT,
Ware
JE.
The structure of
psychological
distress and
well-being
in
general populations.
J
Con-
sult Clin
Psychol
1983;51:730.
37. Stewart AL,
Ware
JE,
Brook RH, et al.
Conceptu-
alization and measurement of health for adults in the
Health Insurance
Study:
vol
II,
physical
health in terms
of
functioning.
Santa Monica: The RAND
Corporation
(publication
no.
1987/2-HEW),
1978.
38. Ware
JE, Davies-Avery
A,
Brook RH.
Conceptu-
alization and measurement of health for adults in the
Health Insurance
Study:
vol
VI,
analysis
of relation-
ships among
health status measures. Santa Monica:
The RAND
number
R-1987/
6-HEW),
1980.
39. Nelson E,
Conger
B,
Douglass
R,
et al. Func-
tional health status levels of
primary
care
patients.
JAMA
1983;249:3331.
40. Cassileth
BR,
Lusk
EJ,
Strouse TB,
et al.
Psycho-
social status in chronic illness: a
comparative analysis
of six
diagnostic groups.
N
Engl J
Med 1984;311:506.
732
MEDICAL CARE
Appendix
Short-form Health
Survey:
Medical Outcomes
Study
2. In
general,
would
you say
your
health is:
a Excellent
O
Very
Good
O Good
O Fair
O Poor
17. How much
bodily pain
have
you
had
during
the
past
4
weeks?
1 0 None
2
O
Very
mild
3 E Mild
4 0 Moderate
5 0 Severe
16. For how
long (if
at
all)
has
your
health limited
you
in each of the
following
activities?
(Check
One Box on Each
Line)
Limited for
more than
3 months
1
a. The kinds or amounts of
vigorous
activities
you
can
do,
like
lifting heavy objects,
running
or
participating
in
strenuous
sports
............
b. The kinds or amounts of
moderate activities
you
can
do,
like
moving
a
table,
carrying
groceries
or
bowling
........
c.
Walking uphill
or
climbing
a
few
flights
of stairs .........
d.
Bending, lifting
or
stooping
...
e.
Walking
one block ..........
f.
Eating, dressing, bathing,
or
using
the toilet .............
Limited for
3 months
or less
2
Not
limited
at all
3
O
0
El
El
O
O
O
O O
18. Does
your
health
keep you
from
working
at a
job, doing
work around the house or
going
to school?
1 0
Yes,
for more than 3 months
2 0
Yes,
for 3 months or less
3 l No
19. Have
you
been unable to do
certain kinds or amounts of
work,
housework or
schoolwork because of
your
health?
1
O Yes,
for more than 3 months
2 0
Yes,
for 3 months or less
3 O No
1
2
3
4
5
733
Vol.
26,
No. 7
STEWART ET AL.
Appendix
Continued
For each of the
following questions, please
check the box for the one
answer that comes closest to the
way you
have been
feeling during
the
past
month.
(Check
One Box on Each
Line)
All of
the
Time
1
Most
of the
Time
2
A Good
Bit of
the
Time
3
Some
of the
Time
4
A
Little
of the
Time
5
None
of the
Time
6
20. How much of the
time,
during
the
past
month,
has
your
health limited
your
social activities
(like visiting
with
friends or close
relatives)?
.........
21. How much of the
time,
during
the
past
month,
have
you
been a
very
nervous
person?
....
22.
During
the
past
month,
how much of
the time have
you
felt
calm and
peaceful?
..
23. How much of the
time,
during
the
past
month,
have
you
felt
downhearted and
blue? .............
24.
During
the
past
month,
how much of
the time have
you
been a
happy person?
25. How
often,
during
the
past
month,
have
you
felt so down in the
dumps
that
nothing
could cheer
you up?
0 0 0 E 0 E
E 0 E 0 E E
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
734
MEDICAL CARE
Appendix
Continued
26. Please check the box that best describes whether each of the
following
statements is true or false for
you.
(Check
One Box on Each
Line)
Definitely Mostly
Not
Mostly Definitely
True True Sure False False
1 2 3 4 5
a. I am somewhat ill .......
b. I am as
healthy
as
anybody
I know ................
c.
My
health is excellent ....
d. I have been
feeling
bad
lately
.................
LI L L L E
L L L L L
L L L L LI
L L L L L
NOTE: Item numbers indicate the order in which the
questions appeared
in the
questionnaire
735
Vol.
26, No. 7

MOS-SF Reliability and Validity

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

MOS-SF Reliability and Validity

Загружено:

Авторское право:

Доступные форматы

The MOS Short-Form General Health Survey: Reliability and Validity in a Patient Population

Author(s): Anita L. Stewart, Ron D. Hays and John E. Ware, Jr.

Вам также может понравиться