Академический Документы
Профессиональный Документы
Культура Документы
DOI 10.1007/s10067-012-2101-6
ORIGINAL ARTICLE
Received: 16 February 2012 / Revised: 17 September 2012 / Accepted: 1 October 2012 / Published online: 13 October 2012
# Clinical Rheumatology 2012
Introduction
F. Franchignoni : M. Ottonello
Department of Physical and Rehabilitation Medicine,
Salvatore Maugeri Foundation,
Clinica del Lavoro e della Riabilitazione, IRCCS,
Nervi,
Genoa, Italy
A. Giordano
Unit of Bioengineering, Salvatore Maugeri Foundation,
Clinica del Lavoro e della Riabilitazione, IRCCS,
Veruno,
Novara, Italy
Rheumatoid arthritis (RA) is a chronic inflammatory arthropathy associated with articular damage, attendant
comorbidities, increasing disability, and socioeconomic decline [1, 2]. Traditional clinical disease markers of RA such
as joint counts and serum C-reactive protein quantify some
of the physical aspects of the disease, but fail to fully
identify the broad spectrum of disease effects. Consequently, patient-reported outcome (PRO) measures of healthrelated quality of life have been developed for use in clinical
practice and research, health policy evaluation, and general
population surveys [3].
Several self-report questionnaires have been constructed
over the past decades. The Health Assessment Questionnaire
212
(HAQ) disability index (DI) [4] and Arthritis Impact Measurement Scales 2 [5] now represent the primary measures of
physical function in RA, accompanied by generic measures
such as the Medical Outcomes Study 36-Item Short Form (SF36) [6]. Over its long history, the HAQ-DI has played an
influential role in the paradigm shift to establishing PROs,
but it has been criticized in terms of floor and ceiling effects,
nonlinearity, and the fact that some items are confusing for
patients [7]. One barrier to a more widespread use of such
questionnaires is their length; consequently, existing questionnaires have been shortened by omission of subscales, items, or
both [7, 8]. Short versions of an instrument are also important
to enhance response rates in postal surveys or to capture PRO
information in RA patients using an interactive touch-screen
computer system [9]. To broaden the application of patient
questionnaires beyond clinical research to standard clinical
rheumatology care, we have developed and validated a simpler self-administered questionnaire, namely the RecentOnset Arthritis Disability (ROAD) questionnaire [10, 11].
The construct/convergent validity, predictive validity, and
sensitivity to change of ROAD have been established in
previous studies [11]. We showed that ROAD subscales are
internally consistent and correlate significantly with a wide
variety of health status measures, including self-report measures and biochemical and clinical data. Further, the ROAD
subscale scores improve significantly post-intervention, indicating that ROAD is a valid measure of change over time [11].
Previous analyses of ROAD provided a three-factor model
explaining 70.1 % of the variance. However, modern psychometric theory poses further demands on health status instruments [10], and the advantages of using new approaches like
Rasch analysis in the development and evaluation of clinical
tools for health care have been clearly documented in the
literature [12]. Rasch analysis evaluates psychometric properties of a questionnaire that are not analyzed by classical test
theory (CTT) techniques, e.g., how well an item performs in
terms of its relevance or usefulness in measuring the underlying construct, the amount of the construct targeted by each
question, the possible redundancy of an item relative to other
items in the scale, and the appropriateness of the response
categories [13]. The aim of this study was to extend previous
assessments of ROADs validity, performing a comprehensive
psychometric analysis of the questionnaire [10, 11] using both
CTT and Rasch analysis.
Internal consistency
Patients
individual judgments; (2) the Spearman rank correlation coefficients (rs), to examine to what degree each item was correlated with the total score, omitting that item from the total
(itemtotal correlation). The usual rule of thumb is that an
item should correlate with the total score with r>0.30 [16].
Dimensionality
Previous studies had shown that the health status construct
tapped by ROAD can be modelled using three factors [10]:
(1) upper limb function, consisting of questions concerning
fine movements of the hand and finger and arm function
(items 15); (2) lower limb function, exploring the locomotor activities of the lower extremity (items 69); and (3)
activities of daily living/work, a factor more likely to represent the interferences caused by arthritis in daily life and
work activities (items 1012). Accordingly, we verified this
finding using confirmatory factor analysis (CFA) for ordinal
data: model parameters were determined via diagonally
weighted least square estimation from the polychoric correlation matrix, as outlined in Lisrel (Lisrel 8.80, SSI Inc 2007)
[17]. The goodness of fit was evaluated examining the following indexes: TuckerLewis fit index (or non-normed fit
index, NNFI), comparative fit index (CFI), root mean square
error of approximation (RMSEA), and the standardized root
mean square residual (SRMR). The NNFI and the CFI range
from 0 (poor fit) to 1 (good fit): values greater than 0.95 are
indicative of an acceptable fit. A RMSEA value lower than 0.8
is considered to reflect adequate fit, and that equal to or less
than 0.050.06 is usually taken as an indication of good fit.
Similarly, as a rule of thumb, the SRMR should be less than
0.05 for a good fit, whereas values smaller than 0.10 may be
interpreted as acceptable [18].
After confirmation of the three-factor structure, to investigate the reasons underlying this factorial complexity, we performed an exploratory analysis using a bifactor model [19] to
check for the presence of content clusters, a subset of items
sharing specific sources of variance, which can mask an
underlying common factor. A bifactor model (BM) is a latent
structure in which each item loads on a general factorit
reflects what is common among the items and represents the
individual differences in the target dimension that a researcher
is most interested in. Moreover, a bifactor structure specifies
two or more orthogonal group factors (three in the ROAD
case) representing common factors measured by the items that
potentially explain item response variance not accounted for
by the general factor. By comparing items loading between the
general factor of the BM and a single-factor analysis, it is
possible to ascertain the amount of distortion caused by multidimensionality [19]. The FACTOR software [20] was used
to obtain the bifactor solution through SchmidLeiman
orthogonalization.
213
Rasch analysis
Once the factorial structure of the scale was determined, the
matrix of single raw scores for each subject of the entire
scale and subsequently of each subscale underwent Rasch
analysis (rating scale model) using the WINSTEPS software, v. 3.68.2 [21]. As a first step, we investigated
through rating scale diagnosticswhether the rating scale
was being used in the expected manner, according to the
criteria suggested by Linacre [21, 22]. Reliability was evaluated in terms of separation (G), defined as the ratio of the
true spread of the measures to their measurement error. The
separation indices give an estimate (in standard error units)
of the spread or separation of items or persons along the
measurement construct and reflect the number of strata of
measures which are statistically discernible. A separation of
2.0 is considered good and enables the distinction of three
groups or strata. A related index is the reliability of these
separation indices, providing the degree of confidence that
can be placed in the consistency of the estimates (range 01;
coefficients >0.80 are considered as good, and coefficients
>0.90 are considered as excellent) [13]. A principal component analysis (PCA) on the standardized residuals was used
to inspect: (1) the presence of subdimensions. Unidimensionality in this context assumes thatafter removal of the trait
that the scale intended to measure (the Rasch factor)the
residuals will be uncorrelated and normally distributed (i.e.,
there are no principal components). The following criteria
were used to determine whether additional factors were likely
to be present in the residuals: (a) a cutoff of 50 % of the
variance explained by the Rasch factor, (b) eigenvalue of the
first contrast smaller than 3, and (c) variance explained by
each contrast smaller than 5 %; (2) the local independence of
items. High correlation (>0.30) of residuals for two items
indicates that they may not be locally independent, either
because they duplicate some feature of each other or because
they both incorporate some other shared dimension [21].
Validity was analyzed by evaluating the fit of individual items to the latent trait as per the Rasch model
(unidimensionality) and examining if the pattern of item difficulties was consistent with the model expectation. The
Rasch model estimates goodness-of-fit (or simply fit) of
the real data to the modelled data. Information-weighted
(infit) and outlier-sensitive (outfit) mean-square statistics
(MnSq) for each item were calculated (similarly to a chisquare analysis) to test if there were items which did not fit
with the model expectancies. We considered as an indicator of
acceptable fit MnSq>0.8 and MnSq<1.2: items outside this
range were considered underfitting (MnSq1.2) or overfitting
(MnSq0.8). At the same time, we calculated the level of
difficulty represented by each item (item difficulty) and where
each individual subject fits along the continuum (subject
ability). The fit criteria for the Rasch model were adopted
214
independent cluster structure (each variable loaded nonnegligibly on just one group factor); the loadings in the
unidimensional model were very similar to those on the
general factor in the bifactor model (Table 1).
Rasch analysis
Results
Score distributions of ROAD
Median ROAD score (10th90th percentile) was 3.1 (16.6),
whereas the three subscales showed the following results:
upper limb function, 3 (07); lower limb function, 1.3 (0
6.3); activities of daily living/work, 4.2 (1.78.3). Only one
subject resulted with the minimum score, and none obtained
the maximum score (0 and 10, respectively).
Internal consistency and dimensionality
Cronbachs alpha was 0.90 for the total ROAD questionnaire, and ranged from 0.87 (activities of daily living/work)
to 0.91 (both upper and lower extremity function) in the
three subscales, thus endorsing the use of the scores for
individual judgement. Itemtotal correlation ranged from
0.57 (wash and dry your body) to 0.75 (take a 2 kg object
from above) in the entire scale and from 0.69 (wash and
dry your body) to 0.85 (climb up five steps or stairs) in
the subscales.
Data fit for the CFA three-factor model was adequate to
good (NNFI00.98, CFI00.99, RMSEA00.079, SRMR0
0.047), with standardized item-to-factor loadings ranging
from 0.60 to 0.90 and a cumulative explained variance of
83 %; inter-factor correlation was moderate (from 0.63 to
0.64). The bifactor model of ROAD presented a clean
Discussion
The proliferation of instruments and measurement systems
has resulted in an increasing need for validated, comparable
measures. In psychosocial and medical sciences, constructs
such as disability or quality of life are usually measured
by means of multi-item health status questionnaires. It is
important that the psychometric properties of these instruments are extensively tested because the ability of these
measures to improve decision making in clinical research
relies heavily on their psychometric robustness.
Items
1.
2.
3.
4.
5.
6.
7.
8.
Single factor
Bifactor
G
F1
0.69
0.75
0.79
0.76
0.79
0.70
0.72
0.76
0.61
0.68
0.71
0.68
0.70
0.63
0.66
0.69
0.46
0.61
0.58
0.47
0.43
0.85
0.60
0.72
0.70
0.76
0.56
0.69
0.66
F2
F3
0.56
0.65
0.60
0.51
0.54
0.69
0.62
215
Table 2 Subjects ability levels, reliability indices, and principal component analysis of the residuals for the three ROAD subscales
ROAD
Separate subscales
Upper extremity
function
Lower extremity
function
Activities of daily
living/work
1.08
(4.38 to +3.34)
2.74
0.88
0.92
1.28 to +0.91
11.64
0.99
59.6
3.2
0.83
(4.09 to +4.14)
2.2
0.83
0.91
0.42 to +0.29
3.75
0.93
65.3
1.8
1.57
(4.54 to +5.47)
1.99
0.80
0.91
0.47 to +0.41
3.53
0.93
65.2
1.7
0.30
(5.93 to +5.88)
2.13
0.82
0.87
0.99 to +1.17
10.64
0.99
75.8
1.7
Measure
S.E.
Infit MnSq
Outfit MnSq
PtMeas Corr
0.02
0.26
0.29
0.15
0.42
0.07
0.07
0.07
0.07
0.07
1.21
0.94
0.88
0.95
0.99
1.19
0.93
0.88
0.95
0.99
0.83
0.85
0.85
0.86
0.86
0.04
0.41
0.47
0.10
0.08
0.08
0.08
0.08
1.19
0.88
0.79
1.12
1.17
0.86
0.77
1.11
0.87
0.88
0.91
0.87
1.17
0.99
0.18
0.08
0.08
0.08
1.21
0.88
0.84
1.23
0.84
0.81
0.85
0.91
0.91
Containing item calibration (measure) with standard error (S.E.), fit information (infit and outfit MnSq), and the pointmeasure correlation (Pt
Meas Corr)
216
RA patients. Our results support the use of separate subscales for upper extremity function, lower extremity function, and activities of daily living/work, and confirm the
appropriateness of reporting an overall score based on summation of the three subscales.
Acknowledgments We thank Giardino A., Ori A., Zanoli L., and
Manfredi A. for their contribution to the organization of the study and
data management. The authors also wish to thank the other members of
the NEW INDICES Study Group, as follows: Lagan B. (Roma),
Varcasia G. (Mormanno, Cosenza), Pomponio G. (Ancona), Mozzani
F. (Parma), Scarpa R. (Napoli), Maier A. (Bolzano), Romeo N.
(Cuneo), Migliore A. (Roma), Scarpellini M. (Magenta, Milano),
Corsaro S.M. (Napoli), Pusceddu M. (Iglesias, Cagliari), Foti R.
(Catania), Rotondi M. (Velletri, Roma), Marasini B. (Rozzano,
Milano), Ferraccioli G. (Roma), Malavolta N. (Bologna), Sabadini L.
(Arezzo), Cimmino M. (Genova), Ricioppo A. (Vimercate, Milano),
Triolo G. (Palermo), Mascia M.T. (Modena), Minisola G. (Roma),
Botsios K. (Montebelluna, Treviso), Pellerito R. (Torino), Noto A.
(Cosenza), Cazzato M. (Pisa), Bombardieri S. (Pisa). Part of the study
(data collection) was supported by an unrestricted educational grant
from Pfizer S.p.A., Italy. Editorial support was provided by Kim
Brown of UBC Scientific Solutions and was funded by Pfizer Inc.
Disclosures FS has attended advisory board meetings for BristolMyers Squibb, Abbott Immunology, Wyeth Lederle, and Pfizer, and
has received research support from Bristol-Myers Squibb. The remaining authors have no financial or other competing interests to declare.
References
1. Verstappen SM, Symmons DP (2011) What is the outcome of RA
in 2011 and can we predict it? Best Pract Res Clin Rheumatol 25
(4):48596
2. Salaffi F, De Angelis R, Stancati A, Grassi W, MArche Pain;
Prevalence INvestigation Group (MAPPING) study (2005)
Health-related quality of life in multiple musculoskeletal conditions: a cross-sectional population based epidemiological study. II.
The MAPPING study. Clin Exp Rheumatol 23(6):82939
3. Kalyoncu U, Dougados M, Daurs JP, Gossec L (2009) Reporting
of patient-reported outcomes in recent trials in rheumatoid arthritis:
a systematic literature review. Ann Rheum Dis 68(2):18390
4. Fries JF, Spitz P, Kraines RG, Holman HR (1980) Measurement of
patient outcome in arthritis. Arthritis Rheum 23(2):13745
5. Meenan RF, Mason JH, Anderson JJ, Guccione AA, Kazis LE
(1992) AIMS2. The content and properties of a revised and expanded Arthritis Impact Measurement Scales health status questionnaire. Arthritis Rheum 35:110
6. Ware JE Jr, Sherbourne CD (1992) The MOS 36-item short form
health survey (SF-36). 1. Conceptual frame-work and item selection. Med Care 30:47381
7. Oude Voshaar MA, ten Klooster PM, Taal E, van de Laar MA
(2011) Measurement properties of physical function scales validated for use in patients with rheumatoid arthritis: a systematic review
of the literature. Health Qual Life Outcomes 9:99
8. Pincus T, Yazici Y, Bergman M (2005) Development of a multidimensional health assessment questionnaire (MDHAQ) for the
infrastructure of standard clinical care. Clin Exp Rheumatol 23:
S1928
9. Salaffi F, Gasparini S, Grassi W (2009) The use of computer touchscreen technology for the collection of patient-reported outcome
10.
11.
12.
13.
14.
15.
16.
17.
18.
217
19. Reise SP, Morizot J, Hays RD (2007) The role of the bifactor
model in resolving dimensionality issues in health outcomes measures. Qual Life Res 16(S1):1931
20. Lorenzo-Seva U, Ferrando PJ (2006) FACTOR: a computer program to fit the exploratory factor analysis model. Behav Res
Methods Instrum Comput 38(1):8891
21. Linacre JM (2009) A users guide to WINSTEPS MINISTEP:
Rasch-model computer programs. Program manual 3.68.0. Chicago:
WINSTEPS.com; 2009. http://www.winsteps.com/a/winsteps.pdf.
Accessed 29 July 2010
22. Linacre JM (2002) Optimizing rating scale category effectiveness.
J Appl Meas 3:85106
23. Smith AB, Rush R, Fallowfield LJ, Velikova G, Sharpe M (2008)
Rasch fit statistics and sample size considerations for polytomous
data. BMC Med Res Methodol 8:33
24. Frost MH, Reeve BB, Liepa AM, Stauffer JW, Hays RD, Mayo/
FDA Patient-Reported Outcomes Consensus Meeting Group
(2007) What is sufficient evidence for the reliability and validity
of patient-reported outcome measures? Value Health 10(Suppl 2):
S94105
25. Reeve BB, Hays RD, Bjorner JB et al (2007) Psychometric
evaluation and calibration of health-related quality of life item
banks: plans for the Patient-Reported Outcomes Measurement
Information System (PROMIS). Med Care 45(Suppl 1):S22
31
26. Coste J, Boue S, Ecosse E, Leplge A, Pouchot J (2005) Methodological issues in determining the dimensionality of composite
health measures using principal component analysis: case illustration and suggestions for practice. Qual Life Res 14:64154
27. ten Klooster PM, Veehof MM, Taal E, van Riel PL, van de Laar MA
(2008) Confirmatory factor analysis of the Arthritis Impact Measurement Scales 2 short form in patients with rheumatoid arthritis. Arthritis Rheum 59(5):6928
28. Guillemin F, Coste J, Pouchot J, Ghezail M, Bregeon C, Sany J,
The French Quality of Life in Rheumatology Group (1997) The
AIMS2-SF: a short form of the Arthritis Impact Measurement
Scales 2. Arthritis Rheum 40:126774
29. Ren XS, Kazis L, Meenan RF (1999) Short-Form Arthritis Impact
Measurement Scales 2: tests of reliability and validity among
patients with osteoarthritis. Arthritis Care Res 12:16371