Classical Test Theory and Rasch Analysis Validation

Clin Rheumatol (2013) 32:211217
DOI 10.1007/s10067-012-2101-6
ORIGINAL ARTICLE
Classical test theory and Rasch analysis validation

of the Recent-Onset Arthritis Disability questionnaire
in rheumatoid arthritis patients
Fausto Salaffi & Franco Franchignoni &
Andrea Giordano & Alessandro Ciapetti &
Stefania Gasparini & Marcella Ottonello &
on behalf of the NEW INDICES Study Group
Received: 16 February 2012 / Revised: 17 September 2012 / Accepted: 1 October 2012 / Published online: 13 October 2012
# Clinical Rheumatology 2012
Abstract Disability has been identified as a core outcome

measure in rheumatoid arthritis (RA). The aim of this study
was to perform a comprehensive psychometric analysis of
the Recent-Onset Arthritis Disability (ROAD) questionnaire
in patients with RA. The questionnaire was completed by
583 patients with RA: 196 subjects participating in the
NEW INDICES study and 387 subjects who were taking
part in a long-term observational study. At confirmatory
factor analysis for categorical data, data fit for a threefactor model was adequate to good (non-normed fit
index00.98, comparative fit index00.99, root mean square
error of approximation 00.079, standardized root mean
square residual00.047), with standardized item-to-factor
loadings ranging from 0.60 to 0.90 and a cumulative
explained variance of 83 %. The bifactor model of ROAD
presented a clean independent cluster structure. The loadings in the unidimensional model were very similar to those
on the general factor in the bifactor model. Rasch analysis

showed a correct functioning of rating categories, a good fit
of the data to the model for all three subscales, and satisfactory separation indexes and respective reliability (for both
persons and items). This study, using both classical test
theory and Rasch analysis methods, provides psychometric
evidence of the reliability and internal and structural validity
of ROAD in RA patients. Our results support the use of
separate subscores for upper limb function, lower limb
function, and activities of daily living/work, and the appropriateness of reporting an overall score (i.e., the mean of the
three subscales).
F. Salaffi (*) : A. Ciapetti : S. Gasparini

Department of Rheumatology, Polytechnic University of Marche,
Ospedale C. Urbani, Via dei Colli 52,
60035 Jesi, Ancona, Italy
e-mail: fsalaff@tin.it
Introduction
F. Franchignoni : M. Ottonello
Department of Physical and Rehabilitation Medicine,
Salvatore Maugeri Foundation,
Clinica del Lavoro e della Riabilitazione, IRCCS,
Nervi,
Genoa, Italy
A. Giordano
Unit of Bioengineering, Salvatore Maugeri Foundation,
Clinica del Lavoro e della Riabilitazione, IRCCS,
Veruno,
Novara, Italy
Keywords Disability . Functional status . Patient-reported

outcome . Psychometrics . Recent-Onset Arthritis Disability
(ROAD) questionnaire . Rheumatoid arthritis
Rheumatoid arthritis (RA) is a chronic inflammatory arthropathy associated with articular damage, attendant
comorbidities, increasing disability, and socioeconomic decline [1, 2]. Traditional clinical disease markers of RA such
as joint counts and serum C-reactive protein quantify some
of the physical aspects of the disease, but fail to fully
identify the broad spectrum of disease effects. Consequently, patient-reported outcome (PRO) measures of healthrelated quality of life have been developed for use in clinical
practice and research, health policy evaluation, and general
population surveys [3].
Several self-report questionnaires have been constructed
over the past decades. The Health Assessment Questionnaire
212
(HAQ) disability index (DI) [4] and Arthritis Impact Measurement Scales 2 [5] now represent the primary measures of
physical function in RA, accompanied by generic measures
such as the Medical Outcomes Study 36-Item Short Form (SF36) [6]. Over its long history, the HAQ-DI has played an
influential role in the paradigm shift to establishing PROs,
but it has been criticized in terms of floor and ceiling effects,
nonlinearity, and the fact that some items are confusing for
patients [7]. One barrier to a more widespread use of such
questionnaires is their length; consequently, existing questionnaires have been shortened by omission of subscales, items, or
both [7, 8]. Short versions of an instrument are also important
to enhance response rates in postal surveys or to capture PRO
information in RA patients using an interactive touch-screen
computer system [9]. To broaden the application of patient
questionnaires beyond clinical research to standard clinical
rheumatology care, we have developed and validated a simpler self-administered questionnaire, namely the RecentOnset Arthritis Disability (ROAD) questionnaire [10, 11].
The construct/convergent validity, predictive validity, and
sensitivity to change of ROAD have been established in
previous studies [11]. We showed that ROAD subscales are
internally consistent and correlate significantly with a wide
variety of health status measures, including self-report measures and biochemical and clinical data. Further, the ROAD
subscale scores improve significantly post-intervention, indicating that ROAD is a valid measure of change over time [11].
Previous analyses of ROAD provided a three-factor model
explaining 70.1 % of the variance. However, modern psychometric theory poses further demands on health status instruments [10], and the advantages of using new approaches like
Rasch analysis in the development and evaluation of clinical
tools for health care have been clearly documented in the
literature [12]. Rasch analysis evaluates psychometric properties of a questionnaire that are not analyzed by classical test
theory (CTT) techniques, e.g., how well an item performs in
terms of its relevance or usefulness in measuring the underlying construct, the amount of the construct targeted by each
question, the possible redundancy of an item relative to other
items in the scale, and the appropriateness of the response
categories [13]. The aim of this study was to extend previous
assessments of ROADs validity, performing a comprehensive
psychometric analysis of the questionnaire [10, 11] using both
CTT and Rasch analysis.
5.15.9 years), who were participants in the NEW INDICES

study [14, 15], and 387 RA patients (F0298, mean age 57.6
13.2 years, mean duration of disease 5.36.1 years), who
were taking part in a long-term observational study conducted
by the Department of Rheumatology of the Polytechnic University of Marche, Ancona, Italy. Details are described elsewhere [14, 15]. The study was approved by the national health
authorities and ethics committees of all participating hospitals.
All patients gave informed written consent.
The ROAD questionnaire
The ROAD questionnaire consists of 12 items that tap a
combination of common symptoms related to the patients
level of functional ability and assess three patient-relevant
dimensions: upper limb function (5 items), lower limb function (4 items), and activities of daily living/work (3 items). For
each item, patients are asked to rate the level of difficulty over
the past week on a 5-point scale ranging from 0 (no difficulty)
to 4 (unable to do) [10, 11]. The subscores of the items of each
dimension are first summed and thenin order to express
them in a more clinically meaningful formata simple mathematical normalization procedure is performed so that all
subscores are expressed in the range of 010, with 0 representing better status and 10 representing poorer status. In this
way, three physical function subscores ranging from 010 can
be presented graphically as a ROAD disability profile. The
overall ROAD score is the mean of the three subscores.
ROAD can be scored in 15 to 20 s. The questionnaire was
developed jointly by groups in Italy [10], through an interactive process involving expert opinion from health care providers and interviews with patients. Items retained for this
scale were those that had a prevalence >60 % in the sample
population and a mean importance rating >2.0 (on a scale of
15). Internal consistency of the subscales was excellent
(Cronbachs alpha00.900.98). Testretest reliability was also acceptable for each of the subscales (intraclass correlation
coefficient00.700.86). Construct validity was confirmed
against a variety of measures, including the HAQ [10, 15].
The results of responsiveness analysis indicated that the
ROAD subscales were slightly more sensitive to perceived
change in functional disability than the HAQ, SF-36 PCS, or
global self-assessment of functional disability [11].
Statistical analysis
Patients and methods
Internal consistency
Patients
The items of ROAD were examined for internal consistency

calculating: (1) Cronbachs coefficient alpha, to measure the
extent to which items measure the same characteristic. Alpha values >0.70 are recommended for group-level comparison, whereas a minimum of 0.850.90 is desirable for
Five hundred eighty-three subjects were enrolled in the study

and completed the ROAD questionnaire: 196 RA outpatients
(F0163, mean age 56.712.1 years, mean duration of disease
individual judgments; (2) the Spearman rank correlation coefficients (rs), to examine to what degree each item was correlated with the total score, omitting that item from the total
(itemtotal correlation). The usual rule of thumb is that an
item should correlate with the total score with r>0.30 [16].
Dimensionality
Previous studies had shown that the health status construct
tapped by ROAD can be modelled using three factors [10]:
(1) upper limb function, consisting of questions concerning
fine movements of the hand and finger and arm function
(items 15); (2) lower limb function, exploring the locomotor activities of the lower extremity (items 69); and (3)
activities of daily living/work, a factor more likely to represent the interferences caused by arthritis in daily life and
work activities (items 1012). Accordingly, we verified this
finding using confirmatory factor analysis (CFA) for ordinal
data: model parameters were determined via diagonally
weighted least square estimation from the polychoric correlation matrix, as outlined in Lisrel (Lisrel 8.80, SSI Inc 2007)
[17]. The goodness of fit was evaluated examining the following indexes: TuckerLewis fit index (or non-normed fit
index, NNFI), comparative fit index (CFI), root mean square
error of approximation (RMSEA), and the standardized root
mean square residual (SRMR). The NNFI and the CFI range
from 0 (poor fit) to 1 (good fit): values greater than 0.95 are
indicative of an acceptable fit. A RMSEA value lower than 0.8
is considered to reflect adequate fit, and that equal to or less
than 0.050.06 is usually taken as an indication of good fit.
Similarly, as a rule of thumb, the SRMR should be less than
0.05 for a good fit, whereas values smaller than 0.10 may be
interpreted as acceptable [18].
After confirmation of the three-factor structure, to investigate the reasons underlying this factorial complexity, we performed an exploratory analysis using a bifactor model [19] to
check for the presence of content clusters, a subset of items
sharing specific sources of variance, which can mask an
underlying common factor. A bifactor model (BM) is a latent
structure in which each item loads on a general factorit
reflects what is common among the items and represents the
individual differences in the target dimension that a researcher
is most interested in. Moreover, a bifactor structure specifies
two or more orthogonal group factors (three in the ROAD
case) representing common factors measured by the items that
potentially explain item response variance not accounted for
by the general factor. By comparing items loading between the
general factor of the BM and a single-factor analysis, it is
possible to ascertain the amount of distortion caused by multidimensionality [19]. The FACTOR software [20] was used
to obtain the bifactor solution through SchmidLeiman
orthogonalization.
213
Rasch analysis
Once the factorial structure of the scale was determined, the
matrix of single raw scores for each subject of the entire
scale and subsequently of each subscale underwent Rasch
analysis (rating scale model) using the WINSTEPS software, v. 3.68.2 [21]. As a first step, we investigated
through rating scale diagnosticswhether the rating scale
was being used in the expected manner, according to the
criteria suggested by Linacre [21, 22]. Reliability was evaluated in terms of separation (G), defined as the ratio of the
true spread of the measures to their measurement error. The
separation indices give an estimate (in standard error units)
of the spread or separation of items or persons along the
measurement construct and reflect the number of strata of
measures which are statistically discernible. A separation of
2.0 is considered good and enables the distinction of three
groups or strata. A related index is the reliability of these
separation indices, providing the degree of confidence that
can be placed in the consistency of the estimates (range 01;
coefficients >0.80 are considered as good, and coefficients
>0.90 are considered as excellent) [13]. A principal component analysis (PCA) on the standardized residuals was used
to inspect: (1) the presence of subdimensions. Unidimensionality in this context assumes thatafter removal of the trait
that the scale intended to measure (the Rasch factor)the
residuals will be uncorrelated and normally distributed (i.e.,
there are no principal components). The following criteria
were used to determine whether additional factors were likely
to be present in the residuals: (a) a cutoff of 50 % of the
variance explained by the Rasch factor, (b) eigenvalue of the
first contrast smaller than 3, and (c) variance explained by
each contrast smaller than 5 %; (2) the local independence of
items. High correlation (>0.30) of residuals for two items
indicates that they may not be locally independent, either
because they duplicate some feature of each other or because
they both incorporate some other shared dimension [21].
Validity was analyzed by evaluating the fit of individual items to the latent trait as per the Rasch model
(unidimensionality) and examining if the pattern of item difficulties was consistent with the model expectation. The
Rasch model estimates goodness-of-fit (or simply fit) of
the real data to the modelled data. Information-weighted
(infit) and outlier-sensitive (outfit) mean-square statistics
(MnSq) for each item were calculated (similarly to a chisquare analysis) to test if there were items which did not fit
with the model expectancies. We considered as an indicator of
acceptable fit MnSq>0.8 and MnSq<1.2: items outside this
range were considered underfitting (MnSq1.2) or overfitting
(MnSq0.8). At the same time, we calculated the level of
difficulty represented by each item (item difficulty) and where
each individual subject fits along the continuum (subject
ability). The fit criteria for the Rasch model were adopted
214
according to Smith et al. [23], who observed that sample size

invariance appears to exist for the mean-square fit statistics in
polytomous Rasch models, suggesting a 0.71.3 range which
we conservatively reduced to 0.81.2.
independent cluster structure (each variable loaded nonnegligibly on just one group factor); the loadings in the
unidimensional model were very similar to those on the
general factor in the bifactor model (Table 1).
Rasch analysis
Results
Score distributions of ROAD
Median ROAD score (10th90th percentile) was 3.1 (16.6),
whereas the three subscales showed the following results:
upper limb function, 3 (07); lower limb function, 1.3 (0
6.3); activities of daily living/work, 4.2 (1.78.3). Only one
subject resulted with the minimum score, and none obtained
the maximum score (0 and 10, respectively).
Internal consistency and dimensionality
Cronbachs alpha was 0.90 for the total ROAD questionnaire, and ranged from 0.87 (activities of daily living/work)
to 0.91 (both upper and lower extremity function) in the
three subscales, thus endorsing the use of the scores for
individual judgement. Itemtotal correlation ranged from
0.57 (wash and dry your body) to 0.75 (take a 2 kg object
from above) in the entire scale and from 0.69 (wash and
dry your body) to 0.85 (climb up five steps or stairs) in
the subscales.
Data fit for the CFA three-factor model was adequate to
good (NNFI00.98, CFI00.99, RMSEA00.079, SRMR0
0.047), with standardized item-to-factor loadings ranging
from 0.60 to 0.90 and a cumulative explained variance of
83 %; inter-factor correlation was moderate (from 0.63 to
0.64). The bifactor model of ROAD presented a clean
Table 1 Exploratory factor

analysis of single- and bifactor
models
G general factor, F1F3 group

factors
Discussion
The proliferation of instruments and measurement systems
has resulted in an increasing need for validated, comparable
measures. In psychosocial and medical sciences, constructs
such as disability or quality of life are usually measured
by means of multi-item health status questionnaires. It is
important that the psychometric properties of these instruments are extensively tested because the ability of these
measures to improve decision making in clinical research
relies heavily on their psychometric robustness.
Items
1.
2.
3.
4.
5.
6.
7.
8.
Loadings <0.25 omitted
Rating scale diagnostics did not evidence any incorrect use

of the categories. Reliability, validity, and PCA of the residual results, for the entire ROAD and for each of the three
subscales, are presented in Table 2. Separation indexes and
respective reliability were satisfactory for both the entire
scale and the subscales (person separation reliability
0.80; item separation reliability >0.90), while residual
analysis confirmed the presence of multidimensionality in
the entire ROAD (eigenvalue of the first residual factor >3),
along with local independence violations (seven item pairs
with standardized residual correlation >0.30). On the other
hand, the ROAD subscales showed unidimensionality and
absence of item dependence. Fit of the data to the Rasch
model was good for all subscales (Table 3).
Close your hand completely

Accept a hand shake
Do up buttons
Open previously opened jars
Take a 2 kg object from above
Stand up
Walk on flat ground
Climb up five steps or stairs
9. Get into and out of a car

10. Wash and dry your body
11. Run errands and shop
12. Do housework or/and your paid job
Single factor
Bifactor
G
F1
0.69
0.75
0.79
0.76
0.79
0.70
0.72
0.76
0.61
0.68
0.71
0.68
0.70
0.63
0.66
0.69
0.46
0.61
0.58
0.47
0.43
0.85
0.60
0.72
0.70
0.76
0.56
0.69
0.66
F2
F3
0.56
0.65
0.60
0.51
0.54
0.69
0.62
215
Table 2 Subjects ability levels, reliability indices, and principal component analysis of the residuals for the three ROAD subscales
ROAD
Separate subscales
Upper extremity
function
Lower extremity
function
Activities of daily
living/work
Person separation index

Person separation reliability
Cronbachs alpha
Range of item-difficulty estimates
Item separation index
Item separation reliability
Variance explained by the Rasch factor (%)
Eigenvalue of the first residual factor
1.08
(4.38 to +3.34)
2.74
0.88
0.92
1.28 to +0.91
11.64
0.99
59.6
3.2
0.83
(4.09 to +4.14)
2.2
0.83
0.91
0.42 to +0.29
3.75
0.93
65.3
1.8
1.57
(4.54 to +5.47)
1.99
0.80
0.91
0.47 to +0.41
3.53
0.93
65.2
1.7
0.30
(5.93 to +5.88)
2.13
0.82
0.87
0.99 to +1.17
10.64
0.99
75.8
1.7
Item pairs with standardized residual correlations >0.3
Average subjects ability levels (range)
The ROAD questionnaire has been previously assessed

by classical psychometric methods [10, 11, 15], but recently
new and detailed psychometric approaches, including a mix
of CTT and item response theory (IRT) methods, have been
recommended [24, 25]. This is the first study to assess in
detail the measurement assumptions and properties of
ROAD with both traditional and IRT (Rasch) measurement
methods. The study focused on analysis of dimensionality,
rating scale diagnostics, reliability, and construct validity of
the questionnaire.
There is no standard approach to determine whether an
outcome measure is sufficiently unidimensional or when the
presence of secondary dimensions threatens the accuracy of

unidimensional IRT calibrations. The most used approaches
are derived from factor analysis: CFA is used to evaluate the
fit of the data to pre-existing models, whereas EFA is used to
study the number of latent factors and the contribution of
individual items to those traits in new or modified models
(e.g., after language adaptation) after some form of estimation of the relevant factors [26].
The ROAD questionnaire was designed as a multidimensional scale composed of three subscales (upper extremity
function, lower extremity function, and activities of daily
living/work) that can be ranked and separately reported [10,
Table 3 Summary of Rasch analysis of the three ROAD subscales
Subscale F1upper extremity function

F1-1
Close your hand completely
F1-2
Accept a hand shake
F1-3
Do up buttons
F1-4
Open previously opened jars
F1-5
Take a 2 kg object from above
Subscale F2lower extremity function
F2-1
Stand up
F2-2
Walk on flat ground
F2-3
Climb up five steps or stairs
F2-4
Get into and out of a car
Subscale F3activities of daily living/work
F3-1
Wash and dry your body
F3-2
Run errands and shop
F3-3
Do housework or/and your paid job
Measure
S.E.
Infit MnSq
Outfit MnSq
PtMeas Corr
0.02
0.26
0.29
0.15
0.42
0.07
0.07
0.07
0.07
0.07
1.21
0.94
0.88
0.95
0.99
1.19
0.93
0.88
0.95
0.99
0.83
0.85
0.85
0.86
0.86
0.04
0.41
0.47
0.10
0.08
0.08
0.08
0.08
1.19
0.88
0.79
1.12
1.17
0.86
0.77
1.11
0.87
0.88
0.91
0.87
1.17
0.99
0.18
0.08
0.08
0.08
1.21
0.88
0.84
1.23
0.84
0.81
0.85
0.91
0.91
Containing item calibration (measure) with standard error (S.E.), fit information (infit and outfit MnSq), and the pointmeasure correlation (Pt
Meas Corr)
216
11, 15]. This scoring procedure is confirmed by the present

results. While we were compelled to split ROAD into three
separate subscales to obtain correct calibration using Rasch
analysis, the bifactor analysis pointed out the presence of a
strong general factor with group factors, which does not
distort the parameter estimates in the unidimensional model
[19]. The presence of three subdomains and an underlying
unidimensional latent trait allows both to report a profile
(ROAD disability profile, given by the scores of each subscales) and an overall score. This captures both specific
functional impairments (subscores) and the global impact
of RA on a patients performance (overall score). Our study
confirms the findings by ten Klooster et al. [27], Guillemin
et al. [28], and Ren et al. [29] that upper limb limitations
should be distinguished as a separate factor from lower limb
limitations. In ROAD, some items are clearly associated
with either upper limb activities (e.g., close your hand
completely, accept a hand shake, or do up buttons),
others with lower limb activities (e.g., stand up, walk on
flat ground, or climb up five steps or stairs), but the
questionnaire also contains items (such as run errands and
shop or are you still able to do house work and/or your
paid job) that involve global body functions and are related
to activities of daily living or work.
The rating scale diagnostics of ROAD showed that the
five response categories (0 0 without any difficulty, 1 0 with
a little difficulty, 2 0 with some difficulty, 3 0 with much
difficulty, 4 0 unable to do) were correctly functioning and
that each one was distinct in representing a different disability level. After that, data of each of the three subscales were
analyzed to calculate fit statistics and PCA on standardized
residuals, extract Rasch-modelled parameters of ability and
difficulty, and then examine validity and reliability issues
(Table 3). Overall, these results show that item selection was
adequate, and the subscales were able to define a hierarchy of
persons along each measured construct with clearly different
levels of disability (in upper extremity function, lower extremity function, and activities of daily living/work, respectively) and did not contain significant secondary dimensions.
Also, the consistency (reliability) of both person-ability
and item-difficulty estimates obtained by the ROAD subscales was good. This is supported by the values of Cronbachs alpha (always exceeding 0.85), which allow highstakes decisions concerning individuals.
While this study was based on a large nationwide sample
drawn from rheumatology practices, it has some potential
limitations in that the sample was composed primarily of
female patients (79 %) and included individuals with longstanding RA (mean disease duration >5 years). Therefore,
these results may not be generalizable to the entire population of patients with RA.
In summary, this study provides psychometric evidence
of the reliability and validity of the ROAD questionnaire in
RA patients. Our results support the use of separate subscales for upper extremity function, lower extremity function, and activities of daily living/work, and confirm the
appropriateness of reporting an overall score based on summation of the three subscales.
Acknowledgments We thank Giardino A., Ori A., Zanoli L., and
Manfredi A. for their contribution to the organization of the study and
data management. The authors also wish to thank the other members of
the NEW INDICES Study Group, as follows: Lagan B. (Roma),
Varcasia G. (Mormanno, Cosenza), Pomponio G. (Ancona), Mozzani
F. (Parma), Scarpa R. (Napoli), Maier A. (Bolzano), Romeo N.
(Cuneo), Migliore A. (Roma), Scarpellini M. (Magenta, Milano),
Corsaro S.M. (Napoli), Pusceddu M. (Iglesias, Cagliari), Foti R.
(Catania), Rotondi M. (Velletri, Roma), Marasini B. (Rozzano,
Milano), Ferraccioli G. (Roma), Malavolta N. (Bologna), Sabadini L.
(Arezzo), Cimmino M. (Genova), Ricioppo A. (Vimercate, Milano),
Triolo G. (Palermo), Mascia M.T. (Modena), Minisola G. (Roma),
Botsios K. (Montebelluna, Treviso), Pellerito R. (Torino), Noto A.
(Cosenza), Cazzato M. (Pisa), Bombardieri S. (Pisa). Part of the study
(data collection) was supported by an unrestricted educational grant
from Pfizer S.p.A., Italy. Editorial support was provided by Kim
Brown of UBC Scientific Solutions and was funded by Pfizer Inc.
Disclosures FS has attended advisory board meetings for BristolMyers Squibb, Abbott Immunology, Wyeth Lederle, and Pfizer, and
has received research support from Bristol-Myers Squibb. The remaining authors have no financial or other competing interests to declare.
References
1. Verstappen SM, Symmons DP (2011) What is the outcome of RA
in 2011 and can we predict it? Best Pract Res Clin Rheumatol 25
(4):48596
2. Salaffi F, De Angelis R, Stancati A, Grassi W, MArche Pain;
Prevalence INvestigation Group (MAPPING) study (2005)
Health-related quality of life in multiple musculoskeletal conditions: a cross-sectional population based epidemiological study. II.
The MAPPING study. Clin Exp Rheumatol 23(6):82939
3. Kalyoncu U, Dougados M, Daurs JP, Gossec L (2009) Reporting
of patient-reported outcomes in recent trials in rheumatoid arthritis:
a systematic literature review. Ann Rheum Dis 68(2):18390
4. Fries JF, Spitz P, Kraines RG, Holman HR (1980) Measurement of
patient outcome in arthritis. Arthritis Rheum 23(2):13745
5. Meenan RF, Mason JH, Anderson JJ, Guccione AA, Kazis LE
(1992) AIMS2. The content and properties of a revised and expanded Arthritis Impact Measurement Scales health status questionnaire. Arthritis Rheum 35:110
6. Ware JE Jr, Sherbourne CD (1992) The MOS 36-item short form
health survey (SF-36). 1. Conceptual frame-work and item selection. Med Care 30:47381
7. Oude Voshaar MA, ten Klooster PM, Taal E, van de Laar MA
(2011) Measurement properties of physical function scales validated for use in patients with rheumatoid arthritis: a systematic review
of the literature. Health Qual Life Outcomes 9:99
8. Pincus T, Yazici Y, Bergman M (2005) Development of a multidimensional health assessment questionnaire (MDHAQ) for the
infrastructure of standard clinical care. Clin Exp Rheumatol 23:
S1928
9. Salaffi F, Gasparini S, Grassi W (2009) The use of computer touchscreen technology for the collection of patient-reported outcome
10.
11.
12.
13.
14.
15.
16.
17.
18.
data in rheumatoid arthritis: comparison with standardized paper

questionnaires. Clin Exp Rheumatol 27:45968
Salaffi F, Bazzichi L, Stancati A, Neri R, Cazzato M, Consensi A,
Grassi W, Bombardieri S (2005) Development of a functional
disability measurement tool to assess early arthritis: the RecentOnset Arthritis Disability (ROAD) questionnaire. Clin Exp Rheumatol 23(5):62836
Salaffi F, Stancati A, Neri R, Grassi W, Bombardieri S (2005)
Measuring functional disability in early rheumatoid arthritis: the
validity, reliability and responsiveness of the Recent-Onset Arthritis Disability (ROAD) index. Clin Exp Rheumatol 23:S3142
Conrad KJ, Smith EV Jr (2004) International conference on objective measurement: applications of Rasch analysis in health care.
Medical Care 42(Suppl 1):I16
Bond TG, Fox CM (2007) Applying the Rasch model: fundamental
measurement in the human sciences, 2nd edn. Lawrence Erlbaum
Associates, Mahwah
Salaffi F, Migliore A, Scarpellini M, Corsaro SM, Lagan B,
Mozzani F, Varcasia G, Pusceddu M, Pomponio G, Romeo N,
Maier A, Foti R, Scarpa R, Gasparini S, Bombardieri S (2010)
Psychometric properties of an index of three patient reported
outcome (PRO) measures, termed the CLinical ARthritis Activity
(PRO-CLARA) in patients with rheumatoid arthritis. The NEW
INDICES study. Clin Exp Rheumatol 28(2):186200
Salaffi F, Ciapetti A, Gasparini S, Migliore A, Scarpellini M,
Corsaro SM, Lagan B, Mozzani F, Varcasia G, Pusceddu M,
Castriotta M, Serale F, Maier A, Foti R, Scarpa R, Bombardieri
S, NEW INDICES Study Group (2010) Comparison of the RecentOnset Arthritis Disability questionnaire with the Health Assessment Questionnaire disability index in patients with rheumatoid
arthritis. Clin Exp Rheumatol 28(6):85565
Streiner DL, Norman GR (2005) Health measurement scales: a
practical guide to their development and use, 2nd edn. Oxford
University Press, Oxford
Jreskog KG (1990) New developments in LISREL: analysis of
ordinal variables using polychoric correlations and weighted least
squares. Qual Quant 24:387404
Hu L, Bentler PM (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equat Model 6:155
217
19. Reise SP, Morizot J, Hays RD (2007) The role of the bifactor
model in resolving dimensionality issues in health outcomes measures. Qual Life Res 16(S1):1931
20. Lorenzo-Seva U, Ferrando PJ (2006) FACTOR: a computer program to fit the exploratory factor analysis model. Behav Res
Methods Instrum Comput 38(1):8891
21. Linacre JM (2009) A users guide to WINSTEPS MINISTEP:
Rasch-model computer programs. Program manual 3.68.0. Chicago:
WINSTEPS.com; 2009. http://www.winsteps.com/a/winsteps.pdf.
Accessed 29 July 2010
22. Linacre JM (2002) Optimizing rating scale category effectiveness.
J Appl Meas 3:85106
23. Smith AB, Rush R, Fallowfield LJ, Velikova G, Sharpe M (2008)
Rasch fit statistics and sample size considerations for polytomous
data. BMC Med Res Methodol 8:33
24. Frost MH, Reeve BB, Liepa AM, Stauffer JW, Hays RD, Mayo/
FDA Patient-Reported Outcomes Consensus Meeting Group
(2007) What is sufficient evidence for the reliability and validity
of patient-reported outcome measures? Value Health 10(Suppl 2):
S94105
25. Reeve BB, Hays RD, Bjorner JB et al (2007) Psychometric
evaluation and calibration of health-related quality of life item
banks: plans for the Patient-Reported Outcomes Measurement
Information System (PROMIS). Med Care 45(Suppl 1):S22
31
26. Coste J, Boue S, Ecosse E, Leplge A, Pouchot J (2005) Methodological issues in determining the dimensionality of composite
health measures using principal component analysis: case illustration and suggestions for practice. Qual Life Res 14:64154
27. ten Klooster PM, Veehof MM, Taal E, van Riel PL, van de Laar MA
(2008) Confirmatory factor analysis of the Arthritis Impact Measurement Scales 2 short form in patients with rheumatoid arthritis. Arthritis Rheum 59(5):6928
28. Guillemin F, Coste J, Pouchot J, Ghezail M, Bregeon C, Sany J,
The French Quality of Life in Rheumatology Group (1997) The
AIMS2-SF: a short form of the Arthritis Impact Measurement
Scales 2. Arthritis Rheum 40:126774
29. Ren XS, Kazis L, Meenan RF (1999) Short-Form Arthritis Impact
Measurement Scales 2: tests of reliability and validity among
patients with osteoarthritis. Arthritis Care Res 12:16371

Classical Test Theory and Rasch Analysis Validation

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Classical Test Theory and Rasch Analysis Validation

Загружено:

Авторское право:

Доступные форматы

Clin Rheumatol (2013) 32:211217