Вы находитесь на странице: 1из 5

Rasch Analysis

Rasch analysis can be applied to assessments in a wide range of disciplines, including health studies, education, psychology,
marketing, economics and social sciences.
Many assessments in these disciplines involve a well defined group of people responding to a set of items for assessment.
Generally, the responses to the items are scored 0, 1 (for two ordered categories); or 0, 1, 2 (for three ordered categories); or 0,
1, 2, 3 (for four ordered categories) and so on, to indicate increasing levels of a response on some variable such as health
status or academic achievement. These responses are then added across items to give each person a total score. This total
score summarise the responses to all the items, and a person with a higher total score than another one is deemed to show
more of the variable assessed. Summing the scores of the items to give a single score for a person implies that the items are
intended to measure a single variable, often referred to as a unidimensional variable.
The Rasch model is the only item response theory (IRT) model in which the total score across items characterizes a person
totally. It is also the simplest of such models having the minimum of parameters for the person (just one), and just one parameter
corresponding to each category of an item. This item parameter is generically referred to as a threshold. There is just one in the
case of a dichotomous item, two in the case of three ordered categories, and so on.

1. What is Rasch Analysis?


What is a Rasch Analysis? The Rasch model, where the total score summarizes completely a person's standing on a variable,
arises from a more fundamental requirement: that the comparison of two people is independent of which items may be used
within the set of items assessing the same variable. Thus the Rasch model is taken as a criterion for the structure of the
responses, rather than a mere statistical description of the responses. For example, the comparison of the performance of two
students' work marked by different graders should be independent of the graders.
In this case it is considered that the researcher is deliberately developing items that are valid for the purpose and that meet the
Rasch requirements of invariance of comparisons.
Analyzing data according to the Rasch model, that is, conducting a Rasch analysis, gives a range of details for checking
whether or not adding the scores is justified in the data. This is called the test of fit between the data and the model. If the
invariance of responses across different groups of people does not hold, then taking the total score to characterize a person is
not justified. Of course, data never fit the model perfectly, and it is important to consider the fit of data to the model with respect
to the uses to be made of the total scores. If the data do fit the model adequately for the purpose, then the Rasch analysis also
linearises the total score, which is bounded by 0 and the maximum score on the items, into measurements. The linearised value
is the location of the person on the unidimensional continuum - the value is called a parameter in the model and there can be
only one number in a unidimensional framework. This parameter can then be used in analysis of variance and regression more
readily than the raw total score which has floor and ceiling effects.

2. Why undertake a Rasch analysis?

A researcher who is developing items of a test or questionnaire intending to sum the scores on the items can use a Rasch model
analysis to check the degree to which this scoring and summing is defensible in the data collected. For example, if two groups
are to be compared on the variable of interest (e.g. males and females), it is important to demonstrate that the workings of the
items is the same in the two groups. Working in the same way permits interpreting the total score as meaning the same in the
two groups.

In checking how well the data fit the model, it is important to be able to diagnose very quickly where the misfit is the worst, and
then proceed to try to understand this misfit in terms of the construction of the items and the understanding of the variable in
terms of its theoretical development.

A very important part of the Rasch analysis from this perspective is to be in dynamic and interactive control of an analysis and to
be able to follow the evidence to see where the responses may be invalid.

3. The research paradigm and the Rasch model

There is a philosophical or paradigm difference between the application of the Rasch model and other IRT models, for example,
the two-parameter and three-parameter models are designed for responses scored just 0, 1.. In the paradigm of other IRT
models, the emphasis is on finding a model that best characterizes the given data; in the Rasch paradigm, the emphasis is in
identifying and studying anomalies in the data disclosed by the Rasch model. Thus, in the paradigm of applying other than
Rasch models, if the Rasch model does not work then a more complicated model, relative to the simpler Rasch model, that
might explain the data better is sought. In the case of dichotomously scored data, this might be the two parameter model which
has a second parameter for each item.

4. Is there more than one Rasch model?

There is only one Rasch model for unidimensional responses at the level of one person responding to one item.

However, there are different specifications when more than two ordered response categories are present. In one specification,
all items might be hypothesized to have the same parameters across all items, as for example in the case that all items have the
same response structure (e.g. SD, D, A, SA). In a second specification, different parameters across items may be needed when
items do not have the same response categories, as in achievement testing when different items may have a different number of
ordered categories and most certainly a different description of the categories

5. Different Rasch Model Specifications

For the case where the response categories are the same across items (e.g. SD, D, A, SA), the Rasch model has been called
"the rating scale model"; the case where the response categories are different across items has been called the "partial credit
model". It is stressed, however, that the structure and response process for a person responding to an item is identical in the two
specifications. Rather than emphasizing two models for the above different specifications, it can be more efficient to refer to one
Rasch Unidimensional Measurement Model (RUMM) with different numbers of categories and different parameterizations (as in
RUMM2030). Thus it might be better to refer to the former as a rating scale parameterization; the latter as a partial credit
parameterization.

6. Thresholds and Steps

One particular difference that has arisen in different Rasch analysis reporting is the use of "step", when the parameters are
different across items, and "thresholds", when they are the same across items. This can give the impression that the response
process characteristized by the Rasch model is a sequential process. However, the Rasch model is NOT a sequential
processing model but a static model, which just specifies the probability of a person with a given location responding, or being
classified, in one of the categories of an item.

For example, the term "step" is not used in the dichotomous case because it would imply, implausibly, that a person goes from
being wrong to being right, or goes from disagreeing to agreeing. Instead the person is either wrong or right, or either disagrees
or agrees; there is no sequential processing here. The response process is a classification into ordered categories defined by
thresholds which can be seen as analogous to markings on a ruler except that the thresholds do not have to be equidistant as
they are in a ruler - they are estimated. The threshold is the point where the probability of a response in either one of two
adjacent categories is 50%.

7. Disordered Thresholds as an Anomaly

As in the case of a ruler, thresholds marking off successive categories need to be ordered to be interpretable. However, in
estimating the thresholds from the data, it is possible to discover that the estimates are not properly ordered. This is a sign that
the categories are not working as intended and an anomaly in the data that needs to be understood and corrected is disclosed.

Prior to the work of Rasch, Thurstone had constructed a model for ordered categories which also involved thresholds. These
may be derived from the Rasch thresholds. The problem with the Thurstone thresholds is that they are always ordered as a
property of the model no matter what the features of the data - they have no use in disclosing whether categories are working in
the ordering intended. Thurstone thresholds cannot disclose any anomalies in the ordering; indeed they will hide them.

8. Who should use a Rasch analysis?


As implied from the above summary, a Rasch analysis should be undertaken by any researcher who wishes to use the total
score on a test or questionnaire to summarize each person. There is an important contrast here between the Rasch model and
Traditional or Classical Test Theory, which also uses the total score to characterize each person. In Traditional Test Theory the
total score is simply asserted as the relevant statistic; in the Rasch model, it follows mathematically from the requirement of
invariance of comparisons among persons and items.
A Rasch analysis provides evidence of anomalies with respect to
the operation of any particular item which may over or under discriminate relative to the summary discrimination of all
items.

two or more groups in which any item might show differential item functioning (DIF).

anomalies with respect to the statistical independence of the items.

If the anomalies do not threaten the validity of the Rasch model or the measurement of the construct, then

people can be located on the same linear scale as the items.

the locations of the thresholds of the items on the continuum permits a better understanding of the variable at different
parts of the scale.

the locations of the thresholds of the items on the continuum permits a better understanding of the variable at different
parts of the scale.

The aim of a Rasch analysis is analogous to helping construct a ruler, but with the data of a test or questionnaire.

9. An ideal approach to a Rasch analysis

A Rasch analysis is consistent with the Rasch paradigm when the researcher is in control when accumulating evidence of the
validity of the responses. No one single statistic is generally enough to decide whether a set of data fit the model for the
purpose. Each analysis is a case study in determining the diagnostic evidence for the internal consistency and validity of the
data. Often, there is no simple "yes" or "no" answer. It is important to use both statistical and graphical evidence simultaneously
and interactively, and not mechanistically and sequentially, in making different decisions, such as whether to discard or modify
an item. The researcher must use professional judgment by considering all the evidence, statistical, graphical and conceptual, in
making decisions on evidence produced by a Rasch analysis.

10. Recommended Rasch Software


Cloud Rasch Analysis Engine & APIs

Rumlab and Timewatch have developed a cloud based rasch analysis engine and APIs which allow anyone to take full
advantage of the power of the Rasch engine, with out the need to understand or implement the complex mathematics involved.

The system allows for data and test instrument setup data (created via Rumm2030 or any other means) to be transmitted
securely via SSL to a cloud server, with resultant data securely returned. The cloud servers do store customer setup data, test or
result data, and we recommend that data passed to the servers be anonymous or utilize random reference id's so that only the
entity calling the service can interpret the meaning of the data and the result.

Rasch Analysis API's

Although the Rasch Cloud Server can provide a test page for customers to upload their test instrument data (item's, item labels,
thresholds etc.), this is provided to demonstrate how with purely instrument data a fully responsive web application that can
support PCs, Macs, Phones and Tablets in portrait or landscape mode can support real time testing and demonstrate the speed
and versatility of the Rasch Analysis engine as resultant data is displayed on the test page. The intention is that developers that
wish to create their own Test or Questionnaire sustems can do so, but can securely utlize our Rash Analysis API's to analyze
their data and provide resultant test scores.

Contact Us

If you wish to learn more about Rasch Analysis APIs or how to integrate our API's with your software, please contact either Barry
Sheridan at RummLab or Graeme Wright at Timewatch Inc

RUMM Software

The RUMM2030 program has been developed with the Rasch paradigm and with the case-study approach to a Rasch analysis.
It is used interactively to move between many different graphic displays and corresponding statistics in making decisions.
Thus RUMM2030 places the researcher in dynamic and interactive control in being able to look at complementary evidence in
making a decision.

The RUMM2030 program has been written to permit the researcher to study relevant graphs and statistical tables. The easy
interactive moving between statistics and graphs also makes RUMM2030 an ideal tool for teaching and learning about the
Rasch model.

For example, RUMM2030 makes possible, using a simple selection option, to analyze items with more than two categories as
either hypothesizing the same thresholds across items (rating scale parameterization), or different across items (partial credit
parameterization).

The RUMM2030 program has the following features characteristic of a full Windows application:

The use of standard tools and objects familiar to any Windows operator.

A simple visual structure for data input and item specifications.

Easy creation and importing of templates for involved and elaborate analyses.

User friendly and efficient displays for analyzing data according to the Rasch paradigm.

Extensive use of elaborate graphics to both complement and extend the meaning of the different parameter estimates
and fit statistics.

A range of advanced extensions involving item rescoring, DIF, subtests (for assessing local independence in items),
anchoring strategies, distractor analysis (for multiple choice items), test equating (providing equivalent test scores
between tests of different sub sets of items), assessment of dimensionality, assessment of local response dependence,
conditional test-of-fit (for a pair of polytomous items or a pair of tests), tailored test analysis (for testing the significance
of guessing), facet analysis (for up to 3-way item response structure), principal component analyses of residuals, and
tests specifically regarding the profiles of individuals.

The DIF diagnostic is an advanced and comprehensive routine for not only identifying DIF but also for accounting for
the DIF. Instead of resorting to deleting items which show between group DIF, RUMM2030 provides a simple facility to
split items routinely to account for the DIF.

Comprehensive tools for editing specification labels and graphic-based labels; displaying and recovering to files the
original response data file and test specifications; saving to file the full range of data tables and graphics. " Facility to
copy and paste easily all, or part, of the data table displays to spreadsheets (e.g EXCEL or SPSS) for additional
analyses.

RUMM2030 comes complete with

Manuals: to assist both the beginner and experienced researcher.

Interpretation Guides: up to seven Interpreting the RUMM2030 Analysis monographs, providing a detailed guide on
how to employ the RUMM paradigm in the conduct of any Rasch analysis.

One testimonial describes the Interpreting RUMM2030 publication as "the best documentation for any Rasch program by a
long shot. Excellent writing and well explained".

RUMM2030 is suggested for

a researcher wanting to conduct an item analysis leading to the construction of a meaningful variable suitable for
providing consistent measurement, or

a teacher seeking a user-friendly computer application for instructing students in the good conduct of item analyses,
then the RUMM2030 program should be a great help to you. The ease of use means that RUMM2030 is a great time
saver as it does not involve any stop-start requirements associated with the preparation of batch files before moving on
to the next procedure or display.
However, once specifications are read in, RUMM2030 does save batch files (called templates), and with large data sets, batch
files/templates which specify the format of the data and which specify the scoring of the responses can be constructed readily
and imported interactively. It must be emphasized here that the RUMM2030 batch files/templates are aids, or tools, for creating
a new analysis setup (such as anchoring) and, once created and implemented, the employment of the RUMM paradigm in the
conduct of the subsequent Rasch analysis within RUMM2030 can proceed as normal, that is, no batch files are required in the
conduct and diagnostic interpretations of the analysis are involved.

11. The RUMM approach to Rasch analyses

RUMM2030 has become the research tool in many disciplines world-wide, such as education, health studies,
psychology, marketing and business studies and the social sciences. The ease of use which contributes so
much to the significant saving of both time and labour makes the RUMM2030 program such a comprehensive
and powerful research tool by taking full advantage of the Rasch paradigm for constructing quality measuring
instruments. Universities and colleges world-wide also use RUMM2030 to assist lecturers and research
assistants in the construction of small scale research projects and to assist students in the instruction of item
analysis.

Вам также может понравиться