— Stanley Stevens, On the Theory

of Scales of Measurement

This course focuses on cognitive and affective test scores as operationalizations of
constructs in education and psychology. As noted above, these test scores often
produce ordinal scales with some amount of meaning in their intervals. The particular
rules for assigning values within these scales depend on the type of scoring
mechanisms used. Here, we’ll cover the two most common scoring mechanisms,
dichotomous and polytomous, and we’ll discuss how these are used to create rating
scales and composite scores.
Dichotomous scoring
Dichotomous scoring refers to the assignment of one of two possible values based on a
person’s performance or response to a test question. A simple example is the use of
correct and incorrect to score a cognitive item response. These values are mutually
exclusive, and describe the correctness of a response in the simplest terms possible, as
completely incorrect or completely correct. Most cognitive tests involve at least some
dichotomously scored items. Multiple-choice questions, which will be discussed further
in Chapter 3, are usually scored dichotomously.
Dichotomous scoring could involve different score values, besides correct and
incorrect. The most common example is scoring that represents a response of either
yes or no. Affective measures, such as attitude surveys and behavior checklists, often
use this type of dichotomous scoring. Depression inventories, for example, may present
individuals with lists of statements that people with depression typically identify strongly
with. Individuals then respond to each statement by indicating whether or not the
statements are characteristic of them.
Others dichotomous scores that do not indicate the presence or absence of a
construct are sometimes used, but these are not discussed here.
Polytomous scoring
Polytomous scoring simply refers to the assignment of three or more possible values for
a given test question or item. In cognitive testing, a simple example is the use of rating
scales to score written responses such as essays. In this case, score values may still
describe the correctness of a response, but with differing levels of correctness, for
example, incorrect, partially correct, and fully correct.
Polytomous scoring with cognitive tests can be less straightforward and less
objective than dichotomous scoring, primarily because it usually requires the use of
human raters with whom it is difficult to maintain consistent meaning of assigned
categories such as partially correct. The issue of interrater reliability will be discussed in
Chapter 6.
Polytomous scoring with affective or non-cognitive measures most often occurs with
the use of rating scales. For example, individuals may use a rating scale to describe
how much they identify with a statement, or how well a statement represents them,
rather than simply saying yes or no. Such rating scales measure multiple levels of
agreement (e.g., from disagree to agree) or preference (e.g., from dislike to like). In this
case, because individuals provide their own responses, subjectivity in scoring is not an
issue as it is with polytomous scoring in cognitive tests. Instead, the challenge with
rating scales becomes ensuring that individuals interpret the rating categories in the
same way. For example, strongly disagree could mean different things to different
people, which will impact how the resulting scores can be compared across individuals.
Except in essay scoring and with some affective measures, individual questions,
whether dichotomously or polytomously score, are rarely used to measure a construct.
Instead, scores from multiple items are combined to create composite scores or rating
scale scores.