Вы находитесь на странице: 1из 6

Start from book. Norm and criterion intro.

Characteristics Norm-Referenced Criterion-Reference

Norm-Referenced tests Criterion-Reference tests


measure the performance measure the performance
Definition of one group of test takers of test takers against the
against another group of criteria covered in the
test takers. curriculum.

CRTs are built to identify


NRTs are designed to
the amount or percent of
Type of interpretation compare the performances
the material each
of students to one another
examinee knows or can do
in relative terms.
in absolute terms.

CRTs usually focus on


NRTs tending to measure specific, well-defined (and
Type of Measurement very general language usually objectives-based)
abilities (for proficiency or language knowledges or
placement purposes skills (for diagnostic or
achievement purposes).

NRTs primarily designed to


spread examinees out on a while CRTs are designed to
Purpose continuum of general assess the amount of
abilities so examinees’ material known, or learned
performances can be by each student
compared to each other.

Norm-Referenced tests Criterion-Reference tests


measure broad skill areas measure the skills the test
Content
taken from a variety of taker has acquired on
textbooks and syllabi. finishing a curriculum.
Each skill is tested by at
Each skill is usually tested least four items in order to
by less than four items. obtain an adequate sample
Items vary in difficulty. of student performance
Test Structure
Items are selected that and to minimize the effect
discriminate between high of guessing. The items
and low achievers. which test any given skill
are parallel in difficulty

Norm-Referenced tests Criterion-Reference tests


Administration must be administered in a need not be administered
standardized format. in a standardized format.

Norm-Referenced test Criterion-Reference test


Score reporting scores are reported in a scores are reported in
percentile rank. categories or percentage.

In Norm-Referenced tests, In Criterion-Reference, the


if a test taker ranks 95%, it score determines how
Score interpretation implies that he/she has much of the curriculum is
performed better than 95% understood by the test
of the other test takers. taker.

NRT Students have no idea CRT Students know exactly


Knowledge of Questions about the content in test what content to expect in
items test items

Norm-referenced tests
often use a multiple-choice Criterion-referenced tests
Information obtained may include multiple-
format with little to no
writing. choice questions, true-false
questions, and open-ended
questions.

Examples of NRTs include


SAT, GRE tests. IQ tests are Examples of CRTs include
Examples among the most well- driving tests, high-stake
known norm-referenced tests and low-stake tests.
tests.

UTILIZATION IN TEACHING LEARNING PROCESS

Norm-referenced and criterion-referenced tests serve a variety of purposes due


to the array of educational situations that exist in today’s schools. Testing can
rank students with each other or some other sociocultural norm, or testing can be
based on some performance criteria that focus on assessing certain
understandings or skill set. Ideally, a combination of both testing types exists in a
way that is valid, reliable, and fair. Thus, given that many classrooms contain
students with different socioeconomic and cultural backgrounds, testing becomes
quite a challenge. Therefore, in order to assure that all students receive the most
appropriate feedback, a variety of testing techniques is needed so that proper
decisions and actions can be made that best suit the learner.

Virtually all students have taken some kind of standardized test by the time they
enter high school or college. Moreover, many standardized tests (i.e., high stakes
tests) are used as a condition of graduation, acceptance, or financial aid. Because
these tests are used as a way to rank or compare students, they are often
referred to as norm-referenced tests (NRT) (Kubiszyn and Borich, 2007). NRTs are
commonly used when stakeholders are interested in the central tendency of the
results of a group of students, as when descriptive statistics are used to find the
average, mean, median, and mode of a particular data set. When using tests to
diagnose or to figure the aptitude of a student, inferences are made based on
how students compare with each other or some other sample based on a social
norm. Since results are “objective” – test items are usually in terms of right and
wrong answers – and since many tests can be applied at once, NRTs are typically
more appropriate for making decisions that are non-instructional based.

In addition to NRTs being used externally to rank students (e.g., SAT, ACT, etc.),
teachers oftentimes use NRTs to test students in the classroom. Multiple-choice,
true-false, matching, and essay questions are common testing types that fall
under this same category. Test results are gathered, averaged, and ranked in
order for teachers to make their best inference as to what level a student has
understood, obtained the necessary skill set, or developed the intended
disposition based on the goals and objectives of the classroom. Subsequently,
instructional decisions are often made based on these results either by reviewing
past information that students continue to struggle with or continuing on with
new information that makes up part of the curriculum. Having framed NRT first as
an external instrument, such as an ACT, then as an internal instrument used by
teachers in their classrooms, one can see a noticeable difference in why they are
being used in each circumstance. The former is to make decisions regarding
achievement while the latter is to make decisions regarding instruction. This
distinction is important when talking about a second type of test that is based on
criteria.

Instead of ranking students to some certain norm, another testing method aids in
basing students performance in terms of meeting certain criteria. Kubiszyn and
Borich (2007) define criterion-referenced test (CRT) as tests that “tells us about a
student’s level of proficiency in or mastery of some skill or set of skills” (p. 66).
Wiggins and McTighe (2005) also put forth the notion of promoting the six facets
of understanding (e.g., explain, interpret, apply, perspective,
empathy, and self-knowledge) when testing students regarding what they know
and their disposition they possess. In other words, CRTs can provide teachers with
greater insight on instructional decision-making adjustments when student
performances are assessed in terms of performance criteria. Rubrics are often
used in order to qualitatively assess performances and products. Arter and
McTighe (2001) distinguish between a holistic and analytical trait rubric
when they state “A holistic rubric gives a single score or rating for an entire
product or performance based on an overall impression of a student’s work” and
“an analytical trait rubric divides a product or performance into essential traits or
dimensions so that they can be judged separately-one analyzes a product or
performance for essential traits” (p. 18).
Communicating these “essential traits” with students provides the basis for what
constitutes a “good” and “bad” performance or product, and is essential in setting
the expectations between teacher and student. Indeed, CRTs are specifically
suited for assessing understandings, knowledge, skills, and dispositions in terms
of subsequent inferences towards instructional decision-making adjustments and
adjustments to student learning tactics.

Regardless of the test being administered, reliability, validity, and “absence-of-


bias” (Popham, 2008, p. 73) drive the level of predictability an instrument has in
making proper inferences on a student’s achievement. Reliability in NRTs is of
high concern since many versions of the ACT, for example, are expected to
contain test items that measure the same content. Similarly, the same ACT should
yield similar results (i.e., a high correlation coefficient) if students retake the exam
without being exposed to a learning intervention in the interim. The validity of a
test pertains to the three Cs: “content, criterion, and construct” (Popham, 2008,
p. 53). Content validity addresses how test items represent concepts that are
covered in the curriculum. Criterion validity in NRTs deals with how accurate the
testing items are in predicting future behavior (e.g., ACT and SAT scores and
subsequent academic success or failure). Criterion validity in CRTs deals with
rubric traits and how valid they are in terms of a student’s future performance.
The final C, construct validity, has to do with how a student’s performance over
time is gauged in terms of meeting criteria that is aligned to the curriculum.
And finally, absence-of-bias centers on how test items present information that is
fair; that is, does not lean towards a certain group of people based on
socioeconomic status, race, ethnic background, gender, or sexual orientation.

NRTs and CRTs should not be considered dichotomous, but are two different
approaches to assessing students in a complementary way. Ranking and
comparing students has a purpose when the goal is to measure achievement and
to predict future academic success. Conversely, testing understandings,
knowledge, skills, and disposition through performance and product criteria
serves a vital role in making inferences that influence instructional decisions and
student tactic adjustments. In order for tests to be valid, reliable, and absent of
bias, test designers should conduct a variety of reviews to assure that tests
measure curricular aims, are reliable within the same and different versions of an
exam, and do not discriminate minority groups based on age, race, gender,
socioeconomic status, or sexual orientation. Tests are the link between the
written and taught curriculum, the ideal and the reality of what schools are for all
its stakeholders. Thus, in order to continue the development and improve the
feedback that tests provide all of its stakeholders, a collaborative effort is needed
in bringing together a community of practice that addresses these important
aspects of testing and assessment.

Вам также может понравиться