Вы находитесь на странице: 1из 4

https://jalt.org/pansig/2004/HTML/Nakamura.

htm

10/10/2015
8 am to 11 am

Features of the holistic rating scale used in this study.

ts

the main idea was indicated, but not clearly

the essay was not so well organized

the vocabulary choice was fair

some major grammatical errors

Features of the analytic rating scale used in this study.


Raters: Three trained native speakers of English.
Items:5 criteria were rated: (1) Originality of Content, (2) Organization, (3) Vocabulary, (4) Grammar, (5) Cohesion & Logical Consistency
Rating scale: A four-point scale (1, 2, 3, 4) was used with the following criteria:
Originality of Content

4 points: interesting ideas


were stated clearly

3 points: interesting ideas


were stated fairly clearly

2 points: ideas somewhat

Organization

4 points: well organized

3 points: fairly well


organized

2 points: loosely
organized

Vocabulary

4 points: very effective


choice of words

3 points: effective choice


of words

2 points: fairly good

Grammar

Cohesion & Logical


Consistency

4 points: almost no
errors

3 points: few minor


errors

2 points: some errors

4 points: sentences
logically combined

3 points: sentences fairly


logically combined

2 points: sentences

https://jalt.org/pansig/2004/HTML/Nakamura.htm

unclear

1 point: ideas not clear

1 point: ideas
disconnected

vocabulary

1 point: limited
vocabulary range of
vocabulary

10/10/2015
8 am to 11 am

poorly combined

1 point: many errors

1 point: many unfinished


sentences

https://jalt.org/pansig/2004/HTML/Nakamura.htm

10/10/2015
8 am to 11 am

A comparison of holistic and analytic scoring methods in the assessment of writing


by Yuji Nakamura (Tokyo Keizai University)
Conclusions

Several conclusions can be drawn from this study.


1. For practical and economical reasons, holistic (one item evaluation) assessment can be used, but to avoid risky
idiosyncratic ratings, analytic assessment (with several evaluation items) is strongly recommended.
2. In terms of rating options, the best practice is to have multiple raters and multiple rating items. The next best
practice is to have one overall evaluation item and multiple raters. In order of preference, the third choice would
be to have one rater and multiple items. The least recommended solution would be to have one rater and one
item. Even worse than this, however, would be to have one rater and an impressionistic scale.
3. This study suggests that it is very risky for one classroom teacher to judge students using a holistic rating system
(cf. Table 1 and the discussion).
4. The more ratings a person receives, the higher the rating precision, though one obvious condition is that
construct and content validity must come before statistical reliability. Otherwise, we do not know what the test is
measuring (cf. Table 3 and Table 4 and the discussion).

https://jalt.org/pansig/2004/HTML/Nakamura.htm

10/10/2015
8 am to 11 am

As Weigle (2002) further suggests:


1. If large numbers of students need to be placed into writing courses with limited time and limited resources, a
holistic scale may be the most appropriate choice in terms of practicality. Reliability, validity and impact can be
ameliorated by the possibility of adjusting placement scores.
2. A test of writing used for research purposes should have reliability and construct validity as central concerns, and
practicality and impact issues should be of lesser significance.
3. The choice of testing procedures should involve finding the best possible combination of the qualities (reliability,
validity, etc.) and deciding which qualities are most relevant in a given situation.
References