Вы находитесь на странице: 1из 11

VALIDATION

The purpose of validation is to determine the characteristics of the whole test itself, namely, the validity and reliability of the test. Validation is the process of collecting and analyzing evidence to support the meaningfulness and usefulness of the test.

Validity
We defined validity as the extent to which a test measures what it purports to measure or as referring to the appropriateness, correctness, meaningfulness and usefulness of the specific decisions a teacher makes based on the test results. These to definitions of validity differ in the sense that the first definition refers to the test itself while the second refers to the decisions made by the teacher based on the test

A teacher who conducts test validation might want to gather different kinds of evidence. There are essentially three main types of evidence that may be collected: content-related evidence of validity, criterion-related evidence of validity and constructrelated evidence of validity.

Content-related evidence of validity


refers to the content and format of the instrument.

Criterion-related evidence of validity


refers to the relationship between scores obtained using the instrument and scores obtained using one or more other tests. How strong is this relationship? How well do such scores estimate present or predict future performance of a certain type.

Construct-related evidence of validity


refers to the nature of the psychological construct of characteristic being measured by the test. How well does a measure of the construct explain differences in the behavior of the individuals or their performance on a certain task?

This table and is easy to construct and consists of the test categories listed on the left hand side and the criterion categories listed horizontally along the top of the chart. The criterion measure used is the final average grades of the students in high school; Very Good, Good, and Needs Improvement. The two way table lists down the number of students falling under each of the possible pairs as shown below.

Grade Point Average


Test Score
High Average Low

Very Good
20 10 1

Good
10 25 10

Needs Improvement
5 5 14

RELIABILITY
Reliability refers to the consistency of the scores

obtained how consistent they are for each individual from one administration of an instrument to another and from one set of items to another. We already gave the formulas for computing the reliability of a test: For internal consistency, for instance, we could use the split-half method or the Kuder-Richardson formulae (KR-20 OR KR-21)

Reliability and validity are related concepts. If an instrument is unreliable, it cannot yet valid outcomes. As reliability improves, validity may improve. However, if an instrument is shown scientifically to be valid then it is almost certain that it is also reliable. The following table is a standard followed almost universally in educational tests and measurement.

Reliability
.90 and above

Interpretation
Excellent reliability; at the level of the best standardized tests Very good for a classroom test

.80 - .90

.70 - .80

Good for a classroom test; in the range of most. There are probably a few items in which could be improved? Somewhat low. This test needs to be supplemented by other measures to determine grades. There are probably some items which could be improved

.60 - .70

.50 - .60

Suggests need for revision of test, unless it is quite short. The test definitely needs to be supplemented by other measures for grading.

.50 or below

Questionable reliability. This test should not contribute heavily to the course grade, and it needs revision.

Вам также может понравиться