Вы находитесь на странице: 1из 20

PRINCIPLES OF

LANGUAGE
ASSESSMENT

Principles of Language Assessment


Practicality
Reliability
Validity
Authenticity
Washback

1- Practicality

Not expensive,
Within appropriate time constraint,
Relatively easy to administer,
A scoring/evaluation procedure that is
specific and time-efficient.

Example of Practicality Checklist


1. Are administrative details clearly
established before the test?
2. Can students complete the test
reasonably within the set time frame?
3. Is the cost of the test within budget
limits?

2- Reliability
Reliability is the degree to which an assessment tool
produces stable and consistent results.
a reliable test is consistent and dependable: this means that
if you give the test to the same students but in two different
occasions, the test should yield=give the same results.

There are four factors that might contribute to


the unreliability of a test :

Students related reliability


Rater reliability (scoring)
Test administration reliability
Test reliability

2- Reliability
a.
Students-related reliability
is related to factors that might contribute to the students themselves
such as illness, fatigue, anxiety or any other physical and psychological
factors which might make the observed score unreflective to the actual
students performance.
Also includes test-wiseness or strategies for efficient test-taking
b.
Rater reliability
is related to human subjectivity, biased and error.

inter-rater reliability is when two or more scorers yield


inconsistent scores of the same test. Due to lack of attention,
inexperience, or even biases.
Intra-rater reliability is common in the classroom teachers
because of unclear criteria or biases toward particular bad or 6
good students.

2- Reliability
c.

Test administration reliability


is related to the condition in which that test is administered.
factors such as heat, light, noise, confusing directions, and
different testing time allowed to different students can affect
students scores
d. Test reliability
is related to the test itself.

Is it too long?,
the time limit,
poorly written test
is there more than one answer to the question?,
is the wording of the test clear?
Subjective or objective test?

2- Reliability
Factors that can affect the reliability of a test
Test length factors
longer tests produce higher reliabilities.
Teacher-Student factors
any good teacher-student relationship would help
increase the consistency of the results.
Environment factors
favourable environment will improve the reliability of
the test
Test administration factors
test administrators should strive to provide clear and
accurate instructions
Marking factors

3- Validity
Validity refers to how well a test measures what it is
purported to measure
Tests themselves are not valid or invalid. Instead, we validate
the use of a test score.
Validity is a matter of degree, not all or none.
Test validity refers to the extent to which inferences made from
he assessment rests are appropriate , meaningful , and
useful in terms of purpose of the assessment

A test of reading ability must actually measure


reading ability.
Does the test content/ items match the course
content or unit being taught?

3- Validity
a.

Content-related Validity
is achieved when the students were asked to perform the
behavior that is being measured. For example, when you are
assessing a student's ability to speak a second language but you
are asking him/her to answer a paper and pen multiple choice
questions that require grammatical judgment, then you are not
achieving the content validity. (direct vs indirect testing)
Does the test follow the logic of the lesson or unit?

Are the objectives of the lessons clear and present in the

test?
Test should assess real course objectives and test

performance directly.
10

3- Validity
b Criterion-related validity
when the test has demonstrated its effectiveness in
predicting criterion or indicators of a construct.
looks at the relationship between a test score and an
outcome. For example, SAT scores are used to
determine whether a student will be successful in
college. First-year grade point average becomes the
criterion for success.
Looking at the relationship between test scores and the
criterion can tell you how valid the test is for determining
success in college.
The criterion can be any measure of success for the
behaviour of interest.

11

3- Validity
A criterion-related validation study can be either predictive of
later behaviour or a concurrent measure of behavior or
knowledge.
i)
Concurrent validity

its results are supported by other concurrent performance


beyond the assessment itself.
Statements of concurrent validity indicate the extent to which
test scores may be used to estimate an individuals present
standing on a criterion.
Eg: a high score on the final exam of English language will be
substantiated by actual proficiency in the language.
12

3- Validity
ii) Predictive validity
refers to the form of criterion-related validity
that is an index of the degree to which a test
score predicts some criterion measure
obtained at a future time.
To assess a test-takers likelihood of future
success.
It is important in the case of placement tests
13

3- Validity
c - Construct-related validity
Construct validity refers to the degree to which a test or
other measure assesses the underlying theoretical
construct it is supposed to measure (i.e., the test is
measuring what it is purported to measure).
Concerned with the theoretical relationships among
constructs
and
The corresponding observed relationships among
measures ( read page 33)

14

3- Validity
c - Construct-related validity
As an example, think about a general knowledge test of basic
algebra. If a test is designed to assess knowledge of facts
concerning rate, time, distance, and their interrelationship with
one another, but test questions are phrased in long and
complex reading passages, then perhaps reading skills are
inadvertently being measured instead of factual knowledge of
basic algebra.
In order to demonstrate construct validity, evidence that the
test measures what it purports to measure (in this case basic
algebra) as well as evidence that the test does not measure
15
irrelevant attributes (reading ability) are both required.

3- Validity
d. Consequential validity (impact)
Some professionals feel that, in the real world,
the consequences that follow from the use of
assessments are important indications of
validity.
Two levels of impact:
i) macro level:
the effect on society and educational systems

ii) micro level:


the effect of individual test takers.

16

3- Validity
e - Face validity
A test is said to have face validity if it "looks like" it is going to
measure what it is supposed to measure
Face validity is not empirical; one is saying that the test
appears it will work, as opposed to saying it has been shown
to work.
refers is to the extent students view the assessment as fair,
relevant , and useful for improving learning
Students will judge a test to be face valid if

Directions are clear


Organized in a logical way
No surprises
Time appropriate
Appropriate in difficulty

17

4- Authenticity
The language
as natural as possible.
Items
contextualized rather than isolated.
Topics
meaningful (relevant, interesting) for
the learner.
Some thematic organization to items is provided,
such as through a story line or episode.
Tasks represent, or closely approximate, realworld tasks.

5- Washback
The effect of testing on teaching and learning
(Hughes in Brown, 2004).
Generally refers to the effects tests have on
instruction in terms of how students prepare for
the test (Brown, 2004).

References
Brown, H. Douglas, 2004. Language Assessment: Principles
and classroom practices. Pearson Education, Inc.
Chitravelu, Nesamalar, 2005. ELT Methodology: Principles
and Practice. Penerbit Fajar Bakti, Sdn, Bhd.

20

Вам также может понравиться