Вы находитесь на странице: 1из 33

2012

Language testing

Ammar Mustafa mahadi Open university of sudan

Teaching and Testing Introduction

Teaching and testing are so closely interrelated that it is impossible to work in either field without being concerned with the other. Students learn a language. Teachers teach and test what students learn. Teachers test samples of what students learn. The samples are usually chosen form the learning problems; as knowing the problems is knowing the language.

The Need for Testing


To evaluate students performance. To enable teachers to increase their own effectiveness in teaching. To locate the precise difficulties encountered by students. To see which part of the syllabus needs amendments. To motivate learns; when students learn form their weaknesses.

To detect weaknesses in order to make remedial work and/or additional practice. To reinforce learning as well as teaching..
2

To have wash back on the teaching methods and lea techniques used. To have wash back on the syllabus taught. Backwash: (also washback) Backwash is the effect of testing on teaching and learning. Language test washback is either positive or negative. Positive washback is said to result when a testing procedure which encourages good teaching, practice is used.

e.g. The use of oral interview in a final examination may encourage teachers to use conversational language use with their students. Negative bachwash may occur when the test items have little relationship to the teaching curriculum. e.g. If the writing skill is tested by multiple-choice items. 2. Kind of Tests There are (several) kinds of tests. The kind of test to used in each case depends on the purpose of the test

In this course we will discuss four kinds of tests: Proficiency test. Achievement tests.
3

Diagnostic tests. Placement tests. Proficiency Tests The purpose of this test type of test is to see how for candidates are proficient to use the language for a certain purpose. Designed to measure peoples ability in the language. What they can do with the language language for a particular purpose. of they can use the

- Regardless of any training they have had certain syllabus or textbook.

i.e. the test is not designed on a

Example whether a student can take a university course of electricity in English.

Another type of proficiency test may be designed to see whether candidates have reached a certain level of proficiency. An example The Cambridge Examination: Cambridge First Certificate Examination Cambridge Proficiency Examination. Achievement Tests
4

Achievement tests are directly related to language courses. The purpose of such tests is to see how successful the learners are in achieving the learning objectives. Achievement test are of two types o Final achievement test. - administered at the end of the course.

- may be written by Ministry of Education or another Board.3 o based on the course or syllabus taught, syllabus content.

Progress Tests Are intended to measure the progress the students are making. Usually administered at different stage of the course being taught. Hence should be based on the parts that are covered by the learner. 3.Diagnostic Tests Diagnostic tests are used to identify students strengths and weaknesses. Disadvantages of diagnostic test is that can not be short; it has to be vast in order to ensure reliability one or two question, on a language category, might be answered correctly by chance. Placement Tests
5

Are used to provide information which will help to place students at the stage of the teaching programme most suitable to their abilities. Successful placement tests are those constructed for particular situation every institution (with a different language course) should design its placement test. Test Validity A valid test is that which measures accurately what it is intended to measure. The Aspects of Validity: Content validity: The contents of the test must contain representatives of the language categories that the test is designed to measure. This means a test for beginners can not be used for advanced students. To ensure content validity, specification of measured language / skill items should be done. This will also give a fair washback. 2. Criteria-related Validity There are two types of criteria-related validity: a. concurrent validity b. Predictive validity Concurrent Validity (is more used): This is done when the time is not enough for all tests. Therefore, each students will be given a shorter time on some language categories of the test. On the other hand a saw a random sample of the students will be given

The appropriate time to take all the test. The results of the two groups will then be compared, the validity confident be calculated. Example: A class of 50 students A test for oral communication. Validity group: group (1): 50 students: each student interviewed in 5 minutes on some parts of the test. Criteria group: group (2): 10 students: each takes all the test on the full time of the test 30 minutes. The percentage of scores of the two groups are compared. The validity of the test of the first group is calculated. If the comparison between the scores of the two groups reveals a high level of agreement, then the validity of the group one is good.

Construct Validity A test constructed to measure certain specific characteristic of language learning. e.g.: if the communicative approach is adopted throughout a course then the test should be designed to measure the four skills as a n interpreted skills. Face Validity To ensure face validity the test designer may show it to other colleagues and friends.
7

These others might see the test objective in a better objective manner. A test designed in a country might have a low face validity in another country authentic material should be used Content Validity This is a type of test designed to contain representatives of all the contents of the course objectives. The relation between the objectives should also be considered more important items should have more weight. A list of the course objectives should be put in a table and then the weights can be assigned. Reliability A test is said to be reliable if it is administered to the same candidates on different occasions and it produces the same measurements. A reliable test is consistent in its measurement. This can be seen in two ways: Test / re-test: when the test is administered on different occasions; without additional teaching done between the different occasions.

Marl / re-mark reliability: When the test papers are marked by different teachers and are awarded the same marks or grades. Factors Affecting Reliability The extent of the sample: the greater the sample the better is reliability. The administration conditions: e.g. time, recording of oral test.
8

Individuals personal factors: such as illness and motivation.

Instructions: whether the rubrics are clear for all conditions. Scoring the test: Objective-multiple choice are more reliable than subjective marking of compositions Profile Reporting Reliability A recent dimension in the concept of reliability, introduced by communicative language testing. In order to get a full profile of a students ability in the target language, it is necessary to assess his / her performance for each of the different areas of communicaion.

Communicative areas of the language are such as: listening comprehension, speaking and listening, reading, reading and writing (summarizing) and writing. Stages of Test Construction The tests decides: what it is that the tester wants to know and for what purpose. Hence, the tester has to answer the following questions: What kind of test is it to be? What is its precise purpose? What are the language abilities to be tested?

d. How detailed must the results be? How accurate must the results be? How important is backwash? What constraints are involved? Stages of Test Construction Planning Stage: o the content o general layout. o types of test items (questions). o the length. o time limit. o the instruction to be given. o scoring method. 2. Pre-pilot Stage To try the rough draft of the test on a small group of people in order to see general impact of the test and to identify the unsatisfactory items. Pilot stage: To try-out the draft on a larger sample of the same kind of people. This will provide washback for test administration and for revision. Final Validation: Try-out for the final form of the test. This will give evidence of practicality and validity.
10

Test Techniques and Measuring Overall Ability The phrase test techniques means ways of electing behaviour from candidates which will tell us about their language abilities. A technique should be valid, reliable, economical in time and with useful washback. o Multiple Choice In this techniques a question is designed with: a stem and options. One of the options is the correct answers and the others are distractors. Example: Stem: He has been here half an hour. Options: B: for C: while D: since Number A, C and D are distractors. Number B is the correct answer. The candidate is to choose the correct option. The main advantages of multiple choice are that its scoring is perfectly reliable, rapid and economical on time.
11

A. During

Disadvantages of multiple choice test The techniques tests only recognition knowledge. i.e. the candidate who recognizes the correct option might not be able to vse the same form when speaking or writing. The effect of guessing on the scores is 25% (if the test contains four options). This also mean that some candidate might get more scores and others might get less Distractors are not always available This limits what can be testes. Example: the past simple and the present perfect. Other distractors are not quite relevant. Difficulty of Writing Questions The difficulty of possible test items make testers committee mistakes, such as: o More than one possible answer. o Clues in the options to which is correct. o Ineffective distractors. Cheating might be facilitated To avoid cheating, a suggestion is to have two versions of the same test, with a different order of options. o Close Test Technique Close test techniques are of various types. They can be used to measure more than one language ability at a time, though they are recommended for testing reading skills.
12

Varieties of Close Test Procedure Blank-filling: The tester deletes a numbers of words in a passage leaving blanks for the tester to fill in. The first sentences are usually left as a lead- in. Deletion of words are made at a fixed numbers of word e.g. every seventh word is deleted.

Careful selections of text and of deleted words are needed. Characteristics of Blank-Filling Procedure: Positive: o Easy to construct. o Easy to administer. o Easy to score. o Integrative method: i.e. it measure more than one language ability.

Negatives: o If reading is measured, we cannot be of the speaking ability of the testee. o Different passage used give different washback. The deletion of words might give a different result.

Example: fill in the blank:

13

o Before reading tests in the second or foreign language can be successfully constructed, the first language reading skills of the students must be ascertained. o Clearly there (1) often little purpose in testing those reading (2) . in the second language which the students have not yet developed in their (3) language.

However, the mere fact that (4) has mastered the reading skills in the first (5).. is no grantee that he / she will be able to transfer those skills of (6). into the (7). language. Answers: (1) is (2) skills (4) students The C-Test The C-test is a type of close test. But; instead of deleting whole words, the second half of every second word is deleted. Characteristics of C-test: Shorter passages can be used. More passages can be used. Only exact scoring is necessary. Can be used to test parts of speech (5) language (3) first (6) reading (7) second

The advantages given to fill-in gap can also be given to the C-Test. Example: Complete the words in the paragraph:
14

There are usually five men in the crew of a fire engine. One o them dri the eng . The lea. Sits bes. The dri . The o firemen s.. Inside t cab of the f. engine. Dictation Advantages of Dictation: Dictation can be used to test spelling and punctuation. Dictation can be used to test listening. A dictation test is easy to create and to adminster. Steps for Dictation The tester (teacher) reads all the text straight through. Testees (students) listen. The tester read the text in stretches (meaningful parts): pausing between each stretch and the next, for a time enough for students to write. After testees finish writing, the tester may re-read the text once more, for testees to check. A Disadvantage o Not easy to score more so when dictating jump led words to be written as sentences. o To make scoring easy, partial dictation can be used. i.e. to dictate a text when testees have the written form with gaps to fill in. Testing Writing

15

In order to test writing, we have to make testees write. The writing skills that we should test are those testees were trained to master. General Writing Stages: Mechanical abilities: the ability to write correct spelling, words and punctuation. Language use: to write correct sentences Treatment of content: the ability to think creatively and write relevant information. Stylistic skills: the ability to manipulate sentences and paragraphs, and use language effectively. Judgment skills: the ability to write for a particular purpose with a particular audience in mind (communicatively). Specifications of A writing Test A writing test should measure only the writing skills. Content: (for a class test) the test should measure the writing skills the students were trained to write. Tasks: The test is to measure the types of tests and functions that students were learning to perform (included in the syllabus textbook). Example: First Level: - greetings o expressions of thanks o wants / need
16

o apology

Second Level: - describe o explain o compare / contrast o argue Material: Test material should be authentic (or the otherwise, the degree to which the authentic material to be altered should be considered. Rubrics: Must be clear Length of the test: should be fixed Types of questions: chosen as appropriate Using Techniques to Test Writing Multiple-choice Questions Example: Instruction: check the alternative that is correct: Punctuation: Do you plan to come tomorrow ( ). Yes, I do. a. (?) Spelling: Did you recve a letter from home today?
17

b. (.)

c. (,)

d. (:)

(a) ei

(b) ee

(c) i

(d) ie

A group of words is a . rase. (a) f (b) gh (c) p (d) ph

There were three men and two wmen present. (a) I (b) e (c) o (d) u

Vocabulary: My nephews sister is my. (a) a.t (b) ne (c) c.n (d) sr

B) Filling the Gap Can be used to measure writing skills at different categories: Example: Fill the blank as correct grammar: (Tenses) Yesterday I. a good film on T.V. Vocabulary: A group of words that gives fall meaning is a . Punctuation: put in the correct punctuation mark: He is a clever student .. last exam he won the school prize.. Guided Composition
18

(watch)

Because candidates should know just what is required of them, they should be restricted. The guided composition technique use different types to do that, such as notes or pictures. Example: Compare the benefits of university education in English with its drawbacks. Use the points given below. You should write about one page. Arabic: - Easier for students o Easier for most teachers o Saves a year in most cases - Most scientific references are in English. o English is an international language; which means better jobs. o Learning a second language is an advantage (for more education and culture).

English:

Scoring Writing Performance There are generally two methods to score students performance on a writing test. Analytic Method: In this method the tester allots certain scores for each category he/she is measuring e.g. grammar, word order. Spelling. Holistic Method: (also called impressionistic method) the tester assigns a certain score to a piece of writing in the basic of an overall impression of it Testing Oral Ability The main objective here is to measure the extend of the learners development of the ability to interact in the target language.
19

Therefore the tester should sets tasks that represent samples of the performances the learner is expected to perform. Tasks Specifications: Operations: At an intermediate level, a learner is expected to perform tasks such as: Thanks Requirements Opinions Information Want / need

Accuracy: A certain limit of accuracy is expected from the learner when producing the target language; though some error grammatical / lexical errors that do not destroy communication are acceptable. Communicative Function: Appropriate use of language to function, should be considered. Flexibility: To measure the learners ability to initiate a conversation and to adept to new topics.
20

Size: The ability to produce more complex utterances and develop these into discourse. Oral Test Format Interview: This is the most used format. Drawbacks: a. the interviewee speaks to a superior. the testee. b. Many functions (such as asking about assessed. Interaction with Peers: Two or more students are asked to discuss a topic. A drawback is that one student might affect the others; e.g. takes most initiative and deprives the others. Response to tape-recordings: This procedure helps to elicit uniform elicitations of responses. Therefore it has a help reliability. A drawback is that, there is no way to follow up the testees flexibility in responses. There no initiatives from information) would not be

21

Elicitation Techniques Questions and Requests for Information: Direct questions can be used. e.g.: Can you tell me what you thank of..? Explain how/why..? Yes/no questions should be avoided. Picture A single picture can be used for eliciting descriptions. A series of pictures is suitable as a basis of narration. Role Play: Conditions can be asked to assume a role in a particular situation. The tester observes. An advantage in that different functions can be tested. A disadvantage is that a testee might affect the others. Testing Reading The task of the tester is to set reading tasks that will result in behaviour that will demonstrate its successful completion. Setting such reading task is not easy, because receptive skills do not usually, manifest themselves in overt behaviours.
22

Therefore, it is important to specify clearly what the testee should be able to do. Reading-Skills to be Tested (different levels) Scanning for specific information. Skimming to get the gist, the general meaning of the test. Identifying stages of an argument. Identifying examples presented in support of an argument. Micro-Skills to be Tested Identifying refer reference words, e.g. pronouns. Using context to guess meaning of unfamiliar words. Understanding relations between parts of tests by recognising indicators in discourse; especially for the introduction, development, transition and conclusion of ideas. Recognising the significance of the use of different tenses. Selecting a text for a Reading Test Select a representative sample: that represent the reading skills you need to test. Choose a text of a suitable length: o Testing scanning: needs longer text. o Intensive reading: short passages are enough. To test scanning: choose a passage with plenty of discrete pieces of information.
23

Choose a topic which will interest the test. Avoid texts of students general knowledge. Avoid texts of specific culture bias. Do not use a text that students have already read. Possible Techniques of Writing Reading Tests To measure reading skills testers should use questions techniques that measure only reading skills and do not entail writing abilities: Multiple choice questions. True/ False questions. Unique Answer: Filling a gap with a word from the passage.

Short answers: should not entail much ability of writing. Identifying order of events, arguments or topics, e.g. list the events (using numbers) as told in the passage. Identifying referents: e.g. What does the pronoun it in line 21 refer to? Guessing the meaning of unfamiliar word: e.g. which word in line 15 means the same as woman and cofortable Testing Listening Skills

24

There are occasions when only listening is practiced. Such occasions are like: listening to the radio, a tape-recorder or a lecture. However in most cases listening is naturally practiced with speaking (oral skills). When testing listening, most techniques used for reading tests can be applied (receptive skills).

Tested Listening Categories

Testers usually test two listening categories: Phoneme discrimination and sensitivity to stress and intonation. Listening comprehension. Listening Skills to be Tested: Listening for specific information. Listening for the general idea Following directions.

Following instruction. Following instructions. Interpretation of intonation patterns (e.g. sarcasm.). Recognition of function of structures (e.g. interrogative as a request: can you switch on the light? Testes Used in Testing Listening
25

Monologues. Dialogues. Conversations. Others: such as; talks, lecture, announcements, instructionsetc. Note: Tester should choose the appropriate category and the appropriate text that are relevant to the testees level. Type of Listening Test Phoneme discrimination tests: o This type of tests consists of a picture accompanied by three or four words spoken by the examiner, in person or on tape. The testee is to write the numbers of the word that is appropriate for the picture e.g. 1. Pin 2. pen 3. pair 4. pain

B. Different pictures are shown. The testee twice. The testee is to choose the right picture.

hear only one word, spoken

2. The testee hear three sentences. Two are identical, one is different. The testee is to indicate which two are the same; and which one is different. 3. The testee is given a sheet with four written words. He listens to only one word spoken by the examiner. The testee is to indicate which word he heard. e.g. written: ten pen den ben
26

Spoken: Pen.

The testee hears a sentence. Then he choose a word, from the written four words, which has occurred in the sentence. e.g. spoken: put the pan in some hot water. written: pan pen pin pain

(2) Tests of Stress and Intonation Two types of tests are generally used: Testees listen to taped sentence and indicate the syllable which carries the main stress of the whole structure. They show the main stress by putting a cross in the bracket under the appropriate syllable. e.g. spoken: Ive got three books now. written: Ive got three books now. ( ) ( ) ( x) ( ) ( )

2. The examiner says a sentence (or on a tape). The testee has to identify the mode of the sentence: Spoken: Hes a fine goalkeeper. Written: the speaker is: o making a statement. o Being sarcastic.
27

o asking a question.

(3) Testing Statements and Dialogues These types of tests are designed to measure how well students can understand short samples of speech. These types can be used to test listening comprehension on categories such as grammar and lexicon. Two types that can be used are: The testee hears a statement and then choose the best option from four written paraphrases. e.g. spoken: I wish youd done it when I told you. written: A: I told you and you did it then. B: I didnt tell you but you did it then. C: I told you but you didnt do it then. D: I didnt tell you and you didnt do it then The testee hears a short question and has to select the correct response. e.g. Spoken: Why are you going home? written: A: At six oclock. B: Yes, I am. C: To help my mother.
28

D: By bus.

(4) Testing Comprehension through Visual Materials Pictures, diagrams and maps can be used for testing listening, on different categories such as: grammar and lexis. The advantage of using such materials is that the testees performance is independent on other skills. There are many types of test designs: Type 1: A picture is used in conjunction with a set of spoken statements. The testee decides which statement is true and which is not true. Type 2: Testees are given a set of five pictures each picture is somewhat different from the others. The testee listens to four sentences describing one of the pictures. The testee has to pick out the described picture. Type 3: Simple diagram: students listen to dialogue about one diagram from a set of four diagram. The testee is to indicate which diagram the conversation / dialogue is about. e.g.

Spoken: Look! whats that inside the square? Its a white circle.
29

Is that a black circle? Whereabouts? Above the square. Yes, it is. Type 4: Testees listen to talk (a lecture) and then answer questions about the talk. Testing Speaking Most of techniques and types used to test listening can also be used to test speaking. However, others can be added. Oral Interview: An interview done by the tester and two (or more) testees. A short problem-solving activity involving a comparison or sequencing of pictures. A role play: each testee takes a role; after the situation is set by the testee. Longer activity, comprising group discussion.

Testing Grammar Grammar can be tested by means of many types of questions: Paraphrase: These require the student to write a sentence equivalent in meaning to one that is given. To restrict the student to the grammatical structure being tested, part of the paraphrase is given.
30

e.g.: Testing present perfect with for: It is ten years since I last saw him I .. Six years. 2. Completion: the testee is asked to complete the given sentence. e.g.: Mr. Cole is being interviewed for a job. Mr. John: Good morning, Mr. Cole. Which school.? Mr. Cole: Oxford College. Mr. John: And when? Mr. Cole: in 2005. Mr. John: And since then what.? Mr. Cole: I worked in a bank for two years. Then I took a car selling job. Cloze Test Questions: This type can be used to test several parts of grammar. e.g. A: prepositions of place: A picture can be used. The picture contains several objects; and the testee is to indicate the position of each item in relation to another. B: Articles: Testees are required to write a, an, the or no article. e.g.: write a, an, the or x, for no article:
31

In England children go to . School from Monday to Friday school that Mary goes to is very small. She walks there each morning with friend. One morning they saw .. Man throwing. Stones at. Dog. . dog was afraid of man. C: linking words: One word is required. e.g.: students should be careful about their examinations.., they will fail. Testing Vocabulary Several techniques can be used to test vocabulary: Synonyms: the testee is required to indicate the word of closer meaning to a certain word. e.g.: choose the word that is closest in meaning to the word on the left: gleam: a. gather b. shine c. welcome d. clean

2. Definitions: A number of definitions are given for the tested word. The testee is to indicate which definition is the right one. e.g.: A: After walking in the cold weather, I was happy to find the room quite cosy. cosy: means: warm and comfortable cold and uncomfortable
32

hot but comfortable hot and uncomfortable

B: . is the second month of the year. 3. Gap filling: Testees are asked to fill in a gap in context. The context should not contain another word that the testee does not know. e.g.: The strong wind.. The mans effort to put up the tent. disabled c. hampered b. deranged d. regaled

Production tests: Pictures, definitions and gap filling can also be used to test vocabulary in achievement tests. e.g.: write down the names of the following objects. Here students are presented with pictures of different objects; such as a watch, gloves..

33

Вам также может понравиться