Вы находитесь на странице: 1из 11

Appendix 10

SUMMATIVE TEST
(Teacher made test for the 1st semester of the 1st Class at SMPN 4 Jambi 2004/2005
School Years)
I. Speaking Test
It is held on Monday, December 27 th, 2004. Teacher had divided the students into
groups. Then teacher asked each group to make conversation about “Introduction”
and performed it in front of the class. Speaking Test for Class 1 A and 1 B are joined
together.

II. Written Test


It is held on Tuesday, December 28th, 2004. The written test is arranged as follow:

A. Listen to the song and find words about job! Put the job words in their right places!

PENNY LANE
The Beatles
In Penny Lane there is a (1)_______________ showing photographs,
of every head he’s had the pleasure to know.
And all the people that come and go,
stop and say hello.

On the corner is a (2) _______________ with a motorcar,


the little children laugh at him behind his back.
And the (3) ___________________ never wears a mack,
in the pouring rain, very strange.

Penny Lane is in my ears and in my eyes.


There, beneath the blue, suburban skies,
I sit, and meanwhile back.

In Penny Lane there is a (4) ________________ with an hourglass,


and in his pocket is a portrait of the Queen.
He likes to keep his fire engine clean,
it’s a clean machine.

Penny Lane is in my ears and in my eyes.


A four of fish and finger pies,
in summer, meanwhile back.

Behind the shelter in the middle of a roundabout,


the pretty (5) ________________ is selling poppies from a tray.
And tho’ she feels as if she’s in a play,
she is anyway.

In Panny Lane the (6) _______________ shaves another customer,


we see the (7) _______________ sitting waiting for a trim.
And then the (8) _________________ rushes in,
from the pouring rain, very strange.

92
Penny Lane is in my ears and in my eyes.
There, beneath the blue, suburban skies,
I sit, and meanwhile back.
Penny Lane is in my ears and in my eyes.
There, beneath the blue, suburban skies,
Penny Lane.

B. Write a description about person at the picture!

____________________________________
____________________________________
____________________________________
____________________________________
____________________________________
____________________________________
___________________________________ .
DAVID BECKHAM

C. Choose the correct answer!


1. These are my friends. _______ ______ Vira and Anwar.
a. You are b. They are
c. We are d. She is
2. I have a sister. ________ _____ smart.
a. I am b. He is
c. She is d. It is
3. I have a dog. It has a house. ______ house is small.
a. My b. Your
c. Its d. Our
4. Friend : Hi, how are you?
You : _____________
a. Very well, and you? b. Thank you.
c. Nice to meet you. d. Have a nice day.
5. Look at the picture!
Music room R. 1A R. 1B R. 2A

Hall Library Language Lab Toilet

Teachers room R. 3A R. 3B R. 2B
X : Excuse me, where is the Language Lab please?
Y : It is __________________________
a. Beside the Grade 1B room. b. In front of music room
c. Beside the library d. Between the hall and library

93
6. Andrew never _____________ about his schedule.
a. think b. thinks
c. thought d. is thought
7. How __________ Anna and July ___________ to school?
a. do, go b. does, go
c. are, go d. is, go
8. A living room is __________________
a. a place to wash b. a place to grow flower
c. a place to cook d. a place to relax to talk
9. A dining room is __________________
a. a place to eat b. a place to cook
c. a place to keep a car d. a place to sleep
10. We need ______________ sugar to make ________________ of coffee.
a. a loaf, a glass b. a spoon, a plate
c. a loaf, a plate d. a spoon, a cup
Answer the questions based on the following text for questions number 11 – 15!

Tania’s Schedule

1:00 pm – 4:00 pm Bank open


1:00 pm – 2:30 pm Rest period
2:30 pm – 3:30 pm Computer
3:45 pm – 4:30 pm Music or sports
4:45 pm – 5:30 pm Internet or sports
6:00 pm – 6:30 pm Dinner
6:45 pm – 9:00 pm Shopping Center open

11. Tania save their money _______________


a. before rest time b. at rest time
c. after rest time d. in the morning
12. How long is the bank open?
a. 3 hours b. less than 2 hours
c. 2 hours d. 4 hours
13. Tania buy her daily needs __________________
a. in the morning b. in the afternoon
c. in the evening d. before dinner
14. What does Tania do from 3:45 pm – 4:30 pm?
a. She goes to computer class b. She plays musical instrument
c. She accesses the internet d. She buys sports equipment
15. How long does Tania take a rest?
a. Forty-five minutes b. Thirty minutes
c. One hour d. One and a half hours

 Good Luck 

94
Appendix
RELIABILITY AND VALIDITY TEST
(BACHMAN AND PALMER (1996) FRAMEWORK OF EVALUATION)

The Bachman and Palmer 1996 framework evaluation of test usefulness will be
used to evaluate the Teacher English made test for the 1st semester of the 1st class
at SMPN 4 Jambi 2004/2005 School Years. The questions for logical evaluation of
usefulness as posed by Bachman and Palmer will be identified in italics.

RELIABILITY
1) To what extent do characteristics of the test setting vary from one
administration of the test to another?
All students take the tests without air-conditioning, comfortable classrooms
with minimal background noise. All students did the speaking test in
December 27th, 2004. Written test is held in December 28th, 2004.

2) To what extent do characteristics of the test rubric vary in an unmotivated


way from one part of the test to another, or on different forms of the test?
Instructions and questions are clear for all questions, but the question ten part
C is missed the word “of” (We need __________ sugar to make _________ of
coffee. Choice of the answer: (a) a loaf, a glass; (b) a spoon, a plate; (c) a loaf,
a plate; (d) a spoon, a cup).

3) To what extent do characteristics of the test input, vary in an unmotivated way


from one part of the test to another and on different forms of the test?
The input is from text book which is used by the students based on the goals
of the curriculum. This is satisfactory.

4) To what extent do characteristics of the expected response vary in an


unmotivated way from one part of the test to another or on different forms of
the test?
Response for speaking test is conversation. In written test, responses for part A
are short answers, by putting the missing words through listening to the song.
Responses for part B is writing a descriptive paragraph. The responses for part
C are multiple choices. There are grammar questions, vocabulary questions,
and reading questions. The vocabulary questions are mixed with the grammar
questions. Questions 1 to 3 and 6 to 7 are grammar questions. Questions 4 to 5
and 8 to 10 are vocabulary questions. Thus 11 to 15 are reading questions. It
would be far better, logical and easier for students if the grammar questions
were clustered together.

5) To what extent do characteristics of the relationship between input and


response vary in an unmotivated way to one part of the test to another, or in
different forms of the test?

95
The input does not vary. But the responses are varies, they are conversation,
finding a missing words through listening to the song, writing a descriptive
paragraph, and multiple choices.
VALIDITY
6) Is the language ability construct for this test clear and unambiguously
defined?
Yes, it is. There are reading, grammars, vocabulary, and conversational
components, but they are a few written questions, which is not a valid
interpretation of English ability at all.

7) Is the language ability construct for the test relevant to the purpose of the
test?
The purpose of the semester test of the 1 st class at SMPN 4 Jambi 2004/2005
School Years is to rank students and to set a pass level to proceed to 2nd class.
Students are ranked and compared among classes. There is speaking, listening,
and writing component to the test. A multiple choice test is going to test their
reading, grammar, and vocabulary ability. It is active and passive test
construct.

8) To what extent does the test taker reflect the construct definition?
The speaking test reflects conversational ability. The written test reflects
listening ability, writing ability, reading ability, and grammatical ability.

9) To what extent do the scoring procedures reflect the construct definition?


It is fair to mark the students performed in speaking, writing, listening, and
chose the correct answer in multiple choice. The multiple choices have at least
one alternative correct answer. Only one answer is marked correct. In writing
good test questions for multiple choice exams, evaluating multiple choice tests
states, "Make sure there is only one correct answer".
Speaking, writing, listening, and multiple choices scores is adequate enough to
assess a range of English skills, depth of skills, or to indicate if a student is
ready to progress to the 2nd class.

10) Will the scores obtained from the test help us to make the desired
interpretations about the test takers' language ability?
It is active and passive test. Although it is just a few questions, but the
listening, speaking, and writing ability is tested. The results can help in
interpreting learners' English ability.

11) What characteristics of the test setting are likely to cause different test takers
to perform differently?
All students complete answers on a folio paper. All conditions are the same.
This aspect is satisfactory.

12) What characteristics of the test rubric are likely to cause different test takers to
perform differently?

96
The instructions for individual questions are in English and are basic in
structure. But there is no instruction can cause test takers to perform
differently.
13) What characteristics of the test input are likely to cause different test takers to
perform differently?
Incorrect grammar, words, spelling, and punctuation are input that can cause
problems with test takers. But this is not occurs in these tests. Incorrect
grammar is just found in questions 10 part C. Incorrect test questions
invalidate the test.

14) What characteristics of the expected response are likely to cause different test
takers to perform differently?
Once again, having several possible correct answers in the question, whilst
only marking one correct, can cause major problems with test takers.
Response for speaking and writing test of each student are different according
to their competence. These answers are scored by using rubric scoring. For
listening and multiple choice questions are having one possible correct
answer.

15) What characteristics of the relationship between input and response are likely
to cause different test takers to perform differently?
The test construct included Indonesian boys' and girls' names which does not
cause problems when trying to identify gender or subject. This is quite good.
There is no question that has cultural bias. The questions have input
commonly known to Indonesians.

AUTHENTICITY
16) To what extent does the description of tasks in the TLU [target language use]
domain include information about the setting input, expected response and
relationship between input and response?
The setting is included in some reading questions, but absent on grammatical
and vocabulary questions. Question 7 epitomizes the confusion over expected
response and input. The expected response by the test constructor is "a)
which" [photographs]. However, "b) what" and even "c) whose" is
grammatically acceptable. The expected response must match the input given
in an authentic valid test. Kehoe (1995, para. 3) states, "As a rule one is
concerned with writing stems that are clear and parsimonious, answers that are
unequivocal and chosen by the students who do best on the test.." Question 28
lacks input for the response. There is no specific information to base the
answer of "b". In essence, the TLU needs more contextual support and only
one correct response. Question 42 relies heavily on knowledge of an
Indonesian folk tale. Without knowledge of the folk tale, construction of the
paragraph could differ from the expected response. There is a very small
minority of students in Indonesia who do not know this folk tale, for example,
foreign nationals sitting the tests.

97
17. To what extent do the characteristics of the test task correspond to those of
the TLU tasks?
Conversational items tested as multiple choice items are far from authentic!
Thus, question 53 in reality could be a, b, c, or d depending on context. There
are also many grammatical mistakes making the test non authentic.

INTERACTIVENESS
18. To what extent does the task presuppose the appropriate area or level of
topical knowledge, and to what extent can we expect the test takers to have
this area or level of topical knowledge?
As previously mentioned question 42 presupposes topical knowledge of an
Indonesian folk tale. Overall, the topical knowledge is appropriate for 14 to 15
year old students, for example, the areas of media, sickness, and sport. There
is generally no great need to use specific topical knowledge to answer
questions.

19. To what extent are the personal characteristics of the test takers included in
the design statement?
The design is for final year students of junior high schools in Indonesia in the
Jakarta district. It is assumed all test takers are Indonesian, aged 14 to 15, and
all speak Indonesian. All have done the pre-UAN test and it is assumed all
have completed nine years of formal schooling. It is to be noted that there is a
tiny minority of foreign nationals who also sit the test, but the government
assumes and expects that they get schooled in international schools.

20. To what extent are the characteristics of the test tasks suitable for test
takers with the specified personal characteristics?
The test tasks are very much appropriate for the average and lower ability
students. In this regard the test is suitable, but it fails to take into account the
higher ability students to which the tasks are at a functional level far below
their ability. That is, some students who achieve excellent results in native
speaking English tests, for the same educational level in English, are tested on
tasks that do not challenge nor address their level of ability. Year nine students
at the school where I teach have performed well above average for the 2001
year 9/10 Australian English schools' competition test items, with one student
scoring 100%. However many students from Indonesia fail the UAN test.
Thus there is a big range of ability, whereas the test tasks do not cover the
whole range.

21. Does the processing required in a test task involve a very narrow range of
areas of language knowledge?
As discussed previously, the tasks engage a very limited range of language
knowledge.

98
22. What language functions, other that the simple demonstration of language
ability, are involved in processing the input and formulating a response?
None.

23. To what extent are the test tasks interdependent?


They are not. All questions are dependent on the specific questions or reading
passage and are independent of other items.

24. How much opportunity for strategy involvement is provided?


A pre-UAN test is administered to all students under the same test conditions
two weeks prior to the UAN test. There is also available a text with past year
UAN test items available to teachers in all schools to prepare students. The
construction of tests is very similar year to year and thus provides students and
teachers with ample time to prepare.

25. Is this test likely to evoke an affective response that would make it
relatively easy or difficult for the test takers to perform at their best?
No. The topics are culturally sensitive and non-emotive.

IMPACT
26. To what extent might the experience of taking the test or the feedback
received affect characteristics of test takers that relate to language use?
This test is passive and the language to be tested is done so testing only
understanding, neglecting higher skills such as processing, comparing,
debating and even production of language. Hughes (2003, p. 1) claims, "If the
skill of writing, for example, is tested only by multiple choice items then there
is great pressure to practise such items rather than practise the skill of writing
itself. This is clearly undesirable." The UAN test aims to test grammar, but
students are not required to construct any sentences. The students are to learn
conversational conventions, but not tested orally. Research by Hadiatmaja,
cited by Somantri (2003, para. 6) observes that Indonesian school students
learning English "are passive and receptive only [translation]." Thus the
backwash effect of the UAN tests can be seen in students' passive and
receptive skill focus with problems in construction of discourse in speaking
and writing.

27. What provisions are there for involving test takers directly, or for
collecting and utilizing feedback from test takers directly, or for collecting and
utilizing feedback from test takers and the design and development of the
tests?
There are no known provisions. Students do not have the opportunity to
provide any feedback or have any input into the development of the test.

28. How relevant, complete, and meaningful is the feedback that is provided
to test takers?
Correct answers, and students' responses are given showing their mistakes. A

99
final score and school ranking is also given. There are just statistics and
students are not given any explanation to why test items are correct. No
information is given on their language ability or mastery of subject matter. It is
difficult for the individual teacher to provide good feedback due to the amount
of alternative correct answers.

29. Are decision procedures and criteria applied uniformly to all groups of test
takers?
Yes. All schools follow the same criteria of the UAN score and scores are
objective, independent on participation, attendance, attitude or other factors.

30. How relevant and appropriate are the test scores to the decisions to be
made?
The test score is the single factor in determining the grade and to determine if
the student can proceed to senior high school.

31. Are test takers fully informed about the procedures and criteria that will be
used in making decisions?

32. Are these procedures and criteria actually followed in making the
decisions?
Yes. There are no exceptions, though those who fail may sit for the test again.

33. How consistent are the areas of language ability to be measured with those
that are included in teaching materials?
The teachings materials of teachers usually match the language ability to be
measured, as is the case in the majority of schools. However, schools such as
my school do not follow the national curriculum per se and go way beyond
including active skills and including listening, speaking, and writing skills, in
addition to the reading and grammar of the national curriculum. These
schools, the teachers, and students feel uncomfortable with the test as it does
not meet their learning content nor does it test most of their ability.

34. How consistent are the characteristics of the test and test tasks with the
characteristics of teaching and learning activities?
This is dependent on the individual teacher. Due to the passive nature of the
tests, a lot of students learn English in a passive manner and as a result
Artsiyanti (2002, para. 6) claims, "Students do not know when structures
[grammar] have to be used and how to apply them in everyday life
[translation]." The test tasks contribute to a negative backwash effect in the
classrooms.

35. How consistent is the purpose of the test with the values and goals of
teachers and of the instructional program?
The test is far from achieving the goals of English at IPEKA. Due to its

100
limitations in passive receptive skills it is also not consistent with goals of
other schools' English courses, even though it is consistent with the national
curriculum.

36. Are the interpretations we make of the test scores consistent with the
values and goals of society and the education system?
If wages are a reflection on worth, society does not value the worth of teachers
in Indonesia in comparison to western countries. The average wage of a
teacher is Rp 700,000 to Rp 800,000 (just over $100 AUD) a month (Sistem
pendidikan harus dirombak secara radikal, 2004). Schools are often
dilapidated and some students cannot afford their tuition. Many language
teachers do not have adequate mastery of English to teach effectively and
efficiently in schools in Indonesia. Somehow test scores are regarded as highly
valid and respected by most as the major measure of performance in English
and as a means to determine the academic progression of students to the next
level.

There are more pressing concerns here of terrorism, hunger, and work. The
acceptance by society and the educational system of the test scores should not
equate with the usefulness of the test. The UAN needs reform!

37. To what extent to the values and goals of the test developer coincide or
conflict with those of society and the education system?
There is agreement with the education system and most of society.

38. What are the potential consequences, both positive and negative, for
society and the education system, of using a test in this particular way?
The backwash effect contributes to passive learners and English speakers not
confident in production of English, of which is the case in Indonesia today.

39. What is the most desirable positive consequence, or the best thing that
could happen as a result of using the test in this particular way, and how likely
is this is happen?
The test could act as a motivating factor for some in mastering passive
English. This is still not likely.

40. What is the least desirable negative consequence, or the worst thing that
could happen as a result of using the test in this particular way, and how likely
is this to happen?
As mentioned previously, many students will learn an understanding of
reading and grammar in a passive and receptive manner without learning
active skills and to the exclusion of speaking, listening, and writing. This is
highly likely as it is already evidenced throughout the country.

PRACTICALITY
41. What type and relative amounts of resources are required for: (a.) the

101
design stage, (b.) the operationalization stage and, (c.) the administration
stage?
There is not much money available for the UAN tests, nor time, nor expertise.
The design is done by a few local English teachers with no resources provided
by the government, apart from the syllabus and test construction design. The
operation is done by a central team by Scanton computer marking. The
administration of the tests is by individual schools.

42. What resources will be available for carrying out a. b. and c. above?
Teachers, computers, printers and paper are available. Resources are very
limited in Indonesia due to its massive student population and limited budget.

CONCLUSION

The UAN is not very useful. It is not valid, authentic nor interactive and has
negative impacts on learning. It is however, reasonably reliable and practical.

The purpose of the UAN is to measure a level of English competence to


progress to senior high school. This obviously fails that. It is necessary first to
determine the aim or goal of the test. Kitao and Kitao (1996b. para. 2) state,
"The goal of the test is what you want to measure." There are many
unmeasured skills that can be tested. Listening, writing, and speaking can all
be assessed in addition to reading and grammar. In regard to grammar, Kitao
and Kitao (1996a , conc. para.) state, "While the testing of grammatical
knowledge is limited - - it does not necessarily indicate whether the testee can
use the grammatical knowledge in a communicative situation - - it is
sometimes necessary and useful."

Indonesian schools are moving towards an outcome based curriculum. A


criterion reference test (CRT) could well be an excellent alternative to the
present UAN test. Gorsuch (1997, para. 25) claims, "Only CRTs will allow
teachers to set standards, measure achievement and give students valuable
feedback at the course level." This could then present the opportunity for
positive backwash so that students are active users of English and confident in
all skills.

All in all the UAN fails to be useful because of its test construction which is
riddled with mistakes and contains many alternative multiple choice answers
that are correct. Hughes (2003, p.2) claims, "Students' true abilities are not
always reflected in the test scores that they obtain." This is the case with the
UAN test.

102

Вам также может понравиться