Академический Документы
Профессиональный Документы
Культура Документы
Standardized tests
Standardized tests refer to tests that measure someones performance compared with the average performance scores of larger groups of people. In other words, when we use a standardized test we are using a test that has been used so frequently that we have average scores for large numbers of people: we have standards, in other words. Your individual score on a standardized test reflects how you stand in relation with other people who have taken the test. An I.Q. test, for instance, is a standardized test, and your score reveals what percentage of the people scored lower than you did and what percentage of people scored higher than you did. A GRE score functions in the same manner: your specific GRE score tells you how many people scored lower or higher than you did. Naturally, for purposes of entering college, you would want to rank higher in the percentage scale.
met specific learning objectives. Testing can be done with the traditional multiple-choice tests, essay tests, oral exams, performance tests, or any other means as long as the means of evaluation focus on the objectives that should have been met by the student. From this perspective, too, a students poor performance on a test may not necessarily indicate poor study habits on the part of the student but it might also indicateGod forbidthe result of poor teaching.
1. Validity
A valid test is one that measures the learning objectives realistically and effectively. To test someones swimming ability, for instance, you would not devise a written test, but one that had to be taken in the water. The written test in this case would be an invalid test since it cannot measure someones performance in the water. By the same token, if you wanted to test a students ability to draw microorganisms from a drop of water under a microscope, you would not want to give that student an essay or multiple choice test on the uses of the microscope. In asking yourself whether or not you have a valid test you must ask yourself if the type of test you are using is really the best way of assessing the most important types of learning that the students should have engaged in. In other words do you have a valid tool for the task?
Task: Read the following examples and determine whether or not each test item is valid or invalid. A life science teacher has decided that the following three objectives are her most important learning objectives for a unit on the California chaparral biome. 1. After a slide session showing examples of different types of biomes, students will be able to identify those slides that show the visual characteristics of the California chaparral. 2. Given actual samples of different plants of the chaparral, students will be able to make generalizations about the possible methods used by plants to survive prolonged drought and heat conditions. 3. After a video about fires in the chaparral, students will be able to describe the role played by such fires in the chaparral life cycle. On her test this teacher decided that among other questions the following were, to her, the most important ones. Please identify the items as being valid or invalid and explain why. 1. 2. 3. 4. 5. Define the term chaparral. Please explain in your own words how different chaparral plants manage to survive prolonged drought and heat. On a map of California, please identify the places where chaparral occurs. Please identify by name the chaparral plants pictured below. Please describe the role of fire in the life cycle of a manzanita.
If there is one generalization you could make about the issue of validity and teacher-made tests, what would it be?
2.
Reliability
By a reliable test we mean a test whose score is a trustworthy assessment of a students skills. After a test a teacher should always ask him-or herself the following question: is the score that this student received on this test an accurate reflection of the students understanding of the material, or is the score due to other factors that have nothing to do with the students understanding? Many factors can influence the reliability of a test. Among them are the following: 1. Lack of instructions. Every test should have instructions for the test as a whole and for different sections. make sure that you tell your students how you want them to answer the test and how they should mark their answers. It helps to provide examples. Clarity If a test is difficult to read because of spelling mistakes, sloppiness, hand-written items, etc., the student might answer the question incorrectly through no fault of his or her own. Ambiguity. If the wording of a question is puzzling or if the directions are unclear the student may provide an answer that is not the answer you are looking for. Statistical Random Error This is a fancy way of describing the effect of guessing on a test item. The more chances there are that a student can make a lucky guess, the larger the statistical random error there is on a test. What would be the percentage of error in a true/false question? What would be the percentage in a multiple choice question with 4 choices? What type of question might provide the least amount of random error? Length and Variety of the Test If the test consists of too many questions of the same type the students will become fatigued and bored with the test. The result is often that they no longer want to think carefully about the items. Keep tests relatively short and vary the types of questions on a test. Remember that classroom tests are not tests if stamina. Other Factors What other factors might affect a students score and hence the reliability of that score?
2.
3.
4.
5.
6.
Task: Please read the following examples and determine whether or not each test item is reliable or unreliable and explain the reasons for your response. 1. 2. The students are given a test on a chaparral unit. Although the teacher thought that she had included some difficult questions every single student obtained an A. The teacher had to give students a test and the morning of the test discovered he had no access to a typewriter or computer. He had to write the test in long-hand. Although students had some difficulty reading the test the teacher felt that the test was fair. Most students scored in the C+, B- range. Carol had a fever when she took the test. She told the teacher that she had studied hard, but she still received a D on the exam.
3.
The following items found on tests are of dubious reliability. Please analyze each item and explain why these items might be unreliable. 1. The Judiciary committees impeachment deliberation resulted in a resolution in favor of a. no impeachment b. the majority voted for three articles c. a sharp division between the two parties d. one article cited obstruction of justice
2. America was discovered by _____________________________ 3. Which of the following is the best brief description of the novelists writing? a. An approach to characterization and description that might be considered flowery and convoluted. b. An approach to characterization and theme development that might be considered to be over-action focused. c. An approach to characterization that is psychoanalytical. d. An approach to characterization and character development that is focused on inner-feeling. 4. The value of ! is 3.14 (true/false)
3. Comprehensiveness
A test should cover all the objectives of a unit adequately with more emphasis on those objectives that are considered more important than others. In other words, the test should be weighted in terms of content. Among the more important ones should be objectives testing for higher types of thinking skills such as application, analysis, evaluation, and synthesis.
Task: Please read the following examples and determine whether or not the test item is comprehensive or non-comprehensive. Example for analysis 1 A teacher is teaching a unit on polygons. She has written a number of objectives among which are identifying by name various polygons, calculating the area of various polygons, and calculating the areas of polygons within polygons. On the test which consisted of 30 items she asked 10 problems asking students to calculate the areas of rectangles, 10 problems asking students to calculate the areas of triangles, and 10 asking students to calculate the area of trapezoids. Example for analysis 2 A teacher teaching the play of Romeo and Juliette spent 1 day on a brief history of the Renaissance, 1 day on the stage under Elizabeth I, 1 day on the biography of Shakespeare, 3 days on watching a movie of the play, and 4 days discussing the play which the students read at home. On the test, 60% of the grade depended on the answer to the following question: given the situation of Romeo and Juliet, can you think of any political situation today where this scenario could occur.? Please write a brief scenario of such an event, using modern social groups and terms.
4.
Objectivity
Whether or not a student answered an item correctly should not be a matter of personal interpretation by the grader. The more precise the criteria for the test are, the more objective the test will be.
Task: Please read the following examples, explain why they might be non-objective, and describe how you might want to change the item or situation. 1. 2. Everyone in the United States has an equal opportunity in education. (circle one: true false)
A student obtained a low grade on an essay test graded by the teachers assistant. Upset, the student turned to the professor who taught the class and who reread the essay test. On the second reading the professor thought that the student did indeed make some important points and he hence increased the grade.
5. Practicality
A test should be relatively easy to administer, take, and grade. These are most practical and important characteristics to keep in mind. If the test is too cumbersome to use or too difficult to administer it will put too much emphasis on the evaluation rather than on the teaching and learning aspects of a lesson or unit. Thus although you must keep the criteria of objectivity, reliability, validity and comprehensiveness in mind, one must balance these with the practical considerations of time and ease of administration.
Task: Please rate the following items by stating the most practical type of test first, to be followed by the second less practical type and by stating the least practical type last. Put a 1 in front of the item you consider to be the most practical, a 2 in front of the next practical item, etc.
A strictly multiple choice test. A test consisting of 3 essay tests with detailed evaluation criteria. A test which has been used in the past by the teacher and which has shown to be reliable and valid. An exhaustive test that tests every objective 3 time in order to maintain reliability and validity. A ready-made tests which is highly reliable and valid but which needs to be bought and consists quite a bit of money.
Types of Assessments
Pop quizzes
Unit Tests (multiplechoice,essay,fill-ins,etc.)
Authentic Assessment
Enhanced Multiple Choice Open-ended essays Projects Performance Assessment Investigations Portfolio
kok 2006
Finding out whether or not kids have met the objectives for a lesson, a unit, or a curriculum:
Criterion reference tests. (Example: By the end of the lesson, the students will be able to explain in their own words the reasons why Goldilocks selected the medium chair.) See also Various classroom assessment tools (next page).
Finding out how much kids know or can do in comparison with most kids of their age level: Standardized Tests
California English Language Development Tests (CELDT) California Standards Tests (CSTs)
Grant Wiggins (1989), Teaching to the Authentic Test, Educational Leadership, 45(7), p. 44.
Level Two The pieces are dated, titles recorded; Chronological; potential for showing development
Level Three Date, title, form (writing), genre (reading) Audience, level of process development
Level Four Date, title, form, genre, audience, process Skills/goals;ideas to write about
A few selected pieces (samples) scored holistically Reflection/Self-assessment Student and teacher input Conscious statement of growth
kok 2006