Академический Документы
Профессиональный Документы
Культура Документы
Cut Score
A selected point on the score scale of a test. The cut score separates students into various categories, such as a passing and failing. The process of setting a cut score is called standard setting, which is a multi-stage, judgmental process.
March 27, 2012 22
Nomenclature
Cutoff scores Achievement levels Mastery levels Classification scores Passing scores Criterion levels Performance levels Cut points Performance Cut scores standards Standards Cutscores
March 27, 2012
(ETS, 2005)
Cutting scores
33
Why is there a need to set cut scores? What benefits are expected from the use of cut scores? What decisions will be made on the basis of the cut scores? How are those decisions being made now in the absence of cut scores?
March 27, 2012
Theres no objective way to set cut scores. Cut scores are constructed, not found. They are based on judgments about people, student work, or test items. Judges depend on subjective, internalized norms about what people can do.
55
Errors of Classification
Some people who deserve to pass will fail. Some people who deserve to fail will pass. Raising or lowering the cut score to reduce one type of error will increase the other type of error. Theres no way to prove that a cut score is correct.
66
Select judges. Teach judges about cut scores. Define borderline knowledge/skills for a particular task/assessment.
1.
4.
Train judges in the use of the method. Implement the cut score study. Document the results.
77
5. 6.
Judges
Be qualified Know subject and population Be representative and diverse Be acceptable to stakeholders Be willing to follow procedures
Teach them about test purpose, cut score purpose, consequences of passing March 27, 2012 and failing 88
Defining Borderline
1.
Make sure judges understand what the assessment measures and how scores will be used.
2.
Ask judges to describe in their own words a person whose knowledge and skills would represent the borderline between acceptable and unacceptable levels of knowledge/skills the assessment measures. March 27, 2012 99
Examples of Borderlines
Worst reader who should get a diploma / best reader who should not Worst surgeon who should be licensed / best surgeon who should not Worst essay that deserves a score of 6/ best essay that deserves a 5 Highest bacteria count in safe water / March 27, 2012 1010 lowest bacteria count in polluted
Judge Training
Overview of the standard setting method: What is it? Why do it? How is it done? Practice using the standard setting method. Observe judges. Correct errors. Answer all questions.
27, 2012 1111 Practice more until all judges are
March
Selecting a Method
Theres no one best method. Different methods will result in different cut scores. Some methods are better for certain types of assessments and assessment situations.
1212
People
Small assessment situations where faculty have strong knowledge of students abilities
Student work
Judges examine each question on an assessment and estimate the probability that a borderline student would answer the question correctly.
Or: Imagine a group of 100 borderline students and estimate how many of them would answer the question correctly.
Discuss probabilities. Aim for range of 10-15 % points per question. Judges can change judgments based on discussion. Sum each judges probabilities for all questions to get a recommended cut score for each judge. Average the judges recommended cut scores to arrive at a average cut March 27, 2012 1515 score for minimum competency.
Judge 1
1616
Modified Angoff
A lot Cons Most researched Pros of data entry method Difficult Stands up to cognitive task court challenges Does not require work well with heavily student data so openit can be carried ended/performa out prior to test nce-based tests administration
1818
Faculty consider all they know about a population of students. Faculty predict individual students level of performance on an assessment (e.g., expected passing/failing, proficient/nonproficient) without reference to scores. Obtain assessment scores.
March 27, 2012
1919
2020
Another Example
Graph the scores of expected passing and expected failing students in 2 separate distributions. The point at which the 2 distributions intersect is the cut score.
2121
Contrasting Groups
Inconvenient to Cons Uses Pros real data, get judgments not conjecture of people Uses external Relies on info in validationhuman judgment addition to test without scores examining actual Easy to explain test performance Need scores before cut is set 2222
Pre-work:
Performance level categories, as well as performance level descriptors for each performance level, must be established and agreed upon. Scoring of a large sample of student work must occur before standard-setting can begin.
Select 40 to 50 intact samples of student work to represent the range of Marchstudent performance on an assessment. 27, 2012 2323
Range-Finding phase:
Identical sets of student work are provided to each judge. Judges are asked to independently categorize the student work samples based on the performance level descriptors, without any discussion.
This process reveals which work samples (e.g., Graduation Portfolios) generate the most agreement and which Marchgenerate the most disagreement 27, 2012 2424
Pinpointing phase:
Judges examine sets of work about which they disagreed in the rangefinding phase, along with additional work samples representing those same score intervals. Judges assign performance levels to these work samples.
The minimum score for each performance level is precisely March"pinpointed" by determining the score 27, 2012 2525
Body of Work
Need Cons Uses scores Pros real data, before cut is set not conjecture Requires a lot More intuitive;of prior panelists are not preparation, asked to imagine volumes of a hypothetical materials minimally competent Grueling work examinee or to estimate the
2626
Setting the Operational Cut Score Only a legally authorized entity (e.g., policy makers) can authorize the use a cut score. Once authorized, a study cut score becomes an operational cut score.
2727
The Standards of the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education specify that the following be included in the standard setting process:
Document how judges were selected, their qualifications and training, the procedures used, whether or not judges Marchratings were independent, the2828 27, 2012 level of
Seek opinions on cut scores. Find out what happened to people who failed.
Is there evidence that any people who passed are really unqualified? What were the consequences of misclassification errors? March 27, 2012 2929
Comments? Questions?
3030
Discussion Questions
How could you use standard setting in your setting? Which standard setting approach might you utilize? Why? What challenges do you anticipate in implementing this approach? How might you address these challenges?
March 27, 2012 3131
References
American Educational Research Association, American Psychological Association, National Council on Measurement in Education. ( Standards for 1111). Educational and Psychological Testing. Washington, DC: American Psychological Association . Cizek, G.J. 1111 ). An NCME instructional module on setting pass scores. ( Educational Measurement: Issues and Practice (1 111 , 11 ), -1 . Cizek, G.J., Bunch, M.B., Koons, H. (W inter 1111 ). Setting performance standards: Contemporary methods. Educational Measurement: Issues and Practice-1 . , 111 Horn, C., Ramos, M., Blumer, I. & Madaus, G. 1). Cut scores: Results may vary. 1 (1 1 National Board on Educational Testing and Public Policy Monographs, 1(1 ). Chestnut Hill, MA: Boston College. Livingston, S.A. & Zieky, M.J.(1111 . ). Excerpts from Passing Scores: A Manual for Setting Standards of Performance on Education and Occupational . Tests Princeton,NJ: Educational Testing Service. Measured measures: Technical considerations for developing a local assessment system. (1111 ). Augusta, ME: Maine Department of Education. Pitoniak, M (1111 . ).Considerations in Setting Performance Standards ( Cutscores) . Training session at 1111 National Council for Measurement in Education conference, Montreal, Canada.
3232