Академический Документы
Профессиональный Документы
Культура Документы
2009
Title page
A detailed report on: Determining the difficulty index (p), discrimination index (D) and
descriptive statistics (mean, mode, median and standard deviation) for a given multiple
choice item test data with twenty questions answered by twenty five learners.
Table of figures...................................................................................................................................4
Terminology list.....................................................................................................................................5
Acknowledgements...............................................................................................................................7
1.0 Introduction...............................................................................................................................8
2.0 Procedure followed...................................................................................................................8
3.0 Descriptive statistics..................................................................................................................9
3.1 Tabulation of Data.......................................................................................................................9
3.2 The mean, mode and median....................................................................................................10
3.3 The standard deviation of the data set......................................................................................11
3.4 Grouped frequencies.................................................................................................................11
3.5 The Histogram...........................................................................................................................12
3.6 The frequency polygon..............................................................................................................13
3.7 Cumulative Frequency polygon.................................................................................................13
4.0 Item analysis............................................................................................................................14
4.2 Difficulty index for the test items..............................................................................................15
4.3 Discrimination index for the test items......................................................................................15
5.0 Summary of Findings...............................................................................................................16
5.1 Conclusion.................................................................................................................................16
6.0 Reference list...........................................................................................................................17
Table 1 /scores…............................................................................................................................9
Box 1/measures of central tendency...........................................................................................10
Table 2/grouped frequencies......................................................................................................11
Table 3/cumulative frequencies..................................................................................................12
Figure 1/histogram......................................................................................................................12
Figure 2/frequency polygon.........................................................................................................13
Figure 3/ogive..............................................................................................................................13
Figure 4/item analysis..................................................................................................................14
Figure 5/test compilation............................................................................................................14
Figure 6/p values.........................................................................................................................15
Box 2/groups...............................................................................................................................15
Figure 7/D values.........................................................................................................................15
Table 4/findings...........................................................................................................................16
CRITIRION -REFERENCED TEST (CRT): a test that provides information about a student’s
performance based on a defined set of standards, a
set criterion, level of proficiency, mastery of a skill
or set of skills.
DATA: raw facts and figures which have not been
manipulated to give mean, median, mode.
It arises from the dunes of my heart to thank Professor J.G. Knoetze for being the ‘GPRS’ in
this course-eae 880. I have personally learnt a lot from him and no one can take that away
from me. My heart specially goes out for my wife-Khanyi and my one year old daughter -
Mumu; my brother Mawina whose prayers, love and support kept me going. May, God bless
you all the time.
Among the many facets of CIE (computer integrated education) is CIA (computer integrated
Assessment). This facet deals with the measurement, testing, assessment and evaluation
issues in the classroom life. It also covers the evaluation and assessment of E-learning
programs.
This report covers the various processes that an e-learning facilitator should go through in
the process of the evaluation and assessment of E-learning. A numbers of descriptors were
used to determine the nature of the test using test data that was provided by the course
lecture professor J.G. Knoetze through the website www. Jgknoetze.co.za/eae880_2009.
The purpose of this report is to table the necessary evidence required to reach a conclusion
on whether the test item used was of the appropriate standard. The measures of central
tendency and variability were calculated to provide reliable information. This included the
arithmetic mean, median and mode of the test scores.
In determining the level of difficulty of the questions, (p) values were calculated. The
difficulty index values were interpreted in terms of a scale provided in the course eae880.
The discrimination index was also calculated and was captured as (D) values which were also
interpreted to ascertain whether or not to accept specified questions.
The need for quality of tests used can not be overemphasised. Tests are necessary and vital
to the education process. The purpose of testing is to get scores that assist in making
instructional decisions, grading decisions, diagnostic decisions, selection decisions,
placement decisions, counselling and guidance decisions, program and curriculum decisions,
as well as making the timely and necessary administrative decisions (Kubizyn and Borich,
2003).
Test data captured from a multiple choice item with twenty questions was used for this
report. The name of the file name from the server was testdata.xls. The multiple-choice had
four options A, B, C and D. The test was administered to twenty five learners.
To obtain the descriptive statistics an analysis was done in Microsoft excel 2007. The test
data was retrieved formatted and tabulated as shown in appendix A. The test data was
Calculations of total test mark, total score for each of the twenty-five learners, lowest and
the highest test score in the data set, range of the data set, frequencies, arithmetic mean,
the median and mode was tabulated as shown in Table 1.
Table 1 -scores
Scores
z-a Groups f
100 95-104
100 2
90 85-94
90
90
85
85
85 6
75 75-84 1
70
70
65
65
65
65 65-74 6
60
55
55 55-64 3
50 45-54
50
45 3
40 35-44 1
30 25-34
30 2
15 15-24 1
The mean, mode and median are referred to as the measures of central tendency because
they summarise the center of given
data. They measure the location of
the middle or the center of a
distribution, “middle” value of the
Measures of central tendency data set and the most occurring
data values. This is summarized in
box1 below.
Box 1
Number of
scores 25
lowest score 15
Formula for mean
Highest score 100
M = (S X) / N
Range 85
M: Symbol for the mean.
mean 65
S: Sigma symbol tells you to sum up
Mode 65 whatever follows.
Standard
deviation 23
The variability of the scores from the test data was measured using standard deviation (SD
or σ ). Standard deviation is a measure of the dispersion of a set of data from its mean.
The more spread apart the data, the higher the deviation. A low standard deviation
indicates that the data is clustered around the mean, whereas a high standard deviation
indicates that the data is widely spread. (Knoetze, 2009)
2
= (X
SD – M)
N
σ or SD = Standard deviation
The standard deviation for the data set was computed and found to be 23.
To display the grouped data, class intervals and the interval size for the data set, grouped
frequencies, midpoints for the class intervals and cumulative frequency Table 1 and Table 2
was used as shown.
TABULATING Table 1
GROUPED
FREQUENCIES
LL UP MP f Groups f
15 24 19.5 1 15-24 1
25 34 29.5 2 25-34 2
35
Armstrong 44
University of 39.5
Pretoria 1 35-44Masters1in Education ( CIE)
Simelane
45 54 49.5 3 45-54 3
29307679
P a55
g e 12 64 59.5 3 55-64 3
65 74 69.5 6 65-74 6
75 84 79.5 1 75-84 1
85 94 89.5 6 85-94 6
95 104 99.5 2 95-104 2
Table 2
CALCULATING CUMULATIVE FREQUENCIES
LL UP MP f Groups UP a-z cf
15 24 19.5 1 15-24 24 1
25 34 29.5 2 25-34 34 3
35 44 39.5 1 35-44 44 4
45 54 49.5 3 45-54 54 7
55 64 59.5 3 55-64 64 10
65 74 69.5 6 65-74 74 16
75 84 79.5 1 75-84 84 17
85 94 89.5 6 85-94 94 23
95 104 99.5 2 95-104 104 25
The Histogram shown in Figure 1 indicates that most of the students passed the test as 15 of
the 25 scores lie above the mean, mode and median.
Figure 1
4
3
2
1
0
15-24 25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104
Interval
The frequency polygon shown in Figure2 illustrates the midpoints 69.5 and 89.5 as the two
peaks of the curve and yet 19.5 and 39.5 are the lowest points.
Figure 2
4
3
2
1
0
19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Mid Points
Figure 3
25
20
Frequency
15
10
0
24 34 44 54 64 74 84 94 104
Upper Limit
Item analysis is conducted to determine the number of items to include in the final version
of a test. Item discrimination index and difficulty index are two types of the quantitative
aspects of item analysis (Sapp, 2003). Figure 4 presents a pictorial view of the levels of
difficulty and discrimination by the multiple choice item.
Figure 4
1.20
1.00
0.80 p [Diff.index]
0.60
0.40
D [Disc.index]
0.20
0.00
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10Q11Q12Q13Q14Q15Q16Q17Q18Q19Q20
Figure 5
30
25
20
15 # Correct
# Answered
10
0
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
The difficulty index expressed as p values is shown in figure 6. This is a proportion of the correct
answers.
Figure 6
Questions Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
# Correct 21 22 17 12 21 17 11 12 13 8 23 19 15 21 20 22 15 8 13 16
# Answered 25 25 25 25 25 25 25 23 25 24 25 25 25 25 25 24 24 24 25 25
p 0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76 0.60 0.84 0.80 0.92 0.63 0.33 0.52 0.64
[Diff.index]
A good test should be able to differentiate the good scorers from low scorers (Knoetze, 2009).Figure
6 illustrate presents the discrimination indexes for all the twenty multiple choice questions. The
calculation of this index considers the number of scorers in the upper group (UG) and the number
scorers in the lower group (LG).
Figure 7
Questions Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
# LG Correct 6 7 3 4 6 5 2 2 3 0 9 5 3 6 6 7 3 3 1 5
# UG 15 15 14 8 15 12 9 10 10 8 14 14 12 15 14 15 12 5 12 11
Correct
D [Disc. 0.60 0.53 0.79 0.50 0.60 0.58 0.78 0.80 0.70 1.00 0.36 0.64 0.75 0.60 0.57 0.53 0.75 0.40 0.92 0.55
index]
The final analysis in table 3 presents 8 questions (1, 2,5,11,12,14,15 and 16) whose difficulty
index is not acceptable. The 8 questions are too easy. In terms of the discrimination index all
20 questions are acceptable. The 8 questions may be having poor distracters, obvious
answers or there are verbal clues.
Table 3
Questions p values D values Result Reason Action
Q1 0.84 0.60 Unacceptable Too easy Remove and replace from test item
Q2 0.88 0.53 Unacceptable Too easy Remove and replace from test item
Q3 0.68 0.79 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q4 0.48 0.50 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q5 0.84 0.60 Unacceptable Too easy Remove and replace from test item
Q6 0.68 0.58 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q7 0.44 0.78 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q8 0.52 0.80 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q9 0.52 0.70 Acceptable Discriminates + Maintain in the test item
Good level of
Armstrong University of Pretoria Masters in Education ( CIE)
Simelane
29307679
P a g e 17
difficulty
Q10 0.33 1.00 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q11 0.92 0.36 Unacceptable Too easy Remove and replace from test item
Q12 0.76 0.64 Unacceptable Too easy Remove and replace from test item
Q13 0.60 0.75 Acceptable Discriminates +
Good level of
difficulty
Q14 0.84 0.60 Unacceptable Too easy Remove and replace from test item
Q15 0.80 0.57 Unacceptable Too easy Remove and replace from test item
Q16 0.92 0.53 Unacceptable Too easy Remove and replace from test item
Q17 0.63 0.75 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q18 0.33 0.40 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q19 0.52 0.92 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Q20 0.64 0.55 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
5.1 Conclusion
This is a good test because 60% (12 out 20) of the questions are acceptable.
Knoetze, J.G. 2009. Evaluation and assessment of E-learning: CIE module notes, Department
of Science, maths and Technology, Faculty of Education, University of Pretoria.
Mills, C.N. 2002. Computer- based assessment: Building the foundation for future
assessments. New Jersey. Lawrence Erlbaum associates.
Nitro, A.J. 2004. Educational assessment of students. 4 th edition. New Jersey. Pearson, Merrill and
Prentice Hall.
Sapp, M. 2003. Psychological and educational test scores: What are they? Illinois. Charles Thomas.
Sims, S. and Rasaia, R.I. 2006. Relationship between Item Difficulty and Discrimination Indices in
True/False-Type Multiple Choice Questions of a Para-clinical Multidisciplinary Paper. Ann Acad Med
Singapore 2006; 35:67-71
Retrieved 19 August, 2009 from http://www.ams.edu.
Tucker, S. Using Rem ark Statistics for Test Reliability and Item Analysis.
Retrieved 19 August, 2009 from http://www.proftesting.com