Faculty of Education: Test Data Item Analysis, Central Tendency and Variability

Faculty of Education
Department of Science, Mathematics and

Technology Education
Test Data Item Analysis, Central tendency

and variability.
2009
Title page
A detailed report on: Determining the difficulty index (p), discrimination index (D) and
descriptive statistics (mean, mode, median and standard deviation) for a given multiple
choice item test data with twenty questions answered by twenty five learners.
Student: Armstrong Simelane
Student number: 29307679
Course: Evaluating and Assessment of E-learning
Course code: EAE 880
Lecturer: Professor J.G. Knoetze
Armstrong University of Pretoria Masters in Education ( CIE)

Simelane
29307679
Page2
Table of Contents
Table of figures...................................................................................................................................4
Terminology list.....................................................................................................................................5
Acknowledgements...............................................................................................................................7
1.0 Introduction...............................................................................................................................8
2.0 Procedure followed...................................................................................................................8
3.0 Descriptive statistics..................................................................................................................9
3.1 Tabulation of Data.......................................................................................................................9
3.2 The mean, mode and median....................................................................................................10
3.3 The standard deviation of the data set......................................................................................11
3.4 Grouped frequencies.................................................................................................................11
3.5 The Histogram...........................................................................................................................12
3.6 The frequency polygon..............................................................................................................13
3.7 Cumulative Frequency polygon.................................................................................................13
4.0 Item analysis............................................................................................................................14
4.2 Difficulty index for the test items..............................................................................................15
4.3 Discrimination index for the test items......................................................................................15
5.0 Summary of Findings...............................................................................................................16
5.1 Conclusion.................................................................................................................................16
6.0 Reference list...........................................................................................................................17

Simelane
29307679
Page3
Table of figures
Table 1 /scores…............................................................................................................................9
Box 1/measures of central tendency...........................................................................................10
Table 2/grouped frequencies......................................................................................................11
Table 3/cumulative frequencies..................................................................................................12
Figure 1/histogram......................................................................................................................12
Figure 2/frequency polygon.........................................................................................................13
Figure 3/ogive..............................................................................................................................13
Figure 4/item analysis..................................................................................................................14
Figure 5/test compilation............................................................................................................14
Figure 6/p values.........................................................................................................................15
Box 2/groups...............................................................................................................................15
Figure 7/D values.........................................................................................................................15
Table 4/findings...........................................................................................................................16

Simelane
29307679
Page4
Simelane
29307679
Page5
Terminology list
ASSESSMENT: process of collecting data for the purpose of making

decisions about students, curricula and programs
and educational policy.
CIA: computer integrated assessment
CIE: Computer Integrated Education

CBT: Computer- based testing
CRITIRION -REFERENCED TEST (CRT): a test that provides information about a student’s
performance based on a defined set of standards, a
set criterion, level of proficiency, mastery of a skill
or set of skills.
DATA: raw facts and figures which have not been
manipulated to give mean, median, mode.
DIFFICULTY INDEX: is the proportion of learners who answered the

item correctly, expressed as a p value.
DISCRIMINATION INDEX: mmeasure of the extent to which a test item

discriminates or differentiates between learners
who do well on the overall test and those who do
not do well on the overall test, expressed as a D
value.
EAE: evaluation and assessment of E-learning.
EVALUATION: process of making value judgements about the

worth of a students’ product or performance.
FREQUENCY POLYGON: is the graph of a straight line connecting the
midpoint of each interval.
HISTOGRAM: is a bar graph based on the grouped frequency

distribution

Simelane
29307679
Page6
ITEM ANALYSIS: process of collecting, summarising, and using
information from students’ item responses to make
decisions about how each item is functioning.
MEAN: is the sum of all the scores divided by the total
number of the scores.
MEASUREMENT: Assignment of numbers, scores to a specified
attribute or characteristic of a person in such a way
that the numbers describe the degree to which the
person possesses the attribute.
MEDIAN: is the score that splits a distribution in half.
MODE: is the sum of all the scores divided by the total
number of scores.
NORM- REFERENCED TEST (NRT): a test that provides information to determine a
student’s performance based on “place” or “rank” in
a group.
OGIVE: is graph based on the on the cumulative frequency
distribution.
RANGE: is calculated by subtracting the lowest score from

the highest score in a distribution.
STANDARD DEVIATION: is a measure of the dispersion of a set of data from
its mean.
TEST: an instrument or systematic procedure for
observing and describing one or more
characteristics of student using either a numerical
scale or classification scheme.

Simelane
29307679
Page7
Acknowledgements
It arises from the dunes of my heart to thank Professor J.G. Knoetze for being the ‘GPRS’ in
this course-eae 880. I have personally learnt a lot from him and no one can take that away
from me. My heart specially goes out for my wife-Khanyi and my one year old daughter -
Mumu; my brother Mawina whose prayers, love and support kept me going. May, God bless
you all the time.

Simelane
29307679
Page8
1.0 Introduction
Among the many facets of CIE (computer integrated education) is CIA (computer integrated
Assessment). This facet deals with the measurement, testing, assessment and evaluation
issues in the classroom life. It also covers the evaluation and assessment of E-learning
programs.
This report covers the various processes that an e-learning facilitator should go through in
the process of the evaluation and assessment of E-learning. A numbers of descriptors were
used to determine the nature of the test using test data that was provided by the course
lecture professor J.G. Knoetze through the website www. Jgknoetze.co.za/eae880_2009.
The purpose of this report is to table the necessary evidence required to reach a conclusion
on whether the test item used was of the appropriate standard. The measures of central
tendency and variability were calculated to provide reliable information. This included the
arithmetic mean, median and mode of the test scores.
In determining the level of difficulty of the questions, (p) values were calculated. The
difficulty index values were interpreted in terms of a scale provided in the course eae880.
The discrimination index was also calculated and was captured as (D) values which were also
interpreted to ascertain whether or not to accept specified questions.
The need for quality of tests used can not be overemphasised. Tests are necessary and vital
to the education process. The purpose of testing is to get scores that assist in making
instructional decisions, grading decisions, diagnostic decisions, selection decisions,
placement decisions, counselling and guidance decisions, program and curriculum decisions,
as well as making the timely and necessary administrative decisions (Kubizyn and Borich,
2003).
2.0 Procedure followed
Test data captured from a multiple choice item with twenty questions was used for this
report. The name of the file name from the server was testdata.xls. The multiple-choice had
four options A, B, C and D. The test was administered to twenty five learners.
To obtain the descriptive statistics an analysis was done in Microsoft excel 2007. The test
data was retrieved formatted and tabulated as shown in appendix A. The test data was

Simelane
29307679
Page9
converted to numeric values.1 – for correct / matching key, 0 – for wrong/ incorrect key and
empty cell – for no answer.
3.0 Descriptive statistics
Calculations of total test mark, total score for each of the twenty-five learners, lowest and
the highest test score in the data set, range of the data set, frequencies, arithmetic mean,
the median and mode was tabulated as shown in Table 1.
3.1 Tabulation of Data
Table 1 -scores
Scores
z-a Groups f
100 95-104
100 2
90 85-94
90
90
85
85
85 6
75 75-84 1
70
70
65
65
65
65 65-74 6
60
55
55 55-64 3
50 45-54
50
45 3
40 35-44 1
30 25-34
30 2
15 15-24 1

Simelane
29307679
P a g e 10
3.2 The mean, mode and median
The mean, mode and median are referred to as the measures of central tendency because
they summarise the center of given
data. They measure the location of
the middle or the center of a
distribution, “middle” value of the
Measures of central tendency data set and the most occurring
data values. This is summarized in
box1 below.
Box 1
Number of
scores 25
lowest score 15
Formula for mean
Highest score 100
M = (S X) / N

Range 85
M: Symbol for the mean.
mean 65
S: Sigma symbol tells you to sum up
Mode 65 whatever follows.
Median 65 X: Symbol used to represent a test

score.
Number of N: Symbol used for the total number

intervals 10 of scores in a distribution.
Interval size 9
Standard
deviation 23

Simelane
29307679
P a g e 11
3.3 The standard deviation of the data set
The variability of the scores from the test data was measured using standard deviation (SD
or σ ). Standard deviation is a measure of the dispersion of a set of data from its mean.
The more spread apart the data, the higher the deviation. A low standard deviation
indicates that the data is clustered around the mean, whereas a high standard deviation
indicates that the data is widely spread. (Knoetze, 2009)
2
= (X
SD  – M)
N
σ or SD = Standard deviation
M: Symbol for the mean.
S: Sigma symbol tells you to sum up whatever follows.
X: Symbol used to represent a test score.
N: Symbol used for the total number of scores in a distribution.
The standard deviation for the data set was computed and found to be 23.
3.4 Grouped frequencies
To display the grouped data, class intervals and the interval size for the data set, grouped
frequencies, midpoints for the class intervals and cumulative frequency Table 1 and Table 2
was used as shown.
TABULATING Table 1
GROUPED
FREQUENCIES
LL UP MP f Groups f
15 24 19.5 1 15-24 1
25 34 29.5 2 25-34 2
35
Armstrong 44
University of 39.5
Pretoria 1 35-44Masters1in Education ( CIE)
Simelane
45 54 49.5 3 45-54 3
29307679
P a55
g e 12 64 59.5 3 55-64 3
65 74 69.5 6 65-74 6
75 84 79.5 1 75-84 1
85 94 89.5 6 85-94 6
95 104 99.5 2 95-104 2
Table 2
CALCULATING CUMULATIVE FREQUENCIES
LL UP MP f Groups UP a-z cf
15 24 19.5 1 15-24 24 1
25 34 29.5 2 25-34 34 3
35 44 39.5 1 35-44 44 4
45 54 49.5 3 45-54 54 7
55 64 59.5 3 55-64 64 10
65 74 69.5 6 65-74 74 16
75 84 79.5 1 75-84 84 17
85 94 89.5 6 85-94 94 23
95 104 99.5 2 95-104 104 25
3.5 The Histogram
The Histogram shown in Figure 1 indicates that most of the students passed the test as 15 of
the 25 scores lie above the mean, mode and median.
Figure 1

Simelane
29307679
P a g e 13
Histogram of Test data Scores
7
6
5
Frequency
4
3
2
1
0
15-24 25-34 35-44 45-54 55-64 65-74 75-84 85-94 95-104
Interval
3.6 The frequency polygon
The frequency polygon shown in Figure2 illustrates the midpoints 69.5 and 89.5 as the two
peaks of the curve and yet 19.5 and 39.5 are the lowest points.
Figure 2
Frequency Polygon (Test Data Scores)

7
6
5
Frequency
4
3
2
1
0
19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Mid Points
3.7 Cumulative Frequency polygon

Simelane
29307679
P a g e 14
To summarise the cumulative frequencies from the grouped data, an ogive was drawn using
data from Table 2. The Ogive is shown in Figure 3.
Figure 3
Ogive (Test Data Scores)

30
25
20
Frequency
15
10
0
24 34 44 54 64 74 84 94 104
Upper Limit
4.0 Item analysis
Item analysis is conducted to determine the number of items to include in the final version
of a test. Item discrimination index and difficulty index are two types of the quantitative
aspects of item analysis (Sapp, 2003). Figure 4 presents a pictorial view of the levels of
difficulty and discrimination by the multiple choice item.
Figure 4
1.20
1.00
0.80 p [Diff.index]
0.60
0.40
D [Disc.index]
0.20
0.00
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10Q11Q12Q13Q14Q15Q16Q17Q18Q19Q20
4.1 Test compilation

Simelane
29307679
P a g e 15
Figure 5 shows a graphical comparison of the item in terms of the questions that were
correctly answered against those that were attempted by the twenty five learners. Kubizyn
and Borich( 2003) provides that one should check if items are arranged from easy to
difficult, are properly spaced, similar grouped together, items and options are on the same
page, random answers, no gender or racial bias and are proof read. Figure 5 clearly illustrate
that test items were indeed arranged from easy to difficult evenly.
Figure 5
30
25
20
15 # Correct
# Answered
10
0
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
4.2 Difficulty index for the test items
The difficulty index expressed as p values is shown in figure 6. This is a proportion of the correct
answers.
Figure 6
Questions Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
# Correct 21 22 17 12 21 17 11 12 13 8 23 19 15 21 20 22 15 8 13 16
# Answered 25 25 25 25 25 25 25 23 25 24 25 25 25 25 25 24 24 24 25 25
p 0.84 0.88 0.68 0.48 0.84 0.68 0.44 0.52 0.52 0.33 0.92 0.76 0.60 0.84 0.80 0.92 0.63 0.33 0.52 0.64
[Diff.index]
4.3 Discrimination index for the test items
A good test should be able to differentiate the good scorers from low scorers (Knoetze, 2009).Figure
6 illustrate presents the discrimination indexes for all the twenty multiple choice questions. The
calculation of this index considers the number of scorers in the upper group (UG) and the number
scorers in the lower group (LG).
Formula for calculating the discrimination index

Box 2
Simelane
29307679 Upper Group 15
P a g e 16 Lower Group 10
As captioned above, Box 2 illustrates that there are fifteen learners in the upper group and ten
learners in the lower group. This is an indication that the distribution is not a normal curve or bell
curve distribution. The Histogram and frequency polygon have already indicated that in Figure 1 and
2 respectively. The data is negatively skewed in that it is skewed to the left. Nitko (2004) classifies
such a distribution of scores under “easy” tests.
Figure 7
Questions Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
# LG Correct 6 7 3 4 6 5 2 2 3 0 9 5 3 6 6 7 3 3 1 5
# UG 15 15 14 8 15 12 9 10 10 8 14 14 12 15 14 15 12 5 12 11
Correct
D [Disc. 0.60 0.53 0.79 0.50 0.60 0.58 0.78 0.80 0.70 1.00 0.36 0.64 0.75 0.60 0.57 0.53 0.75 0.40 0.92 0.55
index]
5.0 Summary of Findings
The final analysis in table 3 presents 8 questions (1, 2,5,11,12,14,15 and 16) whose difficulty
index is not acceptable. The 8 questions are too easy. In terms of the discrimination index all
20 questions are acceptable. The 8 questions may be having poor distracters, obvious
answers or there are verbal clues.
Table 3
Questions p values D values Result Reason Action
Q1 0.84 0.60 Unacceptable Too easy Remove and replace from test item
Q3 0.68 0.79 Acceptable Discriminates + Maintain in the test item
Good level of
difficulty
Good level of
difficulty
Good level of
difficulty
Good level of
difficulty
Good level of
difficulty
Good level of
Simelane
29307679
P a g e 17
difficulty
Good level of
difficulty
Q13 0.60 0.75 Acceptable Discriminates +
Good level of
difficulty
Good level of
difficulty
Good level of
difficulty
Good level of
difficulty
Good level of
difficulty
5.1 Conclusion
This is a good test because 60% (12 out 20) of the questions are acceptable.
6.0 Reference list
Knoetze, J.G. 2009. Evaluation and assessment of E-learning: CIE module notes, Department
of Science, maths and Technology, Faculty of Education, University of Pretoria.
Kubizyn, T. and Borich, G. 2003.Educational testing and measurement: Classroom

Application and practice. 7th edition. USA: Wiley/Jossey-Bass Education.
Mills, C.N. 2002. Computer- based assessment: Building the foundation for future
assessments. New Jersey. Lawrence Erlbaum associates.
Nitro, A.J. 2004. Educational assessment of students. 4 th edition. New Jersey. Pearson, Merrill and
Prentice Hall.
Sapp, M. 2003. Psychological and educational test scores: What are they? Illinois. Charles Thomas.
Sims, S. and Rasaia, R.I. 2006. Relationship between Item Difficulty and Discrimination Indices in
True/False-Type Multiple Choice Questions of a Para-clinical Multidisciplinary Paper. Ann Acad Med
Singapore 2006; 35:67-71
Retrieved 19 August, 2009 from http://www.ams.edu.
Tucker, S. Using Rem ark Statistics for Test Reliability and Item Analysis.
Retrieved 19 August, 2009 from http://www.proftesting.com

Simelane
29307679
P a g e 18
Simelane
29307679
P a g e 19

Faculty of Education: Test Data Item Analysis, Central Tendency and Variability

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Faculty of Education: Test Data Item Analysis, Central Tendency and Variability

Загружено:

Авторское право:

Доступные форматы

Faculty of Education

Department of Science, Mathematics and

Test Data Item Analysis, Central tendency

Student: Armstrong Simelane

Student number: 29307679

Course: Evaluating and Assessment of E-learning

Course code: EAE 880

Lecturer: Professor J.G. Knoetze

Armstrong University of Pretoria Masters in Education ( CIE)

Armstrong University of Pretoria Masters in Education ( CIE)

Armstrong University of Pretoria Masters in Education ( CIE)

ASSESSMENT: process of collecting data for the purpose of making

CIA: computer integrated assessment

CIE: Computer Integrated Education

DIFFICULTY INDEX: is the proportion of learners who answered the

DISCRIMINATION INDEX: mmeasure of the extent to which a test item

EAE: evaluation and assessment of E-learning.

EVALUATION: process of making value judgements about the

HISTOGRAM: is a bar graph based on the grouped frequency

Armstrong University of Pretoria Masters in Education ( CIE)

RANGE: is calculated by subtracting the lowest score from

Armstrong University of Pretoria Masters in Education ( CIE)

Armstrong University of Pretoria Masters in Education ( CIE)

2.0 Procedure followed

Armstrong University of Pretoria Masters in Education ( CIE)

3.0 Descriptive statistics

3.1 Tabulation of Data

Armstrong University of Pretoria Masters in Education ( CIE)

Median 65 X: Symbol used to represent a test

Number of N: Symbol used for the total number

Armstrong University of Pretoria Masters in Education ( CIE)

M: Symbol for the mean.

S: Sigma symbol tells you to sum up whatever follows.

X: Symbol used to represent a test score.

N: Symbol used for the total number of scores in a distribution.

3.4 Grouped frequencies

3.5 The Histogram

Armstrong University of Pretoria Masters in Education ( CIE)

3.6 The frequency polygon

Frequency Polygon (Test Data Scores)

3.7 Cumulative Frequency polygon

Armstrong University of Pretoria Masters in Education ( CIE)

Ogive (Test Data Scores)

4.0 Item analysis

4.1 Test compilation

4.2 Difficulty index for the test items

4.3 Discrimination index for the test items

Formula for calculating the discrimination index

5.0 Summary of Findings

6.0 Reference list

Kubizyn, T. and Borich, G. 2003.Educational testing and measurement: Classroom

Armstrong University of Pretoria Masters in Education ( CIE)

Вам также может понравиться