Вы находитесь на странице: 1из 14

Statistical analysis of a sample of

exam grades

Table of Contents
Introduction............................................................................................................ 3

Quantitative variable: central tendency measures.................................................4


Quantitative variable: measures of variability........................................................5
Qualitative variable: frequencies............................................................................7
Grades distribution................................................................................................. 8
Grouped data-central tendency indicators.............................................................9
Grouped data-variation and asymmetry.................................................................9
Sampling.............................................................................................................. 10
The analysis of variation between two intervals...................................................11
Conclusion............................................................................................................ 12
Refferences:......................................................................................................... 13

Introduction
The purpose of this report is to analyze a sample of data representing the grades
from the final exam of some students from The Bucharest Academy of Economic
Studies. All data was collected from the web page of the university. The present
report will mainly focus of presenting and explaining the central tendency and
variability measures. The following data was collected:
Tabel 1: Raw data
Source: www.ase.ro

Nr.
Crt.
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101

Gende
r
F
F
F
F
F
M
F
M
F
F
F
F
F
M
M
F
F
F
F
M
F
F
F
F
F
F
F
F
F
M
F
F
F
F
F

Grade(
xi)
9,48
8,18
9,52
9,52
9,25
8,69
9,03
8,66
9,25
8,78
8,44
9,42
8,40
8,79
8,88
9,16
9,29
8,74
7,69
8,25
8,50
9,20
9,53
9,42
9,16
9,58
8,35
9,46
8,57
9,05
8,39
9,48
9,28
8,97
9,58

102

9,25

103
104
105
106
107
108
109
110
111
112
113
114
115
116

F
F
F
F
F
F
F
M
F
M
F
F
F
F

9,35
9,10
8,91
8,73
9,20
9,52
8,91
9,19
9,55
9,38
9,08
9,75
9,11
9,45

Quantitative variable: central tendency measures.


Firstly, the grades of the students were considered. After noticing that there are
some differences among them, there is the need to find out what the general
level is and where the majority lies. The arithmetic mean was the first one to be
computed and it showed that the average grade of these fifty students was 9.05.
n

x
i =1

7.69+8.18++ 9.75 452.42


=
=
50
50

9.05 points

Afterwards, it was proceeded by computing also the harmonic, quadratic and


geometric means, all of which having similar values.

x h

x1

i=1

x p

x g=

50
=
1
1
+ +
7.69
9.75

x 2i
i=1

7.692 ++ 9.752
=
50

9.03 points

9.06 points

xi =507.69 9.75=

i =1

9.04 points

Furthermore, it was computed the median which equals 9.16, representing the
fact that half of the students from the sample have grades below it, while half of
them have grades above it. Because the data series that was studied contains an
even number of observations (50), the median is the average of the two middle
observations:

Me=

9.16+9.16
=
2

9.16 points

After that, it was continued with the mode. It is 9.25 points and it translates into
the fact that this value is the most frequent among these students. This value
occurs three times, which is more than the repetition of any other value.
Tabel 2: Central tendency calculus
Source: www.ase.ro

Grade(xi)
7,69
8,18
8,25
8,35
8,39
8,40
8,44

1/xi
0,1300
0,1222
0,1212
0,1198
0,1192
0,1190
0,1185
4

xi^2
59,14
66,91
68,06
69,72
70,39
70,56
71,23

8,50
8,57
8,66
8,69
8,73
8,74
8,78
8,79
8,88
8,91
8,91
8,97
9,03
9,05
9,08
9,10
9,11
9,16
9,16
9,19
9,20
9,20
9,25
9,25
9,25
9,28
9,29
9,35
9,38
9,42
9,42
9,45
9,46
9,48
9,48
9,52
9,52
9,52
9,53
9,55
9,58
9,58
9,75
452,42

0,1176
0,1167
0,1155
0,1151
0,1145
0,1144
0,1139
0,1138
0,1126
0,1122
0,1122
0,1115
0,1107
0,1105
0,1101
0,1099
0,1098
0,1092
0,1092
0,1088
0,1087
0,1087
0,1081
0,1081
0,1081
0,1078
0,1076
0,1070
0,1066
0,1062
0,1062
0,1058
0,1057
0,1055
0,1055
0,1050
0,1050
0,1050
0,1049
0,1047
0,1044
0,1044
0,1026
5,5398

72,25
73,44
75,00
75,52
76,21
76,39
77,09
77,26
78,85
79,39
79,39
80,46
81,54
81,90
82,45
82,81
82,99
83,91
83,91
84,46
84,64
84,64
85,56
85,56
85,56
86,12
86,30
87,42
87,98
88,74
88,74
89,30
89,49
89,87
89,87
90,63
90,63
90,63
90,82
91,20
91,78
91,78
95,06
4103,56

Quantitative variable: measures of variability


Secondly, it was continued with the computation of the range. The absolute one
is 2.06, showing the difference between the biggest and the smallest grade, while
the relative one is 23% and it proves that the data series is very homogeneous,
so the average is representative.
5

R=x max x min =9.757.69=2.06

R ( )=

points

x max x min
9.757.69
100=
100=23
x
9.05

Moreover, the dispersion of all the observations is measured by calculating the


variance which has the value of 0.20. The corresponding standard deviation
equals 0.44. What it means is that on average, the grades of the students differ
(in plus or in minus) by 0.44 points from the average of 9.05. In addition, it was
also computed the coefficient of variation that is of 5%, underlining that the data
is very homogenous, having a representative average.
n

(xi x )2

2= i=1

ni

(7.699.05) ++(9.759.05)
=0.20
50

= 2= 0.20=0.44

points

0.44
v = 100=
100=5
x
9.05
Tabel 3: Dispersion calculus
Data source: www.ase.ro

Grade(x
i)
7,69
8,18
8,25
8,35
8,39
8,40
8,44
8,50
8,57
8,66
8,69
8,73
8,74
8,78
8,79
8,88
8,91
8,91
8,97

xiaverage
-1,36
-0,87
-0,80
-0,70
-0,66
-0,65
-0,61
-0,55
-0,48
-0,39
-0,36
-0,32
-0,31
-0,27
-0,26
-0,17
-0,14
-0,14
-0,08

(xi-av)^2
1,85
0,75
0,64
0,49
0,43
0,42
0,37
0,30
0,23
0,15
0,13
0,10
0,10
0,07
0,07
0,03
0,02
0,02
0,01

9,03
9,05
9,08
9,10
9,11
9,16
9,16
9,19
9,20
9,20
9,25
9,25
9,25
9,28
9,29
9,35
9,38
9,42
9,42
9,45
9,46
9,48
9,48
9,52
9,52
9,52
9,53
9,55
9,58
9,58
9,75
452,42

-0,02
0,00
0,03
0,05
0,06
0,11
0,11
0,14
0,15
0,15
0,20
0,20
0,20
0,23
0,24
0,30
0,33
0,37
0,37
0,40
0,41
0,43
0,43
0,47
0,47
0,47
0,48
0,50
0,53
0,53
0,70

0,00
0,00
0,00
0,00
0,00
0,01
0,01
0,02
0,02
0,02
0,04
0,04
0,04
0,05
0,06
0,09
0,11
0,14
0,14
0,16
0,17
0,19
0,19
0,22
0,22
0,22
0,23
0,25
0,28
0,28
0,49
9,89

Qualitative variable: frequencies


Apart from the grades, the sample took into account one more characteristic: the
gender. The calculus revealed that from all fifty students, only eight are males,
representing 16%. In the same time, in our randomly selected sample, there are
forty-two females corresponding to 84% of the total number of observations.

8
100=16
50

Rel frequency male=

Rel frequency female =

42
100=84
50

Tabel 4: Gender distribution


Data source: www.ase.ro

Absolute frequency
Male
Female
Relative frequency
(%)
Male
Female

8
42

Total
50
100%

16%
84%

Figur 1
Data source: www.ase.ro

Grades distribution
To analyze more accurately the positioning of the students in terms of their
grades, the results were divided into 5 intervals (classes). The width of each class

Range
2.06
=
=0.41 , but it was rounded to 0.5, in order
should have been No of classes
5
to deliver a more expressive interpretation of the results. The inference is that
one student was between 7.50 and 8.00, six were between 8.00 and 8.50, twelve
were between 8.50 and 9.00, twenty-three were between 9.00 and 9.50 and eight
of them had grades higher than 9.50 points. The results are presented in the
following histogram.

Students distribution according to grades

No of students

25
20
15
10
5
0

23
12
6

No of students(ni)

Grades (expressed in points)

Figur 2

Data source: www.ase.ro

Grouped data-central tendency indicators


On the data presented in this way, the results are more representative because
they are weighted. In order to be able to work with these intervals, firstly the
class midpoint had to be computed. The class midpoints equal the value halfway
between the upper limit and lower limit of each class:

x i=

x min + x max
2

The average grade of the students from this sample is 9.06:


n

xi n i

x = i=1 n

ni

7.75 1+8.25 6 +8.75 12+9.25 23+ 9.75 8


=
50

9.06 points

i=1

The median equals 9.14 and this means that half of the grades are below 8.61,
and the other half are above this value.

Loc Me=

n+1 51
= = 25.5
2
2

The median interval is the interval for which the cumulated frequencies are
equal or larger than the median location. For this sample, the median
interval is [9.00-9.50], since 1+6+12+2325.5 .

Me1

Me=x 0 +k Me
9.00+0.5

0.5 ( n+1 ) ni
i=1

n Me

0.5 ( 50+1 ) (1+6+ 12)


= 9.14 points
23

The mode equals 9.21, so this is the most common grade in this sample:

The modal interval is the interval with the highest frequency: [9.00-9.50].

Mo=x o+ k Mo

1
2312
=9.00+0.50
=
1 + 2
( 2312 ) +( 238)

9.21 points

Tabel 5: Grouped data-central tendency calculus


Data source: www.ase.ro

Grade interval (lower limit


included)
7,50-8,00
8,00-8,50
8,50-9,00
9,00-9,50
9,50-10,00
Total

No of
students(ni)
1
6
12
23
8
50

Class
midpoint(xi)
7,75
8,25
8,75
9,25
9,75

xi*ni

Grouped data-variation and asymmetry


The computed variance for these data is 0.23 and the standard deviation is 0.48.
The explanation of it is that in average, the grades obtained by the students
differ (in plus or in minus) by 0.48 from the average of 9.06. The resulted
coefficient of variation is 5%, which is below 35%, showing that the data series is
homogenous and the average is really representative.
n

(xi x )2 ni

2= i=1

ni

(7.759.06)2 1+ +( 9.759.06)2 8
= 0.23
50

i=1

= 2= 0.23= 0.48 points

0.48
v = 100=
100= 5%
x
9.06

The aforementioned results were computed based on the calculations presented


in the following table:
Tabel 6: The distribution of students-dispersion and skewness calculus
Data source: www.ase.ro

10

7,75
49,5
105
212,75
78
453

Grade interval
(lower limit
included)
7,50-8,00

No of
students
(ni)
1

Class
midpoint
(xi)
7,75

8,00-8,50

8,25

8,50-9,00

12

8,75

9,00-9,50

23

9,25

9,75

9,50-10,00
Total

xiav
1,3
1
0,8
1
0,3
1
0,1
9
0,6
9

(xiav)^2

(xi-av)^2
*ni

1,72

1,72

0,66

3,94

0,1

1,15

0,04

0,83

0,48

3,81

50

11,45

Furthermore, the skewness was also analyzed in order to describe the shape of
the data series. The indicator of absolute skewness is equal to -0.15, meaning
that the distribution is negatively skewed (skewed to the right), so the mode is on
the right side of the mean. In the same time, the coefficient of skewness equals
-0.32 which is a little higher than 0.3, and this translates into the fact that the
asymmetry of the data series is not really moderate.

as=x Mo= 9.06 - 9.21 = -0.15

C as=

x Mo 9.069.21
=
= -0.32

0.48

Sampling
The analyzed raw data is a representative sample which was randomly chosen
from the first year students of ASE. In order to estimate the population mean,
the marginal error, the lower confidence limit and the upper confidence limit
were computed. All calculus were done based on a 95% confidence level, with a
corresponding significance level of =5% and a standard error of z=1.96 . The
population from which we are sampling is infinite due to the fact that the
population size N (all first year students from ASE) is larger than 20 times the
sample size (N>20*50). Because of that,

the standard error of the mean is:

The lower confidence limit:

x =

0.48
=
=
0.07.
n 50

LCL=x z x
2

8.93
11

= 9.06 1.96 0.07 =

The upper confidence limit:

UCL=x + z x
2

9.19

Hence, the confidence interval is:

= 9.06 + 1.96 0.07 =

8.93 x o 9.19 .

The conclusion is that the average grade of all the first year students would be
comprised between 8.93 points and 9.19 points, with a 95% confidence level.

The analysis of variation between two intervals


When looking separately at the two categories (males and females) the results
were different from the aggregate ones.
Tabel 7: Variance calculus-MALES
Data source: www.ase.ro

Grade interval
(lower limit
included)
7,50-8,00
8,00-8,50
8,50-9,00
9,00-9,50
9,50-10,00
Total

Males
(ni1)

Class
midpoint
(yi1)
7,75
8,25
8,75
9,25
9,75

0
1
4
3
0
8

yi1*n
i1

(yi1-av1)^2
*ni1

0
8,25
35
27,75
0
71

0,00
0,39
0,06
0,42
0,00
0,88

y i 1 ni 1

y 1= i=1

ni 1

7.75 0+ +9.75 0
= 8.88 points
8

i=1

2
1

( y i 1 y 1)2 ni 1
i=1

ni 1

(7.758.88)2 0+ +( 9.758.88)2 0
=
=
8

0.11

i=1

1= 21= 0.11=

v 1=

0.33 points

1
0.33
100=
100=
y 1
8.88

4%

Tabel 8: Variance calculus-FEMALES


Data source: www.ase.ro

Grade interval
(lower limit
included)
7,50-8,00

Female
(ni2)
1

Class
midpoint (yi2)

yi2*n
i2

(yi2-av2)^2
*ni2

7,75

7,75

1,81

12

8,00-8,50
8,50-9,00
9,00-9,50
9,50-10,00
Total

5
8
20
8
42

8,25
8,75
9,25
9,75

41,25
70
185
78
382

3,57
0,95
0,48
3,43
10,24

y i 2 ni 2

y 2= i=1

ni 2

7.75 1++ 9.75 8


=
42

9.10 points

i=1

2
2

( y i 2 y 2)2 ni 2
i=1

ni 2

(7.759.10)2 1+ +(9.759.10)2 8
=
=
42

0.24

i=1

2= 22= 0.24= 0.49 points

v 2=

2
0.49
100=
100=
y 2
9.10

5%

When comparing the two set of results, it can be noticed that, on average, the
grades of the females are higher than the grades of the males, by 0.22 points. In
addition, both the standard deviation and the coefficient of variation have higher
values, showing that the female sample presents an increased variability.
The value of the explained variation is 0.01, measuring the deviation of the group
mean (the gender) from the overall mean. The residual dispersion (within groups)
equals 0.22, showing the importance of other factors besides the gender in
determining the level of the grades. In the end, the coefficient of determination
equaled 3%, meaning that only 3% from the total variation in the level of grades
is explained by gender. In conclusion, the score received at the exam does not
depend on the student`s gender.

( y j y 0 )2 n j (8.889.06)2 8+(9.109.06)2 42

=
=
=
50
nj

2 n j = 0.11 8+ 0.24 42 =
2
j=
50
nj

0.01

0.22

Verifying the rule of dispersion:

20 = 2 + 2j 0.23 = 0.01 + 0.22 => The rule is verified.


2

D=R =

0.01
100=
100=
2
0.23
0

3%

13

Conclusion
After analyzing the grades of the fifty students from the randomly chosen sample,
it can be noticed that the central tendency measures have values a little above 9,
with an average of 9.05. This means the overall level is pretty high. The range of
the grades is 2.06, explaining a homogenous set of data. The majority of the
students (84%) are females, who have higher grades, on average, but they also
present more variability. The majority of the grade lies in the [9.00-9.50] interval,
with 23 students out of 50 belonging to this class. On average, after computing
the confidence interval, it resulted that all the first year students from ASE have
grades higher than 8.93 and lower than 9.19. In the end, it was proven than the
variation in the level of grades does not depend on the gender.

References:
1. www.ase.ro

14

Вам также может понравиться