Вы находитесь на странице: 1из 26

Describing Data:

Measures of Dispersion
Types of Dispersion
Range
Mean Deviation
Variance & Standard Deviation
Relative Dispersion
Coefficient of Variation
Standard Score
Others Dispersion
Quartile Deviation
Coefficient of Quartile variation


Range
Ungrouped data :
R = Highest value Lowest value
Grouped Data
R = U
n
L
1
R = X
n
X
1
The major characteristics of the range :
Only two values are used in its calculation
It is influenced by extreme values
It is easy to compute and to understand


Mean Deviation
Mean Deviation: The arithmetic mean of
the absolute values of the deviations from
the arithmetic mean.
Mean Absolute Deviation (MAD)
Formula :
Ungrouped data :

Grouped data :

n
X X
MD


=
n
X X f
MD


=
Mean Deviation for Grouped Data
Penjualan f Xi f I Xi - X I
20-30 4 25 117.60
30-40 7 35 135.80
40-50 8 45 75.20
50-60 12 55 7.20
60-70 9 65 95.40
70-80 8 75 164.80
80-90 2 85 61.20
Total 50 657.20
14 , 13
50
20 , 657
= = MD
Characteristics of Mean Deviation
It is not unduly influenced by large or
small values
All observations are used in the
calculation
The absolute values are somewhat
difficult to work with
Variance & Standard Deviation
The variance is the arithmetic mean of
the squared deviations from the mean.
The Standard Deviation is the square
root of the variance
The sample variance/standard
deviation is used as estimator of
population variance/standard deviation
Variance
Ungroup Data
Population





Sample
Group Data
Population




Sample
( )
( )
n
n
X
X
n
X

=
2
2
2
2
2
o

o
( )
( )
1
1
2
2
2
2
2

n
n
X
X
s
n
X X
s
( )
( )
n
n
f X
f X
n
X f

=
2
2
2
2
2
o

o
( )
( )
1
1
2
2
2
2
2

n
n
f X
f X
s
n
X X f
s
Standard Deviation
Ungroup Data
Population





Sample
Group Data
Population




Sample
( )
( )
( )
n
n
X
X
n
X
2
2
2

=
o

o
( )
( )
( )
n
n
f X
f X
n
X f
i
2
2
2

=
o

o
( )
( )
( )
1
1
2
2
2

n
n
X
X
s
n
X X
s
( )
( )
( )
1
1
2
2
2

n
n
f X
f X
s
n
X X
s
Variance&Standard Deviation for
Grouped Data
Penjualan f Xi f (Xi - X ) ^2
20-30 4 25 3,457.44
30-40 7 35 2,634.52
40-50 8 45 706.88
50-60 12 55 4.32
60-70 9 65 1,011.24
70-80 8 75 3,394.88
80-90 2 85 1,872.72
Total 50 13,082.00
14 , 13
50
00 , 082 . 13
64 , 261
50
00 , 082 . 13
2
= = > = = o o
Characteristics of Variance & Standard
Deviation
Variance
All observations are used in the calculation
It is unduly influenced by extreme observations
It can not be negative
The units are somewhat difficult to work with;
they are the original units squared
Standard Deviation
It is square root of the average squared distance
from the mean
It is the most widely reported measure of
dispersion

Interpretation and Uses of the Standard
Deviation
Chebyshevs theorem: For any set of
observations, the minimum proportion of
the values that lie within k standard
deviations of the mean is at least 1 - 1/k ,
where k
2
is any constant greater than 1.
Cont..
Empirical Rule: For any symmetrical, bell-
shaped distribution, approximately 68% of
the observations will lie within of
the mean ( );approximately 95% of the
observations will lie within of the
mean ( ) approximately 99.7% within
o of the mean ( ).
1o

3o

o 2


3o 2o 1o +1o +2o
+ 3o
Bell -Shaped Curve showing the relationship between and . o
Relative Dispersion
The coefficient of variation
is the ratio of the standard deviation to the
arithmetic mean, expressed as a
percentage
It is useful for comparing distribution with
different units and mean
Population :

Sample :

% 100 x V

o
=
% 100 x
X
s
V =
Relative Dispersion
Standard score (Angka Standar)
is deviation of variable value from
arithmetic mean expressed by
standard deviation
Population :

Sample :
o

=
X
AS
s
X X
AS

=
Others Dispersion
Quartile Deviation :

Coefficient of Quartile Variation :



The Interquartile range is the distance
between the third quartile Q3 and the
first quartile Q1 = Q3 - Q1


2
1 3
Q Q
Q

= o
( )
d
Q
M
Q Q
V
2
1 3

=
1 3
1 3
Q Q
Q Q
V
Q
+

=
Box Plots
A box plot is a graphical display, based
on quartiles, that helps to picture a set
of data.
Five pieces of data are needed to
construct a box plot: the Minimum
Value, the First Quartile, the Median,
the Third Quartile, and the Maximum
Value.
EXAMPLE
Based on a sample of 20 deliveries,
Marcos Pizza determined the following
information: minimum value = 13
minutes, Q1 = 15 minutes, median = 18
minutes, Q3 = 22 minutes, maximum
value = 30 minutes. Develop a box plot
for the delivery times.
EXAMPLE continued


median
min Q1 Q3 max




12 14 16 18 20 22 24 26 28 30 32
Skewness
Skewness is the measurement of the lack of
symmetry of the distribution.
Pearson coefficient :

Modified Pearson coeff. :
Sk = 0 (symmetric)
Sk = + (positively skewed)
Sk = - (negatively skewed)
Croxton & Cowden -3 s Sk s 3


s
M X
S
o
k

=
( )
s
M X
S
d
k

=
3
Cont..
Bowley :


Sk
B
= 0 symmetric (Q
2
-Q
1
= Q
3
-Q
2
)
Sk
B
= + positively skewed
(Q
2
-Q
1
< Q
3
-Q
2
)
Sk
B
= - negatively skewed
(Q
2
-Q
1
> Q
3
-Q
2
)
Sk
B
= 0,1 (not significantly skewed)
Sk
B
> 0,3 (significantly skewed)


( ) ( )
( ) ( )
1 2 2 3
1 2 2 3
Q Q Q Q
Q Q Q Q
S
B
k
+

=
Cont..
Relative skewness :
Ungrouped data :

Grouped data :

Karl Pearson : o
3
> 0,5
Kenny & Keeping :
-2 s o
3
s 2 (moderately skewed)
o
3
> 2 (significantly skewed)
( )
3
3
3
1
s
X X
n


= o
( )
3
3
3
1
s
X X f
n


= o
Skewness for Grouped Data
Penjualan f Xi f (Xi - X ) ^3
20-30 4 25 -101,648.74
30-40 7 35 -51,109.69
40-50 8 45 -6,644.67
50-60 12 55 2.59
60-70 9 65 10,719.14
70-80 8 75 69,934.53
80-90 2 85 57,305.23
Total 50 -21,441.60
( )
10 , 0
18 , 16
60 , 441 . 21
50
1
3
3
=

= o
Peakedness
Formula :
Ungrouped data :

Grouped data :

Note :
o
4
= 3 normo/mesokurtic
o
4
= > 3 leptokurtic
o
4
= < 3 platykurtic
( )
4
4
4
1
s
X X
n


= o
( )
4
4
4
1
s
X X f
n


= o
Peakedness for Grouped Data
Penjualan f Xi f (Xi - X ) ^4
20-30 4 25 2,988,472.84
30-40 7 35 991,527.95
40-50 8 45 62,459.92
50-60 12 55 1.56
60-70 9 65 113,622.93
70-80 8 75 1,440,651.28
80-90 2 85 1,753,540.10
Total 50 7,350,276.56
( )
15 , 2
18 , 16
56 , 276 . 350 . 7
50
1
4
4
= = o