Вы находитесь на странице: 1из 17

C HAPTER

Measures of Dispersion

measures, which seek to quantify


Studying this chapter should
variability of the data.
enable you to:
• know the limitations of averages;
Three friends, Ram, Rahim and
• appreciate the need for measures Maria are chatting over a cup of tea.
of dispersion; During the course of their
• enumerate various measures of conversation, they start talking about
dispersion; their family incomes. Ram tells them
• calculate the measur es and that there are four members in his
compare them; family and the average income per
• distinguish between absolute member is Rs 15,000. Rahim says that
and relative measures.
the average income is the same in his
family, though the number of members
1. I NTRODUCTION is six. Maria says that there are five
members in her family, out of which
In the previous chapter, you have one is not working. She calculates that
studied how to sum up the data into the average income in her family too,
a single representative value. However, is Rs 15,000. They are a little surprised
that value does not reveal the since they know that Maria’s father is
variability present in the data. In this earning a huge salary. They go into
chapter you will study those details and gather the following data:

2015-16
MEASURES OF DISPERSION 75

Family Incomes variation in values, your understan-


Sl. No. Ram Rahim Maria ding of a distribution improves
1. 12,000 7,000 0 considerably. For example, per capita
2. 14,000 10,000 7,000 income gives only the average income.
3. 16,000 14,000 8,000 A measure of dispersion can tell you
4. 18,000 17,000 10,000 about income inequalities, thereby
5. ----- 20,000 50,000
6. ----- 22,000 ------
improving the understanding of the
relative standards of living enjoyed by
Total income 60,000 90,000 75,000
Average income 15,000 15,000 15,000
different strata of society.
Dispersion is the extent to which
Do you notice that although the values in a distribution differ from the
average is the same, there are average of the distribution.
considerable differences in individual To quantify the extent of the
incomes? variation, there are certain measures
It is quite obvious that averages namely:
try to tell only one aspect of a (i) Range
distribution i.e. a representative size (ii) Quartile Deviation
of the values. To understand it better, (iii)Mean Deviation
you need to know the spread of values (iv) Standard Deviation
also.
Apart from these measures which
You can see that in Ram’s family,
give a numerical value, there is a
differences in incomes are
graphic method for estimating
comparatively lower. In Rahim’s
family, differences are higher and in dispersion.
Range and Quartile Deviation
Maria’s family, the differences are the
measure the dispersion by calculating
highest. Knowledge of only average is
the spread within which the values lie.
insufficient. If you have another value
Mean Deviation and Standard
which reflects the quantum of
Deviation calculate the extent to
which the values differ from the
average.

2. MEASURES BASED UPON SPREAD OF


VALUES
Range
Range (R) is the difference between the
largest (L) and the smallest value (S)
in a distribution. Thus,
R=L–S
Higher value of Range implies
higher dispersion and vice-versa.

2015-16
76 STATISTICS FOR ECONOMICS

Activities Quartile Deviation


Look at the following values: The presence of even one extremely
20, 30, 40, 50, 200 high or low value in a distribution can
• Calculate the Range. reduce the utility of range as a
• What is the Range if the value measure of dispersion. Thus, you may
200 is not present in the data
need a measure which is not unduly
set?
• If 50 is replaced by 150, what affected by the outliers.
will be the Range? In such a situation, if the entire
data is divided into four equal parts,
each containing 25% of the values, we
Range: Comments
Range is unduly af fected by extreme
get the values of Quartiles and
values. It is not based on all the Median. (You have already read about
values. As long as the minimum and these in Chapter 5).
maximum values remain unaltered, The upper and lower quartiles (Q3
any change in other values does not and Q 1, respectively) are used to
affect range. It cannot be calculated calculate Inter Quartile Range which
for open-ended fr equency distri-
bution.
is Q3 – Q1.
Inter-Quartile Range is based
Notwithstanding some limitations, upon middle 50% of the values in a
Range is understood and used distribution and is, therefore, not
frequently because of its simplicity. affected by extreme values. Half of
For example, we see the maximum the Inter-Quartile Range is called
and minimum temperatures of Quartile Deviation (Q.D.). Thus:
different cities almost daily on our TV Q3 - Q1
screens and form judgments about the Q .D . =
2
temperature variations in them.
Q.D. is therefore also called Semi-
Open-ended distributions are those Inter Quartile Range.
in which either the lower limit of the
lowest class or the upper limit of the Calculation of Range and Q.D. for
highest class or both ar e not ungrouped data
specified.
Example 1

Activity
Calculate Range and Q.D. of the
following observations:
• Collect data about 52-week high/
20, 25, 29, 30, 35, 39, 41,
low of shares of 10 companies
from a newspaper. Calculate the
48, 51, 60 and 70
range of share prices. Which Range is clearly 70 – 20 = 50
company’s share is most volatile For Q.D., we need to calculate
and which is the most stable? values of Q3 and Q1.

2015-16
MEASURES OF DISPERSION 77

n +1 Range is just the difference between


Q 1 is the size of th value. the upper limit of the highest class and
4
the lower limit of the lowest class. So
n being 11, Q1 is the size of 3rd Range is 90 – 0 = 90. For Q.D., first
value. calculate cumulative frequencies as
As the values are already arranged follows:
in ascending order, it can be seen that
Q 1, the 3rd value is 29. [What will you Class- Frequencies Cumulative
Intervals Frequencies
do if these values are not in an order?] CI f c. f.
3( n + 1) 0–10 5 05
Similarly, Q3 is size of th
4 10–20 8 13
20–40 16 29
value; i.e. 9th value which is 51. Hence 40–60 7 36
Q 3 = 51 60–90 4 40
Q3 - Q1 51 − 29 n = 40
Q .D . = = = 11
2 2 n
Q 1 is the size of th value in a
Do you notice that Q.D. is the 4
average difference of the Quartiles continuous series. Thus it is the size
from the median. of the 10th value. The class containing
Activity the 10th value is 10–20. Hence Q1 lies
in class 10–20. Now, to calculate the
• Calculate the median and check
exact value of Q 1, the following
whether the above statement is
correct. formula is used:

Calculation of Range and Q.D. for a n


cf
frequency distribution. Q1 = L + 4 ×i
f
Example 2
Where L = 10 (lower limit of the
For the following distribution of marks relevant Quartile class)
scored by a class of 40 students, c.f. = 5 (Value of c.f. for the class
calculate the Range and Q.D. preceding the Quartile class)
TABLE 6.1 i = 10 (interval of the Quartile
class), and
Class intervals No. of students
CI (f) f = 8 (frequency of the Quartile
class) Thus,
0–10 5
10–20 8 10 − 5
20–40 16 Q1 = 10 + ×10 = 16. 25
40–60 7
8
60–90 4 3n
40 Similarly, Q3 is the size of th
4

2015-16
78 STATISTICS FOR ECONOMICS

value; i.e., 30th value, which lies in to rich and poor, from the median of
class 40–60. Now using the formula the entire group.
for Q3, its value can be calculated as Quartile Deviation can generally be
follows: calculated for open-ended distribu-
3n tions and is not unduly affected by
- c.f. extreme values.
Q3 = L + 4 i
f
3. MEASURES OF D ISPERSION FROM
30 - 29 AVERAGE
Q3 = 40 + 20
7 Recall that dispersion was defined as
Q 3 = 42.87 the extent to which values differ from
their average. Range and Quartile
42.87 - 16.25 Deviation are not useful in measuring,
Q.D. = = 13.31
2 how far the values are, from their
In individual and discrete series, Q 1 average. Yet, by calculating the spread
of values, they do give a good idea
n +1 about the dispersion. Two measures
is the size of th value, but in a
4 which are based upon deviation of the
continuous distribution, it is the size values from their average are Mean
n Deviation and Standard Deviation.
of th value. Similarly, for Q3 and Since the average is a central
4
median also, n is used in place of
value, some deviations are positive
n+1. and some are negative. If these are
added as they are, the sum will not
reveal anything. In fact, the sum of
If the entire group is divided into
two equal halves and the median deviations from Arithmetic Mean is
calculated for each half, you will have always zero. Look at the following two
the median of better students and the sets of values.
median of weak students. These Set A : 5, 9, 16
medians differ from the median of the Set B : 1, 9, 20
entire group by 13.31 on an average.
Similarly, suppose you have data You can see that values in Set B
about incomes of people of a town. are farther from the average and hence
Median income of all people can be more dispersed than values in Set A.
calculated. Now if all people are Calculate the deviations fro m
divided into two equal groups of rich Arithmetic Mean and sum them up.
and poor, medians of both groups can What do you notice? Repeat the same
be calculated. Quartile Deviation will with Median. Can you comment upon
tell you the average difference between the quantum of variation from the
medians of these two groups belonging calculated values?

2015-16
MEASURES OF DISPERSION 79

Mean Deviation tries to overcome distance travelled by students.Mean


this problem by ignoring the signs of Deviation is the arithmetic mean of the
deviations, i.e., it considers all differences of the values from their
deviations positive. For standard average. The average used is either the
deviation, the deviations are first arithmetic mean or median.
squared and averaged and then (Since the mode is not a stable
square root of the average is found. average, it is not used to calculate Mean
We shall now discuss them separately Deviation.)
in detail. Activities
• Calculate the total distance to
Mean Deviation be travelled by students if the
Suppose a college is proposed for college is situated at town A, at
students of five towns A, B, C, D and town C, or town E and also if it
E which lie in that order along a road. is exactly half way between A
and E.
Distances of towns in kilometres from
• Decide where, in your opinion,
town A and number of students in
the college should be establi-
these towns are given below: shed, if there is only one student
in each town. Does it change
Town Distance No.
your answer?
from town A of Students
A 0 90 Calculation of Mean Deviation from
B 2 150
C 6 100 Arithmetic Mean for ungrouped
D 14 200 data.
E 18 80
Direct Method
620
Steps:
Now, if the college is situated in (i) The A.M. of the values is calculated
town A, 150 students from town B will (ii) Difference between each value and
have to travel 2 kilometers each (a total the A.M. is calculated. All
of 300 kilometres) to reach the college. differences are considered
The objective is to find a location so that positive. These are denoted as |d|
the average distance travelled by (iii)The A.M. of these differences
students is minimum. (called deviations) is the Mean
You may observe that the students Deviation.
will have to travel more, on an average,
Σ| d|
if the college is situated at town A or E. i.e. M.D. =
If on the other hand, it is somewhere in n
the middle, they are likely to travel less. Example 3
Mean deviation is the appropriate Calculate the Mean Deviation of the
statistical tool to estimate the average following values; 2, 4, 7, 8 and 9.

2015-16
80 STATISTICS FOR ECONOMICS

ΣX Where Σ |d| is the sum of absolute


The A.M. = =6 deviations taken from the assumed
n
mean.
X |d| x is the actual mean.
2 4 A x is the assumed mean used to
4 2 calculate deviations.
7 1 Σ fB is the number of values below the
8 2 actual mean including the actual
9 3 mean.
12 Σ fA is the number of values above the
12 actual mean.
M.D.( X ) = = 2. 4 Substituting the values in the
5
above formula:
Assumed Mean Method 11 + ( 6 − 7 )( 2 − 3 ) 12
M.D.( x) = = =2 . 4
Mean Deviation can also be calculated 5 5
by calculating deviations from an
assumed mean. This method is Mean Deviation from median for
adopted especially when the actual ungrouped data.
mean is a fractional number. (Take
care that the assumed mean is close Direct Method
to the true mean). Using the values in example 3, M.D.
For the values in example 3, from the Median can be calculated as
suppose value 7 is taken as assumed follows,
mean, M.D. can be calculated as (i) Calculate the median which is 7.
under: (ii) Calculate the absolute deviations
from median, denote them as |d|.
Example 4 (iii)Find the average of these absolute
X |d| deviations. It is the Mean
Deviation.
2 5
4 3 Example 5
7 0
[X-Median]
8 1
9 2 X |d|
2 5
11
4 3
In such cases, the following 7 0
formula is used, 8 1
Σ| d | + (x − Ax )(Σf B − Σf A ) 9 2
M.D.( x ) = 11
n

2015-16
MEASURES OF DISPERSION 81

M. D. from Median is thus, (iii) Multiply each |d| value with its
corresponding frequency to get f|d|
Σ | d | 11
M.D.( median ) = = = 2 .2 values. Sum them up to get Σ f|d|.
n 5
(iv) Apply the following formula,
Short-cut method Σf | d |
M.D. ( x ) =
To calculate Mean Deviation by short Σf
cut method, a value (A) is used to Mean Deviation of the distribution
calculate the deviations and the in Table 6.2 can be calculated as
following formula is applied. follows:
M.D.(M edian )
Example 6
Σ |d| +(Median − A )(Σ fB − Σf A )
=
n C.I. f m.p. |d| f|d|

where, A = the constant from which 10–20 5 15 25.5 127.5


20–30 8 25 15.5 124.0
deviations are calculated. (Other 30–50 16 40 0.5 8.0
notations are the same as given in the 50–70 8 60 19.5 156.0
assumed mean method). 70–80 3 75 34.5 103.5
40 519.0
Mean Deviation from Mean for Σf | d| 519
Continuous distribution M.D.( x ) = = =12.975
Σf 40
TABLE 6.2
Profits of Number of Mean Deviation from Median
companies Companies
(Rs in lakhs) TABLE 6.3
Class-intervals Class intervals Frequencies
10–20 5 20–30 5
20–30 8 30–40 10
30–50 16 40–60 20
50–70 8 60–80 9
70–80 3 80–90 6
40 50

Steps: The procedure to calculate Mean


Deviation from the median is the
(i) Calculate the mean of the same as it is in case of M.D. from
distribution.
Mean, except that deviations are to
(ii) Calculate the absolute deviations be taken from the median as given
|d| of the class midpoints from the below:
mean.

2015-16
82 STATISTICS FOR ECONOMICS

Example 7 Calculation of Standard Deviation


for ungrouped data
C.I. f m.p. |d| f|d|
Four alternative methods are available
20–30 5 25 25 125
30–40 10 35 15 150 for the calculation of standard
40–60 20 50 0 0 deviation of individual values. All
60–80 9 70 20 180 these methods result in the same
80–90 6 85 35 210 value of standard deviation. These are:
50 665
(i) Actual Mean Method
Σf | d | (ii) Assumed Mean Method
M.D.(Median ) =
Σf (iii)Direct Method
(iv) Step-Deviation Method
665
= =13. 3 Actual Mean Method:
50
Suppose you have to calculate the
Mean Deviation: Comments standard deviation of the following
Mean Deviation is based on all values:
values. A change in even one value 5, 10, 25, 30, 50
will affect it. It is the least when
calculated from the median i.e., it Example 8
will be higher if calculated from the
mean. However it ignores the signs X d (x-x̄) d2
of deviations and cannot be 5 –19 361
calculated for open-ended distribu- 10 –14 196
tions. 25 +1 1
30 +6 36
50 +26 676
Standard Deviation 0 1270
Standard Deviation is the positive Following formula is used:
square root of the mean of squared
deviations from mean. So if there are Σd 2
σ=
five values x1, x2, x3, x4 and x5, first n
their mean is calculated. Then
deviations of the values from mean are 1270
σ= = 254 = 15 .937
calculated. These deviations are then 5
squared. The mean of these squared
Do you notice the value from which
deviations is the variance. Positive
deviations have been calculated in the
square root of the variance is the
above example? Is it the Actual Mean?
standard deviation.
(Note that Standard Deviation is Assumed Mean Method
calculated on the basis of the mean For the same values, deviations may be
only). calculated from any arbitrary value

2015-16
MEASURES OF DISPERSION 83

A x such that d = X – A x . Taking A x (This amounts to taking deviations


= 25, the computation of the standard from zero)
deviation is shown below: Following formula is used.

Σx 2
Example 9 σ= − ( x)2
n
X d (x-Ax̄ ) d2
4150
5 –20 400 or σ = − (24 )2
10 –15 225 5
25 0 0
30 +5 25 or σ = 254 = 15 .937
50 +25 625
Standard Deviation is not affected
–5 1275 by the value of the constant fr om
which deviations are calculated. The
Formula for Standard Deviation value of the constant does not figure
2
in the standard deviation formula.
Σd 2 Σ d  Thus, Standar d Deviation is
σ= −  
n n Independent of Origin.

2
1275 −5  Step-deviation Method
σ= −   = 254 = 15 .937
5 5  If the values are divisible by a common
factor, they can be so divided and
Note that the sum of deviations
from a value other than actual
standard deviation can be calculated
mean will not be equal to zero. from the resultant values as follows:

Example 11
Direct Method
Since all the five values are divisible
Standard Deviation can also be by a common factor 5, we divide and
calculated from the values directly, get the following values:
i.e., without taking deviations, as
x x' d (x'-x̄') d2
shown below:
5 1 –3.8 14.44
Example 10 10 2 –2.8 7.84
25 5 +0.2 0.04
2 30 6 +1.2 1.44
X X
50 10 +5.2 27.04
5 25
10 100 0 50.80
25 625 (Steps in the calculation are same
30 900
50 2500 as in actual mean method).
The following formula is used to
120 4150
calculate standard deviation:

2015-16
84 STATISTICS FOR ECONOMICS

Σd 2 Standar d Deviation is not


σ= ×c independent of scale. Thus, if the
n values or deviations are divided by
x a common factor, the value of the
x’= common factor is used in the
c formula to get the value of Standard
c = common factor Deviation.
Substituting the values,
50.80 Standard Deviation in Continuous
σ= 5 frequency distribution:
5
Like ungrouped data, S.D. can be
σ = 10 .16 × 5 calculated for grouped data by any of
σ = 15 .937 the following methods:
(i) Actual Mean Method
Alternatively, instead of dividing
(ii) Assumed Mean Method
the values by a common factor, the
(iii)Step-Deviation Method
deviations can be divided by a
common factor. Standard Deviation Actual Mean Method
can be calculated as shown below:
For the values in Table 6.2, Standard
Example 12
Deviation can be calculated as follows:
x d (x-25) d' (d/5) d' 2
5 –20 –4 16
Example 13
10 –15 –3 9
(1) (2) (3) (4) (5) (6) (7)
25 0 0 0
CI f m fm d fd fd2
30 +5 +1 1
50 +25 +5 25 10–20 5 15 75 –25.5 –127.5 3251.25
20–30 8 25 200 –15.5 –124.0 1922.00
–1 51 30–50 16 40 640 –0.5 –8.0 4.00
50–70 8 60 480 +19.5 +156.0 3042.00
Deviations have been calculated 70–80 3 75 225 +34.5 +103.5 3570.75
from an arbitrary value 25. Common
40 1620 0 11790.00
factor of 5 has been used to divide
deviations. Following steps are required:
1. Calculate the mean of the
2
Σd ’2  Σd’  distribution.
σ=  n  ×c
n Σ fm 1620
x= = = 40.5
Σf 40
51  −1 
2 2. Calculate deviations of mid-values
σ= − ×5
5  5  from the mean so that
d = m − x (Col. 5)
σ = 10 .16 × 5 = 15. 937 3. Multiply the deviations with their

2015-16
MEASURES OF DISPERSION 85

corresponding frequencies to get 4. Multiply ‘fd’ values (Col. 5) with ‘d’


‘fd’ values (col. 6) [Note that Σ fd values (col. 4) to get fd2 values (col.
= 0] 6). Find Σ fd2.
4. C a l c u l a t e ‘ f d 2 ’ v a l u e s b y 5. Standard Deviation can be
multiplying ‘fd’ values with ‘d’ calculated by the following
values. (Col. 7). Sum up these to formula.
get Σ fd2. 2
Σfd 2  Σfd 
5. Apply the formula as under: σ= −
n  n 
Σfd2 11790
σ= = = 17 .168 2
n 40 11800  20 
or σ = −
40  40 
Assumed Mean Method or σ = 294.75 = 17. 168
For the values in example 13,
standard deviation can be calculated Step-deviation Method
by taking deviations from an assumed In case the values of deviations are
mean (say 40) as follows: divisible by a common factor, the
Example 14 calculations can be simplified by the
step-deviation method as in the
(1) (2) (3) (4) (5) (6) following example.
CI f m d fd fd 2
10–20 5 15 -25 –125 3125 Example 15
20–30 8 25 -15 –120 1800
30–50 16 40 0 0 0 (1) (2) (3) (4) (5) (6) (7)
50–70 8 60 +20 160 3200 CI f m d d' fd' fd'2
70–80 3 75 +35 105 3675
10–20 5 15 –25 –5 –25 125
40 +20 11800 20–30 8 25 –15 –3 –24 72
30–50 16 40 0 0 0 0
The following steps are required: 50–70 8 60 +20 +4 +32 128
1. Calculate mid-points of classes 70–80 3 75 +35 +7 +21 147
(Col. 3) 40 +4 472
2. Calculate deviations of mid-points
from an assumed mean such that Steps required:
d = m – A x (Col. 4). Assumed 1. Calculate class mid-points (Col. 3)
Mean = 40. and deviations from an arbitrarily
3. Multiply values of ‘d’ with chosen value, just like in the
corresponding frequencies to get assumed mean method. In this
‘fd’ values (Col. 5). (note that the example, deviations have been
total of this column is not zero taken from the value 40. (Col. 4)
since deviations have been taken 2. Divide the deviations by a common
from assumed mean). factor denoted as ‘c’. c = 5 in the

2015-16
86 STATISTICS FOR ECONOMICS

above example. The values so Suppose the values in Set A are the
obtained are ‘d'’ values (Col. 5). daily sales recorded by an ice-cream
3. Multiply ‘d'’ values with vendor, while Set B has the daily sales
corresponding ‘f'’ values (Col. 2) to of a big departmental store. Range for
obtain ‘fd'’ values (Col. 6). Set A is 500 whereas for Set B, it is
30,000. The value of Range is much
4. Multiply ‘fd'’ values with ‘d'’ values
higher in Set B. Can you say that the
to get ‘fd'2’ values (Col. 7)
variation in sales is higher for the
5. Sum up values in Col. 6 and Col. 7 departmental store? It can be easily
to get Σ fd' and Σ fd'2 values. observed that the highest value in Set
6. Apply the following formula. A is double the smallest value, whereas
2
for the Set B, it is only 30% higher.
Σfd ′2  Σfd ′ 
σ = − ×c Thus absolute measures may give
Σf  Σf  misleading ideas about the extent of
2
variation specially when the averages
472  4  differ significantly.
or σ = − ×5
40  40  Another weakness of absolute
or σ = 11.8 − 0.01 × 5 measures is that they give the answer
in the units in which original values
or σ = 11.79 × 5
are expressed. Consequently, if the
σ = 17.168 values are expressed in kilometers, the
dispersion will also be in kilometers.
Standard Deviation: Comments However, if the same values are
Standard Deviation, the most widely expressed in meters, an absolute
used measure of dispersion, is based measure will give the answer in meters
on all values. Therefore a change in and the value of dispersion will appear
even one value affects the value of to be 1000 times.
standard deviation. It is independent To over come these problems,
of origin but not of scale. It is also
relative measures of dispersion can be
useful in certain advanced statistical
problems.
used. Each absolute measure has a
relative counterpart. Thus, for Range,
there is Coefficient of Range which is
5. ABSOLUTE AND RELATIVE M EASURES calculated as follows:
OF DISPERSION
L −S
All the measures, described so far, are Coefficient of Range =
L +S
absolute measures of dispersion. They where L = Largest value
calculate a value which, at times, is S = Smallest value
difficult to interpret. For example,
consider the following two data sets: Similarly, for Quartile Deviation, it
Set A 500 700 1000
is Coefficient of Quartile Deviation
Set B 1,00,000 1,20,000 1,30,000 which can be calculated as follows:

2015-16
MEASURES OF DISPERSION 87

Coefficient of Quartile Deviation 7. L ORENZ CURVE


Q3 − Q1 The measures of dispersion discussed
= rd
Q3 + Q1 where Q3=3 Quartile so far give a numerical value of
dispersion. A graphical measure called
Q 1 = 1st Quartile
Lorenz Curve is available for estimating
For Mean Deviation, it is
in inequalities in distribution. You may
Coefficient of Mean Deviation.
Coefficient of Mean Deviation = have heard of statements like ‘top 10%
of the people of a country earn 50% of
M.D.( x) M.D.( Median ) the national income while top 20%
or
x Median account for 80%’. An idea about
Thus if Mean Deviation is income disparities is given by such
calculated on the basis of the Mean, figures. Lorenz Curve uses the
it is divided by the Mean. If Median is information expressed in a cumulative
used to calculate Mean Deviation, it manner to indicate the degree of
is divided by the Median. inequality. For example Lorenz curve
For Standard Deviation, the of income gives a relationship between
relative measure is called Coefficient percentage of population and its share
of Variation, calculated as below: of income in total income. It is specially
Coefficient of Variation useful in comparing the variability of two
or more distributions by drawing two
Standard Deviation
= × 100 or more Lorenz curves on the same axis.
Arithmetic Mean Given below are the monthly
It is usually expressed in incomes of employees of a company.
percentage terms and is the most TABLE 6.4
commonly used relative measure of
Incomes Number of employees
dispersion. Since relative measures are
free from the units in which the values 0–5,000 5
5,000–10,000 10
have been expressed, they can be 10,000–20,000 18
compared even across different groups 20,000–40,000 10
having different units of measurement. 40,000–50,000 7
Example 16
Income Freq- Percentage Cumulative Mid- Total Percentage Cumulative
class uency Frequency percentage points income share of Percentage
Frequency income share of income
(1) (2) (3) (4) (5) (6) (7) (8)
0–5000 5 10.00 10.00 2500 12500 1.29 1.29
5000–10000 10 20.00 30.00 7500 75000 7.71 9.00
10000–20000 1 8 36.00 66.00 15000 270000 27.76 36.76
20000–40000 1 0 20.00 86.00 30000 300000 30.85 67.61
40000–50000 7 14.00 100.0 45000 315000 32.39 100.00
Total 50 100.00 - - 972500 100.00 -
Column (3) = [col. (2) ÷ total of col. (2)] × 100.
Column (6) = [co I. (2) × col. (5)].
Column (7) = [col. (6) ÷ total of col. (6)] × 100.

2015-16
88 STATISTICS FOR ECONOMICS

Construction of the Lorenz Curve percentage of frequencies (number


Since Lorenz curve provides a of employees) [i.e., col. (4)] on
relationship between percentage of X- axis, as in figure 6.1. Thus each
population and its share in cumulative axis will have values from ‘0’ to
manner, we will first compute ‘100’. The coordinate (0,0) will
percentage population in each income imply 0% population has 0% share
class, and its total, then the cumulative of income, which is obvious.
distribution of population. After that we 8. Draw a line joining Co-ordinate
will compute the share, in percentage (0,0) with (100,100). This is called
term, of each income class and then the the line of equal distribution shown
cumulative distribution of share. as line ‘OC’ in figure 6.1.
Finally, the curve using different 9. Plot the cumulative percentages
combinations of cumulative percentage share with corresponding cumulative
of population and cumulative percentages of frequency. Join these
percentage of its (population's) share points to get the curve OAC.
will give the Lorenz curve. Please refer
example 16 given above. Studying the Lorenz Curve
Following steps are required. OC is called the line of equal
1. Calculate the percentage frequency distribution, since it would imply a
by the formula [col. (2) ÷ total of col. situation like, top 20% people earn
(2)]×l00 as given in col. (3). 20% of total income and top 60% earn
2. Calculate cumulative percentage 60% of the total income. The farther the
frequencies as in col. (4). curve OAC from this line, the greater is
3. Calculate class mid-points as in col. the inequality present in the
(5), which can be approximated as distribution. If there are two or more
the average income of each person curves on the same axes, the one which
in the income class. is the farthest from line OC has the
4. Now to calculate the share of each highest inequality. However, if two
income class, we need to know the Lorenz curves intersect each other, no
total income of each class and total conclusion can be drawn.
income of all classes. The total
income of each class will be
Cumulative Percentages of

obtained by multiplying col. (2) with


col. (5) as in col. (6).
Income share

5. Calculate the percentage share by


the formula [col. (6) ÷ total of col.
(6)] × 100 as given in col. (7).
6. Calculate cumulative percentage
share as in col. (8).
7. Now, on the graph paper, take the
cumulative percentage of the
variable (share of incomes [i.e., col.
(8)] on Y axis and cumulative Cumulative Percentages of Employees
Fig. 6.1

2015-16
MEASURES OF DISPERSION 89

8. CONCLUSION M.D. calculates average of deviations


Although Range is the simplest to from the average but ignores signs of
calculate and understand, it is unduly deviations and therefore appears to be
affected by extreme values. QD is not unmathema-tical. Standard Deviation
affected by extreme values as it is based attempts to calculate average deviation
on only middle 50% of the data. from mean. Like M.D., it is based on
However, it is more difficult to interpret all values and is also applied in more
M.D. and S.D. Both are based upon advanced statistical problems. It is the
deviations of values from their average. most widely used measure of dispersion.

Recap
• A measur e of dispersion improves our understanding about the
behaviour of an economic variable.
• Range and Quartile Deviation are based upon the spread of values.
• M.D. and S.D. ar e based upon deviations of values from the average.
• Measures of dispersion could be Absolute or Relative.
• Absolute measures give the answer in the units in which data are
expressed.
• Relative measures are free from these units, and consequently
can be used to compare different variables.
• A graphic method, which estimates the dispersion from shape
of a curve, is called Lorenz Curve.

EXERCISES

1. A measure of dispersion is a good supplement to the central value in


understanding a frequency distribution. Comment.
2. Which measure of dispersion is the best and how?
3. Some measures of dispersion depend upon the spread of values whereas
some are estimated on the basis of the variation of values from a central
value. Do you agree?
4. In a town, 25% of the persons earned more than Rs 45,000 whereas
75% earned more than 18,000. Calculate the absolute and relative values
of dispersion.
5. The yield of wheat and rice per acre for 10 districts of a state is as under:
District 1 2 3 4 5 6 7 8 9 10
Wheat 12 10 15 19 21 16 18 9 25 10
Rice 22 29 12 23 18 15 12 34 18 12
Calculate for each crop,
(i) Range
(ii) Q.D.
(iii) Mean Deviation about Mean
(iv) Mean Deviation about Median

2015-16
90 STATISTICS FOR ECONOMICS

(v) Standard Deviation


(vi) Which crop has greater variation?
(vii) Compare the values of different measures for each crop.
6. In the previous question, calculate the relative measures of variation
and indicate the value which, in your opinion, is more reliable.
7. A batsman is to be selected for a cricket team. The choice is between X
and Y on the basis of their scores in five previous tests which are:
X 25 85 40 80 120
Y 50 70 65 45 80
Which batsman should be selected if we want,
(i) a higher run getter, or
(ii) a more reliable batsman in the team?
8. To check the quality of two brands of lightbulbs, their life in burning
hours was estimated as under for 100 bulbs of each brand.
Life No. of bulbs
(in hrs) Brand A Brand B
0–50 15 2
50–100 20 8
100–150 18 60
150–200 25 25
200–250 22 5
100 100

(i) Which brand gives higher life?


(ii) Which brand is more dependable?
9. Averge daily wage of 50 workers of a factory was Rs 200 with a Standard
Deviation of Rs 40. Each worker is given a raise of Rs 20. What is the
new average daily wage and standard deviation? Have the wages become
more or less uniform?
10. If in the previous question, each worker is given a hike of 10 % in wages,
how are the Mean and Standard Deviation values affected?
11. Calculate the Mean Deviation using mean and Standard Deviation for
the following distribution.
Classes Frequencies
20–40 3
40–80 6
80–100 20
100–120 12
120–140 9
50
12. The sum of 10 values is 100 and the sum of their squares is 1090. Find
out the Coefficient of Variation.

2015-16

Вам также может понравиться