Вы находитесь на странице: 1из 36

Frequency Distribution

FREQUENCY DISTRIBUTION
4 6 5 4 4
3 5 4 5 3
4 4 5 6 4
5 5 3 3 5
7 6 5 5 4

Table 1:- shows the daily number of car accident in a certain city recorded
over the period of 30 days.

Simple series or ungrouped data: It is a series of observations recorded


without any systematic arrangement e.g. Table 1.The ungrouped data will
be so numerous that their significance will not be readily comprehended.
In such cases it will become necessary to summaries the data to an easily
manageable form. We may summaries the raw data with the help of
Frequency Distribution.
FREQUENCY DISTRIBUTION

Frequency Distribution is a statistical table which shows the


values of the variable arranged in order of magnitude and also
the corresponding frequencies side by side.

Q) What is Frequency?

Number of times the particular value is repeated is known as


frequency. In order to facilitate counting we may prepare a
column of Tally. In another column, places all possible values of
the variable from the lowest to highest and count the number of
tally bar corresponding to each value of the variable and place
it in the column of frequency, as below
Daily no of carTally Bar Frequency
accidents (No of days)
3 IIIII 5
4 IIIIIIIII 9
5 IIIIIIIIIII 11
6 IIII 4
7 I 1
Total 30

Table 2:Simple frequency distribution of car


accidents.
Types of frequency distribution

There are two types of frequency distribution


1) Simple frequency Distribution-shows the value of the variable
individually, e.g. Table 2
2) Grouped Frequency distribution-shows the value of the
variable in groups. Generally when the number of variable is
too large using grouped frequency distribution would be more
advantageous than that of simple frequency distribution. For
example: consider the following data showing the age (in
years) of 200 unemployed persons who registered their names
in an Employment Exchange on a certain day.
33 26 33 30 28 25 28 22 17 30
26 17 29 22 18 40 20 33 21 25
25 20 21 29 27 26 19 37 32 21
20 40 20 20 27 19 20 20 29 20
21 49 34 26 29 31 25 33 23 16
21 20 23 29 23 17 23 27 43 22
24 26 29 29 20 17 32 23 24 25
36 19 28 26 31 52 25 57 16 26
27 19 24 19 39 23 24 25 17 19
31 18 26 23 19 30 20 19 50 40
22 27 17 22 43 33 22 17 21 25
24 46 27 20 26 20 21 24 22 21
25 22 24 32 24 20 22 15 34 22
22 20 37 19 24 21 28 27 19 31
22 24 17 29 24 17 23 21 21 31
31 18 22 19 33 25 21 25 23 21
27 18 15 19 24 19 22 22 16 21
17 18 25 31 26 29 24 31 23 18
31 24 23 21 24 18 21 19 30 24
24 22 30 19 25 21 21 25 45 24
The first step towards summarization of data is to arrange
the figure in the form of a simple frequency distribution as
follows:
Ages(year) Frequency Ages(year) Frequency Ages(year) Frequency
15 2 26 10 39 1
16 3 27 7 40 3
17 10 28 3 43 2
18 8 29 9 45 1
19 30 5 46 1
20 14 31 8 49 1
21 15 32 3 50 1
22 19 33 6 52 1
23 16 34 2 57 1
24 11 36 1
25 20 37 2 Total 200
14
Although the table gives a somewhat clear picture under the simple
frequency distribution but since the number of observations are too
large it will be more representable if we group the observations(age)
into number of intervals i.e. 15-19, 19-24 etc. and shows the frequency
of occurrence (here the number of persons falling in the particular age
group)separately for each interval. Such frequency table is known as
Group Frequency Distribution.

Age in year Frequency(No of persons)


15-19 37
20-24 81
25-29 43
30-34 24
35-44 9
45-49 6
Total 200
Different component of Grouped
frequency distribution
a) Class Interval

When the numbers of observations are too large


i.e. varying in wide range, these are usually
classified in several groups according to their size
of values. Each of these groups defined as class
interval or simply class.
In our example we have classified the ages into 6
different interval i.e. 15-19,…….., 45-49 since the
ages are varying in wide range from 17 to 57.
Although there is no hard and fast rule regarding
the number of class intervals, It is generally agreed
that the number classes neither be too large then the
basic purpose of classification (i.e. summarization of
data to an easily manageable form.) will not be
served nor be too small then the true nature of the
distribution will be obscured.

As a working rule this number should lie between 5


and 15 depending upon the number of observations
available.
One should have class intervals of either five or
multiple of five like 10, 20, 25 etc.The reason is
that the human mind is accustomed more to think
in terms of certain multiples of 5,10 and the like.

Classes must be mutually exclusive means that


each of the given values is included in one of the
classes .

For Example:- 15-19 , 20-24 etc.


(not like 15-20, 20-25 etc )
b) Class Frequency

The number of observations falling within


a class is called class frequency. Sum of all
class frequency is total frequency. In our
example the class frequencies are 37,81 etc.
c) Class Limit

The two numbers used to specify the limits


of a class interval are called Class Limits.
The smaller of the pair is known as lower
class limit and the larger as upper class
limit. In our example, column 3 is the lower
class limits and column 4 is the upper class
limits as shown in the Table-3.
Class Class Class Limit
Interval frequency Lower Upper
(1) (2) (3) (4)

15-19 37 15 19
20-24 81 20 24
25-29 43 25 29
30-34 24 30 34
35-44 9 35 44
45-49 6 45 59
Total 200
Table-3
d) Class Boundaries

In our present example the variable is discrete since


the age is expressed only in years.
But if we consider age as continuous variable, then it
may be 14.5 years of 19.5 years, then the concept of
class limit will not suffice.(In which class the
observation 14.5 or19.5 will be included?). So in case of
continuous variable, certain adjustment in the class
interval is needed to obtain the continuity.
The extreme values which would ever be included in
the class interval in order to obtain the correct class
interval in case of continuous variable is known as
Class Boundaries.
The lower extreme point is known as the lower class
boundary and the upper extreme point is known as
the upper class boundary with reference to the
particular class. In our example, column (5) shows the
lower class Boundaries and column (6) shows the
upper class Boundaries (as shown in the Table-4)
Correction Factor (d)

(Lower limit of the second class - Upper limit of the first class)
= -----------------------------------------------------------------
2

Lower Class Boundary= Lower Class Limit - d


Upper Class Boundary= Upper Class Limit + d
Class Class Class Boundaries

Lower Upper
Interval frequency
(5) (6)
(1) (2)

15-19 37 14.5 19.5


20-24 81 19.5 24.5
25-29 43 24.5 29.5
30-34 24 29.5 34.5
35-44 9 34.5 44.5
45-49 6 44.5 59.5
Total 200
Table-4
e) Class Mark

The value exactly at the middle of a class interval is called


Class Mark. Class mark is the representative value of the
class interval, for the calculation of mean, SD etc.In our
example column (7) of table -5 shows the class marks.

Class Mark= (Lower Class Limit+ Upper Class Limit) ÷ 2


Or
(Lower Class Boundary+ Upper Class Boundary) ÷ 2
Class Interval Class Class
(1) frequency Mark
(2) (7)
15-19 37 15+19 = 34 ÷2= 17
20-24 81 20+24 = 44 ÷2= 22
25-29 43 25+29 =54 ÷2 = 27
30-34 24 30+34 = 64 ÷2 = 32
35-44 9 35+44 =79 ÷2 = 39.5
45-59 6 45+59 =104 ÷2= 52

Total 200
e) Width of class or Size of the class

1) Width of class (in case of continuous variable)

= Upper Class Boundary - Lower Class Boundary.


or
Upper Class limit - Lower Class limit.

For example:-

Class boundaries Width Class limits Width

15 – 20 5
14.5 – 19.5 5
20 - 25 5
19.5 – 24.5 5
2) Width of class (in case of discrete variable)

= Upper Class limit of the second class - Lower Class


limit of the first class.

For example:-

Class limits Width


15-19 20 -15 = 5
20-24 25 – 20 = 5
25-29 30 -25 = 5
30-34 35 – 30 = 5
35-44 45 – 35 = 10
45-59 60 – 45 = 15
How to choose the number of classes
and class limits
Step I :- Remember the following points
 The number of classes should be
preferably between 5 and 15.
 The width of the classes should be
preferable be 5 or its multiple e.g., 15-19,20-
24 etc.
 One of the class limit being preferable a
multiple of 5.
Step II :-
Apply the Sturges formula for determining the
approximate number of classes,i.e.

k = 1 + 3.322 log N
Where k = The approximate number of classes
N = Total number of observation
Step III :-
The third step is to find the maximum and
minimum of the observations and find the
“range” i.e.

Range = Maximum - Minimum value

If we like to have classes of equal width, then


the approximate width of the class may be
obtained on dividing the range by the
approximate number of classes i.e.
Range
Approximate width = -------------------------------
1 + 3.322 log N
For example:-
Problem 1:-
The profits (in lakh rupees) of 30 companies for the
year 2008 is given below.
20,22,35,42,37,42,48,53,49,65,39,48,67,
18,16,23,37,35,49,63,65,55,45,58,57,69,25,29,58,65.

Classify the above data taking suitable class-interval


Solution:-
Step I :- Find the approximate number of classes.

k = 1 + 3.322log N

Where N = 30

K = 1 + 3.322 log 30
= 1 + 3.322 x 1.4771
= 7 + 4.91
= 5.91 or 6
Step II :-Width of the classes
Maximum - Minimum
Width of the classes = -----------------------------------------------------
1 + 3.322log N

( 69 – 16)
= -------------------------------
5.91
53
= ---------------
5.91

= 8.97 or 9

But since width of the classes preferable be 5 or multiple of 5 i.e. 10,15


We consider width as 10 instead of 9.
Frequency distribution of the profits

Profit (Rs. Lakh) No of companies


15 – 24 5
25 – 34 2
35 – 44 7
45 - 54 6
55 – 64 5
65 – 74 5
Total 30
The lower limit of the first class may not coincide
with the minimum observation,since the width of
the classes and the one of the class limit being
preferable multiple of 5.

For example:- Here lowest value of the data is 16


and we have to take the classes of width 10 and
one of the class limit as multiple of 5, then the
lower limit of the first class may not coincide with
the minimum observation i.e. it should be 15 – 24
not 16 – 25.
Problem 2:-
The following set of numbers presents the mutual
fund prices at the end of the week for the selected
40 nationally sold funds
11,17,15,22,11,16,19,24,29,18
25,26,32,14,17,20,23,27,30,12
15,18,24,36,18,15,21,28,33,38
34,13,11,16,20,22,29,29,23,31
Classify the above data taking suitable class-interval
Solution
Ans) k = 1 + 3.322 log 40
= 1+ 3.322 x 1.602
= 1 + 5.32
= 6.32 or 6

(38 – 11)
Width = ------------------ = 4.27 or 5
6.32
Class Interval Frequency

10 – 14 6

15 - 19 11

20 – 24 9

25 - 29 7

30 – 34 5

35 - 39 2

Total 40

Вам также может понравиться