Академический Документы
Профессиональный Документы
Культура Документы
data is 4.4. Let us consider six classes of equal width and the following table is
called grouped frequency distribution.
In the above table, the left side value of each class, i.e., 2.7, 3.0, is called lower
class limit and the right side of each class, i.e., 2.9, 3.2, "a" 4.4 is called upper
class limit. The width of each class is 0.2.
In this example the classes are not continuous. To make it continuous we add
0.05 to the upper limits and subtract 0.05 from the lower limits where 0.05 +
0.05 = 0.1 is the difference between the previous upper limit and the next lower
limit between any two consecutive classes. In this case, the class limits are
called class boundaries. The middle value of any class is called class mark. So we
obtain the following table:
Class boundaries Class mark Frequency
GRAPHICAL REPRESENTATIONS
There are mainly four graphical representation of frequency distribution, viz, (a)
Histogram, (b) Frequency polygon, (c) Bar chart, (d) O give. (a) Histogram. In this
graph the sides of the column represent the upper and lower class boundaries
and their heights are proportional to the respective frequencies. Consider the
following grouped frequency distribution.
Weight (lbs.), No. of persons, FREQUENCY POLYGON
The histogram is drawn as follows (Fig. 1.1)
Note: If the width of the class intervals are not same then calculate Relative
frequency density' (rdf) for all classes as follows:
Rdf = Frequency /Total frequency x class width. Then take rdf on y-axis and class
intervals on x-axis to draw histogram.
(b) Frequency polygon. Consider the mid-points with a height proportional to
class frequency in the histogram. If these points are joined by straight lines then
the resultant graph is-,-called frequency polygon.
(c) Bar chart. A bar chart is a graphical representation of the frequency
distribution in which the bars are centered at the mid-points of the cells. The
heights of the bars are proportional to the respective class frequencies. If a
single attribute is presented then it is called simple bar chart (Fig. 1.2). When
more than one attribute is presented then it is called multiple bar charts.
WEIGHT Fig. 1.2
(d) Ogive. There are two types of cumulative frequency distribution less than
type and more-than type which are illustrated in the following table. Daily
Wagess, No. of Workers (Frequency) , Cumulative frequency (Less than, more
than)
Such a cumulative frequency distributions may be represented graphically and
the graph known as ogive because 4 its similarity the ogee curve of the architect
and the dam designer. The intersection point of the two curves gives the median
of the distribution.
For grouped frequency distribution, the less than ogive must be plotted against
the upper class boundary and not against the class mark, whereas for 'more
than' ogive the cumulative frequency must be plotted against lower class
boundary.
In this book if the type of the cumulative distribution is not mentioned it is tobe
understood that it is less than type.
SUMMARY
Statistics is a branch of scientific method comprising of collection, presentation,
analysis and interpretation of data which are obtained by measuring some
characteristics. The frequency distribution is a tabulation of data which-are
obtained from measurement, or observation or experiment, arranged in
ascending or descending order.
PROBLEMS
1. From the following prepare a frequency distribution table having class
intervals of 5: Also draw the histogram.
2. A machine shop produces steel pins. The width of 30 pins (in mm )was
checked after machining and data was recorded as follows: Construct the
grouped frequency distribution by taking six classes.
3. For a machine making resistors the successive 40 items were checked and the
following resistances in ohms were noted: Prepare the simple frequency
distribution table. Draw the ogives.
4. Exhibit the absolute and cumulative frequency in respect of the formations
given below:
5. Exhibit the absolute and cumulative frequency in respect of the formations
given below:
6. Consider the following distribution: Draw the (a) Histogram, (b) frequency
polygon (c) ogives.
7. The hourly wage of employees (rs 00) in an organization is given below: Draw
the histogram.
Unit 2: Measures of location.
Definition of Central Tendency /location:
For quantitative data it is observed that there is a tendency of the data to be
distributed about a central value which is a typical value and is called a measure
of central tendency. It is also called a measure of location because it gives the
position of the distribution on the axis of the variable.
There are three commonly used measures of central tendency, viz. mean,
median and Mode. The mean again may be of three types, viz. Arithmetic an
(A.M.), Geometric Mean (G.M.), and harmonic Mean (H.M.). Below we all discuss
these different measures.
Arithmetic mean
The arithmetic mean is simply called Average'. For the observations x1, x2, ......,
xn the A.M. is defined as : simple frequency distribution:
For grouped frequency -distribution, xi is taken as class mark.
Summary
PROBLEMS
Solution. Consider the following: bx. From the combined mean we obtain,
Solution: The calculations are shown in the following table. Mean deviation about
mean . Variance
Summary
Problems
the variables are plotted in the xy-plane which is known as scatter diagram. This
gives an idea about the correlation of the two variables.
COEFFICIENT OF CORRELATION
2A. Karl Pearson has given a coefficient to measure the degree of linear
relationship between two variables which is known as coefficient of correlation
(or, correlation coefficient). For a bivariate distribution (x, y) the coefficient of
correlation denoted by and is defined as : where cov (x, y) = Covariance between
x and y. For the values (xi vi t ) i = 1, 2n of a bivariate distribution,
2B. Limitations of r
1. The coefficient of correlation can be used as a measure of linear relationship
between two variables. In case of non-linear or any other relationship the
coefficient of correlation does not provide any measure at all. So the inspection
of scatter diagram is essential.
2. Correlation must be used to the data drawn from the same source. If distinct
sources are used then the two variables may show correlation but in each source
they may be uncorrelated.
3. For two variables with a positive or negative correlation it does not necessarily
mean that there exists causal relationship. There may be the effect of some
other variables in both of them. On elimination of this effect it may be found that
the net correlation is nil.
2C. Properties 1. The coefficient Of correlation is independent of the origin and
scale of reference,