1st Year Full

Let Me Expose Statistics and
Probability
For Intermediate
Part-I
First Edition
om
9
l.c
ai
gm


s@

t
ta
es

ze
 X
By
Zafar Ali
M.Sc. Statistics
PREFACE
om
l.c
With the grace of Almighty Allah, I feel pleasure in bringing
out First Edition of my book for Intermediate (Part-I) "Let me
ai
Expose Statistics and Probability ". This book is according to
gm
the new syllabus framed for all education al institutions in Khyber
Pakhtun Khwa.
s@
It is hoped that the book in its present form will be useful for
t
ta
the students. Any suggestions for improvement of the Book will be

welcome.
es
ze
First Edition: October 2013
Zafar Ali
0333-0314-9004086, 0345-9282215
Course Contents
om
Chapter # 01 The Basic Concepts of Statistics 1-20
Chapter # 02 Collection and Organization of Data 21-80
Chapter # 03
l.c
Measures of Central Tendency 81-126
ai
Measures of Dispersion, Moments, Skewness and
Chapter # 04 127-172
Kurtosis
gm
Chapter # 05 Index Numbers 173-200
Chapter # 06 Set Theory and Basic Probability 201-262

s@
Chapter # 07 Random Variables 263-292
Chapter # 08 Some Special Probability Distributions 293-315

t
ta
es
ze
ze
es
ta
ts@
gm
ai
l.c
om
CHAPTER 01
The Basic
Concepts of Statistics
om
Chapter Contents
l.c
ai
Y
gm
ou should read this chapter if you need to learn about:
 Information, Observation and Data: (P2)

s@
 Constant : (P3)
 Variable and its Types: (P3-P5)
 Individual, Population and Sample: (P6)
 Parameter and Statistic: (P7)
t
ta
 Sampling: (P8)
 Sampling with replacement and without replacement: (P8-P9)
es
 Frequency and Frequency Distribution: (P9-P11)

 Origin of Statistics: (P12)
Meaning of the word Statistics: (P13)
ze

 Definition of Statistics: (P13)
 Descriptive and Inferential Statistics: (P14)
 Functions of Statistics: (P14)
 Scope and Importance of Statistics in Different Fields: (P14-P15)
 Students study Statistics for several reasons: (P15)
 Summation Notations: (P16)
 Exercise: (P17-P20)
1
Chapter 01 The Basic Concepts of Statistics
Suppose your teacher asks some

questions in the class room:
 What is your Name?

 What is your height?
 What is your age?
 What is your favorite color?
 What is your class number/roll number?
Then the replies of these questions from the students are called information and
om
the recording, listing or observing a single piece of information by the teacher is
called as observation, and hence the collected observations are then collectively
called data.
Information l.c
ai
gm
“To know about something is known as information”
Observation
s@
“Any recording of information (numeric or non-numeric) is called observation”

t
ta
Data
es
“Originally collected observations are

collectively called data”.
ze
 Data of selected student’s Names: Ajmal, Arif and Ali etc.

 Data of selected student’s class numbers: 39, 56 and 47 etc.
 Data of selected student’s heights: 60”, 65” and 66”.
 Data of selected student’s ages: 19, 20 and 21 etc.
 Data of selected student’s favorite color: Red, Blue and Green etc.
Names Class No. Heights Age Color

Ajmal 39 60 19 Red
Arif 56 65 20 Blue
Ali 47 66 21 Green
2
Constant
“A fixed quantity is called a constant”.
 π = 3.14
 e = 2.71 (called as Euler’s number) etc.
A constant is usually denoted by first letters of alphabets e.g. a, b or c.
om
Variable
l.c
“A characteristic, that can vary from one person or object to another, is called a variable”.
ai
gm
 Height and weight of a person

 Eye color of people
 Number of children in a family etc.
s@
Variables are usually denoted by the last letters of alphabets e.g. X, Y or Z.

t
ta
es
Types of Variable
ze
Qualitative Quantitative
Variable Variable
Discrete Continuous
Variable Variable
3
Qualitative Variable
“A variable is qualitative if it can be expressed non-numerically”
 Color
 Religion
 Gender (Female and Male)
om
 Education level
 Grades of students in a class, etc.
l.c
Female Male
ai
Quantitative Variable
gm
Qualitative
“A variable is quantitative if it can be expressed numerically” variables are also
called attributes
s@
 Age
t

ta
Weight
 Height

es
Number of children in a family

 Number of deaths in an accident
 Speed
ze
 Income, etc. 50 kg 80 kg
Quantitative
Variable
Discrete Variable Continuous Variable
4
Discrete Variable
“A quantitative variable is called discrete variable if it has counting

phenomena and there can be certain jump or gap between two
If you count to get the
possible values of the variable. Further it is free from the unit of
value of a variable, it is
measurement”.
discrete. If you measure
to get the value of the
variable, it is continuous.
om
 Family sizes When deciding whether
 No. of pages in a book a variable is discrete or

 No. of apples in a basket. continuous, ask yourself
l.c
 No. of deaths in an accident if it is counted or
 No. of housing units in different blocks of a colony
ai measured!
 No. of passengers carried by PIA in last ten years Further, a discrete
variable takes on values
gm
that are usually integers

or whole numbers, while
a continuous variable
s@
takes on values that are

Continuous Variable
real numbers.
t
“A quantitative variable is called continuous variable if it has

ta
measuring phenomena and there can be infinite number of values

es
between two possible values of the variable. Further it has the unit
of measurement”
ze
Sometimes, the values of

variables such as age,
height, and weight are
 Students heights, ages, weights
 Speed of a car usually rounded to the
 Temperature of a place nearest year, inch, or
 Income of a family pound. However, these

 The amount of milk given by a cow values represent
 The life time of a TV tube measured data, so they
 Fortnightly petrol prices are continuous variables.
5
Individual
“An element from which information may be collected is called an individual”.
Population
“An aggregate of individuals is called population”.
A population can either be finite or infinite depending
om
upon whether it contains a countable or an
uncountable numbers of units.
l.c
Population
 All students in a college (Finite)

ai
 The population of all licensed motor drivers (Finite)

gm
The population of all houses in a country (Finite)
 The population of all points on a line (Infinite)
 The population of stars in the sky (Infinite) etc.
s@
The term population does not mean only the human population; it refers to a collection of
t
measurements on individuals or objects having some common characteristics. The objects

ta
may be concrete (physical) things like the motor cars of a particular type produced by a
es
company, wheat produced in a large farm or they may be abstract (theoretical) things like
the opinions of students about an examination system.
ze
Sample
“A representative part which we select from a population is

called a sample”.
Sample
 Runs scored by a batsman in tests, in the last one year, is a sample of his whole
career scores.
 A few drops of blood, is a sample of the blood containing in the whole body of a
person.
6
Parameter
“Any numerical value (mean, variance or standard deviation, etc.) describing a characteristic of a
population is called parameter”. OR
“The numerical value such as mean, variance or standard deviation etc. computed from population
data is called parameter”.
 Greek letters are
om
  2
used for Parameters
l.c
ai
gm
Statistic
“Any numerical value (mean, variance or standard deviation, etc.) The words Population
s@
describing a characteristic of a sample is called statistic”. OR and Parameter both

start from the letter “P”
“The numerical value such as mean, variance or standard deviation and the words Sample
t
ta
etc. computed from sample data is called statistic”. and Statistic both start
from the letter “S”.
es
Roman letters are

S
ze
2 used for Statistics

X S
 (meu)
A parameter is a fixed value while statistic is a variable because it varies from sample to
sample. It is also to be noted that a parameter is usually denoted by a Greek letter and a
statistic is usually denoted by a Roman letter. For example, the population mean is denoted
by μ while the sample mean is denoted by x . Similarly, the standard deviation of a
population is denoted by  while the sample standard deviation is denoted by S.
7
Sampling
“The process of selecting a sample from a population such that the sample selected has the
characteristics of the whole population is called sampling”.
Sampling
om
l.c
Population
Sample
ai
gm
 A teacher judge performance of his students just

by asking few questions.
s@
 If someone decides taste of the food by tasting a

little bit of the food.
 In medical science a few drops of blood are

t
taken and tested to know whether the blood

ta
contain some abnormality or not.

es
Sampling with Replacement

ze
“If the sampling unit selected is returned to the population before

drawing the next sampling unit, then sampling is said to be
with replacement”.
In with replacement sampling:
 The sampling unit can be selected more than once.

 The population will be considered infinite.
 The number of samples of size “n” that could be drawn with
replacement from a population of size “N” will be equal to (N)n.
 The sampling units will be independent.
Sample
8
Sampling without Replacement
“If the sampling unit selected is not returned to the population before
drawing the next sampling unit, then sampling is said to be
with out replacement”.
In without replacement sampling:
om
The sampling unit can be selected only once.
 The population will be considered finite. Sample
 The number of samples of size “n” that could be drawn with
out replacement from a population of size “N” will be equal to NCn.
l.c
 The sampling units will be dependent. ai
gm
A finite population from which sampling is done with replacement can theoretically be
considered infinite because any number of samples can be drawn without exhausting
(finishing) the population.
t s@
ta
Frequency
es
“The number of occurrences of a particular observation in a data is called frequency”.

OR
ze
“The number of observations falling in a particular group (class) is called frequency”.
cup frequency
9
Frequency Distribution
“The organization of raw data in table form, along with frequencies

is called frequency distribution”.
 The data in the form
The types of frequency distributions that will be considered here are: of frequency
distribution is called
grouped data.
 The purpose of a
om
frequency distribution
is to produce a
meaningful pattern for
l.c
Categorical Frequency Grouped Frequency
Distribution Distribution the overall distribution
of the data from which
ai
Ungrouped Frequency conclusions can be
gm
Distribution drawn.
s@
 A categorical frequency distribution represents Categorical frequency Distribution

data that can be placed in different categories (Categorical frequency Table)
t
Blood Group No. of students (f)

ta
such as gender, hair color, blood group etc. along

A 5
with their frequencies. The categorical frequency B 8
es
distribution is also called frequency table. O 4

ze
 An ungrouped frequency distribution simply Ungrouped frequency Distribution

(Discrete Grouped Data)
lists the data values with the corresponding
Marks No. of students (f)
frequencies. The ungrouped frequency distribution 50 4
is also called discrete grouped data. 60 6
45 3
 A grouped frequency distribution is obtained Grouped frequency Distribution

by constructing classes (or intervals) for the data (Continuous Grouped Data)
values along with corresponding frequencies.
The Height No. of students (f)
3–4 7
grouped frequency distribution is also called
5–6 30
continuous grouped data.
7–8 6
10
Class limits f Mid Points

Hi I am the mid point
60 – 61 2 60.5
of the first class!!!
62 – 63 8 62.5
64 – 65 11 64.5
66 – 67 6 66.5
68 – 69 5 68.5
70 – 71 5 70.5 60+61
Mid points =  60.5
2
om
72 – 73 1 72.5
Total 38 --
l.c
 C.F of the first class is taken equal to the
frequency of that class
 For the other classes C.F are obtained by
ai
Class limits f C.F adding each class cumulative frequency
60 – 61 to the frequency of the next class.
gm
2 2
62 – 63 8 8 + 2 =10
64 – 65 11 11 + 10 = 21
s@
66 – 67 6 6 + 21 = 27
68 – 69 5 5 + 27 = 32
70 – 71 5 5 + 32 = 37
t
ta
72 – 73 1 1 + 37 = 38
Total 38 --
Cumulative Frequency
es
ze
Class Subtract upper limit of the first class

Class limits f boundaries from the lower limit of the second class
60 – 61 2 59.5 – 61.5 and divide it by “2” then subtract and
62 – 63 8 61.5 – 63.5 add the resultant value from the lower
and upper limits respectively!!
64 – 65 11 63.5 – 65.5
66 – 67 6 65.5 – 67.5
68 – 69 5 67.5 – 69.5
70 – 71 5 69.5 – 71.5
72 – 73 1 71.5 – 73.5
Total 38 --
Class boundaries
11
Origin of Statistics!!!
The word statistics has been derived from the

Latin word “status” or an Italian word
“statistia” or German word “statistik”
meaning each word is an organized political
state. It was born as the Science of Kings. It
had its origin in the needs of the
administrators in the ancient days for
Pascal collecting and maintaining quantitative
Sir Sinclair
information about their population wealth and
om
armaments (weaponry used by military). With
the passage of time this word changed its
shape and now is used as “statistics”.
l.c
The word “statistik” was first used by
Gottfried Achenwall (1719-1772). Dr.
ai
Zimmerman (1787) introduced the word
Chebyshev
gm
Statistics into England. Its use was Karl Pearson
popularized by Sir John Sinclair (1754-
1835) in the 1798 publication of his book on
a statistical account of Scotland.
s@
For the last few centuries, considerable

interest had been developed for collection and
analysis of statistical data. Adolf Quetelet
t
ta
(1796-1874) applied statistical methods in the

Bernoulli
field of education and sociology.
es
Quetelet
Outstanding contributions was also made by
Pascal (1623-1662), Bernoulli (1654-1705),
ze
Gauss (1777-1855), Chebyshev (1821-

1894), Francis Galton (1822-1911), Karl
Pearson (1857-1936), William Sealy Gosset
(1876-1937), R.A Fisher (1890-1962), Jerzy
Neyman (1894-1981) ,Wald (1902-1950),
Gosset John Tukey (1915-2000) and many others.
Francis Galton
Gauss Neyman Fisher Wald John Tukey
12
Meaning of the word Statistics
The word statistics is generally used in three different meanings:
 Firstly, the word statistics refers to "numerical facts systematically arranged

with a definite purpose in view". In this sense, the word statistics is always used
in the plural e.g. statistics of prices, statistics of road accidents, statistics of
crimes, statistics of births, statistics of educational institutions etc.
om
 Secondly, the word statistics is defined as “the procedures and techniques used
to collect, process and analyze numerical data to make inferences and to
reach decisions in the face of uncertainty". In this sense, the word statistics is
l.c
used in the singular.

ai
Thirdly, the word statistics are “numerical quantities calculated from sample
observations". The word statistics is plural when used in this sense. The mean,
gm
median, mode, etc. calculated from sample observations are the examples in this
sense.
s@
Definition of Statistics
t
ta
es
 “Statistics is the study of the principles

and methods applied in the collection, A lot of work to do
ze
summarization and description of with the data!!!
numerical data. Further it deals with the

procedures of making inferences about the
characteristics of a population on the basis
of a sample taken from the same
population”.
 “The science, which enables us to draw

conclusion about various phenomena of
the real life data (collected on sample
basis) is called statistics”.
13
Descriptive Statistics
“Descriptive statistics deals with the concepts and methods

concerned with the collection, summarization and description of
numerical data”.
In descriptive statistics
By summarization we mean the classification of data, tabulation and we deal with:
their graphical displays; while the description is the computation of a
few numerical quantities i.e. measure of central tendency, measure of  Collection
om
dispersion, moments, skewness and kurtosis etc.  Classification
 Tabulation
 Graphical displays
Inferential Statistics
l.c
 Numerical quantities
“Inferential statistics deals with the procedures of making inferences

ai In inferential statistics
(conclusion) about the characteristics of a population on the basis of a we deal with:
gm
sample taken from the same population”.

 Estimation
This category consists of estimation of population parameters and testing  Hypothesis testing
s@
of hypothesis.
Functions of Statistics
t
ta
es
 The complex mass of data is made simple and understandable with the help of
statistical methods.
ze
 To study relationship between two or more phenomena statistical methods are

used.
 Statistics helps in formulating policies in different fields.
 Statistical methods are highly useful tools for forecasting.
 Statistics helps in decision making in the face of uncertainty.
 One important function of Statistics is to provide techniques for making
comparisons.
Scope and Importance of Statistics in Different Fields
In the ancient times the scope of statistics was limited. Census of population and wealth was conducted
in those days to determine the strength of manpower and material wealth for the purpose of wars. That is
why it was called the science of king.
14
With the passage of time the scope of statistics became wider and wider. With the development of the
theory of probability, insurance companies were benefited. Thus the statistical methods began to be used
in other sciences.
 Statistics plays an important role in

business.
 The whole structure of insurance is
based on statistics.
 The banks make use of statistics while
framing their policies.
om
 Statistical data are now widely used in
taking all administrative decision.
 Statistics has a vast use in Economics,
l.c
Management, Industry, Transport,
Communication, Physics, Chemistry, ai
Zoology, Agriculture, Health, Atomic Energy, Petroleum, Medicine, Astronomy
and many more.
gm
Now-a-days the science of statistics has shown it's worth so much so that there is hardly any field in
s@
which its need is not felt.
Students study statistics for several reasons!!!

t
ta

es
Like professional people, you must be able to read and understand the various
statistical studies performed in your fields. To have this understanding, you must
be knowledgeable about the vocabulary, symbols, concepts, and statistical
ze
procedures used in these studies.
 You may be called on to conduct research in your field, since statistical

procedures are basic to research. To accomplish this, you must be able to design
experiments; collect, organize, analyze, and summarize data; and possibly make
reliable predictions or forecasts for future use. You must also be able to
communicate the results of the study in your own words.
 You can also use the knowledge gained from studying statistics to become better
consumers and citizens. For example, you can make intelligent decisions about
what products to purchase based on consumer studies, about government
spending based on utilization studies, and so on.
15
Summation Notation
Suppose the heights of some students are 54”, 58”, 64”,…, 57”.
We can denote the height of the:
 First student by X1
 Second student by X2
 Last or nth student by Xn.
We can use the symbol Xi to denote any of the heights, where i = 1, 2,…,n.
om
n
Now the sum of the values X1, X2, …,Xn , i.e. X1+ X2+ …+Xn is denoted by  X , where
i=1
i the symbol
 (capital sigma) is a Greek letter and denotes sum.
l.c
Consider the following examples:
ai
gm
4
 X1 + X2 + X3 + X4 =  Xi
i=1
s@
3
 X1Y1 + X2Y2 + X3Y3 =  XY
i i
i=1
4
 X12 + X2 2 + X3 2 + X4 2 =  Xi 2
t
ta
i=1
2
 4 
    Xi 
2

es
X1 + X2 + X3 + X4 =
 i=1 
n
  a = a + a +....+ a = na
ze
i 1
4
 aX1 + aX2 + aX3 + aX4 = a  Xi
i=1
n
 (X1 - a)+(X2 - a)+....+(Xn - a)=  (Xi - a)
i=1
n
 (X1 - a)2 +(X2 - a)2 +....+(Xn - a)2 = (Xi - a)2
i=1
2
 n 
(X1 - a)+(X2 - a)+....+(Xn - a) = (Xi - a)
2

 i=1 
16
Sharpen your Pencil

MCQ’s
(1) To know about something is known as ______
(A) Information (B) Data (C) Observation (D) None of these
om
(2) A constant can assume ______ value.
(A) 0 (B) single (C) two (D) None of these
l.c
(3) Life of a T.V tube is a _____ variable ai
(A) Discrete (B) Continuous (C) Qualitative (D) None of these
gm
(4) Color of hair is a _____ variable.
(A) Qualitative (B) Quantitative (C) Continuous (D) None of these

s@
(5) The number of flowers on a plant is a ______ variable.

t

ta
6) The amount of milk given by a cow is _____ variable.

es

ze
(7) The number of accidents is a _____ variable.
(8) A discrete variable is _____
(A) Not unit free (B) unit free (C) both A & B (D) None of these
(9) The number of branches on different trees is a _____ variable.
(10) A small part taken from a population is called a _____
(A) Population (B) Sample (C) Observation (D) None of these
17
Sharpen your Pencil

MCQ’s
(11) The word statistics was first used by _____
(A) Newton (B) Einstein (C) Achenwall (D) None of these
om
(12) The statistik is _____ word
(A) Italian (B) Latin (C) German (D) None of these
l.c
(13) The quantity computed from population is called _____ai
(A) parameter (B) statistic (C) observation (D) None of these
gm
(14) The quantity computed from sample is called _____
(A) parameter (B) statistic (C) observation (D) None of these

s@
(15) The word Statistics has _____ different meaning.

t
(A) one (B) two (C) three (D) None of these

ta
(16) Atmospheric pressure at a certain place is _____ variable.

es

ze
18
Short Questions
ExeRciSe
Q.1.01. What is the difference between parameter and statistic?
Q.1.02. What is variable? Differentiate between discrete and continuous variables.

Give examples.
om
Q.1.03. What is the difference between variable and a constant?
l.c
Q.1.04. Differentiate between descriptive and inferential statistics.
Q.1.05. Define population and sample? Give examples.

ai
Q.1.06. Differentiate between quantitative and qualitative variables.
gm
Q.1.07. Define Sampling. Give examples.

s@
Q.1.08. Differentiate between sampling with and without replacement.
Q.1.09. Why is a course in Statistics important to you as a student?

t
ta
es
ze
19
om
Hi!!!
l.c
 I am scientific calculator.
 I will help you in each and ai
every problem of statistics.
 But you should need to learn
gm
most of my functions.
 So, let’s make friendship!!!
t s@
ta
es
ze
20
CHAPTER 02
Collection and
Organization of Data
om
Chapter Contents
l.c
ai
Y
gm
 Data: (P22)
s@
 Types of Data by Source: (P22-P24)

 Types of Data by Nature: : (P24-P26)
 Organization of Data: (P26)
 Classification of Data: (P26-P27)
t
ta
 Tabulation: (P27)
 Frequency and Frequency Distribution: (P28-P29)
es
 Construction of frequency Distribution for Qualitative Data: (P30)

 Construction of Ungrouped Frequency Distribution for
Quantitative Data: (P31-P32)
ze
 Some important points in Grouped Frequency Distribution: (P32-P34)

 Construction of Grouped Frequency Distribution for Quantitative
Data: (P35-P37)
 Relative Frequency Distribution: (P38)
 Percentage Relative Frequency Distribution: (P38)
 Cumulative Frequency: (P39)
 Cumulative Frequency Distribution and its Types: (P30-P40)
 Relative Cumulative Frequency Distribution: (P41)
 Percentage Relative Cumulative Frequency Distribution: (P41)
 Diagrams: Bar Chart, Pie Chart etc.: (P42-P54)
 Graphs: Histogram, Frequency Polygon etc.: (P55-P73)
 False Base line or the Broken line: (P74)
21
Chapter 02 Collection and Organization of Data
Data
“Originally collected
observations are collectively
called data”.
om
 Data of selected student’s Names: Ajmal, Arif and Ali etc.
 Data of selected student’s class numbers: 39, 56 and 47 etc.
l.c
 Data of selected student’s heights: 60”, 65” and 66”.
 Data of selected student’s ages: 19, 20 and 21 etc.
ai
 Data of selected student’s favorite color: Red, Blue and Green etc.
gm
Names Class No. Heights Age Color

Ajmal 39 60 19 Red
s@
Arif 56 65 20 Blue
Ali 47 66 21 Green
t
ta
Types of Data by Source

es
 Primary Data
 Secondary Data
ze
Types of Data by
Source
Primary Data Secondary Data
Primary Data
“The data that have been originally collected and have not undergone any sort of statistical
treatment are called primary data. In other words, a fresh data is called a primary data”.
22
Thus the primary data are the first hand information collected for a certain purpose.
 For example, the data in the Population Census Reports are primary because these
are originally collected by the Population Census Organization.
Methods of Collection of Primary Data
Following methods are used for collection of Primary data:
om
 Direct Personal Observation: In this method, the investigator collects the
l.c
information directly from the source concerned. The investigator must be
qualified and experienced in related field of study and should put simple questions
ai
in a simple language, which could be answered easily.
gm
 Indirect Personal Investigations: Sometimes, the informants would not either
disclose the facts at all or would give wrong information. For example, the
businessmen do not disclose their true incomes to the income tax authorities. In
s@
such a situation, information is collected from the third party.
 Registration: In this method, the information is reported to the appropriate

authority. For example, the births and deaths are registered with the Municipal
t
ta
Committee or Corporation in urban areas and the Union Council in rural areas.

es
Estimates through Local Correspondents: There is no formal collection of data

in this method. Local agents or correspondents send the required information
using their own judgments. It is a timesaving method and does not cast much. It
ze
is, however, a subjective method and gives only the estimates.
 Investigation through Enumerators: In this method information is collected

through trained enumerators. The investigators get the forms of inquiry (called
schedules) filled in from the informants. They help the informants in filling in the
schedules correctly. This method is considered to be very accurate and
timesaving. It is used in large-scale government inquiries like Population Census.
 Mailed Questionnaire Method: In this method, questionnaires along with a

letter of request are sent by mail to the informants. The informants fill in the
questionnaires and return them to the investigator. This method is very cheap and
timesaving. But sometimes, the questionnaires are returned incomplete or full of
errors.
23
Secondary Data
“The data that have undergone any sort of treatment by statistical methods, at least once i.e. the
data have been collected, classified, tabulated or presented in some form for a certain purpose, are
called secondary data”.
 For example, the data in the Economic Survey of Pakistan are secondary because
the Federal Bureau of Statistics, the State Bank of Pakistan, the Central Board of
Revenue, etc. originally collect these.
om
Methods of Collection of Secondary Data
Secondary data can be obtained from the following sources:

l.c
ai
gm
 International Publications: e.g. the publication of World Banks, I.M.F, UNESCO,

UNO, ILO, etc.
 Official (Government) Sources: e.g. publication of Federal Bureau of Statistics,
s@
Ministries of Agriculture, Finance, Communications and Railways, Provincial

Bureaus of Statistics and Provincial Department of Agriculture, Health and
Education.

t
Semi-official (Semi- Government) Sources: e.g. publications of State Bank of

ta
Pakistan, Central Cotton Committee, Economic Research Institutes, District

Councils, Municipal Committees, WAPDA, P.I.D.C, etc.
es
 Private (Non-Government) Sources: e.g. publications of Trade Associations,

Chambers of Commerce and Industry, Market Committees, etc.

ze
Publications of Research Organizations: e.g. Punjab University Institute of

Education and Research, Irrigation Research Institute, Punjab University Social
Sciences Research Center, Pakistan Institute of Development Economics, etc.
Types of Data by Nature
 Qualitative Data
 Quantitative Data Types of Data by
Nature
Qualitative Quantitative
Data Data
24
Qualitative Data
“Data collected by a qualitative variable is called qualitative data”
 Color
 Religion
 Gender (Female and Male) Karl Pearson
om
 Education level
 Grades of students in a class etc.
Female Male
l.c
ai
gm
Quantitative Data
Francis Galton
“Data collected by a quantitative variable is called quantitative data”
s@
When data were

first analyzed
t
statistically by
ta
 Age Karl Pearson and


es
Weight Francis Galton,

 Height almost all were
 Speed continuous data.
ze
 Income
50 kg 80 kg In 1899, Pearson
 Number of children in a family
began to analyze
 Number of deaths in an accident etc.
discrete data.
Pearson found
Quantitative that some data,
Data such as eye color,

could not be
measured, so he
termed such data
Discrete Continuous as qualitative data.
Data Data
25
Discrete Data
“Data collected by a Discrete variable is called discrete data”
 Family sizes
 No. of pages in a book
 No. of apples in a basket.
 No. of deaths in an accident
 No. of shares sold every day in the stock market.
om
 No. of housing units in different blocks of a colony
 No. of passengers carried by PIA in last ten years
Continuous Data
l.c
ai
“Data collected by a Continuous variable is called Continuous data”
gm

s@
 Speed of a car
 Temperature of a place
 Income of a family
t
 The amount of milk given by a cow

ta
 The life time of a TV tube

 Fortnightly petrol prices
es
ze
Organization of Data
To describe situations, draw conclusions, or make inferences about events, the researcher must organize
the data in some meaningful way. Thus “Organization of data means reformatting the collected data
in more understandable form” The most convenient method of organizing data is classification,
tabulation (frequency distributions) and constructing statistical diagrams and graphs.
Classification
“The process of arranging data into classes or categories according to some common characteristic
present in the data is called classification”.
26
Objectives of Classification
The main objectives of classification are:
 To bring out points of similarity and dissimilarity.

 By condensing the details it saves one from mental strain.
 This enables one to make comparisons and draw inferences simply.
 It prepares the ground for the proper presentation of statistical facts.
om
Types of
Classification
l.c
ai
Descriptive Numerical
gm
s@
Descriptive Classification The two separate

headings “Classification”
“When the data are classified on the basis of qualities or attributes, and “Tabulation” should
t
which are incapable of quantitative measurement, then the not lead the readers to
ta
classification is said to be descriptive”e.g. gender, marital status, assume that these are
es
educational standard, etc. Descriptive classification is also called two distinct processes.
classification according to attributes. Infect, they go together,
classification is the first
ze
step in tabulation. Before

Numerical Classification the data are put in
tabular form it has to be
“When the data are classified on the basis of quantitative
classified in different
measurements, then the classification is said to be Numerical” e.g. classes or groups having
age, income, height, weights, etc. common characteristics.
After this step the data
are displayed under
Tabulation different columns and
rows so that their
“A table is a systematic arrangement of data into vertical columns relationship can be easily
and horizontal rows. Thus the process of arranging data into rows understood.
and columns is called tabulation”.
27
Frequency
“The number of occurrences of a particular observation in a data is called frequency”.

OR
“The number of observations falling in a particular group (class) is called frequency”.
cup frequency
om
4
l.c
ai
3
gm
s@
“The organization of raw data in table form, along with frequencies is called frequency distribution”.
t
ta
The types of frequency distributions that will be considered here are:

es
 Categorical frequency distribution

 Ungrouped frequency distribution
ze
 Grouped frequency distribution
Categorical Frequency Grouped Frequency

Distribution Distribution
Ungrouped Frequency
Distribution
28
 A categorical frequency distribution represents data that can be placed in different

categories such as gender, hair color, blood group etc. along with their frequencies. The
categorical frequency distribution is also called frequency table. A categorical frequency
distribution of the students blood group is given below:
Categorical frequency Distribution

(Categorical frequency Table)
Blood Group No. of students (f)
A 5
om
B 8
O 4
l.c
 An ungrouped frequency distribution simply lists the data values with the corresponding
frequencies.
ai
The ungrouped frequency distribution is also called discrete grouped data. An
ungrouped frequency distribution of the students marks is given below:
gm
Ungrouped frequency Distribution

s@

Marks No. of students (f)
50 4
60 6
t
 The data in the form

ta
45 3
of frequency
distribution is called
es
 A grouped frequency distribution is obtained by grouped data.
constructing classes (or intervals) for the data values along

ze
 The purpose of a
with corresponding frequencies. The grouped frequency frequency distribution
distribution is also called continuous grouped data. A grouped is to produce a
frequency distribution of the heights of student is given below:
meaningful pattern for
the overall distribution
of the data from which
Grouped frequency Distribution conclusions can be
(Continuous Grouped Data) drawn.
Height No. of students (f)
3–4 7
4–5 30
5–6 6
29
Construction of frequency distribution (or frequency table) for

Categorical (or Qualitative data)
Suppose that in all there are 625 students of first year in a large college. Suppose some of these students
have come from Urdu medium schools and the other has come from English medium schools. If we
interview the students about their schooling, we will get the observations as follows:
U, U, E, U, E, E, E, U …
(U: URDU MEDIUM) (E: ENGLISH MEDIUM)
om
Now the frequency table of the “medium of institution” is given as
follows:
Medium of No. of
l.c
institution Students (f)
Urdu 400 ai
English 225
Total 625 We must decide how
gm
many categories or
This frequency table is called univariate frequency table because it is classes to use. These
constructed for one categorical variable i.e. the medium of institution. categories must be
s@
chosen so as to
Now suppose that along with the Medium of Institution, you are also
recording the gender of the student i.e. accommodate all the
data and so that no
t
Student item is placed under

ta
Medium Gender
No. more than one category.
1 E F The concepts of class
es
2 E M limits, class boundaries,

3 U F
and class marks are of
ze
. . .
no concern when
. . .
constructing frequency
Then the frequency table of the “medium of institution” and the “Gender distribution using
of the students” is given as follows: categorical data.
Medium of Gender
Institute Total
Male Female
Urdu 300 100 400
English 100 125 225
Total 400 225 625
This frequency table is called bivariate frequency table because it is constructed for two categorical
variables i.e. the medium of institution and the gender of the students
30
 When the data are sorted according to one criterion only is called one-way classification
e.g. the student classified by the medium of institution. Tabulation in this case is called
one-way tabulation.
 When the data are sorted according to two criteria is called two-way classification e.g.
the student classified by the medium of institution and their gender. Tabulation in this
case is called tw0-way tabulation.
 When the data are sorted according to three criteria is called three-way classification
e.g. the student classified by the medium of institution, their gender and their residence.
Tabulation in this case is called three-way tabulation.
om
 When the data are sorted according to many criteria is called manifold classification.
Tabulation in this case is called complex tabulation.
l.c
ai
Construction of Ungrouped Frequency Distribution
gm
The following steps are used for constructing an ungrouped frequency distribution:
s@
Step 1:First step is to denote the variable by X and then make a column of the X
t
values that are in our data.

ta
Step 2:Second step is to construct two more columns that are adjacent to the
column of X. The first of these two columns is for tally marks and the
es
second for frequency.

Step 3:Third step is to sum the frequency column and check with the total
number of observations.
ze
EXAMPLE 2.01
The following are the number of flowers on different branches of a plant:
2 4 6 1 3 3 5 7 8 6 2 9
4 7 4 2 1 3 6 4 2 5 1 4
7 9 1 2 10 1 8 9 2 3 8 2
1 2 3 4 4 4 6 6 5 5 6 1
4 5 8 5 4 3 3 2 5 0 9 1
5 9 8 10 0 10 10 -- -- -- -- --
31
Solution  The variable involved is “no. of flowers” which is discrete.

 Therefore the ungrouped frequency distribution for this data is:
X (no. of flowers) Tally f

0 II 2
1 IIII III 8
2 IIII IIII 9
3 IIII II 7
4 IIII IIII 10
5 8
om
IIII III
6 7
IIII II
7 3
III
l.c
8 IIII 4
9 IIII
ai 5
10 IIII 4
Total -- 67
gm
Test Yourself
s@
The following are the number of flowers on different branches of a plant. Construct Frequency
Distribution.
t
12 14 16 11 13 13 15 17 18 16 12 19
ta
14 17 14 12 11 13 16 14 12 15 11 14
17 19 11 12 20 11 18 19 12 13 18 12
es
11 12 13 14 14 14 16 16 15 15 16 11
14 15 18 15 14 13 13 12 15 10 19 11
ze
15 19 18 20 10 20 20 -- -- -- -- --
Some important Points in a Grouped Frequency Distribution
Class Interval (Class)
In the following table each of the groups (110-119), (120-129) and (130,139) is called a class
interval (or class).
Classes f
110 - 119 1
120 - 129 3
130 - 139 2
32
Class Limits
“The smaller and larger number, which describe the class interval, are called the class limits”
 The smaller number is the lower class limit and the larger number is the Classes
upper class limit. Class limit should be well defined and there should be 110 - 119
no overlapping. In other words the limits should be inclusive i.e. the 120 - 129
values corresponding exactly to the lower limit or the upper limit be 130 - 139
included in that class. 140 - 149
om
 Sometimes classes are taken as given in the table: In such a case, it is Classes
difficult to decide where to place an item, which is exactly 120, 130, 140, 110 - 120
etc. because each one of them seems to belong to two classes. Such 120 - 130
l.c
overlapping class limits should, therefore, be avoided. 130 - 140
ai 140 - 150
 Some times a class has either no lower class limit or no upper class Classes
gm
limit such a class is called an open-end class. As given in the table: Below (under) 15
15 - 19
It is clear from the above table that in the class “Below 15” there is no 20 - 24
s@
lower class limit and in the class “40 and over” there is no upper class 25 - 29
limit. 30 - 34
35 and over (above)
t
ta
Arithmetic mean, harmonic mean and geometric mean cannot be computed from an open-
es
end frequency distribution, because the midpoints of the open-end classes cannot be
determined. Therefore it is a bad practice to use open-end classes.
ze
Class Boundaries
“The precise (true) numbers, which remove the discontinuity between two classes, are called class
boundaries or true class limits”
 A class boundary is located halfway between the upper limit of a class and the lower limit of the
next higher class.
Classes Class boundaries

110 - 119 109.5 - 119.5
120 - 129 119.5 - 129.5
130 - 139 129.5 - 139.5
33
 If the classes are in the form:
Classes
110 - 120
120 - 130
130 - 140
Then in this case the class limits are the class boundaries because there is no discontinuity
between two classes.
 Sometimes:
om
Classes Class boundaries
Below (under) 15 Up to 14.5
l.c
15 - 19 14.5 - 19.5
20 - 24 19.5 - 24.5
ai
25 - 29 24.5 - 29.5
30 - 34 29.5 - 34.5
gm
35 - 39 34.5 - 39.5
40 and over (above) 39.5 and over
s@
Class Mark or Mid Point
“The number, which divides each class into two equal parts, is called class mark”
t
ta
It can be obtained by dividing either the sum of the lower and upper
es
limits of a class or the sum of the lower and upper class boundaries of the
class by 2.
ze
Class Width (Class size)

The class width may or
may not be equal for all
“The difference between the lower class limits of two consecutive
the classes. If the class
classes is called the class width”. OR
width is equal for all the
classes then it is called
“The difference between the upper class boundary and the lower class “common width”. In
boundary of a particular class is called the class width”. practice it is desirable to
have equal class widths
whenever possible.
The width (or size) of the class intervals is denoted by “h”.
34
Construction of Grouped Frequency Distribution

(Continuous Grouped Data)
The following steps are used for constructing a grouped frequency

distribution:
Step 1:First step is to decide the number of classes. For

H.A Sturges has
this purpose there are no hard and fast rules but
proposed an empirical
om
statistical experience tells us that no less than 5 and
no more than 20 classes are generally used. rule for determining the
number of classes into
Rule: If 2  N then, we take “k” classes.
k
which a set of
l.c
observations should be
Where “N” is the total number of observations and
grouped. The rule is:
“k” is the number of classes.
ai
k = 1+ 3.3 log N
gm
Step 2:Second step is to determine the range of variation

in the data i.e. Where k denotes the
number of classes
s@
R = Xm - Xo
N is the total number of
where R is the range, observation.

Xm is the largest value and
t
ta
X0 is the smallest value.

es
Step 3:Third step is to determine the approximate width

(size) of the class by dividing the range (R) of
ze
variation by the number of classes (k).
Step 4:Fourth step is to decide where to locate the lower

class limit of the lowest class. The lowest class A frequency distribution
usually starts with the smallest data value or a should have a minimum
number less than it (will be better if it is a multiple
of 5 and maximum of
of class width).
20 classes. For small
Step 5:Fifth step is to list all the classes and class data, use between 5 and
boundaries. 10 classes. For large
data, use up to 20
Step 6:Sixth step is to distribute the data into the classes.
appropriate classes by using a Tally-column.
Step 7:Seventh step is to complete the frequency column.
35
EXAMPLE 2.02
The following data indicates number of people in different locality:
20 50 60 70 35 45 39
61 74 80 25 30 39 40
58 60 67 71 81 82 85
86 80 94 89 56 58 40
45 56 63 72 79 40 18
 The variable of given data is “number of people” which is discrete.
om
Solution
 Range = 94 – 18 = 76.
 Approximate no. of classes are: 2k  N
2k  N  26  35  k = 6 (N = total no. of observations)
 Class Width = Range/k = 76/6 = 12.67  13

l.c
ai
 Hence the grouped frequency distribution is:
In calculating the class-
gm
Classes Tally f width of a frequency

18 – 30 IIII 4 distribution, use the next
31 – 43 IIII II 6 whole number as the
s@
44 – 56 IIII 5 class-width. Doing this
57 – 69 7 ensures that you will

IIII II
70 – 84 9 have enough space in
t
IIII IIII your frequency

ta
83 – 97 IIII 4
distribution for all the
Total -- 35
es
data values.
Alternate Method
ze
 We may take approximate desired no. of classes = 8 (assumed)

 Class Width = Range/k = 76/8 = 9.5  10
Classes Tally f
15 – 24 II 2
25 – 34 II 2
35 – 44 IIII I 6
45 – 54 III 3 There is no need of Class
55 – 64 IIII III 8 boundaries because the
65 – 74 5 variable is discrete in
IIII
75 – 84 5 this case.
IIII
85 – 94 4
IIII
Total -- 35
36
EXAMPLE 2.03
The following data relate to heights of 1st year students (heights in inches):
62 67 65 64 70 70 66 64 63 65
66 68 71 60 64 63 62 64 63 65
66 70 71 72 69 68 62 65 64 62
68 67 65 60 69 64 66 63 -- --
Solution  The involved variable is “height” which is continuous variable.
om
 Range = 72 – 60 = 12
 No. of classes = 7 (assumed)
 Class Width =12/7 =1.714  2
l.c
ai
Here the class limits and
Classes Tally f class boundaries are the
gm
60 – 62 II 2
same; but it is difficult
62 – 64 IIII IIII 9 to decide where to place
64 – 66 IIII IIII 10
s@
an item which is exactly

66 – 68 IIII I 6 62, 64, and 66 etc.
68 – 70 IIII 5 because each one of
them seems to belong to
t
70 – 72 IIII 5
ta
72 – 74 1 two classes. Such

I
overlapping class limits
es
Total -- 38
should therefore be
avoided.
ze
Alternate Method
Class
Classes Tally f
boundaries
60 – 61 59.5 – 61.5 II 2
62 – 63 61.5 – 63.5 IIII III 8
64 – 65 63.5 – 65.5 IIII IIII I 11
66 – 67 65.5 – 67.5 IIII I 6
68 – 69 67.5 – 69.5 IIII 5
70 – 71 69.5 – 71.5 IIII 5
72 – 73 71.5 – 73.5 I 1
Total -- -- 38
37
Test Yourself
Construct Frequency Distribution:
1) The following data indicates number of people in different locality:
30 60 70 80 45 55 49
71 84 90 35 40 49 50
68 70 77 81 91 92 95
om
96 90 104 99 66 68 50
55 66 73 82 89 50 28
l.c
2) The following data relate to heights of 1st year students (heights in inches):
72 77 75 74 80 80 76 74 73 75
76 78 81 70 74 73
ai
72 74 73 75
76 80 81 82 79 78 72 75 74 72
gm
78 77 75 70 79 74 76 73 -- --
s@
Relative Frequency Distribution

t
ta
“The frequency of a class divided by the total of the frequencies is called the relative frequency of
that class and a table showing the relative frequencies is called a relative frequency distribution”.
es
frequency of a class
R.F =
ze
total of frequencies of all classes
Percentage Relative Frequency Distribution
“The frequency of a class divided by the total of the frequencies and multiplied by 100, is called
the percentage relative frequency of that class and a table showing the percentage relative
frequencies is called a percentage relative frequency distribution”.
frequency of a class
P.R.F = ×100
38
Class
Classes f R.F P.R.F
boundaries
60 – 61 59.5 – 61.5 2 2/38 (2/38)x100
62 – 63 61.5 – 63.5 8 8/38 (8/38)x100
64 – 65 63.5 – 65.5 11 11/38 (11/38)x100
6/38 (6/38)x100
66 – 67 65.5 – 67.5 6
5/38 (5/38)x100
68 – 69 67.5 – 69.5 5 5/38 (5/38)x100
70 – 71 69.5 – 71.5 5 5/38 (5/38)x100
om
72 – 73 71.5 – 73.5 1 1/38 (1/38)x100
Total -- 38 1 100
l.c
Cumulative Frequency ai
 Cumulative frequency for ungrouped frequency distribution is defined as “total frequency that
gm
is obtained by adding the frequencies for each value to frequency for preceding values”.
 Cumulative frequency for grouped frequency distribution is defied as “the total frequency of all
classes less than the upper class boundary of a given class or the total frequency of all
s@
classes greater than the lower class boundary of given class is called cumulative frequency”.
t
Cumulative Frequency Distribution

ta
“A tabular form of the variable along with the cumulative frequencies is called cumulative
es
frequency distribution” e.g.

ze
Class
X f C.F Classes f C.F
boundaries
0 2 2
60 – 61 59.5 – 61.5 2 2
1 8 2 + 8 = 10
2 9 10 + 9 = 19 62 – 63 61.5 – 63.5 8 8 + 2 =10
3 7 19 + 7 = 26 64 – 65 63.5 – 65.5 11 11 + 10 = 21
4 10 26 + 10 = 36 66 – 67 65.5 – 67.5 6 6 + 21 = 27
5 8 36 + 8 = 44
6 7 44 + 7 = 51 68 – 69 67.5 – 69.5 5 5 + 27 = 32
7 3 51 + 3 = 54 70 – 71 69.5 – 71.5 5 5 + 32 = 37
8 4 54 + 4 = 58 72 – 73 71.5 – 73.5 1 1 + 37 = 38
9 5 58 + 5 = 63
Total 63 -- Total -- 38 --
39
Cumulative Frequency
Distribution
“Less than” C.F “OR more” C.F

om
“Less than” Cumulative Frequency Distribution
l.c
“A “less than” cumulative frequency distribution is that, where the cumulative frequency is
obtained by the total frequency of all classes less than the upper class boundary of a given class
ai
and starts with the lower class boundary of the first class indicating that there is no frequency
gm
below it”.
s@
Class
Class boundaries in
C.F
Classes boundaries f “less than”
t
form
ta
60 – 61 59.5 – 61.5 2 Less than 59.5 0

62 – 63 61.5 – 63.5 8 Less than 61.5 2
es
64 – 65 63.5 – 65.5 11 Less than 63.5 2 + 8 = 10

66 – 67 65.5 – 67.5 6 Less than 65.5 10 + 11 = 21
68 – 69 67.5 – 69.5 5 Less than 67.5 21 + 6 = 27
ze
70 – 71 69.5 – 71.5 5 Less than 69.5 27 + 5 = 32

72 – 73 71.5 – 73.5 1 Less than 71.5 32 + 5 = 37
Less than 73.5 37 + 1 = 38
Total -- 38 -- --
“OR more” OR “more than” Cumulative Frequency Distribution
“An “or more” cumulative frequency distribution is that, where the cumulative frequency is
obtained by the total frequency of all classes more than the lower class boundary of a given class
and ends with the upper class boundary of the last class indicating that there is no frequency
above it.”
40
Class
Class boundaries in
C.F
Classes boundaries f “or more”
form
60 – 61 59.5 – 61.5 2 59.5 or more 38
62 – 63 61.5 – 63.5 8 61.5 or more 38 – 2 = 36
64 – 65 63.5 – 65.5 11 63.5 or more 36 – 8 = 28
66 – 67 65.5 – 67.5 6 65.5 or more 28 – 11 =17
68 – 69 67.5 – 69.5 5 67.5 or more 17 – 6 = 11
70 – 71 69.5 – 71.5 5 69.5 or more 11 – 5 = 6
om
72 – 73 71.5 – 73.5 1 71.5 or more 6–5=1
73.5 or more 0
Total -- 38 -- --
l.c
Whenever we refer to a cumulative frequency distribution without any qualification, we
ai
always mean a “less than” type cumulative frequency distribution.
gm
s@
Relative Cumulative Frequency Distribution

t
ta
“The cumulative frequency of a class divided by the total of frequencies is called the relative
cumulative frequency and a table showing relative cumulative frequencies is called the relative
es
cumulative frequency distribution”.

ze
cumulative frequency of a class

R.C.F =
Percentage Relative Cumulative Frequency Distribution
“The cumulative frequency of a class divided by the total of frequencies and multiplied by 100, is
called the percentage relative cumulative frequency and a table showing percentage relative
cumulative frequencies is called the percentage relative cumulative frequency distribution”.
cumulative frequency of a class

P.R.C.F = ×100
41
Class
Classes f C.F R.C.F P.R.C.F
boundaries
60 – 61 59.5 – 61.5 2 2 2/38 = 0.0526 (2/38) x100 = 5.2632
62 – 63 61.5 – 63.5 8 8 + 2 =10 10/38 = 0.2632 (10/38) x100 = 26.3158
64 – 65 63.5 – 65.5 11 11 + 10 = 21 21/38 = 0.5526 (21/38) x100 = 55.2632
66 – 67 65.5 – 67.5 6 6 + 21 = 27 27/38 = 0.7105 (27/38) x100 = 71.0526
68 – 69 67.5 – 69.5 5 5 + 27 = 32 32/38 = 0.8421 (32/38) x100 = 84.2105
70 – 71 69.5 – 71.5 5 5 + 32 = 37 37/38 = 0.9737 (37/38) x100 = 97.3684
72 – 73 71.5 – 73.5 1 1 + 37 = 38 38/38 = 1 (38/38) x100 = 100
om
Total -- 38 -- -- --
l.c
Diagrams (Charts)
ai
Charts or diagrams give visual representations of the qualitative data. Diagrams also show comparisons
between two or more sets of qualitative data. Diagrams should be clear and easy to read and understand.
gm
Too much information should not be shown in the same diagram otherwise it might become confusing.
 Bar Charts
s@
 Pie Chart or Circle Diagram
Diagrams
t
ta
es
Bar Charts Pie Charts

ze
Bar Charts
 Simple bar chart

 Multiple bar charts or cluster chart
 Component bar chart or subdivided bar charts or staked bar charts
Bar Charts
Simple Bar Multiple Bar Component Bar

Chart Charts Charts
42
Simple Bar Chart
This chart consists of vertical or horizontal bars of equal width. The length of the bars represents the
magnitude of the values of the variable i.e. the lengths of the bars vary depending on the size of data
values.
EXAMPLE 2.04
The following table gives the population of five different cities. Draw a simple bar chart:
om
Cities A B C D E
Population 30 45 60 63 69
Solution
l.c
ai
gm
Step 1: Draw the X and Y axis and place the population on Y axis.
Step 2: Draw the bars corresponding to the population.
s@
Bar chart of the Population

t
75
ta
es
60
ze
Population 45
30
In 1786 William
15
Playfair introduced
the Bar chart.
0
A B C D E
Cities
43
Test Yourself
The following table gives the population of six different cities. Draw a simple bar chart:
Years A B C D E F
Population 50 60 70 80 90 100
om
l.c
ai
gm
t s@
ta
es
ze
44
Multiple Bars Chart
By multiple bars chart, two or more sets of inter-related data are represented. The technique of simple
bar chart is used to draw this chart but the difference is that we use different shades, colors or dots to
distinguish between different phenomena. Multiple bars chart facilities comparison between more than
one phenomenon.
EXAMPLE 2.05
The following table gives the imports and exports of Pakistan for the five Months. Draw a
om
multiple bar chart:
Months Imports Exports
January 8 4
l.c
February 10 6
March 12 ai 9
April 18 13
May 20 17
gm
Solution
s@
Step 1: Draw the X and Y axis and place the amount on Y axis.
Step 2: For each month, draw two bars (both for the imports and exports) side-by-side
corresponding to the amount.
t
ta
Multiple Bars Chart Import

es
25
Export
ze
20
15
Million(Rs.)
10
0
Jan Feb March April May
Months
45
Test Yourself
The following table gives the imports and exports of Pakistan for the five Months.
Draw a multiple bar chart:
Months Imports Exports

January 9 5
February 13 9
March 15 8
om
April 14 16
May 23 12
l.c
ai
gm
t s@
ta
es
ze
46
EXAMPLE 2.06
Construct a Multiple Bar Chart to show the population of the cities given in the following table:
Population in thousand
City May Jun July
A 70 110 200
B 80 90 160
C 90 100 120
om
Solution
l.c
Step 1: Draw the X and Y axis and place the population on Y axis.
Step 2: For each year, draw three bars (for the three cities) side-by-side corresponding to
ai
the populations.
gm
s@
Multiple Bars Chart

250
city A
city B
200
t
city C
ta
150
es
Population
ze
100
50
0
May Jun July
Months
47
Test Yourself
Construct a Multiple Bar Chart to show the population of the cities given in the following table:
Population in thousand
City Jan Feb March
A 90 120 180
B 60 80 200
C 70 120 100
om
l.c
ai
gm
t s@
ta
es
ze
48
Sub-divided Bar Chart
A sub-divided bar chart is an effective technique in which each bar is sub-divided into two or more
parts. The component parts are shaded or colored differently to increase the overall effectiveness of the
diagram.
EXAMPLE 2.07
The following table represents the monthly development in the filed of industry, transport and
agriculture of Pakistan. Construct a Sub-divided Bar Chart:
om
Months Industry Transport Agriculture Total
Jan 100 80 40 220
Feb 120 100 50 270
l.c
March 130 120 60 310
ai
Solution
gm
Step 1: Draw the X and Y axis and place the Development on Y axis.
Step 2: For each year, draw three bars corresponding to the development then sub-divide
s@
each bar for the agriculture, transport and industry by their corresponding
Developments.
350 Sub-divided Bars Chart agriculture

t
ta
transport
300 in dustry
es
250
ze
Development 200
150
100
50
0
jan Feb March
Months
49
Test Yourself
The following table represents the Monthly development in the filed of industry, transport and
agriculture of Pakistan. Construct a Sub-divided Bar Chart:
Years Industry Transport Agriculture Total

May 120 90 50 260
June 140 110 70 320
om
July 150 100 40 290
l.c
ai
gm
t s@
ta
es
ze
50
Pie Chart or Circle Diagram
A pie-diagram, also known as sector or circle diagram, is a device

consisting of a circle divided into sectors or pie-shaped pieces whose areas
are proportional to the various parts into which the whole quantity is
divided. The sectors are shaded or colored differently.
The procedure of constructing a pie chart is very simple:
om
 Draw a circle of some suitable radius.
 As a circle consists of 360o, the whole quantity to be
In 1801 the earliest
displayed is equated to 360.
l.c
known Pie chart is
 Then divide the circle into different sectors by
generally credited to
constructing angles at the center by means of a
William Playfair.
ai
protractor and draw the corresponding radii.
 The angles are calculated by the following formula:
gm
Component Part
Angle   360o
Whole Quantity
t s@
ta
The following tools are used to draw the pie chart.

es
ze
51
How to draw Pie chart?
Step 1 Step 2 Step 3
om
l.c
Step 4 Step 5
ai Step 6
gm
ts@
ta
es
Step 7 Step 8
ze
Wow!
52
EXAMPLE 2.08
Draw a Pie-diagram for the following data:
Items Expenditure in Rs.

Food 190
Clothing 64
Rent 100
Medical 46
Other 80
om
Solution
l.c
Step 1: To construct Pie-diagram; first we find Angles.
ai
Component Part
Items
Expenditure Angle   360 0
in Rs. Whole Quantity
gm
Food 190 142.5

Clothing 64 48
Rent 100 75
s@
Medical 46 34.5
Other 80 60
Total 480 360
t
Step 2: Next, using a protractor and a compass, draw the graph using the appropriate
ta
degree found in step 1, and label each section with the name.
es
Pie Chart
ze
Food
Clothing
Others
Re nt
Medical
Medical Food Others
Re nt
Clothing
53
Test Yourself
Draw a Pie-diagram for the following data:
Items Expenditure in Rs.

Food 160
Clothing 80
Rent 120
Medical 50
om
Other 90
l.c
ai
gm
t s@
ta
es
ze
54
Graphs
Graphs give visual representations of the quantitative data. A graph consists of curves or straight lines.
Graphs provide a very good method of showing fluctuations and trends in statistical data. Graphs can
also be used to make predictions and forecasts.
 Histogram
 Frequency Polygon
 Frequency Curve
 Cumulative Frequency Polygon (Ogive)

om
Graph of Ungrouped Frequency Distribution
 Graph of Time Series
l.c
Graphs
ai Graph of
Time Series
gm
Grouped Frequency Ungrouped Frequency

s@
Histogram Frequency
t
ta
Curve
es
Frequency Cumulative
Polygon Frequency Polygon
ze
Histogram
A histogram consists of a set of rectangles having bases on a horizontal

axis i.e. X-axis (note that these bases are marked off by class boundaries
not class limits) with centers at the class marks and areas proportional to
the class frequencies.
 If the widths of the classes are equal then the heights of the
rectangles are also proportional to the class frequencies and are
taken numerically equal to class frequencies. In 1891, Pearson
introduced
 If the widths of the classes are not equal then the heights of the “Histogram”
rectangles have to be adjusted.
55
First Method (Equal Class Width)
 Draw X-axis and Y-axis.

 Take class boundaries on X-axis and frequencies on Y-axis.
 Construct joint rectangles. The resulting figure is the required histogram.
EXAMPLE 2.09
Construct Histogram from the following frequency distribution:
om
Classes 40-49 50-59 60-69 70-79 80-89 90-99 100-109
Frequency 1 3 4 5 4 2 1
l.c
Solution To draw a Histogram we proceed with the following steps:ai
gm
Step 1: Find class-boundaries.
Step 2: Mark class-boundaries along the x-axis and the frequencies along y-axis.
Step 3: Construct rectangles having width proportional to class widths and heights
proportional to class frequencies.
s@
Step 4: The resulting graph will be the Histogram as given below.

t
Histogram with equal

ta
Class-
6 class width Classes f
boundaries
es
40-49 1 39.5-49.5
5 50-59 3 49.5-59.5
ze
60-69 4 59.5-69.5
70-79 5 69.5-79.5
4
80-89 4 79.5-89.5
90-99 2 89.5-99.5
Frequency 3 100-109 1 99.5-109.5
0
39.5 49.5 59.5 69.5 79.5 89.5 99.5 109.5
class boundaries
56
Test Yourself
Classes 30-39 40-49 50-59 60-69 70-79 80-89 90-99

Frequency 2 4 5 7 4 3 2
om
l.c
ai
gm
t s@
ta
es
ze
57
Second Method (Unequal Class Width)

 Take class boundaries on X-axis and adjusted frequencies on Y-axis.
(Frequencies are adjusted by dividing them by their respective class width)
 Construct joint rectangles. The resulting figure is the required histogram.
EXAMPLE 2.10
om
classes 40-49 50-53 54-64 65-79 80-89 90-99 100-109
f 10 12 44 75 40 20 10
Solution To draw a Histogram we proceed with the following steps:

l.c
ai
gm
Step 1: Find class-boundaries and adjusted Class
Class- f
frequencies. width Adj : frequency 
Step 2: Mark class-boundaries along the boundaries h
(h)
x-axis and the adjusted frequencies
s@
39.5-49.5 10 1
along y-axis. 49.5-53.5 4 3
Step 3: Construct rectangles having width 53.5-64.5 11 4
proportional to class-width and 64.5-79.5 15 5
t
heights proportional to class 79.5-89.5 10 4

ta
adjusted frequencies. 89.5-99.5 10 2

Step 4: The resulting graph will be the 99.5-109.5 10 1
es
Histogram as given below.

ze
6 Histogram with unequal

class width
5
Frequency 3
0
39.5 49.5 53.5 64.5 79.5 89.5 99.5 109.5
class boundaries
58
Test Yourself
classes 20-29 30-33 34-44 45-59 60-69 70-79 80-89

f 12 15 48 80 30 25 15
om
l.c
ai
gm
ts@
ta
es
ze
59
Frequency Polygon
A frequency polygon is a many sided closed figure. It is constructed by plotting the class frequencies
against their corresponding class marks (mid-points) and then joining the resulting points by means of
straight lines. It can also be obtained by joining the mid-points of the tops of rectangles in the
histograms.
Method
om
 Take class marks on X-axis and frequencies on Y-axis.
 Join the points by means of straight lines. The resulting figure is the required
l.c
frequency polygon.
ai
gm
In this method the ends of the graph do not meet the X-axis and we know that a polygon
is a many-sided closed figure. We may therefore add extra classes at both ends of frequency
s@
distribution with zero frequencies. By doing so the polygon forms a closed figure.
t
ta
EXAMPLE 2.11
es
Construct Frequency Polygon from the following frequency distribution:

ze
Classes 10-19 20-29 30-39 40-49 50-59

Frequency 5 15 40 20 10
Solution To draw a Frequency Polygon we proceed with the following steps:
Step 1: Find class-marks (mid-points).
Classes Frequency Mid-points

10-19 5 14.5
20-29 15 24.5
30-39 40 34.5
40-49 20 44.5
50-59 10 54.5
60
Step 2: Mark mid-points along the x-axis and the frequencies along y-axis.
Step 3: Place a dot against each mid-point with respect to its class frequency.
Step 4: Join the dots by straight lines to get Frequency Polygon as given below.
Frequency Polygon
45
35
om
Frequency 25
15
l.c
5 ai
0
4.5 14.5 24.5 34.5 44.5 54.5 64.5
gm
Mid Points
s@
We can also draw a frequency polygon by the following method:
 Draw a histogram
t
ta
 The mid-points at the top of each rectangle are joined by straight lines. The
figure is the required frequency polygon.
es
Frequency Polygon
ze
45
35
Frequency 25
15
0
4.5 14.5 24.5 34.5 44.5 54.5 64.5
Mid Points
61
Test Yourself
Construct Frequency Polygon from the following frequency distribution:
Classes 20-29 30-39 40-49 50-59 60-69

Frequency 7 13 42 23 6
om
l.c
ai
gm
t s@
ta
es
ze
62
Frequency Curve
When the frequency polygon is smoothed out as a curve then it becomes frequency curve. OR when the
mid-points are potted against the frequencies then a smooth curve passes through these points is called a
frequency curve.
Method
om
 Take class marks on X-axis and frequencies on Y-axis.
 Plot the frequencies against the class marks.
 The plotted points are then joined by a smooth curve, which gives frequency
l.c
curves.
ai
gm
EXAMPLE 2.12
Construct Frequency Curve from the following frequency distribution:

s@
Classes 10-19 20-29 30-39 40-49 50-59

Frequency 5 15 40 20 10
t
ta
Solution
es
To draw a Frequency Curve we proceed with the

following steps:
ze
Step 1: Find class-marks (mid-points).
Classes Frequency Mid-points

10-19 5 14.5 The smoothed curve
20-29 15 24.5 should pass above the
30-39 40 34.5 highest points of the
40-49 20 44.5 polygon
50-59 10 54.5
Step 2: Mark mid-points along the x-axis and the frequencies along y-axis.
Step 3: Place a dot against each mid-point with respect to its class frequency.
Step 4: Join the dots by smooth line to get Frequency Curve as given below.
63
45 Frequency Curve
35
Frequency 25
15
om
5
0
4.5 14.5 24.5 34.5 44.5 54.5 64.5
l.c
Mid Points
ai
gm
We can also draw a frequency curve by the following method:

s@

 Draw a histogram.
 Draw a smooth curve through the top of the rectangles. The resulting figure is the
t
required frequency curve.

ta
Frequency Curve
es
45
ze
35
Frequency 25
15
0
4.5 14.5 24.5 34.5 44.5 54.5 64.5
Mid Points
64
Test Yourself
Construct Frequency Curve from the following frequency distribution:
Classes 20-29 30-39 40-49 50-59 60-69

om
l.c
ai
gm
t s@
ta
es
ze
65
Cumulative Frequency Polygon (Ogive)
When a curve is based on cumulative frequencies then it is called an ogive.
Ogive
Less than Type More than Type
om
Less than Type
l.c
Method
ai
gm
 First calculate the cumulative frequencies.
 Take upper class boundaries on X-axis and the cumulative frequencies on
Y-axis.

s@
Plot the cumulative frequency against the upper class boundaries.

 Join the potted points by straight lines. The resulting figure is the required less
than cumulative frequency polygon or less than ogive.
t
ta
EXAMPLE 2.13
es
Construct Less Than Cumulative Frequency Polygon (Ogive) from the following frequency
distribution:
ze
Classes 10-19 20-29 30-39 40-49 50-59

Frequency 5 25 45 15 10
Solution To draw a Cumulative Frequency Polygon (Ogive) we proceed with the following steps:
Step 1: Find class-boundaries and cumulative frequencies.
Cumulative Class
Classes Frequency
Frequency boundaries
10-19 5 5 9.5-19.5
20-29 25 30 19.5-29.5
30-39 45 75 29.5-39.5
40-49 15 90 39.5-49.5
50-59 10 100 49.5-59.5
66
Step 2: Mark upper class-boundaries along the x-axis and cumulative frequencies along
y-axis.
Step 3: Place a dot against each upper class-boundary with respect to its class cumulative
frequency.
Step 4: Join the dots by straight line to get Cumulative Frequency Polygon (Ogive) as
given below.
120
Less than Ogive
om
100
80
Cumulative
l.c
Frequency 60
ai
40
gm
20
0
s@
19.5 29.5 39.5 49.5 59.5

upper class boundaries
t
ta
More than Type

es
ze
Method
 First calculate the cumulative frequencies.

 Take lower class boundaries on X-axis and the
If we join the points in
cumulative frequencies on Y-axis.
cumulative frequency
 Plot the cumulative frequency against the lower
class boundaries. polygon by smoothed
 Join the potted points by straight lines. The line then we get a
resulting figure is the required more than smoothed ogive.
cumulative frequency polygon or more than ogive.
67
EXAMPLE 2.14
Construct More Than Cumulative Frequency Polygon (Ogive) from the following frequency
distribution:
Classes 10-19 20-29 30-39 40-49 50-59
Frequency 5 25 45 15 10
Solution To draw a Cumulative Frequency Polygon (Ogive) we proceed with the following steps:
om
Step 1: Find class-boundaries and cumulative frequencies.
Cumulative Class
Classes Frequency
Frequency boundaries
l.c
10-19 5 100 9.5-19.5
20-29 25 95
ai 19.5-29.5
30-39 45 70 29.5-39.5
40-49 15 25 39.5-49.5
gm
50-59 1 10 49.5-59.5
Step 2: Mark lower class-boundaries along the x-axis and cumulative frequencies along
s@
y-axis.
Step 3: Place a dot against each lower class-boundary with respect to its class cumulative
frequency.
Step 4: Join the dots by straight line to get Cumulative Frequency Polygon (Ogive) as
t
ta
given below.
es
More than Ogive

120
ze
100
80
Cumulative
Frequency 60
40
20
0
19.5 29.5 39.5 49.5 59.5
lower class boundaries
68
Test Yourself
Construct Both More Than and Less Than Cumulative Frequency Polygon (Ogive) from the
following frequency distribution:
Classes 30-39 40-49 50-59 60-69 70-79

Frequency 4 13 43 26 12
om
l.c
ai
gm
t s@
ta
es
ze
69
Graph of Ungrouped Frequency Distribution
Vertical lines graph is visual representation of an ungrouped frequency distribution. It consists of a

set of vertical lines that are perpendicular to the X- axis and intersect the X-axis at the values of the
discrete variable and the height of each line is proportional to its frequency.
First Method


om
Take the values of discrete variable on X-axis and frequencies on Y-axis.
 Draw vertical lines for each value of the variable such that the height of each line
is proportional to its frequency.
l.c
EXAMPLE 2.15
ai
gm
Construct vertical lines graph for the following data:
X 3 4 5 6 7 8 9 For discrete variable, if
f 2 3 7 9 8 3 5 we make Histogram we
s@
first find class

boundaries. These class
Solution
boundaries are called
t
ta
factitious class
Step 1: Draw X-axis and Y-axis.
boundaries because the
Take the variable “X” on X-axis and frequencies
es
Step 2:
along Y-axis. discrete variable cannot
Step 3: Draw vertical lines for each value of “X” with height assume such values.
ze
equal to its frequency.
Vertical Lines Graph Histogram

10 10
8 8
6 6
f f
4 4
2 2
0 0
3 4 5 6 7 8 9 X 3 4 5 6 7 8 9 X
70
Test Yourself
Construct vertical lines graph and the histogram for the following data:
X 30 40 50 60 70 80 90
Frequency 4 6 7 13 12 5 1
om
l.c
ai
gm
t s@
ta
es
ze
71
Graph of Time Series
A curve showing changes in the value of one or more items from one period of time to the next is known
as the graph of time series. Thus a Graph of time series displays the variations in time series dealing
with prices, production, imports, population etc.
Method


om
Take time (years, months, weeks, etc.) along X-axis
and the corresponding values along Y-axis. Do not try to fit a
 Plot the various points. Join the plotted points by smooth curve through
straight lines. The resulting figure is the required the data points
l.c
graph of time series.
ai
EXAMPLE 2.16
gm
Construct Graph from the following Time Series:

s@
Year 1995 1996 1997 1998 1999 2000 2001 2002

Values 5 7 10 8 2 11 12 10
t
Solution
ta
es
Step 1: Draw X-axis and Y-axis.

Step 2: Take time along X-axis and the corresponding values along Y-axis.
Step 3: Plot the various points. Join the plotted points by straight lines. The resulting
ze
figure is the required graph of time series.
14 Graph of Time Series

12
10
Values 8
6
4
In 1786 William
2
Playfair invented the
0 line graph.
1995 1996 1997 1998 1999 2000 2001 2002
Years
72
Test Yourself
Construct Graph from the following Time Series:
Year 2000 2001 2002 2003 2004 2005 2006 2007

Values 8 9 15 12 10 9 11 7
om
l.c
ai
gm
t s@
ta
es
ze
73
False Base Line or the broken line
In all the above graphs and diagram, if the horizontal scale is started from zero it would not only be
difficult to accommodate the whole data on the graph paper but the graph would go at the right of the
paper. In order to avoid this, false base line is used. In false base line, instead of showing the entire
horizontal scale starting from zero to the highest value involved, only that portion of the scale is shown
which serves the purpose. Thus the portion of the scale, starting from zero to the minimum value is
omitted.
Graph of Time Series

14
om
The broken line has been used
12 along the horizontal line to indicate
that we are not showing the
l.c
numbers between 0 and 1995
10
ai
Values 8
gm
4
s@
0
t
ta
1995 1996 1997 1998 1999 2000 2001 2002

Years
es
ze
The Difference between Bar Charts and Histograms
 Here is the main difference between bar charts and histograms. With bar charts,
each column represents a group defined by a categorical variable; and with
histograms, each column represents a group defined by a quantitative variable.
 It is always appropriate to talk about the skewness of a histogram; that is, the
tendency of the observations to fall more on the low end or the high end of the X
axis.
 With bar charts, however, the X axis does not have a low end or a high end; because
the labels on the X axis are categorical - not quantitative. As a result, it is less
appropriate to comment on the skewness of a bar chart.
74
Following differences may be noted between diagrams and graphs.
 In the construction of a graph, graph paper is used. A graph helps to study the
mathematical relation between two variables such as price and demand; income and
consumption, time and population etc. On the other hand, diagrams are generally
constructed on a plain paper. A diagram is used for sake of comparison but not for
studying the relation between two variables.
om
 Graphs are more precise and accurate than diagrams. They are more helpful to a
researcher for studying the relationship between two variables and for further
statistical analysis and interpretation. Diagrams furnish only approximate
l.c
information on the problem under study. These are not much use to a researcher
for further analysis. ai
 Graphs are used to present time series data and frequency distributions. Diagrams
gm
are useful in presenting qualitative data. Presentation of data through graphs is

easier than through diagrams.
t s@
ta
es
ze
75
Sharpen your Pencil

MCQ’s
(1) Class boundaries of 2.5 – 3.5 is ______
(A) 2.45 – 3.55 (B) 2.4 – 3.6 (C) 20 -- 40 (D) None of these
om
(2) Class boundaries of 2 –5 is ______
(A) 2.45 – 555 (B) 2.5 – 5.5 (C) 1.5 – 5.5 (D) None of these
l.c
(3) Class Mark of 2 –5 is ______ ai
(A) 4.5 (B) 3.5 (C) 5.5 (D) None of these
gm
(4) Class width of 2 –3, 4 – 5, 6 – 7 is ______
(A) 1 (B) 2 (C) 1.5 (D) None of these

s@
(5) Class width of 2.45 –5.45 is ______

t
(A) 4 (B) 5 (C) 3 (D) None of these

ta
(6) Class boundaries of 2.05 – 3.05 is ______

es
(A) 2.045 – 3.055 (B) 2.045 – 3.065 (C) 2–3 (D) None of these
ze
(7) For construction of an ogive we find _____ frequencies.
(A) Relative (B) Cumulative (C) Percentage (D) None of these
(8) For relative frequency we divide the class frequency by _____
(A) 100 (B) Sum of frequencies

(C) Cumulative Frequency (D) None of these
(9) The Graph of adjacent rectangles is called_____
(A) Frequency curve (B) Histogram

(C) Ogive (D) None of these
76
Sharpen your Pencil

MCQ’s
(10) For construction of more than ogive we draw _____on x-axis.
(A) Class boundaries (B) Upper class boundaries

(C) Lower class boundaries (D) None of these
om
(11) For relative frequency we divide the class frequency by _____
l.c
(A) 100 (B) Sum of frequencies
(C) Cumulative Frequency (D) aiNone of these
(12) Number of classes is equal to the range divide by _____

gm
(A) 100 (B) class interval (C) Mid-point (D) None of these
(13) For construction of less than Ogive we draw _____on x-axis

s@
(A) Class boundaries (B) Upper class boundaries

(C) Lower class boundaries (D) None of these
t
ta
(14) The process of arranging data into rows and columns is called _____
es
(A) Classification (B) Tabulation

(C) Both A & B (D) None of these
ze
(15) A frequency curve is _____
(A) Horizontal line (B) Straight line

(C) Smoothed graph (D) None of these
(16) An ogive is of _____
(A) 4 types (B) 3 types (C) 2 types (D) None of these
(17) Data classified by attributes is called _____
(A) Quantitative data (B) Qualitative data

(C) Time series data (D) None of these
77
Short Questions
ExeRciSe
Q.2.01. Define Primary and Secondary data?
Q.2.02 . Define Classification and Tabulation.
om
Q.2.03. What is frequency distribution?
Q.2.04. What are the methods for collecting primary data?
l.c
Q.2.05. Draw a Pie Chart and Bar Chart for the following data:
ai
Items Food Clothing Bills Rent Misc.
gm
Expenditure 60 10 12 10 8
Q.2.06. What Points should be considered in construction of frequency distribution?

s@
Q.2.07. Represent the following frequency distribution by frequency polygon:
Classes 1-3 4-6 7-9 10-12 13-15.

t
f 1 2 10 5 3
ta
Q.2.08. Calculate cumulative frequencies and relative frequencies for the following data:
es
Classes 1-3 4-6 7-9 10-12 13-15.

ze
f 1 2 10 5 3
Q.2.09. Explain the use of Pie chart in presenting statistical data.
Q.2.10. Draw a sub divided bar chart for the following data:
Year Wheat Maize Rice

March 80 110 90
April 100 80 90
May 120 130 60
Q.2.11. What is Frequency Curve? Draw a frequency curve from the following data:
Classes 10-19 20-29 30-39 40-49 50-59 60-99

f 8 15 30 21 12 5
78
Long Questions
ExeRciSe
Q.2.01. Construct a frequency distribution using class-intervals of 5. Indicate the class-

boundaries and class-limits clearly:
79.4 71.6 95.5 73.0 74.2 81.8 90.6 72.1
om
55.9 75.2 81.9 68.9 74.2 80.7 65.7 71.6
67.6 82.9 88.1 77.8 69.4 83.2 82.7 59.4
73.8 64.2 63.9 58.3 48.6 83.5 70.8 77.6
l.c
Q.2.02. Classify the data taking class intervals of size (Width) one:
ai
1 3 4 5 6 2 3 4 6 3
5 4 2 8 4 2 4 5 3 5
gm
4 5 1 0 2 3 4 5 3 4
3 2 5 5 3 5 4 6 5 6
3 4 6 1 2 4 4 3 4 5
s@
3 7 4 6 3 4 5 7 7 4
Q.2.03. Construct a frequency table for the following data by using 0.02 as the width of
t
the class-intervals.
ta
0.27 0.22 0.20 0.24 0.26 0.27 0.28 0.28 0.27 0.29
es
0.27 0.27 0.27 0.27 0.22 0.24 0.26 0.27 0.29 0.29
0.30 0.35 0.33 0.26 0.31 0.26 0.30 0.27 0.31 0.32
ze
0.26 0.23 0.24 0.22 0.23 0.25 0.23 0.25 0.27 0.30
Q.2.04. The following responses were obtained when 49 randomly selected residents of a
small city were asked the question “How safe do you think your neighborhood is
for kids?”
not at
very very not sure very not very not sure
all
very not sure somewhat very not at all not very
not very
very very very not very somewhat somewhat
very
very very not sure not at all not very very
not very
very not sure very not very very very
very
not very somewhat somewhat somewhat very very
very
not very not at all very somewhat very somewhat
very
Construct a frequency distribution, Bar chart and Pie chart for this data.
79
Long Questions
ExeRciSe
Q.2.05. Construct a frequency table for the following data by using 10 as the width of the
class-intervals.
76 70 54 70 104 58 88 94 89 57
om
86 62 58 73 103 90 84 90 88 59
84 63 65 72 101 56 87 92 60 87
83 69 57 71 102 57 83 93 61 86
l.c
Also construct a Histogram, Frequency Polygon and Frequency Curve.
ai
Q.2.06. Find Relative Frequencies and Percentage Relative Frequencies for the Data in
gm
Question 2.04
t s@
ta
es
ze
80
CHAPTER 03
Measures
of Central Tendency
om
Chapter Contents
l.c
ai
Y
gm
 Types of Measures of Central Tendency: (P82)

s@
 Arithmetic Mean: (P82-P86)

 Properties of Arithmetic Mean: (P87-P88)
 Change of Origin and Scale: (P89-P90)
 Weighted Arithmetic Mean: (P91-P92)
t
ta
 Geometric Mean: (P93-P95)

 Harmonic Mean: (P96-P98)
es
 Relationship between Arithmetic Mean,

Geometric Mean and Harmonic Mean: (P99)
Mode: (P100-P103)
ze

 Median: (P104-P109)
 Symmetrical Distribution: (P110)
 Empirical Relation between Mean, Median and Mode: (P111)
 Quartiles, Deciles and Percentiles: (P111-P118)
 Main Objects of Averages: (P118)
 Requisites (desirable qualities) of a Good Average: (P118)
 Uses of Averages in Different Situations: (P119)
 Prove that:  (xi  x)2  (xi  A)2 : (P119)
81
Chapter 03 Measures of Central Tendency
 Usually when two or more different data sets are to be compared it is

necessary to condense the data, but for comparison the condensation of data
set into a frequency distribution and visual presentation are not enough. It is
then necessary to summarize the data set in a single value. Such a value
usually somewhere in the center and represent the entire data set and hence it
is called measure of central tendency or averages. Since a measure of central
tendency (i.e. an average) indicates the location or the general position of the
distribution on the X-axis therefore it is also known as a measure of location
or position.
om
Types of Measure of Central Tendency
l.c
Types
ai
gm
Arithmetic Median
Mean Harmonic
Mean
s@
Geometric Mode
Mean
t
ta
es
Arithmetic Mean or Simply Mean

ze
“A value obtained by dividing the sum of all the

observations by the number of observations is
called arithmetic mean”
Sum of All the Observations

Mean 
Number of Observations
The mean is that central point where the sum of the negative deviations (absolute value)
from the mean and the sum of the positive deviations from the mean are equal. This is why
the mean is considered a measure of central tendency.
82
Methods of Finding Arithmetic Mean
Methods
Direct Step-deviation Method

Method or Coding Method or
Short-cut Short Method
Method
om
Methods Ungrouped data Grouped data
l.c
x x   ; Here n   f
xi fx
Direct Method
n n
ai
x  A  x  A 
D fD
; Here n   f
gm
n n
Short cut Method
Where D = Xi - A and A is the provisional or assumed mean.
s@
x  A  h x  A    h ; Here n   f
u fu
Step deviation n n
Method Xi - A
t
Where u= and h is the common width of the class intervals

ta
h
es
EXAMPLE 3.01
ze
Find A.M from the following data: (ungrouped data)
2, 4, 6, 8, 10
Solution
Direct Method:
X
2
x =
xi 30 The Arithmetic mean is
4 = 6.0
6 n 5 simply called Mean. We
8 denoted Mean by
10 x (read as “X Bar”)
30
83
Short-cut Method:
X D = Xi - A
2 -2
x  A  = 4 +
D 10
4 0 = 6.0
6 2 n 5
To compute the mean,
8 4
(Let A = 4) round-off it one more
10 6
30 10 decimal place than the
original data values. For
Step-deviation Method: example, if the data are
om
given in whole numbers,
Xi - A then the mean should be
X u=
h rounded-off to nearest
l.c
x  A  h = 8 +
u (-5)
2 -3  2 = 6.0 tenth. If the data are
4 -2 n 5
ai given in tenths then the
6 -1
mean should be
8 0 (Here h = 2 and let A = 8)
rounded-off to nearest
gm
10 1
30 -5 hundredth and so on.
s@
EXAMPLE 3.02
Find A.M from the following data: (Discrete Grouped data)

t
ta
X 10 15 20 25 30
es
f 1 2 3 2 1
ze
Solution
Direct Method:
X f fX
10 1 10 In grouped data the
x =
15 2 30 fx 180
= 20.0 number of observations
n 9
f
20 3 60
“n” is equal to
25 2 50
30 1 30 (Here n =  f = 9 )
Total 9 180
84
Short-cut Method:
X f D = Xi - A fD
x  A 
10 1 -10 -10 fD 0
= 20 + = 20.0
15 2 -5 -10 n 9
20 3 0 0
25 2 5 10 (Here A = 20 and n =  f = 9 )
30 1 10 10
Total 9 -- 0
Step-deviation Method:
om
Xi - A
x  A    h = 20 +  5 = 20.0
X f u= fu fu 0
h
l.c
10 1 -2 -2
n 9
15 2 -1 -2
(Here A = 20, h = 5 and n =  f = 9 )
20 3 0 0
ai
25 2 1 2
gm
30 1 2 2
Total 9 -- 0
s@
EXAMPLE 3.03
Find A.M from the following data: (Continuous Grouped data)
t
ta
Weight 11- 20 21- 30 31- 40 41-50 51-60

f 1 2 3 2 1
es
Solution
ze
Direct Method:
Weight f X (mid points) fX

11- 20 1 15.5 15.5
21- 30 2 25.5 51.0
31- 40 3 35.5 106.5
41-50 2 45.5 91.0
51-60 1 55.5 55.5
Total 9 -- 319.5
x =
fx 319.50
= 35.50 (here n =  f = 9 )
n 9
85
Short-cut Method:
Weight f X (mid points) D = Xi - A fD

11- 20 1 15.5 -20 -20
21- 30 2 25.5 -10 -20
The concept of
31- 40 3 35.5 0 0
Arithmetic Mean
41-50 2 45.5 10 20
has been first used
51-60 1 55.5 20 20
by Greek
Total 9 -- -- 0
astronomers in the
third century BC.
x  A 
fD 0
om
= 35.5 + = 35.50
n 9
(Here A = 35.5 and n =  f = 9 )
l.c
Step-deviation Method: ai
Xi - A
Weight f X (mid points) u= fu
gm
h
11- 20 1 15.5 -2 -2 But In 1755,
21- 30 2 25.5 -1 -2 Thomas Simpson
officially proposed
s@
31- 40 3 35.5 0 0
41-50 2 45.5 1 2 the use of
51-60 1 55.5 2 2 Arithmetic Mean.
Total 9 -- -- 0
t
ta
x  A    h = 35.5 +  10 = 35.50
fu 0
es
n 9
(Here A = 20, h = 10 and n =  f = 9 )
ze
Test Yourself
To find Mean of the
Find the A.M from the following data:
population use the
1) 1, 3, 5, 7, 9, 11, 13, 15 following formula:
 x
X 20 25 30 35 40 N
2) f 2 4 9 3 1
3)
Weight 21- 30 31- 40 41- 50 51-60 61-70  (meu)
f 1 3 5 4 2
86
Properties of Arithmetic Mean
The following are the properties of arithmetic mean:
 The mean of a constant is same constant.
4, 4, 4, 4, 4
x =
xi 4 + 4 + 4 + 4 + 4 20
= = 4.0
n 5 5
om
 The sum of deviations from mean is equal to zero. i.e.  (Xi - X)  0
2, 4, 6, 8, 10
l.c
ai
gm
X (Xi - X)
x =
2 -4 xi 30
= 6.0
4 -2 n 5
s@
6 0
8 2
10 4
30 0 =  (Xi - X)   (Xi - X) = 0
t
ta
es
 The sum of squared deviations from the mean is smaller than the sum of squared deviations
ze
from any arbitrary value or provisional mean. i.e. (xi  x)2  (xi  A)2
2, 4, 6, 8, 10
X (Xi - X) (Xi - X)2 (Xi - A) (Xi - A)2

x =
2 -4 16 -2 4 xi 30
= 6.0
4 -2 4 0 0 n 5
6 0 0 2 4
8 2 4 4 16 Let A = 4
10 4 16 6 36
30 -- 40=  (Xi - X)2 -- 60=  (Xi - A)2
 (Xi - X)  (Xi - A)
2
 < 2
87
 The arithmetic mean is affected by the change of origin and scale i.e. when a constant is
added to or subtracted from each value of a variable or if each value of a variable is
multiplied or divided by a constant, then arithmetic mean is affected by these changes.
x =
xi 30
= 6.0
Variable Mean n 5
Xi X Let Yi = 2X + 3 (a = 3, b = 2)
om
Xi  a X a X Y = 2X+3
2 7
aX 4 11
l.c
a Xi
6 15
X
ai 8 19
Xi 10 23
a
a 30 75
gm
Now Y =
 Yi = 75 = 15.0
n
s@
5
therefore Y = bX + a = (2) (6) + 3 = 15.0
t
ta
 If k-subgroups consists of n1 ,n2 ,…,nk observations having their respective means as x 1,

x 2,…, x k then the mean of all the data or combined mean is denoted by x or xc and is
es
defined by:
n1 x1  n2 x2  ... nk xk
ze
xc 
n1  n2  ... nk
For example, if three sections of a statistics class containing 28, 32, and 35 students averaged 83,
80 and 76 respectively, on the same final examination. Then the combined mean for all 3
sections is:
n1 = 28 ; X1 = 83
n2 = 32 ; X2 = 80
n3 = 35 ; X3 = 76
n1 x1  n2 x2  n3 x3 (28)(83)  (32)(80)  (35)(76)

x  = = 79.4
c n1  n2  n3 28  32  35
88
Change of Origin
If we add a constant to each value of a variable or subtract a constant from each value of a variable,
then this is called as change of origin. The arithmetic mean is affected by these changes but the
standard deviation (will be discussed in Chapter 04) is independent of these changes. For example:
Mean( x) 
 x  30  6 Mean( y ) 
 y  45  9 Old Variable New Variable
n 5 n 5 X X2 Y = X+3 Y2
0 0 3 9
x x y y
2 2 2 2
om
S .D( x)     S .D( y )     3 9 6 36
n  n  n  n  6 36 9 81
2 2
9 81 12 144
270  30  495  45  12 144 15 225
l.c
     4.2      4.2
5  5  5  5  ai 30 270 45 495
gm
The following figure illustrates the idea of change of origin:
s@
Mean  6
t
S.D  4.2
ta
4.2
es
0 3 6 9 12
Origin
Mean I have just changed
ze
my position on the
X-axis
New Mean  9
New S.D  4.2

4.2
3 6 9 12 15
Origin Mean
It is now clear, if we change the origin by adding ―3‖ to each value of the variable, then the A.M will
be affected by these changes but S.D will not be changed i.e.
New Mean   Old Mean  3   6  3  9 and New S.D  Old S.D  4.2
89
Change of Scale
If each value of a variable is multiply or divide by a constant, then this is called as change of scale. The
arithmetic mean and standard deviation are affected by these changes. For example:
Mean( x) 
 x  30  6 Mean( y ) 
 y  10  2 Old Variable New Variable
n 5 n 5 X X2 Y = X/3 Y2
0 0 0 0
x x y y
2 2 2 2
3 9 1 1
S .D( x)     S .D( y )    
om
n 6 36 2 4
n  n   n 
9 81 3 9
2 2
270  30  30  10  12 144 4 16
     4.2      1.4
l.c
5  5  5 5 30 270 10 30
ai
The following figure illustrates the idea of change of scale:
gm
s@
Mean  6
S.D  4.2
t
4.2
ta
0 3 6 9 12
es
Mean
I have changed the
scale; the original
ze
is wider than me.

New Mean  2
New S.D  1.4

1.4
0 1 2 3 4 6 9 12
Mean
It is now clear, if we change the scale by dividing each value of the variable by ―3‖ then both the A.M
and S.D will be affected by these changes, such that:
Old Mean 6 Old S .D 4.2

New Mean    2 and New S .D    1.4
3 3 3 3
90
Merits and Demerits of Arithmetic Mean
Merits
 The A.M is clearly defined by a mathematical formula.

 It is based on all the observations in the data and is easy to calculate.
 It is capable of further algebraic treatment.
 It is always unique, i.e. a set of data has only one mean.

om
It is a relatively stable statistic with the fluctuations of sampling.
 It provide basis for
statistical inference.
l.c
Demerits
 It is greatly affected
ai
by extreme values
gm
in the data.
 It cannot be
calculated for
s@
qualitative data.
 If the grouped data
have ―open-end‖
t
classes, mean cannot be accurately computed.

ta
 It is not an appropriate average for highly skewed distribution.

es
Weighted Arithmetic Mean

ze
More Important
Up till now we have discussed the simple A.M or in other words un-weighted
A.M. In calculating arithmetic mean we assume that the values of a variable
have equal importance. But it is not necessary that all the values have the same
relative importance. Thus whenever it is required to find the mean of certain
variables, which are not of equal importance, then we assign certain numerical
quantities to these variables, which express their relative importance. Such
numerical quantities are technically called the weight.
So it is obvious that we would modify the formula of the simple A.M and apply
the formula of the weighted A.M i.e. Less Important
Xw  
wx
w
91
EXAMPLE 3.04
Calculate the weighted mean from the following data:
Item Expenditure (X) Weights (W)

Food 290 7.5
Rent 54 2.0
Clothing 98 1.5
Fuel & Light 75 1.0
Cosmetics 75 0.5
om
Solution Xw  
wx
Since
w
Item Expenditure (x)

l.c
Weights (w) wx
ai
Food 290 7.5 2175.0
gm
Rent 54 2.0 108.0
Clothing 98 1.5 147.0
Fuel & Light 75 1.0 75.0
Cosmetics 75 0.5 37.5
s@
Total -- 12.5 2542.5
Xw  
wx 2542.5
=
t
Therefore = 203.4
w
ta
12.5
es
ze
Test Yourself
Calculate the weighted mean from the following data:
Item Expenditure (X) Weights (W)

Food 390 9.5
Rent 44 3.0
Clothing 199 2.5
Fuel & Light 67 3.8
Other items 85 5.5
92
Geometric Mean
“The nth root of the product of “n” positive values is called geometric mean”
Geometric Mean  n Product of " n" PositiveValues
The following are the formulae of geometric mean:
Ungrouped data Grouped data
om
G  Antilog  G  Antilog 
 logx   f logx 
  ;Here n   f
 n   n 
l.c
ai
gm
EXAMPLE 3.05
s@
Find geometric mean from the following data: (ungrouped data)
5,8,10,12,15
t
ta
Solution
es
X log X
ze
5 0.6990
8 0.9031
10 1.0000
12 1.0792
15 1.1761 Hi Friends!!!
Total 4.8573
G  Antilog  
 logx 

 n 
 4.8573 
 Antilog    9.4
 5 
93
EXAMPLE 3.06
Find G.M from the following data: (Discrete Grouped data)

X 13 14 15 16 17
f 2 5 13 7 3
Solution
om
X f log X f log X
13 2 1.1139 2.2279
 flogx 
n 
14 5 1.1461 5.7306
l.c
15 13 1.1761 15.2892 
16 7 1.2041 8.4288  35.3679 
 Antilog 
17 3 1.2304 3.6913
ai  = 15.1
 30 
Total 30 -- 35.3679
gm
s@
EXAMPLE 3.07
Find G.M from the following data: (Continuous Grouped data)

t
Weights 65-84 85-104 105-124 125-144 145-164 165-184 185-204

ta
f 9 10 17 10 5 4 5
es
Solution
ze
Weights f X log X f log X

65-84 9 74.5 1.8722 16.8494
 flogx 
85-104 10 94.5 1.9754 19.7543
105-124 17 114.5 2.0588 34.9997  n 
125-144 10 134.5 2.1287 21.2872  124.2470 
145-164 5 154.5 2.1889 10.9446  Antilog   = 117.7
 60 
165-184 4 174.5 2.2418 8.9672
185-204 5 194.5 2.2889 11.4446
Total 60 -- -- 124.2470
94
Test Yourself
Find the G.M from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15
X 20 25 30 35 40
2) f 2 4 9 3 1
om
Weight 21- 30 31- 40 41- 50 51-60 61-70
3)
f 1 3 5 4 2
l.c
ai
Merits and Demerits of Geometric Mean
gm
s@
Merits
t
 The G.M is clearly defined by a mathematical formula.

ta
 It is unique and based on all the observations.

es
 It is comparatively less affected by extreme values as compared to A.M.

 It gives equal weight to all the observations and is not much affected by
ze
fluctuations of sampling.
Demerits
 It is neither easy to calculate nor simple to understand.

 It vanishes if any observation is zero.
 It cannot be calculated for qualitative data.
 In case of negative values, it cannot be computed at all.
 If the grouped data have ―open-end‖ classes, geometric mean cannot be accurately
computed.
95
Harmonic Mean
“The reciprocal of the arithmetic mean of the reciprocals of the values is called harmonic mean”
Harmonic Mean  Reciprocal of  Sum of Reciprocal of the Values

The Number of Values 
The following are formulae of harmonic mean:
om
Ungrouped data Grouped data
n n
H H ; Here n   f
1 f
l.c
   
 x  x ai
gm
EXAMPLE 3.08
s@
Find Harmonic mean from the following data: (ungrouped data)
5, 8, 10, 12, 15
t
ta
Solution
es
ze
X 1/X
5 0.2000
8 0.1250
10 0.1000
12 0.0833 Hi Friends!!!
15 0.0667
Total 0.5750
In 1874, Jevons
William Stanley
introduced the
n 5 Geometric Mean
H= = = 8.7
 1  0.5750 and Harmonic
 
 x Mean.
96
EXAMPLE 3.09
Find Harmonic mean from the following data: (Discrete Grouped data)
X 13 14 15 16 17
f 2 5 13 7 3
Solution
om
X f (f /X)
13 2 0.1538
n
l.c
14 5 0.3571 30
H= = = 15.1
15 13 0.8667  f  1.9916
16 7 0.4375  
x
ai
17 3 0.1765
gm
Total 30 1.9916
s@
EXAMPLE 3.10
Find H.M from the following data: (Continuous Grouped data)

t
ta
Weights 65-84 85-104 105-124 125-144 145-164 165-184 185-204

f 9 10 17 10 5 4 5
es
Solution
ze
Weights f X (f / X)
65-84 9 74.5 0.1208
85-104 10 94.5 0.1058
105-124 17 114.5 0.1485 n 60
125-144 10 134.5 0.0743 H= = = 113.1
 f  0.5304
145-164 5 154.5 0.0324  
165-184 4 174.5 0.0229 x
185-204 5 194.5 0.0257
Total 60 -- 0.5304
97
Test Yourself
Find the H.M from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15
X 20 25 30 35 40
2) f 2 4 9 3 1
om
Weight 21- 30 31- 40 41- 50 51-60 61-70
3)
f 1 3 5 4 2
l.c
ai
gm
Merits and Demerits of Harmonic Mean

 If any value of the
s@
data is negative then

G.M will become ill-
Merits
defined and the
t
remaining two
 The H.M is clearly defined by a mathematical
ta
formula. averages relate each

 It is unique and based on all the observations. other inversely i.e.
es
 It is capable of further algebraic treatment. H.M > A.M.

 It is comparatively less affected by extreme values  If any value of the
ze
as compared to A.M and G.M. data is zero, then

 It is not much affected by fluctuations of sampling. H.M will become ill-
defined and the G.M
Demerits
will be zero.
 It is neither easy to calculate nor simple to  A.M, G.M and H.M of

understand. two values “a” and
 It cannot be determined if any value is zero. “b” are:
 It cannot be calculated for qualitative data.
 If the grouped data have ―open-end‖ classes, A.M= a + b
2
geometric mean cannot be accurately computed.
G.M= (a × b)1/2
H.M= 2ab
a+b
98
Relationship between Arithmetic Mean, Geometric Mean and

Harmonic Mean
 A.M > G.M > H.M

 The three averages are exactly equal if the data set is constant i.e. A.M = G.M = H.M
 (G.M )2  ( A.M )  ( H .M )
x =
Consider the data: xi 30
2, 4, 6, 8, and 10 =6
n 5
om
X log X 1/X  logx 

2 0.3010 0.5000  n 
4 0.6021 0.2500  3.5844 
 Antilog 
l.c
6 0.7782 0.1667  = 5.2
8 0.9031 0.1250  5 
n 5
10 1.0000 0.1000 H= =
ai = 4.4
30 3.5844 1.1417  1  1.1417
 
gm
In 1970, the
 x
Hence it is clear that: A.M > G.M > H.M relationship
between Arithmetic
Mean, Geometric
s@
Mean and
Consider the data: Harmonic Mean is
x =
xi 50
10, 10, 10, 10, and 10 = 10 described by
t
n 5 Mitrinovic, D.S.
ta
X log X 1/X  logx 

10 1 0.1  n 
es
10 1 0.1
5
10 1 0.1  Antilog   = 10
5
ze
10 1 0.1
10 1 0.1 n 5
H= = = 10
50 5 0.5  1  0.5
 
 x
Hence it is clear that: A.M = G.M = H.M
The A.M of two observations is 127.5 and their G.M is 60 find their H.M.
A.M  127.5
G.M  60
H .M  ?
(G.M )2  60 
2
(G.M )  ( A.M )  ( H .M )  H .M 
2
  28.2
A.M 127.5
99
Mode
Mode in case of Ungrouped Data
“A value, that occurs most frequently in a data, is

called mode” Shop
om
e.g. 2, 3, 4, 2, 5, 6, 2, 7
Mode = 2
l.c
If each value occurs the
“If two or more values occur the same number of times but most same number of times,
ai
frequently than the other values, then there is more than one mode” then there is no mode.
gm
e.g.
1,2,3,4
(there is no mode)
s@
e.g. 2, 9, 11, 9, 2, 13, 14, 7, 18

5, 6, 5, 7, 6, 7
Mode = 2, 9 (there is no mode)

t
ta
The data having one mode is called uni-modal distribution.

es

 The data having two modes is called bi-modal distribution.
 The data having more than two modes is called multi-modal distribution.
ze
Mode in case of Discrete Grouped Data
“A value, which has the largest frequency in a set of data, is called mode”
e.g.
Mode = 43
X 41 42 43 44 45
(Against the maximum frequency)
f 1 3 5 2 1
100
Mode in case of Continuous Grouped Data
In case of continuous grouped data, mode would lie in the class that carries
the highest frequency. This class is called the modal class. The formula
used to compute the value of mode, is given below:
fm  f1
Mode  l  h
(fm  f1)  (fm  f2)
Where l = lower class boundary of the modal class
om
h = class-width of the modal class In 1894, Karl
fm= frequency of the modal class Pearson used the
f1= frequency of the class preceding the modal class term “Mode”
l.c
f2= frequency of the class following the modal class
ai
EXAMPLE 3.11
gm
Find mode from the following data:
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

s@
No. of Students 8 87 190 304 211 85 20

t
Solution
ta
fm  f1
Mode  l  h
es
Since
(fm  f1)  (fm  f2)
ze
No. of Class
Marks Students boundaries
30-39 8 29.5-39.5 Modal class: 59.5 — 69.5
40-49 87 39.5-49.5
50-59 190 49.5-59.5 l = 59.5, f1 = 190, f2 = 211, fm = 304,
60-69 304 59.5-69.5
70-79 211 69.5-79.5 h = 69.5-59.5 = 10
80-89 85 79.5-89.5
90-99 20 89.5-99.5
f m  f1
Mode  l  h
(fm  f1)  (fm  f2)
304 - 190
= 59.5 + × 10 = 65
(304 - 190)+ (304 - 211)
101
Mode Graphically
 Construct a Histogram form the continuous grouped data.

 Locate the modal class i.e. the class with highest rectangle.
 Draw a line from top right hand corner of the modal class rectangle to the point
where the top of the next adjacent rectangle to the left- touches. Similarly, join the
top left hand corner of the modal class rectangle to the point where the top of the
next adjacent rectangle to the right -touches.
 From the intersection of these two lines draw a perpendicular on X-axis.

om
Mode is the point where the perpendicular meets the X-axis.
EXAMPLE 3.12
l.c
ai
Find mode graphically from the following data:
gm
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

No. of Students 8 87 190 304 211 85 20
s@
Solution
t
No. of Class
ta
Marks
Students boundaries
30-39 8 29.5-39.5
es
350 40-49 87 39.5-49.5

50-59 190 49.5-59.5
ze
300
60-69 304 59.5-69.5
250 70-79 211 69.5-79.5
200
80-89 85 79.5-89.5
90-99 20 89.5-99.5
150
100
50
29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Mode = 65.0072
102
Merits and Demerits of Mode
Merits
 It is simple to understand and easy to calculate.

 In some cases it may be obtained by just inspection.
 It is not affected by extreme values.
 It is also useful for qualitative data.

om
It can be located even in open-end classes.
Demerits
l.c
 It is not clearly defined by a mathematical formula.
 It may not exist in some cases.

ai
It is non-unique for all types of data.
 It is not capable of further algebraic treatment.
gm
 It is not based on all the observations.

 It is unsatisfactory for statistical inference.
t s@
ta
Test Yourself
es
Find the Mode from the following data:

ze
1) 1, 3, 5, 7, 7, 11, 13, 7
X 20 25 30 35 40
2) f 2 4 9 3 1
Weight 21- 30 31- 40 41- 50 51-60 61-70

3)
f 1 3 5 4 2
103
Median
“When the observations are arranged in

ascending or descending order, then a
value, that divides a distribution into two
equal parts, is called median”
Median in case of Ungrouped Data
om
In this case we first arrange the observations in increasing or decreasing
order then we use the following formulae for Median:
l.c
 n 1
If “n” is Median  size of  th observation
 2 
odd
ai
 n  n  
gm
If “n” is size of  th    1th observation

even Median   2  2  
2
s@
EXAMPLE 3.13
t
ta
Find Median from the following data:

es
3,4,5,8,2,9,7,6,10
ze
Solution
Ascending order: 2,3,4,5,6,7,8,9,10 (n = 9 odd)

The number of values
 n +1  above the median
Median = size of   th observation
 2  balances (equals) the
 9 +1  number of values below
= size of   th observation
 2  the median i.e. 50% of
the data falls above and
= size of 5th observation = 6 below the median.
104
EXAMPLE 3.14
13,14,15,18,12,19,17,16,10,20
Solution
om
Ascending order: 10,12,13,14,15,16,17,18,19,20
(n = 10 even)
l.c
 n  n  
size of  th    1th observation
ai
 2  2  
Median 
2
gm
 10   10  
size of  th    1th observation
  2   2  
s@
2 The concept of
size of 5th  6th observation Median was used by
 Gauss at the
2 beginning of 19th
t
15  16
ta
  15.5 century.
2
es
Median in case of Discrete Grouped Data

ze
In case of discrete grouped data, first we find the

cumulative frequencies and then use the following
formula for Median:
 n 1
Median  size of  th observation Around 1874
 2  Francis Galton first
Here n   f introduced Median

as statistical concept
105
EXAMPLE 3.15
X 20 21 22 23 24 25
f 1 3 5 2 2 2
Solution
om
X f Cumulative
l.c
Frequency
20 1 1  n +1 
 2 
21 3 4
ai
22 5 9  15 +1 
gm
23 2 11
 2 
24 2 13
25 2 15 = size of 8th observation = 22
Total 15 --
s@
EXAMPLE 3.16
t
ta

es
X 41 42 43 44 45 46
f 2 4 4 2 1 3
ze
Solution
X f Cumulative Frequency
41 2 2  n +1 
42 4 6  2 
43 4 10  16 +1 
44 2 12 = size of   th observation
 2 
45 1 13
46 3 16 = size of 8.5th observation = 43
Total 16 --
106
Median in case of continuous Grouped Data
In continuous grouped data, when we are finding median, we first construct

the class boundaries if the classes are discontinuous. Then we find
cumulative frequencies and then we use the following two steps:
 First we determine the median class using n/2.
 When the median class is determined, then the following formula is
used to find the value of median. i.e.
h n 
Median  l   C; Here n   f
om
f 2 
Where l = lower class boundary of the median class
h = class-width of the median class
l.c
f = frequency of the median class
C = cumulative frequency of the class preceding the median class.
ai
gm
EXAMPLE 3.17
s@
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

No. of Students 8 87 190 304 211 85 20
t
ta
Solution
es
Step 1:
ze
No. of Class n

Marks Students C.F boundaries Median = Size of   th observation
2
30-39 8 8 29.5-39.5
40-49 87 95 39.5-49.5  905 
= Size of   th observation
50-59 190 285 49.5-59.5  2 
60-69 304 589 59.5-69.5 = Size of 452.5th observation
70-79 211 800 69.5-79.5
80-89 85 885 79.5-89.5 And since 452.5th observation lies in the class
90-99 20 905 89.5-99.5 (59.5-69.5); hence this is the median class.
Total 905 -- --
Here l = 59.5, f = 304, C = 285, h = 10
Step 2:
h n 
Median  l   C
f 2 
10
Median = 59.5 +
304
 452.5 - 285  = 65
107
Test Yourself
Find the Median from the following data:
1) 1, 3, 5, 7, 7, 11, 13, 7, 6
2) 30, 44, 34, 46, 55, 47, 20, 58
X 20 25 30 35 40
3)
om
f 2 4 9 3 1
4) Weight 21- 30 31- 40 41- 50 51-60 61-70
l.c
f 1 3 5 4 2
Graphic Representation of Median

ai
gm
 Draw an ogive on the basis of ―less than‖ or ―more than‖ type.

 Compute (n/2) and locate this point on vertical scale (y-axis).
s@
 Draw a perpendicular from the located point to the ogive.

 Now draw a perpendicular on x-axis from the point where the first perpendicular
cuts the ogive.
t
 The point at which the perpendicular will intersect the x-axis will be the Median
ta
of the distribution.
es
EXAMPLE 3.18
ze
Find Median graphically from the following data:

Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99
No. of Students 8 87 190 304 211 85 20
Solution Marks No. of Students C.F Class boundaries

30-39 8 8 29.5-39.5
40-49 87 95 39.5-49.5
50-59 190 285 49.5-59.5
60-69 304 589 59.5-69.5
70-79 211 800 69.5-79.5
80-89 85 885 79.5-89.5
90-99 20 905 89.5-99.5
Total 905 -- --
108
Here we construct ―less than‖ cumulative frequency distribution:
Less than Class

boundaries C.F 1000
Less than 29.5 0
Less than 39.5 8 900
Less than 49.5 95
Less than 59.5 285 800
Less than 69.5 589
Less than 79.5 800 700
Less than 89.5 885
600
Less than 99.5 905
om
500
n/2
400
300
l.c
200 ai
100
gm
29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Median = 65
s@
Merits and Demerits of Median

t
ta
Merits
 It will be incorrect if
es
 It is simple to understand and easy to calculate.

we get the answer of
 It is not affected by extreme values.
an average out side the

ze
It is also useful for qualitative data.

range of the data.
 It can be located even in open-end classes.
 Like mean it always exists and is unique for any  Whenever you hear
set of data. the word “average”, be
 It is the most appropriate average in highly skewed aware that the word
distribution.
may not always be
Demerits referring to the mean.

It may refer to Median
 It is not clearly defined by a mathematical formula. and Mode etc.
 It is not based on all the observations.
 It is necessary to arrange the values in an array before finding the median, which
is a tedious (boring) work.
109
The averages that are obtained by using mathematical formulae are called mathematical
averages e.g.
 Arithmetic Mean
 Harmonic Mean
 Geometric Mean
The averages that are obtained by simple inspection of the data are called positional
averages e.g.
om
 Mode
 Median
All these averages are affected by the change of origin and scale.
l.c
ai
Symmetrical Distribution
gm
“A distribution is said to be symmetric if the values of mean, median and mode are equal” i.e.
s@
Mean = Median = Mode

t
In symmetrical distribution the sum of the deviation from the

ta
mean, mode or median is zero. The shape of such a distribution is Mean = Median = Mode
always in the form of a bell, as shown in the figure.
es
ze
“For symmetric distribution, we know that the values of mean, median and mode are equal,
but if these values differ, then the distribution is said to be skewed or asymmetric”
The following figures show the skewed distribution:
+ve Skewness –ve Skewness
Mean > Median > Mode Mean < Median < Mode
110
Empirical Relation between Mean,

Median and Mode
“The difference between mean and mode is three times the

difference between mean and median” i.e. If two averages are given then
we can find the third one using:

Mean – Mode= 3 (Mean– Median)
3Median  Mode
 Mean 
OR 2
om
Mode  2Mean
“The difference between median and mode is twice the  Median 
3
difference between mean and median”.
 Mode  3Median  2Mean
l.c
Median – Mode = 2 (Mean – Median) ai
gm
If Mean = 28.5 and Median = 30 then by Empirical Relation:
Mode  3Median  2Mean  Mode  3(30)  2(28.5)  33

s@
Quartiles
t
ta
“When the observations are arranged in increasing order then the values, that divide the whole
data into four (4) equal parts, are called quartiles”
es
These values are denoted by Q1, Q2 and Q3.

ze
It is to be noted that 25% of the data falls below Q1, 50% of the data falls
below Q2 and 75% of the data falls below Q3.
Quartiles, Deciles and
Deciles Percentiles are also
called Quantiles or
“When the observations are arranged in increasing order then the Fractiles.
values, that divide the whole data into ten (10) equal parts, are
called deciles”
These values are denoted by D1, D2,…,D9.

It is to be noted that 10% of the data falls below D1, 20% of the data falls below D2,…, and 90% of the
data falls below D9.
111
Percentiles
“When the observations are arranged in increasing order then the

values, that divide the whole data into hundred (100) equal parts,
are called percentiles” For a data 2nd quartile,
5th decile and 50th
percentile are equal to
These values are denoted by P1, P2,…,P99.
Median i.e.
Q2 = D5 = P50 = Median
om
It is to be noted that 1% of the data falls below P1, 2% of the data falls
below P2,…, and 99% of the data falls below P99.
l.c
ai
gm
Measures Data Type Formulas

s@
 j(n  1) 
Ungrouped Qj  size of  th observation
Data  4 
Quartiles Discrete  j(n  1) 
Qj  size of  th observation ; Here n   f
t
4 
Grouped
ta
j = 1, 2, 3 data 
 j(n  1) 
es
Ungrouped Dj  size of   th observation

Data  10 
ze
Deciles Discrete  j(n  1) 

Grouped Dj  size of   th observation ; Here n   f
j = 1, 2, . .,9 data  10 
 j(n  1) 
Ungrouped Pj  size of   th observation
Data  100 
Percentiles Discrete  j(n  1) 
Grouped Pj  size of   th observation ; Here n   f
j = 1, 2, . .,99 data  100 
112
Continuous Grouped Data
In continuous grouped data, we use the following two steps:
 First we determine the jth quartile class using jn/4.

 When the jth quartile class is determined, then the following formula
is used to find the value of jth quartile i.e.
h  jn
Q j l   C  ; Here n   f
om
Quartiles
f4 
l = lower class boundary of the jth quartile class
l.c
h = class-width of the jth quartile class
f = frequency of the jth quartile class
ai
C = cumulative frequency of the class preceding the jth quartile class.
gm
 First we determine the jth decile class using jn/10.

 When the jth decile class is determined, then the following formula
s@
is used to find the value of the jth decile. i.e.
h  jn
Deciles D j l   C  ; Here n   f
t
f  10 
ta
l = lower class boundary of the jth decile class

es
h = class-width of the jth decile class

f = frequency of the jth decile class
ze
C = cumulative frequency of the class preceding the jth decile class.
 First we determine jth percentile class using jn/100.

 When the jth percentile class is determined, then the following
formula is used to find the value of the jth percentile. i.e.
h  jn
Percentiles P j l   C  ; Here n   f
f  100 
l = lower class boundary of the jth percentile class

h = class-width of the jth percentile class
f = frequency of the jth percentile class
C = cumulative frequency of the class preceding the jth percentile class.
113
EXAMPLE 3.19
Find Q1, Q3, D5, and P50 from the following data:
50,51,52,53,54,55,56,57,58,59,60; (n = 11)
Solution
om
 1(n  1)   3(n  1) 
Q1  size of  th observation Q3  size of  th observation
 4   4 
 1(11  1)   3(11  1) 
l.c
 Q1  size of   th observation  Q3  size of   th observation
 4  ai  4 
= size of 3th observation = 52 = size of 9th observation = 58
gm
 5(n  1)   50(n  1) 
D5  size of   th observation P50  size of   th observation
 10   100 
 5(11  1)   50(11  1) 
s@
 D5  size of  th observation  P50  size of  th observation

 10   100 
t
ta
es
EXAMPLE 3.20
ze
Find Q1, Q3, D6 and P80 from the following data:
150,151,152,153,154,155,156,157,158,159 (n = 10)
Solution
 1(n  1) 
Q1  size of  th observation
 4 
 1(10  1) 
 Q1  size of   th observation
 4 
= size of 2.75th observation
= size of 2nd +0.75(3rd - 2nd)  observation
= 151+0.75(152 - 151)= 151.75
114
 3(n  1) 
 4 
 3(10  1) 
 Q3  size of   th observation
 4 
= size of 8th+0.25(9th - 8th)  observation
om
= 157 +0.25(158 - 157)= 157.25
l.c
 6(n  1) 
D6  size of   th observation
 10 
ai
 6(10  1) 
 D6  size of  th observation
10 
gm

s@
= 155+0.6(156 - 155)= 155.6

t
 80(n  1) 
ta
P80  size of   th observation

 100 
es
 80(10  1) 
 P80  size of  th observation
 100 
ze

= 157 +0.8(158 - 157)= 157.8
EXAMPLE 3.21

X 20 21 22 23 24 25
f 1 3 5 2 2 2
115
Solution
X f Cumulative Frequency
20 1 1
21 3 4
22 5 9
23 2 11
24 2 13
25 2 15
Total 15 --
 1(n  1)   3(n  1) 
Q1  size of  Q3  size of 
om
th observation th observation
 4   4 
 1(15 +1)   3(15 +1) 
 Q1 = size of   th observation  Q3 = size of   th observation
l.c
 4   4 
ai
 4(n  1)   60(n  1) 
D4  size of   th observation P60  size of   th observation
 10   100 
gm
 4(15 +1)   60(15 +1) 

 D4 = size of  th observation  P60 = size of  th observation
 10   100 
s@
= size of 6.4th observation = 22 = size of 9.6th observation = 23

t
ta
EXAMPLE 3.22
es

ze
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

No. of Students 8 87 190 304 211 85 20
Solution No. of Class

Marks Students C.F boundaries
30-39 8 8 29.5-39.5
40-49 87 95 39.5-49.5
50-59 190 285 49.5-59.5
60-69 304 589 59.5-69.5
70-79 211 800 69.5-79.5
80-89 85 885 79.5-89.5
90-99 20 905 89.5-99.5
Total 905 -- --
116
Step 1 Step 2
 1 n 
Q1 = Size of   th observation
 4  Now using the following formula:
 1  905 
 Q1 = Size of   th observation h  1 n
 4  
Q1l  C 
 905  f 4 
= Size of   th observation 10  1  905 
 4  Q1 = 49.5 + - 95 

190  4
= Size of 226.25th observation 
10
And since 226.25th observation lies in the class  226.25 - 95 
om
= 49.5 +
(49.5-59.5); hence this is the lower quartile class. 190
= 56.40
Here l = 49.5, f = 190, C = 95, h = 10
l.c
Step 1 Step 2
 3 n 
ai
Q3 = Size of   th observation
 4 
gm
Now using the following formula:
 3  905 
 Q3 = Size of  th observation
 4  h  3n 
Q 3l  C 
s@
 2715  f 4 
= Size of   th observation
 4  10  3  905 
= Size of 678.75th observation Q3 = 69.5 +  - 589 
211  4 
t
ta
And since 678.75th observation lies in the class 10

(69.5-79.5); hence this is the upper quartile class.
= 69.5 +
211
678.75 - 589 
es
Here l = 69.5, f = 211, C = 589, h = 10 = 73.75

ze
Step 1 Step 2
 8n 
D8 = Size of   th observation Now using the following formula:
 10 
 8  905 
 D8 = Size of   th observation h  8n 
 10  D8  l   C
f  10 
 7240 
= Size of   th observation 10  8  905 
 10  D8 = 69.5 +  - 589 
211  10 
= Size of 724th observation
10
And since 724th observation lies in the class
= 69.5 +
211
724 - 589 
(69.5-79.5); hence this is the 8th decile class. = 75.89
Here l = 69.5, f = 211, C = 589, h = 10
117
Step 1 Step 2
 40n 
P40 = Size of   th observation
 100  Now using the following formula:
 40  905 
 P40 = Size of   th observation
 100  h  40n 
P40  l  C
f  100 
= Size of 
36200 
 th observation 10  40  905 
 100  P40 = 59.5 + - 285 

304  100
= Size of 362th observation 
10
 362 - 285 
om
And since 362th observation lies in the class = 59.5 +
(59.5-69.5); hence this is the 40th percentile class. 304
= 62.03
Here l = 59.5, f = 304, C = 285, h = 10
Main Objects of Average

l.c
ai
gm
 The main object (purpose) of the average is to give a bird’s eye view (summary)
of the statistical data. The average removes all the unnecessary details of the data
s@
and gives a concise (to the point or short) picture of the huge data under
investigation.
 Average is also of great use for the purpose of comparison (i.e. the comparison of
t
ta
two or more groups in which the units of the variables are same) and for the
further analysis of the data.
es
 Averages are very useful for computing various other statistical measures such as
dispersion, skewness, kurtosis etc.
ze
Requisites (desirable qualities) of a Good Average
An average will be considered as good if:
 It is mathematically defined.
 It utilizes all the values given in the data.
 It is not much affected by the extreme values.
 It can be calculated in almost all cases.
 It can be used in further statistical analysis of the data.
 It should avoid to give misleading results.
118
Uses of Averages in Different Situations
 A.M is an appropriate average for all the situations where there are no extreme
values in the data.
 G.M is an appropriate average for calculating average percent increase in sales,

population, production, etc. It is one of the best averages for the construction of
index numbers.
 H.M is an appropriate average for calculating the average rate of increase of
om
profits of a firm or finding average speed of a journey or the average price at
which articles are sold.
l.c
 Mode is an appropriate average in case of qualitative data e.g. the opinion of an
average person; he is probably referring to the most frequently expressed opinion
ai
which is the modal opinion.
gm
 Median is an appropriate average in a highly skewed distribution e.g. in the
distribution of wages, incomes etc.
s@
Prove that: (xi  x)2  (xi  A)2

t
ta
Taking (xi  A)2  (xi  x  x  A)2

es
Proof:
  [(xi  x)  (x  A)]2
ze
  [(xi  x)2  (x  A)2  2(xi  x)(x  A)]
 (xi  x)2  (x  A)2  2(xi  x)(x  A)
 (xi  x)2  n(x  A)2  2(x  A)(xi  x)
 (xi  x)2  n(x  A)2   (xi  x)  0 
 (xi  A)2  (xi  x)2  n(x  A)2  0
119
Sharpen your Pencil

MCQ’s
(1) Mean = ______

3Median  Mode
(A) Mean  Mode (B)
2
(C) Median (D) None of these
om
(2) Mean  Mode = ______
l.c
(A) 3( x  Median) (B) x  Median
(C) Median (D) None of these
ai
1
Mean  (3Median  ______ )
gm
(3)
2
(A) 2Mean (B) Mode (C) G.M (D) None of these
s@
(4) Mode  (3median  _____ )
(A) x (B) 2x (C) H.M (D) None of these

t
ta
1
(5) Median  ( Mode  ______ )
3
es
(A) x (B) 2x (C) G.M (D) None of these
 ( xi  x )  _____
ze
(6)
(A)  ( xi  a) (B) 0 (C) 1 (D) None of these
(7)  ( xi  a)  _____
(A) nx  na (B) x a (C)  xi  a (D) None of these
(8) G.M of (X + A) is_____.
(A) G.M of (X) + A (B) G.M of (X ) + nA

(C) G.M of (X) (D) None of these
120
Sharpen your Pencil

MCQ’s
(9) If X = -2, -1, 20, 40 then_____ cannot be calculated.
(A) A.M (B) G.M (C) H.M (D) None of these
om
(10) If X = 22, 21, 0, 20 then_____ cannot be calculated.
l.c
(11) For two positive integers G.M=_____ ai
(A) ( A.M )( H .M ) (B) (A.M)(H.M)
gm
(C) A.M (D) None of these
(12) G.M of ―a‖ and ―b‖ is_____

s@
ab ab
(A) (B) (ab)1/ 2 (C) (D) None of these
2 3
2ab
t
(13) H.M of "a" and "b" is

ta
_____
es
(A) ab (B) a b (C) ab (D) None of these

ze
(14) G 2 =_____
(A) AxH (B) A+H (C) (AxH)/2 (D) None of these
(15) If G.M = 60 and A.M = 110.2 then H.M is _____
(A) 28 (B) 38 (C) 32.7 (D) None of these
(16) For a set of data A.M _____G.M _____H.M.
(A) > (B) < (C) = (D) None of these
(17) If X  3.5 and n = 10 then,  X  ____

(A) 0.35 (B) 35 (C) 17.5 (D) None of these
121
Sharpen your Pencil

MCQ’s
(18) Median  Q2  D5  _____
(A) P2 (B) D10 (C) P50 (D) None of these
om
(19) If y  a  bx then y =_____
a  bx
l.c
(A) (B) bx (C) x (D) None of these
(20) For symmetric distribution mean, median and mode are _____
ai
(A) Different (B) same (C) both A & B (D) None of these
gm
t s@
ta
es
ze
122
Short Questions
ExeRciSe
Q.3.01. The A.M of two observation is 127.5 and their G.M is 60 find their H.M?
Q.3.02. What is mode? From the following data find out mode: 5, 6, 3, 4, 5, 9, 2, 7, 5.
om
Q.3.03. A group of 20 students obtained a mean score of 70 marks on an examination. A
second group of 30 students obtained a mean score of 80 marks on the same
examination. Find the mean score for the 50 students of the class?
l.c
Q.3.04. Calculate the A.M of the data: 25, 27, 28, 29, 30, 32, 34, and 36
ai
Q.3.05. Show that G.M lies between A.M and H.M of the two values 16 and 25?
gm
Q.3.06. Find Q1 and Q3 : 9, 9, 10, 12, 15, 15, 13, 8, 4, 7, and 8

s@
Q.3.07. Find mode in each case:
(i) 10, 6, 8, 0, 3, 2,
(ii) 120, 5, 4, 5, 2, 1, 0, 5, 4, 7, 8, 4
t
(iii) 1, 3, 3, 0, 5, 0, 9, 0, 10, 0
ta
Q.3.08. State the empirical relation between mean, median and mode?
es
Q.3.09. If  f  20 ,  fD  200 and D  Xi  20 the find mean?

ze
Q.3.10. If X  87 and Median  90 using empirical relation to find mode?
xi  15
Q.3.11. If ui 
10
,  fu  90 and n  100 find A.M?
Q.3.12. Given that
n1  10 n2  15 n3  20
x1  2.5 x2  4.9 x3  5.1
Find combined mean?
Q.3.13. What are the disadvantages of A.M?
123
Short Questions
ExeRciSe
Q.3.14. Describe partitioned values specially the use of median as a partitioned value.
Q.3.15. In a class of 10 students 4 students failed in a test. The marks of 6 students who
passed were 4, 6, 7, 8, 8, and 9. What is median of all the 10 students?
om
Q.3.16. Calculate A.M, G.M and H.M. Show that A.M > G.M > H.M for the values 4, 9.
l.c
Q.3.17. Prove that  (xi  x)2   (xi  A)2 .
Q.3.18.
ai
Calculate Harmonic mean of 5, 2, 10, 4
gm
Q.3.19. What are the advantages of Median?
Q.3.20. Write down the properties of A.M.

s@
Q.3.21. Find A.M, G.M and H.M from the following data if possible, if not possible give
reason, -1, 2, 3, 100, 89, 31, 0, 49, 50, 70
t
ta
Q.3.22. The mean of 20 observations is 10 and median is 15, if 5 is added to each

observation. Find new mean and median.
es
Q.3.23. Find the geometric mean of the series 1,3,9,...,3n .

ze
Q.3.24. H.M, A.M and G.M of a set of 5 observations are 10.2, 16 and 14 respectively,
Comments.
Q.3.25. Find Mean of 1, 2, 3… 20
Q.3.26. Define Arithmetic mean and Geometric mean
Q.3.27. Find Q3 and P25 from the given data:
Class marks 15 20 25 30 35 40
Frequency 1 1 2 3 2 1
Q.3.28. Define mean, median, mode and geometric mean.
124
Long Questions
ExeRciSe
Q.3.01. Calculate the mean by direct method and step deviation method from the
following data:
Class marks 6 7 8 9 10 11 12
om
Frequency 3 6 9 13 8 5 3
Q.3.02. Calculate the mean by direct method, geometric mean and harmonic mean from
l.c
the following data:
Hourly wages 4 5 6 7 8 9 10 11 12 13 14 15
ai
No. of employees 3 18 23 42 62 78 118 200 198 82 14 5
gm
Q.3.03. Calculate the mean, median and mode from the following data:
Daily
s@
50- 59.9 60- 69.9 70- 79.9 80-89.9 90-99.9 100- 109.9 110-119.9 120-129.9
wages
No. of
7 9 10 15 13 12 6 3
employees
t
ta
Q.3.04. Calculate the mean, median and mode from the following data:
es
Score in Quiz 6 7 8 9 10 11 12
No. of Students 3 6 9 13 8 5 3
ze
Q.3.05. Calculate the mean, median, mode, H.M and G.M from the following data:
Marks 0-10 10-20 20-30 30-40 40-50 50-60

f 8 11 19 16 10 5
Q.3.06. Calculate Median, Mode and Quartiles from the following data:
classes 20- 24 25- 29 30- 34 35-39 40-44

No. of Students 4 8 11 9 2
Q.3.07. Calculate D5 , P50 and P75 from the following data:
classes 9.3- 9.7 9.8- 10.2 10.3- 11.2 11.3-11.7 11.8-12.2

f 2 5 12 17 10
125
Long Questions
ExeRciSe
Q.3.08. Who is better on the average? Use the following data:
Sales representative A 4 7 5 9
Sales representative B 2 12 4 8
om
Q.3.09. Given the following data: 32, 35, 36, 37, 39, 41, and 43:
Calculate A.M, G.M and H.M and show that A.M > G.M > H.M
l.c
Q.3.10. Calculate D8 , P2 and Q3 from the following data: ai
Score in Quiz 6 7 8 9 10 11 12
gm
No. of Students 3 6 9 13 8 5 5
Q.3.11. Find G.M and H.M for the data and show that G.M > H.M.
s@
classes 3- 7 8- 10 11- 13 14-16 17-20

No. of Students 14 24 38 20 4
t
ta
Q.3.12. Find Mean, Median, Q1, Q3 and Mode from the data: 137, 146, 145, 181, 132,
175, 160, 190, 164, 180, 176, 130, 125, 140 and 150.
es
Q.3.13. The Reciprocals of 11 values of X are given below, find A.M, G.M and H.M of X
ze
0.015, 0.0454, 0.04, 0.0333, 0.0285, 0.0213, 0.02, 0.0182, 0.0151, 0.0143, 0.0232
Q.3.14. By taking x = -2, -1, 0, 1, 2, 3 prove or disprove the following relations:
(i) (xi  A.M )  0

(ii) (xi  2 )  A.M
 x
2
(iii)  (xi  A.M )   x

2 2

n
Q.3.15. Find Mean, Median and Mode?
classes 0- 10 10- 20 20- 30 30-40 40-50

126
CHAPTER 04
Measures of Dispersion,
Moments, Skewness and
Kurtosis
om
Chapter Contents
l.c
ai
Y
gm
 Dispersion: (P128)
s@
 Measures of Dispersion: (P129-P130)

 Range and its Coefficient: (P130-P132)
 Quartile Deviation and its Coefficient: (P133-P137)
 Mean Deviation and its Coefficient: (P138-P142)
t
ta
 Standard Deviation and its Coefficient: (P143-P150)

 Properties of Variance and S.D: (P151)
es
 Moments : (P151-P156)
 Symmetrical Distribution: (P157)
Skewness and Coefficient of Skewness: (P158-P163)
ze

 Kurtosis: (P164-P166)
127
Chapter 04 Measures of Dispersion, Moments, Skewness and Kurtosis
 Sometimes when two or more different data sets are to be compared using
measure of central tendency or averages, we get the same result.
Consider the runs scored by two batsmen in their last ten matches as follows:
Batsman A: 30, 91, 0, 64, 42, 80, 30, 5, 117, 71

Batsman B: 53, 46, 48, 50, 53, 53, 58, 60, 57, 52
Clearly, mean of the runs scored by both the batsmen A and B is same i.e. 53
Can we say that the performance of two players is same? Clearly No, because the
om
variability in the scores of batsman A is from 0 to 117, whereas, the variability of
the runs scored by batsman B is from 46 to 60.
l.c
Let us now plot the above scores as dots on a number line. We find the following
diagrams: ai
Batsman A
gm
0 10 20 30 40 50 60 70 80 90 100 110 120
s@
Batsman B
0 10 20 30 40 50 60 70 80 90 100 110 120
We can see that the dots corresponding to batsman B are close to each other and is
t
clustering around the measure of central tendency (mean), while those

ta
corresponding to batsman A are scattered or more spread out. Thus, the measures
of central tendency are not sufficient to give complete information about a given
es
data. In such a situation the comparison becomes very difficult. We therefore,

need some additional information for comparison, concerning with, how the data
ze
is dispersed about (more spread out) the average. This can be done by measuring
the dispersion. Like „measures of central tendency‟ we want to have a single
number to describe variability. This single number is called a ‘measure of
dispersion’.
Dispersion
No Dispersion
“The variability (spread) that exists between the
values of a data is called dispersion”.
OR
“The extent to which the observations are
spread around an average is called dispersion
Dispersion
or the scatter”.
128
Measures of Dispersion
As we know that, there are quite a few ways of measuring the central tendency of a data set i.e. A.M,
G.M, H.M, Mode and Median. Similarly, we have different ways of measuring and comparing the
dispersion of the distribution(s). There are two important types of measures of dispersion.
Types of Measures
of Dispersion
om
Absolute Measure of Relative Measure of
Dispersion Dispersion
Range
l.c
Coefficient of
ai
Range
gm
Quartile Coefficient of
Deviation Q.D
s@
Mean Coefficient of
Deviation M.D
t
ta
es
Standard Coefficient of
Deviation Variation
ze
Absolute Measure of Dispersion
“An absolute measure of dispersion measures the variability in terms of the same units of the
data” e.g. if the units of the data are Rs, meters, kg, etc. The units of the measures of dispersion will
also be Rs, meters, kg, etc.
The common absolute measures of dispersion are:
 Range
 Quartile Deviation or Semi Inter-Quartile Range
 Average Deviation or Mean Deviation
 Standard Deviation
129
Relative Measure of Dispersion
“A relative measure of dispersion compares the variability of two or more data that are
independent of the units of measurements”
In other word “A relative measure of dispersion, expresses the absolute measure of dispersion
relative to the relevant average and multiplied by 100 many times” i.e.
Absolute Dispersion
Relative Dispersion 
om
Average
Absolute Dispersion
Relative Dispersion   100
Average
l.c
This is a pure number and independent of the units in which the data has been expressed. It is used for
ai
the purpose to compare the dispersion of a data with the dispersion of another data.
gm
The common relative measures of dispersion are:
 Coefficient of Dispersion or Coefficient of Range

s@
 Coefficient of Quartile Deviation

 Coefficient of Mean Deviation
 Coefficient of Standard Deviation or Coefficient of Variation (C.V)
t
ta
es
The major difference b/w Absolute and Relative Measures of Dispersion is that the Absolute
measure of dispersion measures only the variability of the data, further it has the unit of
measurement; on the other hand Relative measure of dispersion is used to compare the
ze
.
variation of two or more distributions, further it is unit less.
Range
 Ungrouped Data and for Discrete Grouped Data
“The difference between the largest and the smallest value in a set of data is called range” i.e.
R = Xm – X0
Where R is the range, Xm is the largest value and X0 is the smallest value.
130
 Continuous Grouped Data
“In continuous grouped data the difference between the upper class
boundary of the highest class and lower class boundary of the lowest
class is called range”.
Coefficient of Range or Coefficient of Dispersion
The coefficient of range or coefficient of dispersion is a relative measure of In 1892, Pearson
om
dispersion and is given by: introduced
statistical concept of
Xm - X0
Coefficient of Range= “range”
Xm + X0
l.c
EXAMPLE 4.01
ai
gm
Find Range and the Coefficient of Range from the following data:
51, 50, 40, 90, 75, 60, 44, 30, 23, 20 (ungrouped data)
s@
Solution
Here Xm = 90; X0 = 20
t
R = Xm - X0 = 90 – 20 = 70
ta
es
Xm - X0 90 - 20
Coefficient of Range = = = 0.64
Xm + X0 90 + 20
ze
EXAMPLE 4.02
Find Range and the Coefficient of Range from the following data: (Discrete Grouped data)
Marks (X) 13 14 15 16 17
No. of Students (f) 2 5 13 7 3
Solution
Here Xm = 17; X0 = 13
R = Xm - X0 = 17 – 13 = 4
Xm - X0 17 - 13
Coefficient of Range = = = 0.13
Xm + X0 17 +13
131
EXAMPLE 4.03
Find Range and the Coefficient of Range from the following data: (Continuous Grouped data)
Weight 11- 20 21- 30 31- 40 41-50 51-60

f 1 2 3 2 1
Solution
om
Class Here Xm = 60.5; X0 = 10.5
Weight f
Boundaries
11- 20 1 10.5- 20.5 R = Xm - X0 = 60.5 – 10.5 = 50
l.c
21- 30 2 20.5- 30.5
31- 40 3 30.5- 40.5
Xm - X0
41-50 2 40.5-50.5
ai
Coefficient of Range =
51-60 1 50.5-60.5 Xm + X0
gm
Total 9 -- 60.5 - 10.5
= = 0.70
60.5 +10.5
s@
Merits and Demerits of Range

t
ta
Merits
es
 It is the simplest measure of dispersion.

 It gives a quick picture of the variability.
ze
Demerits
 It does not based on each and every value of the data.

 It cannot be computed in case of open-end distributions
 It is affected by extreme values.
 It is affected by fluctuations of sampling.
132
Test Yourself
Find the Range and Coefficient of Range from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15
X 20 25 30 35 40
2) f 2 4 9 3 1
om
Weight 21- 30 31- 40 41- 50 51-60 61-70
3)
f 1 3 5 4 2
l.c
ai
Quartile Deviation or Semi-inter-quartile Range
gm
“Half of the difference between the upper quartile and lower

quartile is called the semi-inter quartile range or quartile
s@
deviation” i.e.
Q3  Q1
t
The difference between the

ta
Quartile deviation =
2 upper quartile and lower
es
quartile is called inter

Coefficient of Quartile Deviation quartile range i.e.
ze
Inter quartile range = Q3 – Q1

The coefficient of quartile deviation is a relative measure of dispersion
and is given by:
Q3  Q1
Coefficient of Q.D=
Q3  Q1
EXAMPLE 4.04
Find Q.D and the Coefficient of Q.D from the following data:
50,51,52,53,54,55,56,57,58,59,60; (n = 11)
133
Solution  1(n  1) 
 4 
 1(11  1) 
Q1  size of   th observation
 4 
= size of 3th observation = 52
 3(n  1) 
 4 
 3(11  1) 
Q3  size of 
om
 th observation
 4 
l.c
Here Q1 = 52; Q3= 58 ai
Q3  Q1 58 - 52
Q.D = = =3
gm
2 2
Q3 - Q1 58 - 52
s@
Coefficient of Q.D = = = 0.0273

Q3 +Q1 58 + 52
t
ta
EXAMPLE 4.05
es
Find Q.D and the Coefficient of Q.D from the following data
20, 21, 22, 23, 24, 25, 26, 27; (ungrouped data) (n = 8)
ze
Solution
 1(n  1) 
 4 
 1(8  1) 
 4 
= size of 2nd +0.25(3rd - 2nd)  observation
= 21+0.25(22 - 21)= 21.25
134
 3(n  1) 
 4 
 3(8  1) 
 4 
= 25+0.75(26 - 25)= 25.75
Here Q1 = 21.25; Q3= 25.75
om
Q3  Q1 25.75 - 21.25
Q.D = = = 2.25
l.c
2 2
ai
Q3 - Q1 25.75 - 21.25
Q3 +Q1 25.75 + 21.25
gm
EXAMPLE 4.06
s@
Find Q.D and the Coefficient of Q.D from the following data: (Discrete grouped data)
X 20 21 22 23 24 25
t
f 1 3 5 2 2 2
ta
es
Solution X f Cumulative Frequency

20 1 1
ze
21 3 4
 1(n  1) 
Q1  size of  th observation 22 5 9
 4  23 2 11
 1(15 +1)  24 2 13
 Q1 = size of   th observation 25 2 15
 4 
Total 15 --
 3(n  1)  Q3  Q1
Q3  size of  th observation 24 - 21
 4  Q.D =
2
=
2
= 1.5
 3(15 +1)  Q3 - Q1
 Q3 = size of   th observation Coefficient of Q.D =
 4  Q3 +Q1
= size of 12th observation = 24 24 - 21
=
24 + 21
Here Q1 = 21; Q3= 24 = 0.0667
135
EXAMPLE 4.07
Find Q.D and the Coefficient of Q.D from the following data: (Continuous grouped data)
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

f 8 87 190 304 211 85 20
Solution No. of Class

Marks C.F
Students boundaries
om
30-39 8 8 29.5-39.5
40-49 87 95 39.5-49.5
50-59 190 285 49.5-59.5
60-69 304 589 59.5-69.5
l.c
70-79 211 800 69.5-79.5
80-89 85 885 ai 79.5-89.5
90-99 20 905 89.5-99.5
Total 905 -- --
gm
Step 1: Step 1:
s@
 1 n   3 n 
Q1 = Size of   th observation Q3 = Size of   th observation
 4   4 
t
 905   3  905 
ta
= Size of   th observation = Size of  th observation

 4   4 
es
= Size of 226.25th observation = Size of 678.75th observation

ze
And since 226.25th observation lies in the And since 678.75th observation lies in the
class (49.5-59.5); hence this is the lower class (59.5-69.5); hence this is the median
quartile class. quartile class.
Here l = 49.5, f = 190, C = 95, h = 10 Here l = 69.5, f = 211, C = 589, h = 10
Step 2: Step 2:
h  1 n  h  3  905 
Q1 = l +  -C Q3 = l +  -C
f  4  f  4 
10 10
= 49.5 +
190
 226.25 - 95  = 69.5 + 678.75 - 589 
211
= 56.40 = 73.75
136
Here Q1 = 56.40; Q3= 73.75
Q3  Q1 73.75 - 56.40
Q.D = = = 8.6750
2 2
Q3 - Q1 73.75 - 56.40
Q3 +Q1 73.75 + 56.40
Merits and Demerits of Quartile Deviation
om
Merits

l.c
It is simple to understand and easy to calculate.

ai
It is a good measure for open-end distributions.
gm
Demerits
 It does not based on each and every value of the data.

s@

 It is affected by fluctuations of sampling.
t
ta
es
Test Yourself
ze
Find the Q.D and Coefficient of Q.D from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15, 20, 19, 21
2) 30, 33, 23, 22, 34, 40, 41, 28, 35, 39
X 20 25 30 35 40
3) f 2 4 9 3 1
Weight 21- 30 31- 40 41- 50 51-60 61-70

4)
f 1 3 5 4 2
137
Mean Absolute Deviation or Mean Deviation (Average Deviation)
“The arithmetic mean of the absolute deviations from an average (mean, median, etc.) is called
mean deviation or average deviation”
Ungrouped Data Grouped Data

 xi  x  f xi - x
M.D from Mean M.D  M.D =
n n
 xi  Med  f xi - Med
M.D from Median M.D  M.D =
om
n n
Coefficient of Mean Deviation
l.c
The coefficient of mean deviation is a relative measure of dispersion and is given by:
ai
M.D(from mean)
gm
Coefficient of M.D (from mean) 
Mean
M.D(from median)
s@
Coefficient of M.D (from median) 

Median
EXAMPLE 4.08
t
ta
Find M.D and the Coefficient of M.D from mean.

es
Using the data: 50,51,52,53,54,55,56,57,58,59,60; (ungrouped data)

ze
Solution
x =
xi 605 X Xi - X Xi - X
Here = 55
n 11 50 -5 5
51 -4 4
 xi - x 30 52 -3 3
M.D = = = 2.7273
53 -2 2
n 11
54 -1 1
M.D 55 0 0
Coefficient of M.D = 56 1 1
X 57 2 2
2.7273 58 3 3
=
55 59 4 4
= 0.0496 60 5 5
605 -- 30
138
EXAMPLE 4.09
Find M.D and the Coefficient of M.D from median.
Using the data: 50,51,52,53,54,55,56,57,58,59,61; (ungrouped data)
Solution  n +1 
Median = size of   th observation X Xi - Med Xi - Med
 2 
50 -5 5
 11+1  51 -4 4
 2 
om
52 -3 3
= size of 6th observation = 55 53 -2 2
54 -1 1
55 0 0
l.c
56 1 1
 xi - median 31 57 2 2
M.D = = = 2.8182
n 11
ai 58 3 3
M.D 59 4 4
gm
Coefficient of M.D = 61 5 6
Median
2.8182 -- -- 31
=
55
s@
= 0.0512
EXAMPLE 4.10
t
ta
Find M.D and the Coefficient of M.D from mean. (Discrete grouped data)
es
X 20 21 22 23 24 25
f 1 3 5 2 2 2
ze
Solution
x
fxi 337
Here = = 22.47
n 15 X f fX Xi - X f Xi - X
20 1 20 2.47 2.47
 f xi - x 18.41 21 3 63 1.47 4.41
M.D = = = 1.23
n 15 22 5 110 0.47 2.35
23 2 46 0.53 1.06
M.D 24 2 48 1.53 3.06
Coefficient of M.D = 25 2 50 2.53 5.06
X
1.23 Total 15 337 -- 18.41
=
22.47
= 0.05
139
EXAMPLE 4.11
Find M.D and the Coefficient of M.D from median. (Discrete grouped data)
X 20 21 22 23 24 25
f 1 3 5 2 2 2
Solution  n +1 
Median = size of   th observation X f C.F f Xi - Med
 2 
20 1 1 2
om
 15 +1  21 3 4 3
 2  22 5 9 0
23 2 11 2
l.c
24 2 13 4
25 2 15 6
 f xi - median 17 Total 15 -- 17
M.D = = = 1.13
ai
n 15
gm
M.D 1.13
Coefficient of M.D = =
Median 22
s@
= 0.05
EXAMPLE 4.12
t
ta
Find M.D and the Coefficient of M.D: from mean (Continuous grouped data)
es
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

No. of
8 87 190 304 211 85 20
Students
ze
Solution  fxi = 58902.5 = 65.09

Here x  Marks X f fX f Xi - X
n 905
30-39 34.5 8 276 244.69
 f xi - x 8449.88 40-49 44.5 87 3871.5 1790.95
M.D = = = 9.34 50-59 54.5 190 10355 2011.27
n 905
M.D 60-69 64.5 304 19608 178.03
Coefficient of M.D = 70-79 74.5 211 15719.5 1986.43
X
80-89 84.5 85 7182.5 1650.22
9.34
= 90-99 94.5 20 1890 588.29
65.09
Total -- 905 58902.5 8449.88
= 0.14
140
EXAMPLE 4.13
Find M.D and the Coefficient of M.D from median. (Continuous grouped data)
Marks 30-39 40-49 50-59 60-69 70-79 80-89 90-99

No. of
8 87 190 304 211 85 20
Students
Solution
om
Step 1: Step 2:
n h n 
l.c
Median = Size of   th observation Median = l +  -C
2 f 2 
 905  10
 452.5 - 285 
= Size of 
ai = 59.5 +
 th observation 304
 2 
gm
= Size of 452.5th observation = 65
And since 452.5th observation lies in the class

s@
(59.5-69.5); hence this is the median class.
Here l = 59.5, f = 304, C = 285, h = 10

t
ta
No. of Students Class f Xi - Median

Marks X C.F
es
(f) boundaries
30-39 34.5 8 8 29.5-39.5 244
40-49 44.5 87 95 39.5-49.5 1783.5
ze
50-59 54.5 190 285 49.5-59.5 1995

60-69 64.5 304 589 59.5-69.5 152
70-79 74.5 211 800 69.5-79.5 2004.5
80-89 84.5 85 885 79.5-89.5 1657.5
90-99 94.5 20 905 89.5-99.5 590
Total -- 905 -- -- 8426.5
 f xi - median 8426.5
M.D = = = 9.31
n 905
M.D 9.31
Coefficient of M.D = = = 0.14
Median 65 Hi Friends!!!
141
Merits and Demerits of Mean Deviation
Merits
 It is simple to understand.
 It is based on each and every value of the data.
Demerits
om
It is not a good measure for open-end distributions.
 It is difficult to handle it mathematically; because there is an element of
artificiality i.e. the deviations are not taken with their proper signs.
l.c
ai
gm
Test Yourself
s@
a) Find the M.D from Mean and Coefficient of M.D from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15, 20, 19, 21

t
ta
X 20 25 30 35 40
2) f 2 4 9 3 1
es
Weight 21- 30 31- 40 41- 50 51-60 61-70

3)
ze
f 1 3 5 4 2
b) Find the M.D from Median and Coefficient of M.D from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15, 20, 19, 21
2) 30, 33, 23, 22, 34, 40, 41, 28, 35, 39
X 20 25 30 35 40
3) f 2 4 9 3 1
Weight 21- 30 31- 40 41- 50 51-60 61-70

4)
f 1 3 5 4 2
142
Standard Deviation
“The positive square root of variance

is called as Standard deviation”.
OR
 The arithmetic mean of
“The positive square root of the
the squared deviations of
arithmetic mean of the squared
the values measured
deviations from the mean is called from the mean is called
the standard deviation” variance.
om
 For a set of data:
Ungrouped Data Grouped Data Range > S.D > M.D > Q.D
l.c
(xi  μ) f(xi  μ)
   
2 2
S.D for
 The measures of
Population N N dispersion are always
ai
(xi  x)2 f(xi  x)2
S  S  positive.
S.D for
gm
Sample n n
Methods of Calculating Variance and Standard Deviation

s@
Ungrouped Data
Methods
Variance Standard Deviation
 xi 2      xi 2    
2 2
xi xi
t
Direct Method S = 2
S
ta
n  n  n  n 
es
 D2    
2
 D2    
2
Short cut D D
S  2
S
Method n  n  n  n 
ze
  ui 2   ui 2 
 
2
Step-deviation  ui 2
 ui 
S =h    
2 2
S=h 
Method  n  n   n  n 
Grouped Data
Methods
2
 fxi 2    fxi   fxi 2    fxi 
2
Direct Method S  2
 n  S  n 
n   n  
2
 fD2    fD   fD 2   fD 
2
Short cut
Method
S  2
 n  S  
n   n  n 
  fui 2   fui 2 
S=h 
fui 2   fui 
2
Step-deviation S 2  h2      
Method  n  n   n  n 
143
Coefficient of Standard Deviation OR

Coefficient of Variation
The coefficient of standard deviation is a relative measure of dispersion

and is given by:
Standard Deviation
Coefficient of S. D 
Mean
The coefficient of standard deviation is also called the coefficient of
In 1893, Pearson
om
variation, denoted by C.V and is given by:
introduced
statistical concept of
Standard Deviation
C.V   100 “S.D”. He also
l.c
Mean introduced the
coefficient of
Coefficient of Variation was introduced by Karl Pearson. It is used to
ai
compare the variation or to compare the performance of two sets of data. A
variation for the
comparison between
large value of C.V indicates that there is greater variability and vice versa.
gm
two different
Similarly, the smaller the C.V the more consistent is the performance and
groups.
vice versa.
s@
EXAMPLE 4.14
t
ta
Find Variance and Standard deviation from the following data: (ungrouped data)
es
2, 4, 6, 8, 10
ze
Solution
Direct Method
X X2
 xi 2 -    =
2 2
xi 220  30 
Variance= S 2 = -
5  5 
2 4 =8
4 16 n  n 
6 36
8 64  xi 2   xi 
2 2
220  30 
-   -
5  5 
10 100 Standard Deviation= S = =2.8
30 220
n  n 
144
Short-cut Method Let A = 4
X D = Xi - A D2
 D2 -    =
2 2
-   = 8
2 -2 4 2 D 60 10
S =
4 0 0 n  n  5 5
6 2 4
 D2 -    =
2 2
8 4 16 D 60  10 
-
5  5 
S= =2.8
10 6 36 n  n 
-- 10 60
om
Step-deviation Method Here h = 2 and let A = 8
u=
Xi - A
u2
l.c
X
ai
  ui 2   ui 2   2

2 15  -5 
h
2 -3 9 S =h 
2 2
   = 2  -    =8
 n  n    5  5  
gm
4 -2 4
6 -1 1
S=h 
ui 2   ui 
2 2
15  -5 
  =2 -
5  5 
8 0 0 = 2.8
n  n 
s@
10 1 1
30 -5 15
t
ta
EXAMPLE 4.15
es
Find Variance and Standard deviation from the following data: (Discrete grouped data)
X 10 15 20 25 30
ze
f 1 2 3 2 1
Solution
Direct Method
X f fX fX2 2
 fxi 2 -   fxi  = 3900 -  180  = 33.3
2
10 1 10 100 2
 n  9  9 
S =
15 2 30 450 n  
20 3 60 1200
 fxi 2 -   fxi   3900 -  180  =5.7
2 2
25 2 50 1250
 n  9  9 
S=
30 1 30 900 n  
Total 9 180 3900
145
Short-cut Method Here A = 20
X f D = Xi - A fD fD2 2
 fD2 -   fD  = 300 -  0  = 33.3
2
10 1 -10 -10 100 S2 =
15 2 -5 -10 50 n  n 
  9  9 
20 3 0 0 0
 fD 2 -   fD  = 300 -  0  =5.7
2 2
25 2 5 10 50
 n  9  9 
S=
30 1 10 10 100 n  
Total 9 -- 0 300
om
Step-deviation Method Here A = 20, h = 5
l.c
Xi - A
u= fu2
X f h fu
ai
   
S 2 = h2  
2 2
10 1 -2 -2 4 2 12  0 
  = 5  -    =33.3
gm
15 2 -1 -2 2 
 n  n    9  9  
20 3 0 0 0
S=h 
2 2
25 2 1 2 2 12  0 
 =5 -
 9  9 
= 5.7
s@
30 1 2 2 4 n  n 
Total 9 -- 0 12
t
EXAMPLE 4.16
ta
es
Find Variance and Standard deviation from the following data: (Continuous grouped data)
Weight(kg) 11- 20 21- 30 31- 40 41-50 51-60
ze
f 1 2 3 2 1
Solution
2
 fxi 2 -   fxi 
2
Direct Method S =  n 
n  
2
Weight(kg) f X fX fX2 12542.25  319.5 
= -  = 133.3kg2
11- 20 1 15.5 15.5 240.25 9  9 
21- 30 2 25.5 51.0 1300.5
 fxi 2 -   fxi 
2
31- 40 3 35.5 106.5 3780.75 S=  n 
41-50 2 45.5 91.0 4140.5 n  
51-60 1 55.5 55.5 3080.25 2
Total 9 -- 319.5 12542.25 12542.25  319.5 
 -  =11.5kg
9  9 
146
Short-cut Method Here A = 35.5
Weight f X D = Xi - A fD fD2
11- 20 1 15.5 -20 -20 400
21- 30 2 25.5 -10 -20 200  To compute the
31- 40 3 35.5 0 0 0 Variance or S.D,
41-50 2 45.5 10 20 200 round-off it one more
51-60 1 55.5 20 20 400 decimal place than the
Total 9 -- -- 0 1200
original data values.
 The unit of the S.D is
om
2
 fD2 -  =
2
 fD  1200  0  the same as that for
S2 = -   = 133.3kg2
n  n  9 9 the raw data, so it is
preferable to use the
l.c
 fD 2 -   fD  = 1200 -  0  =11.5kg
2 2
 n  9  9  S.D instead of the

S=
n  
Variance.
ai
Step-deviation Method Here A = 35.5, h = 10
gm
 
S 2 = h2  
2
  
Weight f X u=
Xi - A
fu fu2  n  n  
s@
h

2 12 0 
2
11- 20 1 15.5 -2 -2 4 = 10  -    =133.3kg2
21- 30 2 25.5 -1 -2 2  9  9  
31- 40 3 35.5 0 0 0
t
S=h 
2
ta
41-50 2 45.5 1 2 2  
51-60 1 55.5 2 2 4 n  n 
es
Total 9 -- -- 0 12 2
12  0 
= 10 -
9  9 
= 11.5kg
ze
Test Yourself
Find the Variance and S.D from the following data:
1) 1, 3, 5, 7, 9, 11, 13, 15, 20, 19, 21

It will be incorrect if we
X 20 25 30 35 40 get a negative answer in
2) f 2 4 9 3 1 calculating measures of
dispersion.
Weight 21- 30 31- 40 41- 50 51-60 61-70
3)
f 1 3 5 4 2
147
EXAMPLE 4.17
The number of runs scored by two cricketers A and B during a test series of 5 test matches is
shown below for each of the 10 innings. Using coefficient of variation, find who will be more
consistent player?
A 5 26 97 76 112 89 6 108 24 16
B 51 47 36 60 58 39 44 42 71 50
Solution
Cricketer A: Cricketer B:
om
S.D (x) S.D (y)
C.V(x)   100 C.V(y)   100
x y
x=  y= 
xi yi
l.c
n n
 xi 2 -   xi 
2
S .D( x) 
ai  yi 2   yi 
2
S .D( y ) 
n  n  -
n  n 
gm
x x2 y y2
5 25 51 2601
s@
26 676 47 2209
97 9409 36 1296
76 5776 60 3600
t
112 12544 58 3364

ta
89 7921 39 1521
Hi Friends!!! 6 36 44 1936
es
108 11664 42 1764

24 576 71 5041
16 256 50 2500
ze
559 48883 498 25832
Cricketer A: Cricketer B:
559 498
x= = 55.9 y= = 49.8
10 10
2 2
48883  559  25832  498 
S .D( x)  - S .D( y)  -
10  10  10  10 
= 41.993 = 10.156
41.993 10.156
C.V(x)   100 = 75.12% C.V(y)   100 = 20.39%
55.9 49.8
Since the C.V for player B is smaller than C.V for player A, therefore player B is more
consistent.
148
EXAMPLE 4.18
Goals scored by two teams A and B in a football season were as follows:
No. of goals Number of Matches

scored in a (frequencies)
match (xi) A B
0 27 17
1 9 9
2 8 6
3 5 5
om
4 4 3
Using coefficient of variation, find which team may be considered more consistent?
Solution Team A:
l.c
Team B:
S.D
ai S.D
C.V   100 C.V   100
x x
gm
x=  x= 
fAxi fBxi
nA nB
s@
 fAxi 2 -   fAxi 
2
 fBxi 2 -   fBxi 
2
S .D   nA  S .D   nB 
nA   nB  
t
ta
xi fA fB fAxi fAx2 fBxi fBx2

0 27 17 0 0 0 0
es
1 9 9 9 9 9 9
2 8 6 16 32 12 24
ze
3 5 5 15 45 15 45
4 4 3 16 64 12 48
Total 53 40 56 150 48 126
Team A: Team B:
56 48
x = = 1.06 x= = 1.20
53 40
2 2
150  56  126  48 
S .D  - S .D  -
53  53  40  40 
= 1.308 = 1.308
1.308 1.308
C.V   100 = 123.4% C.V   100 = 109.0%
1.06 1.20
Since the C.V for Team B is smaller than C.V for Team A, therefore team B is more consistent.
149
Test Yourself
1) The number of runs scored by two cricketers A and B during a test series of 5 test
matches is shown below for each of the 10 innings. Using coefficient of variation, find
who will be more consistent player?
A 15 34 27 55 0 0 6 4 123 34
B 5 67 36 55 89 33 37 89 88 111
om
2) Goals scored by two teams A and B in a football season were as follows:
No. of goals Number of Matches
l.c
scored in a (frequencies)
match (xi) A B
0 16 20
1 8
ai 7
2 4 8
gm
3 6 2
4 3 1
s@
Using coefficient of variation, find which team may be considered more consistent?
t
ta
Merits and Demerits of Standard Deviation

es
Merits
ze
 It is simple to understand.
 It is clearly defined by a mathematical formula.
 It is based on each and every value of the data.
 It is less affected by the fluctuations of sampling.
 It provide basis for statistical inference.
Demerits
 Its calculation is not very simple.

 It is affected by the extreme values.
 It is not a good measure for open-end distributions.
150
Properties of Variance and Standard Deviation
 Variance and S.D of a constant is zero i.e.
Var (c)  0 and S.D(c)  0 where “c” is any constant.
 The variance and S.D are unaffected by the change of origin i.e. when a constant is added
to or subtracted from each value of a variable, the variance and S.D remain unchanged i.e.
om
Var ( X  c)  Var ( X ) and S.D( X  c)  S.D( X )
 Variance and S.D are affected by the change of scale i.e. when each observation of a variable
l.c
is multiplied or divided by a constant, then variance and S.D are affected by these changes
i.e. ai
Var (c X )  c 2Var ( X ) and S.D(c X )  c S.D( X )
gm
X  1 X 1
Var     2  Var ( X ) and S .D    S .D( X )
 c  c  c c
The variance and S.D of the sum or difference of two independent variables is equal to the
s@

sum of their respective variances and S.D’s respectively i.e.
Var ( X  Y )  Var ( X )  Var (Y ) and S.D( X  Y )  S.D( X )  S.D(Y )

t
ta
es
Moments
ze
“The arithmetic mean of the rth power of deviations taken either from mean, zero or from any
arbitrary origin (provisional mean) are called moments”.
 When the deviations are computed from the arithmetic mean, then such moments are called
moments about mean (mean moments) or sometimes called central moments, denoted by
mr and given as follows:

(xi  x) f(xi  x)r
mr   mr  
r
n n
Where r = 1,2,3,4…
151
 When the deviations of the values are computed from origin or zero, then such moments are
called the moments about origin, denoted by m' r and are given by:
m'r   m'r  
r
xi fxi r
n n
Where r = 1,2,3,4…
Moments about provisional
 When the deviations of the values are computed from any mean and moments about
om
arbitrary value say “A” (provisional mean), then such moments zero are called raw
are called moments about provisional mean, denoted by m' r . moments (denoted by m'r )
l.c
D  fD
r r
ai
m'r  m'r 
n n
gm
Where r = 1,2,3,4… D = xi –A
s@
EXAMPLE 4.19
Calculate the first four moments about the mean from the following data.
t
ta
2, 4, 6, 8, 10
es
Solution
x=   =6
xi 30
ze
Here
n 5
xi (xi  x) (xi  x)2 (xi  x)3 (xi  x)4
2 -4 16 -64 256
4 -2 4 -8 16
6 0 0 0 0
8 2 4 8 16
10 4 16 64 256
30 0 40 0 544
(xi  x) (xi  x)2 40

m1   0 , m2   = =8
n n 5
(xi  x)3 (xi  x) 544
m3   m4  
4
0 , = = 108.8
n n 5
152
EXAMPLE 4.20
Calculate the first four moments about the zero from the following data.
2, 4, 6, 8, 10
Solution
xi xi2 xi3 xi4
2 4 8 16
4 16 64 256
6 36 216 1296
om
8 64 512 4096
10 100 1000 10000
30 220 1800 15664
m'1   =
xi 30
l.c
m'2   =
xi 2 220
n
=6 ,
ai n
= 44
5 5
m'3   = m'4   =
gm
xi 3 1800 xi 4 15664
= 360 , = 3132.8
n 5 n 5
s@
EXAMPLE 4.21
Calculate the first four moments about the P.M from the following data.
t
ta
2, 4, 6, 8, 10
es
Solution Here D = xi – A and (let A = 4)

ze
X D = Xi - A D2 D3 D4
2 -2 4 -8 16
4 0 0 0 0
6 2 4 8 16
8 4 16 64 256
10 6 36 216 1296
30 10 60 280 1584
D D
2
10 60
m'1  = =2 , m'2  =
= 12
n 5 n 5
D D
3 4
280 1584
m'3  = = 56 , m'4  = = 316.8
n 5 n 5
153
EXAMPLE 4.22
Calculate the first four moments about the mean from the following data:
xi 2 3 4 5 6
f 1 3 7 3 1
 fxi  60 = 4
Solution Here x =
n 15
(xi  x) f(xi  x)
om
xi f fx 2
f(xi  x)
3
f(xi  x)
4
f(xi  x)
2 1 2 -2 -2 4 -8 16
3 3 9 -1 -3 3 -3 3
l.c
4 7 28 0 0 0 0 0
5 3 15 1 3 3 3 3
6 1 6 2 2 4 8 16
ai
tal 15 60 -- 0 14 0 38
gm
f(xi  x) 0 f(xi  x)2 14

m1   = =0 , m2   = = 0.933
n 15 n 15
f(xi  x) f(xi  x)
m3   m4  
s@
3 4
0 38
= =0 , = = 2.533
n 15 n 15
EXAMPLE 4.23
t
ta
Calculate the first four moments about zero from the following data:
es
xi 2 3 4 5 6
f 1 3 7 3 1
ze
Solution
xi f fx fx2 fx3 fx4
2 1 2 4 8 16
3 3 9 27 81 243
4 7 28 112 448 1792
5 3 15 75 375 1875
6 1 6 36 216 1296
Total 15 60 254 1128 5222
m'1   = = 4 m'2  
fxi 60 fxi 2 254
, = = 16.93
n 15 n 15
m'3   m'4  
fxi 3 1128 fxi 4 5222
= = 75.2 , = = 348.13
n 15 n 15
154
EXAMPLE 4.24
Calculate the first four moments about P.M from the following data:
xi 2 3 4 5 6
f 1 3 7 3 1
Solution Here D = xi – A and (let A = 3)
xi f D = xi – A fD fD2 fD3 fD4
om
2 1 -1 -1 1 -1 1
3 3 0 0 0 0 0
4 7 1 7 7 7 7
l.c
5 3 2 6 12 24 48
6 1 3 3 9 27 81
Total 15 -- 15 29 57 137
ai
 fD  fD
gm
2
15 29
m'1  = =1 , m'2  =
= 1.933
n 15 n 15
 fD  fD
3 4
57 137
m'3  m'4 
s@
= = 3.8 , = = 9.133
n 15 n 15
Test Yourself
t
ta
es
Find Moments about Mean, about Zero and about P.M from the following data:
1) 11, 13, 15, 17, 19, 21, 23, 25, 30, 29, 31
ze
xi 20 30 40 50 60
2)
f 1 5 9 4 1
All the raw moments can then be converted into central moments or mean moments or
moments about mean, by using the following relations:
 m1  0
 m2  m'2  (m'1)2
 m3  m'3  3m'1m'2  2(m'1)3
 m4  m'4  4m'1m'3  6(m'1)2 m'2  3(m'1)4
155
EXAMPLE 4.25
The first four moments about origin X = 0 are 4, 16.93, 75.2 and 348.13 respectively. Find
moments about mean?
Solution Given that
m'1  4 , m'2  16.93

m'3  75.2 , m'4  348.13
om
Now we use:
m1  0
l.c
m2  m'2  (m'1)2 Hi Friends!!!
 m2  16.93-(4)2 =0.93
ai
m3  m'3  3m'1m'2  2(m'1)3
gm
 m3  75.2 - 3(4)(16.93)+2(4)3 =0.04

m4  m'4  4m'1m'3  6(m'1)2 m'2  3(m'1)4
 m4  348.13 - 4(4)(75.2) +6(4)2 (16.93) - 3(4)4 = 2.21
s@
EXAMPLE 4.26
t
ta
The first four moments about X = 12 are 2.40, 43.0, 337.50 and 5500 respectively. Find
moments about mean?
es
Solution Given that

ze
m'1  2.40 , m'2  43.0

m'3  337.50 , m'4  5500
For population data we
Now we know that use “  ” instead of “m”
m1  0 in all the formulae of
m2  m'2  (m'1)2 Moments.
 m2  43.0 -(2.4)2 = 37.24

m3  m'3  3m'1m'2  2(m'1)3  (meu)
 m3  33.7 - 3(43)(2.40)+2(2.40)3 = 55.548
m4  m'4  4m'1m'3  6(m'1)2 m'2  3(m'1)4
 m4  5500  4(2.40)(337.5)  6(2.40)2 (43)  3(2.40)4  3646.5472
156
Test Yourself
1) The first four moments about origin X = 0 are 8, 83.71, 1019.43 and 123100 respectively.
Find moments about mean?
2) The first four moments about X = 25 are -1.9, 20.5, -96.3 and 906.1 respectively. Find
moments about mean?
om
Symmetrical Distribution
l.c

ai
A distribution in which the values of mean, median and mode are equal is called symmetrical
distribution i.e.
gm
 A distribution is which the two quartiles are equidistant from the median is called a symmetrical
s@
distribution i.e.
Q3 – Median = Median – Q1 1st 2nd

Moment-
t
Moments Moments
Ratios
ta
or Q3 + Q1 – 2 Median = 0 Ratio Ratio

m32 m4
es
 A distribution is said to be symmetrical if: Sample b1 = b2 =

m23 m22
b1 = 0 32 4
ze
Population  1 = 2=
 A distribution in which the two tails are equal in length 23 22
from the central value then it is called symmetrical Moment-Ratios are independent of
distribution. The symmetrical distribution is always in the origin and units of
the form of a bell. measurements i.e. they are
dimensionless quantities.
157
Skewness
We know that for symmetrical distribution the values of mean,

median and mode are equal and that the two tails of the
distribution are equal in length from the central value etc.
“Skewness is the degree of asymmetry”
OR
om
“Skewness is the lack (absence) of symmetry around central value (average)”
l.c
The presence skewness tells us that a particular distribution is not symmetrical or in other words it is
skewed. In skewed distribution the curve is turned more to one side than the other.
ai
gm
Positive Skewness
 Skewness is said to be positive, if mean is greater than the median

s@
and median is greater than mode i.e.

To measure the skewness
Mean > Median > Mode we will use:
t
ta
m3
 Skewness is said to be positive, if:  3  b1 
m23
es
Q3 + Q1 – 2 Median > 0 3
1   1 
23
ze
 In terms of moments, skewness is said to be positive if:
3 > 0
 Skewness is said to be positive, if the right tail of a distribution is longer than its left tail.
Mean  Median  Mode
158
Negative Skewness
 Skewness is said to be negative, if mean is smaller than the median and median is smaller than
mode i.e.
Mean < Median < Mode
 Skewness is said to be negative, if:
Q3 + Q1 – 2 Median < 0
om
 In terms of moments, skewness is said to be negative if:
3 < 0

l.c
Skewness is said to be negative, if the left tail of a distribution is longer than its right tail.
ai
gm
Mean  Median  Mode

t s@
Measures of
ta
Skewness
es
ze
Absolute Measure Relative Measure of

of Skewness Skewness
Absolute measures of
Relative measures of skewness
skewness
 A.S = Mean – Mode  Karl Pearson‟s measures of Skewness

 A.S = Mean – Median  Bowley‟s measures of Skewness
 A.S = Q3 + Q1 – 2 Median  Coefficient of Skewness based on Moments
159
EXAMPLE 4.27
Calculate absolute skewness if Mean = 13.25 and Median = 12.96.
Solution Given that
Mean = 13.25
Median = 12.96
To calculate absolute skewness we use the formula:
om
Absolute skewness = Mean – Median = 13.25 – 12.96 = 0.29
l.c
Hence the distribution is positively skewed.
ai
EXAMPLE 4.28
gm
Calculate absolute skewness if Mean = 12.61and Mode = 13.25.

s@
Solution Given that
Mean = 12.61
t
Mode = 13.25
ta

es
Absolute skewness = Mean – Mode = 12.61– 13.25 = - 0.64

ze
Hence the distribution is negatively skewed.
EXAMPLE 4.29
Calculate absolute skewness if Q1  13.73 , Q3  38.29 and Median = 26.01
Solution To calculate absolute skewness we use the formula:
Absolute skewness = Q3  Q1  2Median = 38.29  13.73  2(26.01) = 0
Hence the distribution is symmetrical.
160
Test Yourself
1) Calculate absolute skewness if Mean = 33.25 and Median = 32.96.

2) Calculate absolute skewness if Mean = 42.61and Mode = 43.25.
3) Calculate absolute skewness if Q1  124.87 , Q3  146.53 and Median = 135.7
Karl Pearson’s measures of Skewness
om
It is defined as:
Mean  Mode
Sk 
l.c
Standard Deviation
ai
”It is to be noted that, this measure is suggested by Karl Pearson
gm
(1857-1936) and is known as Pearsonian coefficient.”
Since in many cases mode is ill-defined, therefore we replace Karl Pearson

s@
(Mean – Mode) by its equivalent from the empirical relation i.e. introduced coefficient
3 (Mean – Median) and hence: of skewness based on
Mean, Median, Mode
and S.D to measure the
3(Mean  Median )
t
Sk 
ta
skewness of a
Standard Deviation distribution
es
This coefficient usually varies between –3 and +3.

ze
EXAMPLE 4.30
Calculate Pearson‟s coefficient of skewness if Mean = 13.25, Mode = 12.61and S.D = 3.73
Solution Given that
Mean = 13.25, Mode = 12.61, S.D = 3.73
Mean  Mode 13.25 -12.61

Sk   = 0.1716
Standard Deviation 3.73
161
Bowley’s measures of Skewness
It is defined as:
Q3  Q1  2Median
Sk 
Q3  Q1
“It is also to be noted that, this measure is suggested by Bowley

(1869-1957) and is known as Bowley’s coefficient”.
Bowley introduced
om
This coefficient usually varies between –1 and +1.
coefficient of skewness
based on quartiles to
measure the skewness
l.c
of a distribution
EXAMPLE 4.31
ai
Calculate Bowley‟s coefficient of skewness if Q1  14.6 , Q3  25.2 and Median = 18.8
gm
Solution To calculate absolute skewness we use the formula:

s@
Q3  Q1  2Median 25.2+14.6 - 2(18.8)

Sk   = 0.21
t
Q3  Q1
ta
25.2 -14.6
es

ze
Coefficient of Skewness based on Moments
It is defined by:
m3
3 =
m23
162
EXAMPLE 4.32
The first four moments about origin X = 0 are 4, 16.93, 75.2 and 348.13 respectively. Find
moments about mean also find coefficient of skewness based on moments?
Solution Given that

m'1  4 , m'2  16.93
m'3  75.2 , m'4  348.13
om
Now we know that
m1  0
m2  m'2  (m'1)2
l.c
 m2  16.93-(4)2 =0.93
Hi Friends!!!
m3  m'3  3m'1m'2  2(m'1)3
ai
 m3  75.2 - 3(4)(16.93)+2(4)3 =0.04
gm
m4  m'4  4m'1m'3  6(m'1)2 m'2  3(m'1)4

 m4  348.13 - 4(4)(75.2) +6(4)2 (16.93) - 3(4)4 = 2.21
s@
m 0.04
Now  3 = 3  = 0.0446
3 3
m2 0.93
t
ta

es
ze
Test Yourself
1) Calculate Pearson‟s coefficient of skewness if Mean = 50, Mode = 55 and S.D = 12.5
2) Calculate Bowley‟s coefficient of skewness if Q1  13.73 , Q3  38.29 and Median = 26.01
3) The first four moments about origin X = 0 are 23.5, 297, 5299.6 and 110306.94 respectively.
Find moments about mean also find coefficient of skewness based on moments?
163
A distribution is said to be normal if its b1=0 and b2=3 respectively. The curve of the
normal distribution is Bell-shaped and symmetric.
 For a Bell-shaped symmetric distribution:
 68.27% area of the normal curve lies under the range  

 95.45% area of the normal curve lies under the range   2
 99.73% area of the normal curve lies under the range   3
om
4
 MeanDeviation  (Standard Deviation)
5
l.c
2
 QuartileDeviation  (Standard Deviation)
3 ai
5
 QuartileDeviation  (MeanDeviation)
6
gm
s@
Kurtosis
“The degree of peakedness or flatness of a frequency distribution

t
ta
relative to normal distribution is called Kurtosis”. OR To measure the skewness

we will use:
es
“The characteristic by which we compare the “hump” of a m4

b2 =
distribution with normal distribution is called kurtosis”. m22
ze
Kurtosis indicates whether a particular distribution is flatter

or more peaked than the normal curve. Kurtosis is measured
by the b2
 If b2 > 3, then the distribution is known as leptokurtic

 If b2 = 3, then the distribution is known as mesokurtic
 If b2 < 3, then the distribution is known as platykurtic
164
EXAMPLE 4.33
The first four moments about X = 170 are 5.2, 664, 10720 and 145600 respectively. Calculate b2
and kurtosis of the distribution?
Solution Given that
m'1  5.2 , m'2  664

m'3  10720 , m'4  1456000
om
Now
m1  0
m2  m'2  (m'1)2
l.c
 m2  664  (5.2)2  637 Hi Friends!!!
m3  m'3  3m'1m'2  2(m'1)3

ai
 m3  10720  3(5.2)(664)  2(5.2)3  642.82
gm
m4  m'4  4m'1m'3  6(m'1)2 m'2  3(m'1)4
 m4  1456000  4(5.2)(10720)  6(5.2)2 (664)  3(5.2)4  1338558
s@
m4 1338558
b2 =   3.3
m22 637 2
t
ta
Since b2 is more than 3, therefore the distribution is leptokurtic.

es
EXAMPLE 4.34
ze
Find kurtosis using m4  2.533 and m2  0.933 ?
Solution Given that m4  2.533 and m2  0.933
m4 2.533
Now b2 =  2
 2.91
m22 0.933
Since b2 is less than 3, therefore the distribution is platykurtic.
165
EXAMPLE 4.35
The first four central moments of a distribution are 0, 2.5, 0.7 and 18.75 respectively. Test
skewness and kurtosis?
Solution Given that

m1  0 , m2  2.5
m3  0.7 , m4  18.75
Skewness:
om
m3 0.7
3 =   0.18
m23 2.53
l.c
Therefore the distribution is positively skewed.
ai Hi Friends!!!
Kurtosis:
gm
m4 18.75
b2 =  2
3
m22 2.5
s@
Therefore the distribution is mesokurtic.

t
ta
es
Test Yourself
ze
1) The first four moments about X = 34.5 are -11, 260, -5000 and 128000 respectively.
Calculate b2 and kurtosis of the distribution?
2) Find kurtosis using m4  3646.54 and m2  37.24 ?
3) The first four central moments of a distribution are 0, 37.24, 55.55 and 3646.54 respectively.
Test skewness and kurtosis?
166
Sharpen your Pencil

MCQ’s
(1) var(3x  2)  _____
(A) 9 var( x) (B) 9 var( x)  2 (C) var( x) (D) None of these
om
 3x 
(2) var    _____
 2 
9 9
var  x  var  x  9 var  x 
l.c
(A) (B) (C) (D) None of these
2 4
(3) S.D is always_____ than range.

ai
gm
(A) More (B) less (C) Both A & B (D) None of these
(4) (4) S.D is always_____ than M.D.

s@
(A) More (B) less (C) Both A & B (D) None of these
(5) M.D is always _____ S.D

t
ta
(A) Less than (B) Greater than (C) Equal to (D) None of these
es
(6) If each observation is multiplied by 5 then S.D is _____

ze
(A) 5 S.D (B) 5 S.D (C) 52 S.D (D) None of these
(7) If m2  4 and m4  16 then b2  _____
(8) S.D of a, a, a, a is _____
(A) 0 (B) a (C) 1 (D) None of these
(9) If m2  4 then S.D_____
167
Sharpen your Pencil

MCQ’s
(10) For symmetric distribution _____
(A) b1  0 (B) b1  3 (C) Both A & B (D) None of these
om
(11) Coefficient of variation is infinite if _____ is zero.
(A) S.D (B) Mean (C) G.M (D) None of these
l.c
(12) C.V is zero if _____ is zero. ai

2
(A) x (B) (C) S.D (D) None of these
gm
(13) S.D = _____

s@
1
(A)  5 / 4  M .D (B)  4 / 5 M .D (C) Q.D (D) None of these
2
(14) If var(X) = 4 then var(2X+4) is _____

t
ta
(A) 4 (B) 16 (C) none (D) None of these

es
(15) Measures of dispersion can _____ be negative.

ze
A) Always (B) never (C) sometimes (D) None of these
(16) m2  _____
(A) Variance (B) S.D (C) Mean (D) None of these
(17) If x  20 and S  5 then C.V = _____
(A) 25% (B) 125% (C) 80% (D) None of these
168
Short Questions
ExeRciSe
Q.4.01. The first two moments of a distribution about x = 2 are 1 and 5find its mean and
S.D?
Q.4.02. Find the Bowley‟s coefficient of skewness if Q1  8.88 , Q3  13.42
om
and Median  11.45 ?
Q.4.03. The first four central moments of a distribution are 0, 2.5, 0.7 and 18.75; test the
l.c
kurtosis of the distribution?
Q.4.04.
ai
Find the range and its coefficient from the following data: 5, 19, 3, 6, 5, 8, 9, 40,
5, and 6.
gm
Q.4.05. Find the Q.D: 2, 3, 5, 5, 5, 6, 6, 8, 9, 9, 30, 38, and 40?

s@
Q.4.06. What is meant by coefficient of variation?
Q.4.07. What are the types of absolute and relative measures of dispersion?
t
Q.4.08. Define absolute and relative measures of dispersion?

ta
Q.4.09. Define kurtosis, how is it measured?

es
Q.4.10. Distinguish between symmetry and skewness?

ze
Q.4.11. If Mean = 42 Median = 42.2 and Mode = 42.3 find skewness?
Q.4.12. If mean = 50.9 and variance = 16 then find C.V?
Q.4.13. If a data having  f  15 ,  fx  60 ,  fx 2

 254 ,  fx 3
 1128 ,
 fx 4
5222 then find the 3 moment about mean?
rd
Q.4.14. If the first four moments of a distribution about “0” are -1.5, 17, -30 and 108 then
find the coefficient of skewness b1.
Q.4.15. If Mean = 10 Median = 8 then calculate mode. What will be the skewness of the
distribution?
169
Short Questions
ExeRciSe
Q.4.16. If mean = 10 and m2 = 16 then find C.V?
Q.4.17. What do you understand by the term dispersion? Name the methods of measuring
dispersion.
om
Q.4.18. If n = 10, x  12 and  x 2  1530 . Find the coefficient of variation.
l.c
Q.4.19. Calculate variance of the following data 6, 9, 12, 15 and 18.
Q.4.20.
ai
If a data having Q1  10 , Q2  18 and Q3  30 , then find coefficient of Q.D?
gm
Q.4.21. In each of the below cases determine whether the data set is skewed and if so
whether it is negatively or positively skewed:
s@
(i) Mean=32.9, Median=21.4, Mode=3.5

(ii) Mean=10.9, Median=10.9, Mode=10.9
(iii) Mean=42, Median=42.2, Mode=42.3
t
ta
Q.4.22. Give some merits and demerits of S.D.

es
Q.4.23. Define skewness. Explain positive and negative skewness.
Q.4.24. Sum of values = 129, Sum of squares of values = 3371, Number of values = 5.
ze
Find mean, variance and standard deviation.
Q.4.25. Define mean deviation and variance.
Q.4.26. Define moments about mean and write down their computing formula.
170
Long Questions
ExeRciSe
Q.4.01. Compute M.D from (i) Mean (ii) Median

Also calculate their coefficients?
Marks 0- 10 11- 21 22- 32 33-43 44-54 55- 65 66-76 77-87
om
f 3 7 21 17 10 9 4 1
Q.4.02. Find range and quartile deviation also find their coefficients?
l.c
Groups 25- 50 50- 75 75- 100 100-125 125-150 150- 175
f 10 12 16 17 20 18
ai
Q.4.03. Find coefficient of variation?
gm
Groups 1.5- 1.9 2.0- 2.4 2.5- 2.9 3.0-3.4 3.5-3.9 4.0- 4.4 4.5- 4.9
f 2 1 4 15 10 5 3
s@
Q.4.04. After the 10 weeks tuition the result of the students is given below:
t
Teacher A 12 15 6 73 7 19 99 36 84 29
ta
Teacher B 47 12 76 48 4 51 37 48 13 0
es
Who is more consistent teacher?

ze
Q.4.05. The first four moments of a distribution about the value X = 4 of the variable are
-1.5, 17, -30 and 108. Find the moments about mean.
Q.4.06. Find Bowley‟s and Pearson‟s coefficient of skewness? Also find coefficient of
skewness based on moments?
Groups 20- 24 25- 29 30- 34 35-39 40-44 45- 49 50- 54

Cumulative
22 50 268 495 730 946 1000
frequencies
Q.4.07. Find first four moments about mean also find b1 and b2?
Marks 0- 10 11- 21 22- 32 33-43 44-54 55- 65 66-76 77-87

No. of
3 7 21 17 10 9 4 1
Students
171
Long Questions
ExeRciSe
Q.4.08. In the following data are doctor‟s salaries more consistent than those of peons?
Mean S.D
Doctors Salaries 20000 6500
om
Peons Salaries 900 250
Q.4.09. Find the shape of the data given below:
l.c
If  f  100 ,  fD  15 ,  fD2  97 ,  fD3  33 ,  fD 4
253 and D  Xi  67
ai
Q.4.10. Find mean deviation and standard deviation for the data given below:
gm
Groups 0- 10 10- 20 20- 30 30-40 40-50 50- 60

f 6 7 10 8 4 2
s@
Q.4.11. Find the four moments about the mean from the following data:
x 2 3 4 5 6
t
f 1 3 7 3 1
ta
Q.4.12. Compute the coefficient of variation of (i) x (ii) y = 2x where “x” has
es
values 2, 3, 3, 5, 5, 5, 8, 10, 12. Are the two results same if so give reasons.
ze
Q.4.13. The five temperature readings in Co are as follows:
X = 15.3, 21.3, 17.4, 20.1, 15.9
(i) Calculate variance

(ii) Make a transformation y = x-12.5
(iii) Calculate the variance of the transformed observations
(iv) What is the effect of this transformation on the variance of the original
observations?
Q.4.14. Find Bowley‟s coefficient of skewness?
classes 22-25 25-28 28-31 31-34 34-37

f 2 5 9 6 1
172
CHAPTER 05
Index Numbers
om
Chapter Contents
l.c
ai
Y
gm
 Definition of Index Number: (P174)

s@
 Types of Index Numbers by Nature: (P174-P175)

 The Base Period (Year) : (P175)
 Fixed Base and Chain Base Methods: (P176)
 Types of Index Numbers by Treatment: (P177-P191)
t
ta
 Wholesale Price Index Numbers: (P192)

 Steps involved in the construction of WPI: (P192-P193)
es
 Consumer Price Index Numbers: (P193-P195)

 Problems involved in the Construction of Index Numbers: (P195)
Exercise: (P196-P200)
ze
173
Chapter 05 Index Numbers
Index Number
“A statistical measure of the average changes in the price, quantity or value of a variable
(commodity) or group of variables with respect to time or space, is called Index Number”
 To compare the changes in the average price of Milk in 2013 with that in 2010.
 To compare the retail price of Rice in KPK with that in Lahore or Karachi
 To know the increase in the yield of Wheat in Pakistan during 2013 as compared
om
to 2010 etc.
Types of Index Number by Nature

l.c
ai
There are three types of Index numbers by nature:
gm
 Price Index Number

 Quantity Index Number Quantities Prices
s@
 Value Index Number

t
ta
Index Number
by Nature
es
ze
Price Index Quantity Value Index

Number Index Number Number
Price Index Number
“Price index number is a measure of the changes in the prices of certain commodities with respect
to time or space”.
174
Quantity Index Number
“Quantity index number or volume index number measures the

changes in the quantity or volume produced, consumed or sold of
certain commodities with respect to time or space”.
Value Index Number When the price “p” of a

commodity during a
“Value index number measures the changes in the value of period is multiplied by
om
commodities in given period with reference to base period” its quantity “q”
produced, purchased or
Pon  
pnqn
l.c
 100 sold we get the value
 poqo ai denoted by v(=pq).
where pn denotes the prices of the given year and

gm
po denotes the prices of the base year
The Base Period (Year)

s@
“The period with which we like to compare the relative changes is

known as reference period or base period”. The base period may be a
t
 The index for base

ta
year, month, week or a day.

period is always taken
es
A base period (year) should be a normal year. By normal year we mean a as 100, and then the
year of economic stability and free from crisis caused by wars, strikes, index number of
earthquake, floods etc. If a single year of normal conditions is not current year is
ze
available then the average of several years is used as the base period.
compared to 100.
There are two methods of selecting a base period:  Index number is a

percentage, but the
 Fixed base method
percent sign is usually
 Chain base method
omitted.
Methods
Fixed base Chain base

Method Method
175
Fixed Base Method
According to this method, a particular year is generally chosen as the base period which remains
unchanged during the life time of the index.
To compute index numbers by fixed base method, the value of the I am the price of a
base year is taken as 100. Index numbers for other periods are Base year!!!
computed by dividing the price of a given year by the base year
price and the results are multiplied by 100. Values so obtained are
called price relatives i.e.
om
Prices of Commodities in Given Year
Price Relative   100
Prices of Commodities in Base Year
 Pon 
pn
po
 100
ai
l.c po
gm
where pn denotes the prices of the given year and

po denotes the prices of the base year
s@
Chain Base Method

t
According to this method, the base year is not fixed but changes
ta
I am the price of a
from year to year. Here the prices of previous year are taken as
Current year!!!
es
base and thus the relatives are computed. The price relatives
computed by the chain base method are known as Link Relatives
i.e.
ze
Prices of Commodities in Given Year

Link Relative   100
Prices of Commodities in the Preceding Year
 Pn - 1, n 
pn
pn - 1
 100
where pn denotes the prices in the given year and

pn
pn-1 denotes the prices in the preceding year
The link relatives cannot be used directly to make comparison. For the purpose of comparison the link
relatives have to be converted to fixed base. The indices so obtained are called Chain Indices. The
chain index for a year is obtained by multiplying the average of link relatives of that year by the chain
index of the preceding year and then dividing the resulting product by 100.
176
Types of Index Number by Treatment
 Simple Index Number

 Composite index number
Index Number
by Treatment
om
Simple Index Composite
Number Index Number
l.c
Simple Index Number ai
“An index number is called a simple index number when it is computed for
gm
one variable (commodity)”.
Simple index number can be computed as under:

s@
 Fixed Base Method

 Chain Base Method
t
ta
Fixed Base Method

es
In this method index number is computed by price relative as given below:

ze
pn
Price Relative   100
po
Chain Base Method
In this method index number is computed in two steps.
Step1: In first step, we calculate link relatives by dividing current period price by price of
immediate previous period of current period and multiplying this ratio by 100. i.e.
pn
Pn - 1, n   100
pn - 1
Step 2: In second step, we take just reverse step of step 1. Hence, to get chain indices we
multiply the current period link relative by link relative of immediate previous period of
current period and divide this product by 100.
177
EXAMPLE 5.01
The price of wheat (per maund) is given for the year 1964 to 1973. Calculate simple index
number using:
Prices Prices
Years Years
(Rs.) (Rs.)
(i) 1964 as base 1964 20 1969 27
(ii) Average of the prices for the first 1965 18 1970 28
five years as base 1966 23 1971 30
(iii)Average of the prices of all the ten 1967 24 1972 32
1968 25 1973 33
years as base.
om
Solution Here, the necessary computations are given below:
l.c
Simple index numbers taking
ai Average of the prices
Years Prices 1964 as base For the first 5 Of all the ten
year as base years as base
gm
1964 20 100  22  100  90.9  2026  100  76.9
20
1965 18  1820  100  90  1822  100  81.8  1826  100  69.2

s@
Hi Friends!!!
1966 23  2023  100  115 104.5 88.5
1967 24 120 109.1 92.3
t
125 113.6 96.2

ta
1968 25
1969 27 135 122.7 103.8
es
1970 28 140 127.3 107.7

1971 30 150 136.4 115.4
1972 32 160 145.5 123.1
ze
1973 33 165 150.0 126.9
Test Yourself
The price of wheat (per maund) is given for the year 2000 to 2009. Calculate simple index
number using:
Prices Prices
Years Years
(i) 2000 as base (Rs.) (Rs.)
2000 120 2005 127
(ii) Average of the prices for the first 2001 118 2006 128
five years as base 2002 123 2007 130
(iii)Average of the prices of all the ten 2003 124 2008 132
years as base. 2004 125 2009 133
178
EXAMPLE 5.02
The price of wheat (per maund) is given for the year 1960 to 1967. Calculate index numbers by
chain base method using 1960 as base:
Years 1960 1961 1962 1963 1964 1965 1966 1967

Prices (Rs.) 40 45 48 50 52 54 56 60
om
Years Prices (Rs.) Link Relatives Chain Indices
1960 40 100 100
l.c
1961 45  4045  100  112.5  112.5100100   112.5
 4845  100  106.7  112.5100106.7   120.04
ai
1962 48
gm
1963 50 104.2 125.08

1964 52 104.0 130.08
1965 54 103.8 135.02
s@
1966 56 103.7 140.02

1967 60 107.1 149.96
t
ta
es
Test Yourself
ze
The price of wheat (per maund) is given for the year 2001 to 2008. Calculate index numbers by
chain base method using 2001 as base:
Years 2001 2002 2003 2004 2005 2006 2007 2008

Prices (Rs.) 140 145 148 150 152 154 156 160
Composite Index Number
“An index number is called a composite index number when it is

computed for more than one variable (commodities)”.
179
Composite (price or quantity) index number may further be classified as:
Composite
Index Number
Un-weighted Weighted
Index Number Index Number
om
Un-weighted Index Number or Un-weighted Indices
l.c
“An index number that measures the changes in the prices of a group of commodities when the
ai
relative importance i.e. weight of the commodities are not taken in to account is called un-
gm
weighted index number or un-weighted indices”.
s@
Un-weighted index numbers may also be classified as:

t
Un-weighted
ta
Index Number
es
ze
Simple Simple Average

Aggregative of Relative Index
Index Number Number
Simple Aggregative Index
Simple aggregative index number can be computed as under:
 Fixed Base Method

 Chain Base Method
180
Fixed Base Method
Under this method index number can be obtained by dividing the sum of In 1738, French
the given year prices of all the commodities by the sum of the base year economist Dutot
prices of the same commodities and multiply the result by 100 i.e. introduced the simple
aggregative method
Pon    100
pn of index number.
 po
Chain Base Method
om
Under this method:
l.c
Step 1: First we compute link relatives for each year by dividing current year total of prices by
the immediate previous year’s total of prices and express the result in percentage.
ai
Step 2: Second to get chain indices we take the reverse procedure as we take in calculation link
relatives i.e. we multiply each year link relative by previous year link relative and divide
gm
this product by 100.
EXAMPLE 5.03
s@
Commodity (Prices in Rs.)

The following table shows wholesale prices of Years
Wheat Rice Mutton
t
wheat, rice and mutton for the years 1972, 1973 1972 30 80 240
ta
and 1974. Compute the simple aggregative price 1973 32 100 300
indices for 1973 and 1974 using 1972 as a base: 1974 37 110 400
es
ze

Years
Wheat Rice Mutton Total
1972 p0 30 80 240 350
1973 p1 32 100 300 432
1974 p2 37 110 400 547
Po1    100  Po1 

p1 432
Therefore, for 1973: ×100 = 123.43%
 po 350
Po2    100  Po2 

p2 547
for 1974: ×100 = 156.29%
 po 350
181
Test Yourself
The following table shows wholesale prices of Commodity (Prices in Rs.)

Years
wheat, rice and mutton for the years 2005, 2006 Wheat Rice Mutton
and 2007. Compute the simple aggregative 2005 130 180 340
price indices for 2006 and 2007 using 2005 as a 2006 132 200 400
base: 2007 137 210 500
om
EXAMPLE 5.04
l.c
The following table shows wholesale prices of Commodity (Prices in Rs.)
Years
wheat, rice and mutton for the years 1972, 1973
ai Wheat Rice Mutton
and 1974. Compute the chain indices for 1973 1972 30 80 240
and 1974 using 1972 as a base: 1973 32 100 300
gm
1974 37 110 400
s@

t
Commodity (Prices in
ta
Rs.) Total Link Relative Chain Indices

Wheat Rice Mutton
es
1972 30 80 240 350 100 100

1973 32 100 300 432 432
350  100  123.43  100   123.43
123.43100
ze
1974 37 110 400 547  547

432  100  126.62  126.62100123.43   156.29
Test Yourself

The following table shows wholesale prices of Years
Wheat Rice Mutton
wheat, rice and mutton for the years 2000, 2001
2000 40 90 340
and 2002. Compute the chain indices for 2001
2001 42 110 400
and 2002 using 2000 as a base:
2002 47 120 500
182
Simple Average of Relatives
Simple average of relatives can be computed as under: In 1764, Italian

economist Carli
 Fixed Base Method introduced the
 Chain Base Method arithmetic average of
price relative.
Fixed Base Method
om
Under this method simple average of price relatives can be obtained by:
Step 1: First, we find price relatives for each commodity given.
l.c
Step 2: Then average these relatives by using arithmetic mean,
median or geometric mean. The averages so obtained are
known as index numbers by simple average of relatives.
ai
In 1863, English
economist Jevons
gm
Chain Base Method introduced the

Geometric average of
Under this method: price relative.

s@
Step 1: First, we find link relatives for the given commodities.

Step 2: Second, we take average (arithmetic mean, median or geometric mean) of link relatives.
t
Step 3: Third, we find chain indices.

ta
es
EXAMPLE 5.05
ze
From the data given below, compute the index numbers of prices, taking 1962 as base. Use:
(i) Simple average of price relative

(ii) Median of price relative and
(iii) Geometric mean of price relative.

Years Soft Kerosene
Firewood Matches
coke oil
1962 3.25 2.50 0.20 0.06
1963 3.44 2.80 0.22 0.06
1964 3.50 2.00 0.25 0.06
1965 3.75 2.50 0.25 0.06
183
Price Relative Index Number By

Years Soft Kerosene (i) (ii) (iii)
Firewood Matches
coke oil Total A.M Median G.M
1962 100 100 100 100 400 100 100 100
1963  3.25  100  106 112
3.44
110 100 428 107 108 107
1964  3.25  100  108 80
3.50
125 100 413 103 104 102
 3.25  100  115 100
3.75
om
1965 125 100 440 110 108 109
l.c
Where:
abcd
(i) A.M 
ai
4
gm
(ii) Median = Exact central value after
arranging the values a, b, c and d.
Hi Friends!!!
s@
(iii) G.M  4 a  b  c  d
t
ta
Test Yourself
es
From the data given below, compute the index numbers of prices, taking 2000 as base. Use:
ze
(i) Simple average of price relative

(ii) Median of price relative and
(iii) Geometric mean of price relative.

Years Soft Kerosene
Firewood Matches
coke oil
2000 4.25 5.50 1.20 1.06
2001 4.44 5.80 1.22 0.07
2002 4.50 5.00 1.25 1.07
2003 4.75 5.50 1.25 1.03
184
EXAMPLE 5.06

Construct chain indices for the prices of the Years
A B C
three commodities, taking 1940 as the base; using 1940 2.80 10.50 2.70
1941 3.40 10.80 3.20
(i) Simple average (A.M) 1942 3.60 10.60 3.50
(ii) Median 1943 4.00 11.00 3.80
(iii) Geometric Mean 1944 4.20 11.50 4.00
om
Link Relative Simple
l.c
Years Total Chain Indices
A B C Average
1940 100 100 100 300 100 100
ai
1941   100  121
3.40
2.80
103 119 343 114  100   114
114100
gm
1942  3.60
3.40  100  106 98 109 313 104  104100114   118.6
1943  3.60
4.00
 100  111 104 109 324 108  108100
118.6
  128.1
 4.20
4.00  100  105  105100   134.5
s@
128.1
1944 105 105 315 105
Link Relative
t
Years Median Chain Indices

ta
A B C
1940 100 100 100 100 100
es
1941   100  121

3.40
2.80
103 119 119  100   119
119100
1942  3.60
3.40  100  106 98 109 106  106100119   126.14
ze
1943  3.60
4.00
 100  111 104 109 109  109100126.14   137.49
1944  4.20
4.00  100  105 105 105 105  105100137.49   144.37
Link Relative
Years G.M Chain Indices
A B C
1940 100 100 100 100 100
1941 3.40
2.80  100  121 103 119 114.04  114.04100
100   114.04
1942  3.60
3.40  100  106 98 109 104.23  104.23100114.04   118.86
1943  3.60
4.00
 100  111 104 109 107.96  107.96100118.86   128.33
1944  4.20
4.00  100  105 105 105 105  105100128.33   134.74
185
Where:
abc
(i) A.M 
3
(ii) Median = Exact central value after arranging the values a, b and c.
(iii) G.M  3 a  b  c
Test Yourself
om
Construct chain indices for the prices of the three commodities, taking 2001 as the base; using
(i) Simple average (A.M)
l.c
(ii) Median
(iii) Geometric Mean ai
Years
A B C
gm
2001 5.80 18.50 5.70

2002 6.40 12.80 2.20
2003 6.60 15.60 6.50
s@
2004 7.00 13.00 8.80

2005 7.20 18.50 2.00
t
ta
Weighted Index Numbers or Weighted Indices

es
“An index number that measures the changes in the prices of a group of commodities when the
ze
relative importance i.e. weight of the commodities are taken into account is called weighted index
number or weighted indices”.
Weighted index number may also be classified as:
Weighted Index
Number
Weighted Weighted Average

Aggregative of Relative Index
Index Number Number
186
Weighted Aggregative Index Numbers
There are various kinds of weighted aggregative index number;

some of them are discussed below:
Laspeyres Index Numbers
It is defined as:
Laspeyres
om
Pon  
pnqo
 100
 poqo
l.c
Since the Laspeyre’s formula use the base year prices (quantities) as
weight therefore it is called as base year weighted index. It is to be noted
ai
that, the Laspeyre’s index is subject to upward bias (expected to
overestimate)
gm
Paasche
Around 1875
Paasche Index Numbers German economists
s@
Laspeyres and
It is defined as: Paasche introduced
their formulae of
Pon  
pnqn index number.
t
 100
ta
 poqn
es
Since the Paasche’s formula use the current year prices (quantities) as
weight therefore it is called the current year weighted index. It is to be
ze
noted that, the Paasche’s index subject to downward bias (expected to

underestimate).
Fisher
Fisher (Ideal) Index Number
Around 1921-1922
American economists
Fisher’s index number is the geometric mean of the Laspeyre’s and
and statistician Irving
Paasche’s index number i.e. Pon  L P Fisher introduced his
formula for index
Pon  
pnqo  pnqn number.
  100
 poqo  poqn
187
Fisher called this index as Ideal, because of the following reasons:
 It takes into account both base period as well as current period prices and
quantities.
 It is based on geometric mean, which is theoretically considered as the best
average for the construction of an index number.
 It is free from bias. The Laspeyres index is subject to upward bias (expected to
overestimate) and Paasche index to downward bias (expected to underestimate).
These two types of bias are crossed geometrically, i.e. by an averaging process
that of itself has no bias.
om
Marshall-Edgeworth Index Number
l.c
It is defined as:
ai
 pn  qo  qn 
Pon 
gm
100
 po  qo  qn 
This formula can also be written as:

s@
Marshall
Pon   pnqo   pnqn 100

 poqo   poqn
t
ta
es
EXAMPLE 5.07
ze
Edgeworth
Construct the following weighted aggregative price
index numbers for 1960 using 1956 as a base, from Around 1887,
the given data. English economist
Marshall and Irish
(i) Laspeyre’s index (iii) Fisher’s “Ideal” index economist Edgeworth
(ii) Paasche’s index (iv) Marshall-Edgeworth index both introduced a
formula for index
Prices (Rs. Per md) Quantities (tons) number.
Commodity 1956 po 1960 p1 1956 qo 1960 q1
A 64 75 270 276
B 40 45 124 118
C 18 21 130 121
D 58 68 185 267
188
Solution The necessary computations are given below:
Hi Friends!!!
(i) Laspeyre’s index
Po1  
p1qo 41140
 100 = ×100 = 116.5 p1 qo po qo p1 q1 po q1
 poqo 35310 20250 17280 20700 17664
5580 4960 5310 4720
(ii) Paasche’s index 2730 2340 2541 2178
om
12580 10730 18156 15486
Po1  
p1q1 46707
 100 = ×100 = 116.6 41140 35310 46707 40048
 poq1 40048
l.c
(iii) Fisher’s “Ideal” index ai
gm
Po1  
p1qo  p1q1 41140 46707
  100 = × ×100 = 116.5
 poqo  poq1 35310 40048
s@
(iv) Marshall-Edgeworth index
Po1  
p1qo   p1q1 41140+46707
 100 = ×100 = 116.57
t
 p q  p q
ta
o o o 1 35310+40048
es
Test Yourself
ze
Construct the following weighted aggregative price index numbers for 2003 using 2000 as a
base, from the given data.
(i) Laspeyre’s index (iii) Fisher’s “Ideal” index

(ii) Paasche’s index (iv) Marshall-Edgeworth index
Prices (Rs. Per md) Quantities (tons)

Commodity 2000 po 2003 p1 2000 qo 2003 q1
A 74 80 370 376
B 77 56 224 218
C 67 78 230 221
D 76 77 285 367
189
EXAMPLE 5.08
Construct the following weighted aggregative price index numbers for 1960 and 1961, using
1956 as a base, from the given data.


Commodity 1956 po 1960 p1 1961 p2 1956 qo 1960 q1 1961 q2
A 64 75 80 270 276 290
om
B 40 45 41 124 118 144
C 18 21 20 130 121 137
D 58 68 56 185 267 355

l.c
ai
gm
p1 qo po qo p1 q1 po q1 p2 qo p2 q2 po q2
20250 17280 20700 17664 21600 23200 18560
s@
5580 4960 5310 4720 5084 5904 5760

2730 2340 2541 2178 2600 2740 2466
12580 10730 18156 15486 10360 19880 20590
41140 35310 46707 40048 19644 51724 47376 Hi Friends!!!
t
ta
(i) Laspeyre’s index

es
ze
For 1960: Po1 

 p1qo  100 = 41140 ×100 = 116.5
 poqo 35310
For 1961: Po2 

 p2qo  100 = 39644 ×100 = 112.3
 poqo 35310
(ii) Paasche’s index
For 1960: Po1 

 p1q1  100 = 46707 ×100 = 116.6
 poq1 40048
For 1961: Po2 

 p2q2  100 = 51724 ×100 = 109.2
 poq2 47376
190
(iii) Fisher’s “Ideal” index
For 1960: Po1 

 p1qo   p1q1  100 = 41140 × 46707 ×100 = 116.5
 poqo  poq1 35310 40048
For 1961: Po2 

 p2qo   p2q2  100 = 39644 × 51724 ×100 = 110.7
 poqo  poq2 35310 47376
om
(iv) Marshall-Edgeworth index
For 1960: Po1 

 p q  p q
1 o 1 1
 100 =
41140+46707
×100 = 116.57
l.c
 p q  p q
o o o 1
ai 35310+40048
For 1961: Po2 

 p q  p q
2 o 2 2
 100 =
39644+51724
×100 = 110.49
 p q  p q
gm
o o o 2 35310+47376
s@
It is to be noted that the formulae used to obtain the price index number can also be used
to obtain the quantity index number simply by interchanging the p’s and q’s in the price
index number formula.
t
ta
es
Test Yourself
ze
Construct the following weighted aggregative price index numbers for 2010 and 2011 using


Commodity 2009 po 2010 p1 2011 p2 2009 qo 2010 q1 2011 q2
A 84 55 70 670 376 790
B 60 35 31 524 518 544
C 48 61 70 530 621 737
D 68 48 46 585 467 355
191
Wholesale Price Index Number (WPI)
“Wholesale price index number measures the changes in

prices of goods in wholesale markets”.
These goods include food grain (wheat, rice, etc.), raw

materials (sugarcane, cotton, etc.), manufactured goods
(textiles, fabrics, sugar, vegetable ghee, soap, etc.), electricity,
gas, petrol, building material, etc.
om
The wholesale price index numbers measure the variation of prices in general. They do not measure the
effects of rise and fall of prices on the general standard of living of the various groups of people in a
society.
Steps Involved in Construction of

l.c
ai
Wholesale Price Index Number
gm
The following steps are involved in the construction of wholesale price index number:
s@
Step 1: Purpose and Scope of Index Number
It is the most important step in the construction of an index number because most of the other points are
t
decided in the light of this point. It must be decided initially what the index number is to be measured
ta
and why? There is no single index number, which can serve all purposes. Every index is limited and for
a particular use. The scope i.e. the filed covered by the index is also decided in the light of the purpose
es
and data available.
Step 2: Selection of Commodities and Collection of Prices

ze
There is no hard and fast rule for the inclusion of commodities but a reasonable number of commodities
on the basis of their importance should be included. According to Dr. Irving Fisher more than 20
commodities should be included and 50 is the better number.
Step 3: Selection of the Base Period
The period with which we like to compare the relative changes is known as reference period or base
period. The base period may be a year, month, week or a day. It is to be noted that, the index for base
period is always taken as 100. A base period (year) should be a normal year. By normal year we mean a
year of economic stability and free from crisis caused by wars, strikes, earthquake, floods etc. If a single
year of normal conditions is not available then the average of several years is used as the base period.
192
There are two methods of selecting a base period:
 Fixed base method

 Chain base method
Step 4: Selection of Averages
Theoretically any average i.e. mean, median, mode, geometric mean, harmonic mean etc. can be taken in
the construction of index number. But practically geometric mean is the suitable average, because in the
construction of index numbers, we are concerned with relative changes and the geometric mean is
om
appropriate measurement of the relative changes.
Step 5: Selection of Appropriate Weights
l.c
We know that all the commodities selected are not equally important e.g. eggs and coffee cannot be
given the same importance as wheat and rice. Wheat is much more important than coffee, it is therefore
ai
desirable that wheat must be given more importance. Thus weights are assigned to the commodities
depending upon their relative importance.
gm
Consumer Price Index Number (CPI),

s@
Cost of Living Index Number, OR

Retail Price Index Number
t
ta
“Consumer price index numbers (or cost of living index

es
numbers) measure the changes in the cost of living i.e. cost of

goods of daily use, purchased by a particular class of people in
ze
a town or area”.
These goods (called the market basket) consists of food, house rent,
clothing, fuel and lighting, education, miscellaneous e.g. medicine,
transport, washing, haircut, newspaper, etc.
The price in CPI are the

There are two methods for the computation of cost of living index average retail price paid
number. by the consumer for the
purchase of goods that’s
 Aggregative expenditure method why consumer price
 Household budget method index is also known as
retail price index
number.
193
Aggregative Expenditure Method
In this method, we use the Laspeyre’s formula i.e.
To study the changes in

Pon   pnqo  100
prices of different
 poqo
commodities in wholesale
markets we will
Household (Family) Budget Method
construct wholesale price
index number. On the
om
In this method, we use the following formula i.e. other hand, to study the
affects of changes in
 IW price of different
l.c
Pon 
W commodities on public
 pn  we will construct cost of

ai
Where I =  ×100 and W = poqo
living index number.
 po 
gm
EXAMPLE 5.09
s@
Construct the Cost of Living index number, from the given data
using both Aggregative Expenditure and Family Budget
Methods.
t
Prices
ta
Commodities Quantity Unit

1990 po 2000 p1
es
A 50 kg 25 kg 5 9
B 5 kg 1 kg 15 20
Hi Friends!!!
C 10 kg 5 kg 10 15
ze
D 20 dozen 1 dozen 10 20
E 3 Liter 1 Liter 15 25
Prices
Commodities qo p1 qo W=po qo I WI
1990 po 2000 p1
A 50 5 9 450 250 180 45000
B 5 15 20 100 75 133.33 10000
C 10 10 15 150 100 150 15000
D 20 10 20 400 200 200 40000
E 3 15 25 75 45 166.67 7500
-- -- -- -- 1175 670 -- 117500
194
Aggregative Expenditure Method Household (Family) Budget Method
Po1    IW
p1qo
 100 Pon 
 poqo W
1175 117500
= ×100=175.37   175.37
670 670
Test Yourself
om
Construct the Cost of Living index number for 2011, from the given data using both Aggregative
l.c
Expenditure and Family Budget Methods.
ai Prices
2010 2011
gm
A 50 kg 25 kg 5 9
B 5 kg 1 kg 15 20
C 10 kg 5 kg 10 15
s@
t
ta
Problems involved in the Construction of Index Numbers

es
The problems involved in the construction of index numbers are described below:
ze
 The first problem is to understand the purpose for which an index number is to be
constructed, because every index number do not serve all the purposes e.g. if it is
required to study the changes in the consumer prices of a specified group of people,
the cost of living index number should be constructed. On the other hand to measure
the general price level in the country, wholesale price index number should be
constructed.
 The next problem is to decide what data should be included. The data to be included
to the purpose for which an index number is to be constructed e.g. if cost of living
index number is to be constructed, then retail prices should be collected , if
wholesale price index number is to be constructed, then wholesale prices should be
collected.
 The next problem is to decide what period should be chosen as the base period.
195
Sharpen your Pencil

MCQ’s
(1) The index number for the base period is always equal to ______
(A) 100 (B) 1000 (C) one (D) None of these
om
(2) A chain index number provides comparison between ______
(A) year to year (B) first and last year
l.c
(C) both A & B (D) None of these
(3) CPI measures ______

ai
gm
(A) Consumer Price Index (B) Wholesale Price Index
(4) Fisher’s index number is _____

s@
(A) Best Index Number (B) Ideal Index Number

(C) Simple Index Number (D) None of these
t
ta
(5) WPI measures _____

es
(A) Consumer Price Index (B) Wholesale Price Index

ze
(6) Fisher’s index number is ______ of Laspeyre’s and Paasche’s index numbers.
(7) Base year weighted index number is _____
(A) Laspeyre’s (B) Paasche’s

(C) Fisher’s (D) None of these
(8) The data for each month are expressed as a percentage of a data for the previous month
these percentages are called _____
(A) Price relatives (B) Link relatives

196
Sharpen your Pencil

MCQ’s
(9) Laspeyre’s index number has _____ bias.
(A) Upward (B) downward

om
(10) Paasche’s index number has _____ bias
l.c
(A) Upward (B) downward
(C) Both A & B (D) aiNone of these
(11) An index number computed for two or more variables is called _____
gm
(A) Simple index number (B) Composite index number

s@
(12) Steps involved in the construction of index number are _____

t
ta
(13) Consumer’s price index number are computed by _____ formula

es
(A) Laspeyre’s (B) Paasche’s

(C) Fisher’s (D) None of these
ze
(14) Price Relative = _____

pn pn
(A)  100 (B)  100
po pn - 1
(C)
 pnqn  100 (D) None of these
 poqo
(15) Link Relative = _____

pn pn
(A)  100 (B)  100
po pn - 1
(C)
 pnqn  100 (D) None of these
 poqo
197
Short Questions
ExeRciSe
Q.5.01. What are the steps involved in the construction of price index number?
Q.5.02. Differentiate between fixed base and chain base method.
om
Q.5.03. Explain the term simple and composite index numbers.
Q.5.04. Differentiate between weighted and un-weighted index numbers.
l.c
Q.5.05. What are the problems in the construction of index number?
ai
Q.5.06. If  pnqo  1650 ,  pnqn  2240 ,  p0q0  1640 and  p0qn  2160 then find
gm
Paasches and Marshall index numbers.
s@
Laspeyre and Fisher index numbers.
Q.5.08. Define index number and give examples.

t
ta
Q.5.09. Calculate simple aggregative index number:

es
Commodities Rice Wheat Maize Tea Tobacco Sugar

Po 30 8 20 100 15 50
Pn 35 10 22 120 10 50
ze
Q.5.10. Give the formulae of four basic weighted index numbers.
Q.5.11. Laspeyre price index number = 254.17 and Fisher price index number = 252.37.
Find Paasche’s price index number.
Base year weighted index number and Current year weighted index number.
198
Long Questions
ExeRciSe
Q.5.01. Compute the price relatives taking 1988 as base and link relatives from the
following data:
Year 1988 1989 1990 1991 1992 1993 1994 1995 1996
om
Prices 5 5.5 6 6.5 7 7.5 8 8.5 9
Q.5.02. Compute chain indices from the following data:
l.c
Year 1980 1981 1982 1983
ai 1984 1985 1986 1987
Prices 15 17 20 18 22 24 23 24
gm
Q.5.03. From the data given below, compute the index numbers of prices, taking 1990 as
base. Use (i) simple average (ii) the median and (iii) the geometric mean
s@

Years
A B C D
1990 2.10 0.75 1.25 1.30
1991 2.15 1.05 1.30 1.30
t
1992 2.25 0.80 1.35 1.32

ta
1993 2.75 1.15 2.05 1.35

1994 3.25 1.75 3.00 1.70
es
Q.5.04. From the data of Q.3, compute Chain indices, taking 1990 as base. Use (i) simple
ze
average (ii) the median and (iii) the geometric mean.
Q.5.05. The price of wheat (per maund) is given for the year 1984 to 1993. Calculate
simple index number using
(i) 1984 as base

(ii) Average of the prices for the first five years as base
(iii) Average of the prices of all the ten years as base.
Years Prices (Rs.) Years Prices (Rs.)

1984 120 1989 127
1985 118 1990 128
1986 123 1991 130
1987 124 1992 132
1988 125 1993 133
199
Long Questions
ExeRciSe
Q.5.06. Construct the following weighted aggregative price index numbers for 1991 and
1992, using 1990 as a base, from the given data.
(i) Laspeyre’s index (ii) Paasche’s index
om
(iii) Fisher’s “Ideal” index (iv) Marshall-Edgeworth index
Prices Quantities
Food item
l.c
1990 1991 1992 1990 1991 1992
Meat 38 38 50 10 13 12
Bread 11.25 11.50 11.75 3 4 6
ai
Milk 13.75 13.75 13.75 7 7 9
gm
Butter 16.00 16.50 17.00 2 4 5
Q.5.07. Construct the following weighted aggregative price index numbers for 2000 using
s@
(i) Laspeyre’s index (ii) Paasche’s index

(iii) Fisher’s “Ideal” index (iv) Marshall-Edgeworth index
t
ta
1987 2000
Commodities
Price value Price Value
es
A 2 148 3 246
B 5 625 4 560
ze
C 7 280 6 198
Q.5.08. Construct the Cost of Living index number, from the given data using both
Aggregative Expenditure and Family Budget Methods.
Prices
1990 2000
A 50 kg 25 kg 5 9
B 5 kg 1 kg 15 20
C 10 kg 5 kg 10 15
200
CHAPTER 06
Set Theory
and Basic Probability
om
Chapter Contents
l.c
ai
Y
gm
 Set and its Types: (P202-P205)

s@
 Tree diagram And Venn diagram: (P205-P206)

 Operations on Sets: (P207-P208)
 Experiment and Random Experiment: (P209)
 Trial, Outcome and Sample Space: (P209-P211)
t
ta
 Event and its Types: (P212-P214)

 Counting Techniques: (P214-P225)
es
 Origin of Probability: (P226)

 Probability: (P227)
Definition of Probability: (P227-P235)
ze

 Addition Rules of Probability For Mutually Exclusive Events: (P236-P239)
 Addition Rules of Probability For Not Mutually Exclusive Events:
(P240-P242)
 Understanding the meaning of the words “AND” and “OR” : (P243)
 Rule of Complementation: (P244)
 Conditional Probability: (P244-P246)
 Independent and Dependent Events: (P247)
 Multiplication rule of Probability for independent Events: (P248-P249)
 Multiplication rule of Probability for dependent Events: (P250-P252)
 Interesting in Playing: (P255)
201
Chapter 06 Set Theory and Basic Probability
Set
“A well-defined collection of distinct objects is called set”
The objects in a set may be the numbers, people, letters, books, rivers etc.
Sets are usually denoted by capital letters such as A, B, C etc.
 Set of vowels in English alphabets

 Set of books in a library Georg Cantor
om
 Set of students in a college etc.
Finite and Infinite Sets
l.c
“A set consisting of finite number of elements is called finite set”
ai
gm
 Set of vowels Richard Dedekind
 Set of months of a year

s@
 Set of days in a week etc. The modern study of

set theory was
initiated by two
On the other hand “a set consisting of infinite number of elements is German

t
Mathematicians
ta
called infinite sets”

Georg Cantor and
es
Richard Dedekind in
the 1870s.
 Set of points on a line
ze
 Set of stars on the sky

 Set of odd numbers
 Set of even numbers etc.
Null Set or Empty Set
“A set that contains no elements is called an empty set or null set”
A null set is denoted by the symbol  (phi) or by   .
 Number of male students in a girl’s college

 Set of first year statistics students older than 200 years etc.
202
Sub-Set
“If each element of a set A is also the elements of set B then A is said to be the subset of B written
as: A  B ”
S
B
If A  1,2,3 4
A
1
And B = 1,2,3,4,5 5 2
3
Then A  B
om
Proper Sub-Set
We call “A”, a proper sub-set of “B” if:

l.c
ai
gm
 “A” is a sub-set of “B”

Every set is a subset of
 A B
itself and the null set is a
s@
subset of every set.

Written as A B
t
If A= 1,2,3
ta
And B = 1,2,3,4,5
es
Then A  B
ze
Improper Sub-Set
We call “A”, an improper sub-set of “B” if:
 “A” is a sub-set of “B”

 A B
If A= 1,2,3
And B = 1,2,3
Then A is an improper sub set of B.
203
Equal Sets
“Two sets “A” and “B” are said to be equal, if they contain exactly the same elements”
In other words
If A  B and B  A then A = B S
A
If A= 1,2,3 3 1
om
1 2
And B = 3,1,2 2
3
Then A=B B
l.c
Power Set
ai
gm
“The set of all possible sub-sets of a set is called power set and is denoted by P (A)”
The number of subsets in power set may be counted by 2n .

s@
If A= 1,2,3 then power set contains 23 = 8 subsets i.e.

t
ta
P( A)    , 1 , 2 , 3 , 1, 2 , 1,3 , 2,3 , 1, 2,3

es
ze
Disjoint Sets
“If there is no element common in between the two “A” and “B”, then they are called disjoint
sets”
S
Disjoint sets are also called mutually exclusive sets.
A
a
If A= 1,2,3 and B  a, b, c 1 2 c
b
3
Then A and B are disjoint sets. B
204
Overlapping sets
“If at least one element is common in between two sets

such that they are not subsets of each other then they
are called overlapping sets”
S
A
If A= 1,2,3,4 and B = 4,5,6,7 5
1 2 6 7
4
om
Then A and B are overlapping sets. 3
B
l.c
Universal Set ai
“The set which is consisted of all the elements specified for some discussion is called universal set”.
gm
It is denoted by U or S.
s@
Product Set OR
Cartesian product of Sets
t
The Cartesian product of sets “A” and “B” denoted by A x B (read as “A”
ta
cross “B”) is the set of elements that contains all the ordered pairs (x, y)
where x  A and y  B
es
ze
If A  H , T  and B  1, 2
The Cartesian
 A  B   H ,1 ,  H , 2  , T ,1 , T , 2  product is named
after Rene Descartes,
a French
Mathematician
Tree diagram
“A systematic method of finding Cartesian product

through a diagram is called tree diagram”
205
If for a coin, A  H , T  and for a die, B  1, 2,3, 4,5,6
 H ,1 ,  H , 2  ,  H ,3 ,  H , 4  ,  H ,5  ,  H , 6  
 
 A B   
T ,1 , T , 2  , T ,3 , T , 4  , T ,5  , T , 6 
 

om
A B
l.c
ai
gm
t s@
Venn diagram
ta
es
“The simple and effective way of representing the relationships

between sets diagrammatically is called Venn diagram”.
ze
In Venn diagram the universal set U (or S) is represented by a rectangle

and the sub sets are represented by circles inside the rectangles e.g.
If A= 1,2,3,4 and B = 4,5,6,7 then, they can be

represented by the Venn diagram as:
S
In 1880, British
A
Philosopher John
5
1 2 6 7 Venn introduced the
4
3 Venn Diagrams.
206
Operations on Sets
Like algebraic operation such as addition, subtraction, multiplication and division in mathematics, we
have basic operations on sets i.e.:
 Union of two sets

 Intersection of two set
 Difference of two sets
 Complement of a set
om
Union of sets
l.c
The union of two sets “A” and “B” is the set of all elements that belongs to “A” or to “B” or to
both “A” and “B”. The union of two sets “A” and “B” is denoted by A  B .
ai S
gm
If A= 1,2,3,4 and B = 5,6,7 A

5 6
1 2
Then A  B = 1,2,3,4,5,6,7 3 4
7
s@
A  B is shaded
t
S
ta
If A= 1,2,3,4,5,6,7 and B = 4,5,6,7,8,9

es
A
45 8
1 2
Then A  B = 1,2,3,4,5,6,7,8,9 6 7 9
ze
3
B
Intersection of sets A  B is shaded
The intersection of two sets “A” and “B” is the set of elements that belongs to both “A” and “B”.
The intersection of two sets “A” and “B” is denoted by A  B
S
If A= 1,2,3,4,5,6,7 A
And B = 4,5,6,7,8,9 1 2
45 8
6 7 9
3
Then A  B = 4,5,6,7 B
A  B is shaded
207
Difference of sets
The difference of sets “A” and “B” is the set of elements that belongs to “A” but do not belongs to
“B”. The difference of two sets “A” and “B” is denoted by A  B or A  ( A  B) or A  B
S
If A= 1,2,3,4,5,6,7
A
And B = 4,5,6,7,8,9 45 8
1 2
6 7 9
om
Then A - B = 1,2,3 3
B
A  B is shaded
l.c
S
If A= 1,2,3,4,5
ai A
And B = 1,2,3 4
1 2
gm
5 B
3
Then A - B = 4,5
s@
A  B is shaded
S
If A= 1,2,3
t
A
And B  a, b, c
ta
a c
1 2
b
es
3
Then A - B = 1,2,3 B
ze
A  B is shaded
Complement of a Set
The complement of a set “A” is the set of elements that belongs to “S” but do not belongs to “A”
The complement of set “A” is denoted by A or Ac
S
If S = 1,2,3,4,5,6
And A= 2,4,6,7 2 4
1 3
6
5 7
Then A = S - A= 1,3,5 A
A is shaded
208
Experiment
“An experiment is a process in which we obtain results”
Random Experiment
In our daily life, we perform many activities which have a fixed result no matter any number of times
they are repeated. For example given any triangle, without knowing the three angles, we can definitely
say that the sum of measure of angles is 180°. We also perform many experimental activities, where the
om
result may not be same, when they are repeated under identical conditions. For example, when a coin is
tossed it may turn up a head or a tail, but we are not sure which one of these results will actually be
obtained. Such experiments are called random experiments.
l.c
A random experiment satisfies the following three properties:
 It can be repeated any number of times.

ai
 It has more than one possible outcome.
gm
 It is not possible to predict the outcome in advance.
Hence we may define the random experiment as “An experiment that generates uncertain results
s@
under similar conditions, is called random experiment”

t
ta
 Tossing of a coin
 Rolling of a dice
es
 Drawing a card from a pack of 52 playing cards etc.

ze
Trail
“A single performance of an experiment is called a trial” I denote a I denote a

Tail on a coin Head on a coin
Outcome
“A possible result of a random experiment is called outcome”
If we toss a coin then “H” or “T” may be the outcomes.

T H
209
Sample space
“A set consisting of all possible outcomes of a random experiment is

called a sample space”. It is denoted by “S” and each element of a
sample space is called a sample point. Number of sample points
in a sample space for
coin tossing experiment
 If a coin is tossed can be determined by
Then S  H , T  2n, where “n” is the
number of coin. And for
om
die rolling experiment
 If two coins are tossed Head Tail
6n, where “n” is the
Then S  HH , HT , TH , TT 
l.c
number of dice.
 If three coins are tossed

ai
Then S  HHH , HHT , HTH , THH , TTH , THT , HTT , TTT 
gm
 If a dice is rolled
s@
Then S  1,2,3,4,5,6
t
ta
 If two dice are rolled, then

es
 1,1 1, 2  1,3 1, 4  1,5  1, 6  


 2,1  2, 2   2,3  2, 4   2,5  2, 6  
ze
 3,1  3, 2   3,3  3, 4   3,5   3, 6  

S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
 
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
210
 If we draw a card from a deck of 52 playing cards then the sample points in the sample space are:
om
l.c
ai
gm
s@
Hearts Diamonds Spades Clubs

t
ta
 If we draw a ball from a basket having 3 different color balls then the sample points in the
es
sample space may be as follows:

ze
211
Event
“Any sub set from a sample space is called an event”
Events are usually denoted by A, B, C etc.

 Each element of a
sample space “S” is
If we toss two coins called sample point.
Then S  HH , HT , TH , TT   Total number of

sample points in
Now if A  HH  , then “A” is called an event.
om
sample space is
denoted by n(S)
 Favorable cases of an
Simple Event
l.c
event “A” is denoted
by n(A)
“If an event contains only one sample point from the sample space
ai
then it is called simple event”
gm
If we toss two coins then S  HH , HT , TH , TT 

s@
If A  HH  , then “A” is called a simple event.

t
Compound Event
ta
es
“If an event contains two or more sample points from the sample space then this is called a
compound event”
ze
If we toss two coins then S  HH , HT , TH , TT 

If A  HT ,TH  , then “A” is called a compound event.
The Certain or Sure Event
“An event consisting of the sample space itself is called the sure event”
Impossible event
“An event consisting of the null set is called the impossible event”
212
Mutually Exclusive (Disjoint) Events
“Events in a same sample space are said to be mutually exclusive if they cannot occur together”
For two mutually exclusive events “A” and “B” A  B = 
If we toss a coin then “H” and “T” are mutually exclusive because if “H” occurs
then “T” cannot take place; similarly 1, 2, 3, 4, 5 and 6 are mutually exclusive
when a dice is rolled. In other words they exclude each other.
om
Coin S Die S
l.c
4 5
T
H 2
ai 6
3
gm
s@
Not Mutually Exclusive (Overlapping) Events
“Events in a same sample space are said to be not mutually exclusive if they can occur together”
t
For two not mutually exclusive events “A” and “B” A  B  

ta
es
If a card is drawn at random from a pack of 52 playing cards then it may be at the
same time an “Ace” and a “Diamond”; therefore “Ace” and “Diamond” are not
ze
mutually exclusive.
A
D
213
Equally likely Events
“Events are said to be equally likely if they have the same chances of occurrence”
If we toss a fair coin then “H” and “T” are equally likely; because they have the
same chances of occurences.
Exhuastive Events
om
“Two or more events defined in the same sample sapce are said
l.c
to be exhaustive if their union is equal to the sample space”
ai
If S  1,2,3,4,5,6 An event “A” and its
gm
Let A= 1,3,5 and B = 2,4,6 compliment “ A ” are

always exhaustive i.e.
Then A  B = 1,2,3,4,5,6  S
A A= S
s@
Therefore “A” and “B” are exhaustive events.
Counting Techniques
t
ta
Sometimes it is very difficult to list all the sample points of a sample space; therefore we use some
es
mathematical techniques for finding the number of sample points of the sample space. These techniques
are called counting techniques i.e.
ze
 Factorial
 Rule of Multiplication
 Permutation
 Combination
Factorial
“The product of first “n” natural numbers is called Factorial and is denoted by n!”
2! = 2×1= 2 0!=1
5! = 5  4  3  2×1= 120
In general n! = n(n - 1)(n - 2)(n - 3).....3..2.1

1!=1
214
Rule of Multiplication
“If a selection operation can be performed in “m” ways and a second selection operation can be
performed in “n” ways; then the two operations can be performed together in “ "m×n" ” ways”
 A coin is tossed and a die is rolled; here operation one i.e. the coin gives
H,T  and the second operation i.e. the die gives 1,2,3,4,5,6 ; hence the two
operations can be performed in 2×6 = 12 ways.

om
If a man has 3 suits and 5 ties; then he can wear a suit and a tie in
3×5 = 15 ways.
l.c
ai
Permutation
gm
“A permutation is an arrangement of “r” objects taken

from “n” distinct objects in a particular order”
s@
n!
It is denoted by nPr and is given by: nPr = The first book on
 n - r !
t
permutations and
ta
Instead of nPr we can also use n Pr or P(n,r) combinations is

written by Swiss
es
mathematician, Jacob
Bernoulli in 1713 A.D
ze
EXAMPLE 6.01
How many different permutations can be formed from the letters A, B, C when two letters are
taken at a time?
Solution Here n = 3 and r = 2

ABC
ACB
n! 3! BAC
Therefore nPr =  3P2 = 6
 n - r !  3 - 2 ! BCA
CAB
CBA
215
EXAMPLE 6.02
In how many ways “3” persons can be seated on “4” chairs?
n! 4!
Therefore nPr =  4P3 =  24
 n - r !  4 - 3 !
om
EXAMPLE 6.03
In how many ways can president, vice-president, secretary and treasure be selected from nine
l.c
members of a committee?
ai
gm
n! 9!
 n - r !  9 - 4 !
s@
EXAMPLE 6.04
t
In how many ways 2 lottery tickets are drawn from 16 for the 1st and 2nd prizes?
ta
es

ze
n! 16!
 n - r ! 16 - 2 !
EXAMPLE 6.05
In how many ways can two different books out of 5 books be arranged on a shelf?
n! 5!
 n - r !  5 - 2 !
216
EXAMPLE 6.06
In how many ways can 5 different books be arranged on a shelf?
Total number of
Solution Here n = 5 permutation of “n”
distinct objects taking all
Therefore Number of permutation = n!  5! = 120 “n” at a time is equal to
“n!”
om
EXAMPLE 6.07
l.c
In how many ways can four people be lined up to get on a bus?
ai
gm
Solution Here n = 4
Therefore Number of permutation = n!  4! = 24

s@
EXAMPLE 6.08
t
ta
How many different words can be formed from the letters of the word “BOXER” if:
es
1) All the letters are taken at a time

2) Three letters are taken at a time
ze
Solution
1) All the letters are taken at a time:
Here n = 5
2) Three letters are taken at a time
Here n = 5 and r = 3
Hi Friends!!!
n! 5!
 n - r !  5 - 3 !
217
EXAMPLE 6.09
Find the number of arrangements of 8 distinct books on a shelf taken:
1) Taken all books at a time

2) Three books are taken at a time
Solution
1) All the letters are taken at a time:
om
Here n = 8
l.c
If we arrange objects in
2) Three letters are taken at a time ai a circle then there is no
starting point to it,
Here n = 8 and r = 3 therefore we fixed one
gm
n! 8!
Therefore nPr =  8P3 =  336 object and the remaining
 n - r !  8 - 3 ! objects are arranged as
in linear permutation.
s@
The formula for

arranging “n” objects in
a circle is (n-1)!
t
EXAMPLE 6.10
ta
es
In how many ways can 4 people be seated at round table?

ze
Solution Here n = 4
Therefore
Number of circular permutation =  n - 1 !
  4 - 1 ! = 3! = 6
Group Permutation
The number of distinct permutations of “n” things when “n1” are alike, “n2” are alike but different from
the first group; “n3” are alike but different from the first and second group and so on; for “k” groups, is:
k
n!
P= Where n =  ni
n1 ! n2 !nk ! i 1
218
EXAMPLE 6.11
How many possible permutations can be formed from the letters of the word “STATISTICS”?
Solution Here n = 10
n1 = number of “S” = 3
n2 = number of “T” = 3
n3 = number of “A” = 1
om
n4 = number of “I” = 2
n5 = number of “C” = 1
l.c
n! 10!
Therefore P = = = 50400
n1!× n2!× n3!× n4 ×!n5! 3!×3!×1!× 2!×1!
ai
EXAMPLE 6.12
gm
How many different ways 3 red, 3 yellow and 3 blue balls are arranged in a string with 9
sockets?
s@
Solution Here n = 9
t
n1 = number of red balls = 3

ta
n2 = number of yellow balls = 3

n3 = number of blue balls = 3
es
Hi Friends!!!
n! 9!
ze
n1!× n2!× n3! 3!×3!×3!
EXAMPLE 6.13
In how many possible orders can two boys and three girls be born to a family having five
children?
Solution Here n = 5
n1 = number boys = 2
n2 = number of girls = 3
n! 5!
n1!× n2! 2!×3!
219
Test Yourself
1) How many permutations can be formed out of the letters of the word “MISSISSIPPI”?
2) Make permutations of A, B, C, D.
3) In how many ways can 4 people be seated at round table?
4) Fine 7 P3 , 4 P2 , 12 P5 , 10 P8
5) Find the number of arrangements of 6 distinct books on a shelf taken:
(i) Taken all books at a time (ii) Three books are taken at a time
6) In how many ways can 8 people be lined up to get on a bus?
om
The Order is important in Permutation!!!
l.c
ai
There are six different ways in which three horses can finish a race as shown in the figure:
(Assume that there are no ties and that every horse finishes)
gm
C
s@

C
B
1st way B A 2nd way
A
t
ta
es
B
B

ze
A
3rd way C A C 4th way
C
C
5th way A B
 A
B 6th way
220
Combination
“A combination is a selection of “r” objects taken from “n” distinct objects without regarding
any order”
It is denoted by nCr and is given by:
n!
nCr =
n ! n - r !
om
n
Instead of nCr we can also use nCr , C(n,r) or  
r
l.c
EXAMPLE 6.14
ai
gm
How many combinations of the letters A, B, C can be made if two letters are taken at a time?
s@
Solution Here n = 3 and r = 2 ABC

BCA
t
n! 3!
ta
Therefore nCr =  3C2 = 3 CAB

r!  n - r ! 2! 3 - 2 !
es
EXAMPLE 6.15
ze
In how many ways can a team of 11 players be chosen from a total of 15 players?
n! 15!
Therefore nCr =  15C11 =  1365
r!  n - r ! 11!15 - 11!
221
EXAMPLE 6.16
In how many ways can we select a committee of 4 people from a group of 10 people?
n! 10!
r!  n - r ! 4!10 - 4 !
om
EXAMPLE 6.17
l.c
In how many ways can we select a set of 6 books from 10 different books?
ai
gm
n! 10!
r!  n - r ! 6!10 - 6 !
s@
EXAMPLE 6.18
t
ta
In how many ways can we select a card from a pack of 52 playing cards?
es
Solution
ze
Here n = 52 and r = 1
n! 52!
r!  n - r ! 1! 52 - 1!
The 52 ways shown in the following figure:
222
EXAMPLE 6.19
A bag contains 7 balls; in how many ways can we select 3 balls?
Solution Here n =7 and r = 3
n! 7!
r!  n - r ! 3!7 - 3 !
om
EXAMPLE 6.20
l.c
A basket contains 5 white and 4 black balls; in how many ways can we select 3 white and 2
black balls? ai
Solution Here
gm
White Black Total

5 4 9
s@
5!
“3” white balls can be selected out of “5” in 5C3 =  10 ways
3! 5 - 3 !
t
ta
4!
“2” black balls can be selected out of “4” in 4C2 =  6 ways
2! 4 - 2 !
es
Hence the number ways in which “3” white and “2” black balls are selected = 10 x 6 = 60
ze
EXAMPLE 6.21
In how many ways can a consonant and a vowel be chosen out of the letters of the word
SCHOLAR?
Solution Here Consonants Vowels Total

SCHOLAR
5 2 7
5!
A consonant can be selected out of “5” in 5C1 =  5 ways
1! 5 - 1!
2!
A vowel can be selected out of “2” in 2C1 =  2 ways
1! 2 - 1!
Hence the number ways in which a consonant and a vowel is selected = 5 x 2 = 10
223
EXAMPLE 6.22
From 4 black, 5 white and 6 gray balls; how many selection of 9 balls are possible if 3 balls of
each color are to be selected?
Solution Here
Black White Gray Total

4 5 6 15
om
4!
“3” black balls can be selected out of “4” in 4C3 =  4 ways
3! 4 - 3 !
l.c
5!
“3” white balls can be selected out of “5” in 5C3 =  10 ways
3! 5 - 3 !
ai
6!
“3” gray balls can be selected out of “6” in 6C3 =  20 ways
gm
3! 6 - 3 !
Hence the number ways in which 3 balls of each color are to be selected= 4x10x20=800
s@
EXAMPLE 6.23
t
A committee of 5 persons is to be selected out of 6 men and 2 women. Fine the number of ways
ta
in which more men are selected than women?

es
Solution Here
Committee Men Women Total
ze
6 2 8
Now here more men can be selected in “3” mutually exclusive ways i.e.
 5 men   4 men   3 men 

  or   or  
 0 women  1 women   2 women 
 6  2   6  2   6  2 
    or    or   
 5  0   4  1   3  2 
Since the three ways are mutually exclusive; therefore the number of ways in which more
 6  2   6  2   6  2 
men than women can be chosen are     +    +    = 56
 5  0   4  1   3  2 
224
The Order Doesn’t matter in Combination!!!
There are three different combinations in which three horses can finish a race as shown in the figure:
(Assume that there are no ties and that every horse finishes)

C
1st B
B A
A
combination
om
l.c
B

B
ai
2nd A
C A C
gm
combination
s@

C
t
ta
3rd B A
A
B
combination
es
ze
Test Yourself
1) In how many ways can we select a set of 3 tables from 9 different tables?
2) A bag contains 6 balls; in how many ways can we select 4 balls?
3) A bag contains 9 white and 8 black balls; in how many ways can we select 6 white and 4
black balls?
4) Fine 7C3 , 4C2 , 12C5 , 10C8
5) In how many ways can a consonant and a vowel be chosen out of the letters of the word
CHOSEN?
6) In how many ways can a team of 11 players be chosen from a total of 13 players?
225
Origin of Probability!!!
Probability theory had its origin in the 16th

century when an Italian physician and
mathematician J.Cardan wrote the first book
on the subject, “The Book on Games of
Chance”. Cardan was an astrologer,
philosopher, physician, mathematician, and
gambler. This book was published in 1663
De Moivre after his death. It contained techniques on J. Cardano
how to cheat and how to catch others at
om
cheating.
In 1654, a professional gambler named
l.c
Chevalier de Mere approached the well
known French Philosopher
ai and
Mathematician Blaise Pascal for certain dice
problem. Pascal became interested in these
B. Pascal
gm
Laplace problems, studied them and discussed them
with another French mathematician, Pierre
de Fermat. Both Pascal and Fermat solved
s@
the problems independently. This work was

the beginning of Probability Theory.
Outstanding contributions to probability

t
theory were also made by J. Bernoulli, De

ta
Moivre, Pierre Laplace, Lagrange

Chebyshev Fermat
es
Chebyshev, Markov, Bayes, Huygens and

Kolmogorov.
ze
The equation of the normal curve was first

published in 1733 by De Moivre. The same
result was later developed by two
mathematical astronomers Laplace and
Gauss.
Bayes
J. Bernoulli
Gauss Huygens Markov Lagrange Kolmogorov
226
Probability
In our daily life we often make the statements such as:
 It will probably rain today

 I will probably go abroad this year
 He is almost certain that he will win this game
All these statements are related with uncertainty and can

be measured numerically by means of “probability”.
Thus we may simply define probability as “the numeric
om
measure of uncertainty is called probability”.
l.c
Though probability started with gambling, it has been
used extensively in the fields of Physical Sciences,
Commerce, Biological Sciences, Medical Sciences,
ai
Weather Forecasting, etc.
gm
Definition of Probability
s@
Usually probability of an event is defined by

adopting any of the following two approaches:
t
ta
1) Subjective approach
es
2) Objective approach
ze
Probability
Subjective Objective
Approach Approach
Subjective Approach
In subjective approach the probability of an event is defined as “the measure of believe in the
occurrence of an event by a particular person”. Probability in this sense is purely subjective, and is
based on whatever evidence is available to the person.
227
For example:
 A sports-writer may say that there is a 70% probability that Australia will
win the world cup.
 A physician might say that, there is a 30% chance the patient will need an
operation etc.
Objective Approach
om
In Objective approach, the probability of an event is defined in the following three ways:
 Classical or Priori or Theoretical Definition of Probability

 Relative Frequency or Empirical or Experimental Definition of Probability
l.c
 The Axiomatic Definition of Probability
ai
Objective
gm
Approach
s@
Classical Relative Frequency The Axiomatic

Definition Definition Definition
t
ta
Classical Definition
es
“If a random experiment can produce “n” mutually exclusive and

ze
equally likely outcomes, and if “m” of these outcomes are favorable to

the occurrence of an event “A”, then the probability of the event “A”
is equal to the ratio m/n” If we take P(A) as “the probability of A” then:
m No. of favourable outcomes

P(A)= =
n No. of possible outcomes
The classical
definition was
 For example, when a fair Coin is tossed, then formulated by
we know in advance that the possible
the French
outcomes are Head and Tail. Since the Head
mathematician
and Tail are equally likely, therefore, the
probability of each is 1/2 or 0.5. P.S. Laplace
228
Relative Frequency Definition
“If “m” is the number of occurrences of an event “A” in large number of trials “n”, then the
probability of “A” is the relative frequency of “m” and “n” as the number of trials grows infinitely
large” If we take P(A) as “the probability of A” then:
m
P(A)= lim  
 
n  n
om
For example, if a coin has been loaded (unfair), then the probability of
Head and Tail will not be equal to 0.5 i.e. the Head and Tail are not
equally likely. Thus for experiments not having equally likely outcomes if
we flip the coin 10 times, say, and observe 4 heads, then, based on this
l.c
information, we say that the chance of observing a head will be 4/10 or
0.4, which is not the same as 0.5. If, however, we flip the coin a large
ai
number of times, we would expect about 50 percent of the flips result in a
head.
gm
The Axiomatic Definition

s@
Let S be a sample space with the sample points A1, A2 … Ai …An. To

each sample point, we assign a real number, denoted by P(Ai), and
t
called the probability of Ai, that must satisfy the following basic
ta
axioms:
es
 Axiom 1: For any event Ai 0  P(Ai )  1

ze
 Axiom 2: P(S)  1
 Axiom 3: If Ai and Aj are mutually exclusive events, The Axiomatic
Then P(Ai  Aj )  P( Ai )  ( Aj ) definition was
introduced in
In this case P(Ai) is defined by the formula: 1933 by the
Russian
n(Ai ) No. of sample points in the event Ai mathematician.
P(Ai )= =
n(S) No. of sample points in the sample space A.N. Kolmogorov
Subjective probability is purely subjective i.e. that two or more persons faced with the same
evidence may arrive at different probabilities. On the other hand, objective probability
relates to those situations where everyone will arrive at the same conclusion.
229
Range of Probability
If the probability of an event is 1, the event is certain to occur. If the

probability of an event is 0, the event is impossible. A probability of 0.5
indicates that an event has an even chance of occurring. The following The closer the probability
graph shows the possible range of probabilities and their meanings.
is to 1, the more likely is
an event will occur.
Similarly,
The closer the probability
is to 0, the less likely is
om
an event will occur.
EXAMPLE 6.24
l.c
A fair coin is tossed only once what is the probability that a Head
will appear? ai
Solution Since a coin is tossed
gm
Therefore S  H , T   n(S )  2
s@
Let “A” denotes the event of getting “a Head”
Then A  H   n( A)  1
t
Probabilities should be
ta
n(A) 1 expressed as reduced

Hence P(A)=   0.50
es
n( S ) 2 fractions or rounded to
two or three decimal
EXAMPLE 6.25
ze
places. When the

probability of an event is
Two fair coins are tossed simultaneously, what is the probability
that at least one head will appear? an extremely small
decimal, it is permissible
to round the decimal to
Solution Since two coins are tossed the first nonzero digit
after the point. For
Therefore S  HH , HT , TH , TT   n(S )  4 example, 0.0000587
would be 0.00006
Let “A” denotes the event of getting “at least one Head”
Then A  HH , HT , TH   n( A)  3
n(A) 3
Hence P(A)= 
n( S ) 4
230
EXAMPLE 6.26
A die is rolled find the probability of getting a six?
Probabilities can be
Solution Since a die is rolled expressed as fractions,
decimals, or percentages.
Therefore S  1,2,3,4,5,6  n(S )  6 If you ask, “What is the
probability of getting a
Let “A” denotes the event of getting “a six” head when a coin is
om
tossed?” typical
Then A  6  n( A)  1 responses can be any of
l.c
the following three.
n(A) 1 “1/2” “0.5” “50%”
Hence P(A)=  ai
n( S ) 6 These answers are all
equivalent.
gm
EXAMPLE 6.27
Two dice are rolled, find the probability that the sum is:
s@
(1) Exactly “5” (2) At least “9” (3) At most “4”

(4) Even (5) Less than “3
t
ta
Solution Since two dice are rolled therefore:

 1,1 1, 2  1,3 1, 4  1,5  1, 6  
es

 2,1  2, 2   2,3  2, 4   2,5  2, 6  
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
ze
S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
 
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
1) The sum is “Exactly “5”
Let “A” be an event of getting “sum is exactly 5”    1, 2  1,3 1, 4  1,5  1, 6  
1,1
 2,1  2, 2   2,3  2, 4   2,5  2, 6  
1, 4  ,  2,3 , 
   3,1  3, 2   3,3  3, 4   3,5   3, 6  
Then A     n( A)  4 S 
 3, 2  ,  4,1 
   4,1  4, 2   4,3  4, 4   4,5  4, 6  
n(A) 4  5,1  5, 2   5,3  5, 4   5,5   5, 6  
Hence P(A)=   
n( S ) 36  6,1  6, 2   6,3  6, 4   6,5   6, 6  
231
2) The sum is “At least “9”
Let “B” be an event of getting “sum is at

least 9” then  1,1 1, 2  1,3 1, 4  1,5  1, 6  

 2,1  2, 2   2,3  2, 4   2,5  2, 6  
 3, 6  ,  4,5  ,  5, 4  ,  6,3 ,  4, 6  , 
   3,1  3, 2   3,3  3, 4   3,5   3, 6  
B S 
 5,5 ,  6, 4  ,  5, 6  ,  6,5  ,  6, 6  


  4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
 
 n( B)  10  6,1  6, 2   6,3  6, 4   6,5   6, 6  
om
n(B) 10
Hence P(B)= 
n( S ) 36
l.c
3) The sum is “At most “4”
Let “C” be an event of getting “sum is at

ai
most 4” then  1,1 1, 2  1,3 1, 4  1,5  1, 6  

 2, 6  
gm
 2,1  2, 2   2,3  2, 4   2,5

1,1 , 1, 2  , 1,3 , 
   3,1  3, 2   3,3  3, 4   3,5   3, 6  
C   n(C )  6 S 
 2,1 ,  2, 2  ,  3,1 
   4,1  4, 2   4,3  4, 4   4,5  4, 6  
s@
 5,1  5, 2   5,3  5, 4   5,5   5, 6  

 
Hence P(C)=
n(C) 6
  6,1  6, 2   6,3  6, 4   6,5   6, 6  
t
n( S ) 36
ta
4) The sum is “Even”

es
Let “D” be an event of getting “sum is even” then

ze
1,1 , 1,3 ,  2, 2  ,   1,1

  1, 2  1,3 1, 4  1,5  1, 6  
 3,1 , 1,5  ,  2, 4  ,  
 2,1  2, 2   2,3  2, 4   2,5  2, 6  
 3,3 , 4, 2 , 5,1 , 
        3,1  3, 2   3,3  3, 4   3,5   3, 6  
D   n( D)  18 S 
 2, 6  ,  3,5  ,  4, 4  ,   4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,3 , 6, 2 , 4, 6 ,   5,1  5, 2   5,3  5, 4   5,5   5, 6  
        
 5,5  ,  6, 4  ,  6, 6  
   6,1  6, 2   6,3  6, 4   6,5   6, 6  
n(D) 18
Hence P(D)= 
n( S ) 36
232
5) The sum is “Less than “3”
Let “D” be an event of getting “is less than  1,1 1, 2  1,3 1, 4  1,5  1, 6  

3” then
 2,1  2, 2   2,3  2, 4   2,5  2, 6  
E  1,1  n( E )  1  3,1  3, 2   3,3  3, 4   3,5   3, 6  
S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
n(E) 1  5,1  5, 2   5,3  5, 4   5,5   5, 6  
Hence P(E)=   
n( S ) 36  6,1  6, 2   6,3  6, 4   6,5   6, 6  
om
EXAMPLE 6.28
l.c
A card is drawn at random from an ordinary pack of 52 playing cards. Find the probability that
the card drawn is “8”?
ai
gm
Solution Since a card is drawn therefore

Eights Others Total
 52  4 48 52
S  the pack of 52 cards  n(S )     52
s@
1
S
t
ta
es
A
ze
 4
Let “A” be the event that “the card is eight” Then n( A)     4
1
n(A) 4 1
Hence P(A)=  
n( S ) 52 13
233
EXAMPLE 6.29
A basket contains 5 white and 4 black balls; what is the probability of selecting 3 white balls?
White Black Total

Solution Since “3” balls are selected out of “9”
5 4 9
9
Therefore n( S )     84
 3
om
Let “W” be the event of “selecting 3 white balls”
l.c
5 S
Then n(W )     10 W
 3 ai
n(W) 10

gm
Hence P(W)=
n( S ) 84
s@
EXAMPLE 6.30
A box contains 3 gray and 5 black balls. If 4 balls are drawn together from the box then find the
t
probability of getting:
ta
(i) At least 2 black balls (ii) At most 2 gray balls.

es
Solution Since 4 balls are drawn from 8 balls: Gray Black Total
ze
3 5 8
8
Therefore n( S )=    70
 4
Let “A” be an event of getting “at least 2 black balls i.e. two or more black balls:
Now “A” can occur in the following mutually exclusive ways:
 2 black   3 black   4 black 

  OR   OR  
 2 gray   1 gray   0 gray 
234
 5  3   5  3   5  3   5  3   5  3   5  3 
 n( A)     or    or     n( A)     +    +    = 65
 2  2   3  1   4  0   2  2   3  1   4  0 
n( A) 65 13
Hence P( A)   
n( S ) 70 14
Let “B” be an event of getting “at most 2 black balls i.e. two or less black balls:
Now “B” can occur in the following mutually exclusive ways:
om
 2 black  1 black 
l.c
  OR  
 2 gray  ai  3 gray 
 5  3   5  3   5  3   5  3 
 n( B)     or     n( B)     +    = 60
gm
 2  2   1  3   2  2   1  3 
n( B) 60 6
Hence P( B)   
s@
n( S ) 70 7
t
Test Yourself
ta
es
1) A fair coin is tossed only once what is the probability that a Tail will appear?
2) Two fair coins are tossed, what is the probability that at least two head will appear?
ze
3) A die is rolled find the probability of getting a four?

4) Two dice are rolled, find the probability that the sum is:
(i) Exactly “4” (ii) At least “10” (iii) At most “5”

(iv) Odd (v) Less than “2”
5) A card is drawn at random from an ordinary pack of 52 playing cards. Find the
probability that the card drawn is “picture”?
6) A basket contains 6 white and 3 black balls; what is the probability of selecting 4 white
balls?
7) A box contains 4 gray and 6 black balls. If 4 balls are drawn together from the box then
find the probability of getting:
(i) At least 2 black balls (ii) At most 3 gray balls.
235
Addition Rule of prbability for Mutually Exclusive Events
Statement: Let “A” and “B” are two mutually exclusive events then the probability that “A” or
“B” occurs is equal to the probability that “A” occurs plus the probability that “B”
occurs i.e.
P(A or B)= P(A)+ P(B)

OR P(A  B)= P(A)+ P(B)
om
S
Proof: To prove the theorem, consider the two Mutually
Exclusive events “A” and “B” in the Venn-diagram: B
A
l.c
It is clear from the Venn-diagram that: q
ai p
n(S )  m
m
n( A)  p
gm
n( B)  q n  A  B   p  q is shaded
n( A  B)  p  q
s@
Now
n(A) p
t
P(A)= 
ta
n( S ) m
n(B) q

es
P(B)=
n( S ) m
ze
Therefore
n(A  B) p  q p q
P(A  B)=     P(A)+ P(B)
n( S ) m m m
 P(A  B)= P(A)+ P(B) Hence proved
236
EXAMPLE 6.31
Suppose that we roll a pair of dice, what is the probability of getting a sum of 5 or a sum of 11?
Solution Since a pair of dice is rolled therefore:
 1,1 1, 2  1,3 1, 4  1,5  1, 6  


 2,1  2, 2   2,3  2, 4   2,5  2, 6  
om
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
l.c
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
 
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
ai
gm
Let “A” be an event of getting “sum is exactly 5” then
 1,1 1, 2  1,3 1, 4  1,5  1, 6  

1, 4  ,  2,3 , 

s@
 
A   n( A)  4  2,1  2, 2   2,3  2, 4   2,5  2, 6  

 3, 2  ,  4,1 

 3,1  3, 2   3,3  3, 4   3,5   3, 6  
S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
t
n(A) 4
ta
 P(A)=   5,1  5, 2   5,3  5, 4   5,5   5, 6  

n( S ) 36  
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
es
Let “B” be an event of getting “sum is 11”

ze
Then B   5,6  ,  6,5  n( B)  2
n(B) 2
 P(B)= 
n( S ) 36
Now we have to find P(A or B) and since the two events “A” and “B” are mutually exclusive
(because they cannot occur together)
4 2 6
 P(A or B)= P(A  B)= P( A)  P( B)   
36 36 36
Hi Friends!!!
237
EXAMPLE 6.32
A card is drawn from a well-shuffled deck of 52 cards; find the probability that the card is a red
or black queen?
Solution Since a card is drawn therefore

Red Black
Others Total
 52  Queens Queens
S  the pack of 52 cards  n(S )     52 2 2 48 52
1
om
S
A
l.c
ai
gm
B
t s@
ta
es
Let “R” be the event that “red Let “B” be the event that “black
ze
queen” queen”
 2  2
Then n( R)     2 Then n( B)     2
1 1
n(R) 2 n(B) 2
P(R)=  P(B)= 
n( S ) 52 n( S ) 52
Now we have to find P(R or B) and since the two events “R” and “B” are mutually exclusive
2 2 4
 P(R or B)= P(R  B)= P( R)  P( B)   
52 52 52
238
EXAMPLE 6.33
A basket contains 5 white and 4 black balls; what is the probability that a ball drawn at random is
white or black balls?
Solution Since a ball is drawn out of “9”

9
1
om
White Black Total
l.c
5 4 9
ai
Let “W” be the event of “drawing a white ball”
gm
 5 S
Then n(W )     5 B
1 W
s@
n(W) 5
Therefore P(W)= 
n( S ) 9
t
ta
Let “B” be the event of “drawing a black ball”

es
 4
Then n( B)     4
ze
1
n(B) 4
Therefore P(B)=  Hi Friends!!!
n( S ) 9
Now we have to find P(W or B) and since the two events “W” and “B” are mutually exclusive
5 4
 P(W or B)= P(W  B)= P(W )  P( B)    1
9 9
239
Addition Rule of prbability for Not Mutually Exclusive Events
Statement: Let “A” and “B” are two not mutually exclusive events then the probability of event
“A” or “B” or “both” occuring is equal to the probability that “A” occurs plus the
probabilitly that “B” occurs minus the probability that “both” events “A” and “B”
occur together i.e.
P(A or B)= P(A)+ P(B)- P(A and B)

OR P(A  B)= P(A)+ P(B)- P(A  B)
om
S
Proof: To prove the theorem, consider the two Not
l.c
Mutually Exclusive events “A” and “B” in the A
Venn-diagram:
p t t q t
ai
It is clear from the Venn-diagram that:
B
gm
m
n(S )  m , n( A)  p , n( B)  q
n  A  B   t is shaded
n( A  B)  p  q  t
n( A  B)  p  t  t  q  t  p  q  t
s@
n( A  B)  t
Now
t
ta
n(A) p
P(A)= 
n( S ) m
es
n(B) q
P(B)= 
n( S ) m
ze
n(A  B) t
P(A  B)= 
n( S ) m
Therefore
n(A  B) p  q  t p q t
P(A  B)=      P(A)+ P(B)- P(A  B)
n( S ) m m m m
 P(A  B)= P(A)+ P(B)- P(A  B) Hence proved
240
EXAMPLE 6.34
If a card is selected at random from a deck of 52 plyaing cards, what is the proability that the
card is a diamond or a picture card or both?
Solution Since a card is drawn, therefore

Diamonds Picture Others Total
 52 
1
A B
om
B S
l.c
A
ai
gm
t s@
ta
es
ze
Let “A” be the event that “a Let “B” be the event that “a
diamond card” picture card”
13  12 
Then n( A)     13 Then n( B)     12
1 1
n(A) 13 n(B) 12
P(A)=  P(B)= 
n( S ) 52 n( S ) 52
Since the two events “A” and “B” are not mutually exclusive (because they can occur together),
therefore n( A  B)  3
n(A  B) 3
Now the probability of both “A” and “B” occur together is: P(A  B)= 
n( S ) 52
13 12 3 22
Hence P(A or B or both)= P(A  B)= P( A)  P( B)  P(A  B)    
52 52 52 52
241
EXAMPLE 6.35
In a certain college 25% of the students failed math, 15% of the students failed stats and 10% of
the students failed both mant and stats. A student is selected at random; what is the prbability the
he/she failed math or stats?
Solution Given that
25% of students who failed Math  P(Math)= 0.25

15% of students who failed Stats  P(Stats)= 0.15
om
10% of students who failed both Math and Stats  P(Math  Stats)= 0.10
P(Math  Stats)  0.10
l.c
Math
ai
gm
P(Stats)= 0.15
P(Math)= 0.25
Stats
s@
Now since the two subjects are not mutually exclusive, therefore
t
ta
P(a student failed Math or Stats )= P(Math  Stats)

= P (Math)  P (Stats )  P(Math  Stats)
es
 0.25  0.15  0.10  0.30

ze
Test Yourself
1) Suppose that we roll a pair of dice, what is the probability of getting a sum of 5 or a sum of 11?
2) A card is drawn from a well-shuffled deck of 52 cards; find the probability that the card is a red
or black King?
3) A basket contains 7 white and 3 black balls; what is the probability that a ball drawn at random is
white or black balls?
4) If a card is selected at random from a deck of 52 plyaing cards, what is the proability that the
card is a Heart or a picture card or both?
5) A customer enters a food store. The probability that the customer buys bread is 0.60, milk is 0.50
and both bread and milk is 0.30. What is the probability that the customer would buy either bread
or milk or both?
242
Understand the meaning of the words

“AND” and “OR”!!!
The word “AND” has a single meaning.

Heart and Queen
 For example, if you were asked

Queens
to find the probability of
Hearts
getting a queen and a heart
om
when you were drawing a
single card from a deck, you
would be looking for the queen
of hearts. Here the word “and”
l.c
means “at the same time.” ai
The word “OR” has two meanings.
gm
 For example, if you were asked Heart or Queen

s@
to find the probability of

selecting a queen or a heart
Queens
when one card is selected from
Hearts
a deck, you would be looking
t
ta
for one of the 4 queens or one

of the 13 hearts. In this case,
es
the queen of hearts would be

included in both cases and
counted twice. In this case,
ze
both events can occur at the

same time; we say that this is
an example of the inclusive or.
King or Queen
 On the other hand, if you were

asked to find the probability of Kings Queens
getting a queen or a king, you
would be looking for one of
the 4 queens or one of the 4
kings. In this case, both events
cannot occur at the same time,
and we say that this is an
example of the exclusive or.
243
The Rule of Complimentation
The probability that an event “A” will not occur, denoted

by P( A) is equal to one minus the probability that “A” will
occur i.e.
P( A)  P( A)  1
P( A)  1  P( A)
In other words, “If the probability of an event or the

probability of its complement is known, then the other
can be found by subtracting the probability from 1”.
om
EXAMPLE 6.36
l.c
ai
A coin is tossed 5 times, what is the probability that at least one tail occurs?
gm
Solution Since a coin is tossed 5 times therefore n(S )  25  32

s@
Let “A” is the event of getting at least one tail (i.e. one, two, three, four or five tails)
So “ A ” is the event of getting no tail (i.e. HHHHH)

t
ta
 n  A  1
es
n ( A) 1
Now P( A)  
n( S ) 32
ze
1 31
Hence P( A)  1  P( A)  1  
32 32
Conditioanl Probability
The probability that event “A” will occure; once event “B” has already
occrured is called conditional prbability of “A” given “B” denoted by
P(A/B) and is given as:
P( A  B)
P( A / B)  ; P(B) > 0 The conditional
P( B) probability was first
introduced by
P( A  B) Fermat, a French
Similarly P( B / A)  ; P(A) > 0
P( A) Mathematician.
244
EXAMPLE 6.37
Two fair dice are thrown, let “A” denotes “ the sum of dots is 10” and “B” denotes “the two dice
show the same number, then find
(i) P( A / B) (ii) P( B / A)
Solution Since two fair dice are rolled therefore:
 1,1 1, 2  1,3 1, 4  1,5  1, 6  
om

 2,1  2, 2   2,3  2, 4   2,5  2, 6  
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
S 
 4, 6  
l.c
 4,1  4, 2   4,3  4, 4   4,5
 5,1  5, 2   5,3  5, 4 
ai  5,5   5, 6  
 
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
gm
Let “A” be an event of getting “sum is 10” then
A   4,6  ,  5,5 ,  6, 4   n( A)  3
s@
n(A) 3
 P(A)= 
n( S ) 36
t
ta
Let “B” be an event of getting “same numbers” then

es
B  1,1 ,  2, 2  , 3,3 ,  4, 4  , 5,5  ,  6,6   n( B)  6

ze
 P(B)=
n(B) 6
  1,1 1, 2  1,3 1, 4  1,5  1, 6  

n( S ) 36
 2,1  2, 2   2,3  2, 4   2,5  2, 6  
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
Also A  B   5,5  n  A  B   1 S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
n( A  B ) 1  
 P( A  B)  
n( S ) 36  6,1  6, 2   6,3  6, 4   6,5   6, 6  
A B
P( A  B) 1/ 36 1
(i) P( A / B)   
P( B) 6 / 36 6
P( A  B) 1/ 36 1
(ii) P( B / A)   
P( A) 3/ 36 3
245
EXAMPLE 6.38
A card is selected at random from a pack, what is the probability that the card is a King given
that it is a picture card?
Solution  52  Kings Picture Others Total

1
A B
S
om
B
A
l.c
ai
gm
t s@
ta
es
Let “A” be the event that “a Let “B” be the event that “a
king” picture card”
ze
 4 12 
Then n( A)     4 Then n( B)     12
1 1
n(A) 4 n(B) 12
P(A)=  P(B)= 
n( S ) 52 n( S ) 52
Since the two events “A” and “B” are not mutually exclusive (because they can occur together),
therefore n( A  B)  4
n(A  B) 4
Now the probability of both “A” and “B” occur together is: P(A  B)= 
n( S ) 52
Hence the probability that the card is a King given that it is a picture card is:
P( A  B) 4 / 52 1
P( A / B)   
P( B) 12 / 52 3
246
Test Yourself
1) In a certain college 25% of the students failed math, 15% of the students failed stats and 10%
of the students failed both mant and stats. A student is selected at random:
(i) If he failed statistics, what is the probability that he failed math?
(ii) If he failed math, what is the probability that he failed statistics?
2) Two fair dice are thrown, let “A” denotes “ the sum of dots is 9” and “B” denotes “the two
om
dice show odd number, then find
(i) P( A / B) (ii) P( B / A)
l.c
3) A card is selected at random from a pack, what is the probability that the card is a queen
given that it is a picture card? ai
gm
Independent Events
s@
“Two events are independent if the occurrence of one of the events does not affect the probability
of the occurrence of the other event”.
t
ta
The folloing are some examples of independent events:

es
 Rolling a die and getting a 6, and then rolling a second die and getting a 3.
 Drawing a card from a deck and getting a queen, replacing it, and drawing a
ze
second card and getting a queen.
Dependent Events
“Two events are dependent if the occurrence of one of the events affects the probability of the
occurrence of the other event”.
The following are some examples of dependent events:
 Drawing a card from a deck, not replacing it, and then drawing a second card.
 Selecting a ball from an urn, not replacing it, and then selecting a second ball.
247
Multiplication Rule of probability for

Independent Events
“If “A” and “B” are two independent events, then the probability
that both of them occur is equal to the probability of “A” occurs To find the probability of
multiply by the probability of “B” occurs” i.e. two events occurring in

sequence, you can use
the Multiplication Laws.
P( A and B)  P( A  B)  P( A)  P( B)
om
EXAMPLE 6.39
l.c
A pair of dice is thrown twice. What is the probability of getting a total of 6 on first throw and a
total of 9 on the second? ai
A1
Solution Let “A1” be an event of getting “total  1,1 1, 2  1,3 1, 4  1,5  1, 6  
gm

of 6” by a pair of dice in first throw:
 2,1  2, 2   2,3  2, 4   2,5  2, 6  
A1  1,5 ,  2, 4  ,  3,3 ,  4, 2  , 5,1  3,1  3, 2   3,3  3, 4   3,5   3, 6  
s@
S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
 n( A1 )  5  5,1  5, 2   5,3  5, 4   5,5   5, 6  
 
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
t
n( A1 ) 5
 P( A1 )= 
ta
n( S ) 36
es
And let “A2” be an event of getting  1,1 1, 2  1,3 1, 4  1,5  1, 6  

 2, 6  
ze
“total of 9” in second throw.

 2,1  2, 2   2,3  2, 4   2,5
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
A2   3,6  ,  4,5 ,  5, 4  ,  6,3 S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
 n( A2 )  4  
n( A2 ) 4  6,1  6, 2   6,3  6, 4   6,5   6, 6  
 P( A2 )=  A2
n( S ) 36
Now we have to find P( A1 and A2 ) and since the two events “A1” and “A2” are
independent, because, they belong to two different throw:
 5   4 5
 P( A1 and A2 )  P( A1 )  P( A2 )       
 36   36  324
248
EXAMPLE 6.40
Two cards are drawn in succession from a pack of playing cards and the card drawn in first
attempt is being replaced in the pack before the second attempt. Find the probability that both the
drawn cards are queens.
Solution Let “Q1” be an event of getting “a queen” on Cards

Attempt No.
a first draw. Queen Others Total
1st 4 48 52
om
Q1  4
 
l.c
 P( Q1 )=   
1 4
 52  52
 
ai 1
gm
And let “Q2” be an event of getting “a

s@
Cards
queen” on second draw such that the first Attempt No.
card is being replaced: Queen Others Total
nd
2 4 48 52
t
ta
Q2  4
es
 
 P( Q2 )=   
1 4
 52  52
ze
 
1
Now we have to find P(Q1 and Q2 ) and since the two

events “Q1” and “Q2” are independent, because, the first With replacement means
card is being replaced: that the events are
independent (the
 4   4 1
 P(Q1 and Q2 )  P(Q1 )  P(Q2 )        probability don‟t change)
 52   52  169
249
Multiplication Rule of probability for Dependent Events
“ If “A” and “B” are two dependent events, then the probability that both of them occur is equal
to the probability of “A” occurs multiply by the conditional probability of “B” given that “A” has
already occurred” i.e.
P( A and B)  P( A  B)  P( A)  P( B / A)
Similarly P( A and B)  P( A  B)  P( B)  P( A / B)
EXAMPLE 6.41
om
Two cards are drawn in succession from a pack of playing cards and the card drawn in first
attempt is not being replaced in the pack before the second attempt. Find the probability that both
l.c
the drawn cards are queens.
Solution Let “Q1” be an event of getting “a queen” on

ai Cards
Attempt No.
a first draw: Queen Others Total
gm
1st 4 48 52
 4
s@
Q1
 
 P( Q1 )=   
1 4
 52  52
 
1
t
ta
es
And let “Q2” be an event of getting “a Cards

Attempt No.
queen” on second draw such that the first Queen Others Total
ze
card is not being replaced. 2nd 3 48 51
A2  3
 
 P( Q2 Q1 )=   
1 3
 51 51
 
1
(Because “Q1” has already occurred)
Now we have to find P(Q1 and Q2 ) and since the two events “Q1” and “Q2” are
dependent, because, the first card is being replaced:
 4   3 1
 P(Q1 and Q2 )  P(Q1 )  P( Q2 Q1 )       
 52   51  221
250
EXAMPLE 6.42
A box contains 3 gray and 2 black balls. Two balls are drawn in succession. Find the probability
that both balls drawn are black when the balls are not replaced after being drawn.
Solution Let “B1” be an event of getting “a black

Balls
Draw No.
ball” on a first draw: Gray Black Total
st
1 3 2 5
om
B1  2
 
 P( B1 )=   
1 2
5 5
l.c
 
ai 1
And let “B2” be an event of getting Balls

gm
Draw No.
“a black ball” on second draw, such that the Gray Black Total
first ball is not being replaced: 2 nd
3 1 4
s@
1
B2  
 P( B2 B1 )=   
1 1
  4
4
t
 
ta
1
(Because “B1” has already occurred)
es
Now we have to find P( B1 and B2 ) and since the two events “B1” and “B2” are dependent,
ze
because, the first ball is not being replaced:
2 1 1
 P( B1 and B2 )  P( B1 )  P( B2 B1 )       
 5   4  10
EXAMPLE 6.43
Two drawings each of 3 balls are made from a box containing 4 Without replacement
gray and 7 black balls; the balls are not being replaced before the means that the events
second draw. Find the probability that first drawing gives 3 black are dependent (the
balls and second 3 gray balls.
probability changes)
251
Solution Let “B” be an event of getting “3 black Draw No.

Balls
balls” on a first draw: Gray Black Total
st
1 4 7 11
B 7
 
 P( B )=   
3 35
11 165
 
3
om
And let “G” be an event of getting Balls
Draw No.
“3 white balls” on second draw, such that Gray Black Total
the first ball is not being replaced: 2 nd
4 4 8
l.c
 4
 
ai
 P( G B )=   
G 3 4
  56
8
gm
 
 3
(Because “B” has already occurred)
s@
Now we have to find P( B and G) and since the two events “B” and “G” are dependent,
because, the first ball is not being replaced:
t
ta
 35   4  1
 P( B and G)  P( B)  P( G B )     
 165   56  66
es
ze
Test Yourself
1) Find the probability of drawing a picture card on each of two consecutive draws from a
standard pack with replacement of the first card.
2) Two drawings each of 4 balls are made from a box containing 5 white and 8 black balls; the
balls are not being replaced before the second draw. Find the probability that first drawing
gives 4 black balls and second 4 white balls.
If two events are independent, it doesn‟t mean that they can‟t occur at the same time.
Many people make the mistake of thinking of independent events as being totally separate
from each other. In probability, two independent events can occur at the same time they
just don‟t affect each other in terms of probabilities as discussed in examples 6.38 and 6.39
252
EXAMPLE 6.44
Let “A” and “B” be the two possible out comes of an experiment and suppose:
P( A)  0.33 , P( A  B)  0.25 , P( B)  p and P( A  B)  0.75
1) Find “p”, if “A” and “B” are not mutually exclusive.

2) Find “p”, if “A” and “B” are independent.
Solution
om
1) Find “p”, if “A” and “B” are not mutually exclusive.
l.c
If “A” and “B” are not mutually exclusive then:
P(A  B)= P(A)+ P(B)- P(A  B)

ai
 0.75 = 0.33+ p - 0.25
gm
 p = 0.67
Hi Friends!!!
s@
If “A” and “B” are independent then:

t
P( A  B)  P( A)  P( B)
ta
P( A  B) 0.25
 P( B)   p  0.76
es
P( B) 0.33
ze
Sometimes there is confusion between independent events and mutually exclusive events.
Term „independent‟ is defined in terms of „probability of events‟ whereas mutually exclusive
is defined in term of events (subset of sample space). Moreover, mutually exclusive events
never have an outcome common, but independent events, may have common outcome.
Clearly, „independent‟ and „mutually exclusive‟ do not have the same meaning. In other
words, two independent events having non-zero probabilities of occurrence can not be
mutually exclusive, and conversely, i.e. two mutually exclusive events having non-zero
probabilities of occurrence can not be independent.
253
EXAMPLE 6.45
Let “A” and “B” be the two possible out comes of an experiment and suppose:
P( A)  0.60 , P( B)  p and P( A  B)  0.92
1) Find “p”, if “A” and “B” are mutually exclusive.

Solution
om
1) Find “p”, if “A” and “B” are mutually exclusive.
If “A” and “B” are mutually exclusive then:
P(A  B)= P(A)+ P(B)

l.c
ai
 0.92= 0.60+ p
 p = 0.32
gm

s@
If “A” and “B” are independent then:

t
P(A  B)= P(A)+ P(B)- P(A  B) Independent events are not M.E
ta
 P(A  B)= P(A)+ P(B)- P(A)  P(B) P( A  B)  P( A)  P( B)

es
 0.92 = 0.60+ p - 0.60  p

 p = 0.80
ze
 Two events A and B are independent if:
P  A B   P( A) or P  B A  P( B)
 Two events A and B are dependent if:
P  A B   P( A) or P  B A   P( B)
 Mutually Exclusive Events are always dependent.
 Two dependent events A and B cannot be mutually exclusive, unless P  A B  0
254
Interesting in Playing Cards
 There are 52 cards in a deck of playing cards. There are four suits
(clubs, diamonds, hearts and spades) in it, each have 13 cards.
The clubs and spades are black in color while hearts and  No. of spots on cards
diamonds are red in color. Total black cards are 26 and total red
365 (days in year)
cards are also 26.
 Cards in pack 52
(weeks in year)
 No. of Suits 4
om
(weeks in month)
Hearts Diamonds  No. of Picture cards
l.c
12 (months in year)
ai
gm
Spades Clubs
s@
 There are four aces.

t
ta
es
 Number of picture cards is 12 that include “4” jacks, “4” queens and “4” kings from each suit.
ze
 Number of face cards is 16 that include “4” aces, “4” jacks, “4” queens and “4” kings from each
suit.
255
Results: Consider the Venn-diagrams:

A  B  A   B  A
Result #01
S

 P  A  B   P  A  P B  A 
B S
A
A B
A B A B B A
om
l.c
ai
A B A
gm
s@
A  B  A   B  A
B   A  B    B  A B  A  B   A  B
Result #02 Result #03
t

 P  B  P  A  B  P B  A   
 P B  A  P  B  P  A  B
ta
S S
es
A B A B
ze
A B A B B A A B A B B A
A B B A
B A
B   A  B    B  A
B  A  B   A  B
256
Result #04 A  B  A   A  B   P  A  B   P  A  P  A  B 
A B
om
A B
l.c
ai
gm
A  B  A   A  B
s@
Important!!!
t
ta
While reading probability problems, pay special attention to key phrases that translate into mathematical
symbols. The following table lists various phrases and their corresponding mathematical equivalents:
es
ze
Math Symbol Phrases
 “greater than” or “more than” or

“exceed” or “better than” or “taller than” or “above”
 “less than” or “smaller than” or “below” or “under”

or “fewer than”
 “at least” or “greater than or equal to” or “no less

than”
 “at most” or “less than or equal to” or “no more

than”
 “exactly” or “equal” or “is”
257
Sharpen your Pencil

MCQ’s
(1) Permutation of STATISTICS is_____
om
(2) n
Cr  _____
n! n! r!
(A) (B) (C) (D) None of these
n ! n - r  !  n - r ! n ! n - r  !
l.c
(3)
ai
Probability of a king from a pack of 52 cards is_____
gm
(A) 4/52 (B) 1/4 (C) 1/52 (D) None of these
(4) P( A)  P( A)  _____
s@
(A) P( A) (B) P ( A) (C) 1 (D) None of these

t
(5) Total possible cases with two dice _____

ta
A) 26 (B) 62 (C) 6
C2 (D) None of these
es
(6) P(S) = _____

ze
(A) 1 (B) 0 (C)  (D) None of these
(7) P(  ) _____
(A) 1 (B) 0 (C) -1 (D) None of these
(8) For Not Mutually Exclusive events P( A  B)  P( A)  P( B)  _____
(A) P( A / B) (B) P( B / A) (C) P( A  B ) (D) None of these
(9) For Mutually Exclusive events P( A  B)  P( A)  _____
(A) P( B) (B) P ( A) (C) P( A  B ) (D) None of these
258
Sharpen your Pencil

MCQ’s
(10) Two mutually exclusive events are always _____
(A) Independent (B) Dependent

(C) Nothing can be said in terms of independence (D) None of these
om
(11) If “A” and “B” are two independent events with P(A) = 0.5 and P(B) = 0.3 then
P( A  B)  _____
l.c
(A) 0 (B) 0.15 (C) 0.3
ai (D) None of these
(12) Two dice are thrown the probability of obtaining a sum of “2” is _____
gm
(A) 1/6 (B) 1/36 (C) 1/18 (D) None of these
(13) An event contains only one sample point is called _____

s@
(A) Exhaustive (B) Compound (C) Simple (D) None of these

t
(14) If two dice are thrown then total sample points are _____
ta

es
(15) 8
C5  _____
ze
(16) The range of probability is _____
(A) (0, 1) (B) (-1, 1) (C) (-1, 0) (D) None of these
(17) If three fair dice are rolled, total number of sample points is _____
(18) If 4 coins are tossed, total number of sample points is _____
259
Short Questions
ExeRciSe
Q.6.01. State additional Rule of probability?
Q.6.02. How many permutations can be formed out of the letters of the word
“MISSISSIPPI”?
om
Q.6.03. A pair of dice is rolled. Make the sample space and calculate the probability that
the sum of dots is at least 9?
l.c
Q.6.04. Make permutations of A, B, C, D. ai
Q.6.05. Define the terms:
gm
(i) Event (ii) Mutually Exclusive Events

(iii) Equally likely Events (iv) Sample Space
s@
Q.6.06. Evaluate the following:
 52   48  4   39 13   7  3  4 
(i)   (ii)    (iii)    (iv)    
t
 13   10  3   13  4   2  2  1 
ta
Q.6.07. Distinguish between independent and dependent events.

es
Q.6.08. Evaluate the following:

ze
10 52 10
(i) P3 (ii) P13 (iii) P4,2,3,1
Q.6.09. A pair of dice is rolled. Make the sample space and calculate the probability that
the sum of dots is at least 8?
Q.6.10. Define with examples;
(i) Set (ii) Null set (iii) Sub-set

(iv) Universal set (v) Equal sets
Q.6.11. What is experiment and random experiment?
Q.6.12. A card is drawn at random from a pack of 52 cards. Find the probability of
obtaining: (i) Red Card (ii) King of Spade
260
Long Questions
ExeRciSe
Q.6.01. Two dice are rolled find the probability that the sum of dots is:
(i) Exactly 3
(ii) Odd
(iii) More than 9
om
(iv) More than 5 but less than or equal to 10
(v) At least 7
(vi) At most 6
Q.6.02.
l.c
State and prove additional Rule of probability for not mutually exclusive events?
ai
Q.6.03. A card is drawn at random from an ordinary pack of 52 playing cards. Find the
gm
probability that the card:
(a) Is a seven
s@
(b) Is not a seven
Q.6.04. A card is drawn at random from an ordinary pack of 52 cards, find the probability
that the card is:
t
ta
1) A club or a diamond
2) A club or a king
es
Q.6.05. Two dice are rolled find the conditional probability of getting the sum of dots
ze
is”7” given that:
(i) The sum is 6 or more

(ii) The sum is Less than 9
(iii) The sum is More than 5
Q.6.06. In a college 30% students failed English, 40% students failed Urdu and 10%
students failed both. A student is selected at random. What is the probability that
he failed English or Urdu?
Q.6.07. What is probability of getting sum of dots as 14, when 3 fair dice are rolled?
Q.6.08. A bag contains 16 balls of which 5 are marked. If 8 balls are drawn out together,
what is the probability that all the marked balls are among 8 balls?
261
Long Questions
ExeRciSe
Q.6.09. A card is drawn from a well shuffled pack of 52 playing cards. Find the
probability that the drawn card is:
(i) Spade (ii) Jack of Clubs (iii) King
om
(iv) Queen, King of Diamond, Ace of Hearts or Jack
Q.6.10. If U  5,6,7,...,15  , A  4,6,8 and B  11,12,13,14,15 then show that
l.c
A B  A B
Q.6.11.
ai
The probability a person will alive in next 20 years is 2/3. What is the probability
he will not alive in next 20 years?
gm
Q.6.12. Two coins are tossed. What is the probability that the two heads result, given that
there is at least one head?
s@
Q.6.13. In how many ways letters of the following words be rearranged:
(i) Mathematics (ii) Manufacturer

t
ta
(iii) Convocation (iv) Sociology

es
Q.6.14. How many possible permutations can be formed from letters of each word?
(i) Infinity (ii) Unusual (iii) Statistics (iv) Hyperbola

ze
Q.6.15. State and prove additional Rule of probability for mutually exclusive events?
Q.6.16. If 3 books are picked at random from a shelf containing 5 novels, 3 books of
poems and a dictionary, what is the probability that:
(i) The dictionary is selected?

(ii) 2 novels and one book of poems are selected?
262
CHAPTER 07
Random Variables
om
Chapter Contents
l.c
ai
Y
gm
 Random Variable: (P264)

s@
 Discrete Random Variables: (P265)

 Continuous Random Variables: (P266)
 Discrete Probability Distribution: (P266-P267)
 Graph of Discrete Probability Distribution: (P267)
t
ta
 How to find Probabilities using Discrete Probability Distributions:

(P271)
es
 Continuous Probability Distribution: (P275)

 Graph of Continuous Probability Distribution: (P276)
How to find Probabilities using Continuous Probability
ze

Distributions: (P276)
 Mathematical Expectation of a Discrete random variable le: (P280)
 Variance and S.D of a Discrete random variable: (P282-P283)
 Properties of Expectation: (P284)
 Properties of Variance and S.D: (P285-P286)
 Amazing Histogram: (P287)
263
Chapter 07 Random Variables
Suppose your teacher asked you

to write down your names on a
slip distributed by him. You
returned your slip, the teacher
fold the slips in a uniform pattern
and mixed them well. Then he
asked one of the students to draw
ten slips one by one or together.
The teacher open the drawn slips
one by one, read the names of the
selected students and asked the
om
following questions:
 What is your age?
l.c
 How many brothers and sisters are you?
 How many living rooms are available to your family?
 What is your height?
ai
gm
In this example, the selection of ten students by the method explained above is a
random experiment and the procedure of selection is random process. The
students selected in this way are the outcomes of the experiment and the
questions asked from selected students are the characteristics in which we are
s@
interested. Since each characteristic can assume different values from outcome to
outcome of the random experiment. So these characteristics may be considered
not only as variable but are known as random variables or chance variable or
t
stochastic variables.
ta
es
Random Variable
ze
“A variable whose values are determined by the

outcomes of a random experiment is called a
random variable”
If we toss two coins then the sample space must be:

S X
S  HH , HT , TH , TT  HH 2
HT
Let “X” is a variable denoting the “number of heads” 1
then “X” have the values 0, 1, 2; since these values TH
are determined from the results (outcomes) of the TT 0
random experiment; therefore “X” is called as a
random variable.
264
The following are some examples of random variables:
 The number of deaths in an accident.

 The number of heads in tossing two coins
 The number of daily admissions in a hospital
 The amount of rain falls at a certain place, etc.
om
Random variables are usually denoted by the last letters of alphabets e.g. X, Y or Z.
l.c
Types of Random Variable
There are two types of random variable:

ai
gm
Random
Variable
t s@
Discrete Random Continuous Random

ta
Variable Variable
es
ze
Discrete Random Variable
“A random variable is called discrete random variable if it has counting phenomena and there
can be certain jump or gap between two possible values of the random variable. Further it is free
from the unit of measurement”.
 The number of heads in tossing two coins

 No. of deaths in an accident
 No. of apples in a basket.
 No. of passengers carried by PIA in last ten years
 The number of daily admissions in a hospital, etc.
265
Continuous Random Variable
“A random variable is called continuous random variable if it has measuring phenomena and
there can be infinite number of values between two possible values of the variable. Further it has
the unit of measurement”

om
 The amount of milk given by a cow
 The amount of rain falls at a certain place, etc.
l.c
ai
gm
A discrete random variable has either a finite or countable infinite number of values that
are usually integers or whole numbers. The values of a discrete random variable can be
plotted on a number line with space between each point.
s@
X = No. of calls in one day

0 1 2 3 4
t
On the hand a continuous random variable has infinitely many values that are real
ta
numbers. The values of a continuous random variable can be plotted on a line in an

es
uninterrupted fashion.
X = Time spent making calls in one day

ze
0 6 12 18 24
Discrete probability distribution
“A table listing all possible values that a discrete random variable can take on together with the
associated probabilities is called discrete probability distribution”
Let “X” be a discrete random variable which can take values as x1, x2, … ,xn and the associated
probabilities be f(x1), f(x2), ….,f(xn) respectively; then the discrete probability distribution is given as:
x x1 x2 …………… xn
f(x) or P(x) f(x1) f(x2) …………… f(xn)
266
The function f(x) or P(x) that is used to assign the probabilities to

different values of the random variable “X” is called probability function
or probability mass function (p.m.f).
The discrete probability mass function may be defined as:

Some writers do not
 Function of "x" ; x = x1 , x2 , ... ,xn make any distinction
f(x)= P(X = x)= 
0 ; otherwise between the terms
probability function and
A p.m.f has the following two properties: probability distribution
om
but they use it
(i) f(x)  0 for all “x” interchangeably.
(ii)  f(x)  1
l.c
all x
ai
Graph of Discrete probability distribution
gm
The discrete probability distribution is usually displayed by vertical lines

graph or probability histogram. In both type of graphs we take the values
s@
of X on the X-axis and probabilities on the Y-axis as shown in the

following figure:
To make probability
Histogram we first find
t
x 0 1 2 3 Total
ta
class boundaries. These

f(x) 1/8 3/8 3/8 1/8 1
class boundaries are
es
Class
-0.5 – 0.5 0.5 - 1.5 1.5 - 2.5 2.5 - 3.5 -- called factitious class
Boundaries
boundaries because the
ze
discrete random variable

cannot assume such
Vertical Lines Graph Probability Histogram
values.
f(x) f(x)
3/8 3/8
2/8 2/8
1/8
1/8
x x
0 1 2 3 0 1 2 3
267
EXAMPLE 7.01
Find the probability distribution of the number of heads when two coins are tossed?
Solution Since two coins are tossed therefore:
S  HH , HT , TH , TT 
Let “X” is a random variable denoting the number of heads then x = 0, 1, 2
om
Now the probabilities are:
l.c
1 2 1
If X = 0 then f (0)  , If X = 1 then f (1)  , If X = 2 then f (2) 
4 4 4
ai
Hence the probability distribution of the number of heads (X) becomes:
gm
x 0 1 2 Total
f(x) 1/4 2/4 1/4 1
s@
EXAMPLE 7.02
t
ta
Find the probability distribution of the number of heads when three coins are tossed?
es
Solution Since three coins are tossed therefore:

ze
S  HHH , HHT , HTH , THH , TTH , THT , HTT , TTT 
Let “X” is a random variable denoting the number of heads then x = 0, 1, 2, 3
1 3
If X = 0 then f (0)  , If X = 1 then f (1) 
8 8
3 1
8 8
Hence the probability distribution of the number of heads (X) becomes:
x 0 1 2 3 Total
f(x) 1/8 3/8 3/8 1/8 1
268
EXAMPLE 7.03
Find the probability distribution of the sum of dots when two dice are rolled?
Solution Since two dice are rolled then:
 1,1 1, 2  1,3 1, 4  1,5  1, 6  


 2,1  2, 2   2,3  2, 4   2,5  2, 6  
om
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
S 
 4,1  4, 2   4,3  4, 4   4,5  4, 6  
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
l.c
 
 6,1  6, 2   6,3  6, 4 
ai  6,5   6, 6  
For the sum of dots we may write the sample space as:
gm
2 3 4 5 6 7
3 4 5 6 7 8 

s@
4 5 6 7 8 9
S 
5 6 7 8 9 10 
 9
6 7 8 9 10 11
 
t
ta
7 8 9 10 11 12 
es
Let “X” is a random variable denoting the sum of dots then x = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12.

ze
1 2 3
36 36 36
4 5 6
36 36 36
5 4 3
36 36 36
2 1
36 36
Hence the probability distribution of the sum of dots (X) becomes:
x 2 3 4 5 6 7 8 9 10 11 12 Total
f(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1
269
EXAMPLE 7.04
A basket contains 6 balls 2 white ball and 4 black balls. If three balls are selected at random then
find the probability distribution for the number of black balls.
Solution Since “3” balls are selected out of “9” Black White Total
4 2 6
6
 3
om
Let “X” is a random variable denoting the number of black balls
then x = 1, 2, 3 (because 3 balls are selected)
l.c
Now the probabilities are: ai
 4  2 
  
gm
I am impossible
If X = 1 then f (1)     
1 2 4
, as there are
20 20
total 2 white!!!
s@
 4  2 
  
If X = 2 then f (2)     
2 1 12
,
20 20
t
ta
 4  2 
  
es
If X = 3 then f (3)     
3 0 4
20 20
Hmm !
ze
Hence the probability distribution of the number of black balls (X) becomes:
x 0 1 2 Total
f(x) 4/20 12/20 4/20 1
Test Yourself
1) Find the probability distribution of the number of tails when two coins are tossed?
2) Find the probability distribution of the number of tails when three coins are tossed?
3) Find the probability distribution of the difference of dots when two dice are rolled?
4) A basket contains 6 balls 2 white ball and 4 black balls. If three balls are selected at random
then find the probability distribution for the number of white balls.
270
How to find the probabilities using a discrete probability

distribution or a discrete probability density function
Let “X” be a discrete random variable then the discrete probability distribution is given as:
x x1 x2 …………… xn
f(x) f(x1) f(x2) …………… f(xn)
Similarly the discrete probability mass function f(x) is given as:
om
 Function of "x" ; x = x1 , x2 , ... ,xn
f(x)= P(X = x)= 
0 ; otherwise
l.c
Now to find the probabilities we have: ai
 P( X  x1 )  f ( x1 )
 P( x1  X  x3 )  f ( x1 )  f ( x2 )  f ( x3 )
gm
 P( x1  X  x3 )  f ( x2 )  f ( x3 )
 P( X  x2 )  f ( x1 )  f ( x2 )
s@
 P( X  x2 )  f ( x1 )
t
EXAMPLE 7.05
ta
Find the probability distribution of the sum of dots when two dice are rolled?
es
Also find the probability that:

ze
(i) The sum of dots is exactly 4 (iii) The sum of dots is less than 6
(ii) The sum of dots is greater than 10 (iv) The sum of dots is at least 9
Solution Since two dice are rolled then:
 1,1 1, 2  1,3 1, 4  1,5  1, 6  


 2,1  2, 2   2,3  2, 4   2,5   2, 6 
 3,1  3, 2   3,3  3, 4   3,5   3, 6  
S 
 4,1  4, 2   4,3  4, 4   4,5   4, 6 
 5,1  5, 2   5,3  5, 4   5,5   5, 6  
 
 6,1  6, 2   6,3  6, 4   6,5   6, 6  
271
For the sum of dots we may write the sample space as:
2 3 4 5 6 7
3 4 5 6 7 8 

4 5 6 7 8 9
S 
5 6 7 8 9 10 
 9
6 7 8 9 10 11
 
7 8 9 10 11 12 
om
Let “X” is a random variable denoting the sum of dots then x = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
l.c
1 2 3
36 36
ai 36
4 5 6
If X = 5 then f (5)  If X = 6 then f (6)  If X = 7 then f (7) 
gm
, ,
36 36 36
5 4 3
36 36 36
s@
2 1
36 36
t
Hence the probability distribution of the sum of dots (X) becomes:

ta
x 2 3 4 5 6 7 8 9 10 11 12 Total
es
f(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1
ze
(i) The sum of dots is exactly 4
P(The sum of dots is exactly 4)  P( X  4)

 f (4)
= 3/36 Hi Friends!!!
(ii) The sum of dots is less than 6
P(The sum of dots is less than 6)  P( X  6)

 P( X  2 or X  3 or X  4 or X  5 )
 P( X  2)  P( X  3)  P ( X  4)  P( X  5 )
 f (2)  f (3)  f (4)  f (5)
 1/ 36  2 / 36  3/ 36  4 / 36
= 10/36
272
(iii) The sum of dots is greater than 10
P(The sum of dots is greater than 10)  P( X  10)

 P( X  11 or X  12)
 P( X  11)  P( X  12)
 f (11)  f (12)
 2 / 36  1/ 36
= 3/36
(iv) The sum of dots is at least 9
om
P(The sum of dots is at least 9)  P( X  9)
 P( X  9 or X  10 or X  11 or X  12)
 P( X  9)  P( X  10)  P ( X  11)  P( X  12)
l.c
 f (9)  f (10)  f (11)  f (12)
 4 / 36  3/ 36  2 / 36  1/ 36
ai
= 10/36
gm
EXAMPLE 7.06
s@
x4
Given that: f ( x)  x = 0, 1, 2, 3
98
Find (i) P(1  X  3) (ii) P( X  2) (iii) P( X  3)
t
ta
Solution (i) P(1  X  3)

es
P(1  X  3)  P( X  1 or X  2 or X  3)
ze
 P( X  1)  P( X  2)  P( X  3)
 f (1)  f (2)  f (3) x4
 1/ 98  16 / 98  81/ 98  1 Given that f ( x) 
98
4
1 1
(ii) P( X  2) f (1)  
98 98
24 16
P( X  2)  P( X  0 or X  1) f (2)  
 P( X  0)  P( X  1) 98 98
34 81
 f (0)  f (1) f (3)  
98 98
 0 / 98  1/ 98  1/ 98
(iii) P( X  3)
P( X  3)  f (3)  81/ 98
273
EXAMPLE 7.07
What value of “k” makes the following function a density function?
f(x) = kx4 , x = 0, 1, 2, 3
Solution To find the value of “k” we use:
 f ( x)  1
om
x 0
3
  kx 4  1
x 0
l.c
 k (0)4  (1)4  (2)4  (3)4   1
 k 0  1  16  81  1
ai
 k 98  1
gm
1
k 
98
s@
EXAMPLE 7.08
t
ta
Find “K” for the probability distribution given below:

es
x 0 1 2 3
f(x) 1/8 K 3/8 1/8
ze
Solution To find the value of “k” we use:
 f ( x)  1
x 0
 1/ 8  K  3/ 8  1/ 8  1
 K  5/ 8  1
 K  1  5/ 8
 K  3/ 8
274
Test Yourself
1) Find the probability distribution of the sum of dots when two dice are rolled?
Also find the probability that the sum of dots is exactly 6
x4
2) Given that: f ( x)  x = 0, 1, 2, 3
98
Find (i) P(1  X  3) (ii) P( X  1) (iii) P( X  0)
3) What value of “k” makes the following function a density function?
om
f ( x)  k 4Cx , x = 0, 1, 2, 3, 4
l.c
4) Find the value of “k” from the following probability distribution:
x -2 -1 0
ai1 2 3
f(x) 0.1 0.1 0.2 2k 0.3 k
gm
Continuous probability distribution

s@
“Since a continuous random variable takes all possible values in a given range, therefore, we
cannot obtain the probability of a continuous random variable at a particular point and also
t
cannot express a probability distribution in tabular form. Hence the continuous probability
ta
distribution can only be expressed in the form of a mathematical equation which is known as
es
probability function or probability density function”
Let “X” be a continuous random variable which can take values in the interval (a, b) or (, ) then
ze
The function f(x) is called probability function or probability density function (p.d.f) of the random
variable “X”.
The continuous probability density function may be defined as:
 Function of "x" ;a  x b
f(x)= 
0 ; otherwise
A p.d.f has the following two properties:
(i) f(x)  0 for all “x”

 f (a )  f (b ) 
(ii) P (a  X  b )  
 2  b  a   1
275
Graph of Continuous probability distribution
The continuous probability distribution is usually displayed by a continuous probability curves. In this
type of graphs we take the range of X on the X-axis and the probabilities on the Y-axis as shown in the
following figure:
om
l.c
ai
How to find the probabilities using
gm
a continuous probability distribution OR

a continuous probability density function
s@
Let “X” be a continuous random variable then continuous probability density function f(x) is given as:
t
 Function of "x" ;a  x b
ta
f(x)= 
0 ; otherwise
es
Now to find the probabilities we have:

ze
 P( X  x1 )  0 (because there is no area over a single point) In discrete case the
 f ( x1 )  f ( x2 )  probabilities:
 P( x1  X  x2 )  
 2   x2  x1  P( x1  X  x3 )
P( x1  X  x3 )
have different meaning
but in the case of
continuous they are
same.
a x1 x2 b
276
EXAMPLE 7.09
Given that:
x 1
f ( x)  , 2 x4
8
(i) Show that the area under the curve is equal to unity
(ii) Find P( X  3) (iii) Find P(3  X  4) (iv) Find P( X  3)
Solution
om
(i) Show that the area under the curve is equal to unity.
l.c
Here we use:
 f (a)  f (b) 
ai
P ( a  X  b)  
 2   b  a  Given that f ( x) 
x 1
gm
8
2 1 3
 f (2)  f (4)  f (2)  
P(2  X  4)  
   4  2  8 8
s@
2 3 1 4
f (3)  
 f (2)  f (4) 
   2 
8 8
 2 4 1 5
f (4)  
t
8 8
ta
 f (2)  f (4)
 3/ 8  5/ 8
es
1
ze
Hence the area under the curve is equal to unity.
(ii) P( X  3)
P( X  3)  P(2  X  3)
 f (2)  f (3) 

 2   3  2 
 f (2)  f (3)  Hi Friends!!!
  (1)
 2
3/ 8  4 / 8
   7 /16
 2
277
(iii) P(3  X  4)
 f (3)  f (4) 
P(3  X  4)  
 2   4  3
4 /8  5/8
  (1)
 2
 9 /16
(iv) P( X  3)
P( X  3) = 0 (for continuous random variable the probability on a single point is zero)
om
EXAMPLE 7.10
Given that f ( x)  kx , 0 x2

l.c
ai
(i) Find the value of “k” (ii) Find P(0.5  X  1.5)
gm
(iii) Find P( X  1)
s@
Solution
(i) Here we use:

t
ta
P(a  X  b)  1
es
 f (a)  f (b)  Given that f ( x)  kx


   b  a   1
2 f (0)  k (0)  0
ze
 f (0)  f (2)  f (2)  k (2)  2k


 2   2  0   1
 f (0)  f (2) 

 2   2   1
 f (0)  f (2)  1
 0  2k  1
 k  1/ 2
x
Hence the p.d.f can be written as: f ( x)  kx  f ( x)  , 0 x2
2
278
(ii) P(0.5  X  1.5)
 f (0.5)  f (1.5) 
P(0.5  X  1.5)  
 2  1.5  0.5
x
Since f ( x) 
2
 f (0.5)  f (1.5) 
 1.5  0.5
 0.5
f (0.5)   0.25
 2 2
 f (0.5)  f (1.5)  1.5
  (1) f (1.5)   0.75
 2 2
om
 0.25  0.75 
   0.5
 2
(iii) P( X  1)
l.c
ai
P( X  1)  P(1  X  2)
gm
x
Since f ( x) 
 f (1)  f (2) 
  2  1
 2
s@
 2
f (1) 
1
 0.5
 f (1)  f (2)  2

 2  1 2
f (2)   1
t
 0.5  1  2
  0.75
ta
 2 
es
ze
Test Yourself
1
1) If f ( x)  (5  2 x) , 1 x  4
30
(i) Show that f(x) is a density function

(ii) Find P( X  3) (iii) Find P( X  3)
2) Given that f ( x)  kx , 0 x2
(i) Find the value of “k” (ii) Find P(1  X  1.5)

(iii) Find P( X  0.5)
279
The function that assigns probability for a discrete random variable is called a probability
mass function, because it shows how much probability (or mass), is given to each value of
the random variables. Mass is thought of as weight in this case the total mass (or weight)
for a probability distribution equals one. A continuous random variable doesn’t actually
assign probability or mass, it assigns density, which means it tells you how dense the
probability is around x for any value of X. You find probabilities for intervals of X, not for
particular values of X, when X is continuous. Continuous random variables have no
probability at any single point because there is no area over a single point.
om
Mathematical Expectation OR
Expected Value of a Discrete Random Variable
l.c
ai
A very important concept in probability is the idea of expected values. The
gm
expected value is the long-term mean or average value of a random
variable. If the random variable is observed over a long period of time, we
would expect that the expected value would be close to the average value
of the observations generated by the random process. The larger the
s@
number of observations, the closer the expected value will be to the

In 1657, Christiaan
average value of the observations. Thus we define expected value as “The
Huygens published
theoretical average of a random variable is called expected value.”
the first book on
t
ta
probability theory. In
Let “X” be a discrete random variable that text, he

es
which can take values as x1, x2, … ,xn introduced the idea
and the associated probabilities be of expected value.

ze
f(x1), f(x2),….,f(xn) respectively; then

the expectation of “X” (denoted by
E ( X ) ,  x or  ) is defined as:
 x  E ( X )   xf ( x)
all x
OR  x  E ( X )   xP( x)
all x
The expected value of X doesn’t have to be equal to a possible value of X because it

represents a long-term average value. It does, however, have to lie between the smallest and
largest possible values of X, which is something to check after you have calculated E(X). Also,
note that E(X) is not a probability, so it falls between zero and one only if all the possible
values of X are between zero and one.
280
A Practical Example!!!
If we toss three coins let “X” represents the No. of heads so that x = 0, 1, 2, 3 then the probability
distribution of the number of heads:
S  HHH , HHT , HTH , THH , TTH , THT , HTT , TTT 
X 0 1 2 3 Total Head Tail

f(x) 1/8 3/8 3/8 1/8 1
om
Thus E ( X )   xf ( x)   0 1/ 8  1 3 / 8   2  3 / 8  3 1/ 8   1.5
l.c
If toss three coins 50 times and the number of heads are recorded as given in the following table:
ai
gm
0 2 3 1 1 2 2 2 0 3
2 1 1 1 2 3 1 1 1 2 Now the Mean of this data is:
0 1 1 3 0 1 2 1 3 2
1 1 1 2 1 1 0 1 1 2 0  2  1  1    1  3
Mean   1.48
s@
2 1 2 1 2 1 3 2 1 3 50
t
ta
Now if toss three coins 100 times and the number of heads are recorded as given in the following table:
es
0 3 0 1 0 2 2 2 0 3
ze
2 2 1 2 2 3 2 1 1 2
0 1 1 3 0 2 2 1 3 2 Now the Mean of this data is:
1 2 0 2 1 2 0 1 1 2
0  3  0  1    1  2
1 2 2 3 1 1 1 2 1 3 Mean   1.49  1.5
100
2 1 1 2 0 2 3 1 1 2
1 2 2 1 3 1 1 2 1 2
It is clear that this mean is close to 1.5
1 2 1 2 1 0 2 3 2 1
2 1 2 1 2 1 2 1 2 1
1 0 1 1 2 1 3 2 1 2
Hence, “as the number of repetitions of the experiment increases, we expect that the actual mean
get closer to the expected (theoretical) mean”
281
Variance and Standard Deviation

of a Discrete Random Variable
Let “X” be a discrete random variable which can take values as x1, x2, … ,xn and the associated
probabilities be f(x1), f(x2), ….,f(xn) respectively; then the variance and S.D of “X” are defined as:
Var ( X )   x2  E ( X 2 )   E ( X ) 
2
S.D( X )   x  E ( X 2 )   E ( X ) 
2 Sigma
om
Here E ( X 2 )   x 2 f ( x)
all x
l.c
EXAMPLE 7.11
A random variable X has a probability distribution:

ai
gm
x 0 1 2
f(x) 1/4 2/4 1/4
Find Expected value, Variance and S.D of the random variable X.

s@
Solution
t
x f(x) xf(x) x2f(x)

ta
0 1/4 0 0
1 2/4 2/4 2/4
es
2 1/4 2/4 4/4

-- 1 4/4 6/4
ze
To compute E(X),
E ( X )   xf ( x)  4 / 4  1.0
round-off it to one more
And E ( X )   x f ( x)  6 / 4  1.5
2 2 decimal place than the
values of random
Var ( X )  E ( X )   E ( X ) 
2
variable x. This round-
2
Therefore
off rule is also used for
 1.5  1.0 
2
the variance and S.D of

 1.5  1  0.5
a probability distribution.
S .D( X )  E ( X 2 )   E ( X ) 
2
 1.5  1.0 
2
 1.5  1  0.5  0.7
282
EXAMPLE 7.12
Find “K” for the probability distribution given below:
x 0 1 2 3
f(x) 1/8 K 3/8 1/8
Also find the value of Mean and Variance of the random variable X.
Solution
To find the value of “K” we use:
om
3
 f ( x)  1 x f(x) xf(x) x2f(x)

x 0 0 1/8 0 0
 1/ 8  K  3/ 8  1/ 8  1 1 K = 3/8 3/8 3/8
l.c
 K  5/ 8  1 2 3/8 6/8 12/8
 K  1  5/ 8 3
ai 1/8 3/8 9/8
 K  3/ 8 -- 1 12/8 24/8
gm
Mean  E ( X )   xf ( x)  12 / 8  1.5
And E ( X 2 )   x 2 f ( x)  24 / 8  3
s@
Therefore Var ( X )  E ( X 2 )   E ( X ) 
2
Hi Friends!!!
 3  1.5
2
t
ta
 3  2.25
 0.75
es
ze
Test Yourself
1) Find E(X), Var(X) and S.D(X) from the following Probability Distribution:
x 0 1 2 3
f(x) 1/4 1/6 2/6 1/4
2) Find the value of “K”, E(X), Var(X) and S.D(X) from the following Probability Distribution:
x 1 2 3 4
f(x) 2/12 K 4/12 3/12
283
Properties of Expectation
 E (a)  a
 E (aX )  aE ( X )
 E ( X  a)  E ( X )  a
 E(aX  b)  aE( X )  b
 E( X  Y )  E( X )  E(Y )
 E( X Y )  E( X ) E(Y ) (If X and Y are independent)
EXAMPLE 7.13
om
Given the following Probability Distribution:
l.c
x 1 2 3 4
f(x) 1/8 1/4 1/2 1/8 ai
Find (1) E ( X ) (2) E ( X  10) (3) E (2  X )
gm
(4) E (3 X ) (5) E (4 X  100) (6) E (20  5 X )
Solution 1) E ( X )   xf ( x)  21/ 8  2.6

s@
2) E( X  10)  E( X )  10  2.6  10  12.6 x f(x) xf(x)

t
1 1/8 1/8
3) E(2  X )  2  E( X )  2  2.6  0.6
ta
2 1/4 2/4
3 1/2 3/2
es
4) E (3 X )  3E ( X )  3(2.6)  7.8 4 1/8 4/8

-- 1 21/8
5) E(4 X  100)  4E( X )  100  4(2.6)  100  110.4
ze
6) E(20  5 X )  20  5E( X )  20  5(2.6)  7.0
Test Yourself
x 4 5 6 7
f(x) 1/8 1/4 1/2 1/8
Find (1) E ( X ) (2) E ( X  9) (3) E (12  X )

(4) E (8 X ) (5) E (3 X  70) (6) E (17  3 X )
284
Properties of Variance and Standard Deviation
 Var (c)  0  S.D(c)  0

 Var ( X  c)  Var ( X )  S.D( X  c)  S.D( X )
 Var (c X )  c 2Var ( X )  S.D(c X )  c S.D( X )
X  1 X 1
 Var     2  Var ( X )  S .D    S .D( X )
om
 c  c  c c
 If X and Y are independent then  If X and Y are independent then
Var ( X  Y )  Var ( X )  Var (Y ) S.D( X  Y )  S.D( X )  S.D(Y )
l.c
ai
EXAMPLE 7.14
gm
x -50 -100 1500

s@
f(x) 1/5 3/10 1/2
Find (1) E ( X ) (2) E ( X 2 ) (3) Var ( X )

t
(4) S.D( X ) (5) Var ( X  3) (6) S.D(2  3 X )

ta
X X
(7) S.D(3 X ) (8) Var   (9) S .D  
es
5 5
Solution
ze
1) E ( X )   xf ( x)  35.0 x f(x) xf(x) x2f(x)

-50 1/5 -10 500
2) E ( X 2 )   x 2 f ( x)  14750 -100 3/10 -30 3000
1500 1/2 75 11250
Var ( X )  E ( X 2 )   E ( X ) 
2
3) -- 1 35 14750
 14750  (35)2  13525.0
S .D( X )  E ( X 2 )   E ( X ) 
2
4)
 14750  (35)2  13525  116.3
285
5) Var ( X  3)  Var ( X )  13525.0
6) S.D(2  3 X )  3S.D( X )  3(116.3)  348.9
7) S.D(3 X )  3S.D( X )  3(116.3)  348.9

Hi Friends!!!
X   1   1 
8) Var     Var ( X )    (13525)  541.0
5   25   25 
X  1 1
9) S .D      S .D( X )    (13525)  2705.0
5  5 5
om
Test Yourself
x
l.c
-40 -900 1400
ai
f(x) 1/5 3/10 1/2
gm
Find (1) E ( X ) (2) E ( X 2 )

(3) Var ( X ) (4) S.D( X ) (5) Var ( X  4)
 2X   3X 
(6) S.D(5  2 X ) (7) S.D(6 X ) (8) Var   (9) S .D  
 7   8 
s@
Important!!!
t
ta
es
ze
Math Symbol Phrases
 “greater than” or “more than” or

“exceed” or “better than” or “taller than” or “above”
 “less than” or “smaller than” or “below” or “under”

or “fewer than”
 “at least” or “greater than or equal to” or “no less

than”
 “at most” or “less than or equal to” or “no more

than”
286
Amazing Histogram!!!
If two dice are rolled and “X” represents the sum of dots then:
2 3 4 5 6 7
3 4 5 6 7 8 

4 5 6 7 8 9
S 
5 6 7 8 9 10 
 9
6 8 9 10 11
om
7
 
7 8 9 10 11 12 
l.c
x 2 3 4 5 6 7 8 9 10 11 12 Total
f(x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 1
ai
gm
f ( x)
s@
6 / 36
t
5 / 36
ta
4 / 36
es
3 / 36
ze
2 / 36
1/ 36
X
2 3 4 5 6 7 8 9 10 11 12
287
Sharpen your Pencil

MCQ’s
(1) Expected value of a random variable is called_____
(A) Mean (B) Median (C) Mode (D) None of these
om
(2) If “X” and “Y” are independent random variables then E(XY) = _____
(A) XY (B) E(X)+E(Y) (C) E(X)E(Y) (D) None of these
l.c
(3) IF “X” is a random variable and “a” and “b” are constants, then E(aX+b) = _____
ai
(A) a2E(X)+b (B) aE(X)+b (C) aE(X) (D) None of these
gm
(4) IF “X” is a random variable then E(2X+3) = _____
(A) 22E(X)+3 (B) 2E(X)+3 (C) 2E(X) (D) None of these

s@
(5) If “a” is a constant then E(aX) = _____

t
A) a (B) aE(X) (C) E(X) (D) None of these

ta
6) If “X” and “Y” are two random variables then E(X+Y) = _____
es
(A) X+Y (B) E(X)+E(Y) (C) E(X)-E(Y) (D) None of these

ze
(7) A random variable is also called a _____variable.
(A) Chance (B) Qualitative (C) Discrete (D) None of these
(8) For a discrete random variable “X” E(X) = _____
(A)  xP( x) (B)  x P( x)

2
(C)  P( x) (D) None of these
(9) The probability for a continuous random variable at a particular point is = _____
(A) 0 (B) 1 (C) -1 (D) None of these
(10) If X is a random variable with E(X) = 7, then E(2X-3) = _____
288
Sharpen your Pencil

MCQ’s
(11) If “a” is a constant then E(a) = _____
(A) 0 (B) a (C) 1 (D) None of these
om
(12) If Var(X) = 10, Var(Y) = 15 and if X and Y are independent then, Var(X-Y) = _____
(A) -5 (B) 5 (C) 150 (D) None of these
l.c
(13) The sum of probabilities in a probability distribution is _____
ai
(A) Negative (B) 0 (C) 1 (D) None of these
gm
(14) If f(x) is a p.d.f of a random variable X then f(x) _____
(A) f ( x)  (B) f ( x)  0 (C) f ( x)  0 (D) None of these

s@
(15) Var(-5X) = _____ when Var(X) = 25

t
(A) 625 (B) -625 (C) 25 (D) None of these

ta
(16) If X and Y are independent then, Var(X-Y) = _____

es
(A) Var(X) + Var(Y) (B) 0

ze
(C) Var(X) - Var(Y) (D) None of these
289
Short Questions
ExeRciSe
Q.7.01. If E(X) = 3 then find E(2X-1) and E(X+1).
Q.7.02. Find “K” for the probability distribution given below and find E(X):
om
x 0 1 2 3 4
f(x) 12/210 80/210 K 24/210 1/210
l.c
Q.7.03. Given that P(x) = 0.2 for x = 2, 3, 4, 5, 6. Calculate E(X)
ai
Q.7.04. What value of “k” makes the following function a mass function?
gm
f(x) = kx6 , x = 0, 1, 2, 3
Q.7.05. What value of “k” makes the following function a mass function?
s@
f ( x)  k 7Cx , x = 0, 1, 2, 3, 4
t
Find the value of “k” from the following probability distribution:

ta
Q.7.06
es
x -2 -1 0 1 2 3
f(x) 0.1 0.1 0.2 2k 0.3 k
ze
Also find probability distribution of X by replacing the calculated value of “k”
Q.7.07. What value of “k” makes the following function a density function?
f ( x)  k (4  x) , 1 x  3
Q.7.08. What value of “A” makes the following function a density function?
f ( x)  A(2 x  3) , 1 x  2
Q.7.09. Define random variable. Write down the properties of probability distribution.
290
Long Questions
ExeRciSe
Q.7.01. Consider the following probability distribution:
x 0 1 2 3
P(x) 0.1 0.4 0.3 0.2
om
(i) Calculate Mean and Variance
(ii) Calculate E(3X-1)
l.c
(iii) Calculate Variance of (3X-1)
Q.7.02. Find E(X), E(X2) and V(X) from the following table:
ai
gm
x 0 1 2 3
f(x) 1/4 1/6 2/6 1/4
1
s@
Q.7.03. If f ( x)  (5  2 x) , 1 x  4
30
(i) Show that f(x) is a density function

t
(ii) Find P( X  3) (iii) Find P( X  3)

ta
es
Q.7.04. A random variable X that can assume values between x = 2 and x = 5 have a
2(1  x)
density function given by: f ( x) 
27
ze
(i) Find P( X  4) (ii) Find P(2  X  3)
Q.7.05. Given that f ( x)  0.25x , 1 x  3
(i) Show that P(1  X  3)  1

(ii) Find P( X  2) and P(2  X  3)
Q.7.06. Given that f ( x)  k ( x  1) , 2 x5
(i) Find “k” (ii) Find P(2  X  3)

(iii) Find P( X  4) (iv) Find P( X  4)
Q.7.07. Find the probability distribution of the No. of tails when two coins are tossed.
291
Long Questions
ExeRciSe
Q.7.08. Find E(X), E(X2), V(X) and S.D(X) from the following table:
x -2 3 1
f(x) 1/3 1/2 1/6
om
Q.7.09 Find Mean and S.D if f(-1) = 3/8, f(0) = 2/8 and f(1) = 3/8
l.c
Q.7.10. Check whether the following is a density function:
5  2x
f ( x)  , 0 x4
ai
30
gm
(i) Find P( X  2) (ii) Find P(2  X  3)

s@
Q.7.11. A random variable X has a probability distribution:
x 0 1 2
f(x) 1/4 2/4 1/4
t
ta
Find the value of Mean and Variance of the random variable X.

es
1
Q.7.12. Given that f ( x)  (5  2 x) , 0 x2
6
ze
(i) Find P( X  1) (ii) Find P(0.25  X  1.25)

(iii) Find P( X  0.75)
2(5  x)
Q.7.13. Given that: f ( x)  0 x5
25
(i) Find P(0  X  5) (ii) Find P( X  3) (iii) Find P( X  2)
Q.7.14. From the following table find E(X), E(X2) and E(X+3)
x 1 2 3 4
f(x) 2/12 3/12 4/12 3/12
292
CHAPTER 08
Some Special
Probability Distributions
om
Chapter Contents
l.c
ai
Y
gm
 Bernoulli Trials: (P294)

s@
 Bernoulli Distribution: (P294)

 Mean, Variance and S.D of Bernoulli Distribution: (P295)
 Binomial Experiment: (P295)
 Binomial Distribution: (P296)
t
ta
 Mean, Variance and S.D of Binomial Distribution: (P296-P299)

 Properties of Binomial Distribution: (P300-P302)
es
 Pascal’s Triangle: (P303-P304)

 Hypergeometric Experiment: (P305)
Hypergeometric Distribution: (P306)
ze

 Mean, Variance and S.D of Hypergeometric Distribution: (P306)
 Properties of Hypergeometric Distribution: (P307)
 Discrete Uniform Distribution: (P309)
 Mean ,Variance and S.D of Discrete Uniform Distribution: (P309)
 Continuous Uniform Distribution: (P310)
 Mean ,Variance and S.D of Continuous Uniform Distribution: (P310)
293
Chapter 08 Some Special Probability Distributions
Bernoulli Trial
 A trial is said to be Bernoulli trial if it can results in a success or a failure.
Here is a simple example of a Bernoulli trial. From a

standard deck of cards, you pick a card, note whether it
is a club or not. So the outcome of the trial can be
classified in two categories: selecting a club (success)
and selecting another suit (failure).
om
The probabilities of success and failure are denoted by
“p” and “q” respectively. If the random variable x Club
represents the number of clubs selected, then the
l.c
possible values of the random variable are 0 and 1.
ai
gm
The random variable “X” representing the number of successes in Bernoulli trials is called a
Bernoulli random variable
s@
Bernoulli distribution
t
ta
The probability distribution of the Bernoulli variable “X” is called as

es
Bernoulli distribution.
The probability mass function of Bernoulli distribution is given below:

ze
 p x q1-x ; x = 0, 1


P(X = x)= f(x)= 

0 ; otherwise Bernoulli random
variable is named
Where in honor of the
mathematician
 x = number of success Jacob Bernoulli
 p = probability of success
(1654-1705).
 q = probability of failure
Note: “p” is the parameters of the Bernoulli distribution
294
Mean, Variance and S.D

of Bernoulli distribution
Measure Formula
Mean p
Variance  2  pq
Standard Deviation   pq The prefix “bi” means
“two”. This should help
you to remind that
om
Binomial experiment binomial experiments
deal with situations in
l.c
An experiment that has the following properties is called Binomial which there are only two
experiment: outcomes i.e. success and
failure.
ai
 Every trial results in a success or a failure.

gm
The successive trials are independent.
 The probability of successes remains constant from trial to trail.
 The number of trials is fixed in advance.
s@
Here is a simple example of a binomial experiment.

From a standard deck of cards, you draw 5 cards in
t
succession, note whether it is a club or not, and

ta
replace the card. So the outcomes of each trial can

be classified in two categories: selecting a club
es
(success) and selecting another suit (failure).

ze
The probabilities of success and failure are denoted

by “p” and “q” respectively. The probability of
success remains the same because the card once
drawn has been replaced before the next draw. If
the random variable x represents the number of
clubs selected, then the possible values of the
random variable are 0, 1, 2, 3, 4, and 5. Note that x
is a discrete random variable because its possible
values can be listed.
The random variable “X” representing the number of successes in a binomial experiment is
called a binomial random variable
295
Binomial Distribution
The probability distribution of the binomial variable “X” is called as

binomial distribution.
The probability mass function of binomial distribution is given below:
 n  x n-x
  p q ; x = 0, 1,2, ... ,n
 x 
The binomial
om
P(X = x)= f(x)= 
 distribution is a
 very important
0 ; otherwise
discrete probability
l.c
Where distribution. It was
discovered by James
ai
 x = number of success Bernoulli about the
gm
 p = probability of success year 1700.
 q = probability of failure
 n = number of trials that are fixed in advance
s@
Note: “n” and “p” are the parameters of the binomial distribution.
t
ta
The terms “success” and “failure” are used in the binomial doesn’t necessarily mean a
es
success is good and failure is bad. Success mean that you get the outcome you want to
count, and failure means you get the outcome you don’t want to count. For example, if you
ze
select ten 18-year-old male drivers, then a success may be an 18-year-old driver who was
involved in an accident.

of Binomial Distribution
Measure Formula
Mean   np
Variance  2  npq
Standard Deviation   npq
296
EXAMPLE 8.01
Find complete binomial distribution having n = 5 and p = 1/2
Solution Here n  5  x  0,1, 2,3, 4,5
And p  1/ 2  q  1  p  1  1/ 2  1/ 2
n
om
Now f ( x)    p x q n  x
 x A Binomial distribution
having n = 5 and
l.c
5 p = 1/2 can also be
 f ( x)    1/ 2  1/ 2 
x 5 x
 x
5
1 1
written as   
ai
2 2
Hence the complete binomial distribution becomes:
gm
5
f ( x)    1/ 2  1/ 2 
x 5 x
s@
x
 x
5
f (0)    1/ 2  1/ 2   11/ 2   1/ 32
0 50 5
0
0
t
ta
 5
f (1)    1/ 2  1/ 2    51/ 2   5 / 32
1 51 5
1
es
1
5
f (2)    1/ 2  1/ 2   10 1/ 2   10 / 32
2 5 2 5
ze
2 Hi Friends!!!
 2
 5
f (3)    1/ 2  1/ 2   10 1/ 2   10 / 32
3 5 3 5
3
 3
5
f (4)    1/ 2  1/ 2    51/ 2   5 / 32
4 5 4 5
4
 4
 5
f ( x)    1/ 2  1/ 2   11/ 2   1/ 32
5 5 5 5
5
 5
297
EXAMPLE 8.02
An event has the p = 3/8 and n = 5 find the probability of:
(i) P( X  3) (ii) P( X  3) (iii) P( X  3) (iv) P( X  3)
Solution Here n  5  x  0,1, 2,3, 4,5
And p  3/ 8  q  1  p  1  3/ 8  5/ 8
n
om
Now P( X  x)    p x q n  x
 x
5
 P( X  x)     3/ 8   5 / 8 
x 5 x
l.c
Hi Friends!!!
 x
(i) P( X  3)
ai
 5
gm
P( X  3)     3/ 8   5 / 8 
3 53
 3
 10  3/ 8  5 / 8  0.21
3 2
s@
(ii) P( X  3)
t
P( X  3)  P( X  4)  P( X  5) In binomial distribution
ta
we can not find the

5  5
es
    3/ 8  5 / 8      3/ 8   5 / 8  probability of the
4 5 4 5 5 5
 4  5 form P( x  2.3) because

ze
the binomial r.v. X can

  5 3/ 8  5 / 8  1 3/ 8  5 / 8
4 5 4 5 5 5
take only the integer
values.
  5 3/ 8  5 / 8  1 3/ 8  0.07
4 5
(iii) P( X  3)
P( X  3)  P( X  3)  P( X  4)  P( X  5)
 5  5  5
    3/ 8  5 / 8     3/ 8   5 / 8      3/ 8   5 / 8 
3 5 3 4 5 4 5 5 5
 3  4  5
 10  3/ 8  5 / 8   5 3/ 8  5 / 8  1 3/ 8  5 / 8

3 53 4 5 4 5 5 5
 10  3/ 8  5 / 8   5 3/ 8  5 / 8  1 3/ 8  0.28

3 2 4 5
298
(iv) P( X  3)
P( X  3)  1  P( X  3)
 1  0.07  0.93 P( X  a)  P( X  a)  1
EXAMPLE 8.03
Find mean, variance and S.D for a binomial distribution having n = 20 and p = 0.3
om
Solution We know that for a binomial distribution
l.c
Mean  np
 (20)(0.3)  6 ( q  1  p  1  0.3  0.7 )
ai
Variance  npq
gm
 (20)(0.3)(0.7)  4.2 For a binomial distribution

Mean > Variance
S .D  npq
s@
 (20)(0.3)(0.7)  2.05
t
ta
EXAMPLE 8.04
es
The mean and variance of a binomial distribution are 42 and 12.6. Find the values of the
parameters “n” and “p”?
ze
Solution Given that

Now since p  1  q  p  1  0.3  0.7
Mean  40
Putting p  0.7 in equation (b) we have:
 np  40      (a)
42
And Variance  12.6 n(0.7)  42  n   60
0.7
 npq  12.6      (b) Hence the values of the parameters are:
Putting np  40 in equation (b) we have: n = 60 and p = 0.7

12.6
4.2q  12.6  q   0.3
4.2
299
Prove that the mean of the Binomial distribution is “np”
Proof:
n
We know that Mean    E ( X )   xf ( x)
x 0
n
n  n 
 Mean    E ( X )   x   p x q n  x  f ( x)    p x q n  x 
x 0  x   x 
n  n  n  n
 0   p 0 q n 0  1  p1q n 1  2   p 2 q n 2  ...  n   p n q nn
om
0 1  2  n
 np qn1  n(n  1) p 2 q n2  ...  np n
l.c
 np  q n1  (n  1) p q n2  ...  p n1 
ai
 np  q  p 
n 1
gm
 (meu)
 np Hence proved
s@
Properties of Binomial Distribution

t
ta
 The mean and variance of the binomial distribution are: “np” and “npq” respectively
es
 For the binomial distribution Mean > Variance

 The shape of the binomial distribution depends on the values of “n” and “p”
ze
 The Binomial distribution is
o Symmetric if p = q =1/2
o Positively skewed if p < q
o Negatively skewed if p > q
 The Binomial distribution approach to Normal distribution; as n   such that np >5 and nq>5
 The moments about mean of the binomial distribution are:
o 1  0
o 2  npq
o 3  npq 1  2 p 
o 4  3n 2 p 2q 2  npq 1  6 pq 
300
 For the binomial distribution:
q
o coefficient of variation  100
np
qp
o coefficient of skewness 
npq
1  6 pq
o coefficient of kurtosis  3 
npq
om
EXAMPLE 8.05
Is it possible to have a binomial distribution with mean = 5 and S.D = 3?
l.c
Solution Given that
ai
gm
Mean  5
S.D  3  Variance  32  9
s@
But we know that for any binomial distribution Mean > Variance
Hence it is not possible to have a binomial distribution with mean = 5 and S.D = 3.
t
ta
EXAMPLE 8.06
es
If X is a binomial r.v with n = 20 and p = 0.5 then find:

ze
(i) Coefficient of variation

(ii) Coefficient of skewness
(iii) Coefficient of kurtosis
Solution (i) Coefficient of variation
q
Coefficient of variation  100
np
0.5
 Coefficient of variation  100  22.4% ( q  1  p  1  0.5  0.5 )
(20)(0.5)
301
q p
Coefficient of skewness 
npq
0.5  0.5
 Coefficient of skewness  0
(20)(0.5)(0.5)
Hence the distribution is symmetric.
om
l.c
Hi Friends!!!
1  6 pq
Coefficient of kurtosis  3  ai
npq
gm
1  6(0.5)(0.5)
 Coefficient of kurtosis  3   2.9
(20)(0.5)(0.5)
s@
Hence the distribution is platykurtic.

t
ta
EXAMPLE 8.07
es
If X is a binomial r.v with n = 20 and p = 0.3 then find E(2X-3)?

ze
Solution We know that for a binomial distribution
Mean  E ( X )  np
 (20)(0.3)
6
E(2 X  3)  2E( X )  3
 2(6)  3
9
302
Pascal’s Triangle!!!
Consider the binomial expansion:
n n n  n

( p  q)n    p n q0 +   p n-1q1 +   p n-2 q 2 +     p o q n
0   1 2  n
n n n n

The coefficients   ,   ,   , ….. ,   are called the binomial
0   1  2 n
In 1653, Blaise
om
coefficients. These coefficients can be easily written down by using an
arrangement of numbers, called Pascal’s triangle given below: Pascal created a
triangle of numbers
called Pascal’s
l.c
triangle that can be
used in the binomial
ai
distribution
gm
Hmm !!!
s@
interesting
t
ta
and so on . . .
es
ze
Power Binomial Expansions Coefficients

2 ( p  q)2  p 2  2 pq  q 2 1 2 1
3 ( p  q)3  p3  3 p 2 q  3 pq 2  q3 1 3 3 1
4 ( p  q)4  p 4  4 p3q  6 p 2 q 2  4 pq3  q 4 1 4 6 4 1
and so on…
303
EXAMPLE 8.08
4
1 2
Expand the binomial distribution   
3 3
Solution Here p = 1/3 and q = 2/3 and n = 4
Now since ( p  q)4  p 4  4 p3q  6 p 2 q 2  4 pq3  q 4
4 4 3 2 2 3 4
1 2 1 1  2 1  2  1  2   2 
om
        4      6      4      
3 3 3  3  3  3  3  3  3   3 
l.c
4
1 2 1 8 24 32 16
       
 3 3  81 81 81 81 81 ai
This is the required expansion.
gm
Hi Friends!!!
s@
Test Yourself
t
ta
es
1) Find complete binomial distribution having n = 6 and p = 1/4

2) Find mean, variance and S.D for a binomial distribution having n = 15 and p = 0.7
3) The mean and variance of a binomial distribution are 1.2 and 0.84. Find the values of the
ze
parameters “n” and “p”?

4) If X is a binomial r.v with n = 14 and p = 0.8 then find:
5) An event has the p = 1/8 and n = 5 find the probability of:
(i) P( X  2) (ii) P( X  2) (iii) P( X  4) (iv) P( X  2)
5
1 3
6) Expand the binomial distribution   
4 4
304
Hypergeometric experiment
An experiment that has the following properties is called Hypergeometric experiment:
 Every trial results in a success or a failure.

 The successive trials are dependent.
 The probability of successes changes from trial to trail.
 The number of trials is fixed in advance.
om
Here is a simple example of a Hypergeometric experiment. If 5 cards are drawn at
random without replacement and we are interested in selecting a red card. For instance
the probability of 3 red cards on first draw is:
 26  26 
l.c First Draw
  
ai Red Black Total
P(3 red)      0.325
3 2
26 26 52
gm
 52 
 
5
s@
On the second draw the probability becomes:
 23  24 
   Second Draw
t
P(3 red)      0.318 and so on.

3 2
ta
Red Black Total

 47  23 24 47
 
es
5
ze
Thus the probabilities of success changes in this case from trail to trail because the cards
are drawn without replacement. If the random variable x represents the number of red
cards selected, then the possible values of the random variable are 0, 1, 2, 3, 4, and 5.
Note that x is a discrete random variable because its possible values can be listed.
The random variable “X” representing the number of successes in a Hypergeometric

experiment is called a Hypergeometric random variable
305
Hypergeometric Distribution
The probability distribution of the Hypergeometric variable “X” is called as Hypergeometric

distribution.
The probability mass function of Hypergeometric distribution is given below:
 k   N - k 
   
 x   n - x  ; x = 0, 1,2, ... ,n if n  k
 N
om
   ; x = 0, 1,2, ... ,k if n  k
P(X = x)= f(x)=  n


l.c


0 ; otherwise
ai
gm
Where
 N = number of units in the population


s@
n = number of units in the sample (also “n” is the number of trials that are fixed in advance)
 k = number of success in the population
 x = number of success in the sample
t
Note: “N”, “n” and “k” are the parameters of the Hypergeometric distribution.
ta
es

of Hypergeometric Distribution
ze
k
Measure Formula If  p and q  1  p If N  
N
nk   np   np
Mean 
N
nk  N  k   N n  N n
Variance 2      2  npq     2  npq
N  N   N 1   N 1 
Standard nk  N  k   N  n   N n
      npq      npq
Deviation N  N   N 1   N 1 
306
Properties of Hypergeometric Distribution
nk
 The mean and variance of the Hypergeometric distribution are: “ ” and
N
nk  N  k   N n 
“    ” respectively
N  N   N 1 
 The Hypergeometric distribution approach to the Binomial distribution; as N  
om
EXAMPLE 8.09
If 6 cards are drawn from a deck of 52 playing cards, what is the probability that 2 will be
l.c
hearts?
ai
Solution By using the Hypergeometric distribution with:
gm
N = 52 Total cards in a deck

k = 13 There are 13 hearts i.e. success in the population
n=6 Sample of 6 cards is selected
s@
x=2 Exactly 2 hearts i.e. success in the sample

t
13  39 
ta
   Cards
P(2 Hearts)      0.315
2 4
We have: Hearts Others Total
es
 52  13 39 52
 
6
ze
EXAMPLE 8.10
A committee of size 5 is to be selected at random from 4 women and 5 men. Find the probability
distribution for the number of women on the committee?
Solution Let X is a random variable for the number of women on Women Men Total
the committee. Then x = 0, 1, 2, 3, 4 4 5 9
 4  5 
  
P(0 women)     
0 5 1
For X = 0
9 126
 
5
307
 4  5 
  
P(1 women)     
1 4 20
For X = 1
9 126
 
5
 4  5 
  
P(2 women)     
2 3 60
For X = 2
9 126
 
5
 4  5 
om
  
P(3 women)     
3 2 40
For X = 3
9 126
 
l.c
5
 4  5 
  
ai
Hi Friends!!!
P(4 women)     
4 1 5
For X = 4
gm
9 126
 
5
s@
Hence the probability distribution for “X” is given as follows:

t
x 0 1 2 3 4 Total
ta
P(x) 1/126 20/126 60/126 40/126 5/126 1

es
ze
Test Yourself
1) If 8 cards are drawn from a deck of 52 playing cards, what is the probability that 3 will be
hearts?
2) A committee of size 6 is to be selected at random from 4 women and 5 men. Find the
probability distribution for the number of men on the committee?
308
Discrete Uniform Distribution
A discrete random variable “X” is said to have a uniform distribution if its p.m.f is defined as:
1
N ; x = 1,2, ... ,N

f(x)= 


0 ; otherwise
om
Note: “N” is the parameter of the discrete uniform distribution.
l.c
of Discrete Uniform Distribution
ai
gm
Measure Formula
N 1
Mean 
2
s@
N 2 1
Variance 2 
12
N 2 1
t
Standard Deviation 
ta
12
es
N 1
Prove that the mean of the discrete uniform distribution is
ze
2
N
Proof: We know that: Mean    E (X )   xf (x )
x 1
N
1  1 
 Mean    E (X )   x  f (x )  
x 1 N  N 
N
1

N
x
x 1
1
 1  2  ....  N 
N
 N ( N  1)  1  N (N  1) 
 1  2  ....  N     
 2  N  2   (meu)
N 1

2
309
EXAMPLE 8.11
From the following series find mean and variance:

1001, 1002, 1003… 1009
For consecutive natural
Solution Here N = 9 numbers we may use

N  1 9  1 10 discrete uniform
Now Mean    5
2 2 2 distribution to find its
N  1 9  1 81  1 80
2 2 mean and variance.
And Variance      6.66
om
12 12 12 12
l.c
Continuous Uniform Distribution
ai
A continuous random variable “X” is said to have a uniform distribution over the interval (a, b) if its
p.d.f is defined as:
gm
 1
b - a ;a  x b

s@
f(x) = 


0 ; otherwise
t
ta
Note: “a” and “b” are the parameters of the continuous uniform distribution
es

ze
of Continuous Uniform Distribution
Measure Formula
a b
Mean 
2
b  a 
2
Variance  
2
12

 a
b 
Standard Deviation
12
310
EXAMPLE 8.12
Let X has a continuous uniform distribution under the interval (2, 5) find its Mean and Variance?
Solution Since the interval is (2, 5) therefore 2  x  5
ab 25 7
Now Mean     3.5 ( here a  2 and b  5 )
2 2 2
om
b  a  5  2  3
2 2 2
9
And Variance      0.75
l.c
12 12 12 12
Important!!!
ai
gm
s@
Math Symbol Phrases

“greater than” or “more than” or
t
ta
“exceed” or “better than” or “taller than”

or “above”

es
“less than” or “smaller than” or “below”

or “under” or “fewer than”
ze
 “at least” or “greater than or equal to” or

“no less than”
 “at most” or “less than or equal to” or

“no more than”
311
Sharpen your Pencil

MCQ’s
(1) The probability of success is denoted by _____
(A) p (B) q (C) n (D) None of these
(2) The sum of “p” and “q” is = _____
om
l.c
(3) Binomial distribution has _____ parameters.
(A) 2 (B) 3 (C)

ai1 (D) None of these
gm
(4) Discrete uniform distribution has _____ parameter.

s@
(5) Continuous uniform distribution has _____ parameters.
A) 3 (B) 2 (C) 1 (D) None of these

t
ta
6) Hypergeometric distribution has _____ parameters

es

ze
(7) The binomial distribution is symmetrical if p = q = _____
(A) 1/2 (B) 1/4 (C) 1 (D) None of these
(8) The binomial experiment becomes the Bernoulli experiment if n = _____
(9) The shape of binomial distribution depends on the values of _____
(A) n and p (B) n and q (C) p and q (D) None of these
(10) When p = 0.4 and n = 6 the Mean of binomial distribution is _____
312
Sharpen your Pencil

MCQ’s
(11) For binomial distribution Mean is ______than it’s Variance.
(A) smaller (B) greater (C) equal (D) None of these
(12) Coefficient of variation of binomial distribution is _____
om
q q
(A) (B) 100 (C) npq 100 (D) None of these
np np
l.c
(13) In a Binomial experiment the repeated trials are _____
(A) Independent (B) Dependent (C)

ai
Mixed (D) None of these
gm
(14) The Binomial distribution is negatively skewed if _____
(A) pq (B) pq (C) pq (D) None of these

s@
(15) The Binomial distribution is positively skewed if _____
(A) pq (B) pq (C) pq (D) None of these

t
ta
(16) For Binomial distribution S.D = _____

es
(A) npq (B) np (C) npq (D) None of these

ze
(17) In Binomial distribution n = 20 and p = 3/5 then its S.D = _____
(18) In Binomial distribution n = 10 and p = 3/5 then its Variance = _____
(19) The probability of failure is denoted by _____
(A) p (B) q (C) 0 (D) None of these
(20) The parameters of Hypergeometric distribution are _____
(A) N, n, k (B) N, k (C) n, k (D) None of these
313
Short Questions
ExeRciSe
Q.8.01. What are the properties of Binomial distribution?
N 1
Q.8.02. Prove that mean of Uniform distribution is
2
om
Q.8.03. If X is a binomial random variable with n = 5 and p = 0.6, then find E(2X-3) and
Var(2X-3).
l.c
Q.8.04. If X is a binomial random variable with n = 20 and p = 0.5. Find its variance and
ai
coefficient of variation?
gm
Q.8.05. From the following series find mean and variance (using uniform distribution)
5001, 5002, 5003… 5007

s@
Q.8.06. Let X has a continuous distribution under the interval (3, 7) find its mean and
variance?
t
ta
Q.8.07. Is it possible to have a binomial distribution with mean = 8 and S.D = 9?

es
Q.8.08. If 4 cards are drawn from a deck of 52 cards. What if the probability that 2 cards
are from spade.
ze
Q.8.09. Expand the following binomial distributions:
5
1 3
    0.65  0.35
4
(i) (ii)
4 4
Q.8.10. If 5 cards are drawn from a deck of 52 cards. What if the probability that 3 cards
will be clubs.
314
Long Questions
ExeRciSe
Q.8.01. An event has the p = 3/8 and n = 5 find the probability of:
(i) P( x  3) (ii) P( x  3)
(iii) P( x  3) (iv) P( x  3)
om
Q.8.02. If n = 5 and p = 1/3, find the complete Binomial distribution?
Q.8.03. If n = 4 and p = 1/2, find the complete Binomial distribution?
Q.8.04.
l.c
If X is a binomial random variable with n = 20 and p = 0.5 then find:
ai
gm

s@
Q.8.05. The mean and variance of a binomial distribution are 42 and 12.6 find “n” and
“p”?
Q.8.06. The mean and variance of a binomial distribution are 3 and 1.5 find its
t
ta
parameters?
es
Q.8.07. A committee of 3 members is to be selected from 3 men and 4 women. Find the
probability distribution for the number of men on the committee?
ze
315
Prepared by
Zafar Ali (M.Sc. Statistics)
Cell No. 0333-9004086, 0345-9282215 Types of Index Number
Simple Index Numbers
om Composite Index Numbers
l.c
Fixed Base Chain Base
ai
Unweighted Weighted
Method Method Index
numbers gm Index
numbers
Simple
s@
Simple average
t
Weighted Weighted
Aggregative
Method
of relative
Method
taAggregative
Method
average of
relative Method
es
Fixed Base Chain Base Fixed Base
ze
Chain Base
Method Method Method Method
8 Study Tips for Statistics Students
 Do homework daily
Doing homework on a daily basis is essential to succeeding in Statistics class. Take notes in class and
use them daily. Make a file where you can keep all your notes handy for future reference.
om
 Don't be afraid to ask help
Don't be afraid or to proud to ask for help. Your teacher will gladly help you and it will give you the
l.c
tools to conquer the problem.
 Do sample tests and use Calculator

ai
gm
Take time to do sample tests. This way you can identify problem areas quickly and eliminate them
before its time for the real test. Retest yourself regularly. This way you will be able to establish your
weak points. Do make use of calculator to simplify the computations.
s@
 Form a study group
Form a study group that can meet at least once a week where you can discuss problems or any
t
difficulties and help each other. Compare answers with one another.
ta
 Take your time until you understand a problem

es
Don't rush through problems. Take your time and make sure you understand it. What you don't
ze
understand today will become a problem tomorrow.
 Don't rush through problems.
Take your time and make sure you understand it. What you don't understand today will become a
problem tomorrow.
 Practice makes perfect
Statistics is something you need to practice. Repetition is what will give you the skill to overcome any
problem.
 Relaxation techniques
When you feel all flustered or fear grips your heart, try to relax.

1st Year Full

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

1st Year Full

Загружено:

Авторское право:

Доступные форматы

Let Me Expose Statistics and

the students. Any suggestions for improvement of the Book will be

First Edition: October 2013

Chapter # 02 Collection and Organization of Data 21-80

Chapter # 05 Index Numbers 173-200

Chapter # 06 Set Theory and Basic Probability 201-262

Chapter # 07 Random Variables 263-292

Chapter # 08 Some Special Probability Distributions 293-315

 Information, Observation and Data: (P2)

 Frequency and Frequency Distribution: (P9-P11)

Suppose your teacher asks some

 What is your Name?

“To know about something is known as information”

“Any recording of information (numeric or non-numeric) is called observation”

“Originally collected observations are

 Data of selected student’s Names: Ajmal, Arif and Ali etc.

Names Class No. Heights Age Color

“A fixed quantity is called a constant”.

A constant is usually denoted by first letters of alphabets e.g. a, b or c.

 Height and weight of a person

Variables are usually denoted by the last letters of alphabets e.g. X, Y or Z.

“A variable is qualitative if it can be expressed non-numerically”

Number of children in a family

Discrete Variable Continuous Variable

“A quantitative variable is called discrete variable if it has counting

 No. of pages in a book a variable is discrete or

that are usually integers

takes on values that are

“A quantitative variable is called continuous variable if it has

measuring phenomena and there can be infinite number of values

Sometimes, the values of

 Temperature of a place nearest year, inch, or

 Income of a family pound. However, these

“An element from which information may be collected is called an individual”.

“An aggregate of individuals is called population”.

A population can either be finite or infinite depending

 All students in a college (Finite)

measurements on individuals or objects having some common characteristics. The objects

“A representative part which we select from a population is

 Greek letters are

describing a characteristic of a sample is called statistic”. OR and Parameter both

Roman letters are

2 used for Statistics

 A teacher judge performance of his students just

 If someone decides taste of the food by tasting a

 In medical science a few drops of blood are

taken and tested to know whether the blood

contain some abnormality or not.

Sampling with Replacement

“If the sampling unit selected is returned to the population before

In with replacement sampling:

 The sampling unit can be selected more than once.

Sampling without Replacement

In without replacement sampling:

“The number of occurrences of a particular observation in a data is called frequency”.

“The number of observations falling in a particular group (class) is called frequency”.

“The organization of raw data in table form, along with frequencies

 A categorical frequency distribution represents Categorical frequency Distribution

Blood Group No. of students (f)

such as gender, hair color, blood group etc. along

distribution is also called frequency table. O 4

 An ungrouped frequency distribution simply Ungrouped frequency Distribution

 A grouped frequency distribution is obtained Grouped frequency Distribution