Вы находитесь на странице: 1из 55

Collection, Processing & Presentation

of Business Data
Data

 A set of observations obtained from a particular enquiry is called


data or a data set.

 Usually numerical results of scientific measurements

 Example: Income of workers, IQ scores of students, etc.

2 Collection, Processing & Presentation 1/29/2018


Variable
 Data collected or generated are always associated with some
characteristics of individuals
 These characteristic can be qualitative or quantitative
 These characteristics are found to vary from individual to
individual
 ‘A variable is a characteristic, often but not always
quantitatively measured, containing two or more
values or categories that can very from person to
person, object to object or from phenomenon to
phenomenon’
 Example: Age of worker, Gender of a garment worker, etc.

3 Collection, Processing & Presentation 1/29/2018


Types of Variables
Variable

Qualitative Quantitative

Brand of PC
Marital Status Discrete Continuous
Hair Color

Children in a Family
Amount of income tax
Strokes on a golf hole paid
TV sets owned Weigh of a student
4 Collection, Processing & Presentation 1/29/2018
Levels of Measurements
 Measurement is a process of assigning numbers to some characteristics
or variables or events according to scientific rules

 Data can be classified according to levels of measurement

 If the measurement procedure in a statistical investigation are poor, the


usefulness of the findings of investigation will be severely affected.

 There are actually four levels of measurements

5 Collection, Processing & Presentation 1/29/2018


Levels of Measurements
Levels of
Measurements

Nominal Ordinal Interval Ratio

Data may only be Meaningful differences Meaningful 0 point


Data are ranked between values and 0
classified and ratio between
is not absolute zero values

1) Jersey numbers 1)Economic Status 1)Temperature


2) Name of the 1)Height of students
2) Education 2)IQ marks
Students 2) Distance to class

6 Collection, Processing & Presentation 1/29/2018


Sources of Data

 Primary Data
- those data which are collected by investigator himself or by
any research institution for the purpose of some specific
inquiry or study

 Secondary Data
- investigator uses data which have already been collected by
others, such data are called secondary data

7 Collection, Processing & Presentation 1/29/2018


Methods of Collecting Primary Data
 Tool & technique used to collect data depends largely on the
objective of the study
 The important primary data collecting methods are
1) Through Interview
2) Through Questionnaires
3) Through Schedules
4) Through Local agents
5) Through Observation
6) Through Experimentation

8 Collection, Processing & Presentation 1/29/2018


Through Interview
 Most important method of collecting primary data
 Can be categorized as
1) Unstructured Interview
2) Structured Interview
3) Face to Face Interview
4) Indirect Personal Interview
5) Telephone interview
6) Computer –Assisted Interviewing

9 Collection, Processing & Presentation 1/29/2018


Through Interview
 Unstructured Interview
- free to ask any questions relating to the objective of the study
- purpose is to bring some preliminary issues so that researcher can determine
what variable need to further in-depth investigation
- open ended questions are asked

 Structured Interview
- has a list of predetermined questions to be asked to the respondents
- same questions will be asked to everybody

 Face to Face interview


- Investigator presents himself personally before the respondents and ask
questions in a face to face contact
- Useful in the investigation of a specific issue or opinion survey

10 Collection, Processing & Presentation 1/29/2018


Through Interview
 Indirect Personal Interview
- Investigator interviews several third person who are directly or indirectly
concerned with the subject matter of the enquiry
- Data regarding drug addicted, gamblers, etc. are successfully obtained by this
method

 Telephone Interview
- Investigator contacts respondent on telephone and collects information from
them
- Method is best when information from a large number of respondent spread
over a wide geographical area is to be obtained

 Computer -Assisted Interviewing (CIA)


- Questions are flashed onto the computer screen and interviewer can enter the
answers of the respondent directly into the computer

11 Collection, Processing & Presentation 1/29/2018


Through Questionnaires
 Investigator prepares a questionnaire containing a number of questions
relating to the objectives of the study

 Questionnaires are sent by post to the respondents with a polite cover


letter explaining the aims and objectives of collecting information

 Usually adopted by the research workers, private individuals, private


and public organizations

12 Collection, Processing & Presentation 1/29/2018


Through Questionnaires
 Merits
- Large field of investigation may be covered at a very low cost
- Most economical method in terms of time, money and
manpower
- Personal bias are eliminated by this method

 Demerits:
- Can be applied to the educated respondents only
- Return rate of the filled questionnaires are not always
satisfactory

13 Collection, Processing & Presentation 1/29/2018


Through Questionnaires
 Requirements of a good questionnaire
1. The number of question should be related to the objective of the
study
2. The size of the questionnaire should not be large
3. Questions should be logically arranged
4. Questions should clear, brief, unambiguous
5. Questions of sensitive nature should be avoided
6. Questions should be simple and easy to understand
7. Question involving mathematical questions should not be asked
8. If a particular question need clarification, It should be explained by
the way of a foot note
9. The answer of most question should be multiple choices , that is
closed question should be more than the open questions

14 Collection, Processing & Presentation 1/29/2018


Through Schedule
 Instead of sending questionnaire, the investigator through
enumerators with a questionnaire (Schedule) ask respondent
questions , and record their replies
 Merits
- Information is more reliable
- Non response rate is very low
- Can be applied to illiterate respondents
 Limitations
- Expensive
- Time consuming
- Skilled enumerators are needed to get reliable data

15 Collection, Processing & Presentation 1/29/2018


Questionnaire Vs Schedule
Questionnaire Method Schedule Method
 Sent to respondent with a  Filled out by the researcher
cover letter  Expensive compared to
 Cheaper and Economical questionnaire
 Fast
 Slow
 Identity of the respondent
 Not always clear who
is known
replies
 Success depends upon the
 Success depends on the honesty and competence of
quality of the questionnaire enumerators

16 Collection, Processing & Presentation 1/29/2018


Through Local Agents
 Information is collected by the local agent also known as
correspondents
 Collects information and transmit the same to the investigator
 Apply own judgment as to the best method of obtaining
information
 Usually employed by the newspaper and periodical agencies
 Merits
- Cheap and economical for extensive investigation
- Information can be obtained expeditiously, since only rough
estimates are required
 Limitations
- Information may be subject to bias due to personal choice of
interests
17 Collection, Processing & Presentation 1/29/2018
Through Observation &
Experimentation
 Through observation
- Purposeful, systematic, and selective way of watching and
listening to an interaction or phenomenon
- Use in studying the animal behavior, behavior of a group of
people
 Through Experimentation
- Use in natural sciences like physics, chemistry, etc.
- Data are obtained when treatments are applied on
experimental units in a controlled laboratory experiment

18 Collection, Processing & Presentation 1/29/2018


Sources of Collecting Secondary Data
 Published Sources
- National organizations, international agencies which collect, and publish
statistical data relating to business, trade, price, consumption, production etc
- Government, semi-government organizations, municipalities, local bodies,
universities publish a lot official statistics in report form
- Bangladesh Bureau of Statistics(BBS) is the main source of government official
statistics
- UNO,FAO,UNDP,ILO,IMF,UNESCO,WHO, etc got their official publication
containing valuable information
- Trade journal, financial journals as well as newspapers contain a good deal of
statistical data
 Unpublished Sources
- The records maintained by private firms or business houses who may not like to
release their data to outside agencies
- Universities or research institutes may also provide useful statistical data

19 Collection, Processing & Presentation 1/29/2018


Presentation of Data
 Data which has been just collected are in raw or disorganized form
 Quite difficult to explore information about the characteristics of
interest form this raw data
 Hence it is required to arrange or present data using some
statistical tools
 The process of arranging data in different groups or classes
according to their affinities is known as classification
 Facts of one class differ from those of another class with respect to
some characteristics called the basis of classification
 Once classification has been done, the classified data can be
condensed and summarized in two basic forms
1) Tabular form( Known as frequency distribution)
2) Diagrammatical and graphical form
20 Collection, Processing & Presentation 1/29/2018
Frequency Distribution
 Number of items occurs in a set of data is called the
frequency of that observation
 A grouping of data into mutually exclusive classes
showing the number of observations in each
 Steps
1) Decide on the number of classes
2) Determine the class interval or width
3) Set the individual class limit
4) Tally marks
5) Count the number of item in each classes

21 Collection, Processing & Presentation 1/29/2018


Frequency Distribution
 Decide on the number of classes:
- A class is an interval containing observations, each observation
being classified into one and only one class
- Number of classes should be such that the true nature of the
distribution is made manifest
- Too many classes or two many few classes might not reveal the
basic shape of the data set
- A useful recipe to determine the number of classes (K) is the 2k
“2 to the k rule”
- Number of classes such that 2k greater than the number of
observations (n)

22 Collection, Processing & Presentation 1/29/2018


Frequency Distribution
 Determine the class interval or width
- Generally the class interval or width should be the same for
all classes
- The classes all together taken must cover at least the distance
from the lowest value in the raw data up to the highest values
in the raw data
- Formula for determining the interval or width is:
Where, H= Highest value
H L L= Lowest value
K= Number of classes
k
23 Collection, Processing & Presentation 1/29/2018
Frequency Distribution
 Set the individual class limit
- State clear class limits so you can put each observation into only
one category
- Must avoid the overlapping or unclear class limits

 Tally marks
- After a particular value has occurred within an interval, a vertical
bar (|) called the tally mark is put against the number
 Count the number of item in each classes
- Finally we count the number of bars corresponding to each interval
of the variable

24 Collection, Processing & Presentation 1/29/2018


Construction of Frequency Distribution
The Quick oil company has a number of outlets in the metropolitan
Seattle area. The numbers of changes at the Oak Street outlet in the
past 20 days are:
65,98,55,62,79,59,51,90,72,56,70,62,66,80,94,79,63,73,71,85
The data are to be organized into a frequency distribution
1) How many class would you recommend
2) What class interval would you suggest
3) What lower limit would you recommended for the first class
4) Organize the number of oil changes into a frequency distribution
5) Comment on the shape of the frequency distribution. Also
determine the relative frequency distribution.

25 Collection, Processing & Presentation 1/29/2018


Construction of Frequency Distribution
1) Decide on the number of classes
Total observation n=20,
k 5
Applying the 2 formula we get 2 > 20
So I would recommend 5 classes
2) Determine the class interval or width
We know that
H  L
Class interval= k
Where , H= 98, L= 51, K=5
So I suggest 9.4≈ 10 would be the class interval
3) From the raw data we can see that, the lowest value is 51, so I
would recommend 50 as the lower limit for the first class

26 Collection, Processing & Presentation 1/29/2018


Frequency Distribution
Class Tally Marks Frequency Relative
Interval Frequency
50-60 |||| 4 4/20=.2
60-70 |||| 5 5/20=.25
70-80 ||||| 6 6/20=.3
80-90 || 2 2/20=.10
90-100 ||| 3 3/20=.15
Total 20 1.00

From the frequency distribution we can see that, 70-80 class has the
maximum number of observation while 80-90 has the minimum
number of observation. So we can say that maximum number oil
changes is between 70-80 and minimum number of oil change is
between 80-90 at the Oak street outlet.
Collection, Processing & Presentation 1/29/2018
27
Qual Frequency Distribution
 Leverrock’s waterfront Stakehouse in Maderia Beach, Florida
uses a questionnaire to ask customers how they rate the
server, food quality, coktails, prices and atmosphere at the
restaurant. Each characteristic is rated on a scale of
outstanding (O), very good (V), good (G), average (A) &
poor (P). Use descriptive statistics to summarize the
following data collected on food quality. What is your feeling
about food quality ratings at the restaurant?
G O V G A O V O V G O V A VO P V O G A O O O G O V
V A G OV PV O O G O OV O G A OV O O GV A G

28 Collection, Processing & Presentation 1/29/2018


Frequency Distribution
Table: Frequency Distribution of Food Rating
Ratings Frequency Relative Frequency
Outstanding 19 .38
Very Good 13 .26
Good 10 .20
Average 6 .12
Poor 2 .04
Total 50 1.00
From the frequency distribution we can see that most of the customers rated
outstanding (38%), while very few customers rated poor (4%). So from the
frequency distribution we can say that the food quality of the restaurant is
Outstanding
29 Collection, Processing & Presentation 1/29/2018
Diagrammatic Presentation
 Visual form of presentation of statistical data
 Refers various types of devices such as bars, circles, maps,
etc.
 Do not add any new meaning to the statistical facts, but they
exhibit the results more clearly
 An ordinary man understand pictures and diagrams more
easily than figures
 Types of Diagram
• Bar Diagram
• Pie Diagram
• Pictogram etc.
30 Collection, Processing & Presentation 1/29/2018
Bar Diagram
• Widely used because they are easy to construct and easy to
understand
• Effective tools for visual interpretations or comparison of data.
• Television, newspaper, reports make substantial use of this
diagram to visually communicate to the viewer
• Most popular diagrammatical representation of qualitative data
• Three types of Bar Diagram
1. Simple Bar Diagram
2. Multiple Bar Diagram
3. Component Bar Diagram

31 Collection, Processing & Presentation 1/29/2018


Simple Bar Diagram
 Most popular diagrammatical representation of qualitative
data
 On one axis (usually horizontal axis), we specify the labels
that are used for the classes
 A frequency, relative frequency, or percent frequency can
be used for the other axis
 Using a bar of fixed width drawn above each class label we
extend the length of the bar until we reach the frequency
 For qualitative data the bars should be separated to
emphasize the fact of each class is separate

32 Collection, Processing & Presentation 1/29/2018


Simple Bar Diagram

33 Collection, Processing & Presentation 1/29/2018


Multiple Bar Diagram
 Two or more sets of interrelated data are represented
 Technique of drawing is same as simple bar diagram
 More than one phenomenon is represented , different shades,
colours , dots or crossing are used to distinguish between the bars
 Example: The following data give the price of different sizes of shirts sold by
departmental stores for last five months. Construct a multiple bar diagram to
compare the sales of different sizes of shirts for the last five months

Size/
April May June July August
Month
Small 40 50 60 80 100
Medium 100 175 225 300 370
Large 60 90 140 180 220
Extra Large 20 30 40 60 90

34 Collection, Processing & Presentation 1/29/2018


Multiple Bar Diagram
Mulitple Bar Diagram of different kinds of shirts
Small Medium large Extra large
370

300

225 220
175 180
140
100 90 100 90
80
60 50 60 60
40 30 40
20

April May June july August

35 Collection, Processing & Presentation 1/29/2018


Component Bar Diagram
 Used to represent various parts of the total
 Various component in each bar should kept in same order
 A common and helpful arrangement is that of presenting each in the order or
magnitude from largest component at the base of the bar to the smallest at the
end
 To distinguish between different components, it is useful to use different shades
or colours
 The following table gives the expenditure in taka of a family on different items
for last three years. Draw a component bar diagram with the following data

Item/ Year 2005 2007 2009


Food 5000 8500 12000
Clothing 1500 2500 4000
Education 2000 3000 5000
Others 2500 4000 7000

36 Collection, Processing & Presentation 1/29/2018


Component Bar Diagram
Component Bar Diagram showing the expenditure on
different items in different year
20000
18000
16000
14000
12000
10000
8000
6000
4000
2000
0
2005 2007 2009
Food Clothing Education Other
37 Collection, Processing & Presentation 1/29/2018
Pie Diagram
 Effective way of presenting percentage parts when the whole
quantity is taken as 100
 Useful device for presenting categorical data
 Consists of a circle sub-divided into sectors, whose areas are
proportional to the various parts into which the whole
quantity is divided
 Before constructing pie, we need to convert relative
frequency (%) into angles
 As a circle consists of 360 degree , the whole quantity to be
represented is equal to 360 degree

38 Collection, Processing & Presentation 1/29/2018


Pie Diagram

Ratings Freq %RF Angles

Outstanding 19 38% .38×360=


136.8
Very Good 13 26% 93.6
Good 10 20% 72
Average 6 12% 43.2
Poor 2 4% 14.4
39 Collection, Processing & Presentation Total 50 100 360
1/29/2018
Graphic Presentation
 Graphical representation can be advantageously employed to
bring out clearly the statistical nature of frequency
distribution
 Graph paper is needed
 The most commonly used graphs are
1. Histogram
2. Frequency Polygon
3. Frequency Curve
4. Ogive or Cumulative Frequency Polygon

40 Collection, Processing & Presentation 1/29/2018


Histogram
 Most common form of graphical presentation of a frequency
distribution
 Constructed by placing the class boundaries on the horizontal
axis & frequency on the vertical axis
 Each class is shown on the graph by drawing a rectangle
whose base is the class boundary & whose height is the
corresponding frequency for that class
 Histogram is two dimensional but Bar diagram is one
dimensional
 Constructed for numerical data of continuous frequency
distribution

41 Collection, Processing & Presentation 1/29/2018


Histogram

42 Collection, Processing & Presentation 1/29/2018


Frequency Polygon
 Provides an alternative to histogram as a way of graphical
presenting a distribution of a continuous variable
 Presentation involves placing the mid values on the
horizontal axis & frequency on the vertical axis
 Instead of using rectangle , we find the class mid points on
the horizontal axis and plot directly above the class mid
points at height corresponding to the frequency of that class
 Classes of zero frequency are added at each end of the
frequency distribution
 Area of histogram & area of frequency polygon is equal
 Largely used for comparison of two or more distribution

43 Collection, Processing & Presentation 1/29/2018


Frequency Polygon
7

5
Frequency

0
55 65 75 85 95
Mid Value
44 Collection, Processing & Presentation 1/29/2018
Frequency Curve
 Smooth frequency polygon refers frequency curve

7
6
5
Frequency

4
3
2
1
0
0 10 20 30 40 50 60 70 80 90 100

Mid Value
45 Collection, Processing & Presentation Frequency Curve 1/29/2018
Ogives
 When cumulative frequencies are plotted on a graph paper,
then the frequency curve obtained is called ogive or
cumulative frequency polygon
 Ogives determine median, quartiles, percentiles etc.
 The class limits are shown along the X- axis and cumulative
frequencies along the Y- axis
 There are two methods of constructing ogives
1. Less than ogives
2. More than ogives

46 Collection, Processing & Presentation 1/29/2018


Less than ogive
 In this method we start with the upper class limit of classes
and go on adding the frequencies to get the less than
cumulative frequency table
 Upper limits of the class intervals are plotted in the X-axis
and cumulative frequencies are plotted in the Y- axis
 A point is then plotted directly above upper class limit at a
height corresponding to the cumulative frequency
 These points are then joined by straight line and the resulting
polygon is called a less than cumulative frequency polygon

47 Collection, Processing & Presentation 1/29/2018


Less than Ogive
CI CF Less than Ogive
Less 4
than 60 Less than Cumulative Frequency 25
Less 4+5=9 20
than 70
15
Less 4+5+6
than 80 =15
10
Less 17
than 90 5
Less 20 0
than 0 20 40 60 80 100 120
100
Upper Class Boundaries
48 Collection, Processing & Presentation 1/29/2018
More than Ogive
 In this method we start with the lower class limit of the
classes and from the total frequencies we subtract the
frequency of each class to get the more than cumulative
frequency table
 Lower limits of the class intervals are plotted in the X-axis
and more than cumulative frequencies are plotted in the Y-
axis
 A point is then plotted directly above upper class limit at a
height corresponding to the cumulative frequency
 These points are then joined by straight line and the resulting
polygon is called a more than cumulative frequency polygon

49 Collection, Processing & Presentation 1/29/2018


More than Ogive
CI CF More than Ogive
More 20
than 50 More than Cumulative Frequency 25

More 20-4=16
20
than 60

More 16-5=11 15
than 70
10
More 11-6=5
than 80 5

More 5-2=3 0
than 90 0 20 40 60 80 100
Lower Class Boundaries

50 Collection, Processing & Presentation 1/29/2018


Draw more than and less than Ogives

51 Collection, Processing & Presentation 1/29/2018


Line Graph
 Represents a special type of numerical data called the time
series data
 When the values of a variable are available at different time
points , the set of values or data is known as time series
 Time is plotted along the X-axis and values of the variable are
plotted along the Y axis
 Points are plotted directly above each time at a height
corresponding to the values of the variables
 The points are then connected by straight lines and the
resulting graph is called line graph

52 Collection, Processing & Presentation 1/29/2018


Line Graph
Year Sales
Line Graph
1995 235 600

1996 355 500

400
1997 395 Sales
300
1998 420
200
1999 430 100

2000 490 0
1994 1995 1996 1997 1998 1999 2000 2001 2002
2001 430 Year
53 Collection, Processing & Presentation 1/29/2018
References
 Islam, M.N.“An Introduction to Statistics and
Probability” 3rd edition,
 Lind, D.A.,William G.M. “Statistical Techniques
in Business & Economics” 12th edition
 Roy, M.K. “ Business Statistics” 1st edition
 Pillai, R.S.N., V. Bagavathi, “ Statistics” 1st edition
 Shukla, M.C., S.S. Gulshan, “Statistics; Theory
and Practice” 1st edition
 Jalil,M.A., R. Ferdous, “Basic Statistics” Methods
and Applications, 1st edition

54 Collection, Processing & Presentation 1/29/2018


Any Query ?

55 Collection, Processing & Presentation 1/29/2018

Вам также может понравиться