Вы находитесь на странице: 1из 87

Quantitative Methods for

Management
Day-2
Recap..
Introduction
Definition
Terms and terminologies
Types of statistics
Types of data
Levels of measurements
Application of statistics in business
Data warehousing & data mining
Sources of data
Organizing /Classification of data
Qualitative
Quantitative
Geographical
Chronological
Time series (is a set of observations collected at
usually discrete and equally spaced time intervals- Eg.
Daily closing stock price of a certain stock recorded
over the last six weeks )
Cross sectional (observations from different
individuals or groups at a single point in time
inventory of all ice creams in stock at a particular
store)
VISUALIZING/PRESENTATION OF
DATA
TABULAR
DIAGRAMS
GRAPHS
TABULATION

SPECIMEN OF A TABLE

Stub Caption Total


Stub Body of the table
Entries
Stub entries

Total Grand
Total

Foot Note
Sources
Descriptive Statistics:
Tabular and Graphical Presentations
Summarizing Categorical Data
Summarizing Quantitative Data

Categorical data use labels or names


to identify categories of like items.

Quantitative data are numerical values


that indicate how much or how many.
Categorical Data Are Summarized By
Tables & Graphs

Categorical Data

Tabulating Data Graphing Data

Summary Table Bar Charts Pie Charts Pareto Chart

Chap 2-8
Summarizing Categorical Data
Frequency Distribution/ contingency table

Relative Frequency Distribution


Percent Frequency Distribution
Bar Chart
Pie Chart
Frequency Distribution

A frequency distribution is a tabular summary of


data showing the frequency (or number) of items
in each of several non-overlapping classes.

The objective is to provide insights about the data


that cannot be quickly obtained by looking only at
the original data.
Relative Frequency Distribution

The relative frequency of a class is the fraction or


proportion of the total number of data items
belonging to the class.

A relative frequency distribution is a tabular


summary of a set of data showing the relative
frequency for each class.
Percent Frequency Distribution
The percent frequency of a class is the relative
frequency multiplied by 100.

A percent frequency distribution is a tabular


summary of a set of data showing the percent
frequency for each class.
Frequency Distribution
Example 4 soft drinks 15 households
Coke Pepsi 7 Up Coke Mirinda
Coke 7 Up 7 Up Coke Coke
Mirinda 7 Up Coke Mirinda Coke
Drink Frequency
Coke 7
Pepsi 1
Mirinda 3
7 Up 4
Total 15
Frequency Distribution
Soft Drink Frequency Relative Percent
frequency frequency
Coke 7 0.46 46
Pepsi 1 0.07 7
Mirinda 3 0.20 20
7 Up 4 0.27 27
Total 15 1.00 100
Frequency Distribution
Example: Marada Inn

Guests staying at Marada Inn were asked to rate the


quality of their accommodations as being excellent, above
average, average, below average, or poor. The ratings
provided by a sample of 20 guests are:

Below Average Average Above Average


Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average
Frequency Distribution

Example: Marada Inn

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
Relative Frequency and
Percent Frequency Distributions
Example: Marada Inn

Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100) = 10
Above Average .45 45
Excellent .05 5
Total 1.00 100

1/20 = .05
Bar Chart
A bar chart is a graphical device for depicting
qualitative data.
On one axis (usually the horizontal axis), we specify
the labels that are used for each of the classes.
A frequency, relative frequency, or percent frequency
scale can be used for the other axis (usually the
vertical axis).
Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
The bars are separated to emphasize the fact that each
class is a separate category.
Bar Chart

10 Marada Inn Quality Ratings


9
8
7
Frequency

6
5
4
3
2
1
Rating
Poor Below Average Above Excellent
Average Average
Pareto Diagram

In quality control, bar charts are used to identify the


most important causes of problems.
When the bars are arranged in descending order of
height from left to right (with the most frequently
occurring cause appearing first) the bar chart is
called a Pareto diagram.
This diagram is named for its founder, Vilfredo
Pareto, an Italian economist.
Pie Chart
The pie chart is a commonly used graphical device
for presenting relative frequency and percent
frequency distributions for categorical data.
First draw a circle; then use the relative frequencies
to subdivide the circle into sectors that correspond to
the relative frequency for each class.
Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) = 90
degrees of the circle.
Pie Chart

Marada Inn Quality Ratings


Excellent
5%
Poor
10%
Below
Average
Above 15%
Average
45%
Average
25%
Tables and Charts for
Numerical Data
Numerical Data

Frequency Distributions and


Ordered Array Cumulative Distributions

Stem-and-Leaf
Display Histogram Polygon Ogive

Chap 2-23
Example: Marada Inn

Insights Gained from the Preceding Pie Chart


One-half of the customers surveyed gave Marada
a quality rating of above average or excellent
(looking at the left side of the pie). This might
please the manager.
For each customer who gave an excellent rating,
there were two customers who gave a poor
rating (looking at the top of the pie). This should
displease the manager.
Summarizing
Quantitative/Numerical Data
Frequency Distribution
Relative Frequency and
Percent Frequency Distributions/ordered array
Dot Plot
Histogram
Cumulative Distributions
Ogive
Organizing Numerical Data:
Ordered Array
An ordered array is a sequence of data, in rank order, from the
smallest value to the largest value.
Shows range (minimum value to maximum value)
May help identify outliers (unusual observations)

Age of Day Students


Surveyed
16 17 17 18 18 18
College
Students 19 19 20 20 21 22
22 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45
Chap 2-26
Quantitative Data

Quantitative data indicate how many or how much:

discrete, if measuring how many

continuous, if measuring how much

Quantitative data are always numeric.

Ordinary arithmetic operations are meaningful for


quantitative data.
Ungrouped Versus Grouped Data
Ungrouped data
have not been summarized in any way
are also called raw data
Grouped data
have been organized into a frequency
distribution
ARRANGE (ARRAY)
ARRANGE (ARRAY)
RAW DATA
(OR)
INDIVIDUAL UNARRANGE (RANDOM)
SERIES
INCLUSIVE
DISCRETE SERIES INCLUSIVE

DISCRETE SERIES

EXCLUSIVE

GROUPED CONTINUOUS SERIES EXCLUSIVE


SERIES/ OPEN END

DATA CONTINUOUS SERIES


OPENTHAN
LESS END
COMMULATIVE FREQUENCY

CUMMULATIVE FREQUENCY LESS THAN


MORE THAN

BIVARIATE DATA

MORE THAN
Frequency Distribution
It is a tabular summary of data showing the
number of items in each of the non
overlapping classes
A table that organises data into classes or
groups of values
They divide a range into equal classes

Width of CI =

Largest data value Smallest


___________________________________________
# of Class Intervals
Frequency Distribution
Table with two columns listing:
Each and every group or class or interval
of values
Associated frequency of each group
Number of observations assigned to
each group
Sum of frequencies is number of
observations
N for population
n for sample
Frequency Distribution
Class midpoint is the middle value of a
group or class or interval
Relative frequency is the percentage of
total observations in each class
Sum of relative frequencies = 1
Example of Ungrouped Data
42 26 32 34 57

30 58 37 50 30

53 40 30 47 49
Ages of a Sample of
Managers from
50 40 32 31 40 Urban Child Care
52 28 23 35 25 Centers in the
United States
30 36 32 26 50

55 30 58 64 52

49 33 43 46 32

61 31 30 40 60

74 37 29 43 54
Frequency Distribution of Child
Care Managers Ages

Class Interval Frequency


20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1
Data Range

42 26 32 34 57 Range = Largest - Smallest


30 58 37 50 30

53 40 30 47 49
= 74 - 23
50 40 32 31 40 = 51
52 28 23 35 25

30 36 32 26 50

55 30 58 64 52 Smallest
49 33 43 46 32

61 31 30 40 60 Largest
74 37 29 43 54
Number of Classes and Class Width
The number of classes should be between 5 and 15.
Fewer than 5 classes cause excessive summarization.
More than 15 classes leave too much detail.
Class Width
Divide the range by the number of classes for an
approximate class width
Round up to a convenient number

51
Approximat e Class Width = = 8.5
6
Class Width = 10
Relative Frequency
Relative
Class Interval Frequency Frequency
20-under 30 6 .12
30-under 40 18 .36
40-under 50 11 .22
50-under 60 11 .22
60-under 70 3 .06
70-under 80 1 .02
Total 50 1.00
LESS THAN CUMULATIVE FREQUENCY SERIES

NO. OF
HOURS
WORKERS

LESS THAN 10 5
LESS THAN 30 15
LESS THAN 60 30
LESS THAN 90 50
MORE THAN CUMULATIVE FREQUENCY SERIES

PROFITS (RS. IN LAKHS) NO. OF COMPANIES

MORE THAN 100 150


MORE THAN 150 90
MORE THAN 200 40
MORE THAN 250 5
INCLUSIVE CLASS INTERVAL

CLASS INTERVAL FREQUENCY

10 19 17
20 29 15
30 39 12
40 49 10
EXCLUSIVE CLASS INTERVAL

NO. OF
REVENUE (RS.)
PRODUCTS
100 200 15
200 300 20
300 400 10
400 500 5
TOTAL 50
OPEN END CLASS INTERVAL

SALARY (RS.) NO. OF CLERKS


LESS THAN 1500 10
1500 1700 25
1700 1900 45
1900 2100 11
MORE THAN 2100 9
TOTAL 100
Cumulative Frequency
Cumulative
Class Interval Frequency Frequency
20-under 30 6 6
30-under 40 18 24
40-under 50 11 35
50-under 60 11 46
60-under 70 3 49
70-under 80 1 50
Total 50
Class Midpoints, Relative Frequencies, and
Cumulative Frequencies

Relative Cumulative
Class Interval Frequency Midpoint Frequency Frequency
20-under 30 6 25 .12 6
30-under 40 18 35 .36 24
40-under 50 11 45 .22 35
50-under 60 11 55 .22 46
60-under 70 3 65 .06 49
70-under 80 1 75 .02 50
Total 50 1.00
Cumulative Relative Frequencies
Cumulative
Relative Cumulative Relative
Class Interval Frequency Frequency Frequency Frequency
20-under 30 6 .12 6 .12
30-under 40 18 .36 24 .48
40-under 50 11 .22 35 .70
50-under 60 11 .22 46 .92
60-under 70 3 .06 49 .98
70-under 80 1 .02 50 1.00
Total 50 1.00
Frequency Distribution
Example BMW manufactures racing cars
and has gathered the following info on the
number of models of engines in different
size categories used in the racing market it
serves.
Engine Size # of Engine Size # of
cu inches models cu inches models
101 150 1 301 350 17
151 200 7 351 400 16
201 250 7 401 450 15
251 300 8 451 500 7
Frequency Distribution
- Construct a cumulative relative frequency distribution.

- 70% of engine models are larger than what size?

- What is the approx middle value in the original data set?


Common Statistical Graphs
Histogram -- vertical bar chart of frequencies
Frequency Polygon -- line graph of frequencies
Ogive -- line graph of cumulative frequencies
Pie Chart -- proportional representation for
categories of a whole
Stem and Leaf Plot
Pareto Chart
Scatter Plot Exploratory Data Analysis
Bar Chart (Illustration)
Figure 1-11: SHIFTING GEARS
Quartely net income for General Motors (in billions)

1.5

1.2

0.9

0.6

0.3

0.0
1Q 2Q 3Q 4Q 1Q
2003 C4 2004
Pie Chart Calculations for Company A

2d Quarter
Truck
Production
Company Proportion Degrees

A 357,411 .388 140

B 357, 411 354,936 .386 139


=
C 920,190 160,997 .175 63

D 34,099 .388 .037


360 = 13

E 12,747 .014 5
Totals 920,190 1.000 360
PIE DIAGRAM
Complaints by Amtrak Passengers
COMPLAINT NUMBER PROPORTION DEGREES

Stations, etc. 28,000 .40 144.0

Train 14,700 .21 75.6


Performance
Equipment 10,500 .15 50.4

Personnel 9,800 .14 50.6

Schedules, 7,000 .10 36.0


etc.
Total 70,000 1.00 360.0

2-51
Complaints by Amtrak Passengers
Schedules,
Personnel Etc.
14% 10%

Equipment
15%

Stations, Etc.
40%
Train
Performance
21%
Histogram
A histogram is a chart made of bars of
different heights.
Widths and locations of bars
correspond to widths and locations of
data groupings
Heights of bars correspond to
frequencies or relative frequencies of
data groupings
Frequency Histogram
Relative Frequency Histogram
Histogram

Class Interval Frequency

20
20-under 30 6
30-under 40 18

Frequency
40-under 50 11

10
50-under 60 11
60-under 70 3
70-under 80 1
0

0 10 20 30 40 50 60 70 80
Years
Histogram Construction

Class Interval Frequency

20
20-under 30 6
30-under 40 18

Frequency
40-under 50 11

10
50-under 60 11
60-under 70 3
70-under 80 1
0

0 10 20 30 40 50 60 70 80
Years
Frequency Polygon

Class Interval Frequency

20
20-under 30 6
30-under 40 18

Frequency
40-under 50 11

10
50-under 60 11
60-under 70 3
70-under 80 1
0

0 10 20 30 40 50 60 70 80
Years
Ogive

Cumulative

60
Class Interval Frequency
20-under 30 6

40
Frequency
30-under 40 24
40-under 50 35

20
50-under 60 46
60-under 70 49
0

0 10 20 30 40 50 60 70 80
70-under 80 50
Years
Relative Frequency Ogive
Cumulative
Relative

Cumulative Relative Frequency


Class Interval Frequency 1.00
0.90
20-under 30 .12 0.80
0.70
30-under 40 .48 0.60
40-under 50 .70 0.50
0.40
50-under 60 .92 0.30
0.20
60-under 70 .98 0.10
70-under 80 1.00 0.00
0 10 20 30 40 50 60 70 80
Years
Stem-and-Leaf Display

A simple way to see how the data are


distributed and where concentrations of data
exist

METHOD: Separate the sorted data series


into leading digits (the stems) and
the trailing digits (the leaves)

Chap 2-61
Organizing Numerical Data:
Stem and Leaf Display
A stem-and-leaf display organizes data into groups (called
stems) so that the values within each group (the leaves)
branch out to the right on each row.
Age of College Students

Age of Day Students Day Students Night Students


Surveyed
16 17 17 18 18 18 Stem Leaf
College Stem Leaf
Students 19 19 20 20 21 22
1 67788899 1 8899
22 25 27 32 38 42
Night Students 2 0012257 2 0138
18 18 19 19 20 21
3 28 3 23
23 28 32 33 41 45
4 2
4 15

Chap 2-62
Safety Examination Scores
for Plant Trainees
Raw Data Stem Leaf

86 77 91 60 55 2 3
76 92 47 88 67 3 9
4 79
23 59 72 75 83
5 569
77 68 82 97 89
6 07788
81 75 74 39 67 7 0245567789
79 83 70 78 91 8 11233689
9 11247
68 49 56 94 81
Organizing Categorical Data:
Pareto Chart
Used to portray categorical data (nominal
scale)
A vertical bar chart, where categories are
shown in descending order of frequency
A cumulative polygon is shown in the same
graph
Used to separate the vital few from the
trivial many
Chap 2-64
Organizing Categorical Data:
Pareto Chart

Pareto Chart For Banking Preference

100% 100%
% in each category

80% 80%

Cumulative %
(line graph)
(bar graph)

60% 60%

40% 40%

20% 20%

0% 0%
In person Internet Drive- ATM Automated
at branch through or live
service at telephone
branch

Chap 2-
65
Pareto Chart
100 100%
90 90%
80 80%
70 70%
60 60%
Frequency

50 50%
40 40%
30 30%
20 20%
10 10%
0 0%
Poor Short in Defective Other
Wiring Coil Plug
Scatter Plot

Registered Gasoline Sales


Vehicles (1000's of
(1000's) Gallons) 200

Gasoline Sales
5 60
100
15 120

9 90
0
15 140 0 5 10 15
Registered Vehicles
20

7 60
Time Plot
M o n th ly S te e l P r o d u c tio n

8 .5

7 .5
M ill io n s o f T o n s

6 .5

5 .5

M o n th J F M A M J J A S O N D J F M A M J J A S O N D J F M A M J J A S O
Cross Tabulations
Used to study patterns that may exist between
two or more categorical variables.

Cross tabulations can be presented in


Contingency Tables

Chap 2-69
Cross Tabulations:
The Contingency Table

A cross-classification (or contingency) table presents the


results of two categorical variables. The joint responses are
classified so that the categories of one variable are located in
the rows and the categories of the other variable are located in
the columns.

The cell is the intersection of the row and column and the
value in the cell represents the data corresponding to that
specific pairing of row and column categories.

Chap 2-70
Cross Tabulations:
The Contingency Table

A survey was conducted to study the importance of brand


name to consumers as compared to a few years ago. The
results, classified by gender, were as follows:

Importance of Male Female Total


Brand Name
More 450 300 750
Equal or Less 3300 3450 6750

Total 3750 3750 7500

Chap 2-
71
Scatter Plots
Scatter plots are used for numerical data consisting of paired
observations taken from two numerical variables

One variable is measured on the vertical axis and the other


variable is measured on the horizontal axis

Scatter plots are used to examine possible relationships


between two numerical variables

Chap 2-72
Scatter Plot Example

Volume Cost per


per day day Cost per Day vs. Production Volume
23 125
250
26 140
200
Cost per Day

29 146
150
33 160
100
38 167
50
42 170
0
50 188
20 30 40 50 60 70
55 195
Volume per Day
60 200

Chap 2-73
Time Series Plot

A Time Series Plot is used to study


patterns in the values of a numeric
variable over time

The Time Series Plot:


Numeric variable is measured on the
vertical axis and the time period is
measured on the horizontal axis

Chap 2-74
Time Series Plot Example

Number of
Year Franchises Number of Franchises, 1996-2004
120
1996 43
100
1997 54 Franchises
Number of

80
1998 60 60
1999 73 40
2000 82 20
0
2001 95
1994 1996 1998 2000 2002 2004 2006
2002 107 Year
2003 99
2004 95

Chap 2-75
Principles of Excellent Graphs

The graph should not distort the data.


The graph should not contain unnecessary adornments
(sometimes referred to as chart junk).
The scale on the vertical axis should begin at zero.
All axes should be properly labeled.
The graph should contain a title.
The simplest possible graph should be used for a given set of
data.

Chap 2-76
Graphical Errors: Chart Junk

Bad Presentation
Good Presentation

Minimum Wage Minimum Wage


1960: $1.00
$
4
1970: $1.60

2
1980: $3.10
0
1990: $3.80 1960 1970 1980 1990

Chap 2-77
Graphical Errors:
No Relative Basis

Bad Presentation Good Presentation


As received by As received by
Freq. students. % students.
30%
300

200 20%

100 10%

0 0%
FR SO JR SR FR SO JR SR

FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior

Chap 2-78
Graphical Errors:
Compressing the Vertical Axis

Bad Presentation Good Presentation


Quarterly Sales Quarterly Sales
$ $
200 50

100 25

0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Chap 2-79
Graphical Errors: No Zero Point on the
Vertical Axis

Bad Presentation
Good Presentations

Monthly Sales $ Monthly Sales


$ 45
45
42
42 39
39 36
36 0
J F M A M J J F M A M J

Graphing the first six months of sales

Chap 2-80
Chapter Summary
In this chapter, we have

Organized categorical data using the summary table, bar


chart, pie chart, and Pareto chart.
Organized numerical data using the ordered array, stem-and-
leaf display, frequency distribution, histogram, polygon, and
ogive.
Examined cross tabulated data using the contingency table.
Developed scatter plots and time series graphs.
Examined the dos and don'ts of graphically displaying data.

Chap 2-81
Cross Tabulation
Understanding relationship between 2 variables
Example Quality rating of meals of various prices at 10
restaurants

# Rating Price # Rating Price


1 Good 18 7 Excellent 19
2 Very Good 22 8 Very Good 11
3 Good 28 9 Good 23
4 Excellent 38 10 Very Good 13
5 Good 33 11 Excellent 18
6 Very Good 28 12 Excellent 33
Cross Tabulation
One variable is qualitative (Rating) and the other quantitative( Price)
Row % included

Price
Rating 10 - 19 20 - 29 30 - 39 Total
Good 1 2 1 4
25% 50% 25% 100%

Very Good 2 2 0 4
50% 50% 100%

Excellent 2 0 2 4
50% 50% 100%

Total 5 4 3 12
Cross Tabulation
Problem - In a study of job satisfaction for 4
occupations higher the scores indicate high satisfaction
Provide a cross tab of occupation & satisfaction score
Lawyer 44 Comp Analyst 54 Lawyer 53

Doctor 80 Lawyer 42 Physiatrist 48

Lawyer 62 Physiatrist 59 Doctor 62

Physiatrist 55 Doctor 79 Lawyer 86

Lawyer 64 Physiatrist 76 Comp Analyst 79

Comp Analyst 73 Doctor 50 Physiatrist 60

Physiatrist 86 Comp Analyst 86 Doctor 52

Lawyer 71 Comp Analyst 50 Lawyer 79

Doctor 78 Physiatrist 76 Comp Analyst 69


TWO WAY FREQUENCY SERIES/BIVARIATE SERIES

CLASS
05 5 10 10 15 15 20
INTERVAL
0 10 1 - 2 -
10 20 4 3 - -
20 30 - - 1 -
30 40 2 - 1 -
Tabular and Graphical Methods
Data

Categorical Data Quantitative Data

Tabular Graphical Tabular Graphical


Methods Methods Methods Methods

Frequency Bar Chart Frequency


Distribution Distribution Histogram
Pie Chart
Rel. Freq. Dist. Rel. Freq. Dist. Ogive
Percent Freq. % Freq. Dist. Stem-and-
Distribution Cum. Freq. Dist. Leaf Display
Cross tabulation Cum. Rel. Freq. Scatter
Distribution Diagram
Cum. % Freq.
Distribution
Cross tabulation
Methods of Summarizing Data
Tabular Graphical
Presentation Presentation
Frequency Distribution Dot Plot, Line Chart
Relative Frequency Distribution Histogram
Percent Frequency Distribution Bar diagram, Pie
Cumulative Frequency Distribution Ogive, Freq polygon
Cum Relative Frequency Distribution Freq Curve
Cum Percent Frequency Distribution Stem & Leaf Display
Cross tabulation Scatter Diagram

Вам также может понравиться