Вы находитесь на странице: 1из 50

Statistics for Business and Economics

6th Edition

Chapter 2

Describing Data: Graphical

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-1

Chapter Goals
After completing this chapter, you should be able to:

Identify types of data and levels of measurement Create and interpret graphs to describe categorical variables: frequency distribution, bar chart, pie chart, Pareto diagram Create a line chart to describe time-series data Create and interpret graphs to describe numerical variables: frequency distribution, histogram, ogive, stem-and-leaf display Construct and interpret graphs to describe relationships between variables: Scatter plot, cross table Describe appropriate and inappropriate ways to display data graphically

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-2

Types of Data
Data

Categorical
Examples:

Numerical

Marital Status Are you registered to vote? Eye Color (Defined categories or groups)

Discrete
Examples:

Continuous
Examples:

Number of Children Defects per hour (Counted items)

Weight Voltage (Measured characteristics)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-3

Measurement Levels
Differences between measurements, true zero exists Differences between measurements but no true zero

Ratio Data
Quantitative Data

Interval Data

Ordered Categories (rankings, order, or scaling)

Ordinal Data
Qualitative Data

Categories (no ordering or direction)

Nominal Data
Chap 2-4

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Graphical Presentation of Data

Data in raw form are usually not easy to use for decision making
Some type of organization is needed Table Graph The type of graph to use depends on the variable being summarized

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-5

Graphical Presentation of Data


(continued)

Techniques reviewed in this chapter:


Categorical Variables Numerical Variables Line chart Frequency distribution Histogram and ogive Stem-and-leaf display Scatter plot

Frequency distribution Bar chart Pie chart Pareto diagram

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-6

Tables and Graphs for Categorical Variables


Categorical Data

Tabulating Data Frequency Distribution Table

Graphing Data

Bar Chart

Pie Chart

Pareto Diagram

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-7

The Frequency Distribution Table


Summarize data by category Example: Hospital Patients by Unit
Hospital Unit Cardiac Care Emergency Intensive Care Maternity Surgery
(Variables are categorical)
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 2-8

Number of Patients 1,052 2,245 340 552 4,630

Bar and Pie Charts

Bar charts and Pie charts are often used for qualitative (category) data
Height of bar or size of pie slice shows the frequency or percentage for each category

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-9

Bar Chart Example


Hospital Unit Cardiac Care Emergency Intensive Care Maternity Surgery Number of Patients 1,052 2,245 340 552 4,630
Hospital Patients by Unit
5000

Number of patients per year

4000 3000 2000 1000 0


Cardiac Care Emergency Maternity Intensive Care Surgery
Chap 2-10

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Pie Chart Example


Hospital Unit Cardiac Care Emergency Intensive Care Maternity Surgery Total Number of Patients 1,052 2,245 340 552 4,630 8,819 % of Total 11.93 25.46 3.86 6.26 52.50 100.00
Surgery 53%

Hospital Patients by Unit


Cardiac Care 12%

Emergency 25%

(Percentages are rounded to the nearest percent)


Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Intensive Care 4% Maternity 6%

Chap 2-11

Pareto Diagram

Used to portray categorical data A bar chart, where categories are shown in descending order of frequency A cumulative polygon is often shown in the same graph

Used to separate the vital few from the trivial many

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-12

Pareto Diagram Example


Example: 400 defective items are examined for cause of defect:
Source of Manufacturing Error Bad Weld Poor Alignment Missing Part Paint Flaw Electrical Short Cracked case Total
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Number of defects 34 223 25 78 19 21 400


Chap 2-13

Pareto Diagram Example


(continued)

Step 1: Sort by defect cause, in descending order


Step 2: Determine % in each category
Source of Manufacturing Error Poor Alignment Paint Flaw Bad Weld Missing Part Cracked case Electrical Short Total Number of defects 223 78 34 25 21 19 400 % of Total Defects 55.75 19.50 8.50 6.25 5.25 4.75 100%
Chap 2-14

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Pareto Diagram Example


(continued)

Step 3: Show results graphically


Pareto Diagram: Cause of Manufacturing Defect

% of defects in each category (bar graph)

60%

100%

90%

cumulative % (line graph)

50% 80%

70% 40% 60%

30%

50%

40% 20% 30%

20% 10% 10%

0%

0%

Poor Alignment

Paint Flaw

Bad Weld

Missing Part

Cracked case

Electrical Short

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-15

Graphs for Time-Series Data

A line chart (time-series plot) is used to show the values of a variable over time

Time is measured on the horizontal axis


The variable of interest is measured on the vertical axis

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-16

Line Chart Example


Magazine Subscriptions by Year
350 300

Thousands of subscribers

250 200 150 100 50 0

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006
Chap 2-17

Graphs to Describe Numerical Variables


Numerical Data

Frequency Distributions and Cumulative Distributions

Stem-and-Leaf Display

Histogram

Ogive

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-18

Frequency Distributions
What is a Frequency Distribution?

A frequency distribution is a list or a table containing class groupings (categories or ranges within which the data fall) ... and the corresponding frequencies with which data fall within each class or category

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-19

Why Use Frequency Distributions?

A frequency distribution is a way to summarize data

The distribution condenses the raw data into a more useful form...
and allows for a quick visual interpretation of the data

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-20

Class Intervals and Class Boundaries


Each class grouping has the same width Determine the width of each interval by
largest number smallest number w interval width number of desired intervals

Use at least 5 but no more than 15-20 intervals Intervals never overlap Round up the interval width to get desirable interval endpoints
Chap 2-21

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Frequency Distribution Example


Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-22

Frequency Distribution Example


(continued)

Sort raw data in ascending order:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Find range: 58 - 12 = 46

Select number of classes: 5 (usually between 5 and 15)


Compute interval width: 10
less than 30, . . . , 60 but less than 70 (46/5 then round up)

Determine interval boundaries: 10 but less than 20, 20 but Count observations & assign to classes

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-23

Frequency Distribution Example


(continued)

Data in ordered array:


12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Interval

Frequency

Relative Frequency

Percentage

10 but less than 20 20 but less than 30 30 but less than 40 40 but less than 50 50 but less than 60 Total

3 6 5 4 2 20

.15 .30 .25 .20 .10 1.00

15 30 25 20 10 100
Chap 2-24

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Histogram

A graph of the data in a frequency distribution is called a histogram


The interval endpoints are shown on the horizontal axis the vertical axis is either frequency, relative frequency, or percentage

Bars of the appropriate heights are used to represent the number of observations within each class
Chap 2-25

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Histogram Example
Interval 10 but less than 20 20 but less than 30 30 but less than 40 40 but less than 50 50 but less than 60 Frequency

Histogram : Daily High Tem perature


3 6 5 4 2

7 6 5 4 3 2 1 0 3

6 5 4 2 0

Frequency

(No gaps between bars)

0
0 0 10 10 2020 30 30 40 40 50 50 60 60 70 Temperature in Degrees
Chap 2-26

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Histograms in Excel

1 Select Tools/Data Analysis

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-27

Histograms in Excel
(continued)

2
Choose Histogram

(
Input data range and bin range (bin range is a cell 3
range containing the upper interval endpoints for each class grouping)

Select Chart Output and click OK


Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 2-28

Questions for Grouping Data into Intervals

1. How wide should each interval be?


(How many classes should be used?)

2. How should the endpoints of the intervals be determined?

Often answered by trial and error, subject to user judgment The goal is to create a distribution that is neither too "jagged" nor too "blocky Goal is to appropriately show the pattern of variation in the data
Chap 2-29

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

How Many Class Intervals?

Many (Narrow class intervals)

3.5 3
Frequency

may yield a very jagged distribution with gaps from empty classes Can give a poor indication of how frequency varies across classes

2.5 2 1.5 1 0.5 0


4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 More

Temperature

Few (Wide class intervals)


Frequency

12 10 8 6 4 2 0 0 30 60 More Temperature

may compress variation too much and yield a blocky distribution can obscure important patterns of variation.

(X axis labels are upper class endpoints)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-30

The Cumulative Frequency Distribuiton


Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Class 10 but less than 20 20 but less than 30 30 but less than 40

Frequency Percentage 3 6 5 15 30 25

Cumulative Cumulative Frequency Percentage 3 9 14 15 45 70

40 but less than 50


50 but less than 60 Total

4
2 20

20
10 100

18
20

90
100

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-31

The Ogive
Graphing Cumulative Frequencies
Interval Less than 10 10 but less than 20 20 but less than 30 30 but less than 40 40 but less than 50 50 but less than 60 Upper interval Cumulative endpoint Percentage 10 20 30 40 50 60 0 15 45 70 90 100

Ogive: Daily High Temperature


Cumulative Percentage

100 80 60 40 20 0 10 20 30 40 50 60

Interval endpoints
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 2-32

Distribution Shape

The shape of the distribution is said to be symmetric if the observations are balanced, or evenly distributed, about the center.
Symmetric Distribution
10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9

Frequency

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-33

Distribution Shape
(continued)

The shape of the distribution is said to be skewed if the observations are not symmetrically distributed around the center.
Positively Skewed Distribution
12 10

A positively skewed distribution (skewed to the right) has a tail that extends to the right in the direction of positive values.

Frequency

8 6 4 2 0 1 2 3 4 5 6 7 8 9

A negatively skewed distribution (skewed to the left) has a tail that extends to the left in the direction of negative values.
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Negatively Skewed Distribution


12 10
Frequency

8 6 4 2 0 1 2 3 4 5 6 7 8 9

Chap 2-34

Stem-and-Leaf Diagram

A simple way to see distribution details in a data set METHOD: Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves)

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-35

Example
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

Here, use the 10s digit for the stem unit:


Stem Leaf

21 is shown as 38 is shown as

2 3

1 8

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-36

Example
(continued)

Data in ordered array:


21, 24, 24, 26, 27, 27, 30, 32, 38, 41

Completed stem-and-leaf diagram:


Stem Leaves

2 3 4

1 4 4 6 7 7 0 2 8 1

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-37

Using other stem units

Using the 100s digit as the stem:

Round off the 10s digit to form the leaves


Stem

Leaf

613 would become


776 would become ... 1224 becomes

6
7 12

1
8 2

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-38

Using other stem units


(continued)

Using the 100s digit as the stem:

The completed stem-and-leaf display:


Data:
613, 632, 658, 717, 722, 750, 776, 827, 841, 859, 863, 891, 894, 906, 928, 933, 955, 982, 1034, 1047,1056, 1140, 1169, 1224 Stem 6 7 8 Leaves 136 2258 346699

9
10 11 12

13368
356 47 2
Chap 2-39

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Relationships Between Variables

Graphs illustrated so far have involved only a single variable When two variables exist other techniques are used:
Categorical (Qualitative) Variables Cross tables Numerical (Quantitative) Variables Scatter plots

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-40

Scatter Diagrams

Scatter Diagrams are used for paired observations taken from two numerical variables
The Scatter Diagram: one variable is measured on the vertical axis and the other variable is measured on the horizontal axis

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-41

Scatter Diagram Example


Volume per day Cost per day

Cost per Day vs. Production Volume


250
Cost per Day

23
26 29 33 38 42 50

125
140 146 160 167 170 188

200 150 100 50 0 0 10 20 30 40 50 60 70 Volume per Day

55
60

195
200

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-42

Scatter Diagrams in Excel


1
Select the chart wizard

2
Select XY(Scatter) option, then click Next

3
When prompted, enter the data range, desired legend, and desired destination to complete the scatter diagram
Statistics for Business and Economics, 6e 2007 Pearson Education, Inc. Chap 2-43

Cross Tables

Cross Tables (or contingency tables) list the number of observations for every combination of values for two categorical or ordinal variables
If there are r categories for the first variable (rows) and c categories for the second variable (columns), the table is called an r x c cross table

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-44

Cross Table Example

4 x 3 Cross Table for Investment Choices by Investor (values in $1000s)


Investor A Investor B Investor C Total

Investment Category

Stocks
Bonds CD Savings Total

46.5
32.0 15.5 16.0 110.0

55
44 20 28 147

27.5
19.0 13.5 7.0 67.0

129
95 49 51 324

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-45

Graphing Multivariate Categorical Data


(continued)

Side by side bar charts


C o m p arin g In vesto rs
S a vin g s CD B onds S toc k s 0 10 In ve s t o r A 20 30 In ve s t o r B 40 50 In ve s t o r C 60

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-46

Side-by-Side Chart Example

Sales by quarter for three sales territories:


East West North 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr 20.4 27.4 59 20.4 30.6 38.6 34.6 31.6 45.9 46.9 45 43.9

60 50 40 30 20 10 0 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr


Chap 2-47

Eas West North

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Data Presentation Errors


Goals for effective data presentation:

Present data to display essential information Communicate complex ideas clearly and accurately

Avoid distortion that might convey the wrong

message

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chap 2-48

Data Presentation Errors


(continued)

Unequal histogram interval widths Compressing or distorting the vertical axis Providing no zero point on the vertical axis Failing to provide a relative basis in comparing data between groups
Chap 2-49

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.

Chapter Summary

Reviewed types of data and measurement levels Data in raw form are usually not easy to use for decision making -- Some type of organization is needed:
Table Graph

Techniques reviewed in this chapter:


Frequency distribution Bar chart Pie chart Pareto diagram

Line chart Frequency distribution Histogram and ogive Stem-and-leaf display Scatter plot Cross tables and side-by-side bar charts
Chap 2-50

Statistics for Business and Economics, 6e 2007 Pearson Education, Inc.