Вы находитесь на странице: 1из 141

Summarizing and Graphing

Chapter 2
Outline
2-1 Review and Preview
2-2 Frequency Distributions
2-3 Histograms
2-4 Graphs That Enlighten and Graphs
That Deceive

Spring 2017 Math 115 / Statistics Chapter 2 / Page 2


Introduction
IQ data from children that lived near a
lead smelter based on the amount of
lead found in their blood
Low, medium and high levels of lead
Low (78 children)

Spring 2017 Math 115 / Statistics Chapter 2 / Page 3


Review and Preview
Characteristics of Data

Center
A representative value that indicates where the
middle of the data set is located

Variation
A measure of the amount that the data values
vary
Spring 2017 Math 115 / Statistics Chapter 2 / Page 4
Review and Preview
Characteristics of Data
Distribution
The nature or shape of the spread of data over the
range of values
Bell-shaped, uniform, or skewed
Outliers
Sample values that lie very far away from the vast
majority of other sample values
Time
Changing characteristics of the data over time
Spring 2017 Math 115 / Statistics Chapter 2 / Page 5
Review and Preview
Time
In the US 100 years ago
8% of homes had a telephone
14% of homes had a bathtub
The mean life expectancy was 47 years
The mean hourly wage was 22 cents
There were approximately 230 murders in the entire
US
Statistical analysis should always consider
changing population characteristics
Spring 2017 Math 115 / Statistics Chapter 2 / Page 6
Review and Preview
Be careful of what you believe
Why do you believe what you believe?
Sources?
Legitimate?
Questionable?
Agenda?
https://www.ted.com/talks/hans_and_ola_rosli
ng_how_not_to_be_ignorant_about_the_worl
d
Spring 2017 Math 115 / Statistics Chapter 2 / Page 7
Frequency Distributions
When working with large data sets
It is often helpful to organize and summarize
data by constructing a table called a
frequency distribution
Because computer software and calculators
can generate frequency distributions
The details of constructing them are not as
important as what they tell us about data sets
Still need to know how to derive them!!
Spring 2017 Math 115 / Statistics Chapter 2 / Page 8
Frequency Distributions
Frequency Distribution (or Frequency
Table)
Shows how a data set is partitioned among
all of several categories (or classes) by
listing all of the categories along with the
number (frequency) of data values in each
of them

Spring 2017 Math 115 / Statistics Chapter 2 / Page 9


Frequency Distributions
To organize quantitative data
We group the observations into classes
Classes
Categories or bins for grouping data

Once in classes, we can construct


Frequency distributions
Relative-frequency distributions

Spring 2017 Math 115 / Statistics Chapter 2 / Page 10


Frequency Distributions
Reasons for Constructing Frequency
Distributions
1. Large data sets can be summarized
2. We can analyze the nature of data
3. We have a basis for constructing
important graphs

Spring 2017 Math 115 / Statistics Chapter 2 / Page 11


Frequency Distributions
Lower Class Limits
The smallest numbers that can belong to a
class
IQ Score Frequency

50-69 2
70-89 33
Lower Class Limits 90-109 35
110-129 7
130-149 1
Spring 2017 Math 115 / Statistics Chapter 2 / Page 12
Frequency Distributions
Upper Class Limits
The largest numbers that can belong to a
class
IQ Score Frequency

50-69 2
70-89 33
Upper Class Limits 90-109 35
110-129 7
130-149 1
Spring 2017 Math 115 / Statistics Chapter 2 / Page 13
Frequency Distributions
Class Boundaries
The numbers used to separate classes, but
without the gaps created by class limits
IQ Score Frequency
49.5
50-69 2
69.5
70-89 33
Class Boundaries 89.5 90-109 35
109.5
110-129 7
129.5
130-149 1
149.5
Spring 2017 Math 115 / Statistics Chapter 2 / Page 14
Frequency Distributions
Class Midpoints
The values in the middle of the classes and can be
found by averaging the lower and upper limits
Adding the class upper limit to the class lower limit and
dividing the sum by 2 IQ Score Frequency
(50 + 69) / 2 = 119 / 2 = 59.5 50-69 2
(70 + 89) / 2 = 159 / 2 = 79.5 70-89 33
(90 + 109) / 2 = 199 / 2 = 99.5 90-109 35
(110 + 129) / 2 = 239 / 2 = 119.5 110-129 7
(130 + 149) / 2 = 279 / 2 = 139.5 130-149 1
Spring 2017 Math 115 / Statistics Chapter 2 / Page 15
Frequency Distributions
Class Width
The difference between two consecutive
lower class limits or two consecutive lower
class boundaries IQ Score Frequency
70 50 = 20 50-69 2
90 70 = 20 70-89 33
110 90 = 20 90-109 35
130 110 = 20 110-129 7
150 130 = 20 130-149 1
Spring 2017 Math 115 / Statistics Chapter 2 / Page 16
IQ Score Frequency

50-69 2

70-89 33

90-109 35

Frequency Distributions 110-129

130-149
7

Constructing A Frequency Distribution


1. Determine the number of classes
Should be between 5 and 20
We choose 5 (could be 10)
2. Calculate the class width
Round up

(maximum value) (minimum value)


class width
(141 - 50) / 5 = 91 / 5 = number
18.2 of classes
Choose 20 (rounded up)
Spring 2017 Math 115 / Statistics Chapter 2 / Page 17
IQ Score Frequency

50-69 2

70-89 33

90-109 35

Frequency Distributions 110-129

130-149
7

Constructing A Frequency Distribution


3. Starting point
Choose the minimum data value
Or a convenient value below it as the first lower class limit
Choose 50
4. Using the first lower class limit and class
width
Proceed to list the other lower class limits
50, 70, 90, 110 and 130

Spring 2017 Math 115 / Statistics Chapter 2 / Page 18


IQ Score Frequency

50-69 2

70-89 33

90-109 35

Frequency Distributions 110-129

130-149
7

Constructing A Frequency Distribution


5. List the lower class limits in a vertical
column and proceed to enter the upper
class limits
69, 89, 109, 129 and 149
6. Take each individual data value and put a
tally mark in the appropriate class
Add the tally marks to get the frequency

Spring 2017 Math 115 / Statistics Chapter 2 / Page 19


IQ Score Frequency

50-69 2

70-89 33

90-109 35

Frequency Distributions 110-129

130-149
7

Constructing A Frequency Distribution


6. Take each individual data value and put a
tally mark in the appropriate class
Add the tally marks to get the frequency
50 69: 2
70 89: 33
90 109: 35
110 129: 7
130 149: 1
Total = 78
Spring 2017 Math 115 / Statistics Chapter 2 / Page 20
Frequency Distributions
Relative Frequency
The ratio of the frequency of a
class to the total number of observations
Relative Frequency = Frequency / Total
Number of Observations

Spring 2017 Math 115 / Statistics Chapter 2 / Page 21


Frequency Distributions
Relative Frequency Distribution
Includes the same class limits as a
frequency distribution
But the frequency of a class is replaced with
relative frequencies
A proportion or a percentage frequency (a percent)
class frequency
relative frequency =
sum of all frequencies

Spring 2017 Math 115 / Statistics Chapter 2 / Page 22


Frequency Distributions
Relative Frequency Distribution
A listing of the district values and their relative
frequencies
Provide a table of the values of the observations
and (relatively) how often then occur
To obtain a relative-frequency distribution
We find a frequency distribution
Then divide each frequency by the total number of
observations

Spring 2017 Math 115 / Statistics Chapter 2 / Page 23


IQ Score Frequency Relative
Frequency

50-69 2 2.6%

70-89 33 42.3%

90-109 35 44.9%

110-129 7 9.0%

Frequency Distributions
130-149 1 1.3%

Relative Frequency Distribution

Spring 2017 Math 115 / Statistics Chapter 2 / Page 24


Frequency Distributions
Example
Days to Maturity for Short-Term
Investments
Organize this
data into frequency
and
relative-frequency
distributions

Spring 2017 Math 115 / Statistics Chapter 2 / Page 25


36 38 39 47 50 51 51 53 55 55
56 57 60 62 63 64 64 65 66 67
68 69 70 70 70 71 75 78 79 80
81 83 85 86 87 89 95 98 99 99

Frequency Distributions
Days to Maturity for Short-Term Investments

Frequency Distribution
Columns 1 and 3
Relative-frequency Distribution
Columns 1 and 4

Spring 2017 Math 115 / Statistics Chapter 2 / Page 26


36 38 39 47 50 51 51 53 55 55
56 57 60 62 63 64 64 65 66 67
68 69 70 70 70 71 75 78 79 80
81 83 85 86 87 89 95 98 99 99

Frequency Distributions
Days to Tally Number of Frequency Relative Relative Relative
maturity investments frequency frequency frequency
(fraction) (decimal) (percentage)
30 D 39 III 3 3 3 / 40 0.0750 7.50%
40 D 49 I 1 1 1 / 40 0.0250 2.50%
50 D 59 IIIIIIII 8 8 8 / 40 0.2000 20.00%
60 D 69 IIIIIIIIII 10 10 10 / 40 0.2500 25.00%
70 D 79 IIIIIII 7 7 7 / 40 0.1750 17.50%
80 D 89 IIIIIII 7 7 7 / 41 0.1750 17.50%
90 D 99 IIII 4 4 4 / 40 0.1000 10.00%
40 40 40 / 40 = 1 1.0000 100.00%

Spring 2017 Math 115 / Statistics Chapter 2 / Page 27


Frequency Distributions
Example
Weights of 18 to 24-Year-Old Males
Organize this data into frequency and relative-
frequency 129.2 185.3 218.1 182.5 142.8
distributions 155.2 170.0 151.3 187.5 145.6
First lower class limit: 120 167.3 161.0 178.7 165.0 172.5
Class width of 20 191.1 150.7 187.0 173.7 178.2
161.7 170.1 165.8 214.6 136.7
278.8 175.6 188.7 132.1 158.5
146.4 209.1 175.4 182.0 173.6
149.9 158.6
Spring 2017 Math 115 / Statistics Chapter 2 / Page 28
Frequency Distributions
Weights of 18-24-Year-Old Males
129.2 185.3 218.1 182.5 142.8 Relative
Weight (lb) Frequency
155.2 170.0 151.3 187.5 145.6 Frequency
167.3 161.0 178.7 165.0 172.5 120 W < 140 3 0.081
191.1 150.7 187.0 173.7 178.2 140 W < 160 9 0.243
161.7 170.1 165.8 214.6 136.7 160 W < 180 14 0.378
278.8 175.6 188.7 132.1 158.5 180 W < 200 7 0.189
146.4 209.1 175.4 182.0 173.6 200 W < 220 3 0.081
149.9 158.6
Frequency Distribution 220 W < 240 0 0.000
Columns 1 and 2 240 W < 260 0 0.000
260 W < 280 1 0.027
Relative-frequency Distribution
37 1.000
Columns 1 and 3

Spring 2017 Math 115 / Statistics Chapter 2 / Page 29


Frequency Distributions
Recall
Frequency
The number of observations that fall in a class
Count
Frequency Distribution
A listing of all classes and their frequencies
Single Value Class
Each class represents a single possible value
Usually a small number of distinct values
The X (horizontal) axis does not have ranges
Only single values

Spring 2017 Math 115 / Statistics Chapter 2 / Page 30


Frequency Distributions
Example
TVs per Household
Organize this data
into frequency and
relative-frequency
distributions

Spring 2017 Math 115 / Statistics Chapter 2 / Page 31


Frequency Distributions
TVs per Household

Frequency Distribution
Columns 1 and 2
Relative-frequency Distribution
Columns 1 and 3

Spring 2017 Math 115 / Statistics Chapter 2 / Page 32


Frequency Distributions
Gaps
The presence of gaps can show that we
have data from two or more different
populations
However, the converse is not true
Because data from different populations do not
necessarily result in gaps

Spring 2017 Math 115 / Statistics Chapter 2 / Page 33


Frequency Distributions
Gaps
Example
Weights of
Pennies
Frequency
distribution of
randomly
selected
pennies

Spring 2017 Math 115 / Statistics Chapter 2 / Page 34


Frequency Distributions
Gaps
Example
Pennies made before 1983 are 95% copper
and 5% zinc
Pennies made after 1983 are 2.5% copper and
97.5% zinc
Conclusion
The presence of gaps can suggest the data are
from two or more different populations
Spring 2017 Math 115 / Statistics Chapter 2 / Page 35
Frequency Distributions
Frequency distributions - qualitative data
One way of organizing qualitative data is to
construct a table that gives the number of times
each distinct value occurs
The number of times each value occurs is its frequency
(or count)
Frequency Distribution
Is a listing of the distinct values
And their frequencies
Provides a table of the values of the observations
And how often they occur

Spring 2017 Math 115 / Statistics Chapter 2 / Page 36


Frequency Distributions
To Construct a Frequency Distribution
1. List the distinct values of the observations
in the first column of a table
2. For each observation
Place a tally mark in the second column in the
row of the appropriate distinct value
Cross out the observation
3. Count the tallies for each distinct value
Record the totals in the third column
Spring 2017 Math 115 / Statistics Chapter 2 / Page 37
Frequency Distributions
Example
Political Party Affiliation
Construct a Frequency Distribution

Spring 2017 Math 115 / Statistics Chapter 2 / Page 38


Frequency Distributions
Political Party Affiliation
1. List the distinct values of the observations
in the first column of a table

Spring 2017 Math 115 / Statistics Chapter 2 / Page 39


Frequency Distributions
Political Party Affiliation
2. For each observation
Place a tally mark in the second column in the
row of the appropriate distinct value

Spring 2017 Math 115 / Statistics Chapter 2 / Page 40


Frequency Distributions
Political Party Affiliation
3. Count the tallies for each distinct value
Record the totals in the third column

Spring 2017 Math 115 / Statistics Chapter 2 / Page 41


Frequency Distributions
Political Party Affiliation
Interpretation
40 students are in the class
13 are Democrats, 18 are Republicans and 9 are
Other

By looking at frequency distributions


We can obtain useful information
Example: More students are Republican than
any other political party affiliation
Spring 2017 Math 115 / Statistics Chapter 2 / Page 42
Frequency Distributions
Example
Political Party Affiliation
Determine the
Relative-Frequency Distribution
1.Obtain a frequency distribution of the data
From the first and third columns
We have the distinct values and their frequencies

Spring 2017 Math 115 / Statistics Chapter 2 / Page 43


Frequency Distributions
Political Party Affiliation
2. Divide each frequency by
the total number of observations

Spring 2017 Math 115 / Statistics Chapter 2 / Page 44


Frequency Distributions
Political Party Affiliation
Interpretation
32.5% are Democrats
45.0% are Republicans
22.5% are Other
Relative- frequency distributions are better
then frequency distributions
For comparing two (or more) data sets
Why?
Spring 2017 Math 115 / Statistics Chapter 2 / Page 45
Frequency Distributions
Cumulative Frequency Distribution
IQ Score Frequency Cumulative
Frequency

Cumulative Frequencies
50-69 2 2
70-89 33 35
90-109 35 70
110-129 7 77
130-149 1 78

Spring 2017 Math 115 / Statistics Chapter 2 / Page 46


Histograms
A histogram is
A visual tool that helps analyze the shape of a
distribution of data
A graph consisting of bars of equal width
drawn adjacent to each other
Unless there are gaps in the data
The horizontal scale represents the classes of
quantitative data values and the vertical scale
represents the frequencies
Spring 2017 Math 115 / Statistics Chapter 2 / Page 47
Histograms
Histogram
The heights of the bars correspond to the
frequency values
The frequency of each class is represented by a
vertical bar whose height is equal to the frequency
of the class
A histogram is a graph of a frequency
distribution
A graph of the values of the observations and how
often they occur
Spring 2017 Math 115 / Statistics Chapter 2 / Page 48
Histograms
Relative-frequency Histogram
A graph that displays the classes on the
horizontal axis and the relative
frequencies of the classes on the vertical
axis
The relative frequency of each class is
represented by a vertical bar whose height
is equal to the relative frequency of the
class

Spring 2017 Math 115 / Statistics Chapter 2 / Page 49


Histograms
Relative-frequency Histogram
The only difference is the height of each
bar (the Y or vertical value)
The height of each bar is equal to the
relative frequency of each class instead of
the frequency of each class
The general shapes are the same because
the frequencies and the relative
frequencies are proportional
Spring 2017 Math 115 / Statistics Chapter 2 / Page 50
Histograms
Example - Histogram
IQ Score Frequency

50-69 2
70-89 33
90-109 35
110-129 7
130-149 1

Spring 2017 Math 115 / Statistics Chapter 2 / Page 51


Histograms
Relative Frequency Histogram
Has the same shape and horizontal scale
as a histogram, but the vertical scale is
marked with relative frequencies instead
of actual frequencies
IQ Score Relative Relative
Frequency Frequency
50-69 0.0256 2.56%
70-89 0.4231 42.31%
90-109 0.4487 44.87%
110-129 0.0897 8.97%
130-149
Spring 2017 0.0128 1.28% Math 115 / Statistics Chapter 2 / Page 52
Histograms
Example
Construct frequency and relative frequency
histograms for
Number of televisions per household
Days to maturity for short term investments
Weights of 18- to 24-year-old males

Spring 2017 Math 115 / Statistics Chapter 2 / Page 53


Histograms
TVs per Household/Single Valued Grouping

Spring 2017 Math 115 / Statistics Chapter 2 / Page 54


Histograms
Television Sets per Household

18 16
16 14
14 12
12
Frequency

10
8
6
3
4 2 2
1
2
0
0 1 2 3 4 5 6
Number of TVs

Spring 2017 Math 115 / Statistics Chapter 2 / Page 55


Histograms
Television Sets per Household

0.35 0.320
0.30 0.280
Relative Frequency

0.240
0.25
0.20
0.15
0.10 0.060
0.020 0.040 0.040
0.05
0.00
0 1 2 3 4 5 6
Number of TVs

Spring 2017 Math 115 / Statistics Chapter 2 / Page 56


Histograms
Days to Maturity for Short-Term
Investments/Limit Grouping

Spring 2017 Math 115 / Statistics Chapter 2 / Page 57


Histograms
Short-Term Investments
10
10
9 8
8 7 7
7
Frequency

6
5 4
4 3
3
2 1
1
0
35 45 55 65 75 85 95
Days to maturity

Spring 2017 Math 115 / Statistics Chapter 2 / Page 58


Histograms

Short-Term Investments
0.250
0.25
0.200
Relative Frequency

0.20 0.175 0.175

0.15
0.100
0.10 0.075

0.05 0.025

0.00
35 45 55 65 75 85 95
Days to maturity

Spring 2017 Math 115 / Statistics Chapter 2 / Page 59


Histograms
Weights of 18- to 24-year-old Males
Relative
Weight (lb) Frequency
Frequency
120 W < 140 3 0.081
140 W < 160 9 0.243
160 W < 180 14 0.378
180 W < 200 7 0.189
200 W < 220 3 0.081
220 W < 240 0 0.000
240 W < 260 0 0.000
260 W < 280 1 0.027
37 1.000
Spring 2017 Math 115 / Statistics Chapter 2 / Page 60
Histograms

Spring 2017 Math 115 / Statistics Chapter 2 / Page 61


Histograms

Spring 2017 Math 115 / Statistics Chapter 2 / Page 62


Histograms
Some questions about distributions
What is the shape of this distribution?
What is the center?
How much variation is in the data?
Are there any outliers?

Spring 2017 Math 115 / Statistics Chapter 2 / Page 63


Histograms
Skewness
A distribution of data is skewed if it is not
symmetric and extends more to one side to
the other
Data skewed to the right have a longer right tail
Positively skewed or right skewed
Data skewed to the left have a longer left tail
Negatively skewed or left skewed

Spring 2017 Math 115 / Statistics Chapter 2 / Page 64


Histograms

Examples of skewness

Spring 2017 Math 115 / Statistics Chapter 2 / Page 65


Histograms
Distribution of a Data Set
Table, graph, or formula that provides the
values of the observations and how often
they occur

Example
Heights of 3,264 female students
With bell shaped curve
Spring 2017 Math 115 / Statistics Chapter 2 / Page 66
Histograms

Spring 2017 Math 115 / Statistics Chapter 2 / Page 67


Histograms
Interpreting Histograms
Objective is not simply to construct a histogram
But rather to
understand
something about
the data
Shape
Center
Spread
Modes
Outliers

Spring 2017 Math 115 / Statistics Chapter 2 / Page 68


Histograms
Interpreting Histograms
The author calls it CVDOT
C Center
V Variation
D Distribution
OT Outliers

Spring 2017 Math 115 / Statistics Chapter 2 / Page 69


Histograms
Interpreting Histograms
Normal distribution
Frequencies start low
Then increase to one or
two high frequencies
Then decrease to a low frequency
The distribution is approximately symmetric
Frequencies preceding the maximum being roughly
a mirror image of those that follow the maximum

Spring 2017 Math 115 / Statistics Chapter 2 / Page 70


Histograms

Spring 2017 Math 115 / Statistics Chapter 2 / Page 71


Histograms
Household Size
Identify the distribution

Spring 2017 Math 115 / Statistics Chapter 2 / Page 72


Histograms
Histogram for a cumulative distribution
Recall: Low Lead Level

Cumulative
Cumulative Relative
IQ Score Frequency Relative
Frequency Frequency
Frequency
50 - 69 2 2 0.0256 0.0256
70 - 89 33 35 0.4231 0.4487
90 - 109 35 70 0.4487 0.8974
110 - 129 7 77 0.0897 0.9872
130 - 149 1 78 0.0128 1.0000
Total 78 1

Spring 2017 Math 115 / Statistics Chapter 2 / Page 73


Histograms
Histogram for a cumulative relative
distribution
Recall: Low Lead Level
Cumulative
Cumulative Relative
IQ Score Frequency Relative
Frequency Frequency
Frequency
50 - 69 2 2 0.0256 0.0256
70 - 89 33 35 0.4231 0.4487
90 - 109 35 70 0.4487 0.8974
110 - 129 7 77 0.0897 0.9872
130 - 149 1 78 0.0128 1.0000
Total 78 1

Spring 2017 Math 115 / Statistics Chapter 2 / Page 74


Histograms
Recall
Population data
The values of a variable for the entire
population
Population distribution
Sample data
The values of a variable for a sample of the
population
Sample distribution
Spring 2017 Math 115 / Statistics Chapter 2 / Page 75
Histograms
Simulation
Household
Size
Samples
of 100
households

Spring 2017 Math 115 / Statistics Chapter 2 / Page 76


Histograms
Population and Sample Distributions
We usually do not know the population
distribution
However
We can use the distribution of the simple random
sample from the population to get a rough idea of the
population distribution

Spring 2017 Math 115 / Statistics Chapter 2 / Page 77


Histograms
Population and Sample Distributions
For a simple random sample
The sample distribution approximates the
population distribution
The distribution of the variable under consideration
The larger the sample size
The better the approximation tends to be

Spring 2017 Math 115 / Statistics Chapter 2 / Page 78


Histograms
Many methods we will use later in the text
require that the sample data must be from
a population with a normal distribution
A normal quantile plot can be interpreted on
the following criteria
Normal Distribution
Points are reasonably close to a straight line
Not a Normal Distribution
Points not reasonably close to a straight line or the points show
some systemic pattern that is not straight

Spring 2017 Math 115 / Statistics Chapter 2 / Page 79


Histograms
Examples

Spring 2017 Math 115 / Statistics Chapter 2 / Page 80


Histograms
Comparing 2 data sets
Frequency distributions (Low and High
Lead Level)
Low Lead Level - Sorted
50 56 70 72 73 74 75 76 76 76 76 76 77 77 78 80
80 80 84 85 85 85 85 86 86 86 86 87 87 88 88 88
89 89 89 91 92 93 94 94 94 95 96 96 96 96 96 96
96 97 97 98 99 99 99 99 100 101 101 102 104 104 105 105
106 107 107 107 107 108 111 115 115 118 120 125 128 141

High Lead Level - Sorted


75 75 76 79 80 80 80 82 83 85 85 88 88 88 89 93
94 96 101 104 104

Spring 2017 Math 115 / Statistics Chapter 2 / Page 81


High Lead Level - Sorted
75 75 76 79 80 80 80 82 83 85 85 88 88 88 89 93
94 96 101 104 104

Histograms
Low Lead Level - Sorted
50 56 70 72 73 74 75 76 76 76 76 76 77 77 78 80
80 80 84 85 85 85 85 86 86 86 86 87 87 88 88 88
89 89 89 91 92 93 94 94 94 95 96 96 96 96 96 96
96 97 97 98 99 99 99 99 100 101 101 102 104 104 105 105
106 107 107 107 107 108 111 115 115 118 120 125 128 141

Comparing 2 data sets


Frequency distributions (Low and High
Lead Level)
Low Lead Relative High Lead Relative
IQ Score Percent IQ Score Percent
Frequency Frequency Frequency Frequency
50 - 69 2 0.0256 2.56% 50 - 69 0.0000 0.00%
70 - 89 33 0.4231 42.31% 70 - 89 14 0.7000 70.00%
90 - 109 35 0.4487 44.87% 90 - 109 6 0.3000 30.00%
110 - 129 7 0.0897 8.97% 110 - 129 0.0000 0.00%
130 - 149 1 0.0128 1.28% 130 - 149 0.0000 0.00%
Total 78 1 100.00% Total 20 1 100.00%

Spring 2017 Math 115 / Statistics Chapter 2 / Page 82


Low Lead Relative High Lead Level - Sorted
IQ Score Percent
Frequency Frequency 75 75 76 79 80 80 80 82 83 85 85 88 88 88 89 93
50 - 69 2 0.0256 2.56% 94 96 101 104 104
70 - 89 33 0.4231 42.31%
90 - 109 35 0.4487 44.87%

Histograms
110 - 129 7 0.0897 8.97%
130 - 149 1 0.0128 1.28%
Total 78 1 100.00%
High Lead Relative
IQ Score Percent
Low Lead Level - Sorted Frequency Frequency
50 56 70 72 73 74 75 76 76 76 76 76 77 77 78 80 50 - 69 0.0000 0.00%
80 80 84 85 85 85 85 86 86 86 86 87 87 88 88 88 70 - 89 14 0.7000 70.00%
89 89 89 91 92 93 94 94 94 95 96 96 96 96 96 96 90 - 109 6 0.3000 30.00%
96 97 97 98 99 99 99 99 100 101 101 102 104 104 105 105 110 - 129 0.0000 0.00%
106 107 107 107 107 108 111 115 115 118 120 125 128 141 130 - 149 0.0000 0.00%
Total 20 1 100.00%

Comparing 2 data sets


Frequency histograms (Low and High Lead
Level)

Spring 2017 Math 115 / Statistics Chapter 2 / Page 83


Low Lead Relative High Lead Level - Sorted
IQ Score Percent
Frequency Frequency 75 75 76 79 80 80 80 82 83 85 85 88 88 88 89 93
50 - 69 2 0.0256 2.56% 94 96 101 104 104
70 - 89 33 0.4231 42.31%
90 - 109 35 0.4487 44.87%

Histograms
110 - 129 7 0.0897 8.97%
130 - 149 1 0.0128 1.28%
Total 78 1 100.00%
High Lead Relative
IQ Score Percent
Low Lead Level - Sorted Frequency Frequency
50 56 70 72 73 74 75 76 76 76 76 76 77 77 78 80 50 - 69 0.0000 0.00%
80 80 84 85 85 85 85 86 86 86 86 87 87 88 88 88 70 - 89 14 0.7000 70.00%
89 89 89 91 92 93 94 94 94 95 96 96 96 96 96 96 90 - 109 6 0.3000 30.00%
96 97 97 98 99 99 99 99 100 101 101 102 104 104 105 105 110 - 129 0.0000 0.00%
106 107 107 107 107 108 111 115 115 118 120 125 128 141 130 - 149 0.0000 0.00%
Total 20 1 100.00%

Comparing 2 data sets


Relative frequency histograms (Low and
High Lead Level)

Spring 2017 Math 115 / Statistics Chapter 2 / Page 84


Histograms
When comparing data sets
Use relative frequency distributions and
relative histograms

Spring 2017 Math 115 / Statistics Chapter 2 / Page 85


Histograms
Relative-frequency histograms are
better for comparing two quantitative
data sets
Why?

Spring 2017 Math 115 / Statistics Chapter 2 / Page 86


Histograms

Spring 2017 Math 115 / Statistics Chapter 2 / Page 87


Graphs That Enlighten and
Graphs That Deceive
This section discusses other types of
statistical graphs
Our objective is to identify a suitable
graph for representing the data set
The graph should be effective in revealing
the important characteristics of the data

Spring 2017 Math 115 / Statistics Chapter 2 / Page 88


Graphs That Enlighten and
Graphs That Deceive
Some graphs are bad in the sense that
they contain errors
Some are bad because they are
technically correct, but misleading
It is important to develop the ability to
recognize bad graphs and identify
exactly how they are misleading

Spring 2017 Math 115 / Statistics Chapter 2 / Page 89


Graphs That Enlighten and
Graphs That Deceive
Scatterplot or Scatter Diagram
A plot of paired (x, y) quantitative data with a
horizontal x-axis and a vertical y-axis
Used to determine
whether there is a
relationship between
the two variables
Males
x is waist circumference (cm)
y is arm circumference (cm)

Spring 2017 Math 115 / Statistics Chapter 2 / Page 90


Graphs That Enlighten and
Graphs That Deceive
Scatterplot or Scatter Diagram
What would no relationship look like?
Nonlinear?
A linear but negative relationship?
Recall our test for normalcy

Spring 2017 Math 115 / Statistics Chapter 2 / Page 91


Graphs That Enlighten and
Graphs That Deceive
Clusters and a Gap
Recall
The weight of 72 pennies
Two clusters
Separated by a gap
May erroneously conclude
There is a relationship between
the weight of a penny and the year it was made
No apparent relationship and the year
Apparent relationship with the makeup
Pre-1983: 97% copper and 3% zinc
Post-1983: 3% copper and 97% zinc

Spring 2017 Math 115 / Statistics Chapter 2 / Page 92


Graphs That Enlighten and
Graphs That Deceive
Time-Series Graph
Data that have been collected at different
points in time
Time-series
data
Yearly high
values of the
Dow Jones
Industrial
Average

Spring 2017 Math 115 / Statistics Chapter 2 / Page 93


Graphs That Enlighten and
Graphs That Deceive
Dotplots
A graph that displays the range of values
on the horizontal axis and dots represent
the data points
Each dot is placed immediately above the
value it which is corresponds
Dots are stacked (from bottom to top) if
values are duplicated

Spring 2017 Math 115 / Statistics Chapter 2 / Page 94


Graphs That Enlighten and
Graphs That Deceive
To Construct a Dotplot
1. Draw a horizontal axis
That displays the possible values of the
quantitative data
2. Record each observation
By pacing a dot over the appropriate value on
the horizontal axis
3. Label the horizontal axis with the name of
the variable
Spring 2017 Math 115 / Statistics Chapter 2 / Page 95
Graphs That Enlighten and
Graphs That Deceive 197
199
Example 199
199
Prices of DVD Players 208
209
Construct a Dotplot 210
210
212
212
214
215
219
219
Sorted List 219
(Not required) 224
Spring 2017 Math 115 / Statistics Chapter 2 / Page 96
197
199

Graphs That Enlighten and 199


199

Graphs That Deceive 208


209
210
Prices of DVD Players 210
212
Dot Plot 212
214
215
219
219
219
224

Spring 2017 Math 115 / Statistics Chapter 2 / Page 97


Graphs That Enlighten and
Graphs That Deceive
Dotplot
Consists of a graph in which each data
value is plotted as a point (or dot) along a
scale of values
Dots representing equal values are stacked

Spring 2017 Math 115 / Statistics Chapter 2 / Page 98


199 200 202 203 207 208 208 209 210 210 210 210 212 213 214 215 217 218 218 221

Graphs That Enlighten and


Graphs That Deceive
Stem-and-leaf diagrams
A list of numbers organized into Stems (on the Left)
and Leaves (on the Right) separated by a vertical
line
Row of leaves
The most significant digits are the Stems
The least significant digits are Leaves
The Stems are in increasing order
From top to bottom
Leaves are in increasing order
From left to right
Spring 2017 Math 115 / Statistics Chapter 2 / Page 99
X X X
199 200 202 203 207 208 208 209 210 210 210 210 212 213 214 215 217 218 218 221

Graphs That Enlighten and


Graphs That Deceive
To construct a stem-and-leaf diagram
1. Think of each observation as a stem 199
200 2 3 7 8 8 9
Consisting of all but the rightmost digit 210 0 0 0 2 3 4 5 7 8 9
And a leaf, the rightmost digit 221

2. Write the stems from the smallest to largest in a


vertical column to the left of a vertical rule
3. Write each leaf to the right of the vertical rule in
the row that contains the appropriate stem
4. Arrange the leaves in each row in ascending
order
Spring 2017 Math 115 / Statistics Chapter 2 / Page 100
Graphs That Enlighten and
Graphs That Deceive
36 38 39 47 50 51 51 53 55 55
56 57 60 62 63 64 64 65 66 67
68 69 70 70 70 71 75 78 79 80
81 83 85 86 87 89 95 98 99 99
Example
Days to Maturity for Short-Term
Investments
Construct a
stem-and-leaf
diagram

Spring 2017 Math 115 / Statistics Chapter 2 / Page 101


Graphs That Enlighten and
Graphs That Deceive
36 38 39 47 50 51 51 53 55 55
56 57 60 62 63 64 64 65 66 67
68 69 70 70 70 71 75 78 79 80
81 83 85 86 87 89 95 98 99 99

Stems | Leaves Stems | Leaves

not ordered stem-and-leaf stem-and-leaf


Spring 2017 Math 115 / Statistics Chapter 2 / Page 102
Graphs That Enlighten and
Graphs That Deceive
Stemplot or stem-and-leaf plot

Spring 2017 Math 115 / Statistics Chapter 2 / Page 103


199
2002
02
203
Graphs That Enlighten and
207
208 Graphs That Deceive
208
209 Example
210
210
Cholesterol Levels
2102 Construct sorted
10 stem-and-leafs with
212
213 a. One line of leaves
214 per stem
215 0-9
217 b. Two lines of leaves
218 per stem
218
0 4 and 5 - 9
221
Spring 2017 Math 115 / Statistics Chapter 2 / Page 104
199
200
202
203
Graphs That Enlighten and
207
208 Graphs That Deceive
208
209 Cholesterol Levels
210 a. One line of leaves per stem
210
210 0-9
210 b. Two lines of leaves per stem
212 0 4 and 5 - 9
213
214
215
217
218
218
221
Spring 2017 Math 115 / Statistics Chapter 2 / Page 105
Graphs That Enlighten and
Graphs That Deceive
Stem-and Leaf
Advantages
Similar to Histogram
Shows Raw Data

Disadvantages
Not useful for a large data set
Awkward with data containing many digits

Spring 2017 Math 115 / Statistics Chapter 2 / Page 106


Graphs That Enlighten and
Graphs That Deceive
Pie Chart
Consists of a disk divided into wedge-shaped
pieces
Each proportional to its relative frequency
To construct a pie chart
1.Obtain the relative-frequency distribution
2.Divide a disk into wedge-shaped pieces proportional
to the relative frequencies
3.Label the slices with the distinct values and their
relative frequencies (or percents)
Spring 2017 Math 115 / Statistics Chapter 2 / Page 107
Graphs That Enlighten and
Graphs That Deceive
Example
Political Party Affiliation
Construct a Pie Chart
1. Obtain the relative-frequency distribution

Spring 2017 Math 115 / Statistics Chapter 2 / Page 108


Graphs That Enlighten and
Graphs That Deceive
Political Party Affiliation
Construct a Pie Chart
2. Divide a disk into
wedge-shaped pieces
proportional to the relative
frequencies
3. Label the slices with the
distinct values and their
relative frequencies
Percentages
Spring 2017 Math 115 / Statistics Chapter 2 / Page 109
Graphs That Enlighten and
Graphs That Deceive
Pie Chart
What contributes most to happiness?
Coca-Cola Survey
12,500 respondents

Spring 2017 Math 115 / Statistics Chapter 2 / Page 110


Graphs That Enlighten and
Graphs That Deceive
Pareto Chart
A bar graph for qualitative data
With the bars
arranged in
descending
order according
to frequencies
Coca-Cola Survey

Spring 2017 Math 115 / Statistics Chapter 2 / Page 111


Graphs That Enlighten and
Graphs That Deceive
Bar Graph
Uses bars of equal width to show frequencies of
categorical, or qualitative, data
Vertical scale represents frequencies or relative
frequencies
Horizontal scale identifies the different categories of
qualitative data
A multiple bar graph has two or more sets of
bars and is used to compare two or more data
sets
Spring 2017 Math 115 / Statistics Chapter 2 / Page 112
Graphs That Enlighten and
Graphs That Deceive
Bar Charts
Distinct values on the horizontal axis
Relative frequencies (or percents) on the
vertical axis
The bar height is equal to the relative frequency
The bars should not touch each other

Spring 2017 Math 115 / Statistics Chapter 2 / Page 113


Graphs That Enlighten and
Graphs That Deceive
Bar Charts
To construct a bar chart
1. Obtain the relative-frequency distribution
2. Draw a horizontal axis
And a vertical axis
3. For each distinct value
Construct a vertical bar whose height equals the relative
frequency
4. Label the bars with the distinct values
The horizontal axis with the name of the variable and the
vertical axis with Relative Frequency

Spring 2017 Math 115 / Statistics Chapter 2 / Page 114


Graphs That Enlighten and
Graphs That Deceive
Example
Political Party Affiliation
Construct a Bar Chart
1. Obtain the relative-frequency distribution

Spring 2017 Math 115 / Statistics Chapter 2 / Page 115


Graphs That Enlighten and
Graphs That Deceive
Political Party Affiliation
Construct a Bar Chart
2. Draw a horizontal axis
And a vertical axis
3. For each distinct value
Construct a vertical bar whose 45.0%
height equals the relative
frequency 32.5%

4. Label the bars with the 22.5%


distinct values
The horizontal axis with the
name of the variable and the
vertical axis with Relative
Frequency

Spring 2017 Math 115 / Statistics Chapter 2 / Page 116


Graphs That Enlighten and
Graphs That Deceive
45.0%

32.5%

22.5%

Spring 2017 Math 115 / Statistics Chapter 2 / Page 117


Graphs That Enlighten and
Graphs That Deceive
Example
Multiple
Bar
Graph
Median
income for
males and
females
by year

Spring 2017 Math 115 / Statistics Chapter 2 / Page 118


Graphs That Enlighten and
Graphs That Deceive
Relative-frequency histogram vs. bar
chart
In a bar chart, the bars do not touch
Why?
Hint: What kind of data is represented?
Any other differences?

Spring 2017 Math 115 / Statistics Chapter 2 / Page 119


Graphs That Enlighten and IQ Score Frequency

Graphs That Deceive 50-69


70-89
2
33
90-109 35

Frequency Polygon 110-129


130-149
7
1

Uses line segments connected to points


directly
above class
midpoint
values

Spring 2017 Math 115 / Statistics Chapter 2 / Page 120


Graphs That Enlighten and
Graphs That Deceive
Relative Frequency Polygon
Uses relative frequencies (proportions or
percentages) for the vertical scale

Spring 2017 Math 115 / Statistics Chapter 2 / Page 121


Graphs That Enlighten and
Graphs That Deceive
Ogive
A line graph that depicts cumulative
frequencies
IQ Score Frequency Cumulative
Frequency
50-69 2 2
70-89 33 35
90-109 35 70
110-129 7 77
130-149 1 78
Spring 2017 Math 115 / Statistics Chapter 2 / Page 122
Graphs That Enlighten and
Graphs That Deceive
Nonzero Axis
Graphs can be misleading because one or
both of the axes begin at some value other
than zero
So that
differences
are
exaggerated

Spring 2017 Math 115 / Statistics Chapter 2 / Page 123


Graphs That Enlighten and
Graphs That Deceive
Pictographs
Drawings of objects
Three-dimensional objects - money bags, stacks
of coins, army tanks (for army expenditures),
people (for population sizes), barrels (for oil
production), and houses (for home construction)
are commonly used to depict data
These drawings can create false
impressions that distort the data

Spring 2017 Math 115 / Statistics Chapter 2 / Page 124


Graphs That Enlighten and
Graphs That Deceive
Pictographs
If you double each side of a square, the
area does not merely double; it increases
by a factor of four; if you double each side
of a cube, the volume does not merely
double; it increases by a factor of eight
Pictographs using areas and volumes can
therefore be very misleading

Spring 2017 Math 115 / Statistics Chapter 2 / Page 125


Graphs That Enlighten and
Graphs That Deceive
Pictographs
Consider
Mean salaries and education level
No high school diploma: $18,734
High school diploma: $27,915
Bachelor's degree: $51,206
Advanced degree: $74,602

Spring 2017 Math 115 / Statistics Chapter 2 / Page 126


Graphs That Enlighten and
Graphs That Deceive
Bars have
same width
Too busy
Too difficult
to understand

Spring 2017 Math 115 / Statistics Chapter 2 / Page 127


Graphs That Enlighten and
Graphs That Deceive
Misleading
Depicts one-dimensional data with three-
dimensional boxes
Last box is 64 times as large as first box
But income is only 4 times
as large

Spring 2017 Math 115 / Statistics Chapter 2 / Page 128


Graphs That Enlighten and
Graphs That Deceive
Fair, objective,
unencumbered
by distracting
features

Spring 2017 Math 115 / Statistics Chapter 2 / Page 129


Graphs That Enlighten and
Graphs That Deceive
Important Principles Suggested by
Edward Tufte
For small data sets of 20 values or fewer
Use a table instead of a graph
A graph of data should make the viewer
focus on the true nature of the data
Not on other elements, such as eye-catching
but distracting design features

Spring 2017 Math 115 / Statistics Chapter 2 / Page 130


Graphs That Enlighten and
Graphs That Deceive
Important Principles Suggested by
Edward Tufte
Do not distort data
Construct a graph to reveal the true nature of
the data
Almost all of the ink in a graph should be
used for the data
Not for the other design elements

Spring 2017 Math 115 / Statistics Chapter 2 / Page 131


Graphs That Enlighten and
Graphs That Deceive
Exercise
What is a frequency distribution of
qualitative data and why is it useful?
A frequency distribution of qualitative data is a
listing of the distinct values and their
frequencies
A frequency distribution is useful for organizing
qualitative data so that the data are more
compact and easier to understand

Spring 2017 Math 115 / Statistics Chapter 2 / Page 132


Graphs That Enlighten and
Graphs That Deceive
Exercise
Explain the difference between
a. Frequency and Relative Frequency
A frequency is the number of times a particular distinct
value occurs, whereas a relative frequency is a ratio of
a frequency to the total number of observations
b. Percentage and Relative Frequency
A percentage equals 100 times a relative frequency
Equivalently, a relative frequency is a percentage
expressed as a decimal

Spring 2017 Math 115 / Statistics Chapter 2 / Page 133


Graphs That Enlighten and
Graphs That Deceive
Exercise
Answer true or false to each of the statements
in parts (a) and (b)
And explain your reasoning
a. If two data sets that have identical frequency
distributions
They have identical relative frequency distributions
True

Spring 2017 Math 115 / Statistics Chapter 2 / Page 134


Graphs That Enlighten and
Graphs That Deceive
Exercise
If two data sets that have identical relative
frequency distributions
They have identical frequency distributions
False

Spring 2017 Math 115 / Statistics Chapter 2 / Page 135


Graphs That Enlighten and
Graphs That Deceive
Exercise
c. Use your answers to parts (a) and (b) to
explain why relative frequency
distributions are better than frequency
distributions for comparing two data sets
Relative frequencies always lie between 0
and 1 and hence provide a standard for
comparison

Spring 2017 Math 115 / Statistics Chapter 2 / Page 136


Graphs That Enlighten and
Graphs That Deceive
Exercise
Identify an important reason for grouping
data
Grouping can help to make a large and
complicated set of data more compact and
easier to understand

Spring 2017 Math 115 / Statistics Chapter 2 / Page 137


Graphs That Enlighten and
Graphs That Deceive
Exercise
Explain the difference between a frequency
histogram and a relative-frequency
histogram
A frequency histogram displays the class
frequencies on the vertical axis
A relative-frequency histogram displays the
class relative frequencies on the vertical axis

Spring 2017 Math 115 / Statistics Chapter 2 / Page 138


Review
This chapter covered organizing,
summarizing and graphing data sets
Important characteristics of a data set
Center
Variation
Distribution
Outliers
Any changing patterns over time

Spring 2017 Math 115 / Statistics Chapter 2 / Page 139


Review
Should be able to
Construct a frequency or relative frequency
distribution to summarize data
Construct a frequency or a relative frequency
histogram to show the distribution of the data
Examine a histogram or normal quantile plot
to determine whether the sample data
appear to be from a normal population

Spring 2017 Math 115 / Statistics Chapter 2 / Page 140


Review
Should be able to
Construct graphs of data using a
scatterplot, frequency polygon, dotplot,
stemplot, bar graph, multiple bar graph,
Pareto chart, pie chart, or time-serice graph
Critically analyze a graph to determine
whether if objectively depicts data or is
somehow misleading or incorrect

Spring 2017 Math 115 / Statistics Chapter 2 / Page 141

Вам также может понравиться