Вы находитесь на странице: 1из 11

Math 106 Lecture 7

Measures of Central Tendency


(summarizing data with a single number)
Mean, Median, Mode,
Grouped Data

Intro to Dispersion: Quartiles

1
© m j winter, ss2003

Mean, Median, Mode


Data points: x1, x2, …., xn
Mean: x1 + x2 + ... + xn ∑ x
x= = =µ
n n
Median: list data in order: x1 < x2 < …. < xn if
n is odd, median is middle point
n is even, median is average of the two
middle points
Mode: the value of x which occurs most
often. May be more than one; may be
none.
2

1
Example: Starting
Salaries

Data Set - Starting Salaries of Basket-weaving Majors:


$27000, $27,000, $49,500, $37,300, $487,000,
$15,000, $32,000, $37,500, $41,300

What was the mean (average) starting salary?


What was the median starting salary?
What was the mode?

Example - 2

Data Set - Starting Salaries of Basket-weaving Majors:


$27,000 $27,000, $49,500, $37,300, $487,000,
$15,000, $32,000, $37,500, $41,300

Mean salary? (Add the salaries and divide by 9)


$83,733.33
Median salary? (List in order, take middle)
15.0 27.0 27.0 32.0 37.3 37.5 41.3 49.5 487
Mode? $27,000
How useful are these numbers?
4

2
Measure of Central Tendency
Advantages, Disadvantages

Mean can be influenced by outliers.


useful mathematically
most useful when data is ‘continuous’

Median also a central number. Often more meaningful.


However, it is possible there is no data point anywhere near
the mean or median (or very few)

Mode useful when data is discrete – such as number of cars in


a family, etc.

Questions
x1 < x2 < x3 < x4 <x5 < x6 <x7 <x8 <x9< x10
calculate the mean µ = x + x + ... + x
1 2 10

10
and the median, m = x5 + x6
2
Now increase the largest number by 20. What is the new
mean? The new median?
New mean =
x1 + x2 + ... + ( x10 + 20) 20
=µ+ = µ+2
10 10
The median does not change. x5 + x6
2 6

3
Detour – weighted averages
Calculate the average of: 3.2, 3.2, 3.2, 4.0, 2.5, 2.5
3.2 + 3.2 + 3.2 + 4.0 + 2.5 + 2.5
6
3.2 + 3.2 + 3.2 + 4.0 + 2.5 + 2.5 3*(3.2) + 1*(4.0) + 2*(2.5)
=
6 6
3 1 2
= (3.2) + (4.0) + (2.5) = 3.10
6 6 6

This is a weighted average.


3 1 2
The numbers , , are called the weights.
6 6 6
Note that the sum of the weights is 1.
7

Another example of Weighted Averages

A student’s test average is 3.1 and the grade on the final


exam is 2.8. If the exam is to count as 1/4 of the final
average, how is this average computed?

The weights are 3/4 and 1/4.

3 1 3 3.1 + 1 2.8
3.1 + 2.8 = = 3.025
4 4 4

4
Mean or Average of Grouped Data
Set of 17 integers
between 2 and 9 7 7

(inc)

[2, 3] 7
4
Freq
[4, 5] 3 3 3

[6, 7] 3

[8,9] 4
0
3.0 5.0 7.0 9.0
2 unnamed 9

With data in a group, use the midpoint value.


9

Mean or Average of Grouped Data - 2


7
Set of 17 numbers 7

between 2 and 9
(inc)
Use mid-interval 4

value. Freq
3 3

0
3.0 5.0 7.0 9.0
7 ⋅ 2.5 + 3 ⋅ 4.5 + 3 ⋅ 6.5 + 4 ⋅ 8.5 2 unnamed 9
x= = 4.9705..
17
10

5
Calculating the mean from a relative
frequency (density) histogram
7 .412
7

.235
4
Freq ..176
3 .176
3

0
3.0 5.0 7.0 9.0
2 2.5 6.5
4.5unnamed 8.5 9

.412 * 2.5 + .176 * 4.5 + .176 *6.5 + .235 *8.5 = 4.964..


11

Here’s the original data


frequencies for noname.fma (column 1)
4 4

2 ... 2 4 23.53%
3 3 3
3 ... 3 3 17.65%
4 ... 4 2 11.76%
5 ... 5 1 5.88% 2
Freq
6 ... 6 3 17.65%
7 ... 7 0 1 1
8 ... 8 3 17.65%
9 ... 9 1 5.88% 0
2.0 3.0 4.0 5.0 6.0
2
6 78.08 9.0910.0
2 3 4 5unnamed 9
mean value: 4.76

The wider the bins, the more information you lose.


12

6
The wider the bins, the more information
you lose.

The next slide shows four histograms formed from the


same data. The means are listed in the center.

they come from


Grouping Will Change the Mean!
http://www.shodor.org/interactivate/activities/histogram/index.html

13

139.84
147.2
112.24
206.00

14

7
Elevator-Simulation Examples
Number of time passengers got off at
different floors (3 passengers, 6 floors)
• List 1: (10 trials)
6, 8, 8, 6, 9, 6, 5, 7, 5, 9, 6, 6, 3, 6, 5

• List 2: (100 trials)


56, 52, 49, 56, 50, 57, 61, 63, 56, 52, 55, 58, 49, 64,
51, 51

• List 3: (400 trials)


213, 231, 221, 215

15

Sorted Lists
List 1

3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Mean = Median = Mode =
List 2
49 49 50 51 51 52 52 55 56 56 56 57 58 61 63 64

Mean = Median = Mode =


List 3
213, 215, 221, 231
Mean = Median = Mode =
16

8
Sorted Lists
List 1

3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Mean = 6.33 Median = 6 Mode = 6
List 2
49 49 50 51 51 52 52 55 56 56 56 57 58 61 63 64

Mean = 55 Median = 55.5 Mode = 56


List 3
213, 215, 221, 231
Mean = 220 Median = 218 No Mode
17

Quartiles - use with median

3 5 5 5 6 6 6 6 6 6 7 8 8 9 9
Median is midpoint - the number of elements below the
median equals the number above it.
First quartile: Take the median of the lower half.
Third quartile: Take the median of the upper half.

3 5 5 5 6 6 6 6 6 6 7 8 8 9 9

interquartile range: 8 – 5 = 3

18

9
Interquartile Range, Box Plot, 5-number summary

Roadhog: someone who takes his half of


the road out of the middle
The interquartile range is the width
(range) of the middle half of your data.

5 number summary: {3,5,6,8,9}


min first quartile median third quartile max
3 5 6 8 9

25.0% 25.0%

5 8 19

Estimating the mean from the 5-number


summary

5 number summary: {3,5,6,8,9}

interval weight midpoint


[3,5] 1/4 4
[5,6] 1/4 5.5
[6,8] 1/4 7
[8,9] 1/4 8.5

1 1 1 1 4 + 5.5 + 7 + 8.5 25
4 + 5.5 + 7 + 8.5 = = = 6.25
4 4 4 4 4 4
20

10
Commonly reported
statistical results
List 1: (10 trials)
6, 8, 8, 6, 9, 6, 5, 7, 5, 9, 6, 6, 3, 6, 5
6

Freq

0
3 5 6 7 8 9 10
0 unnamed 9

21

11