Вы находитесь на странице: 1из 42

5

Chapter 2

ORGANIZATION AND DESCRIPTION OF
DATA

2.1 (a) The percentage in other classes is 100 32.7 12.8 12.5 12.1 8.2 21.7 % =
(b)
0
5
10
15
20
25
30
35
Paper Yard Food Plastic Metals Other
P
e
r
c
e
n
t

W
a
s
t
e


(c) The percentage of waste that is paper and paperboard is: 32.7 %
The percentage of waste in the top two categories is: 32.7 12.8 45.5 + = %
The percentage in the top five categories is:
32.7 12.8 12.5 12.1 8.2 78.3% + + + + =
6 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA



2.2 The frequency table for blood type is


Blood type Frequency Relative Frequency
O 16 16 / 40 0.40 =
A 18 18 / 40 0.45 =
B 4 4 / 40 0.10 =
AB 2 2 / 40 0.05 =
Total 40 1.00

2.3 The frequency table for number of activities is

Number of Activities Frequency Relative Frequency
0 7 7/40 = 0.175
1 10 10/40 = 0.25
2 13 13/40 = 0.325
3 5 5/40 = 0.125
4 2 2/40 = 0.05
5 1 1/40 = 0.025
6 1 1/40 = 0.025
7 1 1/40 = 0.025
Total 40 1.00

This is the relative frequency histogram:


7


2.4 The frequency table for number of crashes per month is

Number of Activities Frequency Relative Frequency
0 5 5/59 = 0.085
1 12 12/59 = 0.203
2 11 11/59 = 0.186
3 14 14/59 = 0.237
4 8 8/59 = 0.136
5 8 8/59 = 0.136
6 1 1/59 = 0.017
Total 59 1.00 (rounding error)

This is the relative frequency histogram:


2.5 (a) The table of relative frequencies for workers in the department is

Mode of Transportation Frequency Relative Frequency
Drive alone 25 25/ 40 0.625 =
Car pool 3 3/ 40 0.075 =
Ride bus 7 7 / 40 0.175 =
Other 5 5 / 40 0.125 =
Total 40 1.000




8 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA



(b) The pie chart for workers in the department is


2.6 The table of relative frequencies for the money raised (in million dollars) is

Source Frequency Relative Frequency
Individuals and bequests 117 117 / 207 0.565 =
Industry and business 24 24 / 207 0.116 =
Foundations and associations 66 66 / 207 0.319 =
Total 207 1.000

The pie chart for the university fund drive is


2.7 There are overlapping classes in the grouping. A report of 3 stolen bicycles will fall
in two classes.

2.8 There is a gap. A report of 6 complaints in one week does not fall in any class. The
last class should be 6 or more.
9


2.9 There is a gap. The response 5 close friends does not fall in any class. The last
class should be 5 or more.

2.10 The first class should be less than 175 pounds. Otherwise, a light weight kicker
cannot be assigned to a class.

2.11 (a) Yes. (b) Yes. (c) Yes. (d) No. (e) No.

2.12 The frequency table of the survey response is

Response Frequency Relative Frequency
1 14 14 / 50 0.28 =
2 13 13/ 50 0.26 =
3 7 7 / 50 0.14 =
4 16 16 / 50 0.32 =
Total 50 1.00

2.13 (a) The relative frequencies are 0.18, 0.48, 0.26, and 0.08 for 0, 1, 2, and 3 bags,
respectively.


(b) Nearly one-half of the passengers check exactly one bag. The longest tail is to
the right.

(c) The proportion of passengers who fail to check a bag is 9 / 50 0.18 = .






10 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.14 The dot diagram of meter readings is
422 432 442 452 462 472
Measurements DotPlot

2.15 The dot diagram of amounts of radiation leakage is


2.16 The dot diagram of number of bad checks received is
8 7 6 5 4 3
Number of Bad Checks Received




11


2.17 (a) The dot diagram of number of CFUs is

(b) There is a long tail to the right with one extremely large value of 1600 CFU
units.
(c) There is one day so the proportion is 1/15 0.067 =

2.18 (a) The frequency distribution of tornado fatalities is given in the table below.

Class Interval Frequency Relative Frequency
[0, 25) 2 2/58 = 0.035
[25, 50) 19 19/58 = 0.333
[50, 75) 18 18/58 = 0.316
[75, 100) 7 7/58 = 0.123
[100, 150) 5 5/58 = 0.088
[150, 200) 2 2/58 = 0.035
[200, 250) 1 1/58 = 0.018
[250, 550) 3 3/58 = 0.053
Total 57 1.001 (rounding error)

(b) The relative frequency histogram is given below.
0 25 50 75 100 150 200 250
0.0
0.1
0.2
0.3
Number of Deaths
R
e
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y

(c) The proportion of years having 49 or fewer tornado fatalities is
0.035 0.333 0.368 + = .
(d) There is a long tail to the right due to the fact that the last class interval is much
wider than the others yet still exhibits a low frequency of observations.

12 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.19 (a) In the following frequency distribution of lizard speed (in meters per second),
the left endpoint is included in the class interval but not the right endpoint.

Class Interval Frequency Relative Frequency
0.45 to 0.90 2 0.067
0.90 to 1.35 6 0.200
1.35 to 1.80 11 0.367
1.80 to 2.25 5 0.167
2.25 to 2.70 6 0.200
Total 30 1.001 (rounding error)


(b) All of the class intervals are of length 0.45 so we can graph rectangles whose
heights are the relative frequency. The histogram is

2.20 In the following frequency distribution of order of earthquake magnitudes (as given
on the Richter scale), the left endpoint is not included in the class interval, but the
right one is.

Class Interval Frequency Relative Frequency
(6.0, 6.3] 12 12/55 = 0.218
(6.3, 6.6] 15 15/55 = 0.273
(6.6, 6.9] 10 10/55 = 0.182
(6.9, 7.2] 10 10/55 = 0.182
(7.2, 7.5] 5 5/55 = 0.091
( ] 7.5, 7.8
2 2/55 = 0.036
(7.8, 8.1] 1 1/55 = 0.018
Total 55 1.0000 (rounding error)

The class intervals all have the same length so we take the option of making the
height of a rectangle equal to the relative frequency. The histogram is

13


0
2
4
6
8
10
12
14
16
6.3 6.6 6.9 7.2 7.5 7.8 8.1
Order of Earthquake Magnitude
F
r
e
q
u
e
n
c
y


2.21 This time, the frequency distribution is given by

Class Interval Frequency Relative Frequency
(6.0, 6.3] 12 12/55 = 0.218
(6.3, 6.6] 15 15/55 = 0.273
(6.6, 6.9] 10 10/55 = 0.182
(6.9, 7.2] 10 10/55 = 0.182
(7.2, 7.9] 8 8/55 = 0.145
Total 55 1.0000 (rounding error)

The corresponding frequency histogram is as follows:
0
2
4
6
8
10
12
14
16
6.3 6.6 6.9 7.2 >7.2
Order of Earthquake Magnitude
F
r
e
q
u
e
n
c
y




14 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.22 The stem-and-leaf display of the scores is

9 58
10 6
11 559
12 6
13 135678
14 344557
15 2478
16 01222567
17 14688
18 24
19 04


2.23 The stem-and-leaf display of the amount of iron present in the oil is

0 6
1 2234455567777889
2 000000222445567799
3 022444566
4 1167
5 12


2.24 The corresponding measurements are

246 268 293 319 344 371 382 397 405 426 443 490 504 568 613

2.25 The double-stem display of the amount of iron present in the oil is
0 6
1 22344
1 55567777889
2 00000022244
2 5567799
3 022444
3 566
4 11
4 67
5 12

15


2.26 The corresponding measurements are

18 20 20 21 22 22 23 23 24 24 24 25 25 25 26 26 27 29 30

2.27 The five-stem display of the Consumer Price Index in 2001 for the given cities is

15 5
15
15 9
16
16
16
16 7
16 8
17 0
17 22333
17 4
17 677
17 88
18 11
18 2
18
18 67
18
19 011


2.28 (a) The median is 5. The sample mean is
3 7 4 11 5 30
6
5 5
x
+ + + +
= = =
(b) The median is 3. The sample mean is
3 1 7 3 1 15
3
5 5
x
+ + + +
= = =
2.29 (a) The median is 3. The sample mean is
2 5 1 4 3 15
3
5 5
x
+ + + +
= = =
(b) The mean is
26 30 38 32 26 31 183
30.5
6 6
x
+ + + + +
= = =

16 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


The ordered measurements are: 26, 26, 30, 31, 32, 38

30 31
median 30.5
2
+
= =
(c) The sample mean is
1 2 0 1 4 1 2
1
7
x
+ + + + +
= =
The ordered measurements are: 1, 1, 0, 1, 2, 2, 4 .
The median is 1.

2.30 The sample mean is

6.3 6.9 5.7 5.4 5.6 5.5 6.6 6.5 48.5
6.0625
8 8
x
+ + + + + + +
= = =

The ordered measurements are: 5.4, 5.5, 5.6, 5.7, 6.3, 6.5, 6.6, 6.9.

5.7 6.3 12
median 6
2 2
+
= = =

2.31 (a) 3810/15 254. x = =

(b) The ordered observations are:


10 20 50 60 80 90 90 110
140 180 260 340 380 400 1600


So, the median is 110 CFU units. The one very large observation makes the
sample mean much larger. Hence, the sample median is better to use in this
instance.


2.32 (a) The ordered monthly incomes are: 2275 2350 2425 2450 2475 2650 4700.

19325
2760.7
7
x = = , median 2450 = .

(b) For a typical salary, the median is better. Only one person earns more than the
mean.


2.33 The mean is 956/12 79.67 = . The claim ignores variability and is not true. It is
certainly unpleasant with a daily maximum temperature 105
o
F in July.

17


2.34 The sample mean is

85 82 77 83 80 77 94 578
82.57
7 7
x
+ + + + + +
= = = cases

The ordered sales times are: 77, 77, 80, 82, 83, 85, 94

median 82 =

2.35 (a) 212/ 25 8.48 x = =
(b) The sample median is 8. Since the sample mean and median are about the
same, either of them can be used as an indication of radiation leakage.

2.36 The mean, 10.30, is one measure of center tendency and the median, 10.00, is
another. These values may be interpreted as follows. On average, there were 10.3
reports of aggravated assault at the 27 universities. Thirteen of the universities had
at least 10 such reports while 13 recorded at most 10 such reports. At least one
school logged exactly 10 reports.

2.37 The mean, 118.05, is one measure of center tendency and the median, 117.00, is
another. The value 118.05 tells us that, on average, that a baby weighed 118.05
ounces. The median tells us that about half of the babies weighed at least 117
ounces while roughly half weighed at most 117 ounces.

2.38 (a)
0(7) 1(10) 2(13) 3(5) 4(2) 5(1) 6(1) 7(1)
1.925
40
x
+ + + + + + +
= = (activities)
(b) Sample median is 2 activities
(c) The large observations of 5, 6, and 7 activities did not drastically affect the
computation of the mean in this instance.

2.39 (a)
1(7) 2(9) 3(6) 4(5) 5(3)
2.6
30
x
+ + + +
= = (returns)
(b) Sample median is 2 returns.

2.40 (a) Sample median (240 248) / 2 244 = + = (seconds).
(b) 1239/ 6 206.5 x = = (seconds).

2.41 (a) 271/ 40 6.775 x = = days.
(b) Sample median (6 7) / 2 6.5 = + = . Both the sample mean and the sample
median give a good indication of the amount of mineral lost.

2.42 (a) Sample median for males (45.8 48.3) / 2 47.05 = + = .
(b) Sample median for females 30.3 = .
(c) Sample median for the combined set of males and females 38.6 = .

18 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.43 Sample median (176 187) / 2 181.5 = + = (minutes).

2.44 In Exercise 2.43, the sample mean 1862/10 186.2 = = (minutes). The total time for
10 games is 10 1862 x = minutes and this is meaningful. However 10 median
ignores the actual times of the long games and is therefore meaningless.

2.45 (a) The dot diagram for the diameters (in feet) of the Indian mounds in southern
Wisconsin is

(b) 346/13 26.62 x = = . Sample median 24 = .
(c) 13/ 4 3.25 = , so we count in 4 observations.
1
22 Q = and
3
30 Q = .

2.46 40 / 4 10 = , an integer, so we average the 10th and 11th observations to get the first
quartile:
1
(1 1) / 2 1 Q = + = . Similarly, we average the 20
th
and 21
st
observations to
get the median, or
2
(2 2) / 2 2 Q = + = days. Finally, we average the 30
th
and 31
st

observations to get the third quartile:
3
(3 3) / 2 3 Q = + = .

2.47 (a) Median (152 154) / 2 153 = + = .
(b) 40/ 4 10 = , so we need to count in 10 observations. The 11-th smallest
observation also satisfies the definition. This yields
1
135 136
135.5
2
Q
+
= = .
Using a similar approach, we find that
3
166 167
166.5
2
Q
+
= = .

2.48 2283/ 25 91.32 x = = calls per shift.

2.49 The ordered data are

50 57 68 69 72 73 73 80 82 91
92 93 94 96 96 100 102 104 105 106
108 109 118 118 127


Since the number of observations is 25, the median or second quartile is the 13th
ordered observation in the list. The first quartile is the 7th ordered observation and
the third quartile is the 19
th
ordered observation:
1 2 3
73 94 105 Q Q Q = = =
19


2.50 (a) The ordered data are


0.50 0.76 1.02 1.04 1.20 1.24 1.28 1.29 1.36 1.49
1.55 1.56 1.57 1.57 1.63 1.70 1.72 1.78 1.78 1.92
1.94 2.10 2.11 2.17 2.47 2.52 2.54 2.57 2.66 2.67


Since the number of observations is 30, the median or second quartile is the
average of the 15th and 16th in the list. Sample median (1.63 1.70) / 2 = +
1.665 = meters per second. Because 30/ 4 7.5 = , the first quartile is the 8th
ordered observation, and because (0.75)(30)=22.5, the third quartile is the 23
rd

ordered observation:
1 2 3
1.29 1.665 2.11 Q Q Q = = =
(b) Since 0.9(30) 27 = , the 90th percentile is the average of the 27th and 28th
observation in the ordered list. Sample 90th percentile (2.54 2.57) / 2 = +
2.555 = .

2.51 (a) The ordered observations are

10 20 50 60 80 90 90 110
140 180 260 340 380 400 1600


Since the sample size is 15, the median is the 8th ordered observation 110. To
obtain
1
Q , we find 15/ 4 3.75 = so the first quartile is the 4th ordered
observation in the ordered list. To obtain
3
Q , we find 0.75(15)=11.25, so that
the third quartile is the 12
th
ordered observation:
1 3
60 340 Q Q = =

(b) The 90th percentile requires us to count in at least 0.9(15) 13.5 = or 14
observations. The 90th sample percentile 400 = .

2.52 (a) The mean of the original data set is
4 8 8 7 9 6 42
7
6 6
x
+ + + + +
= = =
Adding 4 c = to the original data set we get: 8, 12, 12, 11, 13, 10. The mean of
the new data set is
8 12 12 11 13 10 66
11
6 6
x c x c
+ + + + +
+ = = = = +
which equals 4 7 4 x + = + . Multiplying the original data set by 2 d = we get:
8, 16, 16, 14, 18, 12. The mean of the new data set is


8 16 16 14 18 12 84
14
6 6
dx d x
+ + + + +
= = = =
which equals 2(7) d x = .
20 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


(b) The median of the original data set is

7 8
median 7.5
2
+
= =

When 4 c = is added to the original data set, the median of the new data set is

11 12
median of ( 4) 11.5
2
x
+
+ = =

which equals (median ) 7.5 4 c + = + . When the original data set is multiplied
by 2 d = , the median of the new data set is

14 16
median of 2 15
2
x
+
= =

which equals ( median) 2(7.5) d = .

2.53 (a) The ordered data are 73, 74, 76, 76, 80. The median is 76
o
F and the mean is
o
379/ 5 75.8 x F = = .

(b) The mean of
o
( 32) F is 32 x by property (i) of Exercise 2.52 with 32 c = .
By property (ii)

o o
o
5 5
mean of ( 32) (mean of ( 32))
9 9
5 5
( 32) (75.8 32) 24.33
9 9
F F
x C
=
= = =


By similar properties for the median

o o o
5 5 5
median of ( 32) (median of ( ) 32) (76 32) 24.44
9 9 9
F F C = = =

2.54 (a) Company A. The average is highest and a superior machinist would earn above
the median.

(b) Company B. A medium quality machinist would be paid near the median.
Company B has the higher median.





21


2.55 (a)

(b)
Lake Apopka
67
13.40
5
x = = Lake Woodruff
454
50.44
9
x = =
(c) From the dot diagrams, the males in Lake Apopka have lower levels of
testosterone and their sample mean is only about one-third of that for males in
(un-contaminated) Lake Woodruff. This finding is consistent with the
environmentalists concern that the contamination has affected the testosterone
levels and the reproductive abilities.

2.56 (a)

22 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


(b) Males 67 / 5 13.40 x = = Females 144/11 13.09 x = =

(c) The dot diagrams of the amount of testosterone seems to be quite similar for
males and females although there is a gap in the male diagram. The two means
are nearly the same which suggests that the insecticide contamination has
pushed hormone concentrations far out of balance because, ordinarily, males
should have higher testosterone concentrations.

2.57 (a) We carry out all necessary calculations in the following table. The mean is
12 / 3 4 x = = .

x x x

2
( ) x x

7 3 9
2 2 4
3 1 1
Total 12 0.0 14

(b) The variance and the standard deviation are
2
14
7 , 7 2.646
3 1
s s = = = =


2.58 (a) We carry out all necessary calculations in the following table. The mean is
15/ 3 5 x = = .
x x x

2
( ) x x

4 1 1
9 4 16
2 3 9
Total 15 0.0 26

(b) The variance and the standard deviation are
2
26
13 , 13 3.606
3 1
s s = = = =


2.59 (a) We carry out all necessary calculations in the following table. The mean is
32/ 4 8 x = = .


x x x

2
( ) x x

8 0 0
6 2 4
14 6 36
4 4 16
Total 32 0.0 56

(b) The variance and the standard deviation are

2
56
18.667 , 18.667 4.320
4 1
s s = = = =


23


2.60 (a) We carry out all necessary calculations in the following table. The mean is
9.5/ 5 1.9 x = = .

x x x

2
( ) x x

2.5 0.6 0.36
1.7 0.2 0.04
2.1 0.2 0.04
1.5 0.4 0.16
1.7 0.2 0.04
Total 9.5 0.0 0.64

(b) The variance and the standard deviation are
2
0.64
0.16 , 0.16 0.40
5 1
s s = = = =



2.61 We carry out all necessary calculations in the following table.

x

2
x
8 64
3 9
4 16
Total 15 89

The variance is
( )
2
2
2 2
1 1 15 1
89 (89 75) 7
1 2 3 2
x
s x
n n
| |
| |
|
= = = =
|
|
\
\



2.62 We carry out all necessary calculations in the following table.

x

2
x
8 64
6 36
14 196
4 16
Total 32 312
The variance is
( )
2
2
2 2
1 1 32 1 56
312 (312 256) 18.667
1 3 4 3 3
x
s x
n n
| |
| |
|
= = = = =
|
|
\
\


2.63 (a)
2 2
(34 12 / 5) / 4 1.30 s = = .
(b)
2 2
(19 ( 7) / 6) / 5 2.167 s = = .
(c)
2 2
(499 59 / 7) / 6 0.286 s = = .
24 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.64 (a) Many factors could explain the difference in apartment rents. One possible
factor is simply that different landlords may charge different rents. Other
factors are the size of the apartment, the proximity of the apartment to key
locations such as parks or public transportation, and whether utilities such as
water and electricity are included.
(b)
2 2
(3,836, 675 5155 / 7) / 6 6730.95 s = = .
(c) 6730.95 82.04 s = =

2.65
2
(9726 346 /13) /12 6.5643 s = = .

2.66 (a)
2 2
(142 26 / 5) / 4 1.70 s = = .
(b) 1.70 1.304 s = =

2.67 (a)
2 2
(3,140, 900 3810 /15) /14 155, 225.7 s = = .
(b) 155225.7 393.99 s = = .
(c)
2 2
(580900 2210 / 14) / 13 17,848.9 s = = , So, 17848.9 133.6 s = = . The
single very large value greatly inflates the standard deviation.

2.68 (a)
2 2
(2410 212 / 25) / 24 25.51 s = = .
(b) 25.51 5.05 s = = .

2.69 (a) 1862 /10 186.2 x = = .
(b)
2 2
(353796 1862 /10) / 9 787.96 s = = .
(c) 787.96 28.07 s = = .

2.70 (a) 62/ 50 1.24 x = = bags.
(b)
2 2
(112 62 / 50) / 49 0.71673 s = = , so that .847 s = .

2.71 (a) Median 68.4 = .
(b) 478.4 / 7 68.343 x = = .
(c)
2 2
(32730.34 478.4 / 7) / 6 5.853 s = = . Hence 2.419 s = .

2.72 (a) The measure of variation displayed is 7.61, the sample standard deviation. The
sample variance is
2 2
7.61 57.9121 s = = .
(b) The interquartile range is
3 1
14.00 5.00 9.00 Q Q = = . This means the center
half of the data span an interval of length 9.
(c) Any value greater than 7.61 would correspond to greater variation.



25


2.73 (a) The measure of variation displayed is 15.47, the sample standard deviation.
The sample variance is
2 2
15.47 239.321 s = = .

(b) The interquartile range is
3 1
131.00 106.00 25.00 Q Q = = . This means the
center half of the data span an interval of length 25 ounces.

(c) Any value smaller than 15.47 would correspond to smaller variation.

2.74 (i) For the observations 5, 9, 9, 8, 10, 7,
2
8, 3.2 x s = = and 1.789 s = . Add 4 c =
to the observations x , we have 9, 13, 13, 12, 14, 11. The sample mean and
variance of the new data set are

9 13 13 12 14 11
mean of ( 4) 12
6
9 1 1 0 4 1 16
variance of ( 4) 3.2
6 1 5
x
x
+ + + + +
+ = =
+ + + + +
+ = = =



So the standard deviation of the new data set 4 x + is 3.2 1.789 = which is the
same as the standard deviation of x.

(ii) Multiply the observations x by 2 d = . We get 10, 18, 18, 16, 20, 14. The
sample mean and variance of the new data set are

10 18 18 16 20 14
mean of 2 16
6
36 4 4 0 16 4
variance of 2
6 1
4(9 1 1 0 4 1)
4(3.2) 12.8
5
x
x
+ + + + +
= =
+ + + + +
=

+ + + + +
= = =

So the standard deviation of the new data set 2x is 2 3.2 3.578 = or d times the
standard deviation of x.

2.75 Using the data set in Exercise 2.22, in Exercise 2.47, we determined that
1
135.5 Q =
and
3
166.5 Q = . Hence,
Interquartile range
3 1
166.5 135.5 31.0 Q Q = = = points.

2.76 We determined in an earlier exercise (2.33) that
1
1 Q = and
3
3 Q = . Hence,
Interquartile range
3 1
3 1 2 Q Q = = = days.



26 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.77 No. Typically, the middle half of a data set is much more concentrated than the
combination of the two quarters, one in each tail. As an example, for the water
quality data of Exercise 2.17, the range is 1600 10 1590 = because of one
extremely large observation. From the quartiles determined in Exercise 2.51, the
interquartile range is 340 60 280 = . The range is six times larger than the
interquartile range.

2.78 (a) 150.125 x = and 2 49.354 s = so 2 x s is the interval (100.771, 199.479) .
This interval contains 38 observations or proportion .95 of the observations.
And 3 x s is the interval (76.094, 224.156) which contains proportion 1 of the
observations.
(b) The empirical guidelines suggests proportion 0.95 in the interval 2 x s and we
observed 0.95. It suggests proportion 0.997 for the interval 3 x s and we
observed 1.000. The agreement is excellent.

2.79 (a) 6.775 x = and 19.4096 4.406 s = = .
(b) The proportion of the observations are given in the following table:

2 3
Interval: (2.369, 11.181) ( 2.037, 15.587) ( 6.443, 19.993)
Proportion: 26/ 40 0.65 38/ 40 0.95 40/ 40 1.00
Guidelines: 0.68 0.95 0.997
x s x s x s

= = =

(c) We observe a good agreement with the proportions suggested by the empirical
guideline.

2.80 (a) 51.71/ 30 1.724 x = = and
2
(98.641 51.71 / 30) / 29 0.5727 s = = .
(b) The proportion of the observations are given in the following table:

2 3
Interval: (1.151, 2.297) (.579, 2.869) (.006, 3.442)
Proportion: 20 / 30 0.667 29 / 30 0.967 30 / 30 1.00
Guidelines: 0.68 0.95 0.997
x s x s x s
= = =

(c) We observe a good agreement with the proportions suggested by the empirical
guideline.

2.81 (a) 2.6 x = and 1.69655 1.3025 s = = .
(b) The proportion of the observations are given in the following table:

2 3
Interval: (1.2975, 3.9025) ( 0.005, 5.205) ( 1.3075, 6.5075)
Proportion: 15 / 30 0.50 30 / 30 1.00 30 / 30 1.00
Guidelines: 0.68 0.95 0.997
x s x s x s

= = =

(c) We observe a good agreement with the proportions suggested by the empirical
guideline.
27



2.82 (a) The z-values of 350 and 620 are

350 490 620 490
1.167 , 1.083
120 120
z z

= = = = .

(b) For the z-score of 2.4, the raw score is obtained by solving the equation

210
2.4
50
x
z

= = so 210 50(2.4) 330 x = + = .
2.83 (a)
102 118.05
1.037
15.47
z

= = (b)
144 118.05
1.677
15.47
z

= =

2.84 (a), (b) The boxplots for salaries in City A and City B are shown below.




28 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


(c) There is a greater difference between the cities with respect to the higher salaries.
For instance, any salary above the median in City B is greater than the 75th
percentile in City A.

2.85 For males, the minimum and the maximum horizontal velocity of a thrown ball are
25.2 and 59.9 respectively. The quartiles are:

1
3
(38.6 39.1) / 2 38.85,
median (45.8 48.3) / 2 47.05,
(49.9 51.7) / 2 50.8.
Q
Q
= + =
= + =
= + =


For females, the minimum and the maximum horizontal velocity of a thrown ball
are 19.4 and 53.7 respectively. The quartiles are

1 3
25.7, median 30.3, 33.5 Q Q = = = .

The boxplot of the male and female throwing speed are

Comparing the two boxplots, we can see that males throw the ball faster than
females.

2.86 (a) The differences, arranged in order, between 2007 and 1992 Consumer Price
Index are

13 13 14 18 20 20 21 21 21 22 23 24 25 25 26 26
27 33 33 34 35 36 37 41


1 2 3
20 21 24 25 33 33
20.5, 24.5, 33
2 2 2
Q Q Q
+ + +
= = = = = =

The five-number summary is: 13, 20.25, 24.5, 33, 41

29


Alternatively, from Minitab:

Descriptive Statistics:

Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
C1 24 0 25.33 1.59 7.81 13.00 20.25 24.50 33.00 41.00

(b)
40
35
30
25
20
15
10
Boxplot of Increases in Consumer Price Indices



2.87 (a)


608
25.33
24
x = = and
2
16806 (608) / 24
7.811
24 1
s

= =



(b) Since 2 25.33 2(7.811) 9.708 x s = = and 2 25.33 2(7.811) 40.952 x s + = + =
only the increase of 41 for Honolulu lies outside the interval. The proportion
23/ 24 0.958 = lies within the interval.









30 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.88 (a) Using the ordered data set from Example 5, we have

1 2 3
min 4.5 max 10.0
6.3 7.3 8.3 Q Q Q
= =
= = =


(b) The box plot depicting this data set is as follows:
10
9
8
7
6
5
4
C
1
Hours of Sleep


2.89 From Exercise 2.38, we know that
1.925 x = and
2
249 (77) / 40
1.607
40 1
s

= =



2.90 (a) 296 / 14 21.14 x = =
(b)

2
296
12078
14
21.16
14 1
s

= =


(c) The dot plot is given below


(d) All are losses except for two gains in the 1998 and 2002 elections.

31


2.91 (a) The ordered data are
8 5 4 5 8 11 12 16 26 30 43 47 52 55
Median (12 16) / 2 14 = + = seats lost.

(b) The maximum number of seats lost, 55, occurred when Harry S. Truman was
President. The minimum number, 8 or a gain, occurred during G.W. Bushs
term as President.
(c) range 55 ( 8) 63 = =

2.92

The process appears to be in statistical control. The pattern is nearly a horizontal
band with one possible low value.

2.93


The value 215 from the second pay period looks high and 194 from the fifth period
is possibly high.

32 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.94 We calculate 2283/ 25 91.32 x = = and 9281.44/ 24 19.67 s = = so the upper
limit is 2 130.66 x s + = and the lower limit is 2 51.98 x s = .

Only the value 50 calls for worker 20 is out of control.

2.95 We calculate 2501/ 26 96.2 x = = and 65254/ 25 51.1 s = = so the upper limit is
2 198.4 x s + = and the lower limit is 2 6.0 x s = which we take as 0.

Only the value 215 from the second pay period is out of control.






33


2.96
Year
E
x
c
h
a
n
g
e

R
a
t
e
2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992
1.6
1.5
1.4
1.3
1.2
1.1
1.0
Time Series Plot of Exchange Rate


The process appears to be in statistical control between 1994 and 2003, but then
begins to taper off from 2003 to 2007.

2.97
Observation
I
n
d
i
v
i
d
u
a
l

V
a
l
u
e
15 13 11 9 7 5 3 1
1.8
1.6
1.4
1.2
1.0
_
X=1.3544
UCL=1.7918
LCL=0.917
I Chart of Exchange Rate

The process appears to be in statistical control.

34 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.98 We re-calculate without the outlier 5326.
18329
1221.9
15
x = = and
5195138
609.16
14
s = =
so the upper limit is 2 2440.2 x s + = and the lower limit is 2 3.6 x s = . All of the
points are within the control limits.

2.99 (a) The relative frequencies of the occupation groups are:

Relative Frequency
2007 2000
Goods Producing 0.139 0.161
Service (Private) 0.722 0.702
Government 0.139 0.136
Total 1.000 0.999

(b) The proportions of persons in private service occupations and government has
increased while the proportion in goods producing have decreased from 2000 to
2007.

2.100 (a) The frequency table of intended major of the students is:

Intended major Frequency Relative Frequency
Biological Science 18 0.367
Humanities 4 0.082
Physical Sciences 9 0.184
Social Science 18 0.367
Total 49 1.000





35


(b) The frequency table of year in college of the students is:

Year Frequency Relative Frequency
1 4 0.082
2 10 0.204
3 20 0.408
4 15 0.306
Total 49 1.000

The histogram is:
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
1 2 3 4
Year in College
R
e
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y


2.101 The dot diagrams of heights for the male and female students are







36 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.102 The frequency table of the causes for power outage is:
Frequency Table for Causes of Outage
Cause Frequency
Trees and limbs 12
Animals 9
Lighting 3
Wind storm 1
Fuse 1
Unknown 4

The Pareto chart for the cause of outage is

2.103 (a) Yes. The exact number of lunches is the sum of the frequencies of the first
three classes.
(b) Yes. The exact number of lunches is the sum of the frequencies of the last
three classes.
(c) No.

2.104 The sample mean and sample standard deviation are:

143 131 101 143 111
125.8
5
x
+ + + +
= =
1452.8
19.1
5 1
s = =

millimeters.

2.105 (a) The mean, 227.4, is one measure of center tendency and the median, 232.5, is
another. These values may be interpreted as follows. On average, the 20
grizzly bears weigh 227.4 pounds apiece. Half of the grizzly bears sampled
weighed at least 232.5 pounds while half weighed at most 232.5 pounds.
(b) The sample standard deviation is 82.7 pounds.
(c) The z score for a weight of 320 pounds is
320 227.4
1.12
82.7
z

= =



37


2.106 (a) Median (64 67) / 2 65.5 = + = .
(b) We count in 38/ 4 9.5 = or 10 observations to find
1
50 Q = and
3
79 Q = .
(c) The proportion of students who scored below 70 is 20/ 38 0.526 = .
The proportion of students who scored 80 or over is 9/ 38 0.237 = .

2.107 (a) Sample median (9 9) / 2 9 = + = .
(b) 271/ 30 9.033 x = = .
(c) The sample variance is
2
2
1 271
2561 3.895
29 30
s
| |
= =
|
\
.

2.108 (a) The double stem display is
4 23
4 6677899
5 0011112244444
5 555566677778
6 0111244
6 589

(b) Median
1 3
54 55
54.5, 50.5 , 57.5
2
Q Q
+
= = = = .

2.109 (a) 7, 2 x s = =
(b) By the properties, the new data set 100 x + has sample mean (7 100) = +
107 = and standard deviation 2. By direct calculation, we verify

106 108 104 109 108
107
5
x
+ + + +
= =
2 2 2 2 2
2
(106 107) (108 107) (104 107) (109 107) (108 107)
4
4
s
+ + + +
= =
(c) By the properties, the new data set 3x has sample mean 3(7) 21 = =
and standard deviation 3 3(2) 6 s = = . By direct calculation, we verify
18 24 12 27 24
21
5
x

= =

2 2 2 2 2
2
(3) ( 3) (9) ( 6) ( 3) 1 1 9 4 1
9 9
4 4
s
+ + + + + + + + | |
= = =
|
\


2.110 (a) For the heights of males, 69.61, 2.97 x s = = .
(b) For the heights of females, 66.14, 2.60 x s = = .
(c) For the heights of males, median
1 3
70, 68, 72 Q Q = = = .
(d) For the heights of females, median
1 3
66.5, 65, 67 Q Q = = = .
38 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.111 (a) The dot diagrams are


(b) From the dot diagrams we can see the number of flies (grape juice) is centered
at about 11 and the number of flies (regular food) is centered near 25. The
spread looks about the same.

(c) Regular food: 25.1, 6.84 x s = = .
Grape juice: 11.05, 6194 x s = = .

2.112 (a) The dot diagram of the usage times per ounce of toothpaste is


(b) The relative frequency of usage times that do not exceed .80 is


number of usage times less than or equal to .80 12
0.50
24 24
= = .
(c) 0.794 x = and 0.115 s = .
(d) Median
1
0.805, 0.755 Q = = and
3
0.86 Q = .

39


2.113 (a) 5.38 x = and 3.42 s = .
(b) Median 5 = .
(c) Range 13 0 13 = = .

2.114 (a) 0.75 since 96.0 is the third quartile.
(b) 0.50 since 84.0 is the median.
(c) 0.68 in the interval 2 x s or 59.5-101.5 if the frequency distribution is nearly
bell-shaped.
(d) 0.50 since
1
75.5 Q = and
3
96.0 Q = .
(e) 0.997 in the interval 3 x s or 49.0-112.0 if the frequency distribution is
nearly bell-shaped.

2.115 (a) Median
1
4.505, 4.30 Q = = and
3
4.70 Q = .
(b) 90th percentile = (4.80 5.07) / 2 4.935 = + = .
(c) 4.5074 x = and 0.368 s = .
(d) The boxplot of acid rain in Wisconsin is



2.116 (a) and (b). We have 4.507 x = and 0.368 s = so


2 3
Interval: (4.139, 4.875) (3.771, 5.243) (3.403, 5.611)
Proportion: 38/ 50 0.76 46/ 50 0.92 50/ 50 1.000
Guidelines: 0.68 0.95 0.997
x s x s x s
= = =


(c) The observed proportions are somewhat close to those suggested by the
empirical guidelines. However, the proportion between one and two standard
deviations (46 38) / 50 0.16 = is noticeably smaller than the expected
0.95 0.68 0.27 = .





40 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.117 (a) Median
1
6.7, 6.4 Q = = and
3
7.1 Q = .
(b) 370.4 / 55 6.7345 x = = and
2
2506.22 (370.4) / 55 / 54 0.466 s ( = =

.
(c) The boxplot of the data is

2.118 (a)
Class Interval Frequency Relative Frequency
( ] 4500, 1600
1 0.026
( 1600, 850] 1 0.026
( 850, 250] 2 0.051
( 250, 0] 8 0.205
(0, 250] 13 0.333
(250, 850] 6 0.154
(850,1650] 5 0.128
(1650, 2450] 3 0.077
Total 39 1.0000

(b) The frequency and relative frequency histograms are:
Yearly Changes in Dow Jones
Averages
0
2
4
6
8
10
12
14
-1600 -850 -250 0 250 850 1650 2450
F
r
e
q
u
e
n
c
y

41



Yearly Changes in Dow Jones Averages
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
1 2 3 4 5 6 7 8
Changes in DJ Average
R
e
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y

(Here, the classes were renumbered 1 8, consecutively, for simplicity.

(c) Since the value 0 is included in the interval ( 250, 0] , it is unclear how many
of the 8 observations contained in that interval are negative. It is, however,
unlikely that the Dow Jones average at the end of one year was exactly the
same as that in the previous year. It seems safe to assume that all 8
differences in the interval ( 250, 0] are negative. The proportion of changes
that are negative is then (1 1 2 8) / 39 0.308 + + + = .
(d) The distribution is roughly bell-shaped centered around class 5.

2.119 (a)
Year
C
1
2008 2004 2000 1996 1992 1988 1984 1980 1976 1972 1968 1964
4.2
4.1
4.0
3.9
3.8
3.7
3.6
Winning Times in Minutes

42 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


(b) It is not reasonable, because a frequency distribution would not show the
systematic decrease of the winning times over the years, which is the main
feature of these observations.

2.120 The mode is 1 bag, since the sample has twenty-four 1s.

2.121 We calculate
4167
83.34
50
x = = and
2
4167
50
419, 411
38.368
49
s

= =

2.122 (a) The time plot for this data set is as follows:

2004 1999 1994 1989 1984 1979 1974 1969 1964 1959
200
150
100
50
Year
N
u
m
b
e
r

o
f

D
e
a
t
h
s
Time Plot for Deaths Due to Lightning


(b) The number of deaths due to lightning is steadily declining over the indicated
time period. The mean and standard deviation, while important, would not
capture this trend. The 5-number summary would be an important supplement
to report when describing this data set.

2.123 (a) The partial MINITAB output is

43


(b) The partial MINITAB output for the acid rain data.


(Note that MINITAB uses a slightly different convention for determining
1
Q
and
3
Q .)

2.124 (a) The histogram and box plot reveal a longer right hand tail. The five
observations 4749, 4846, 4949, 5005, and 5157 seem to be detached and
larger than expected for a bell-shaped pattern.

(b) Textbook scheme:
1
3470 Q = MINITAB scheme:
1
3457.5 Q =
The scheme used by the text yields a slightly higher value.

2.125 The partial MINITAB output for the data set in Table 4.


2.126 The MINITAB commands and partial output of the final times to run 1.5 miles in
Data Bank D.5.


2.127 The mean and standard deviation given by MINITAB are the rounded off values
of the answer given by SAS.







44 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA


2.128 (a) Freshwater growth of male salmon: 98.350, 30.03 x s = =
median
1
98.5, 80 Q = = and
3
118 Q = . The frequency table for freshwater
growth of male salmon is:

Class Interval Frequency Relative Frequency
40-60 6 0.150
60-80 4 0.100
80-100 11 0.275
100-120 10 0.250
120-140 6 0.150
140-160 3 0.075
Total 40 1.000

Because the cells have equal width, we take the option of using relative
frequency for the vertical scale. The histogram is shown below.

(b) Freshwater growth of female salmon: 114.825, 22.22 x s = =
median
1
115.5, 98.5 Q = = and
3
132.5 Q = . The frequency table for
freshwater growth of female salmon is:

Class Interval Frequency Relative Frequency
60-80 2 0.050
80-100 9 0.225
100-120 12 0.300
120-140 13 0.325
140-160 3 0.075
160-180 1 0.025
Total 40 1.000

Because the cells have equal width, we take the option of using relative
frequency for the vertical scale. The histogram is shown below.

45


(c) The box plots for freshwater growth of male and female salmon are:


2.129 (a) The histogram of the alligator data is

(b)

4035 155672
109.1 65.8
37 37 1
x s = = = =



2.130 (a) The ordered data are


16 18 19 19 24 29 32 33 46 58
68 72 78 82 82 83 99 101 109 110
114 118 125 134 140 141 142 143 163 170
184 194 200 220 220 221 228


Median
1
109, 58 Q = = and
3
143 Q = .

(b) We count in 34 positions to find the 90th percentile 220 = . The only two
observations that are higher were taken from females.
46 CHAPTER 2. ORGANIZATION AND DESCRIPTION OF DATA



2.131 (a)




(b) The ordered observations are


75.3 75.7 75.9 75.9 76.2 76.3 76.4 76.4 76.6 76.6
76.7 76.9 76.9 77.0 77.0 77.1 77.4 77.4 77.4 77.4
77.4 77.5 77.6 77.6 77.8 77.9 77.9 77.9 77.9 77.9
78.0 78.1 78.3 78.4 78.4 78.5 79.1 79.2 80.0 80.4


There are 40 observations so the median (77.4 77.4) / 2 77.4 = + = . The first
quartile is the average of the 40/ 4 = 10th and 11th observations in the sorted
list.
1
(76.6 76.7) / 2 76.65 Q = + = and
1
(77.9 78.0) / 2 77.95 Q = + = .

(c) The interval x s or (76.36, 78.56) has relative frequency 30 / 40 0.75 =
compared to 0.683. The interval 2 x s or (75.26, 79.66) has relative
frequency 38 / 40 0.95 = compared to 0.95. The interval 3 x s or
(74.16,80.76) has relative frequency 1 compared to 0.997. The agreement is
quite good.

Вам также может понравиться