Академический Документы
Профессиональный Документы
Культура Документы
4.1 a x
Ordered data: 0, 5, 15, 25, 30, 33, 40, 44, 52, 60, 81, 104; Median = (33 + 40)/2 = 36.5
Mode = all
4.2 x
5 7 0 3 15 6 5 9 3 8 10 5 2 0 12 = 90 = 6.0
15
15
5.5 7.2 1.6 22. 0 8.7 2.8 5.3 3.4 12.5 18.6 8.3 6.6
12
102.5
= 8.54
12
Ordered data: 1.6, 2.8, 3.4, 5.3, 5.5, 6.6, 7.2, 8.3, 8.7, 12.5, 18.6, 22.0; Median = 6.9
Mode = all
b The mean number of miles jogged is 8.54. Half the sample jogged more than 6.9 miles and half
jogged less.
4.4 a x
x
n
33 29 45 60 42 19 52 38 36 = 354 = 39.3
9
9
Ordered data: 19, 29, 33, 36, 38, 42, 45, 52, 60; Median = 38
Mode: all
b The mean amount of time is 39.3 minutes. Half the group took less than 38 minutes.
4.5 a
x
n
14 8 3 2 6 4 9 13 10 12 7 4 9 13 15 8 11 12 4 0
20
164
= 8.2
20
105
Ordered data: 0, 2, 3, 4, 4, 4, 6, 7, 8, 8, 9, 9, 10, 11, 12, 12, 13, 13, 14, 15; Median = 8.5
Mode = 4
b The mean number of days to submit grades is 8.2, the median is 8.5, and the mode is 4.
4.8 a x
4.9 a x
x
n
1200 1000
= .20
1000
1200 1200
=0
1200
1500 1200
= .25
1200
2000 1500
= .33
1500
b x
x
n
106
c Rg
4
10 12
= .167
12
14 10
= .40
10
15 14
= .071
14
22 15
= .467
15
30 22
= .364
22
25 30
= .167
30
b x
x
n
Ordered data: .167, .167, .071, .364, .40, .467; Median = .218
c R g 6 (1 R 1 )(1 R 2 )(1 R 3 )(1 R 4 )(1 R 5 )(1 R 6 ) 1
=
b The mean expenditure is $117.08 and half the sample spent less than $1246.00.
107
4.15a
b
c The mean and median of commuting time in New York is larger than that in Los Angeles.
4.16a
b The mean percentage is .81. Half the sample paid less than .83.
4.17a
b The mean speed is 32.91 mph. Half the sample traveled slower than 32 mph and half traveled
faster. The mode is 32.
4.18a
b The mean expenditure is $592.04. Half the sample spent less than $591.00
4.19 x
s2
9 3 7 4 1 7 5 4 = 40 = 5
8
8
x) 2
n 1
(x
x) 2
(x
12 6 22 31 23 13 15 17 21 = 160 = 17.78
9
9
x) 2
n 1
4 5 3 6 5 6 5 6 = 40 = 5
8
8
n 1
4.21 x
(x
4.20 x
2
54.19
s2 =
54.19 = 7.36
108
4.22 x
(x
0 ( 5) ( 3) 6 4 (4) 1 ( 5) 0 3 = 3 = .30
10
10
x) 2
n 1
15.12
s2 =
15.12 = 3.89
109
4.31 Range = 25.85, s 2 29.46, and s = 5.43; there is considerable variation between prices; at
least 75% of the prices lie within 10.86 of the mean; at least 88.9% of the prices lie within 16.29
of the mean.
4.32 s 2 40.73 mph 2 and s = 6.38 mph; at least 75% of the speeds lie within 12.76 mph of the
mean;
at least 88.9% of the speeds lie within 19.14 mph of the mean
4.33 a Punter
40.22
6.34
14.81
3.85
3.63
1.91
x 175.73 and s = 62.1; At least 75% of the withdrawals lie within $124.20 of the mean; at
b.
c The histogram is approximately bell shaped allowing us to use the Empirical Rule.
Approximately 68% of adults are between 12.9 and 82.5 years old.
110
4.38a x
b.
c. The histogram is positively skewed; we must use Chebysheffs Theorem. At least 75% of
American adults watch between 0 and 249 minutes of television news.
4.39 a x
b.
The histogram is very positively skewed. As a result we can only use Chebysheffs Theorem. At
least 75% of American born outside the United States were between 0 and 62.6 years old
25
= (16)(.25) = 4; the fourth number is 3.
100
50
= (16)(.5) = 8; the eighth number is 5.
100
75
= (16)(.75) = 12; the twelfth number is 7.
100
30
= (11)(.30) = 3.3; the 30th percentile is 22.3.
100
111
80
= (11)(.80) = 8.8; the 80th percentile 30.8.
100
40
= (11)(.40) = 4.4; the 40th percentile is 52 +.4(6052) = 55.2.
100
25
= (14)(.25) = 3.5; the first quartile is 13.05.
100
50
= (14)(.5) = 7; the second quartile is 14.7.
100
75
= (14)(.75) = 10.5; the third quartile is 15.6.
100
20
= (11)(.20) = 2.2; the 20th percentile is 43 + .2(5143) = 44.6.
100
30
= (16)(.30) = 4.8; the third decile is 5 + .8(7 5) = 6.6.
100
60
= (16)(.60) = 9.6; the sixth decile is 17 + .6(18 17) = 17.6.
100
112
Cats: First quartile = 743, second quartile = 856, and third quartile = 988.
113
Dogs cost more money than cats. Both sets of expenses are positively skewed.
4.52 First quartile = 50, second quartile = 125, and third quartile = 260. The amounts are positively skewed.
4.53 BA First quartile = 25,730, second quartile = 27,765, and third quartile = 29836
BSc First quartile = 29,927, second quartile = 33,397, and third quartile = 36,745
BBA First quartile = 31,316, second quartile = 34,284, and third quartile = 39,551
114
Other First quartile = 28,254, second quartile = 29,951, and third quartile = 32,905
The starting salaries of BA and other are the lowest and least variable. Starting salaries for BBA and BSc
are higher.
4.54 a
115
b The amount of time taken to complete rounds on the public course are larger and more variable
than those played on private courses.
4.56 a The quartiles are 26, 28.5, and 32
b the times are positively skewed.
4.57 The quartiles are 8081.81, 9890.48, and 11,692.92. One-quarter of mortgage payments are
less than $607.19 and one quarter exceed $909.38.
4.58 TIME1
116
TIME2
Americans spend more time watching news on television than reading news on the Internet.
4.59
117
4.60 EDUC
SPEDUC
118
s xy
sxsy
150
.7813
(16)(12)
119
4.65a.
xi
x i2
400
1600
yi
20
40
14
16
60
Total
yi2
196
256
18
x i yi
280
640
3600
324
1080
50
17
2500
289
850
50
18
2500
324
900
55
18
3025
324
990
60
18
3600
324
1080
70
20
4900
400
1400
405
139
22,125
2,437
7,220
x i = 405
i 1
y i = 139
i 1
i 1
s xy
x i yi
i 1
s 2x
x i2
s 2x
n 1
y i2
y
i 1
( 405)(139)
1
7,220
26.16
8 1
8
( 405) 2
1
22,125
231.7
8 1
8
(139) 2
1
2,437
3.13
8 1
8
s2
y
sxs y
s xy
= 7,220
i 1
231.7 15.22
sy
i 1
s 2y
sx
x y
xi
i 1
i 1
n 1
i 1
i 1
i 1
y i2 = 2,437
x y
i
1
n 1
x i2 = 22,125
3.13 1.77
26.16
(15.22)(1.77)
..9711
R2 = r2 = .97112 = .9430
The covariance is 26.16, the coefficient of correlation is .9711 and the coefficient of determination
is .9430.
94.30% of the variation in expenses is explained by the variation in total sales.
120
b.
b1
s xy
26.16
.113
231.7
s 2x
405
50.63
8
139
17.38
8
The estimated variable cost is .113 and the estimated fixed cost is 11.66.
4.66
xi
yi
x i2
yi2
x i yi
Total
40
42
37
47
25
44
41
48
35
28
387
77
63
79
86
51
78
83
90
65
47
719
1,600
1,764
1,369
2,209
625
1,936
1,681
2,304
1,225
784
15,497
5,929
3,969
6,241
7,396
2,601
6,084
6,889
8,100
4,225
2,209
53,643
3,080
2,646
2,923
4,041
1,276
3,432
3,403
4,320
2,275
1,316
28,712
x i = 387
i 1
y i = 719
i 1
x i2 = 15,497
i 1
i 1
y i2 = 53,643
x y
i
i 1
28,712
s xy
1
n 1
x i yi
i 1
i 1
i 1
x y
(387)(719)
1
28,712
98.52
10 1
10
s 2x
n 1
i 1
x i2
xi
i 1
(387) 2
1
15,497
57.79
10 1
10
121
s 2y
n 1
y i2
i 1
R2 = r2 = .88112 = .7763
b1
s xy
s 2x
(719) 2
1
53,643
216.32
10 1
10
.8811
98.52
1.705
57.79
387
38.7
10
719
71.9
10
(57.79)(216.32)
98.52
sxsy
s xy
i 1
yi
e. There is a strong positive linear relationship between marks and study time. For each additional
hour of study time marks increased on average by 1.705.
4.67
xi
yi
x i2
yi2
x i yi
Total
599
689
584
631
594
643
656
594
710
611
593
683
7,587
9.6
8.8
7.4
10.0
7.8
9.2
9.6
8.4
11.2
7.6
8.8
8.0
106.4
358,801
474,721
341,056
398,161
352,836
413,449
430,336
352,836
504,100
373,321
351,649
466,489
4,817,755
92.16
77.44
54.76
100.00
60.84
84.64
92.16
70.56
125.44
57.76
77.44
64.00
957.2
5750.4
6063.2
4321.6
6310.0
4632.2
5915.6
6297.6
4989.6
7952.0
4643.6
5218.4
5464.0
67,559.2
x
i 1
=7,587
y
i 1
= 106.4
x
i 1
67,559.2
122
2
i
= 4,817,755
y
i 1
2
i
= 957.2
x y
i
i 1
s xy
1
n 1
x i yi
i 1
i 1
i 1
x y
(7,587)(106.4)
1
67,559.2
26.16
12 1
12
s 2x
n 1
x i2
sx
s 2x
1,897.7 43.56
n 1
s 2y
(7,587) 2
1
4,817,755
1,897.7
12
= 12 1
i 1
i 1
xi
y i2
y
i 1
i 1
sxsy
s 2Y
s xy
(106.4) 2
1
957.2
1.25
12
= 12 1
sY
1.25 1.12
26.16
( 43.56)(1.12)
.5362
R2 = r2 = .53622 = .2875
The covariance is 26.16, the coefficient of correlation is .5362, and the coefficient of
determination is .2875. The coefficient of determination tells us that 28.75% of the variation in
MBA GPAs is explained by the variation in GMAT scores.
4.68
R2 = r2 = (.6332)2 = .4009; 40.09% of the variation in the employment rate is explained by the
variation in the unemployment rate.
4.69 a
123
R2 = r2 = (.2543)2 = .0647.
b There is a weak linear relationship between age and medical expenses. Only 6.47% of the
variation in average medical bills is explained by the variation in age.
c
5.966 .2257 x
The least squares line is y
d For each additional year of age mean medical expenses increase on average by $.2257 or 23
cents.
e Charge 25 cents per day per year of age.
4.70
124
4.71
Only 0.55% of the variation in the number of wells drilled is explained by the variation in the
price of oil. The relationship is too weak to interpret the value of the slope coefficient.
4.72
R2 = (.0830)2 = .0069.
There is a very weak positive relationship between the two variables.
125
4.73
4.74
= 263.4 + 71.65x; Estimated fixed costs = $263.40, estimated variable costs = $71.65
y
126
4.75a
b The slope coefficient is 510.37; home attendance increases on average by 510.37 for each win.
46.41% of the variation in home attendance is explained by the variation in the number of wins.
4.76a
4.77
a. The slope coefficient is .26; for each million dollars in payroll the number of wins increases on
average by .26. Thus, to cost of winning one addition game is 1/.26 million = $3.846 million.
b. The coefficient of determination tells us that only 4.11.9% of the variation in the number of
wins is explained by the variation in payroll,
4.78
128
a. The slope coefficient is .0428; for each million dollars in payroll the number of wins increases
on average by .0428. Thus, to cost of winning one addition game is 1/.0428 million = $23.364
million.
b. The coefficient of determination = .0866, which reveals that the linear relationship is very weak.
4.79
a. The slope coefficient is .1526; for each million dollars in payroll the number of wins increases
on average by .1526. Thus, to cost of winning one addition game is 1/.1526 million = $6.553
million.
b. The coefficient of determination = .0876, which reveals that the linear relationship is very weak.
129
4.80a
For each additional win home attendance increases on average by 84.391. The coefficient of
determination is .2468; there is a weak relationship between the number of wins and home
attendance.
b
For each additional win away attendance increases on average by 31.151. The coefficient of
determination is .4407; there is a moderately strong relationship between the number of wins and
away attendance.
130
4.81
R2 = .4023. The relationship between wins and home attendance as a percentage of capacity is
weaker than the relationship between wins and home attendance.
4.82
For each additional win home attendance increases on average by 947.38. The coefficient of
determination is .1108; there is a very weak linear relationship between the number of wins and
home attendance.
131
For each additional win away attendance increases on average by 216.74. The coefficient of
determination is .0322; there is a very weak linear relationship between the number of wins and
away attendance.
4.83
R2 = .3304. The relationship between wins and home attendance as a percentage of capacity is
stronger than the relationship between wins and home attendance.
132
4.84 a
There is a weak negative linear relationship between education and television watching.
b R2 = .0572; 5.72% of the variation in the amount of television is explained by the variation in
education.
4.85 Correlation matrix
133
4.87
b1
R2
AT&T
0.687
.318
Aetna
1.256
.296
Cigna
1.829
.463
Coca-Cola
0.601
.324
Disney
1.104
.592
Ford
2.654
.296
McDonalds
0.637
.314
4.88
b1
R2
Barrick Gold
0.594
.071
0.399
.089
0.610
.164
Enbridge
0.314
.109
Fortis
0.211
.032
Methanex
1.301
.270
1.465
.201
Telus
0.446
.097
0.393
.197
4.89
b1
R2
Amazon
1.324
.267
Amgen
0.492
.096
Apple
1.358
.401
Cisco Systems
1.100
.604
1.075
.327
Intel
1.074
.556
Microsoft
0.865
.436
Oracle
0.866
.526
Research in Motion
1.920
.387
134
4.90 a
b We can see that among those who repaid the mean score is larger than that of those who did not
and the standard deviation is smaller. This information is similar but more precise than that
obtained in Exercise 3.23.
4.91 Repaid loan:
135
Defaulted on loan:
The box plots make it a little easier to see the overlap between the two sets of data (indicating that
the scorecard is not very good).
4.92
R2 = .67842 = .4603; 46.03% of the variation in statistics marks is explained by the variation in
calculus marks. The coefficient of determination provides a more precise indication of the
strength of the linear relationship.
4.93
136
= 17.933 + .6041x
ay
b The coefficient of determination is .0505, which indicates that only 5.05% of the variation in
incomes is explained by the variation in heights.
4.95
137
The coefficient of determination is .0779, which indicates that only 7.79% of the variation in sales
is explained by the time between movies.
4.96a
b. The slope coefficient is .07; For each additional square foot the price increases on average by
$.07 thousand. More simply for each additional square foot the price increases on average by$70.
c. From the least squares line we can more precisely measure the relationship between the two
variables.
4.97 B.A.
138
B.Sc.
B.B.A.
Other
Using the same class limits the histograms provide more detail than do the box plots.
139
Public course
The information obtained here is more detailed than the information provided by the box plots.
4.99
a x 35.01, median = 36
140
b s = 7.68
c Half of the bone density losses lie below 36. At least 75% of the numbers lie between 19.64 and
50.38, at least 88.9% of the numbers lie between 11.96 and 58.06.
4.100
141
4.101
R2 = r2 = .57422 = .3297; 32.97% of the variation in bone loss is explained by the variation in age.
4.102 a & b
= 49,337 553.7x
R2 = .5489 and the least squares line is y
c 54.8% of the variation in the number of coffees sold is explained by the variation in temperature.
For each additional degree of temperature the number of coffees sold decreases on average by 554
cups. Alternatively for each 1-degree drop in temperature the number of coffees increases on
average, by 553.7 cups.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how temperature and the number of coffees sold are related.
4.103a mean, median, and standard deviation
142
x = 93.90, s = 7.72
c We hope Chris is better at statistics than he is golf.
4.104
143
c.
d The times are positively skewed. Half the times are above 26 hours.
4.105
80.21% of the variation in scores is explained by the variation in the number of putts.
144
4.106 a & b
= 8.2897 + 3.146x
R2 = .412 and the least squares line is y
c 41.2% of the variation in Internet use is explained by the variation in education. For each
additional year of education Internet use increases on average by 3.146 hours.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how education and Internet use are related.
4.107
145
x = 150.77, median = 150.50, and s = 19.76. The average crop yield is 150.77 and there is a great
deal of variation from one plot to another.
4.108a & b
c 36.92% of the variation in yield is explained by the variation in rainfall. For each additional
inch of rainfall yield increases on average by .128 bushels.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how rainfall and crop yield are related.
4.109
146
c 15.49% of the variation in yield is explained by the variation in the amount of fertilizer. For
each additional unit of fertilizer yield increases on average by.180 bushels.
d We can measure the strength of the linear relationship accurately and the slope coefficient gives
information about how the amount of fertilizer and crop yield are related.
4.110a
b The mean debt is $12,067. Half the sample incurred debts below $12,047 and half incurred debts
above. The mode is $11,621.
Case 4.1 a Scatter diagrams with time as the independent variable and temperature anomalies as
the dependent variable
147
Monthly average increase is .0006. For the 1600 month period the increase was 1600(.0006) = .96o
Celsius.
Scatter diagrams with carbon dioxide levels as the independent variable and temperature
anomalies as the dependent variable
The coefficient of determination is .5075, which means that 50.75% of the variation in
temperature anomalies is explained by the variation in CO2levels. There is a moderately strong
linear relationship.
Case 4.21880 to 1940
148
From 1880 to 1940 the earth warned at an average monthly rate of .0007o Celsius.
1941 to 1975
From 1941 to 1975 the earth cooled at an average monthly rate of .0004o Celsius
1976 to 1997
149
From 1976 to 1997 the earth warmed at an average monthly rate of .0021o Celsius.
1998 to 2009
From 1998 to 2009 the earth warmed at an average monthly rate of .0012o Celsius
Over different periods of time the earth has warmed and cooled.
150
The cost of winning one additional game is 1million/.1526 = $6.553 million. However, the
coefficient of determination is only .0876, which tells us that there are many other variables that
determine how well a team will do.
2005-06 Season
The cost of winning one additional game is 1million/.7795 = $1.283 million. The coefficient of
determination is .3072.
151
The small coefficient of determination in the year before the strike seems to indicate that team
owners were spending large amounts of money and getting little in return. The results are
markedly different in the year after the strike. There is a much stronger linear relationship between
payroll and the number of wins and the cost of winning one additional game is considerably
smaller.
Case 4.4
The coefficient of determination is (.1787)2 = .0319. There is a weak negative linear relationship
between percentage of rejected ballots and Percentage of yes votes.
The coefficient of determination is (.0678)2 = .0046. There is a very weak positive linear
relationship between percentage of rejected ballots and Percentage of Allophones.
The statistics provide some evidence that electoral fraud has taken place.
152
153