Вы находитесь на странице: 1из 14

Question 10.

15
A University career center conducted a study to determine whether there is an association
between starting salaries Y (in thousands of dollors) and grade point average X1,
age apon completion X2, and gender for students in the school of engineering.
The career center obtained the following sample data
Y
X1
Starting Salary GPA
50.5
52
53.1
54.4
53.2
47
50
50.8
47.7
46.4
47.5
49.2
51
49.2
48.8

X2
Age
2.95
3.2
3.4
3.6
3.5
2.85
3.1
3.2
3.05
2.7
2.75
3.1
3.15
2.95
0.75

22
23
22
23
24
24
25
26
23
24
28
22
22
23
26

X3
Gender
F
M
M
M
M
F
F
F
M
F
F
M
M
F
M

(a) Fit an appropriate regression model to these data, evaluate it


and reive it as suggested by your evaluation
(b) Think of another potential predictor variable that could further
explain the variation in the sample starting salaries
Solution
(a)
Y
50.5
52
53.1
54.4
53.2
47
50
50.8
47.7
46.4
47.5
49.2
51
49.2
48.8

X1
2.95
3.2
3.4
3.6
3.5
2.85
3.1
3.2
3.05
2.7
2.75
3.1
3.15
2.95
0.75

X2
22
23
22
23
24
24
25
26
23
24
28
22
22
23
26

X3
0
1
1
1
1
0
0
0
1
0
0
1
1
0
1

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.703494178
R Square
0.494904058
Adjusted R Square0.35715062
Standard Error 1.931858531
Observations
15
ANOVA
df
Regression
Residual
Total

Intercept
X1
X2
X3

3
11
14

Coefficients Standard Error


40.3640753
10.34532195
1.8664914
0.883585363
0.11969999
0.360596758
2.50171596
1.120612565

X1
50.5
52
53.1
54.4
53.2
47
50
50.8
47.7
46.4
47.5
49.2
51
49.2
48.8

SS
40.2245
41.0529
81.2773

t Stat
P-value
3.90167 0.00247
2.11241 0.05833
0.33195 0.74617
2.23245 0.04732

X3
2.95
3.2
3.4
3.6
3.5
2.85
3.1
3.2
3.05
2.7
2.75
3.1
3.15
2.95
0.75

MS
13.4082
3.73208

0
1
1
1
1
0
0
0
1
0
0
1
1
0
1

Significa
F
nce F
3.59268 0.04982

Lower
Upper
95%
95%
17.59418
63.134
-0.07827 3.81125
-0.67397 0.91337
0.035264 4.96817

Lower
Upper
95.0%
95.0%
17.5942
63.134
-0.07827 3.81125
-0.67397 0.91337
0.03526 4.96817

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.69989
R Square
0.48984
Adjusted R Square0.40482
Standard Error
1.85885
Observations
15
ANOVA
df
Regression
Residual
Total

Intercept
X1
X3

2
12
14

SS
39.8132
41.4641
81.2773

Coefficients
43.70409143
1.730310245
2.334050035

Standard
Error
2.31443
0.753
0.96252

The Model would be:


Y=43.704+1.73031X1+2.33405X2

MS
F
19.9066 5.761116
3.45534

t Stat
18.8833
2.29789
2.42493

P-value
2.73E-10
0.040352
0.032028

Significa
nce F
0.01763

Lower
Upper
Lower
Upper
95%
95%
95.0%
95.0%
38.6614 48.7468 38.6614 48.7468
0.08966 3.37096 0.08966 3.37096
0.23689 4.43121 0.23689 4.43121

Question 10.17
An insurance executive wished to estimate the relationship between the number of days
of work lost by auto accident victims Y and age X1 and gender X2 of victim. A representative
sample of 25 loss reports was selected resulting in the least squares equation
Y-Bar=21.4-0.0072X1-2.5X2, For this equation SST = 4.750, SSE=3.180, SD(b1)=0.11, SD(b2)=0.99
(a) Do you detect an association between the response variable Y and the two predictor
variables as a group? Support your answer
(b) Is the incremental contribution of age discernible, given the persons gender? Explain
(c ) Is the incremental contribution of gender discernible, given the persons age? Explain
(d) what do your conclusions in parts (a)-(c ) suggest about basing premiums for income
replacement on the age and gender of the unsured when work time is lost due to an
automobile accident?
Solution:
Y-Bar=21.4-0.0072X1-2.5X2
(a)

the data sample for analysis can be considered as


(X2: 0=Female, 1= Male)
X3 = X1*X2
Y
21.2416
18.7344
18.7416
18.7344
18.7272
21.2272
21.22
21.2128
18.7344
21.2272
21.1984
18.7416
18.7416
21.2344
18.7128

X1
22
23
22
23
24
24
25
26
23
24
28
22
22
23
26

Y
21.2416
18.7344
18.7416
18.7344

X3
0
23
22
23

X2
0
1
1
1
1
0
0
0
1
0
0
1
1
0
1

X3
0
23
22
23
24
0
0
0
23
0
0
22
22
0
26

18.7272
21.2272
21.22
21.2128
18.7344
21.2272
21.1984
18.7416
18.7416
21.2344
18.7128

24
0
0
0
23
0
0
22
22
0
26

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.997180093
R Square
0.994368138
Adjusted R Square 0.993934918
Standard Error
0.100126163
Observations
15
ANOVA
df
1
13
14

SS
MS
23.0109068 23.0109068
0.130328232 0.01002525
23.14123503

Standard Error
0.037779413
0.002233683

t Stat
P-value
561.5531057 6.8402E-30
-47.90924109 5.211E-16

Regression
Residual
Total

Intercept
X3

Coefficients
21.21514683
-0.107014068

F
2295.295382
The model can be defined as
Y=21.21515-0.10701X3

(b)

Lower 95%
21.13352937
-0.111839647
Considering incremental contribution of age against given gender
Y

X1
21.2416
18.7344

X2
22
22

0
1

18.7416
18.7344
18.7272
21.2272
21.22
21.2128
18.7344
21.2272
21.1984
18.7416
18.7416
21.2344
18.7128

22
22
23
23
23
23
24
24
24
25
26
26
28

1
1
1
0
0
0
1
0
0
1
1
0
1

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.999961958
R Square
0.999923917
Adjusted R Square 0.999911237
Standard Error
0.012112852
Observations
15
ANOVA
df
Regression
Residual
Total

SS
23.13947438
0.001760654
23.14123503

2
12
14

Coefficients
Standard Error
t Stat
21.26468235
0.043425783 489.678729

Intercept
X1
X2
(c )

MS
F
11.56973719 78855.2646
0.000146721

-0.001764706
-2.488829412

0.001832039
0.006317974

-0.9632469 Significance F
-393.92843 1.93966E-25

Considering incremental contribution of gender against given age


Y

X1
21.2416
18.7344

X2
22
23

0
0

P-value
3.54277E-27

18.7416
18.7344
18.7272
21.2272
21.22
21.2128
18.7344
21.2272
21.1984
18.7416
18.7416
21.2344
18.7128

22
23
24
24
25
26
23
24
28
22
22
23
26

0
0
0
0
0
1
1
1
1
1
1
1
1

0.354433637
4.82175E-26

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.413975326
R Square
0.171375571
Adjusted R Square 0.033271499
Standard Error
1.264100229
Observations
15
ANOVA
df
Regression
Residual
Total

Intercept
X1
X2

2
12
14

SS
MS
3.965842359 1.98292118
19.17539267 1.59794939
23.14123503

Coefficients
Standard Error
t Stat
12.6565445
4.625799429 2.73607723
0.306936126
0.197591481 1.55338744
-0.12434555
0.681414417 -0.1824815

F
1.240916135

P-value
0.018062093
0.146295239
0.858251378

0.11, SD(b2)=0.99

Significance F
5.21105E-16

Upper 95%
Lower 95.0%
21.29676429
21.13352937
-0.102188488 -0.111839647

Upper 95.0%
21.29676429
-0.102188488

Lower 95%
21.1700657

Upper 95%
21.35929901

Upper
Lower 95.0%
95.0%
21.1700657 21.3593

-0.005756376
-2.502595094

0.002226964
-2.47506373

-0.005756376 0.00223
-2.502595094 -2.47506

Upper 95%
22.73529565
0.73745098
1.360328925

Upper
Lower 95.0%
95.0%
2.57779336 22.7353
-0.123578729 0.73745
-1.609020024 1.36033

Significance F
0.323702756

Lower 95%
2.57779336
-0.123578729
-1.609020024

Question 10.23
A Manufacturing firm wishes to predict the manufacturing unit cost Y (in dollors) of one
of its products as a function of fluctuating production rate X1, and material and labor costs X2,
(X1 is measured as a percentage of rated capacity and X2 is a standard index that combines
the costs of material and labor.) Representative data were collected over a 20 month span
during which the production rate and labor costs fluctuated considerably
Y

X1

X2

13.59
15.71
15.97
20.21
24.64
21.25
18.94
14.85
15.18
16.3

87
78
81
65
51
62
70
91
94
100

Y
80
95
106
115
128
128
115
92
93
111

X1
15.93
16.45
19.02
18.16
18.57
17.01
18.03
19.22
21.12
23.32

X2
102
82
74
85
86
90
93
81
72
60

116
117
127
133
135
136
140
142
148
150

Fit an appropriate regression model to these data, evaluate the resulting least squares
equation, and revise it as necessary
Solution
Month #
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Y
13.59
15.71
15.97
20.21
24.64
21.25
18.94
14.85
15.18
16.3
15.93
16.45
19.02
18.16
18.57
17.01
18.03
19.22
21.12

X1
87
78
81
65
51
62
70
91
94
100
102
82
74
85
86
90
93
81
72

X2
80
95
106
115
128
128
115
92
93
111
116
117
127
133
135
136
140
142
148

20

23.32

60

150

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.95601
R Square
0.91396
Adjusted R Square 0.90384
Standard Error
0.89419
Observations
20
ANOVA
df
Regression
Residual
Total

Intercept
X1
X2

2
17
19

SS
144.387
13.5929
157.98

Significa
MS
F
nce F
72.1937 90.28915 8.8E-10
0.79958

Standard
Coefficients
Error
t Stat
P-value
20.28126941 2.12525
9.543 3.1E-08
-0.137695838 0.01585 -8.68549 1.2E-07
0.074245424 0.01096 6.77134 3.3E-06

Model Would be
Y=20.2812-0.1376X1+0.07424X2

Lower
Upper
Lower
Upper
95%
95%
95.0%
95.0%
15.79738 24.7652 15.7974 24.7652
-0.17114 -0.10425 -0.17114 -0.10425
0.051112 0.09738 0.05111 0.09738

10.21
How well can a taxpayer's taxes Y, as a percentage of his or her gross income X (in thousands
of dollors)? The following represents a random sample of 14 federal income tax returns in a
given year"
Income X % tax Y:
45.6
10.4
62.2
11.8
77.6
14.7
118.8
16.7
30.4
5.8
50.1
10.2
60
13.9
49.3
10.9
36.1
7
38
9.1
108.2
16.1
54
12.6
42.1
9.8
90
16.6
Fit a simple linear regression model to these data, evaluate the resulting least squares
equation (including a residual analysis) and revise it as necessary.

% tax Y:
20
15
10
5
0

% tax Y:

Linear Regression Equations:


y = 0.1157x + 4.7037
R = 0.8363