Академический Документы
Профессиональный Документы
Культура Документы
SAJ_Assignment9
March 28, 2016
Question 11: How would you use one of the variable selection methods to choose a model with fewer variables? Select one of the
methods (either one of the stepwise or criterion-based methods)
and show which variables it would lead you to keep. Do you agree
with its results?
Stepwise regression is applied to select the best model on the basic of AIC (Akaike Information criterion)
library(ggplot2)
library(gridExtra)
library(scatterplot3d)
library(car)
library(knitr)
## Warning: package 'knitr' was built under R version 3.2.4
require(MASS)
## Loading required package: MASS
## Warning: package 'MASS' was built under R version 3.2.4
head(mtcars)
##
##
##
##
##
##
##
Mazda RX4
Mazda RX4 Wag
Datsun 710
Hornet 4 Drive
Hornet Sportabout
Valiant
Gear<-mtcars$gear
Carb<-mtcars$carb
Regression<- lm(Mpg~Cyl+Disp+Hp+Drat+Wt+Qsec+V.s+Aim+Gear+Carb,data=mtcars)
step <- stepAIC(Regression, direction="both")
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Start: AIC=70.9
Mpg ~ Cyl + Disp + Hp + Drat + Wt + Qsec + V.s + Aim + Gear +
Carb
- Cyl
- V.s
- Carb
- Gear
- Drat
- Disp
- Hp
- Qsec
<none>
- Aim
- Wt
Df Sum of Sq
RSS
AIC
1
0.0799 147.57 68.915
1
0.1601 147.66 68.932
1
0.4067 147.90 68.986
1
1.3531 148.85 69.190
1
1.6270 149.12 69.249
1
3.9167 151.41 69.736
1
6.8399 154.33 70.348
1
8.8641 156.36 70.765
147.49 70.898
1
10.5467 158.04 71.108
1
27.0144 174.51 74.280
Step: AIC=68.92
Mpg ~ Disp + Hp + Drat + Wt + Qsec + V.s + Aim + Gear + Carb
- V.s
- Carb
- Gear
- Drat
- Disp
- Hp
<none>
- Qsec
- Aim
+ Cyl
- Wt
Df Sum of Sq
RSS
AIC
1
0.2685 147.84 66.973
1
0.5201 148.09 67.028
1
1.8211 149.40 67.308
1
1.9826 149.56 67.342
1
3.9009 151.47 67.750
1
7.3632 154.94 68.473
147.57 68.915
1
10.0933 157.67 69.032
1
11.8359 159.41 69.384
1
0.0799 147.49 70.898
1
27.0280 174.60 72.297
Step: AIC=66.97
Mpg ~ Disp + Hp + Drat + Wt + Qsec + Aim + Gear + Carb
- Carb
- Gear
- Drat
- Disp
- Hp
<none>
- Aim
- Qsec
+ V.s
Df Sum of Sq
RSS
AIC
1
0.6855 148.53 65.121
1
2.1437 149.99 65.434
1
2.2139 150.06 65.449
1
3.6467 151.49 65.753
1
7.1060 154.95 66.475
147.84 66.973
1
11.5694 159.41 67.384
1
15.6830 163.53 68.200
1
0.2685 147.57 68.915
2
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
+ Cyl
- Wt
1
1
Step: AIC=65.12
Mpg ~ Disp + Hp + Drat + Wt + Qsec + Aim + Gear
- Gear
- Drat
<none>
- Disp
- Aim
- Hp
+ Carb
+ V.s
+ Cyl
- Qsec
- Wt
Df Sum of Sq
RSS
1
1.565 150.09
1
1.932 150.46
148.53
1
10.110 158.64
1
12.323 160.85
1
14.826 163.35
1
0.685 147.84
1
0.434 148.09
1
0.414 148.11
1
26.408 174.94
1
69.127 217.66
AIC
63.457
63.535
65.121
65.229
65.672
66.166
66.973
67.028
67.032
68.358
75.350
Step: AIC=63.46
Mpg ~ Disp + Hp + Drat + Wt + Qsec + Aim
- Drat
- Disp
<none>
- Hp
+ Gear
+ Cyl
+ V.s
+ Carb
- Aim
- Qsec
- Wt
Df Sum of Sq
RSS
1
3.345 153.44
1
8.545 158.64
150.09
1
13.285 163.38
1
1.565 148.53
1
1.003 149.09
1
0.645 149.45
1
0.107 149.99
1
20.036 170.13
1
25.574 175.67
1
67.572 217.66
AIC
62.162
63.229
63.457
64.171
65.121
65.242
65.319
65.434
65.466
66.491
73.351
Step: AIC=62.16
Mpg ~ Disp + Hp + Wt + Qsec + Aim
- Disp
<none>
- Hp
+ Drat
+ Gear
+ Cyl
+ V.s
+ Carb
- Qsec
- Aim
- Wt
Df Sum of Sq
RSS
1
6.629 160.07
153.44
1
12.572 166.01
1
3.345 150.09
1
2.977 150.46
1
2.447 150.99
1
1.121 152.32
1
0.011 153.43
1
26.470 179.91
1
32.198 185.63
1
69.043 222.48
AIC
61.515
62.162
62.682
63.457
63.535
63.648
63.927
64.160
65.255
66.258
72.051
Step: AIC=61.52
Mpg ~ Hp + Wt + Qsec + Aim
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
- Hp
<none>
+ Disp
+ Carb
+ Drat
- Qsec
+ Cyl
+ V.s
+ Gear
- Aim
- Wt
Df Sum of Sq
RSS
1
9.219 169.29
160.07
1
6.629 153.44
1
3.227 156.84
1
1.428 158.64
1
20.225 180.29
1
0.249 159.82
1
0.249 159.82
1
0.171 159.90
1
25.993 186.06
1
78.494 238.56
AIC
61.307
61.515
62.162
62.864
63.229
63.323
63.465
63.466
63.481
64.331
72.284
Step: AIC=61.31
Mpg ~ Wt + Qsec + Aim
<none>
+ Hp
+ Carb
+ Disp
+ Cyl
+ Drat
+ Gear
+ V.s
- Aim
- Qsec
- Wt
Df Sum of Sq
1
1
1
1
1
1
1
1
1
1
9.219
8.036
3.276
1.501
1.400
0.123
0.000
26.178
109.034
183.347
RSS
169.29
160.07
161.25
166.01
167.78
167.89
169.16
169.29
195.46
278.32
352.63
AIC
61.307
61.515
61.751
62.682
63.022
63.042
63.284
63.307
63.908
75.217
82.790
The variables selected are Wt, Qsec, and A/m. I do not agree with the result, because Wt is significantly
correlated with A/m (pvalue<0.001). Final fitted model estimated is
M pg = 9.6178 3.9165 W t + 1.2259 Qsec + 2.9358 A/m
fit<- lm(mpg~.,data = mtcars)
step<- step(fit, direction="backward", trace = FALSE)
summary(step)$coeff
##
##
##
##
##
##
##
##
##
##
wt
1 442.58 442.58 73.203 2.673e-09 ***
qsec
1 109.03 109.03 18.034 0.0002162 ***
Residuals 28 169.29
6.05
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.459 on 28 degrees of freedom Multiple R-squared: 0.8497, Adjusted R-squared:
0.8336 F-statistic: 52.75 on 3 and 28 DF, p-value: 1.21e-11
carb
am
vs
qsec
wt
drat
hp
disp
cyl
mpg
mpg
vs 0.710.810.720.750.450.590.79 1 0.170.63
0.6
Question 13:
(a) Derive the coefficients from your regression using the (XT X)???1XT Y formula
Question 14: Add at least one quadratic term into your model and
interpret the results. Is it significant? What is the effect of a 1-unit
increase in that variable at its mean value?
lm(formula = Mpg ~ Wt + Wt*Wt, data = mtcars)
##
##
##
##
##
##
##
Call:
lm(formula = Mpg ~ Wt + Wt * Wt, data = mtcars)
Coefficients:
(Intercept)
37.285
Wt
-5.344
M P G = Intercept + b1 W t + b2 W t2
H0 (null hypothesis): b2=0 against H1(alt hypothesis): b2???0 H0 is rejected as the pvalue=0.00286
fit8<- lm(Mpg~Wt+Qsec+Aim+Wt*Wt,data=mtcars)
summary(fit8)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = Mpg ~ Wt + Qsec + Aim + Wt * Wt, data = mtcars)
Residuals:
Min
1Q Median
-3.4811 -1.5555 -0.7257
3Q
1.4110
Max
4.6610
Coefficients:
Estimate Std. Error t value
(Intercept)
9.6178
6.9596
1.382
Wt
-3.9165
0.7112 -5.507
Qsec
1.2259
0.2887
4.247
Aim
2.9358
1.4109
2.081
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>|t|)
0.177915
6.95e-06 ***
0.000216 ***
0.046716 *
'*' 0.05 '.' 0.1 ' ' 1
fit9<- lm(Mpg~Wt+Qsec+Aim+Qsec*Qsec,data=mtcars)
summary(fit9)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Call:
lm(formula = Mpg ~ Wt + Qsec + Aim + Qsec * Qsec, data = mtcars)
Residuals:
Min
1Q Median
-3.4811 -1.5555 -0.7257
3Q
1.4110
Max
4.6610
Coefficients:
Estimate Std. Error t value
(Intercept)
9.6178
6.9596
1.382
Wt
-3.9165
0.7112 -5.507
Qsec
1.2259
0.2887
4.247
Aim
2.9358
1.4109
2.081
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>|t|)
0.177915
6.95e-06 ***
0.000216 ***
0.046716 *
'*' 0.05 '.' 0.1 ' ' 1
Call:
lm(formula = Mpg ~ Wt + Qsec + Aim + Qsec * Qsec + Wt * Wt, data = mtcars)
Residuals:
Min
1Q Median
-3.4811 -1.5555 -0.7257
3Q
1.4110
Max
4.6610
Coefficients:
Estimate Std. Error t value
(Intercept)
9.6178
6.9596
1.382
Wt
-3.9165
0.7112 -5.507
Qsec
1.2259
0.2887
4.247
Aim
2.9358
1.4109
2.081
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>|t|)
0.177915
6.95e-06 ***
0.000216 ***
0.046716 *
'*' 0.05 '.' 0.1 ' ' 1
Question 15: Add at least one interaction term to you model and
interpret the results. Is it significant? What is the effect of a 1-unit
increase in one of those interacted variables holding the other at
its mean value?
fit11<- lm(Mpg~Wt+Qsec+Aim+Wt*Qsec,data=mtcars)
summary(fit11)
##
## Call:
## lm(formula = Mpg ~ Wt + Qsec + Aim + Wt * Qsec, data = mtcars)
##
## Residuals:
8
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
Min
1Q Median
-3.5999 -1.6316 -0.6345
3Q
1.3839
Max
4.2888
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -20.2272
28.0796 -0.720
0.4775
Wt
5.7172
8.8117
0.649
0.5219
Qsec
2.8927
1.5466
1.870
0.0723 .
Aim
2.8596
1.4075
2.032
0.0521 .
Wt:Qsec
-0.5403
0.4926 -1.097
0.2824
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.45 on 27 degrees of freedom
Multiple R-squared: 0.8561, Adjusted R-squared: 0.8348
F-statistic: 40.15 on 4 and 27 DF, p-value: 5.416e-11
F value : 41.397
AdjustedR squared : 0.867
On the basis of Adjusted R-squared, this is the best model where quadratic effect of weight, and interaction
effect btw Wt and Qsec is included.