Академический Документы
Профессиональный Документы
Культура Документы
Splitting
Corrected SS
Sequential testing
1/123
=0
Splitting
Corrected SS
Sequential testing
mean 0;
variance 2 I ;
2/123
=0
Splitting
Corrected SS
Sequential testing
The first thing we want to test is for model relevance: does our
model contribute anything at all?
If none of the x variables have any relevance for predicting y, then
all the parameters will be 0. We test for this using the null
hypothesis
H0 : = 0.
Alternatively, if at least some of the x variables are relevant to
predicting y, then the corresponding parameters will be nonzero.
So our alternative hypothesis is
H1 : 6= 0.
To test these hypotheses, we assume throughout the section that
the errors are normally distributed.
Linear Statistical Models: Inference for the full rank model
3/123
=0
Splitting
Corrected SS
Sequential testing
ANOVA
The method used to test the hypotheses is ANOVA.
If = 0, then y = consists entirely of errors. In this case, yT y,
the sum of squares of the errors, measures the variability of the
errors.
However, if 6= 0, then y = X + . In this case, yT y is not
made up solely of the errors but also of the model predictions.
Some of yT y will come from the errors and some from the model
predictions.
By separating yT y into the two parts, measuring variation due to
the model and variation due to the errors, we can compare them to
see how well the model is doing.
Linear Statistical Models: Inference for the full rank model
4/123
=0
Splitting
Corrected SS
Sequential testing
SSRes
= (y X b)T (y X b)
= yT y 2yT H y + yT H 2 y
(y H y)T (y H y)
=
yT y yT H y
= yT y yT X (X T X )1 X T y
which means that
yT y = yT X (X T X )1 X T y + SSRes .
T y
the regression sum of squares
We call yT X (X T X )1 X T y = y
and denote it by SSReg . It reflects the variation in the response
variable that is accounted for by the model. If we call the total
variation in the response variable SSTotal = yT y, then we have
divided it into:
SSTotal = SSReg + SSRes .
Linear Statistical Models: Inference for the full rank model
5/123
=0
Splitting
Corrected SS
Sequential testing
SSReg
= yT X (X T X )1 X T y
= T X T X (X T X )1 X T X
= T X T X
= yT y = SSTotal
and SSRes = 0.
6/123
=0
Splitting
Corrected SS
Sequential testing
SSRes
= (y X b)T (y X b)
= yT y = SSTotal
and SSReg = 0.
7/123
=0
Splitting
Corrected SS
Sequential testing
3.5
2.0
2.5
3.0
4.0
4.5
5.0
7
8/123
=0
Splitting
Corrected SS
Sequential testing
y=
1.9
2.7
4.2
4.8
4.8
5.1
,X =
1
1
1
1
1
1
2
3
4
5
6
7
9/123
=0
Splitting
Corrected SS
Sequential testing
Since
SSTotal = yT y =
1.9 2.7 4.2 4.8 4.8 5.1
1.9
2.7
4.2
4.8
4.8
5.1
= 100.63,
we get
SSReg = SSTotal SSRes = 99.53.
Since 99.53 > 1.1, informally we would say that there is some
linear signal in the data.
10/123
=0
Splitting
Corrected SS
Sequential testing
11/123
=0
Splitting
Corrected SS
Sequential testing
Theorem
In the full rank linear model, SSRes / 2 has a 2 distribution with
n p degrees of freedom, SSReg / 2 has a noncentral 2
distribution with p degrees of freedom and noncentrality parameter
=
1 T T
X X ,
2 2
12/123
=0
Splitting
Corrected SS
Sequential testing
13/123
=0
Splitting
Corrected SS
Sequential testing
The test for = 0 comes about when we observe that if the null
hypothesis is true, the noncentrality parameter for SSReg / 2 must
be 0.
Thus, under H0 ,
SSReg /p 2
SSReg /p
MSReg
=
=
2
SSRes /(n p)
SSRes /(n p)
MSRes
has an F distribution with p and n p degrees of freedom.
14/123
=0
Splitting
Corrected SS
Sequential testing
15/123
=0
So if = 0, E [
SSReg
p ]
Splitting
Corrected SS
Sequential testing
16/123
=0
Splitting
Corrected SS
Sequential testing
Source of
variation
Regression
Residual
Total
Sum of
squares
yT X (X T X )1 X T y
T
y y yT X (X T X )1 X T y
yT y
degrees of
freedom
p
n p
n
Mean
square
SSReg
p
SSRes
np
F
ratio
MSReg
MSRes
17/123
=0
Splitting
Corrected SS
Sequential testing
Files (x1 )
4
2
20
6
6
3
4
16
4
6
5
Flows (x2 )
44
33
80
24
227
20
41
187
19
50
48
Processes (x3 )
18
15
80
21
50
18
13
137
15
21
17
18/123
=0
Splitting
Corrected SS
Sequential testing
= yT X (X T X )1 X T y = 38978
yT y = 39667
SSRes
= yT y SSReg = 689
MSReg
= SSReg /4 = 9745
MSRes
= SSRes /(11 4) = 98
19/123
=0
Splitting
Corrected SS
Sequential testing
Variation
Regression
Residual
Total
SS
38978
689
39667
d.f.
4
7
11
MS
9745
98
F
99
20/123
=0
Splitting
Corrected SS
Sequential testing
21/123
=0
Splitting
Corrected SS
Sequential testing
We test H0 : = 0.
> (SS <- sum(y^2))
[1] 381.1864
> (SSRes <- sum((y - X %*% b)^2))
[1] 4.498653
> (SSReg <- SS - SSRes)
[1] 376.6877
> (SSReg <- t(y) %*% X %*% solve(t(X) %*% X) %*% t(X) %*% y)
[,1]
[1,] 376.6877
> (Fstat <- as.vector((SSReg/p)/(SSRes/(n-p))))
[1] 3768.005
> pf(Fstat, p, n-p, lower.tail=FALSE)
[1] 6.656806e-130
Linear Statistical Models: Inference for the full rank model
22/123
=0
Splitting
Corrected SS
Sequential testing
0
midrib + estim
Df Sum of Sq
3
Pr(>F)
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
23/123
=0
Splitting
Corrected SS
Sequential testing
24/123
=0
Splitting
Corrected SS
Sequential testing
0 1 1
0
0 0
1 1
, =
0
0
.
25/123
=0
Splitting
Corrected SS
Sequential testing
Test statistic
26/123
=0
Splitting
Corrected SS
Sequential testing
(C )T [C (X T X )1 C T ]1 (C )
.
2 2
27/123
=0
Splitting
Corrected SS
Sequential testing
28/123
=0
Splitting
Corrected SS
Sequential testing
E[
29/123
=0
Splitting
Corrected SS
Sequential testing
1.96
0.12
b = (X T X )1 X T y =
0.18
0.8
so
0.04
0.12
b =
0.18 .
0.2
Linear Statistical Models: Inference for the full rank model
30/123
=0
Splitting
Corrected SS
Sequential testing
F4,7
p = 4
1110.18/4
=
= 2.8.
668.63/7
31/123
=0
Splitting
Corrected SS
Sequential testing
1.96
0.12
b = (X T X )1 X T y =
0.18 .
0.8
32/123
=0
Splitting
Corrected SS
Sequential testing
Therefore
Cb
T
C (X X )
0.06
0.62
0.013 0.0024
0.0024 0.00077
=
=
(C b )T [C (X T X )1 C T ]1 (C b ) = 1138.35
33/123
=0
Splitting
Corrected SS
Sequential testing
34/123
=0
Splitting
Corrected SS
Sequential testing
35/123
=0
>
>
>
>
Splitting
Corrected SS
Sequential testing
0
midrib + estim
Df Sum of Sq
3
Pr(>F)
36/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 0 = 1, 1 = 2
> ( C <- matrix(c(1,0,0,1,0,-1),2,3) )
[1,]
[2,]
37/123
=0
Splitting
Corrected SS
Sequential testing
Testing if part of is 0
38/123
=0
Splitting
Corrected SS
Sequential testing
0
..
.
r 1
= 1
r
2
..
.
k
39/123
=0
Splitting
Corrected SS
Sequential testing
40/123
=0
Splitting
Corrected SS
Sequential testing
R( 1 | 2 )/r
.
SSRes /(n p)
41/123
=0
Splitting
Corrected SS
Sequential testing
Theorem
R( 1 | 2 ) = R() R( 2 )
where R() is the regression sum of squares for the full model
1
y = X + = [X1 |X2 ]
+
2
and R( 2 ) is the regression sum of squares for the reduced model
y = X2 2 + .
42/123
=0
Splitting
Corrected SS
Sequential testing
Lemma
Suppose that
A=
A11 A12
A21 A22
,A
=B =
B11 B12
B21 B22
,
1
and B22
exists. Then
1
A1
11 = B11 B12 B22 B21 .
43/123
=0
Splitting
Corrected SS
Sequential testing
We have
T
(p (p r )) + [X
X)
T
X2 ]y
T
1 T
tr (X (X X )
X X2 (X2 X2 )
X2 )
T
1 T
T T
T
1 T
+ X [X (X X )
X X2 (X2 X2 )
X2 ]X
Ey [X (X
X2 (X2 X2 )
1
X X
r
h
+
T
1
2
T
2
T
1
""
T
2
X1T
X2T
""
#
#
X1T
T
1 T
X1
X2
X2 (X2 X2 )
X2
T
X2
#
"
##
X1T X2
X1T X2 (X2T X2 )1 X2T X1
X1T X2
X2T X2
X2T X1
X2T X2
X1
X2
X1T X1
X2T X1
r+
r + 1 [X1 X1 X1 X2 (X2 X2 )
X2 X ]
"
X2 (X2 X2 )
1
tr (A11 A11 )
T 1
E 1 A11 1 .
2
1
2
1
2
X2 X1 ] 1
T 1
1 A11 1
B = XTX =
X1T X1
X2T X1
X1T X2
X2T X2
.
44/123
=0
Splitting
Corrected SS
Sequential testing
Source of
variation
Regression
Full model
Reduced model
1 in presence of 2
Residual
Total
Sum of
squares
degrees of
freedom
R()
R( 2 )
R( 1 | 2 )
yT y R()
yT y
p
pr
r
n p
n
Mean
square
R( 1 | 2 )
r
SSRes
np
F
ratio
R( 1 )| 2 )/r
MSRes
45/123
=0
Splitting
Corrected SS
Sequential testing
1
2
1
=
=
.
3
2
0
46/123
=0
Splitting
Corrected SS
Sequential testing
4 44 18 1
2 33 15 1
X = .
= X1 X2 .
.
.
.
..
.. ..
..
5 48 17 1
47/123
=0
Splitting
Corrected SS
Sequential testing
= 21800.
From before,
R() = SSReg = 38978, MSRes = 98,
so
R( 1 | 2 ) = R() R( 2 ) = 38978 21800 = 17178.
Linear Statistical Models: Inference for the full rank model
48/123
=0
Splitting
Corrected SS
Sequential testing
In other words, the intercept alone does not explain the variation in
the response variable adequately, and we are (reasonably) certain
that we need at least one of the terms in the model.
49/123
=0
Variation
Regression
Full
Reduced
1 in presence of 2
Residual
Total
Splitting
Corrected SS
SS
d.f.
38978
21800
17178
689
39667
4
1
3
7
11
Sequential testing
MS
5726
98
58.2
50/123
=0
Splitting
Corrected SS
Sequential testing
Sum of
squares
degrees of
freedom
R() = yT H y
2
Pn
n
i=1 yi
R( 1 | 2 )
yT y R()
yT y
k +1
1
k
n k 1
n
Mean
square
R( 1 | 2 )
k
SSRes
np
F
ratio
R( 1 )| 2 )/k
MSRes
51/123
=0
Splitting
Corrected SS
Sequential testing
(yi y ) =
n
X
i=1
yi2
P
( ni=1 yi )2
= yT y R( 2 ).
n
52/123
=0
Splitting
Corrected SS
Sequential testing
Source of
variation
Regression
Residual
Total
Sum of
squares
2
Pn
SSReg
n
i=1 yi
SSRes
2
Pn
n
yT y
i=1 yi
degrees of
freedom
k
n k 1
n 1
Mean
square
R( 1 | 2 )
k
SSRes
nk 1
F
ratio
R( 1 | 2 )/k
MSRes
53/123
=0
Splitting
Corrected SS
Sequential testing
Variation
Regression
Residual
Total
SS
17178
689
17867
d.f.
3
7
10
MS
5726
98
F
58.2
The actual test does not change: the F statistic and degrees of
freedom are the same.
54/123
=0
Splitting
Corrected SS
Sequential testing
Clover example
55/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 0 = 0
> X2 <- X[,-1]
> b2 <- solve(t(X2) %*% X2, t(X2) %*% y)
> (SSRes2 <- sum((y - X2 %*% b2)^2))
[1] 6.296183
> (Rg2 <- SS - SSRes2)
[1] 374.8902
> (Rg2 <- t(y) %*% X2 %*% solve(t(X2) %*% X2) %*% t(X2) %*% y)
[,1]
[1,] 374.8902
> (Rg1g2 <- as.vector(SSReg - Rg2))
[1] 1.79753
56/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 0 = 0
> r <- 1
> (Fstat <- (Rg1g2/r)/(SSRes/(n-p)))
[1] 53.94204
> pf(Fstat, r, n-p, lower.tail=FALSE)
[1] 1.761625e-11
57/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 0 = 0
0 + midrib + estim
midrib + estim
Df Sum of Sq
F
1
Pr(>F)
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
58/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 1 = 0
59/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 1 = 0
60/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 1 = 0
estim
midrib + estim
Df Sum of Sq
1
Pr(>F)
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
61/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 2 = 0
62/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 2 = 0
63/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 2 = 0
midrib
midrib + estim
Df Sum of Sq
1
Pr(>F)
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
64/123
=0
Splitting
Corrected SS
Sequential testing
65/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 1 = 2 = 0
66/123
=0
Splitting
Corrected SS
Sequential testing
H0 : 1 = 2 = 0
1
midrib + estim
Df Sum of Sq
2
Pr(>F)
0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
67/123
=0
Splitting
Corrected SS
Sequential testing
> summary(model)
Call:
lm(formula = area ~ midrib + estim, data = clover)
Residuals:
Min
1Q
-0.57603 -0.09824
Median
0.01173
3Q
0.11355
Max
0.51957
Coefficients:
Estimate Std. Error t value
(Intercept) -1.58458
0.21575 -7.345
midrib
0.76731
0.11295
6.793
estim
0.62183
0.06435
9.663
--Signif. codes: 0 '***' 0.001 '**' 0.01
Pr(>|t|)
1.76e-11 ***
3.19e-10 ***
< 2e-16 ***
68/123
=0
Splitting
Corrected SS
Sequential testing
Sequential testing
Suppose we have a number of explanatory variables, and we would
like a parsimonious model. That is, a model which explains the
variation in the response y using a minimal number of explanatory
variables. A parsimonious model is less likely to suffer from
overfitting.
For such a model, if we were to test if parameter i is 0, in the
presence of the other model parameters, we should always reject
the null.
How do we find such a minimal set of parameters?
69/123
=0
Splitting
Corrected SS
Sequential testing
70/123
=0
Splitting
Corrected SS
Sequential testing
71/123
=0
Splitting
Corrected SS
Sequential testing
That is, we can start with a simple model and sequentially add
parameters until we reach a parsimonious model. That is, until
adding parameters does not significantly improve the fit.
72/123
=0
Splitting
Corrected SS
Sequential testing
= 0 + (0)
= 0 + 1 x1 + (1)
..
.
= 0 + 1 x1 + . . . + k xk + (k ) .
73/123
=0
Splitting
Corrected SS
Sequential testing
Note that these are full regression sums of squares, i.e. we are
looking at the total variation explained by the model in the
presence of no other parameters.
Now by taking the difference between the sums of squares, we can
get the extra variation explained as we add variables to the model
one at a time:
74/123
=0
Splitting
Corrected SS
Sequential testing
Theorem
Suppose y = X + where X is full rank and N (0, 2 I ).
Let Xj be the first j + 1 columns of X (the first column is all
ones), and put
Hj
= Xj (XjT Xj )1 XjT
Rj
= yT Hj y = R(0 , . . . , j )
75/123
=0
Splitting
Corrected SS
Sequential testing
Lemma
Suppose X = [X1 |X2 ] is full rank, size n p for n p, then
X2 = X (X T X )1 X T X2
Lemma
For X as above, X1 size n r and X2 size n (p r ), we have
that
A2 := X (X T X )1 X T X2 (X2T X2 )1 X2T = H H2
is symmetric and idempotent, rank r .
76/123
=0
Splitting
Corrected SS
Sequential testing
Proof of lemmas
77/123
=0
Splitting
Corrected SS
Sequential testing
Proof of theorem
78/123
=0
Splitting
Corrected SS
Sequential testing
Note that this is still not entirely satisfactory, because the result
will depend heavily on the order of the parameters considered.
Different orderings can result in different sets of parameters being
included in the final model.
79/123
=0
Splitting
Corrected SS
Sequential testing
x1 : Beak length
x2 : Wing length
x5 : Width
80/123
=0
Splitting
Corrected SS
Sequential testing
81/123
=0
Splitting
Corrected SS
Sequential testing
SS
595.16
7.92
603.08
d.f.
6
16
22
MS
99.19
0.49
F
200.47
82/123
=0
Splitting
Corrected SS
Sequential testing
R(0 ) = 387.16
R(1 |0 ) = 199.15
R(2 |0 , 1 ) = 0.127
R(3 |0 , 1 , 2 ) = 4.12
R(4 |0 , 1 , 2 , 3 ) = 0.263
R(5 |0 , 1 , 2 , 3 , 4 ) = 4.35
Note that these sum to the regression sum of squares for the full
model, 595.16.
83/123
=0
Splitting
Corrected SS
Sequential testing
84/123
=0
Splitting
Corrected SS
Sequential testing
85/123
=0
Splitting
Corrected SS
Sequential testing
Forward selection
86/123
=0
Splitting
Corrected SS
Sequential testing
87/123
=0
Splitting
Corrected SS
Sequential testing
Backward elimination
A method which is conceptually very similar to forward selection is
backward elimination:
1. Start off with the full model.
2. Calculate the F -values for the tests H0 : i = 0, for all
parameters in the model, in the presence of the other
parameters in the model.
3. If all of the tests are significant (we reject all null hypotheses),
then stop.
4. Otherwise, remove the least significant parameter (i.e.
parameter with smallest F -value).
5. Return to step 2.
88/123
=0
Splitting
Corrected SS
Sequential testing
89/123
=0
Splitting
Corrected SS
Sequential testing
90/123
=0
Splitting
Corrected SS
Sequential testing
> pairs(heat)
50
70
10
30
50
20
30
50
70
5 10
x1
20
30
x2
50
10
x3
80
100
10
30
x4
5 10
20
10
20
80
100
91/123
=0
Splitting
Corrected SS
Sequential testing
F value
Pr(>F)
12.6025
21.9606
4.4034
22.7985
0.0045520
0.0006648
0.0597623
0.0005762
**
***
.
***
92/123
=0
Splitting
Corrected SS
Sequential testing
93/123
=0
Splitting
Corrected SS
Sequential testing
94/123
=0
Splitting
Corrected SS
Sequential testing
Backward elimination
> fullmodel <- lm(y ~ x1+x2+x3+x4,data=heat)
> drop1(fullmodel, scope= ~ ., test="F")
Single term deletions
Model:
y ~ x1 + x2 + x3 + x4
Df Sum of Sq
RSS
AIC F value Pr(>F)
<none>
47.864 26.944
x1
1
25.9509 73.815 30.576 4.3375 0.07082 .
x2
1
2.9725 50.836 25.728 0.4968 0.50090
x3
1
0.1091 47.973 24.974 0.0182 0.89592
x4
1
0.2470 48.111 25.011 0.0413 0.84407
--Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 '
> model2 <- lm(y~x1+x2+x4,data=heat)
Linear Statistical Models: Inference for the full rank model
95/123
=0
Splitting
Corrected SS
Sequential testing
96/123
=0
Splitting
Corrected SS
Sequential testing
97/123
=0
Splitting
Corrected SS
Sequential testing
Stepwise selection
Stepwise selection functions similarly to forward or backward
selection, but with the possibility of either adding or eliminating a
variable at each step. We give a procedure using a goodness of fit
measure called the AIC , though it is trivial to adjust for the usage
of any other goodness-of-fit statistic.
1. Start with any model.
2. Compute the AIC of all models which either have one extra
variable or one less variable than the current model.
3. If the AIC of all such models is more than the AIC of the
current model, stop.
4. Otherwise, change to the model with the lowest AIC .
5. Return to step 2.
98/123
=0
Splitting
Corrected SS
Sequential testing
99/123
=0
Splitting
Corrected SS
Sequential testing
100/123
=0
Splitting
Corrected SS
Sequential testing
Goodness-of-fit measures
The F test is used to compare nested models, that is, it requires
the variable set of one model to be fully contained in the variable
set of the other model. Thus we cannot use an F test to compare
models which, for example, have replaced one variable with
another variable.
Also, use of the F test requires the somewhat arbitrary choice of a
significance level.
To overcome these problems many authors have proposed
goodness-of-fit measures, which try to give a measure of how good
a model is, independently of other models (though still dependent
on the data in question).
101/123
=0
Splitting
Corrected SS
Sequential testing
102/123
=0
Splitting
Corrected SS
Sequential testing
R2
A commonly reported goodness-of-fit statistic is the proportion of
(corrected) total sums of squares that is explained by the model:
R2 = 1
SSTotal
SSRes
.
P
( i yi )2 /n
R 2 lies between 0 and 1, and the larger it is, the more variation in
y is explained by the model. (We are assuming that 0 is always in
the model.)
However R 2 can never decrease when we add a variable to a
model, as even an irrelevant variable will explain a small extra
amount of variation. We would like to remove irrelevant variables,
so, like the SSRes , R 2 is not appropriate for model selection.
Linear Statistical Models: Inference for the full rank model
103/123
=0
Splitting
Corrected SS
Sequential testing
Adjusted R 2
The adjusted R 2 tries to account for model complexity, by
introducing a penalty based on the number of parameters in the
model.
n 1
(1 R 2 ).
n 1k
Here we are assuming that 0 is in the model, and k is the number
of other parameters in the model.
adj R 2 = 1
104/123
=0
Splitting
Corrected SS
Sequential testing
AIC
A very popular goodness-of-fit statistic is Akaikes information
criterion, or AIC. This is based on the likelihood of the observed
values of the response.
AIC
= 2 ln(likelihood) + 2p
SSRes
= n ln
+ 2p + const.
n
105/123
=0
Splitting
Corrected SS
Sequential testing
106/123
=0
Splitting
Corrected SS
Sequential testing
107/123
=0
Splitting
Corrected SS
Sequential testing
+ x4
+ x2
+ x1
+ x3
<none>
AIC=71.44
Df Sum of Sq
RSS
AIC
1
1831.90 883.87 58.852
1
1809.43 906.34 59.178
1
1450.08 1265.69 63.519
1
776.36 1939.40 69.067
2715.76 71.444
Step: AIC=58.85
y ~ x4
108/123
=0
Splitting
Corrected SS
Sequential testing
+ x1
+ x3
<none>
+ x2
- x4
AIC=58.85
Df Sum of Sq
1
809.10
1
708.13
1
1
RSS
74.76
175.74
883.87
14.99 868.88
1831.90 2715.76
AIC
28.742
39.853
58.852
60.629
71.444
Step: AIC=28.74
y ~ x4 + x1
109/123
=0
Splitting
Corrected SS
Sequential testing
+ x2
+ x3
<none>
- x1
- x4
Df Sum of Sq
1
26.79
1
23.93
1
1
RSS
47.97
50.84
74.76
809.10 883.87
1190.92 1265.69
AIC
24.974
25.728
28.742
58.852
63.519
Step: AIC=24.97
y ~ x4 + x1 + x2
110/123
=0
Splitting
Corrected SS
Sequential testing
1
1
1
1
9.93
0.11
26.79
820.91
RSS
47.97
57.90
47.86
74.76
868.88
AIC
24.974
25.420
26.944
28.742
60.629
Call:
lm(formula = y ~ x4 + x1 + x2, data = heat)
Coefficients:
(Intercept)
71.6483
x4
-0.2365
x1
1.4519
x2
0.4161
111/123
=0
Splitting
Corrected SS
Sequential testing
112/123
=0
Splitting
Corrected SS
Sequential testing
1
1
1
1
9.93
0.11
26.79
820.91
RSS
47.97
57.90
47.86
74.76
868.88
AIC
24.974
25.420
26.944
28.742
60.629
Call:
lm(formula = y ~ x1 + x2 + x4, data = heat)
Coefficients:
(Intercept)
71.6483
x1
1.4519
x2
0.4161
x4
-0.2365
113/123
=0
Splitting
Corrected SS
Sequential testing
t tests
We can also use a t test for a partial test of one parameter. That
is, to test H0 : i = 0 against H1 : i 6= 0 in the presence of all the
other parameters. (A partial test.)
bi t/2 s cii
where cii is the (i , i )th entry of (X T X )1 , and we use a t
distribution with n p degrees of freedom. If this confidence
interval includes 0, we do not reject H0 ; otherwise, we can reject it.
114/123
=0
Splitting
Corrected SS
Sequential testing
Let us compare this with our existing partial F test. The statistic
we use for this is
R(i |0 , 1 , . . . , i1 , i+1 , . . . , k )
.
SSRes /(n p)
The denominator is of course s 2 .
115/123
=0
Splitting
Corrected SS
Sequential testing
bi2
.
cii
116/123
=0
Splitting
Corrected SS
Sequential testing
This means that the t test and the F test are (nearly) identical;
the t test is actually slightly more useful, because it also gives an
indication of the sign of the parameter.
117/123
=0
Splitting
Corrected SS
Sequential testing
=
= 14.89
s c11
0.86 0.00057
which, using a t distribution with n p = 6 2 = 4 degrees of
freedom, would reject the hypothesis 1 = 0 at the 0.05 level
(critical value 2.78). We can also say that 1 is almost certainly
positive.
118/123
=0
Splitting
Corrected SS
Sequential testing
119/123
=0
Splitting
Corrected SS
Sequential testing
Variation
Regression
Full
Reduced
1 in presence of 0
Residual
Total
SS
d.f.
663.77
498.68
165.09
2.98
666.75
2
1
1
4
6
MS
165.09
0.74
221.7
120/123
=0
Splitting
Corrected SS
Sequential testing
Shrinkage
121/123
=0
Splitting
Corrected SS
Sequential testing
ei2
i=1
k
X
bj2 .
j =0
122/123
=0
Splitting
Corrected SS
Sequential testing
ei2 +
i=1
k
X
|bj |.
j =0
123/123