12 Week4Lect3

MATH2831/2931
Linear Models/ Higher Linear Models.
August 19, 2013
Week 4 Lecture 3 - Last lecture:
Confidence Intervals for coefficients
Properties of multivariate Gaussian
Hypothesis testing for coefficients
Confidence intervals for the mean and prediction intervals.
Joint confidence regions.
Week 4 Lecture 3 - This lecture:
Decomposing variation
Introduction to the analysis of variance table
Sequential sums of squares.
Week 4 Lecture 3 - Decomposing variation

RECALL: Identity for simple linear regression
n
X
i=1
(yi y )2 =
n
X
i=1
(
yi y )2 +
n
X
(yi yi )2 .
i=1
SStotal = SSreg + SSres
SStotal , total sum of squares (the sum of squared deviations of

the responses about their mean)
SSreg , regression sum of squares (sum of squared deviations of

fitted values about their mean, which is y )
SSres is called the residual sum of squares (the sum of the

squared deviations of the fitted values from the responses).
Week 4 Lecture 3 - Decomposing variation
This identity decomposing variation into a part explained by

the model and a part unexplained holds in the general linear
model.
For simple linear regression, the partitioning of variation was

presented in the analysis of variance (ANOVA) table
The ANOVA table was also a way of organizing calculations in

hypothesis testing
Week 4 Lecture 3 - Adjusted R 2
For simple linear regression

R2 =
SSreg
.
SStotal
2.
We also have adjusted R 2 , written as R
2 = 0.748 here (or 74.8 percent)
R
2 ???
What is the definition of R
Rewrite R 2 as
R2 = 1
SSres
SStotal
2 by replacing SSres in (1) by

Define R
2 (which is
SSres /(n p)) and replacing SStotal by SStotal /(n 1).
(1)
2 = 1 (n 1)SSres
R
(n p)SStotal
(2)
2 (n 1)
2 = 1
.
R
SStotal
(3)
or
In terms of R 2 ,
2 = 1 n 1 (1 R 2 ).
R
np

What was the motivation for introduction of R 2 ?
R 2 is an easily interpreted measure of fit of a linear model:

proportion of total variation explained by the model.
Might be tempted to use R 2 as a basis for comparing models
with different numbers of parameters.
IMPORTANT: R 2 is not helpful here: if a new predictor is
added to a linear model, the residual sum of squares always
decreases, and R 2 will increase.
The attempt to select a subset of good predictors from a set
of possible predictors using R 2 results in the full model, even
if many of the predictors are irrelevant.
2 does not necessarily increase as new predictors are
R
added to a model.
Since
2 (n 1)
2 = 1
R
SStotal
2 increases as
R
2 decreases
2 is equivalent to ranking models
Ranking models using R
2
based on
QUESTION: Does
2 necessarily decrease as new predictors
2 increase?
are added to the model, and hence must R

Recall
2 =
(y Xb) (y Xb)
.
np
Consider two models in which one model contains a subset of the

predictors included in the other.
For the larger model, the numerator in the above expression

(residual sum of squares) is smaller, but the denominator will
also be smaller, as p is larger
Any reduction in the residual sum of squares must be large

enough to overcome the reduction in the denominator
2 doesnt necessarily increase as we make the model more
R
complicated.
2 may be useful as a crude device for model comparison !!
So R
Week 4 Lecture 3 - Analysis of variance table
Week 4 Lecture 3 - Analysis of variance table
Notation: = (0 , ...k ) , = ( (1) , (2) ) where (1) is an r 1

subvector and (2) is a (p r ) 1 subvector.
Week 4 Lecture 3 - Sequential sums of squares
Write R( (2) | (1) ) for the increase in SSreg when the

predictors corresponding to the parameters (2) are added to
a model involving the parameters (1)
Think of R( (2) | (1) ) as the variation explained by the
term involving (2) in the presence of the term involving
(1)
Define R(1 , ..., k |0 ) as SSreg .
Sequential sums of squares shown below the analysis of variance

table are the values
R(1 |0 )
R(2 |0 , 1 )
R(3 |0 , 1 , 2 )
..
.
R(k |0 , ..., k1 )
Values add up to R(1 , ...k |0 ) = SSreg .
Sequential sums of squares are useful when we have first ordered

the variables in our model in a meaningful way
(based on the underlying science or context).
They tell us about how much a term contributes to explaining
variation given all the previous terms in the table
(but ignoring the terms which come after).
Week 4 Lecture 3 - Hypothesis testing
Simple linear regression model: t test (or equivalent F test)

for examining the usefulness of a predictor.
General linear model: partial t test for the usefulness of a

predictor in the presence of the other predictors.
Equivalent partial F test: test statistic is the square of the

partial t statistic.
Test for overall model adequacy: is the model including all the
predictors better than the model containing just an intercept?
The F statistic in the analysis of variance table and the

p-value relate to a test for overall model adequacy !
Week 4 Lecture 3 - Testing model adequacy
In the general linear model, if 1 = . . . = k = 0, then the statistic

F =
SSreg /k
SSres /(n p)
has an Fk,np distribution.

This distributional result is the basis for a hypothesis test.

To test
H0 : 1 = . . . = k = 0
versus
H1 : Not all j = 0, j = 1, . . . , k
we use the test statistic
F =
SSreg /k
.
SSres /(n p)
For a size test the critical region is

F F;k,np .
Alternatively, the p-value for the test is
Pr (F F )
where F Fk,np .

ANOVA table: columns show source of variation (Source), degrees
of freedom (DF), sums of squares (SS), mean squares (MS), value
of F statistic for testing model adequacy (F) and corresponding
p-value (P).
Source
DF
SS
MS
Regression
Residual
Total
p1
np
n1
SSreg
SSres
SStotal
SSreg
(p1)
SSres
(np)
MSreg
MSres
Week 4 Lecture 3 - Model adequacy for risk assessment
Risk assessment data: response is mean risk assessment, seven

accounting determined measures of risk as predictors.

RECAL we are testing
H0 : 1 = . . . = k = 0
versus
H1 : Not all j = 0, j = 1, . . . , k
we use the test statistic
F =
SSreg /k
.
SSres /(n p)
F statistic for testing overall model adequacy is 6.97, and the

associated p-value is
p = Pr (F 6.97)
where F F7,17 .
p = 0.001
approximately.
RESULT: Reject the null hypothesis

H0 : 1 = ... = k = 0
in favour of the alternative
H1 : Not all j = 0, j = 1, ..., k.
What can we say about inclusion of predictors in the order we

have selected?
Mean Risk Assessment = 2.19 + 0.443 Dividend Payout +

0.865 Current Ratio - 0.247 Asset Size + 1.96 Asset Growth
+ 3.59 Leverage + 0.135 Variability Earnings + 1.05
Covariability Earnings
We have from this ordering: R(1 |0 ) = 18.42;

R(2 |1 , 0 ) = 5.6042; R(3 |2 , 1 , 0 ) = 10.12;
R(4 |3 , 2 , 1 , 0 ) = 1.64; . . .
Under a different ordering:

Mean Risk Assessment = 2.19 + 0.865 Current Ratio + 1.96
Asset Growth + 3.59 Leverage + 0.443 Dividend Payout 0.247 Asset Size + 1.05 Covariability Earnings + 0.135
Variability Earnings
NOTE: The joint model adequacy F test statistic does

not change neither does the result of the hypothesis
test !!!
NOTE: The estimates S = 0.981620; R-Sq = 74.2%;

R-Sq(adj) = 63.5% are unchanged !!!
NOTE: R(1 |0 ), R(2 |1 , 0 ) ... clearly changed !!!
Week 4 Lecture 2 - Learning Expectations.
Be familiar with decomposing variation in the General Linear

Model.
Understand the Sequential sums of squares and be able to

interpret and calculate.
Understand R 2 versus R 2 (Adjusted)

12 Week4Lect3

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

12 Week4Lect3

Загружено:

Авторское право:

Доступные форматы

MATH2831/2931

Linear Models/ Higher Linear Models.

August 19, 2013

Week 4 Lecture 3 - Last lecture:

Confidence Intervals for coefficients

Properties of multivariate Gaussian

Hypothesis testing for coefficients

Confidence intervals for the mean and prediction intervals.

Joint confidence regions.

Week 4 Lecture 3 - This lecture:

Introduction to the analysis of variance table

Sequential sums of squares.

Week 4 Lecture 3 - Decomposing variation

SStotal = SSreg + SSres

SStotal , total sum of squares (the sum of squared deviations of

SSreg , regression sum of squares (sum of squared deviations of

SSres is called the residual sum of squares (the sum of the

Week 4 Lecture 3 - Decomposing variation

This identity decomposing variation into a part explained by

For simple linear regression, the partitioning of variation was

The ANOVA table was also a way of organizing calculations in

Week 4 Lecture 3 - Adjusted R 2

For simple linear regression

2 by replacing SSres in (1) by

Week 4 Lecture 3 - Adjusted R 2

Week 4 Lecture 3 - Adjusted R 2

R 2 is an easily interpreted measure of fit of a linear model:

Week 4 Lecture 3 - Adjusted R 2

Week 4 Lecture 3 - Adjusted R 2

Consider two models in which one model contains a subset of the

For the larger model, the numerator in the above expression

Any reduction in the residual sum of squares must be large

Week 4 Lecture 3 - Analysis of variance table

Week 4 Lecture 3 - Analysis of variance table

Notation: = (0 , ...k ) , = ( (1) , (2) ) where (1) is an r 1

Week 4 Lecture 3 - Sequential sums of squares

Write R( (2) | (1) ) for the increase in SSreg when the

Week 4 Lecture 3 - Sequential sums of squares

Sequential sums of squares shown below the analysis of variance

Week 4 Lecture 3 - Sequential sums of squares

Sequential sums of squares are useful when we have first ordered

Week 4 Lecture 3 - Hypothesis testing

Simple linear regression model: t test (or equivalent F test)

General linear model: partial t test for the usefulness of a

Equivalent partial F test: test statistic is the square of the

The F statistic in the analysis of variance table and the

Week 4 Lecture 3 - Testing model adequacy

In the general linear model, if 1 = . . . = k = 0, then the statistic

has an Fk,np distribution.

Week 4 Lecture 3 - Testing model adequacy

For a size test the critical region is

Week 4 Lecture 3 - Testing model adequacy

Week 4 Lecture 3 - Model adequacy for risk assessment

Risk assessment data: response is mean risk assessment, seven

Week 4 Lecture 3 - Model adequacy for risk assessment

F statistic for testing overall model adequacy is 6.97, and the

Week 4 Lecture 3 - Model adequacy for risk assessment

RESULT: Reject the null hypothesis

What can we say about inclusion of predictors in the order we

Mean Risk Assessment = 2.19 + 0.443 Dividend Payout +

We have from this ordering: R(1 |0 ) = 18.42;

Week 4 Lecture 3 - Model adequacy for risk assessment

Under a different ordering: