Вы находитесь на странице: 1из 16

Multiple regression

Variance-Covariance Matrix of Estimators


Analysis of Variances
F-Test for Global Significance of Regression
Testing the Coefficients
Estimators Properties
The estimators are linear, unbiased and efficient.
The multiple regression model can be written:


where the residuals are:



The estimators are unbiased when:




because by the hypothesis


c c c X X X a X X X Xa X X X Xa X X X Y X X X a ' ' + = ' ' + ' ' = + ' ' = ' ' =
1 1 1 1 1
) ( ) ( ) ( ) ( ) ( ) ( ) (
k
a a a ,..., ,
1 0
c + = Xa Y
e a X Y + =

a X Y

=
Y Y a X Y e

= =
a a E = ) (
a E X X X a a E =
' '
+ =

) ( ) ( ) (
1
c
0 ) ( =
t
E c
Variance-Covariance Matrix of Estimators
- Is named
- It contains the variances; it can be calculated the standard deviations
and the covariances of the regression coefficients.






is symmetrical and
a
O
] )

)(

[(

'
= O a a a a E
a
c X X X a a
' '
=
1
) (
1
) ( ) (

' '
=
'
X X X a a c
1
) (

'
X X
( )
1 1
) ( ) (

'
=
'
'
X X X X
1 1
) ( ) ( )

)(

(

' ' ' '
=
'
X X X X X X a a a a c c
1 1

) ( ) ( ) ( ] ) )( [(

' ' ' '
=
'
= O X X X E X X X a a a a E
a
c c
1 2 1 1 2

) ( ) ( ) (

'
=
' ' '
= O X X X X X X X X
a c c
o o
Variance-Covariance Matrix of Errors
- is





Then the variance-covariance matrix of estimators is:


The errors variance can be unbiased estimated by the residuals variance.



We use the errors variance by its estimator and we shall obtain an estimation of the
variance-covariance matrix of estimators:



When the number of observations tends to +, the variance of residuals tends to 0,
and we say that the estimator is convergent and it has minimum variance.
) ( c c
'
E
I
E E E
E E E
E E E
E
n n n n
n
n
2
2
2
2
2 1
2 2 2 1 2
1 2 1 1 1
... 0 0
... ... ... ...
0 ... 0
0 ... 0
) ( ... ) ( ) (
... ... ... ...
) ( ... ) ( ) (
) ( ... ) ( ) (
) (
c
c
c
c
c
o
o
o
o
c c c c c c
c c c c c c
c c c c c c
c c =
|
|
|
|
|
.
|

\
|
=
|
|
|
|
|
.
|

\
|
=
'
= O
1 2 1 1 2

) ( ) ( ) (

'
=
' ' '
= O X X X X X X X X
a c c
o o
1

2

'
=
k n
e e
e
o
1 2

) (


'
= O X X
a c
o
a
Analysis of variances and adjustment quality
TSS = ESS + RSS


The adjustment quality is appreciated with the determination
coefficient, R
2
.




When the number of observations is small, R
2
is corrected with the
number of degrees freedom, obtaining:


= = =
+ =
n
t
t t
n
t
t
n
t
t
y y y y y y
1
2
1
2
1
2
) ( ) ( ) (

=
=
=
=
=
=

= =

=
n
t
t
n
t
t
n
t
t
n
t
t t
n
t
t
n
t
t
y y
e
y y
y y
N
y y
y y
R
1
2
1
2
1
2
1
2
2
1
2
1
2
2
) (
1
) (
) (
1 1
) (
) (
2 2 2
1
1
1 ) 1 (
1
1
1 N
k n
n
R
k n
n
R

=
The Analysis of Variances Fisher Test
for Global Significance of Regression



The global significance test of regression can be formulated:
Does it exists at least one significance explanatory variable?
The hypotheses are:
H0: a
1
= a
2
= ... = a
k
= 0
(all the coefficients are 0, no explanatory variabile contributes to explain
the y variable; the intercept term a
0
is not important, because a model with
only constant term being significant, doesnt have any economic sense.)
H1: it exists at least one coefficient a
i
0.

In case of accepting H0 hypothesis it means that there doesnt exist any
linear significant relation between y and the variables x
i
, i=1,2, ..., k.
Testing the null hypothesis means testing if the explained variance ESS
is significant different from 0.
H0: ESS = 0
H1: ESS 0
) 1 /(
/

=
k n RSS
k ESS
F
Multiple Regression
Testing the Coefficients
Adding New Variables
Testing the Coefficients (1)
Comparing a parameter with a fixed
value a

Comparing a set of parameters with a
set of fixed values
The t-ratio - Testing the Coefficients (2)
Comparing a parameter a
i
with a fixed value a
H0: a
i
= a
H1: a
i
a




If H0 is rejected, a
i
is significant different from a, at a
significance level o, which means a probability of 1-o.

If H0 is accepted; a
i
is not significant different from a,
at a significance level o, which means a probability of 1-o.

A special case is when a=0 and t-ratio is:

i
i
a
i
a
a a
t

=
-
2 /
1
o

-
>
k n a
t t
i
2 /
1
o

-
s
k n a
t t
i
i
i
a
i
a
a
t

o
=
-
Testing the Coefficients (3)
Comparing a set of parameters with a set of
fixed values H0: a
q
= a
q
f
H1: a
q
a
q
f
,
where a
qf
is the fixed vector of values, and q is the number
of established coefficients, the size of the two vectors.
To accept the hypothesis H0 it is sufficient as:


where is the theoretical value of Fisher table with
n-k-1 degrees freedom and a significance level o, and
is a part of variance-covariance matrix of estimators,
corresponding to the analyzed coefficients.
o
) 1 , (
1

(
1

s O
'

k n q
f
q q a
f
q q
F a a a a
q
q
o
) 1 , ( k n q
F
q
a

O
1 2

) (


'
= O X X
a c
o
Example of coefficients testing
a) Is a
1
significantly less than 1?
b) Are the coefficients a
1
and a
2

significant and simultaneously
different from 1, respectively -
0.5?

a) H0: a
1
=1
H1: a
1
<1

The test is one-sided at left and



H0 is accepted.
t y x
1
x
2
x
3

1 17 3 42 115
2 19 2 40 126
3 15 4 40 148
4 21 7 44 139
5 19 8 39 123
6 24 9 38 150
7 26 9 29 126
8 24 6 30 141
9 26 6 38 122
10 21 9 35 157
11 24 5 29 155
12 26 10 28 166
13 30 13 32 168
14 26 8 26 174
81 . 1
% 5
. . 10
=
= o
fr dg
t
81 . 1 68 . 0
298 . 0
1 80 . 0

1
1
1

< =

=
-
a
a
a
t
o
Example of a simultaneous test of
coefficients

b) H0:

H1:

q=2 is the number of tested coefficients.
The estimators vector is:

The values vector is:
|
|
.
|

\
|

=
|
|
.
|

\
|
5 . 0
1
2
1
a
a
|
|
.
|

\
|

=
|
|
.
|

\
|
5 . 0
1
2
1
a
a
|
|
.
|

\
|

=
38 . 0
80 . 0

q
a
|
|
.
|

\
|

=
5 . 0
1
q
a
Example of a simultaneous test of
coefficients (cont.)
The variance-covariance matrix of estimators is:









0
a
1 2

) (


'
= O X X
a c
o
( ) 745 . 6 597 . 2

2
2
= =
c
o
1

a
2

a
3

a
120.872 -0.00133 -1.4723 -0.4795
-0.00133 0.08906 0.0081 -0.00634
-1.4723 0.00806 0.0245 0.00388
-0.4795 -0.00634 0.00388 0.00271
0

a
1

a
2

a
3

a



q
a

O

0.08906 0.00806
0.00806 0.02452
Example of a simultaneous test of
coefficients (cont.)
We calculate







0.612<4.10, we accept H0 hypothesis,
meaning the two coefficients can have
simultaneously the values 1 and -0.5.
= )

(
f
q q
a a
|
|
.
|

\
|
+

=
|
|
.
|

\
|
+

=
|
|
.
|

\
|


12 . 0
20 . 0
5 . 0 38 . 0
1 80 . 0
) 5 . 0 ( 38 . 0
1 80 . 0
=
'
)

(
f
q q
a a ( ) ( ) 12 . 0 20 . 0 5 . 0 38 . 0 1 80 . 0 + = +
( ) 612 . 0
12 . 0
2 . 0
036 . 42 802 . 3
802 . 3 572 . 11
12 . 0 2 . 0
2
1
) (

) (
1
1

=
|
|
.
|

\
|
+

|
|
.
|

\
|


+ = O
'
=
- f
q q a
f
q q
a a a a
q
F
q
10 . 4
05 . 0
. . ) 10 , 2 ( ) 1 , (
= =
lib grd k n q
F F
o
Omission of an Important Variable or
Inclusion of an Irrelevant Variable
Omission of an Important Variable
Consequence: The estimated coefficients on all the other
variables will be biased and inconsistent unless the excluded
variable is uncorrelated with all the included variables.
Even if this condition is satisfied, the estimate of the
coefficient on the constant term will be biased.
The standard errors will also be biased.

Inclusion of an Irrelevant Variable
Coefficient estimates will still be consistent and unbiased, but
the estimators will be inefficient.

Testing New Variable in the Model
H0: RSS
1
=RSS
H1: RSS
1
RSS

where
RSS
1
is the residuals variance in the restricted (new) model
k is the number of explanatory variables from the new model
RSS is the initial residuals variance
k is the number of explanatory variables from the initial model

Fisher value is:

(n k 1) (n k -1) = k k.

If then we accept H0.
) 1 /(
) /( ) (
1

'

=
-
k n SSR
k k SSR SSR
F
o
1 , '
-
<
k n k k
F F