Вы находитесь на странице: 1из 83

F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

In an earlier sequence it was demonstrated that the sum of the squares of the actual values
of Y (TSS: total sum of squares) could be decomposed into the sum of the squares of the
fitted values (ESS: explained sum of squares) and the sum of the squares of the residuals.
1
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

R2, the usual measure of goodness of fit, was then defined to be the ratio of the explained
sum of squares to the total sum of squares.

2
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

The null hypothesis that we are going to test is that the model has no explanatory power.

3
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

Since X is the only explanatory variable at the moment, the null hypothesis is that Y is not
determined by X. Mathematically, we have H0: b2 = 0

4
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
Hypotheses concerning goodness of fit are tested via the F statistic, defined as shown. k is
the number of parameters in the regression equation, which at present is just 2.

5
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
n – k is, as with the t statistic, the number of degrees of freedom (number of observations
less the number of parameters estimated). For simple regression analysis, it is n – 2.

6
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
The F statistic may alternatively be written in terms of R2. First divide the numerator and
denominator by TSS.

7
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
We can now rewrite the F statistic as shown. The R2 in the numerator comes straight from
the definition of R2.

8
F TEST OF GOODNESS OF FIT

RSS TSS  ESS ESS


  1  1  R2
TSS TSS TSS

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
It is easily demonstrated that RSS/TSS is equal to 1 – R2.

9
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
F is a monotonically increasing function of R2. As R2 increases, the numerator increases
and the denominator decreases, so for both of these reasons F increases.

10
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

Here is F plotted as a function of R2 for the case where there is 1 explanatory variable and
20 observations. Since k = 2, n – k = 18.

11
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

If the null hypothesis is true, F will have a random distribution.

12
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80

60

40

20

0
0 0.2 0.4 0.6 0.8 1 R2

There will be some critical value which it will exceed, as a matter of chance, only 5 percent
of the time. If we are performing a 5 percent significance test, we will reject H0 if the F
statistic is greater than this critical value.
13
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80

60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

In the case of an F test, the critical value depends on the number of explanatory variables
as well as the number of degrees of freedom. When there is one explanatory variable and
18 degrees of freedom, the critical value of F at the 5 percent significance level is 4.41.
14
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80 Fcrit ,5% (1,18)  4.41
60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

For one explanatory variable and 18 degrees of freedom, F = 4.41 when R2 = 0.20.

15
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80 Fcrit ,5% (1,18)  4.41
60

40

20
4.41
0
0 0.2 0.4 0.6 0.8 1 R2

If R2 is higher than 0.20, F will be higher than 4.41, and we will reject the null hypothesis at
the 5 percent level.

16
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80 Fcrit ,1% (1,18)  8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

If we were performing a 1 percent test, with one explanatory variable and 18 degrees of
freedom, the critical value of F would be 8.29. F = 8.29 when, R2 = 0.32.

17
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80 Fcrit ,1% (1,18)  8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

If R2 is higher than 0.32, F will be higher than 8.29, and we will reject the null hypothesis at
the 1 percent level.

18
F TEST OF GOODNESS OF FIT

F R 2 /( k  1)
140 F ( k  1, n  k ) 
(1  R 2 ) /( n  k )
120
R2 / 1
F (1,18) 
100 (1  R 2 ) / 18
80 Fcrit ,1% (1,18)  8.29
60

40

20
8.29
0
0.32
0 0.2 0.4 0.6 0.8 1 R2

Why do we perform the test indirectly, through F, instead of directly through R2? After all, it
would be easy to compute the critical values of R2 from those for F.
The reason is that an F test can be used for several tests of analysis of variance. Rather
than have a specialized table for each test, it is more convenient to have just one. 19
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
Note that, for simple regression analysis, the null and alternative hypotheses are
mathematically exactly the same as for a two-tailed t test. Could the F test come to a
different conclusion from the t test?
21
F TEST OF GOODNESS OF FIT

Model Y = b 1 + b 2X + u
Null hypothesis: H0: b 2 = 0
Alternative hypothesis: H1 : b 2 ≠ 0

 (Y  Y ) 2
  (Yˆ  Y ) 2
  e 2
TSS  ESS  RSS

 i
ESS (Yˆ  Y ) 2

R2 
TSS  (Yi  Y ) 2

ESS
( k  1)
ESS /( k  1) TSS R 2 /( k  1)
F ( k  1, n  k )   
RSS /( n  k ) RSS ( n  k ) (1  R 2 ) /( n  k )
TSS
The answer, of course, is no. We will demonstrate that, for simple regression analysis, the F
statistic is the square of the t statistic.

22
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

We start by replacing ESS and RSS by their mathematical expressions.

23
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

The denominator is the expression for su2, the estimator of su2, for the simple regression
model. We expand the numerator using the expression for the fitted relationship.

24
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

The b1 terms in the numerator cancel. The rest of the numerator can be grouped as shown.

25
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

We take the b22 term out of the summation as a factor.

26
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

We move the term involving X to the denominator.

27
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

The denominator is the square of the standard error of b2.

28
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

Hence we obtain b22 divided by the square of the standard error of b2. This is the t statistic,
squared.

29
F TEST OF GOODNESS OF FIT

Demonstration that F = t2

F
ESS

 Yi  Y 
ˆ 2

RSS /( n  2)  ei2 n  2 


 1 2 i 1 2
[ b  b X ]  [ b  b X ]2
1
  2  b22  X i  X 
2

su2 su

b22 b22 b22


 2  Xi  X   2  
2 2
t
su su 
 iX  X 2
 s .e .( b2 ) 2

It can also be shown that the critical value of F, at any significance level, is equal to the
square of the critical value of t. We will not attempt to prove this.

30
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

Here is the output for the regression of hourly earnings on years of schooling for the
sample of 540 respondents from the National Longitudinal Survey of Youth.

33
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

ESS 19322 19322


F 1, n  2     112.15
RSS / n  2 92689 / 540  2 172.28

We shall check that the F statistic has been calculated correctly. The explained sum of
squares (described in Stata as the model sum of squares) is 19322.

34
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

ESS 19322 19322


F 1, n  2     112.15
RSS / n  2 92689 / 540  2 172.28

The residual sum of squares is 92689.

35
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

ESS 19322 19322


F 1, n  2     112.15
RSS / n  2 92689 / 540  2 172.28

The number of degrees of freedom is 540 – 2 = 538.

36
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

ESS 19322 19322


F 1, n  2     112.15
RSS / n  2 92689 / 540  2 172.28

The denominator of the expression for F is therefore 172.28. Note that this is an estimate of
su2. Its square root, denoted in Stata by Root MSE, is an estimate of the standard deviation
of u.
37
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

ESS 19322 19322


F 1, n  2      112.15
RSS / n  2  92689 / 540  2  172.28

Our calculation of F agrees with that in the Stata output.

38
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

R2 0.1725
F 1, n  2     112.15
1  R / n  2 1  0.1725 / 540  2
2

We will also check the F statistic using the expression for it in terms of R2. We see again
that it agrees.

39
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

We will also check the relationship between the F statistic and the t statistic for the slope
coefficient.

40
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

112.15  10.592

Obviously, this is correct as well.

41
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

Fcrit, 0.1% 1,500   10.96 t crit, 0.1% 500   3.31 10.96  3.312

And the critical value of F is the square of the critical value of t. (We are using the values
for 500 degrees of freedom because those for 538 do not appear in the table.)

42
F TEST OF GOODNESS OF FIT

. reg EARNINGS S

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 112.15
Model | 19321.5589 1 19321.5589 Prob > F = 0.0000
Residual | 92688.6722 538 172.283777 R-squared = 0.1725
-------------+------------------------------ Adj R-squared = 0.1710
Total | 112010.231 539 207.811189 Root MSE = 13.126

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
S | 2.455321 .2318512 10.59 0.000 1.999876 2.910765
_cons | -13.93347 3.219851 -4.33 0.000 -20.25849 -7.608444
------------------------------------------------------------------------------

Fcrit, 0.1% 1,500   10.96 t crit, 0.1% 500   3.31 10.96  3.312

The relationship is shown for the 0.1% significance level, but obviously it is also true for any
other significance level.

43
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

Y  b 1  b 2 X 2  ...  b k X k  u

H 0 : b 2  ...  b k  0
H 1 : at least one b  0
• This sequence describes two F tests of goodness of fit in a multiple
regression model. The first relates to the goodness of fit of the equation
as a whole.
• We will consider the general case where there are k – 1 explanatory
variables. For the F test of goodness of fit of the equation as a whole, the
null hypothesis, in words, is that the model has no explanatory power at
all.
• Of course we hope to reject it and conclude that the model does have
some explanatory power.
• The model will have no explanatory power if it turns out that Y is
unrelated to any of the explanatory variables. Mathematically, therefore,
the null hypothesis is that all the coefficients b2, ..., bk are zero.
• The alternative hypothesis is that at least one of these b coefficients is
different from zero.
5
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

Y  b 1  b 2 X 2  ...  b k X k  u

H 0 : b 2  ...  b k  0
H 1 : at least one b  0

ESS ( k  1)
F ( k  1, n  k ) 
RSS (n  k )
ESS
( k  1)
R 2 ( k  1)
 TSS 
RSS
(n  k ) (1  R 2
) (n  k )
TSS
• In the multiple regression model there is a difference between the roles of the F and t
tests. The F test tests the joint explanatory power of the variables, while the t tests test
their explanatory power individually.
• It can be expressed in terms of R2 by dividing the numerator and denominator by TSS,
the total sum of squares.
9
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

Y  b 1  b 2 X 2  ...  b k X k  u

H 0 : b 2  ...  b k  0
H 1 : at least one b  0

ESS ( k  1)
F ( k  1, n  k ) 
RSS (n  k )
ESS
( k  1)
R 2 ( k  1)
 TSS 
RSS
(n  k ) (1  R 2
) (n  k )
TSS

ESS / TSS is the definition of R2. RSS / TSS is equal to (1 – R2).

10
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

The educational attainment model will be used as an example. We will suppose that S
depends on ASVABC, the ability score, and SM, and SF, the highest grade completed by the
mother and father of the respondent, respectively.
11
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

H 0 : b 2  b 3  b 4  0, H 1 : at least one b  0

. reg S ASVABC SM SF
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

13
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

H 0 : b 2  b 3  b 4  0, H 1 : at least one b  0

. reg S ASVABC SM SF
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943
------------------------------------------------------------------------------

ESS /( k  1) 1181 / 3
F k  1, n  k   F 3,536   104.3
RSS /( n  k ) 2024 / 536

In this example, k – 1, the number of explanatory variables, is equal to 3 and n – k, the


number of degrees of freedom, is equal to 536.

14
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

H 0 : b 2  b 3  b 4  0, H 1 : at least one b  0

. reg S ASVABC SM SF
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943
------------------------------------------------------------------------------

ESS /( k  1) 1181 / 3
F k  1, n  k   F 3,536   104.3
RSS /( n  k ) 2024 / 536

The numerator of the F statistic is the explained sum of squares divided by k – 1. In the
Stata output these numbers are given in the Model row.

15
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

H 0 : b 2  b 3  b 4  0, H 1 : at least one b  0

. reg S ASVABC SM SF
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943
------------------------------------------------------------------------------

ESS /( k  1) 1181 / 3
F k  1, n  k   F 3,536   104.3
RSS /( n  k ) 2024 / 536

The denominator is the residual sum of squares divided by the number of degrees of
freedom remaining.

16
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

H 0 : b 2  b 3  b 4  0, H 1 : at least one b  0

. reg S ASVABC SM SF
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943
------------------------------------------------------------------------------

ESS /( k  1) 1181 / 3
F k  1, n  k   F 3,536    104.3
RSS /( n  k ) 2024 / 536

Hence the F statistic is 104.3. All serious regression packages compute it for you as part of
the diagnostics in the regression output.

17
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

S  b 1  b 2 ASVABC  b 3 SM  b 4 SF  u

H 0 : b 2  b 3  b 4  0, H 1 : at least one b  0

. reg S ASVABC SM SF
Source | SS df MS Number of obs = 540
-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943
------------------------------------------------------------------------------

1181 / 3
Fcrit,0.1% 3,500   5.51 F 3,536    104.3
2024 / 536

The critical value for F(3,536) is not given in the F tables, but we know it must be lower than
F(3,500), which is given. At the 0.1% level, this is 5.51. Hence we easily reject H0 at the 0.1%
level.
18
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION

• This result could have been anticipated because both ASVABC and SF
have highly significant t statistics. So we knew in advance that both b2
and b4 were non-zero.

• It is unusual for the F statistic not to be significant if some of the t


statistics are significant. In principle it could happen though. Suppose
that you ran a regression with 40 explanatory variables, none being a true
determinant of the dependent variable.

• The opposite can easily happen, though. Suppose you have a multiple
regression model which is correctly specified and the R2 is high. You
would expect to have a highly significant F statistic.

• However, if the explanatory variables are highly correlated and the model
is subject to severe multicollinearity, the standard errors of the slope
coefficients could all be so large that none of the t statistics is
significant.

• In this situation you would know that your model is a good one, but you
are not in a position to pinpoint the contributions made by the
explanatory variables individually.
24
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u

We now come to more general F tests of goodness of fit. This is a test of the joint
explanatory power of a group of variables when they are added to a regression model.

For example, in the original specification, Y may be written as a simple function of X2. In the
second, we add X3 and X4.

2
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

• The null hypothesis is that neither X3 nor X4 belongs in the model. The
alternative hypothesis is that at least one of them does, perhaps both.

• When new variables are added to the model, RSS cannot rise. In general,
it will fall. If the new variables are irrelevant, it will fall only by a random
amount. The test evaluates whether the fall in RSS is greater than would
be expected on a pure chance basis.

4
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining
• The appropriate test is an F test. For this test, and for several others which we will
encounter, it is useful to think of the F statistic as having the structure indicated above.
• The ‘reduction in RSS’ is the reduction when the change is made, in this case, when the
group of new variables is added.
• The ‘cost in d.f.’ is the reduction in the number of degrees of freedom remaining after
making the change. In the present case it is equal to the number of new variables added,
because that number of new parameters are estimated.
• The ‘RSS remaining’ is the residual sum of squares after making the change.
• The ‘degrees of freedom remaining’ is the number of degrees of freedom remaining after
making the change.
5
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 1, 538) = 274.19
Model | 1081.97059 1 1081.97059 Prob > F = 0.0000
Residual | 2123.01275 538 3.94612035 R-squared = 0.3376
-------------+------------------------------ Adj R-squared = 0.3364
Total | 3204.98333 539 5.94616574 Root MSE = 1.9865

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .148084 .0089431 16.56 0.000 .1305165 .1656516
_cons | 6.066225 .4672261 12.98 0.000 5.148413 6.984036
------------------------------------------------------------------------------

We will illustrate the test with an educational attainment example. Here is S regressed on
ASVABC using Data Set 21. We make a note of the residual sum of squares.

11
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

Now we have added the highest grade completed by each parent. Does parental education
have a significant impact? Well, we can see that a t test would show that SF has a highly
significant coefficient, but we will perform the F test anyway. We make a note of RSS.
12
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  2 2123.0  2023.6 / 2


F 2,540  4     13.16
RSS 2 540  4  2023.6 / 536

The improvement in the fit on adding the parental variables is the reduction in the residual
sum of squares.

13
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  2 2123.0  2023.6 / 2


F 2,540  4     13.16
RSS 2 540  4  2023.6 / 536

The cost is 2 degrees of freedom because 2 additional parameters have been estimated.

14
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  2 2123.0  2023.6 / 2


F 2,540  4     13.16
RSS 2 540  4  2023.6 / 536

The remaining unexplained is the residual sum of squares after adding SM and SF.

15
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  2 2123.0  2023.6 / 2


F 2,540  4     13.16
RSS 2 540  4  2023.6 / 536

The number of degrees of freedom remaining is n – k, that is, 540 – 4 = 536.

16
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  2 2123.0  2023.6 / 2


F 2,540  4     13.16
RSS 2 540  4  2023.6 / 536

The F statistic is 13.16.

17
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b3  b4  0
H 1 : b 3  0 or b 4  0 or both b3 and b4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  2 2123.0  2023.6 / 2


F 2,540  4     13.16
RSS 2 540  4  2023.6 / 536

Fcrit, 0.1% 2,500   7.00

The critical value of F(2,500) at the 0.1% level is 7.00. The critical value of F(2,536) must be
lower, so we reject H0 and conclude that the parental education variables do have
significant joint explanatory power.
18
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

• This sequence will conclude by showing that t tests are equivalent to


marginal F tests when the additional group of variables consists of just
one variable.

• Suppose that in the original model Y is a function of X2 and X3, and that in
the revised model X4 is added.
19
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

The null hypothesis for the F test of the explanatory power of the additional ‘group’ is that
all the new slope coefficients are equal to zero. There is of course only one new slope
coefficient, b4.
21
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

The F test has the usual structure. We will illustrate it with an educational attainment model
where S depends on ASVABC and SM in the original model and on SF as well in the revised
model.
22
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 2, 537) = 147.36
Model | 1135.67473 2 567.837363 Prob > F = 0.0000
Residual | 2069.30861 537 3.85346109 R-squared = 0.3543
-------------+------------------------------ Adj R-squared = 0.3519
Total | 3204.98333 539 5.94616574 Root MSE = 1.963

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1328069 .0097389 13.64 0.000 .1136758 .151938
SM | .1235071 .0330837 3.73 0.000 .0585178 .1884963
_cons | 5.420733 .4930224 10.99 0.000 4.452244 6.389222
------------------------------------------------------------------------------

Here is the regression of S on ASVABC and SM. We make a note of the residual sum of
squares.

23
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

Now we add SF and again make a note of the residual sum of squares.

24
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

The reduction in the residual sum of squares is the reduction on adding SF.

25
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

The cost is just the single degree of freedom lost when estimating b4.

26
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

The RSS remaining is the residual sum of squares after adding SF.

27
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

The number of degrees of freedom remaining after adding SF is 540 – 4 = 536.

28
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

Hence the F statistic is 12.10.

29
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

Fcrit, 0.1% 1,500   10.96

The critical value of F at the 0.1% significance level with 500 degrees of freedom is 10.96.
The critical value with 536 degrees of freedom must be lower, so we reject H0 at the 0.1%
level.
30
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

Y  b1  b 2 X 2  b 3 X 3  u RSS1
Y  b1  b 2 X 2  b 3 X 3  b 4 X 4  u RSS 2

H0 : b4  0
H1 : b 4  0

reduction in RSS cost in d.f.


F (cost in d.f., d.f. remaining) =
RSS remaining degrees of freedom
remaining

 RSS1  RSS 2  1 2069.3  2023.6 / 1


F 1,540  4     12.10
RSS 2 540  4  2023.6 / 536

Fcrit, 0.1% 1,500   10.96

The null hypothesis we are testing is exactly the same as for a two-sided t test on the
coefficient of SF.

31
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536

We will perform the t test. The t statistic is 3.48.

32
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
tcrit,0.1%  3.31
The critical value of t at the 0.1% level with 500 degrees of freedom is 3.31. The critical
value with 536 degrees of freedom must be lower. So we reject H0 again.

33
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31
It can be shown that the F statistic for the F test of the explanatory power of a ‘group’ of one
variable must be equal to the square of the t statistic for that variable. (The difference in the
last digit is due to rounding error.)
34
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31 3.312  10.96
It can also be shown that the critical value of F must be equal to the square of the critical
value of t. (The critical values shown are for 500 degrees of freedom, but this must also be
true for 536 degrees of freedom.)
35
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31 3.312  10.96
Hence the conclusions of the two tests must coincide.

36
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31 3.312  10.96
This result means that the t test of the coefficient of a variable is a test of its marginal
explanatory power, after all the other variables have been included in the equation.

37
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31 3.312  10.96
If the variable is correlated with one or more of the other variables, its marginal explanatory
power may be quite low, even if it genuinely belongs in the model.

38
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31 3.312  10.96
If all the variables are correlated, it is possible for all of them to have low marginal
explanatory power and for none of the t tests to be significant, even though the F test for
their joint explanatory power is highly significant.
39
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES

. reg S ASVABC SM SF

Source | SS df MS Number of obs = 540


-------------+------------------------------ F( 3, 536) = 104.30
Model | 1181.36981 3 393.789935 Prob > F = 0.0000
Residual | 2023.61353 536 3.77539837 R-squared = 0.3686
-------------+------------------------------ Adj R-squared = 0.3651
Total | 3204.98333 539 5.94616574 Root MSE = 1.943

------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------

( 2069.3  2023.6) / 1
F (1,536)   12.10 Fcrit,0.1%  10.96
2023.6 / 536
3.482  12.11 tcrit,0.1%  3.31 3.312  10.96
If this is the case, the model is said to be suffering from the problem of multicollinearity.

40

Вам также может понравиться