Вы находитесь на странице: 1из 7

GENERAL LINEAR REGRESSION MODEL

INTRODUCTION

The general linear regression model is a statistical model that describes a data generation
process. The general linear regression model is a generalization of the classical linear regression
model. You can obtain the general linear regression from the classical linear regression by
changing one assumption: assume that the disturbances are nonspherical rather than spherical.
Because of this, the general linear regression model can be used to describe data generation
processes characterized by heteroscedasticity and autocorrelation.

SPECIFICATION

The specification of the general linear regression model is defined by the following set of
assumptions.

Assumptions

1. The functional form is linear in parameters.


y = X +
2. The error term has mean zero.
E() = 0
3. The errors are nonspherical.
Cov() = E( T) = W
where W is any nonsingular TxT variance-covariance matrix of disturbances.
4. The error term has a normal distribution
~N
5. The error term is uncorrelated with each independent variable.
Cov (,X) = 0

Sources of Nonspherical Errors

There are 2 major sources of nonspherical errors.

1. The error term does not have constant variance.

This is called heteroscedasticity. In this case, the disturbances are drawn from probability
distributions that have different variances. This often occurs when using cross-section data.
When the error term has non constant variance, the variance-covariance matrix of disturbances
is not given by a constant times the identity matrix (i.e., W 2I). This is because the elements
on the principal diagonal of W, which are the variances of the distributions from which the
disturbances are drawn, are not a constant given by 2 but have different values.

2. The errors are correlated.


This is called autocorrelation or serial correlation. In this case, the disturbances are correlated
with one another. This often occurs when using time-series data. When the disturbances are
correlated, the variance-covariance matrix of disturbances is not given by a constant times the
identity matrix (i.e., W 2I). This is because the elements off the principal diagonal of W,
which are the covariances of the disturbances, are non-zero numbers.

Classical Linear Regression Model as a Special Case of the General Linear Regression Model

If the error term has constant variance and the errors are uncorrelated, then W = 2I and the
general linear regression model reduces to the classical linear regression model.

General Linear Regression Model Concisely Stated in Matrix Format

The sample of T multivariate observations (Yt, Xt1, Xt2, , Xtk) are generated by a process
described as follows.

y = X + , ~ N(0, W)
or alternatively
y ~ N(X, W)

ESTIMATION

Choosing an Estimator

To obtain estimates of the parameters of the model, you need to choose an estimator. We will
consider the following 3 estimators:
1. Ordinary least squares (OLS) estimator
2. Generalized least squares (GLS) estimator
3. Feasible generalized least squares (FGLS) estimator

Ordinary Least Squares (OLS) Estimator

To obtain estimates of the parameters of the general linear regression model, you can apply the
OLS estimator to the sample data. The OLS estimator is given by the rule:

^ = (XTX)-1XTy

The variance-covariance matrix of estimates for the OLS estimator is

Cov( ^) = 2(XTX)-1

Properties of the OLS Estimator


If the sample data are generated by the general linear regression model, then the OLS estimator
has the following properties.

1. The OLS estimator is unbiased


2. The OLS estimator is inefficient.
3. The OLS estimator is not the maximum likelihood estimator.
4. The variance-covariance matrix of estimates is incorrect, and therefore the
estimates of the standard errors are biased and inconsistent
5. Hypothesis tests are not valid.

Property 2 means that in the class of linear unbiased estimators, the OLS estimator does not
have minimum variance. Thus, an alternative estimator exists that will yield more precise
estimates.

Generalize Least Squares (GLS) Estimator

The GLS estimator is given by the rule:

^GLS = (XTW-1X)-1XT W-1y

The variance-covariance matrix of estimates for the GLS estimator is

Cov( ^) = (XTW-1X)-1

Properties of the GLS Estimator

If the sample data are generated by the general linear regression model, then the GLS estimator
has the following properties.

1. The GLS estimator is unbiased


2. The GLS estimator is efficient.
3. The GLS estimator is the maximum likelihood estimator.
4. The variance-covariance matrix of estimates is correct, and therefore the
estimates of the standard errors are unbiased and consistent.
5. Hypothesis tests are valid.

If the sample data are generated by the general linear regression model, then the GLS estimator
is the best linear unbiased estimator (BLUE) of the population parameters. The reason that the
GLS estimator is more precise than the OLS estimator is because the OLS estimator wastes
information. That is, the OLS estimator does not use the information contained in W about
heteroscedasticity and/or autocorrelation, while the GLS estimator does.

Major Shortcoming of the GLS Estimator


To actually use the GLS estimator, we must know the elements of the variance-covariance
matrix of disturbances, W. That means that you must know the true values of the variances and
covariances for the disturbances. However, since you never know the true elements of W, you
cannot actually use the GLS estimator, and therefore the GLS estimator is not a feasible
estimator.

Feasible Generalized Least Squares (FGLS) Estimator

To make the GLS estimator a feasible estimator, you can use the sample of data to obtain an
estimate of W. When you replace true W with its estimate W^ you get the FGLS estimator. The
FGLS estimator is given by the rule:
^FGLS = (XTW-1^X)-1XT W-1^y

The variance-covariance matrix of estimates for the GLS estimator is

Cov( ^) = (XTW-1^X)-1

FGLS Estimator as a Weighted Least Squares Estimator

The FGLS estimator is also a weighted least squares estimator. The weighted least squares
estimated is derived as follows. Find a TxT transformation matrix P such that * = P, where *
has variance-covariance matrix Cov( *) = E(* *T) = 2I. This transforms the original error term
that is nonspherical to a new error term that is spherical. Use the matrix P to derive a
transformed model.

Py = PX + P

or y* = X* + *

where y* = Py, X* = PX, * = P. The transformed model satisfies all of the assumptions of the
classical linear regression model. The FGLS estimator is the OLS estimator applied to the
transformed model. Note that the transformed model is a computational device only. We use it
to obtain efficient estimates of the parameters and standard errors of the original model of
interest.

Major Problem with Using the FGLS Estimator

A major problem with using the FGLS estimator is that to estimate W you must obtain an
estimate of each element in W (i.e., each variance and covariance). The matrix W is a TxT
matrix and therefore contains T 2 elements. Because it is a symmetric matrix, T(T + 1) of these
elements are different. Thus, if you have a sample size of T = 100, then you must use these 100
observations to obtain estimates of 5,050 different variances and covariances. You cannot
obtain this many estimates with 100 observations because you do not have enough degrees of
freedom.
Resolving the Degrees of Freedom Problem

To circumvent the degrees of freedom problem and obtain estimates of the variances and
covariances in W, you must specify a model that describes what you believe is the nature of
heteroscedasticity and/or autocorrelation. You can then use the sample data to estimate the
parameters of your model of heteroscedasticity and/or autocorrelation. You can then use these
parameter estimates to obtain estimates of the variances and covariances in W. Some often
used models of heteroscedasticity are the following.
1. Assume that the error variance is a linear function of the explanatory variables.
2. Assume that the error variance is an exponential function of the explanatory
variables.
3. Assume the error variance is a polynomial function of the explanatory variables.
Some often used models of autocorrelation are the following.
1. First-order autoregressive process
2. Second-order autoregressive process
3. Higher-order autoregressive process

Properties of the FGLS Estimator

If the sample data are generated by the general linear regression model, then the FGLS
estimator has the following properties. The FGLS estimator may or may not be unbiased in
small samples. However, if W^ is a consistent estimator of W, then the FGLS estimator is
asymptotically unbiased, efficient, and consistent. In this case, Monte Carlo studies have shown
that the FGLS estimator generally yields better estimates than the OLS estimator.

Caveat

For W^ to be a consistent estimator of W, your model of heteroscedasticity or autocorrelation


must be a reasonable approximation of the true unknown heteroscedasticity or autocorrelation.
If it is not, then the FGLS estimator will not have desirable small or large sample properties.

HYPOTHESIS TESTING

The following statistical tests can be used to test hypotheses in the general linear regression
model. 1) t-test. 2) F-test. 3) Likelihood ratio test. 4) Wald test. 5) Lagrange multiplier test.

GOODNESS-OF-FIT

It is somewhat more difficult to measure the goodness-of-fit of the model when the sample data are
generated by the general linear regression model. The FGLS estimator is simply the OLS estimator
applied to a transformed regression that purges the heteroscedasticity and/or autocorrelation. Many
economists use as their measure of goodness of fit the R 2 statistic applied to the transformed regression.
However, the transformed regression is simply a computational device, not the original model of
interest. The fact that you have a good or bad fit for the transformed regression may be of no interest.
HETEROSCEDASTICITY AND THE GENERAL LINEAR REGRESSION MODEL

Consider the following general linear regression model with heteroscedasticity.

Yt = 1 + 2Xt2 + 3Xt3 + t where var(t) = E(t2) = t2

The t subscript attached to sigma squared indicates that the error for each unit in the sample is
drawn from a probability distribution with a difference variance.

Models of Heteroscedasticity

It is often assumed that the var(t) is either a linear or exponential function of the explanatory
variables. These two alternative models of heteroscedasticity can be written as follows.

Linear hetero: t2 = 1 + 2Xt2 + 3Xt3

Exponential hetero: ln(t2) = 1 + 2Xt2 + 3Xt3

The model of exponential heteroscedasticity is written in log-linear form.

Testing for Heteroscedasticity

Four alternative tests for heteroscedasticity. 1) Breusch-Pagan test. 2) Harvey-Godfrey test. 3)


White test. 4) Wooldridge test. The Breusch-Pagan test assumes that if heteroscedasticity exists
it is linear. The Harvey-Godfrey test assumes that if heteroscedasticity exists it is exponential.
The White and Wooldridge tests assume that if heteroscedasticity exists it has an unspecified
general form.

Remedies for Heteroscedasticity

When there is evidence of heteroscedasticity, econometricians choose one of two alternatives.

1. Use the OLS estimator. Correct the estimates of the standard errors of the estimates so they
are unbiased and consistent.
2. Use the FGLS estimator.

White Robust Standard Errors

If you are uncertain of the true model of heteroscedasticity, then you can estimate the
parameters of the model using the OLS estimator, and use Whites correction to obtain unbiased
and consistent estimates of the standard errors. This is called White robust standard errors or
White-Huber robust standard errors. If you choose this alternative, you will obtain unbiased but
inefficient estimates of the parameters of the model, but consistent estimates of the standard
errors. Hypothesis tests will be valid, but you will lose some precision.
FGLS Estimator

If you are relatively certain about the true model of heteroscedasticity, then you can use the
FGLS estimator. The FGLS estimator is a weighted least squares (WLS) estimator.
To use the WLS estimator, begin by specifying a transformed model that satisfies all of
the assumptions of the classical linear regression model. The transformed model, which is a
computational device, is given by

wtYt = wt1 + 2(wtXt1) + 3(wtXt2) + wtt

The transformed model is obtained by multiplying each side of the statistical equation by an
appropriate weight, wt. The appropriate weight is wt = 1/t, where the weight is the reciprocal
of the standard deviation of the error. Note that the error variance in the transformed model is

var(wtt) = var[(1/t)t] = (1/t)2var(t) = var(t)/ var(t) = 1

so the transformed model has constant variance of 1, and therefore a homoscedastic error
term.
To implement the WLS estimator, you use the sample of data to estimate the weight w t =
1/t. You then regress wtYt on wt, wtXt1, and wtXt2 using the OLS estimator.

Вам также может понравиться