Академический Документы
Профессиональный Документы
Культура Документы
INTRODUCTION
The general linear regression model is a statistical model that describes a data generation
process. The general linear regression model is a generalization of the classical linear regression
model. You can obtain the general linear regression from the classical linear regression by
changing one assumption: assume that the disturbances are nonspherical rather than spherical.
Because of this, the general linear regression model can be used to describe data generation
processes characterized by heteroscedasticity and autocorrelation.
SPECIFICATION
The specification of the general linear regression model is defined by the following set of
assumptions.
Assumptions
This is called heteroscedasticity. In this case, the disturbances are drawn from probability
distributions that have different variances. This often occurs when using cross-section data.
When the error term has non constant variance, the variance-covariance matrix of disturbances
is not given by a constant times the identity matrix (i.e., W 2I). This is because the elements
on the principal diagonal of W, which are the variances of the distributions from which the
disturbances are drawn, are not a constant given by 2 but have different values.
Classical Linear Regression Model as a Special Case of the General Linear Regression Model
If the error term has constant variance and the errors are uncorrelated, then W = 2I and the
general linear regression model reduces to the classical linear regression model.
The sample of T multivariate observations (Yt, Xt1, Xt2, , Xtk) are generated by a process
described as follows.
y = X + , ~ N(0, W)
or alternatively
y ~ N(X, W)
ESTIMATION
Choosing an Estimator
To obtain estimates of the parameters of the model, you need to choose an estimator. We will
consider the following 3 estimators:
1. Ordinary least squares (OLS) estimator
2. Generalized least squares (GLS) estimator
3. Feasible generalized least squares (FGLS) estimator
To obtain estimates of the parameters of the general linear regression model, you can apply the
OLS estimator to the sample data. The OLS estimator is given by the rule:
^ = (XTX)-1XTy
Cov( ^) = 2(XTX)-1
Property 2 means that in the class of linear unbiased estimators, the OLS estimator does not
have minimum variance. Thus, an alternative estimator exists that will yield more precise
estimates.
Cov( ^) = (XTW-1X)-1
If the sample data are generated by the general linear regression model, then the GLS estimator
has the following properties.
If the sample data are generated by the general linear regression model, then the GLS estimator
is the best linear unbiased estimator (BLUE) of the population parameters. The reason that the
GLS estimator is more precise than the OLS estimator is because the OLS estimator wastes
information. That is, the OLS estimator does not use the information contained in W about
heteroscedasticity and/or autocorrelation, while the GLS estimator does.
To make the GLS estimator a feasible estimator, you can use the sample of data to obtain an
estimate of W. When you replace true W with its estimate W^ you get the FGLS estimator. The
FGLS estimator is given by the rule:
^FGLS = (XTW-1^X)-1XT W-1^y
Cov( ^) = (XTW-1^X)-1
The FGLS estimator is also a weighted least squares estimator. The weighted least squares
estimated is derived as follows. Find a TxT transformation matrix P such that * = P, where *
has variance-covariance matrix Cov( *) = E(* *T) = 2I. This transforms the original error term
that is nonspherical to a new error term that is spherical. Use the matrix P to derive a
transformed model.
Py = PX + P
or y* = X* + *
where y* = Py, X* = PX, * = P. The transformed model satisfies all of the assumptions of the
classical linear regression model. The FGLS estimator is the OLS estimator applied to the
transformed model. Note that the transformed model is a computational device only. We use it
to obtain efficient estimates of the parameters and standard errors of the original model of
interest.
A major problem with using the FGLS estimator is that to estimate W you must obtain an
estimate of each element in W (i.e., each variance and covariance). The matrix W is a TxT
matrix and therefore contains T 2 elements. Because it is a symmetric matrix, T(T + 1) of these
elements are different. Thus, if you have a sample size of T = 100, then you must use these 100
observations to obtain estimates of 5,050 different variances and covariances. You cannot
obtain this many estimates with 100 observations because you do not have enough degrees of
freedom.
Resolving the Degrees of Freedom Problem
To circumvent the degrees of freedom problem and obtain estimates of the variances and
covariances in W, you must specify a model that describes what you believe is the nature of
heteroscedasticity and/or autocorrelation. You can then use the sample data to estimate the
parameters of your model of heteroscedasticity and/or autocorrelation. You can then use these
parameter estimates to obtain estimates of the variances and covariances in W. Some often
used models of heteroscedasticity are the following.
1. Assume that the error variance is a linear function of the explanatory variables.
2. Assume that the error variance is an exponential function of the explanatory
variables.
3. Assume the error variance is a polynomial function of the explanatory variables.
Some often used models of autocorrelation are the following.
1. First-order autoregressive process
2. Second-order autoregressive process
3. Higher-order autoregressive process
If the sample data are generated by the general linear regression model, then the FGLS
estimator has the following properties. The FGLS estimator may or may not be unbiased in
small samples. However, if W^ is a consistent estimator of W, then the FGLS estimator is
asymptotically unbiased, efficient, and consistent. In this case, Monte Carlo studies have shown
that the FGLS estimator generally yields better estimates than the OLS estimator.
Caveat
HYPOTHESIS TESTING
The following statistical tests can be used to test hypotheses in the general linear regression
model. 1) t-test. 2) F-test. 3) Likelihood ratio test. 4) Wald test. 5) Lagrange multiplier test.
GOODNESS-OF-FIT
It is somewhat more difficult to measure the goodness-of-fit of the model when the sample data are
generated by the general linear regression model. The FGLS estimator is simply the OLS estimator
applied to a transformed regression that purges the heteroscedasticity and/or autocorrelation. Many
economists use as their measure of goodness of fit the R 2 statistic applied to the transformed regression.
However, the transformed regression is simply a computational device, not the original model of
interest. The fact that you have a good or bad fit for the transformed regression may be of no interest.
HETEROSCEDASTICITY AND THE GENERAL LINEAR REGRESSION MODEL
The t subscript attached to sigma squared indicates that the error for each unit in the sample is
drawn from a probability distribution with a difference variance.
Models of Heteroscedasticity
It is often assumed that the var(t) is either a linear or exponential function of the explanatory
variables. These two alternative models of heteroscedasticity can be written as follows.
1. Use the OLS estimator. Correct the estimates of the standard errors of the estimates so they
are unbiased and consistent.
2. Use the FGLS estimator.
If you are uncertain of the true model of heteroscedasticity, then you can estimate the
parameters of the model using the OLS estimator, and use Whites correction to obtain unbiased
and consistent estimates of the standard errors. This is called White robust standard errors or
White-Huber robust standard errors. If you choose this alternative, you will obtain unbiased but
inefficient estimates of the parameters of the model, but consistent estimates of the standard
errors. Hypothesis tests will be valid, but you will lose some precision.
FGLS Estimator
If you are relatively certain about the true model of heteroscedasticity, then you can use the
FGLS estimator. The FGLS estimator is a weighted least squares (WLS) estimator.
To use the WLS estimator, begin by specifying a transformed model that satisfies all of
the assumptions of the classical linear regression model. The transformed model, which is a
computational device, is given by
The transformed model is obtained by multiplying each side of the statistical equation by an
appropriate weight, wt. The appropriate weight is wt = 1/t, where the weight is the reciprocal
of the standard deviation of the error. Note that the error variance in the transformed model is
so the transformed model has constant variance of 1, and therefore a homoscedastic error
term.
To implement the WLS estimator, you use the sample of data to estimate the weight w t =
1/t. You then regress wtYt on wt, wtXt1, and wtXt2 using the OLS estimator.