Multidisciplinary

Lecture 9
Violation of Assumptions of CLR Model:
Multicollinearity
Nature
Causes Consequences
and Types
Detection
Correction
Multicollinearity:
Nature
Multicollinearity This
is a sample problem.
is because independent variables in a regression equation are assumed to be nonstochastic so that population covariance between them is zero by definition.
But
in a sample of a given population, the independent variables may be correlated.
Multicollinearity:
Types
There
are two types of multicollinearity:
perfect multicollinearity imperfect multicollinearity

Perfect
multicollinearity means existence of an exact linear relationship between two or more independent variables.
Multicollinearity:
Types
For
example, the following regression,
Y = 1 + 2X2 + 3X3 + u
would suffer from perfect multicollinearity if
X3 = 1+ 2X2
Note
that there is no random error term in the second equation, which means |r23| = 1.
Multicollinearity:
Types
In
the case of imperfect multicollinearity, r23 is large but not equal to one in absolute value.
This
would be the case if in the following regression equation,
Y = 1 + 2X2 + 3X3 + u
we had that
X3 = 1 + 2X2 + v
where v is a random error term.
Causes of Perfect Multicollinearity

Perfect multicollinearity is typically caused by carelessness on the part of the researcher.
A special kind of perfect multicollinearity is caused by dominant independent variable(s) where one or more independent variables are perfectly collinear with the dependent variable!
Causes of Perfect Multicollinearity

This would arise if an independent variable is definitionally related to the dependent variable.
A dominant variable is so strongly correlated with the dependent variable that it dominates (overwhelm) all other variables.
Causes of Imperfect but High Multicollinearity

Since multicollinearity is a sample problem, high imperfect multicollinearity has to be caused by problems arising from the sample. These include: poor data manipulated data small sample size time-series data especially macro data
Consequences of Perfect Multicollinearity

When there is perfect multicollinearity, regression coefficients cannot be estimated as they will be of the indeterminate form, 0/0. Standard errors of the estimated coefficients tends to infinity as they will equal 2/0, implying complete lack of precision.
Consequences of Imperfect but High Multicollinearity

Regression coefficients are estimable OLS estimates are BLUE
Standard errors of estimates are too large

This makes t ratios too small And this increases probability of Type II Error
Consequences of Imperfect but High Multicollinearity

While R2 and adjusted R2 are not affected, we may encounter a situation where these are high and significant but none or only a few of the estimated regression coefficients are individually significant.
The estimated coefficients may have unexpected or "wrong" signs. Estimates and their standard errors will be sensitive to changes in the sample or model specification.
Detection of Perfect Multicollinearity

This is quite simple-- The model cannot be estimated. For example, in EViews, you would get the error message, near singular matrix indicating that estimation cannot proceed.
Detection of Imperfect but High Multicollinearity

Seldom if ever would we have zero multicollinearity in a regression model. That is we never have a regression equation in which all regressors are orthogonal. Thus the issue is not one of detection but one of degree of intercorrelation. Moreover, because multicollinearity is a sample problem rather than a population problem, there is no formal test for its presence.
Detection of Imperfect but High Multicollinearity

All we have is some general guidelines (rules of thumb) mostly based on the symptoms of high but imperfect multicollinearity. At best, these guidelines may lead us to suspect high multicollinearity. Nor do they tell us much regarding the severity of the consequences of high multicollinearity. We discuss these guidelines below.
Some Guidelines for Assessing Imperfect Multicollinearity

Suspect a high degree of multicollinearity if...
1. The simple correlation coefficient between two independent variables is high and statistically significant. But there are two problems with this:
First, this is only a sufficient condition for high multicollinearity
Second, it is not clear how high the correlation

coefficients must be for there to be severe multicollinearity.

A possible answer to the second problem is to compare the squared of the simple correlation coefficient between two independent variables with the unadjusted R2 from the model.
If the squared simple correlation coefficient is greater than or equal to the unadjusted R2, we may conclude that the two explanatory variables are highly correlated with one another. Unfortunately, this rule itself suffers from a number of defects, which render it unsatisfactory.

_ 2. R2, R2, F statistic, and simple correlation coefficients between the dependent variable and each individual independent variable are high but none or only a few of the estimated coefficients are individually significant.
Note, however, that this is only a sufficient condition for high multicollinearity.

_ 3. R2, R2, and F statistic are high, but partial correlation coefficients between the dependent variable and independent variables are low.
Once again, this is only a sufficient condition for high multicollinearity.

4. In a regression of the kth independent variable, Xk, on the remaining independent variables, the resulting R2 (known R2-delete and denoted R2k) is high and significant based on an F test. 5. Variance-inflation factor (VIF), defined as
VIF(k) = 1/(1 - R2k)

is much larger than one, where R2k, is as defined in (4) above.

VIF may be viewed as the ratio of variance of ^k in the presence of multicollinearity to variance of ^k in the absence of multicollinearity.
When there is no multicollinearity, R2k is zero and VIF = 1/(1 - R2k) = 1/(1 - 0) = 1. When there is perfect multicollinearity, R2k is close to 1 and VIF = 1/(1 - 1) = 1/0 .

6. Adding or dropping a few observations to or from the sample, or adding or dropping an independent variable results in significant changes in the estimated values, their signs and their statistical significance.
Some Remarks Concerning Detection of Imperfect Multicollinearity

Except for VIF, the above rules only tell us how strongly the independent variables are correlated. They do not tell us how serious the consequence of multicollinearity is say in terms of significantly lowering the t scores. Lawrence Klein suggests multicollinearity significantly lowers the t scores if R2 < R2k.
The problem is that t ratios may be quite high even though R2 < R2k.
Some Remarks Concerning Detection of Imperfect Multicollinearity

Even if every measure of intercorrelation among the independent variables points to the existence of a high degree of multicollinearity, there is no problem if the estimated t ratios are significant and all estimated coefficients have expected signs and reasonable magnitudes.
Multicollinearity:
Correction
The following options are available for alleviating, handling, or coping with multicollinearity:
1. Do Nothing
As mentioned above, even if every measure of intercorrelation among the independent variables points to the existence of strong multicollinearity, one need do nothing if the estimated t ratios are significant at reasonable levels and the estimates coefficients have the expected signs and reasonable magnitudes.
Multicollinearity:
Correction 2. Drop One of the Collinear Variables
Some suggest dropping one of the collinear variables from the model. While sometimes this is wise, other times dropping a variable can cause omitted-variable bias
Of course, in some cases this bias may be more than offset by the gain in efficiency. In that case mean-squared error (MSE) of the estimated coefficients on the included variable declines indicating an improvement.
Multicollinearity:
Correction
But how can we tell whether MSE increases or decreases when we omit a variable from the model? If the t ratio for the variable that is a candidate for dropping is less than one in absolute value, then dropping that variable would reduce MSE of the estimated parameter on the included variable. A corollary to the above rule is to never drop an independent variable whose estimated coefficient has a t ratio that is greater than one in absolute value, even if it has a unexpected (wrong) sign
Multicollinearity:
Correction
3. Transform the Data
First Difference the Variables Differencing reduces spurious correlation that normally arises in time-series data in level form.
Express the Variables as Ratios Sometimes one can greatly reduce even eliminate multicollinearity by combining two multicollinear variables by expressing them as a ratio.
Multicollinearity:
Correction
4. Obtain Additional Data
Because multicollinearity is a sample problem, it is possible that in another sample it is not as severe as in the first sample.
Furthermore, because standard errors of estimates are inversely related to the sample size, we can alleviate the major consequence of multicollinearity, inflated standard errors, by increasing the sample size.

Multidisciplinary

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Multidisciplinary

Загружено:

Авторское право:

Доступные форматы

Lecture 9

Violation of Assumptions of CLR Model:

in a sample of a given population, the independent variables may be correlated.

are two types of multicollinearity:

perfect multicollinearity imperfect multicollinearity

example, the following regression,

would be the case if in the following regression equation,

Causes of Perfect Multicollinearity

Causes of Perfect Multicollinearity

Causes of Imperfect but High Multicollinearity

Consequences of Perfect Multicollinearity

Consequences of Imperfect but High Multicollinearity

Standard errors of estimates are too large

Consequences of Imperfect but High Multicollinearity

Detection of Perfect Multicollinearity

Detection of Imperfect but High Multicollinearity

Detection of Imperfect but High Multicollinearity

Some Guidelines for Assessing Imperfect Multicollinearity

Second, it is not clear how high the correlation

Some Guidelines for Assessing Imperfect Multicollinearity

Some Guidelines for Assessing Imperfect Multicollinearity

Some Guidelines for Assessing Imperfect Multicollinearity

Some Guidelines for Assessing Imperfect Multicollinearity

VIF(k) = 1/(1 - R2k)

Some Guidelines for Assessing Imperfect Multicollinearity

Some Guidelines for Assessing Imperfect Multicollinearity

Some Remarks Concerning Detection of Imperfect Multicollinearity

Some Remarks Concerning Detection of Imperfect Multicollinearity

Вам также может понравиться