Академический Документы
Профессиональный Документы
Культура Документы
(a) (b)
Figure 3.1 Panel options for the application of the GLS Estimation Method
81
By using each of the equation specifications, there are several panel options can be
selected, as presented in Figure 3.1. Based on these options, the following notes and comments
are presented.
(1) Figure 3.1(a) presents three options for the cross section effect specifications, namely None,
(2) Figure 3.1(a) also presents the options for GLS Weight: No Weight, which is the default
option. There are four options for the weights, namely Cross-Section Weights, Cross-Section
SUR, Period Weights, and Period SUR. It is not an easy task to select the best possible
weight. The statistical results would be highly dependent on the data set used. For this
reason, the researchers could apply selected or all alternative options, then they could
identify which one would be considered as the best and the worst fit models. For an
illustration, Baltagi (2009) presents statistical results using several alternative options.
However, I would recommend to apply the default options, if there is no good reason to do
otherwise.
(a) (b)
Figure 3.2 Additional options for the application of the GLS Estimation Method
82
(3) Figure 3.1(b) presents eight options for the “Coef covariance method”, which also should be
selected using the trial and error method, and the statistical result obtained is unpredictable.
(4) In addition, Figure 3.1 also present the object “Options”, which again presents several
alternative selection, as presented in Figure 3.2. Based on this figure, the following notes and
comments are presented. Figure 3.2(a) presents three Weighting Options, specific for the
Random effects method, and a choice to select or not “Always keep GLS and IV/GMM
weights”. Figure 3.2(b) presents the options if we are not using the random-effects method.
(5) We never can predict, which combination of options would give the best possible results or
estimates. Find the following illustrative examples.
(6) In developing any model, including the models presented in previous chapters, the following
notes or statements should be considered.
(i) Tukey (1962, in Giffi, 1990) has the following statements.
In data analysis we must look to very heavy emphasis on judgment. At least three different sorts
or sources of judgments are likely to be involved in almost every instance:
(a1) judgment based upon the experience of the particular field of subject matter from which the
data come,
(a2) judgment based upon a broad experience with how particular techniques of data analysis
have worked out in variety of fields of applications,
(a3) judgment based on abstract results about the properties of particular techniques, whether
obtained by mathematical proof or empirical sampling.
(ii) Benzecri (1973, in Gifi, 1991, p.25) presents notes on the relationship between the data
set and the model as follows:
The model must follow the data, and not the other way around. This is another error in the
application of mathematics to the human sciences : the abundance of models, which are built
a priori and then confronted with the data by what one calls a ‘test’. Often the ‘test’ is used to
justify a model in which the parameters to be fitted is larger than the number of data points.
And often is used, on the contrary, to reject as invalid the most judicial remarks of the
experimenter. But what we need is a rigorous method to extract structure, starting from the
data.
Classical (parametric) statistics derives results under the assumption that these models are
strictly true. However apart from some simple discrete models perhaps, such models are
never exactly true. After all, a statistical model has to be simple (where ‘simple’, of course
has a relative meaning, depending on state and standards of the subject matter.
83
3.2 GLS DVMS OR GLS ANOVA REGRESSIONS
As the extension of the DVMs or LS ANOVA Regressions presented in chapter 1, this
section would present examples of empirical statistical results based on weighted GLS
regression, weighted FEM, and weighted REM. However, one should note the error message
can be obtained by using other alternative models. See the following examples. For the
illustrated empirical results, the model in Figure 1.1(a) having the following equation would be
applied.
LNSALE C @Expand(TP,B,@Droplast) @Expand(A,@Droplast)*@Expand(TP,B) (3.1)
84
Figure 3.1 Statistical results of three weighted GLS regressions.
85
3.1 There is no option for weighted Two-way FEM available.
3.2 The estimate presented is the panel least squares of a TW_FEM, having additive cross-
section dummies and time-point (year) dummies, as additional independent variables for the
ANOVA model in (3.1). Refer to the limitation of TW_FEMs, specifically the Firm+Year
FEMs, presented in previous chapter.
Figure 3.2 Statistical results of two weighted GLS FEMs, and an ordinary Two-way FEM.
86
3.2.3 Weighted GLS REMs
that is the individual error components have normal distribution with zero means and variances
2 and 2 , respectively, there are not correlated with each other and are not auto-correlated
across both cross-section and time series units.
Take a note that there are two error term in this model, εi which is the cross-section, or
individual specific error, and µit , which is the combined time series and cross section error
components. The overall, two errors here are assumed to have zero means and individual error is
not correlated with combination error. As a result of the assumptions stated above, it follows:
E ( wit ) E ( i it ) 0
var( wit ) 2 2
(3.4)
(rho) corr ( wis , wit ) 2 /( 2 2 ), s t
87
t , t 1,..., T have i.i.d. N( 0, 2 )
it , i 1,..., N ; t 1,..., T have i.i.d. N( 0, 2 )
E ( t it ) 0; E ( s t ) 0, ( s t ) (3.6)
E ( is it ) E ( it jt ) E ( is js ) 0, (i j; s t )
that is the individual error components have normal distribution with zero means and variances
2 and 2 , respectively, there are not correlated with each other and are not auto-correlated
across both cross-section and time series units.
Take a note that there are two error term in this model, t which is the time series, or time
specific and cross section error, and µit , which is the combined time series and cross section
error component. The overall, three errors here are assumed zero and the time series error is not
correlated with combination error. As a result of the assumptions stated above, it follows:
E ( wit ) E ( t it ) 0
var( wit ) 2 2 (3.7)
(rho) corr ( wis , wit ) 2 /( 2 2 ), s t
that is the individual error components have normal distribution with zero means and variances
2 , 2 , and 2 , respectively, there are not correlated with each other and are not auto-
correlated across both cross-section and time series units.
88
Take a note that there are three error terms in this model, εi which is the cross- section, or
individual specific and the time series error, t which is the time series, or time specific and
cross section error, and µit , which is the combined time series and cross section error component.
The overall, three errors here are assumed zero and the time series error is not correlated with
combination error. As a result of the assumptions stated above, it follows:
E ( wit ) E ( i t it ) 0
var( wit ) 2 2 2 , for all i & t
cov( wis , w jt ) 2 , i j , s t
2 , i j , s t
corr ( wis , w jt ) 1 i j, s t
(3.10)
0 i j, s t
Rho (CSRE ) 2 /( 2 2 2 ), i j , s t
Rho ( PERE ) 2 /( 2 2 2 ), i j , s t
89
for normality distribution is not for testing the set of the error terms have independent
identical normal distributions, but the theoretical distribution of the residual has a normal
distribution. In addition I would have then following statement:
The conclusion of a testing hypothesis does not directly mean it is representing
the true characteristic(s) of the corresponding population”.
Example 3.3 Figure 3.2 presents the statistical results of two REMs, namely weighted
CSREM, and weighted PEREM, using the Wallace and Hussain weighting option. Based on
these results, the following notes are presented.
(1) Based on the CSREM, the following empirical findings are presented.
1.1 The equation of mean model C(i) is C(i) = 7.1401 + εi.
1.2 Rho(CSRE) = 2 /( 2 2 ) = 0.7457, and Rho(Id_RE) = 0.2543, with a total of 1 (one).
1.3 The regression obtained has the following equation, which is a single cell-means (ANOVA)
model having an additional term [CX=R], to represent the two random errors εi, and µit.
(2) Based on the PEREM, the following empirical findings are presented.
2.1 The equation of mean model C(t) is C(t) = 7.7617 + νt.
2.2 Rho(PERE) = 2 /( 2 2 ) = 0.0076, and Rho(Id_RE) = 0.9924, with a total of 1 (one).
2.3 The regression obtained has the following equation, which is a single cell-means (ANOVA)
model having an additional term [PER=R], to represent the two random errors νt, and µit.
90
LNSALE = 7.7617 - 0.7858*(TP=1 AND B=1) - 0.5985*(TP=1 AND B=2)
- 0.2802*(TP=2 AND B=1) - 2.3978*(A=1)*(TP=1 AND B=1)
- 1.7286*(A=1)*(TP=1 AND B=2) - 2.8491*(A=1)*(TP=2 AND B=1)
- 2.10686*(A=1)*(TP=2 AND B=2)+ [PER=R]
Figure 3.3 Statistical results of CSREM, and PEREM, using the ES in (3.1)
Figure 3.3 Statistical results of a weighted CSREM, and a weighted PEREM, using the ES in (3.1)
91
2.4 So that the total scores of 1561 firm-year observations of LNSALE, also are presented or
estimated by a total of eight cell-means only.
(3) Additional notes.
3.1 The TW_REM can not be applied because the data is unbalanced. However, by using other
dependent variable, such as LNLIA=log(Liability), the statistical results of the TW_REM, as
presented in Figure 3.4. Then the data analyses based on CSREM and PEREM can easly be
done.
Figure 3.4 Statistical results of a TW_REM of LNLIA, using the independent variables in (3.1)
3.2 The other two weighted options, namely Awamy-Arora, and Wansbeek Kapteyn, could easily be
applied. Do it for exercises.
92
ly c @expand(group,tp,@dropfirst) ly(-1)*lk*ll*@expand(group,tp)
ly(-1)*lk*@expand(group,tp) ly(-1)*ll*@expand(group,tp)
lk*ll*@expand(group,tp) ly(-1)*@expand(group,tp)
lk*@expand(group,tp) ll*@expand(group,tp) (3.11)
Note that unexpected good fit reduced model might be obtained, such as hierarchical and
nonhierarchical three-way or two-way interaction models by Group and TP, and an additive
model by Group and TP, which will be presented in the following subsections. For the data
analysis, it is recommended to apply the manual stepwise selection method, as presented in
Section 2.6. Do it for an exercise.
Note that unexpected good fit reduced model might be obtained, either hierarchical or
nonhierarchical two-way HRMs by Group and TP, and an additive models in (3.13), which is
highly dependent on the data set used. For the data analysis, it is recommended to apply the
manual stepwise selection method, as presented in Section 2.6. Do it for an exercise.
93
3.3.4 Application of Various GLS HRMs
It is recognized that data analysis based on various GLS regressions could be conducted
using each of the general equations in (3.11), (3,13) and (3.13). However, the following
examples only present some selected empirical results, using either one of the equation
specifications.
Figure 3.5 Summary of the statistical results of CS weight, Period weight, and
Period SUR models, based on the ES in (3.13)
94
3.3.4.1 Application of Weighted GLS Regression
Example 3.4 Figure 3.5 presents the statistical results of the EGLS (Cross-Section weight)
model, EGLS (Period weight) model, and EGLS (Period SUR) model in (3.13), using the White
standard errors & covariance. Based on these results, the following notes are presented.
(1) Then the data analysis can be done for each of the four GLS weights, however for the
weight: Cross-Section SUR, an the error message above shown on the screen. The statistical
would be obtained by selecting any subsample of the firms of a size less than (T-1) = (28 - 1),
where the 1 is corresponding to the LY(-1) used as an independent variable.
(2) The LV(1) Models are applied to overcome the autocorrelation problems. Compare to the
models having very small DW-statistics, presented in Table 2.4.
(3) The three regressions in this figure are acceptable regression functions, in a statistical sense,
even though some of the independent variables having p-values > 0.30.
So that, in a statistical sense, a reduced model should be explored, by deleting the
corresponding independent variables one-by-one. Do it for an exercise.
95
“Representatives”. For instance, in the following four equations, the term [CX=F] is
representing the (82-1) cross-section fixed (dummy variables), which is divided into 2
groups, namely Group=1, and Group=2.
Figure 3.6 Statistical results of weighted CSFEM, weighted PEFEM, and ordinary TW_FEM
96
Figure 3.7 The estimation equation and the regression function of the weighted CSFEM
1.2 For each firm in Group=1, the two first model above presents two peace time series model of
LY on LY(-1), LK and LL, for TP=1 and TP=2. So that all firms of 40 in Group=1, are
represented by a set of homogeneous two peace time series regression, having different
intercepts. Note that this model in fact is a time series One-Way ANCOVA model, which is
considered as a not recommended model, more over for large number of firms, because in
general the independent variables have different slope parameters. Refer to ANCOVA
presented in previous chapter. Similarly for all firms of 42 in Group=2.
1.3 Since some of the independent have large p-values, say greater than 0.30, then a reduced
ll*@expand(group,tp,@drop(1,2),@drop(2,1),@drop(2,2))
(2) Similar results and notes can be presented based on the weighted PEFEM, as follows:
2.1 The model presents the following 4 equations by Group, and TP, where the term [PER=F]
is representing the (27-1) period fixed (dummy variables), which is divided into 2 groups,
97
For Group=2 & TP=1: LY = C(1) + C(4)*LY(-1) + C(8)*LK + C(12)*LL + [PER=F]
For Group=2 & TP=2: LY = C(1) + C(5)*LY(-1) + C(9)*LK + C(13)*LL + [PER=F]
2.2 For each time point in TP=1, the pair of equations present a two peace cross-section model
of LY on LY(-1), LK and LL, for Group=1 and Group=2. So that all time points of 14, in
TP=1, are represented by a set of homogeneous two peace cross section regression, having
different intercepts. Similarly for all time points of 14, in TP=2.
2.3 Since some of the independent have large p-values, say greater than 0.30, then a reduced
model should be explored.
(3) Finally, for the ordinary TW_FEM we have the findings and notes as follows:
3.1 The model presents the following set of equations, where then term [CX=F,PER=F] is
representing (82-1) firm dummies, and (27-1) time dummies, which are divided into four
3.2 Note that the first model represent homogeneous regressions for 560 (= 40 firms * 14 time
indicated by the additive dummies of the 40 firms and 14 time-points, which would never
be observed in practice. Refer to the illustrative results and special notes in Example 2.7 and
Example 2.8. So that this model also would be considered as the worst model , compare to
all possible models having the same set of numerical independent variables. Similarly the
3.3 So that it is recommended not to apply a TW_FEMs, because the model in fact is an additive
two-way ANCOVA model by Firms and Time-points, which is the worst ANCOVA model
98
among all possible ANCOVA models having the same set of variables – refer to the
Example 3.6 (Random effects model using the manual stepwise selection method) Figure 3.8
presents the statistical results of a weighted TWRE_HRM, in (3.11), using the manual stepwise
selection method. Based on these results, the following findings and notes are presented.
(1) Note that the ordering of the sets of variables in (3.16) are representing the levels of their
importance. Such as the interaction LY(-1)*LK*LL is considered as the most important cause
(source, or upstream) variable within each cell generated by GROUP, and TP, because it is
defined that the effect of LY(-1) on LY depends on LK and LL, which is referring to the path
diagrams in Figure 2.1, and the model in (2.20a).
(2) The statistical results of each stage of data analysis, from stage 1 up to stage 6, shows that the
interaction regression is a good fit model of LY, because each of its numerical independent
variables have a p-value < 0.30. Note that each stage is presented between the bold lines. For
instance, at Stages 1 and 2, the model has the intercepts and the three-way interaction
numerical variables only, namely LY(-1)*LK*LL*(Group=i,TP=j), for i = 1, 2; and j = 1,2.
(3) The model at Stage-5 is obtained by inserting the additional independent variables
LK*@Expand(Group,TP), it is found that LK*(Group=2 and TP=1) should not be used in
the model, because an interaction numerical variables in (Group=2 and TP=1) has a p-value
> 0.30. The data analysis can be done by using independent variables
LK*@Expand(Group,TP,@Drop(2,1)).
(4) The model at Stage-6 is obtained by inserting the additional four independent variables in
LL*@Expand(Group,TP). However, it is found two of the independent variables should not
be used in the model, because some of the independent variables in model obtained at Stage-
5 have p-values > 0.30. The data analysis can be done by using independent variables
LL*@Expand(Group,TP,@Drop(2,1))
99
Figure 3.8 A part of the manual stepwise selection method using the models in (3.13)
100
(5) Finally by inserting additional variables LY(-1)*@Expand(Group,TP), an error message is
obtained, that is “Random effects estimation requires number of cross-section >number of
coefs for between parameters”. It is really unexpected. So that the model in Stage-6 should
be considered as the final model.
(6) It is recognized that by selecting the most important independent variable up to the least one,
then unexpected other final model would be obtained. So that a final good fit model obtained,
in a statistical results, would be highly dependent on the ranking of importance of the
candidate independent variables, in addition to the multicollinearity between the numerical
independent variables within each cell generated by GROUP, and TP. See the following
example.
(7) Note the HRMs above can be modified to the HRMs by Group, and the time T, and the
models by Firm and TP, with the general equation specifications (2.23a) and (2.23b),
respectively.
(8) They also can be extended to the HRMs with trend or the tiem-related-effects, which have
been presented in previous chapter. Do it as exercises.
Example 3.7 (An alternative manual stepwise selection method) Corresponding to the error
message obtained while inserting ly(-1)*@expand(group,tp) at the final stage in the previous
example, I try to choose LY(-1) as the most important independent variable, and then for the
other variables we may have many possible ordering. In this case, however, I select the
following ordering, where the three interactions having the main factor LY(-1) are the least
important, because they should be highly correlated with LY(-1) within each cell. The readers can
apply other orderings for an exercise.
ly c @expand(group,tp,@dropfirst) ly(-1)*@expand(group,tp) lk*ll*@expand(group,tp)
lk*@expand(group,tp) ll*@expand(group,tp) ly(-1)*lk*ll*@expand(group,tp)
ly(-1)*lk*@expand(group,tp) ly(-1)*ll*@expand(group,tp) (3.14)
Figure 3.9 presents the final statistical results of the first four stages of the data analysis. Based
on these results, the following findings and notes are presented.
101
Figure 3.9 Statistical results of the reduced models in (3.16), using the manual stepwise selection method
102
(1) By inserting either one of the interactions having the main factor LY(-1), at Stage-5, it was
obtained that some of the independent variables at Stgae-4 have large p-values, say greater
than 0.30. So that all interactions are not inserted as additional independent variables.
(2) Since a researcher would never apply all possible orderings, then his/her final good fit model,
can not be considered as the best fit model. In this case, there are 7! = 5,040 possible
orderings of the seven sets of the following variables. Refer to Section 2.6.2.
ly(-1)*@expand(group,tp) up to ly(-1)*ll*@expand(group,tp)
(Group,TP)
Variable (1,1) (1,2) (2,1) (2,2)
C C(1) C(1)+C(2) C(1)+C(3) C(1)+C(4)
LY(-1) C(5) C(6) C(7) C(8)
LK*LL C(9) C(10)
LK C(11) C(12) C(13)
LL C(14) C(15)
(3) The parameters of the final model in Figure 3.6 are presented in Table 3.1, which shows the
following types of models>
3.1 The cell-(1,1) presents a LV(1) nonhierarchical two-way interaction model, because it does
not have the main factor LL as an independent variable, and LK*LL is an independent
variable.
3.2 The cell-(1,2) presents the simplest lagged variable model, because it has an independent
variable LY(-1) only.
3.3 The cell-(2,1) presents a LV(1) additive model, because it has three main factors as the
independent variables.
3.4 The cell-(2,2) presents a LV(1) hierarchical two-way interaction model, because it has a two-
way interaction LK*LL together with both its main variables as the independent variables.
3.5 Since the four models have different sets of independent variables, then it is inappropriate to
test a hypothesis on the effects differences between the cells. If their effects differences
want to be tested, then the full model should be applied. So that the four models would have
the same set of independent variables.
103
(4) The statistical results are obtained using the Swamy-Arora weighting option. For the other
two weighting options, namely Wallace-Hyssain, and Wansbeek-Kapteyn, could easily be
done. Do it for an exercise.
Figure 3.10 Statistical results of the other reduced models of the model in (3.13)
104
Example 3.8 (Application of the Swamy-Arora weighting option) Figure 3.10 presents the
statistical results of a weighted CSRE_HRM, a weighted PERE_HRM, and a weighted
TWRE_HRM, which are the reduced models of the models using the ES (3.13). Based on these
results, the following findings and notes are presented.
(1) The statistical results of the reduced models are obtained using the manual stepwise
selection method, for each of the weighted CSREM, PEREM, and TW_REM, using the
Swamy-Arora weighting option. For the other two weighting options could easily be done.
Do it as an exercise.
(2) Note that the models in Figure 3.10 are translog linear models of LY on LY(-1), LK, and LL
by the factors or dichotomous variables Group, and TP, and they also can be considered
as the reduced models of the model in (2.20).
Corresponding to the sets of models of LY on LY(-1), LK, and LL, in (3.11), the data
analysis based on the fixed effects models (FEMs) can easily be done using the following
alternative equations specification (ES). Note that in practice we would have unexpected good fit
model(s), because the unpredictable impacts of the multicollinearity between the numerical
105
independent variables. Based on the data in CES.wf1, we can consider the following alternative
It is recognized that there are two fixed effects models as single factor ANCOVA
models, namely cross section fixed effects model (CSFEM), and period fixed effects model
(PEFEM).
Example 3.9 (GLS Cross section FEM) Figure 3.7 presents the statistical results of two
CSFEMs, using the following ESs, respectively.
Based on these result, the following findings and notes are presented.
(1) The results obtained are using the manual stepwise selection method. At Stage-1 the set of
interactions are inserted as independent variables. Since each of the interactions has a
significant effect, then the main factors/variables are inserted at Stage-2.
(2). Both models has (82-1) cross-section (firm) dummies. However, their coefficients are not
presented. Compare this with the FEMs presented in previous chapter. They are ANCOVA
models with a single factor, namely the FIRM or FIRM_Code. So they present a set of 82
time-series multiple regressions having the same slopes, which are inappropriate, in a
theoretical sense.
(3) Based on the statistical results at Stage-2, the following notes and comments are presented.
3.1 Compare to the ANCOVA model (3.17), the CSFEM in (3.16) can be presented using the
ES as follows:
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [CX=F] (3.17)
106
Note that the term [CX=F] presents 81 cross-section dummy variables.
3.2 Even though two of the interactions have large p-values, but the four interactions have a
significant joint effects, because the null hypothesis H0 :C(2)=C(3)=C(4)=C(5)=0 is rejected,
based on the F-statistic of F0 = 6.826003 with df = (4,2115), and p-value = 0.0000. In
addition, all independent variables, namely seven numerical and 81 dummy variables, have
significant joint effects, based on the F-statistic of F0 = 111020.9 with p-value = 0.0000 (in
the output). So, in a statistical sense, the hierarchical three-way interaction model is a good
fit model, to present the linear effect of LY(-1) on LY depends on LK, and LL.
107
3.3 In practice, however, a researcher might want to reduce the model. In this case, then, the
trial-and-error method should be applied, which would be very subjective, because there are
several alternative methods in deleting an independent variable one-by-one, which will be
very subjective, such as he/she can either keep or delete the interaction LY(-1)*LK*LL,
since the results at Stage-1 show it has a significant effect. Do it as an exercise.
3.4 This example again shows the unpredictable impact of the multicollinearity between the
numerical independent variables. Similarly the following example.
108
Example 3.10 (GLS Period FEM) Similar to previous example, Figure 3.12 presents the
statistical results of two PEFEMs using the ESs in (3.15), and (3.16) with a two stage manual
selection method. Based on these results, the following findings and notes are presented.
(1) Both models has (27-1) period (time-point) dummies. However, their coefficients are not
presented. Compare this with the FEMs presented in previous chapter. They are ANCOVA
models with a single factor, namely the Period or Time-point. So they present a set of 26
cross-section multiple regressions having the same slopes, which are inappropriate, in a
theoretical sense.
(2) Based on the statistical results at Stage-2, the following notes and comments are presented.
2.1 Compare to the ANCOVA model (3.17), the PEFEM in (3.16) can be presented using the ES
as follows:
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [PER=F] (3.18)
Note that the term [PER=F] presents 26 period fixed (dummy variables).
2.2 Three out of four interactions have large p-values, say greater than 0.30, but it is unexpected
the null hypothesis H0 :C(2)=C(3)=C(4)=C(5)=0 is rejected, based on the F-statistic of F0 =
4.716930 with df = (4,2180), and p-value = 0.0009. In addition, all independent variables,
namely seven numerical and 26 period dummy variables, have significant joint effects,
based on the F-statistic of F0 = 105779.0 with p-value = 0.0000. So, in a statistical sense, the
hierarchical three-way interaction model is a good fit model, to present the linear effect of
LY(-1) on LY depends on LK, and LL.
2.3 Note four out of the seven numerical independent variables have very large p-values –
greater than 0.30. So it is recommended to explore at least two alternative reduced models.
Do it as an exercise.
Example 3.11 (PLS Two-way FEM) Figure 3.13 presents the statistical results of two panel
least squares (PLS) TWFEMs in (3.16), using the two stage manual selection method. Based on
these results, the following findings and notes are presented.
109
(1) Both models has a total of 107 dummies, namely (82-1) cross-section (firm) dummies, and
(27-1) period (time-point) dummies. However, their coefficients are not presented. Compare
this with the FEMs presented in previous chapter. They are ANCOVA models with two
factors, namely Firm and Period or Time-point. So they present a set 26 cross-section
multiple regressions having the same slopes, which are inappropriate, in a theoretical sense.
(2) Based on the statistical results at Stage-2, the following notes and comments are presented.
2.1 Compare to the ANCOVA model (3.17), the TWFEM in (3.16) can be presented using the
following ES, for t=Time > 1.
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [CX=F,PER=F] (3.19)
Note that the term [CX=F,PER=F] presents 26 period fixed (dummy variables), and 81
cross-section fixed (dummy variables. So the model presents a set of 82(firms) 27(time-
points) = 2214 homogeneous regressions, that is the seven numerical variables have the same
slopes. Referring to the TWFEMs presented in Examples 2.7 and 2.8, this model is the worst
ANCOVA model, in a theoretical sense.
2.2 Three out of four interactions have large p-values, say greater than 0.30, but it is unexpected
the null hypothesis H0 :C(2)=C(3)=C(4)=C(5)=0 is rejected, based on the F-statistic of F0 =
110
4.716930 with df = (4,2180), and p-value = 0.0009. In addition, all independent variables,
namely seven numerical and 26 period dummy variables, have significant joint effects,
based on the F-statistic of F0 = 105779.0 with p-value = 0.0000. So, in a statistical sense, the
hierarchical three-way interaction model is a good fit model, to present the linear effect of
LY(-1) on LY depends on LK, and LL.
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [CX=R] (3.20)
111
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [FX=R,PER=R] (3.22)
Example 3.12 (Three alternative TWREMs) Figure 3.14 presents the statistical results of the
TWREM (3.22), and two of its reduced models. Based on these results, the following notes and
comments are presented.
(1) Stage-1, and Stage-2 present the statistical results of a manual stepwise selection method,
which also can be done using multiple stages.
(2) The results at Stage-1 show that the model is a good fit model to present the effect of LY(-1)
on LY, depends on LK and LL. At Stage-2, however, the model is not good fit, because six
out of seven independent variables have very large p-values. However, it is surprising that
the null hypothesis H0 : C(2)=C(3)=C(4)=C(5)=0 is rejected based on the F-statistic of F0 =
4.369235 with df = (4,2206) and p-value = 0.0016. So it can be concluded that the
interactions have a significant joint effects on LY. So, in a statistical sense, a researcher
should a manual multistage selection method to obtain an acceptable reduced model. Do it
as an exercise.
112
(3) In general, the manual multistage selection method can be done as follows:
3.1 At the first step, data analysis based on the model with interaction independent variables. If
each of the interactions has a significant effect, say p-value < 0.20 (or 0.30), go to the
second step. Otherwise, a reduced model should be explored to obtain a model, with all
interactions have p-values < 0.20. At this step, we have Model-1 to present the linear effect
of LY(-1) on LY, is significantly dependents on LK and LL.
Figure 3.14 Statistical results of the TWREM (3.22), and two of its reduced models
113
3.2 At the second step, insert all main factors/variables LY(-1), LK, and LL, as additional
independent variables of Model-1. In this case, we have a model with all interactions have
very large p-values. So, the three main factors cannot be inserted at once.
Figure 3.15 Statistical results of a translog linear TWREM, and three of its possible reduced models
(b)
(a)
Figure 3.16 Scatter graphs of (K,Y), and (L,Y) with their Kernel fit curves.
114
3.3 At the third step, there are three alternative models should be observed, by inserting each pair
of the variables LY(-1), LK, and LL, in Model-1.If there is no acceptable model. Then go to
the following step. In this case, it is found a good fit Model-3.
3.4 At the fourth step, there are three alternative models should be observed, by inserting each of
the main variables LY(-1), LK, and LL, in Model-1. If there is no acceptable model, then
Model-1 would be the final three-way nonhierarchical model.
(4) As an additional illustration, Figure 3.15 presents statistical results of a translog linear
TWREM of LY on LK and LL, and three of its possible reduced models Based on these addi
4.1 Note Model-3 is a good fit reduced model, which is obtained by deleting the independent
variable LY(-1) with the smallest p-value = 0.0000.
4.2 Compare to the three models in Figure 3.14, which one would you choose as the best model?
4.3 In fact, we can develop additional reduced model of the three-way interaction model (3.16),
or the Model-2 in Figure 3.14, such as hierarchical and nonhierarchical two-way interaction
model. Do it as an exercise.
4.4 However, in practice a research would never consider all possible models, using all possible
estimation methods, based on a selected set of variables. Therefore, the presented models
would subjectively acceptable models, since we would never know the true population
model (Agung, 2009a, Section 2.14).
(5) This example has demonstrated an unexpected impact of multicollinearity.
(6) On the other hand, peace-wise, polynomial or nonlinear relationships, between numerical
variables might give better fit model for a data set. Note Figure 3.16a presents the scatter
graphs of (K,Y) with its nearest neighbor fit curve, which shows three groups of K (capital)
over times, for t = time > 14. So a better fit model should be peace-wise for t > 14. In
addition, Figure 3.16b shows five groups of L (labor). Therefore, these scatter graphs have
shown that a continuous REM for all firm-year observations, is inappropriate. However, is
not an easy task to develop a peace-wise model, based on both independent variables K, and
L. Refer to various alternative models presented in the main book, and Agung (2011, 2009a).
(7) The models presented in Figure 3.14 and Figure 3.15 can easily be extended to the models
with trend, and the models with the time-related-effects. Refer to the models presented in
previous chapter. Do it as the exercises
115
3.4.3 Mixed Effects Models
3.4.3.1 Cross-Section Fixed and Period Random Model
The statistical results of the cross-section fixed and period random model, namely
CSF_PERM, using the ES in (3.16), present the following equation, where [CX=F,PER=R]
indicates 81 cross-section fixed (dummy variables), and the parameter C(1) is the mean of the
period random variable, as presented in (3.5). So this model presents 82 homogeneous
regressions, with 82 different intercepts, indicated by C(1), and the coefficients of 81 cross-
section dummies. Therefore, this model can be considered as a special one-way ANCOVA by a
single factor FIRM, with a set of covariates.
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [CX=F,PER=R] (3.23)
The statistical results of the cross-section random and period fixed model, namely
CSR_PEFM, using the ES in (3.16), present the following equation, where [CX=R,PER=F]
indicates the parameter C(1) is the mean of the cross-section random variable, as presented in
(3.3), and 26 period fixed (dummy variables). So this model presents 27 homogeneous
regressions, with 27 different intercepts, indicated by C(1), and the coefficients of 26 dummies.
Therefore, this model can be considered as a special one-way ANCOVA by a single factor TIME
(Period), with a set of covariates.
LY = C(1)+C(2)*LY(-1)*LK*LL+C(3)*LY(-1)*LK+C(4)*LY(-1)*LL+C(5)*LK*LL
+C(6)*LY(-1)+C(7)*LK+C(8)*LL + [CX=R,PER=F] (3.24)
Example 3.14 (Application of the CSF_PERM) Figure 3.17 presents the statistical results of
the model (3.23), and three of its reduced models, namely Model-1, and Model 3. Based on these
results, the following notes and comments are made.
116
Figure 3.17 Statistical results of the CSF_PEREM (3.23), and three of its reduced models
(1) As the results of the first stage of analysis, Model-1 is a good fit nonhierarchical model to
show that the effect of LY(-1) on LY depends on LK and LL, with 81 cross-section dummies,
because each of the numerical independent variables has a significant adjusted effect.
(2) Model-2 also can be considered as a good fit model, because only one out of seven numerical
independent variables has a p-value > 0.30. So each of the others has either a positive or a
negative significant effect on LY. For instance LK*LL has a negative significant adjusted
effect on LY with a p-value = 0.2573/2 = 0.12865 < 0.15 (Lepin, 1973).
(3) In general, LL with the largest p-value would be deleted to obtain a reduced model. In this
case, by deleting LL, it is obtained an unexpected reduced Model-3, where LK*LL has a p-
117
value = 0.5808. So this reduced model is worse than Model-2. This finding shows the
unpredictable impacts of correlation and multicollinearity between the numerical
independent variables. Explore other reduced models as an exercise.
(4) Another unexpected reduced Model-4 is obtained by using the manual stepwise selection or
the trial-and-error method. See the manual stepwise selection presented in previous example.
the general ESs in (2.69), (2.72), (2.73), and (2.75) to (2.77), we can easily done the data
analyses based on all alternative models, using various GLS estimation methods. In general,
unexpected good fit model(s) would be obtained – refer to previous example. Corresponding to
the ANCOVA model in (2.74), various GLS ANCOVA models can be easily done, with all
possible their reduced models. As the extension, the models with trend, and the models with
trend-related-effects (TRE) can also be easily developed. See the following examples.
118
adjusted effect on LY, based of the t-statistic of t0 = - 1,318168 with a p-value = 0.1876/2 =
0.0938 < 0.10. So this model also can be considered as a good fit or an acceptable
ANCOVA model, in both theoretical and statistical senses, to represent that the effect of the
time T (Year) on LY depends on LY(-1), LK and LL.
Figure 3.18 Statistical results of two weighted ANCOVA models with trend and TRE, respectively
(3) Note that either positive or negative adjusted effect of a numerical independent variable on a
dependent variable is unpredictable or unexpected, because unpredictable impact of the
correlations and multricollinearity between the independent variables.
(4) The analysis can easily be done, using other weight, such as cross-section SUR, period
weights, and period SUR, with specific Coff covariance methods. However, we never know
which one would give the best possible fit.
(5) These ANCOVA models can easily be modified to the models by Group and the time-T, or
the models by Firm (Individuals) and TP. Do it as an exercise.
119
3.4.4.1 Application of Cross-Section Random-Effects Models (CSREMs)
Example 3.16 (GLS CSREM of the model in Figure 3.18b) As a modification of the model
Figure 3.18b, Figure 3.19 presents the statistical results of a CSREM by Group and TP, with
time-related-effect, using the Wallace and Hussain estimator of component variances. Based on
these results, the following findings and notes are presented.
(1) It is found that LY has a significant positive correlation with each of the independent
variables T, T*LY(-1), T*LK, and T*LL, with a p-value = 0.0000. However, the results show
each of the numerical variable T, T*LY(-1), T*LK has very large p-value, and T*LL at only
has a positive significant adjusted effect based on the t-statistic of t0 = 1.702689 with a p-
value = 0.0888/2= 0.0444 < 0.05. Based on these findings, the following notes and comments
are made.
1.1 These results show the unpredictable impacts of the multicollinearity of all independent
variables.
1.2 In order to develop an acceptable reduced model, in a statistical sense, we have to select a
variable or a subset of the variables, which is defined as the most important upstream
(source or cause) variable, up to the least importance, in a theoretical sense, or corresponding
to the specific objectives of the analysis. So a reduced model obtained would be highly
dependent on the researchers.
120
1.3 Suppose, the main objective of the analysis is to study the effects of LY(-1), LK, and LL on
LY depend on the time T, then variables T, and its interactions should be considered as the
most important upstream variables. So the data analysis should be done using the manual
stepwise selection method (MSSM), which has been demonstrated in previous chapter. As
additional example, see the following example.
Example 3.17 (A MSSM based on CSREM in Figure 3.19) Since T*LY(-1), T*LK, T*LL, and
T are defined to be the most important upstream variables, then Figure 3.20 presents the results
of the first stage of the regression analysis.
Figure 3.20 Statistical results of the first stage of analysis based on the model in Figure 3.19
Figure 3.21 Statistical results of the second stage of analysis by inserting LY(-1)
Since all numerical independent variables have Prob. < 0.10, then additional upstream
variable(s) can be inserted at the second stage. However, the problem is which variable or subset
variables would be inserted. We have several alternative selections Do it as an exercise.
121
In this case, however, LY(-1) will be inserted at the second stage because the regression
in Figure 3.20 has autocorrelation problem, indicated by the very small Durbin-Watson statistic
of 0.029307, with the results in Figure 3.21, with a DW statistic of 1.683498. However, the
statistical results show T*LY(-1), T*LK, and T have very large p-values. Then there is a question,
weather LY(-1) should be used as an additional independent variable or not. For comparison, I
have found Baltagi (2009a, b) presents many GLS models with small Durbin-Watson statistic.
(1) The PEREM in Figure 3.22a is a nonhierarchical two-way interaction model by Group, and
TP, because it does not have the main variables LY(-1) and LK as independent variables It is
122
a good fit ANCOVA PEREM, because each of the numerical variables has a significant
effect on LY, at the 1% level of significance.
(2) Compare to the regression in Figure 3.20, this regression also has very small Durbin-Watson
statistic of 0.113181. For additional comparison, many GLS regressions presented in Baltagi
(2009a, b) have small Durbin-Watson statistic. So this model can be considered as an
acceptable model.
(3) Similar to at the second stage analysis presented in previous example, Figure 3.22b presents
the results by inserting LY(-1) as an additional variable. The results show six out of seven
numerical independent variables have Prob. < 0.30. So each of them has either positive or
negative significant effect on LY, at the 15% level of significant, since its p-value = Prob./2 <
0.15 (Lapin, 1973). Thence this regression is an acceptable model, in a statistical sense. With
respect to its DW statistics, this model is better than the model in Figure 3.22a.
123
Figure 3.23 Statistical results of TWREMs, as the modification of the models in Figure 3.18
Figure 3.24 Statistical results of a reduced model of the TWREM in Figure 3.23b
124
1.2 Based on this regression, however, it is found that the effect of LY(-1) on LY is significantly
dependent on LK*LL, LK, LL, and the time T (Year), as it is defined, in theoretical sense,
because the null hypothesis H0: C(5)= C(6)=C(7)=C(12)=0 is rejected based on the F-statistic
of F0 = 8538.857 with df = (4,22020 and p-value = 0.0000.
1.3 The other conditional effects also can easily be tested, such as the effect of LK, or the effect
of LL, conditional for the others independent variables. Do as an exercise.
1.4 On the other hand, in a statistical sense, an acceptable reduced model should be explored,
using the trial-and-error method, to delete one or more of the variables T, T*LY(-1), and
T*LK, with the results are presented in Figure 3.24. Based on this regression, the following
unexpected finding is presented.
It is unexpected, the two interactions T*LY(-1), and T*LL have insignificant joint effects
on LY, based on the F-statistic of F0 = 1.533614 with df = (2,2202) and p-value = 0.2160,
even though each interaction has a significant adjusted effect, at the 10% level of
significance.
I would say, the reason of this unexpected conclusion is T*LY(-1) has a significant
negative adjusted effect with p-value = 0.0801/2=0.04005, but T*LL has a significant
positive adjusted effect with p-value = 0.0840/2 = 0.0420, at the 5% level of significance.
In addition, it is found that the effect of LY(-1) on LY, is significantly dependent on
LK*LL, LK, LL, and T, as it is expected, because the null hypothesis H0: C(5)=
C(6)=C(7)=C(11)=0 is rejected based on the F-statistic of F0 = 8662.553 with df =
(4,2202) and p-value = 0.0000.
Example 3.20 (Alternative additive REMs by Group, and TP) Figure 3.25 presents statistical
results of three additive PEREM by Group and TP, using the Wallace and Hussain estimator of
component variances. Based on these results the following notes and comments are presented.
125
Figure 3.23 Statistical results of three additive PREMs of LY by Group and TP
(1) Model-1 is a translog linear PEREM of LY on LK, and LL, by Group, and TP, which can be
considered as an extension of the Cobb-Douglass production function. This model can be
extended to quadratic translog PEREM of LY on LK and LL, which can be considered as the
extension of the CES production function (Agung 2011, 2008). Note each of LK and LL has a
significant positive adjusted effect on LY, but this model has a very small Durbin-Watson
126
statistic, which indicates there is autocorrelation problem. This can be improved by using the
lag(s) of the LY. Do it as an exercise.
(2) Model-2 is a translog linear PEREM of LY on LK, and LL, with trend by Group, and TP. An
additional analysis shows that the time T has a positive significant effect on LY, which is
contradicted with its negative significant adjusted effect in the model. For this reason, I
conduct the analysis based on CSREM, and TWREM, of the same model, with the results
presented in Figure 3.26. It happens, at the 10% level of significance, both CSREM and
TWREM show the time T has significant positive adjusted effects with p-values of 0.0626 =
0.1252/2, and 0.0603= 0.1606/2, respectively.
So I would say the CSREM, and TWREM are better fit models than the PEREM, and
CSREM is better than TWREM, because it has greater adjusted R-squared.
(3) Model-3 is a LV(1) translog linear PEREM of LY on LK, LL, and LY(-1) with trend by
Group, and TP. Since each of the three variables LK, LL, and T, has very large Prob., then, in
127
a statistical sense, acceptable reduced model(s) should be explored. If LY(-1) should be kept
in the model, then one or more other independent variables should be deleted from the
model, using the trial and error method. Do it as an exercise.
128
(a) resents the Hausman test for the TW_REM in Figure 3.5, which also presents the tests for
both CSREM and PEREM, with special notes, such as “Cross-section test variance is
invalid Hausman statistic set to zero”, and warning: robust standard errors may not be
concluded that the Hausman test can not be applied, in this case.
129