Академический Документы
Профессиональный Документы
Культура Документы
Conclusion
nonresponse in survey data and showed proofs on the disadvantages and advantages of the
methods. It showed that when applying imputation procedures, it is important to consider the
type of analysis and the type of point estimator of interest. Whether the goal of the researcher is
to produce unbiased and efficient estimates of means, totals, proportions and official aggregated
statistics or a complete data file that can be used for a variety of different analyses and by
different users, the researcher should clearly identify first the type of analysis and the type of
Anyone faced with having to make decisions about imputation procedures will usually
have to choose some compromise between what is technically effective and what is operationally
expedient. If resources are limited, this is a hard choice. It is to be hoped that this study might be
There are other issues to consider in determining which imputation method should be
used for a particular assumption. There are several practical issues that involve the case of
language due to the unavailability of software that can generate imputations for all the methods
these researchers intended to use. In all of the methods, the overall mean imputation was the
simplest and easiest to use and to create a computer program. The other three methods required
the formation of imputation classes. Both regression imputations were the hardest to program
observations was compared using the 1997 Family Income Expenditure Survey (FIES) data set.
A set of criteria were computed for each method based on the data set with imputed values and
data set with actual values to find the best imputation method for this data set. The criteria in
judging the best method were the bias and variance estimates of the population mean of the
imputed data, the preservation of the distribution by the actual data, and the other measures of
The results show that the choice of imputation method significantly affected the estimates
of the actual data. The similarities among the best two methods, namely, the deterministic and
stochastic regression imputation methods were due in part to the adequacy and prediction power
of the models.
The bias and variance estimates of the population mean of the imputed data obtained
appeared to vary much across imputation methods and it was unexpected that the hot deck
imputation method rendered the highest estimates in majority of the nonresponse rates as well as
its variables. Stochastic regression, on the other hand, was the best method in that particular
criterion since in majority of the results in the tests it delivered relatively small biases and
variances.
The distributions of the imputed data of each method were checked for the preservation
of the distribution using the Kolmogorov-Smirnov Goodness of Fit test. In the methods used in
this study, both regression imputation methods retained the distribution of the data especially the
deterministic regression imputation that generated exactly the same distribution as the actual
data.
In the other tests of accuracy and precision, namely, the mean deviation, mean absolute
deviation and root mean square deviation, the different methods provided mixed results in all
nonresponse rates. The results for some methods did not consistently and clearly yielded good
results. Only half of the methods used provided great results in one particular criterion which is
the preservation of the distribution of the data. In the other results, inconsistency was obviously
Given the criteria and procedures in judging the best imputation procedure from the set of
four methods, the selection of the best method was difficult. Consequently, in order to determine
the best method of imputing nonresponse observation for each variable in the study, the methods
were ranked according to several criteria. The rank value registered a value 1 if it ranks first and
regression and stochastic imputation method gave the outstanding results. The results were
ranking first and second and vice-versa in majority of the criteria. The researchers concluded that
the stochastic regression imputation procedure is considered the best imputation method for this
study.
determination of the model and the added random residual in the deterministic imputed value.
The random residuals added to the deterministic imputation provided a change in making the
Deterministic regression imputation method performed much better than hot deck
imputation method. It is surprising that the hot deck imputation method was less efficient than
deterministic regression where in the related studies; it emerged as the better method than
deterministic regression. Most likely the selection of donors with replacement might be the cause
of this downfall and not the imputation classes. If it were the imputation classes, then both
regression imputation methods estimates could be as worse as the hot deck imputation even if the
model is adequate.