Вы находитесь на странице: 1из 6

Revista Romn de Statistic Trim.

I/2013 - Supliment 38
Non-parametrical Estimation of the Regression used
in Economic Analyses

Prof. Constantin ANGHELACHE PhD
Artifex University of Bucharest
Academy of Economic Studies, Bucharest
Prof. Gabriela Victoria ANGHELACHE PhD
Academy of Economic Studies, Bucharest
Prof. Liviu BEGU PhD
Academy of Economic Studies, Bucharest
Georgeta BARDAU PhD Student
Academy of Economic Studies, Bucharest


Abstract
Non-parametric methods are useful, but raises some problems. In
practice, they require a large number of observations and are used for a
relatively small number of explanatory variables. Moreover, the result is
sensitive to the choice of the smoothing parameter and to a lesser extent in
the nucleus. They pose a problem for the presentation of results that can not
be contained in a compact formula but can only be described by graphs. A
non-parametric analysis does not allow extrapolation outside the range of
observation, but econometric is an advantage.
Key words: non-parametric methods, variables, regression
function, appraisal
JEL Classification: C01, C51

General aspects
Contrary to the other domains, the economic theory is rarely mentioning
functional forms but, usually, it specifies only a list of the relevant variables in
order to explain a phenomenon. The specification of the relation form is resulting,
to a great extent, out of an empirical study containing a good model which
works well. A first level of analysis consists of writing a model (linear, logarithm
linear, non-linear etc.) and performing the estimation without taking into account
its approximate nature. A second approach consists of specifying a parametrical
model which incorrect specification is explicit. This is leading, for instance, to the
correction of the expression for variations or to the selection of the models for the
erroneous specification.

Revista Romn de Statistic Trim I/2013- Supliment 39
Practically, we have to get all the specified conditions by adopting a non-
parametrical approach when estimating the regression, in which the data
themselves are selecting the form of the function to be built up.
Various methods (models) for estimating the non-parametrical regression
have been drawn up which are presently commonly used. We consider likewise the
nucleus method, which is a simple one and, in certain situations, dominated by
other approaches.
The non-parametrical methods are useful but they are raising certain
problems. In practice, they are requiring a large number of observations and are to
apply to a relatively small number of explanatory variables. Moreover, the outcome
is sensitive to the selection of the equalizing parameter and, to a smaller extent to
the nucleus. They are raising a problem as to submitting the outcomes which
cannot be covered by a compact formula but can be described by means of
diagrams. A non-parametrical analysis does not allow an extrapolation outside the
observation domain but, from the econometric point of view, this is an advantage.
In order to redeem some of these difficulties, semi-parametrical methods have been
developed which purpose is to estimate only certain characteristics of the
regression or to constrain the regression function to satisfy certain conditions. The
dimension of the issue is thus reduced and the obtaining of the outcomes
facilitated. Meantime, it is possible to insert also structural conditionings to the
model.
For the beginning, we take into consideration the standard estimation of the
regression nucleus and then, we discuss certain problems of the estimation for
specific characteristics of the regression or the estimation under compulsion.

The band lengths for the variables
The previous expression is transformed in the following mode. In the
dispersion terms,
q
n
h becomes

=
q
j
jn
h
1
.
In addition, the same argument as the one applied to the density can be
utilized in order to set up the width of the band and nucleus. We can use the
expression of the squared mean asymptotic integrated error in order to derive the
best width of the band at z fix.
This calculation implies that g and f are known. We can go on with
estimating the f and g, first with a couple of initial values of the band width and,
then, by using these estimations in order to improve the band width.
This procedure is merely a delicate one because it requires the estimation
of the differentials, which are converging slowly (and need a large sample) and the
conditioned dispersion. This method of connecting has been also extended to the
selection of a specific band width for each explanatory variable.
After replacing the band width by its optimum value, we can look after an
optimum nucleus, which is the Epanechnikov nucleus, as in the case of the density
estimation.


Revista Romn de Statistic Trim. I/2013 - Supliment 40
An alternative approach for selecting the optimum width consists of the so-
called crossed validation method.
The expression does not depend on
n
h

and can be numerically minimized
by observing
n
h for a given interval.
The AMISE calculation is based on two conditions, respectively: the fact
that

dar

and on the double difference of the observation density. The distance between g
and g
n
, measured by AMISE can be reduced by assuming a differentiation at a
higher order or by selecting K so that:

for j < r
In this case, the smallest r in this formula is called the order of the nucleus
K. To note that when K is a density of measurement of the probability (K non
negative), then r equals to 2.
The term of the systematic error is then equal, up to a multiplicative
constant, with
) , min( 2 r s
n
h , where s is the order of the differentiation and r is the
order of the nucleus.
The disadvantage of the nucleuses of high order, of order higher then 2,, is
that there are no more densities and the estimated densities can be negative, at least
on small samples.
When Lf h
n
equals to the optimum choice, the convergence rate
q
n
nh
becomes:

This is the convergence non-parametrical optimum rate with the measure q
which can be compared with the usual parametrical rate, namely n . We are
checking the fact that indeed the interval between the two rates increases along
with the increase of q .
In order to utilize this outcome in practice, we must estimate the density
and the conditional dispersion. The density is estimated by the nucleus and,
similarly, the conditional dispersion.

The estimation of the regression function transformation
Instead of the estimation of the regression function, we can analyze a
transformation of this function. The option for this transformation is grounded by

Revista Romn de Statistic Trim I/2013- Supliment 41
the economic analysis which defines the parameters of interest. Obviously , there
are many transformations which can be considered but we shall focus on a specific
class characterized by the relation:


In this formula , ( ) ( ) z z y E z g = =
~
|
~
, and w(z) is a weight function which is
either scalar, or vectorial and satisfies w(z)=0 if ( ) 0
arg
= z f
m
, which is natural
since g(z) is defined only if ( ) 0
arg
> z f
m
. The parameter of interest

is scalar
or vectorial. This class of transformation is justified by the properties of the
resulting estimator

and, meantime, by its relevance as regards many issues of
applied econometrics, which are special situations of these analyses.
Before entering into details, we notice the fact that this transformation does
not insert the over-determination of the conditions on the variables distribution.
We shall estimate the mean of the regression differentials. We have seen
that the parametrical estimation of a regression erroneously specified does not
allow us to consistently estimate the differentials of this function in a certain point.
In many econometrical issues, the differentials are parameters of interest. The
estimation is possible but its rate of convergence is very slow and, consequently,
requires a large sample. Nevertheless, in many applications it is enough to estimate
the mean of the regression differentials, namely:


where is a multiple index of the derivation and

is the derivation defined by


this multiple index. The function v(z) is a density on the explanatory variable
which can be equal to ( ) z f
m
, the density of the actual explanatory variable being
studied. We shall analyze the under-additively test. In order to illustrate this
situation, lets assume that the function C is the function cost which associates an
expected cost with the quantities of the different products z. The economic theory
is interested in the under-additively C, namely it is:


Which means that, the cost of a company producing

=
p
j
j
z
1
, is lower than the cost
of several companies each producing
j
z . The above property must be true for
each p and each sequence ) ,..., (
1 p
z z . It is easy to show that this property is
equivalent to the property which will be explicitly shown by the content. If is

Revista Romn de Statistic Trim. I/2013 - Supliment 42
the density ) ,..., (
1 p
z z ,
~
the density of the sum
p
z z + + ...
1
and
j
the density
j
z , than, it is equivalent with the fact that for each , we have:

The reciprocal is resulting by taking into account the distribution on
) ,..., (
1 p
z z focused in one point. Now, we shall approach the under-additively test.
The previous relation suggests that there is a defined, namely:

the sign of this parameter having to be tested.
The estimation of defined can be made in two modes.
The first variant consists of the estimation of g followed by the calculation.
The second approach avoids the estimation g and is based on the
particularity given by the utilized (final) function:

This condition is seldom satisfied. We can replace
arg m
f with a
parametrical or non-parametrical estimation.
Implicitly, we assume that w is given. In practice, iv can be partially or
totally unknown (since it is, for instance, a function of
arg m
f ) and thus w must be
replaced by an estimation.
A procedure of adjustment is inserted sometimes, consisting of the
elimination of the data placed at the limit of the support of the explanatory
variables distribution. The adjustment can be inserted in the function w as the form
of a function with multiplying indicator.

The main asymptotic result is the convergence rate
n

at . Indeed, we
know:

in the frame regularity conditionings and under the condition that the bands width
have an adequate asymptotic behavior. In order to limit the problems of
dimensioning or to impose certain restrictions originating in the economic theory,
we often assume that the conditioned probability g(z), which is a function of the
variables q, depends in fact on the functions of a reduced number of variables and,
possibly, on certain parameters. In fact, there are two points of view being
expressed: either we assume that g is actually restricted to this specific form or we

Revista Romn de Statistic Trim I/2013- Supliment 43
are searching for the best approximation g through an element satisfying the
considered restrictions.


References
Anghelache, C. (coord., 2012) Modele statistico econometrice de analiz
economic utilizarea modelelor n studiul economiei Romniei, Revista
Romn de Statistic, Supliment Noiembrie 2012
Bardsen, G., Nymagen, R., Jansen, E. (2005) The Econometrics of
Macroeconomic Modelling, Oxford University Press
Benjamin, C., Herrard A., Hanee-Bigot, M., Tavere, C. (2010) Forecasting with
an Econometric Model, Springer
Dougherty, C. (2008) Introduction to econometrics. Fourth edition, Oxford
University Press
Jesus Fernandez-Villaverde & Juan Rubio-Ramirez (2009) Two Books on the
New Macroeconometrics, Taylor and Francis Journals, Econometric Reviews
Mitru, C. (2008) Basic econometrics for business administration, Editura
ASE, Bucureti
Voineagu, V., ian, E. i colectiv (2007) Teorie i practic econometric,
Editura Meteor Press

Вам также может понравиться