You are on page 1of 6


1 Econometrics and economic data

Ezequiel Uriel
University of Valencia
Version: 09-2013
1 Econometrics and economic data 1
1.1 What is econometrics? 1
1.2 Steps in developing an econometric model 2
1.3 Economic data 5
1.1 What is econometrics?
First, let us see something about the origin of econometrics as a discipline. The
term econometrics is believed to have been crafted by Ragnar Frisch, co-winner of the
first Nobel Prize in Economic Sciences in 1969, along with fellow econometrician J an
Tinbergen. Both of them were founders of the Econometric Society in 1933. In section I
of the constitution of this society, it is stated that
The Econometric Society is an international society for the advancement of
economic theory in its relation to statistics and mathematics. Its main object shall be to
promote studies that aim at a unification of the theoretical-quantitative and the empirical-
quantitative approach to economic problems and that are penetrated by constructive and
rigorous thinking similar to that which has come to dominate the natural sciences
In the first issue of Econometrica (1933), the Econometric Society journal,
Ragnar Frisch gives us an explanation about the meaning of econometrics:
But there are several aspects of the quantitative approach to economics, and no single
one of these aspects, taken by itself, should be confounded with econometrics. Thus,
econometrics is by no means the same as economic statistics. Nor is it identical with what we
call general economic theory, although a considerable portion of this theory has a definitely
quantitative character. Nor should econometrics be taken as synonymous with the application
of mathematics to economics. Experience has shown that each of these three viewpoints, that
of statistics, economic theory, and mathematics, is a necessary, but not by itself a sufficient
condition for a real understanding of the quantitative relations in modern economic life. It is
the unification of all three that is powerful. And it is this unification that constitutes
Today, we would also say that econometrics is the combined study of economic
models, mathematical statistics, and economic data. Within the field of econometrics,
econometric theory can be distinguished from applied econometrics.
Econometric theory concerns the development of tools and methods, and the
study of the properties of econometric methods. Econometric theory belongs to the field
of statistics.
Applied econometrics is a term describing the development of quantitative
economic models and the application of econometric methods to these models using
economic data. Applied econometrics is mainly used in the field of applied economics.
What are the goals of Econometrics? We are going to examine three:
1) Knowledge of the real economy. Econometric methods allow us to estimate
economic magnitudes such as the marginal propensity to consume or the
elasticity of labor with respect to output. These estimations are located in a
determined time and space: for example, in Spain in the last quarter of the
20th century. In addition to the estimation, in which numerical values are
obtained, econometric methods allow us to perform tests of hypothesis; for
example, in a production function, is the hypothesis of constant returns to
scale admissible?
2) Economic simulation policy. Econometrics methods can be used to simulate
the effects of alternative policies. For example, with an appropriate
econometric model we could see, in quantitative terms, how the different
increases in tobacco tax affect the consumption of tobacco.
3) Prediction or forecasting. Very often econometric methods are used to predict
values of economic variables in the future. By making predictions we try to
reduce our uncertainty in the future of the economy. This is not an easy task,
since in general the predictions are only satisfactory when there are no drastic
changes in the economy. Although it would be useful to be able to predict
these drastic changes accurately, both econometric and other alternative
methods tend to be imprecise.
1.2 Steps in developing an econometric model
There are three main steps in developing an econometric model: specification,
estimation and validation.
While in a first approximation these stages follow a sequential order, in
econometric analysis it is generally necessary to go back more than once within this
sequence. It is necessary to continuously confront the model with the data and any other
information source, in order to obtain an econometric model compatible with the data.
The model can be used to analyze reality, offer better predictions or constitute a good
basis for making decisions. Now we will describe the steps listed above.
(a) Specification
In this first step, the model or models used must be defined, as well as data to be
used in the estimation stage.
In the specification step, we will refer to four elements: the economic model, the
econometric model, the statistical assumptions of the model and the data. In this section
we will refer to the first three elements; in the following section we will examine
different types of data used in econometric analysis.
The first element we need is an economic model. In some cases, a formal
economic model is constructed entirely using economic theory. In other cases,
economic theory is used less formally in constructing an economic model.
After we have an economic model, we must convert it into an econometric
model. We are going to see that with two examples.
EXAMPLE 1.1 Keynesian consumption function
Keynes formulated his well-known consumption function in three propositions:
Proposition 1: Consumption is a function of income, and both variables are measured in real
terms. If the variables are measured in real terms, it means that when consumers decide the proportion of
income devoted to consumption, they are not affected by monetary illusion.
Analytically, proposition 1 can be expressed in the following way:
( ) cons f inc (1-1)
Proposition 2: Consumption is an increasing function of income, but an increase in income
always causes an increase, to a lesser degree, in consumption.
This proposition implies that marginal propensity to consumption is greater than 0 (it is an
increasing function), but it is smaller than 1 (an increase in income always causes an increase, to a lesser
degree, in consumption).
Analytically, proposition 2 can be expressed in the following way:

0 1
Proposition 3: The proportion of income consumed is smaller when income increases. That is to
say, the proportion of the last euro earned devoted to consumption is smaller than the proportion of total
income earned devoted to consumption.
Analytically, proposition 3 can be expressed in the following way:

con cons
inc inc
In other words, the marginal propensity to consume is smaller than the average propensity to
These three propositions constitute an economic model: the Keynesian consumption function.
To estimate and test this model we must convert it into an econometric model. For this
conversion, two requirements must be accomplished.
According to the first requirement, it is necessary to specify the mathematical form of the
function. The linear function has been used in this case because, in addition to being simple, it is
compatible with the description made by Keynes.
In order to justify the second requirement, it must be taken into account that the model
formulated in proposition 1 is deterministic. That is to say, income is the only factor in the determination
of consumption. But in real life there are many other factors, other than income, which have an influence
on consumption. In an econometric model, all the factors different from the independent variables
included are gathered in a variable denominated random disturbance or error (u). The second requirement
is the introduction of the term of error in the equation .
In general, all the relevant factors must be introduced explicitly in the econometric model; all the
other factors are taken into account in a unique variable: the error or the random disturbance. In the
Keynesian consumption function the only relevant factor considered is income.
Taking into account these two requirements, Keynesian consumption function can be expressed
in the following way:

1 2
cons inc u (1-4)
This is an econometric model that can be estimated if you have data on consumption and income.
Let us see now the other two propositions. In this linear model, the marginal propensity to consumption is
the following:

Consequently, proposition 2 in this model is the following:

0 1 (1-6)
Once the model has been estimated, it is possible to test whether the estimate of
is between 0
and l.
The average propensity to consume in the linear model, considering that the error is equal to 0, is
the following:

1 2 1
inc cons
inc inc inc

Therefore, proposition 3 implies that

1 1
2 2
or 0
inc inc

That is to say,

0 (1-9)
Once the model has been estimated, testing proposition 3 is equivalent to testing whether the
intercept is significantly greater than 0.
EXAMPLE 1.2 Wage determination
Economic model:
Formal economic theory - human capital theory- says that education (educ), experience (exper)
and training are factors that affect productivity and hence the wage. Therefore, an economic model for
wage determination could be the following:
( , , ) wage f educ exper training (1-10)
Incidentally, do you think there is any variable missing in this model?
Econometric model:
The corresponding econometric model, using a mathematical linear form, is the following:

1 2 3 4
wage educ exper training u (1-11)
To sum up, to convert an economic model into an econometric model:
a) The form of the function f(.) has been specified.
b) A disturbance variable has been included to reflect the effect of other variables
affecting wage, but not appearing in the model.
An important element in the specification of the model is the formulation of a set
of statistical assumptions, which are used in subsequent steps. These statistical
assumptions play a key role in hypothesis testing and, in general, throughout the
inference process carried out with the model.
(b) Estimation
In the estimation process we obtain numerical values of the coefficients of an
econometric model. To complete this stage, data are required on all observable variables
that appear in the specified econometric model, while it is also necessary to select the
appropriate estimation method, taking into account the implications of this choice on the
statistical properties of estimators of the coefficients. The distinction between estimator
and estimate should be made clear. An estimator is the result of applying an estimation
method to an econometric specification. On the other hand, an estimate consists of
obtaining a numerical value of an estimator for a given sample. For example, applying a
very simple estimation method, called ordinary least squares, to the specification of the
consumption function (1-4) provides expressions which determine the estimators


. Substituting the sample data in these expressions, two numbers are obtained:
one for

and one for


which provide estimates of the parameters

In general, it is possible to obtain analytical expressions of the estimators,
particularly in the case of estimating linear relationships. But in non-linear procedures
of estimation it is often difficult to establish their analytical expression.
(c) Validation
The results are assessed in the validation stage, where we assess whether the
estimates obtained in the previous stage are acceptable, both theoretically and from the
statistical point of view. On the one hand, we analyze, whether estimates of model
parameters have the expected signs and magnitudes: that is to say, whether they satisfy
the constraints established by economic theory.
From the statistical point of view, on the other hand, statistical tests are
performed on the significance of the parameters of the model, using the statistical
assumptions made in the specification step. In turn, it is important to test whether the
statistical assumptions of the econometric model are fulfilled, although it should be
noted that not all assumptions are testable. The violation of any of these assumptions
implies, in general, the application of another estimation method that allows us to obtain
estimators whose statistical properties are as good as possible.
One way to establish the ability of a model to make predictions is to use the
model to forecast outside the sample period, and then to compare the predicted values of
the endogenous variable with the values actually observed.
1.3 Economic data
As we have seen, an empirical analysis uses data to test a theory or to estimate a
relationship. It is important to stress that in Econometrics we use non-experimental data.
Non experimental or observational data are collected by observing the real world in a
passive way. In this case, data are not the outcome of controlled experiments.
Experimental data are often collected in laboratory environments in the same
way as in natural sciences. Now, we are going to see three types of data which can be
used in the estimation of an econometric model: time series, cross sectional data, and
panel data.
Time Series
In time series, data are observations on a variable over time. For example:
magnitudes from national accounts such as consumption, imports, income, etc. The
chronological ordering of observations provides potentially important information.
Consequently, ordering matters.
Time series data cannot be assumed to be independent across time. Most
economic series are related to their recent histories. Typical examples include
macroeconomic aggregates such as prices and interest rates. This type of data is
characterized by serial dependence.
Given that most aggregated economic data are only available at a low frequency
(annual, quarterly or perhaps monthly), the sample size can be much smaller than in
typical cross sectional studies. The exception is financial data where data are available
at a high frequency (weekly, daily, hourly, etc.) and so sample sizes can be quite large.
Cross Sectional Data
Cross sectional data sets have one observation per individual and data are
referred to a determined point in time. In most studies, the individuals surveyed are
individuals (for example, in the Labor Force Survey (EPA) more than 100000
individuals are interviewed every quarter), households (for example, the Household
Budget Survey), firms (for example, industrial firm survey) or other economic agents.
Surveys are a typical source for cross-sectional data. In many contemporary
econometric cross sectional studies the sample size is quite large.
In cross sectional data, observations must be obtained by random sampling. Thus,
cross sectional observations are mutually independent. The ordering of observations in
cross sectional data does not matter for econometric analysis. If the data are not
obtained with a random sample, we have a sample selection problem.
So far we have referred to micro data type, but there may also be cross sectional
data relating to aggregate units such as countries, regions, etc. Of course, data of this
type are not obtained by random sampling.
Panel Data
Panel data (or longitudinal data) are time series for each cross sectional member
in a data set. The key feature is that the same cross sectional units are followed over a
given time period. Panel data combines elements of cross sectional and time series data.
These data sets consist of a set of individuals (typically people, households, or
corporations) surveyed repeatedly over time. The common modeling assumption is that
the individuals are mutually independent of one another, but for a given individual,
observations are mutually dependent. Thus, the ordering in the cross section of a panel
data set does not matter, but the ordering in the time dimension matters a great deal. If
we do not take into account the time in panel data, we say that we are using pooled
cross sectional data.