Вы находитесь на странице: 1из 32

1/32

EC114 Introduction to Quantitative Economics


11. Introduction to Econometrics
Marcus Chambers
Department of Economics
University of Essex
17/19 January 2012
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
2/32
Outline
1
The Purpose of Econometrics
2
Sample Correlations
3
Two-variable Regression
Reference: R. L. Thomas, Using Statistics in Economics,
McGraw-Hill, 2005, chapter 8 and sections 9.19.2.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 3/32
What is Econometrics? Econometrics:
may be dened as the social science in which the
tools of economic theory, mathematics, and statistical
inference are applied to the analysis of economic
phenomena
(A.S. Goldberger, Econometric Theory, 1964, p.1)
is concerned with the empirical determination of
economic laws
(H. Theil, Principles of Econometrics, 1971, p.1)
aims to put empirical esh and blood on
theoretical structures
(J. Johnston, Econometric Methods (Third Edition),
1984, p.5)
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 4/32
Furthermore, Econometrics:
is the science and art of using economic theory
and statistical techniques to analyze economic data
(J.H. Stock and M.W. Watson, Introduction to
Econometrics, 2003, p.3)
The latter authors (Stock and Watson) also state:
Ask a half dozen econometricians what
econometrics is and you could get a half dozen
different answers (p.3); and
Econometrics can be a fun course for both
teacher and student (p.xxvii)
Please remember this last point as the term progresses. . .
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 5/32
Econometrics, then, can be seen as an application of
Statistics to economic theory and data.
In this sense, during this term we will apply many of the
tools covered last term to economic problems.
For example, we will use the concepts of estimation and
inference, including t- and F-tests, to estimate economic
relationships and to test hypotheses of interest.
But why is Econometrics so important?
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 6/32
Economics is based on the study of relationships between
variables.
Example include:
consumption and income (the consumption function);
quantity demanded and price (demand curve);
employment and wages (demand for labour).
Econometrics studies how to:
quantify these relationships and nd values for their
parameters (i.e. estimates);
test the theories implied by the relationships;
use the relationships as a basis for predictions and
forecasts.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 7/32
Example: consumer theory suggests that aggregate
consumers expenditure (C) is a function of income (Y) and
the cost of borrowing (I):
C = C(Y, I).
It also suggests that a rise in Y leads to a rise in C, while a
rise in I leads to a fall in C, other things being equal.
However, some problems remain:
How do we actually dene and measure C, Y and I?
What is the appropriate functional form? It could be linear:
C = + Y + I, > 0, < 0,
or it could be nonlinear e.g. the constant elasticity form
C = AY

, > 0, < 0,
or it could be something completely different!
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 8/32
Even when we decide on the functional form, additional
problems remain:
What are the values of and ?
The theory concerns equilibrium do the data correspond
to equilibrium points?
The theory suggests no role for prices, P. Suppose we
include prices:
C = + Y + I + P.
Can we test whether = 0?
Furthermore, economic relationships are never exact or
deterministic.
There will always be unknown factors that determine the
variable of interest.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 9/32
In view of this we would rewrite the linear consumption
function as
C = + Y + I + ,
where is a random disturbance that may be positive or
negative.
The presence of reects the fact that there may be other
(unknown) factors affecting C which we treat as being
random.
We therefore consider stochastic (or random) relationships
rather than deterministic relationships.
All of these aspects (and more besides) have to be
considered in an econometric analysis of the data.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
The Purpose of Econometrics 10/32
Unlike the physical sciences, it is not typically possible to
conduct experiments to quantify the effects of interest in
economics.
For example, we cant hold I constant so as to isolate the
effects of Y in order to determine in reality all variables
change!
Fortunately, however, we can use a statistical technique
called multiple regression analysis to estimate the
parameters of interest, such as and .
Much of econometrics uses multiple regression analysis in
one form or another it can be regarded as the
econometricians substitute for a controlled experiment.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 11/32
In the rst part of the module it was shown that the
direction of the (linear) association between two variables
can be measured by the covariance.
The strength of the association is measured via the
correlation coefcient.
The formulation of these statistics differs depending on
whether they are computed for the population or the
sample.
The denitions were given in Lectures 9 and 10.
As a reminder we have:
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 12/32
Population Covariance
Cov(X, Y) = E[X E(X)][Y E(Y)]
Population Correlation
=
Cov(X, Y)

V(X)

V(Y)
=
E(XY) E(X)E(Y)

V(X)

V(Y)
Sample Correlation
R =

(X X)(Y Y)

(X X)
2

(Y Y)
2
Remember: 1 1 and 1 R 1.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 13/32
As an example, consider the well-known macroeconomic
relation
MV = PG,
where M denotes money stock, V is the velocity of
circulation, P is the price level and G denotes GDP.
Dening k = P/V we can rewrite the equation as
M = kG.
Assuming k to be a positive constant this implies that M is
proportional to G, as in the following diagram:
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 14/32

If we could observe M and G we could calculate the
correlation between them and test to see whether it was
positive or not.
Table 9.1 in Thomas provides data for a cross-section of 30
countries in 1985.
A scatter diagram of the data is as follows:
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 15/32

There is a broadly increasing relationship between M and
G but the fact that the dots do not lie on a straight line
implies that the value of k is different across countries.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 16/32
We can use the data to calculate the sample correlation.
We nd that, taking X = G and Y = M:

(X X)(Y Y) = 116.60

(X X)
2
= 666.86

(Y Y)
2
= 26.403
and hence
R =

(X X)(Y Y)

(X X)
2

(Y Y)
2
=
116.60

666.86

26.403
= 0.8787,
which suggests a strong positive linear relation between M
and G.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 17/32
However, R is a sample statistic, and we are really
interested in the population correlation, .
In particular, is R sufciently different from 0 that we can
say that is also different from 0?
Put another way, can we test
H
0
: = 0 against H
A
: > 0
at, say, the 5% level of signicance?
The answer is: yes!
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 18/32
Our test will be based on the statistic
TS =
R

n 2

1 R
2
t
n2
under H
0
.
From the t-table we nd that t
0.05
28
= 1.701 and so the test
criterion is:
reject H
0
: = 0 if TS > 1.701
and reserve judgment otherwise.
Substituting the values:
TS =
0.8787

28

1 0.8787
2
= 9.74.
Hence TS = 9.74 > 1.701 and so we reject H
0
at the 5%
level of signicance in favour of H
A
: > 0.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Sample Correlations 19/32
Our result implies that M and G are positively related.
However, R does not imply anything about causality, so we
cant say that M grows because G grows.
It can be the other way around, or M and G can inuence
each other, or the relation exists by chance i.e. the relation
between M and G is spurious.
For example, the sample correlation between UK beer
prices and Japanese petrol consumption from the 1950s
to the 1990s is as high as 0.93, but there is no causal
mechanism the high correlation exists by chance, or is
spurious.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 20/32
Regression analysis differs from correlation because:
1
An a priori assumption is made about the direction of
causality between two variables; and
2
An attempt is made to quantify the linear relationship
between the variables.
So, by writing M = f (G), we are assuming that M depends
on G and not vice versa.
Therefore, M is the dependent variable (or regressand)
and G is the explanatory variable (or regressor ).
For consistency of notation we shall set:
Y : dependent variable;
X : explanatory variable.
In our example Y = M and X = G.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 21/32
We assume that Y and X are linked by the population
regression equation which is a linear relationship:
E(Y) = + X.
In this set-up:
E(Y) : the expected demand for money of a country
with GDP of X;
, : unknown population parameters;
: intercept
: slope, or gradient.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 22/32
The actual demand for money, Y, of a country is not always
the same as the expected demand, E(Y).
The difference between the two is referred to as a
deviation, error or disturbance, which we represent with the
symbol .
We then have
Y = E(Y) + .
Recalling that E(Y) = + X this implies that
Y = + X +
i.e. Y is linearly related to X but the relationship is subject
to a random disturbance .
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 23/32
What does the disturbance actually represent?
All variables other than GDP that inuence the demand for
money (which we are assuming to be quantitatively small,
otherwise we would need to allow for them explicitly);
Random variation in Y resulting from the basic
unpredictability of economic agents.
Even if GDP was the only variable inuencing the demand
for money and even if GDP was identical in all 30
countries, we would still expect some variation in the
demand for money across countries.
The random disturbance, , represents all such random
factors.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 24/32
Disturbances can be either positive or negative.
If > 0 then Y > E(Y) so that Y is above its expected value.
Alternatively, if < 0 then Y < E(Y) and Y is below its
expected value.
Extending the notation:
n : sample size (n = 30 in the current example);
i : index for observations: i = 1, . . . , n;
Y
i
: demand for money per head in country i;
X
i
: GDP per head of country i;

i
: disturbance associated with country i.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 25/32
The symbols X, Y and without subscripts are a general
shorthand for the variables they represent: GDP per head,
the demand for money per head and the disturbance.
When X, Y and appear with subscripts (e.g. Y
8
or X
10
or

12
) they must be interpreted as numbers referring to, in
this case, either GDP, the demand for money or the
disturbance values for particular countries.
Subscripted variables therefore satisfy:
E(Y
i
) = + X
i
, i = 1, 2, 3, . . . , n,
Y
i
= + X
i
+
i
, i = 1, 2, 3, . . . , n.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 26/32
Problem: the population parameters, and , and hence
the population regression line, are unknown.
We therefore estimate the population parameters using the
data.
The most common way to do this is to t a straight line to
the scatter of points in Figure 9.1.
The result is the sample regression line, written

Y = a + bX,
where a and b are the estimates of and , respectively,
and

Y is the predicted value (or tted value) of Y.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 27/32

The dependent variable Y is represented on the vertical
axis and the independent variable X on the horizontal axis.
We can compare the population and sample regression
lines:
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 28/32

The population and sample regression lines are different
because a and b are only estimates of and .
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 29/32
We can calculate the predicted value of Y for any country
in our sample using

Y
i
= a + bX
i
, i = 1, . . . , n.
For example, country 15 (Japan) has GDP per head of
X
15
= 10.9748, and so

Y
15
= a + (b 10.9748).
The difference between the actual value of Y and the
predicted value is known as the residual:
Y
i
=

Y
i
+ e
i
Actual = Predicted + Residual
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 30/32
Important: residuals and disturbances are different
quantities:
Disturbance:
i
= Y
i
E(Y
i
) = Y
i
X
i
,
Residual: e
i
= Y
i

Y
i
= Y
i
a bX
i
.
Disturbances are the parts of the Y
i
that are not explained
by the population regression; they are unobservable.
Residuals are the parts of the Y
i
that are not explained by
the sample regression; they can be calculated using the
formula above.
Disturbances and residuals can be depicted as follows:
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Two-variable Regression 31/32

In the diagram
i
> 0 and e
i
> 0 because Y
i
lies above both
the population and sample regression lines for the
corresponding value of X
i
.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics
Summary 32/32
Summary
The purpose of Econometrics.
Sample correlations.
Two-variable regression.
Next week:
Ordinary least squares (OLS) estimation; goodness-of-t.
EC114 Introduction to Quantitative Economics 11. Introduction to Econometrics