Вы находитесь на странице: 1из 105

Chapter 3

A brief overview of the


classical linear regression model

Introductory Econometrics for 1


Regression

Regression is probably the single most important tool at the


econometricians disposal.

But what is regression analysis?

It is concerned with describing and evaluating the relationship


between a given variable (usually called the dependent variable) and
one or more other variables (usually known as the independent
variable(s)).

Introductory Econometrics for 2


Some Notation

Denote the dependent variable by y and the independent variable(s) by x1, x2, ... , xk
where there are k independent variables.

Some alternative names for the y and x variables:


y x
dependent variable independent variables
regressand regressors
effect variable causal variables
explained variable explanatory variable

Note that there can be many x variables but we will limit ourselves to the case
where there is only one x variable to start with. In our set-up, there is only one y
variable.

Introductory Econometrics for 3


Regression is different from Correlation

If we say y and x are correlated, it means that we are treating y and x in


a completely symmetrical way.

In regression, we treat the dependent variable (y) and the independent


variable(s) (xs) very differently. The y variable is assumed to be
random or stochastic in some way, i.e. to have a probability
distribution. The x variables are, however, assumed to have fixed
(non-stochastic) values in repeated samples.

Introductory Econometrics for 4


Simple Regression

For simplicity, say k=1. This is the situation where y depends on only one x
variable.

Examples of the kind of relationship that may be of interest include:


How asset returns vary with their level of market risk
Measuring the long-term relationship between stock prices and
dividends.
Constructing an optimal hedge ratio

Introductory Econometrics for 5


Simple Regression: An Example

Suppose that we have the following data on the excess returns on a fund
managers portfolio (fund XXX) together with the excess returns on a
market index:
Year, t Excess return Excess return on market index
= rXXX,t rft = rmt - rft
1 17.8 13.7
2 39.0 23.2
3 12.8 6.9
4 24.2 16.8
5 17.2 12.3
We have some intuition that the beta on this fund is positive, and we
therefore want to find whether there appears to be a relationship between
x and y given the data that we have. The first stage would be to form a
scatter plot of the two variables.

Introductory Econometrics for 6


Graph (Scatter Diagram)

45
40
Excess return on fund XXX

35
30
25
20
15
10
5
0
0 5 10 15 20 25
Excess return on market portfolio

Introductory Econometrics for 7


Finding a Line of Best Fit

We can use the general equation for a straight line,


y=a+bx
to get the line that best fits the data.

However, this equation (y=a+bx) is completely deterministic.

Is this realistic? No. So what we do is to add a random disturbance


term, u into the equation.
y t = + x t + u t
where t = 1,2,3,4,5

Introductory Econometrics for 8


Why do we include a Disturbance term?

The disturbance term can capture a number of features:

- We always leave out some determinants of yt


- There may be errors in the measurement of yt that cannot be
modelled.
- Random outside influences on yt which we cannot model

Introductory Econometrics for 9


Determining the Regression Coefficients

So how do we determine what and are?


Choose and so that the (vertical) distances from the data points to the
fitted lines are minimised (so that the line fits the data as closely as
possible): y

x
Introductory Econometrics for 10
Ordinary Least Squares

The most common method used to fit a line to the data is known as
OLS (ordinary least squares).

What we actually do is take each distance and square it (i.e. take the
area of each of the squares in the diagram) and minimise the total sum
of the squares (hence least squares).

Tightening up the notation, let


yt denote the actual data point t
y t denote the fitted value from the regression line
ut denote the residual, yt - y t

Introductory Econometrics for 11


Actual and Fitted Value

yi

u i

y i

xi x

Introductory Econometrics for 12


How OLS Works

So min. u1 u 2 u3 u 4 u5 , or minimise
2 2 2 2 2
u
t 1
2
t . This is known
as the residual sum of squares.

But what was ut ? It was the difference between the actual point and
the line, yt - y t .

So minimising y
y
t t is equivalent to minimising
2
t

u 2

with respect to and .

Introductory Econometrics for 13


Deriving the OLS Estimator

t xt , so let
But y L ( yt y t ) 2 ( yt xt ) 2
t i

Want to minimise L with respect to (w.r.t.)


and differentiate L w.r.t.
, so
and
L
2 ( yt xt ) 0 (1)
t
L
2 xt ( yt xt ) 0 (2)
t

From (1), ( y t x t ) 0 y t T x t 0
t

But y t Ty and x t Tx .

Introductory Econometrics for 14


Deriving the OLS Estimator (contd)

So we can write Ty T Tx 0 or y x 0 (3)


From (2), x ( y x ) 0
t t t
(4)
t

From (3), y x (5)


Substitute into (4) for from (5),
xt ( yt y x xt ) 0
t

t t t
x y y x
x t
x
t 0
x
2

t t
x y T y x
T x 2

t 0
x
2

Introductory Econometrics for 15


Deriving the OLS Estimator (contd)

Rearranging for ,

(Tx 2 xt2 ) Tyx xt yt

So overall we have


xt yt Tx y
and y x

xt2 Tx 2

This method of finding the optimum is known as ordinary least squares.

Introductory Econometrics for 16


What do We Use and For?
In the CAPM example used above, plugging the 5 observations
in to make up the formulae given above would lead to the

estimates
= -1.74 and = 1.64. We would write the fitted line as:
y t 1.74 1.64 x t

Question: If an analyst tells you that she expects the market to yield a return
20% higher than the risk-free rate next year, what would you expect the return
on fund XXX to be?

Solution: We can say that the expected value of y = -1.74 + 1.64 * value of x,
so plug x = 20 into ythe equation
1.74 1.to
64get
20the expected
31 .06 value for y:
i

Introductory Econometrics for 17


Accuracy of Intercept Estimate

Care needs to be exercised when considering the intercept estimate,


particularly if there are no or few observations close to the y-axis:
y

0 x

Introductory Econometrics for 18


The Population and the Sample

The population is the total collection of all objects or people to be studied,


for example,

Interested in Population of interest


predicting outcome the entire electorate
of an election

A sample is a selection of just some items from the population.

A random sample is a sample in which each individual item in the


population is equally likely to be drawn.

Introductory Econometrics for 19


The DGP and the PRF

The population regression function (PRF) is a description of the model


that is thought to be generating the actual data and the true relationship
between the variables (i.e. the true values of and ).

The PRF is yt xt ut

The SRF is y t xt
and we also know that ut yt y t.

We use the SRF to infer likely values of the PRF.

We also want to know how good our estimates of and are.

Introductory Econometrics for 20


Linearity

In order to use OLS, we need a model which is linear in the parameters (


and ). It does not necessarily have to be linear in the variables (y and x).

Linear in the parameters means that the parameters are not multiplied
together, divided, squared or cubed etc.

Some models can be transformed to linear ones by a suitable substitution or


manipulation, e.g. the exponential regression model

Yt e X t e ut ln Yt ln X t ut
Then let yt=ln Yt and xt=ln Xt

yt xt ut
Introductory Econometrics for 21
Linear and Non-linear Models

This is known as the exponential regression model. Here, the coefficients


can be interpreted as elasticities.

Similarly, if theory suggests that y and x should be inversely related:



yt ut
xt
then the regression can be estimated using OLS by substituting
1
zt
xt
But some models are intrinsically non-linear, e.g.

yt xt ut

Introductory Econometrics for 22


Estimator or Estimate?

Estimators are the formulae used to calculate the coefficients

Estimates are the actual numerical values for the coefficients.

Introductory Econometrics for 23


The Assumptions Underlying the
Classical Linear Regression Model (CLRM)
The model which we have used is known as the classical linear regression model.
We observe data for xt, but since yt also depends on ut, we must be specific about how
the ut are generated.
We usually make the following set of assumptions about the uts (the unobservable
error terms):
Technical Notation Interpretation
1. E(ut) = 0 The errors have zero mean
2. Var (ut) = 2 The variance of the errors is constant and finite
over all values of xt
3. Cov (ui,uj)=0 The errors are statistically independent of
one another
4. Cov (ut,xt)=0 No relationship between the error and
corresponding x variate
Introductory Econometrics for 24
The Assumptions Underlying the
CLRM Again

An alternative assumption to 4., which is slightly stronger, is that the


xts are non-stochastic or fixed in repeated samples.

A fifth assumption is required if we want to make inferences about the


population parameters (the actual and ) from the sample parameters
( and )

Additional Assumption
5. ut is normally distributed

Introductory Econometrics for 25


Properties of the OLS Estimator

If assumptions 1. through 4. hold, then the estimators and determined by


OLS are known as Best Linear Unbiased Estimators (BLUE).
What does the acronym stand for?

Estimator - is an estimator of the true value of .


Linear - is a linear estimator
Unbiased - On average, the actual value of the and s will be equal to
the true values.

Best - means that the OLS estimator has minimum variance among
the class of linear unbiased estimators. The Gauss-Markov
theorem proves that the OLS estimator is best.

Introductory Econometrics for 26


Consistency/Unbiasedness/Efficiency

Consistent
areconsistent.
The least squares estimators and That is, the estimates will converge
to their true values as the sample size increases to infinity. Need the assumptions
E(xtut)=0 and Var(ut)=2 < to prove this. Consistency implies that

lim Pr 0 0
T
Unbiased
The least squares estimates of and
unbiased. That is E( )=
are and E( )=
Thus on average the estimated value will be equal to the true values. To prove this
also requires the assumption that E(ut)=0. Unbiasedness is a stronger condition than
consistency.
Efficiency
An estimator of parameter is said to be efficient if it is unbiased and no other
unbiased estimator has a smaller variance. If the estimator is efficient, we are
minimising the probability that it is a long way off from the true value of .

Introductory Econometrics for 27


Precision and Standard Errors

Any set of regression estimates of


and specific to the sample used in their
are
estimation.
Recall that the estimators of and from the sample parameters ( and )are
given
by

t 2 t
x y Tx y
and y x
ofxt the
What we need is some measure Treliability
x 2
or precision of the estimators
( and ). The precision of the estimate is given by its standard error. Given
assumptions

1 - 4 above, then the standard errors can be shown to be given by

t xt
2 2
x
SE ( ) s s ,
T ( xt x ) 2
T x T x
2
t
2 2

) s 1 1
SE ( s
t
where s is the estimated standard deviation
( x of
x ) t
2the residuals.
x 2
T x 2

Introductory Econometrics for 28


Estimating the Variance of the Disturbance Term

The variance of the random variable ut is given by


Var(ut) = E[(ut)-E(ut)]2
which reduces to
Var(ut) = E(ut2)
2
u
We could estimate this using the average of t :
1
s 2 ut2
T
Unfortunately this is not workable since ut is not observable. We can use

the sample counterpart to ut, which is ut :
1
s2
T

ut2

But this estimator is a biased estimator of 2.

Introductory Econometrics for 29


Estimating the Variance of the Disturbance Term
(contd)

An unbiased estimator of is given by


s
t

u 2

T 2

where u 2
t is the residual sum of squares and T is the sample size.

Some Comments on the Standard Error Estimators



1. Both SE( ) and SE( ) depend on s2 (or s). The greater the variance s2,
then the more dispersed the errors are about their mean value and therefore
the more dispersed y will be about its mean value.

2. The sum of the squares of x about their mean appears in both formulae.
The larger the sum of squares, the smaller the coefficient variances.

Introductory Econometrics for 30


Some Comments on the Standard Error Estimators

Consider what happens if


tx x 2
is small or large:

y
y

y
y

0 x x 0 x x

Introductory Econometrics for 31


Some Comments on the Standard Error Estimators
(contd)

3. The larger the sample size, T, the smaller will be the coefficient
variances. T appears explicitly in SE( ) and implicitly in SE( ).

T appears implicitly since the sum


tx x 2
is from t = 1 to T.

4. The term x
2
t appears in the SE( ).
The reason is that xt
2
measures how far the points are away from the
y-axis.

Introductory Econometrics for 32


Example: How to Calculate the Parameters and
Standard Errors
Assume we have the following data calculated from a regression of y on a
single variable x and a constant over 22 observations.
Data:
x y 830102, T 22, x 416.5, y 86.65,
t t

x 3919654, RSS 130.6


2
t

Calculations: 830102 (22 * 416.5 * 86.65) 0.35


3919654 22 * (416.5) 2

86.65 0.35 * 416.5 59.12


We write y t xt
y t 59.12 0.35 xt

Introductory Econometrics for 33


Example (contd)

SE(regression), s u t2

130.6
2.55
T 2 20

3919654
SE ( ) 2.55 * 3.35

22 3919654 22 416.5
2

1
SE ( ) 2.55 * 0.0079

3919654 22 416.5 2

We now write the results as

y t 59.12 0.35 xt
(3.35) (0.0079)

Introductory Econometrics for 34


An Introduction to Statistical Inference

We want to make inferences about the likely population values from


the regression parameters.

Example: Suppose we have the following regression results:


y t 20.3 0.5091xt
(14.38) (0.2561)
0.5091 is a single (point) estimate of the unknown population
parameter, . How reliable is this estimate?

The reliability of the point estimate is measured by the coefficients


standard error.

Introductory Econometrics for 35


Hypothesis Testing: Some Concepts

We can use the information in the sample to make inferences about the
population.
We will always have two hypotheses that go together, the null hypothesis
(denoted H0) and the alternative hypothesis (denoted H1).
The null hypothesis is the statement or the statistical hypothesis that is actually
being tested. The alternative hypothesis represents the remaining outcomes of
interest.
For example, suppose given the regression results above, we are interested in
the hypothesis that the true value of is in fact 0.5. We would use the notation
H0 : = 0.5
H1 : 0.5
This would be known as a two sided test.

Introductory Econometrics for 36


One-Sided Hypothesis Tests

Sometimes we may have some prior information that, for example, we


would expect > 0.5 rather than < 0.5. In this case, we would do a
one-sided test:
H0 : = 0.5
H1 : > 0.5
or we could have had
H0 : = 0.5
H1 : < 0.5

There are two ways to conduct a hypothesis test: via the test of
significance approach or via the confidence interval approach.

Introductory Econometrics for 37


The Probability Distribution of the
Least Squares Estimators
We assume that ut N(0,2)

Since the least squares estimators are linear combinations of the random variables
i.e.
wt yt
The weighted sum of normal random variables is also normally distributed, so
N(, Var())
N(, Var())

What if the errors are not normally distributed? Will the parameter estimates still
be normally distributed?
Yes, if the other assumptions of the CLRM hold, and the sample size is
sufficiently large.

Introductory Econometrics for 38


The Probability Distribution of the
Least Squares Estimators (contd)

Standard normal variates can be constructed from and :


~ N 0,1 and ~ N 0,1
var var

But var() and var() are unknown, so


~ tT 2 and ~ tT 2
SE ( )
SE ( )

Introductory Econometrics for 39


Testing Hypotheses:
The Test of Significance Approach

Assume the regression equation is given by ,


yt xt ut for t=1,2,...,T

The steps involved in doing a test of significance are:


1. Estimate , and SE( ) , SE( ) in the usual way

2. Calculate the test statistic. This is given by the formula


*
test statistic
SE ( )
where * is the value of under the null hypothesis.

Introductory Econometrics for 40


The Test of Significance Approach (contd)

3. We need some tabulated distribution with which to compare the estimated test
statistics. Test statistics derived in this way can be shown to follow a t-
distribution with T-2 degrees of freedom.
As the number of degrees of freedom increases, we need to be less cautious in
our approach since we can be more sure that our results are robust.

4. We need to choose a significance level, often denoted . This is also


sometimes called the size of the test and it determines the region where we will
reject or not reject the null hypothesis that we are testing. It is conventional to
use a significance level of 5%.
Intuitive explanation is that we would only expect a result as extreme as this or
more extreme 5% of the time as a consequence of chance alone.
Conventional to use a 5% size of test, but 10% and 1% are also commonly used.

Introductory Econometrics for 41


Determining the Rejection Region for a Test of
Significance
5. Given a significance level, we can determine a rejection region and non-
rejection region. For a 2-sided test:
f(x)

2.5% 95% non-rejection 2.5%


rejection region region rejection region

Introductory Econometrics for 42


The Rejection Region for a 1-Sided Test (Upper Tail)

f(x)

95% non-rejection
region 5% rejection region

Introductory Econometrics for 43


The Rejection Region for a 1-Sided Test (Lower Tail)

f(x)

95% non-rejection region


5% rejection region

Introductory Econometrics for 44


The Test of Significance Approach: Drawing
Conclusions

6. Use the t-tables to obtain a critical value or values with which to


compare the test statistic.

7. Finally perform the test. If the test statistic lies in the rejection
region then reject the null hypothesis (H0), else do not reject H0.

Introductory Econometrics for 45


A Note on the t and the Normal Distribution

You should all be familiar with the normal distribution and its
characteristic bell shape.

We can scale a normal variate to have zero mean and unit variance by
subtracting its mean and dividing by its standard deviation.

There is, however, a specific relationship between the t- and the


standard normal distribution. Both are symmetrical and centred on
zero. The t-distribution has another parameter, its degrees of freedom.
We will always know this (for the time being from the number of
observations -2).

Introductory Econometrics for 46


What Does the t-Distribution Look Like?

normal distribution

t-distribution

Introductory Econometrics for 47


Comparing the t and the Normal Distribution

In the limit, a t-distribution with an infinite number of degrees of freedom is


a standard normal, i.e. t () N (0,1)

Examples from statistical tables:


Significance level N(0,1) t(40) t(4)
50% 0 0 0
5% 1.64 1.68 2.13
2.5% 1.96 2.02 2.78
0.5% 2.57 2.70 4.60

The reason for using the t-distribution rather than the standard normal is that
we had to estimate 2, the variance of the disturbances.

Introductory Econometrics for 48


The Confidence Interval Approach
to Hypothesis Testing

An example of its usage: We estimate a parameter, say to be 0.93, and


a 95% confidence interval to be (0.77,1.09). This means that we are
95% confident that the interval containing the true (but unknown)
value of .

Confidence intervals are almost invariably two-sided, although in


theory a one-sided interval can be constructed.

Introductory Econometrics for 49


How to Carry out a Hypothesis Test
Using Confidence Intervals

1. Calculate , and SE( ) , SE( ) as before.


2. Choose a significance level, , (again the convention is 5%). This is equivalent to


choosing a (1-)100% confidence interval, i.e. 5% significance level = 95%
confidence interval

3. Use the t-tables to find the appropriate critical value, which will again have T-2
degrees of freedom.


4. The confidence interval is given by ( t crit SE ( ), t crit SE ( ))

5. Perform the test: If the hypothesised value of (*) lies outside the confidence
interval, then reject the null hypothesis that = *, otherwise do not reject the null.

Introductory Econometrics for 50


Confidence Intervals Versus Tests of Significance

Note that the Test of Significance and Confidence Interval approaches always give the
same answer.

Under the test of significance approach, we would not reject H 0 that = * if the test
statistic lies within the non-rejection region, i.e. if

*
tcrit tcrit
SE ( )
Rearranging, we would not reject if

t crit SE ( ) * t crit SE ( )
But this is just therule interval approach.
crit SE ( ) * t crit SE ( )
tunder the confidence

Introductory Econometrics for 51


Constructing Tests of Significance and
Confidence Intervals: An Example

Using the regression results above,

y t 20.3 0.5091xt
, T=22
(14.38) (0.2561)
Using both the test of significance and confidence interval approaches,
test the hypothesis that =1 against a two-sided alternative.

The first step is to obtain the critical value. We want tcrit = t20;5%

Introductory Econometrics for 52


Determining the Rejection Region

f(x)

2.5% rejection region 2.5% rejection region

-2.086 +2.086

Introductory Econometrics for 53


Performing the Test

The hypotheses are:


H0 : = 1
H1 : 1

Test of significance Confidence interval


approach approach

test stat
* t crit SE ( )
SE ( )
05091
. 1 0.5091 2.086 0.2561
1917
.
0.2561 (0.0251,1.0433)
Do not reject H0 since Since 1 lies within the
test stat lies within confidence interval,
non-rejection region do not reject H0

Introductory Econometrics for 54


Testing other Hypotheses

What if we wanted to test H0 : = 0 or H0 : = 2?

Note that we can test these with the confidence interval approach.
For interest (!), test
H0 : = 0
vs. H1 : 0

H0 : = 2
vs. H1 : 2

Introductory Econometrics for 55


Changing the Size of the Test

But note that we looked at only a 5% size of test. In marginal cases


(e.g. H0 : = 1), we may get a completely different answer if we use a
different size of test. This is where the test of significance approach is
better than a confidence interval.

For example, say we wanted to use a 10% size of test. Using the test of
significance approach, *
test stat
SE ( )
05091
. 1
1917
.
0.2561
as above. The only thing that changes is the critical t-value.

Introductory Econometrics for 56


Changing the Size of the Test:
The New Rejection Regions

f(x)

5% rejection region 5% rejection region

-1.725 +1.725

Introductory Econometrics for 57


Changing the Size of the Test:
The Conclusion

t20;10% = 1.725. So now, as the test statistic lies in the rejection region,
we would reject H0.

Caution should therefore be used when placing emphasis on or making


decisions in marginal cases (i.e. in cases where we only just reject or
not reject).

Introductory Econometrics for 58


Some More Terminology

If we reject the null hypothesis at the 5% level, we say that the result
of the test is statistically significant.

Note that a statistically significant result may be of no practical


significance. E.g. if a shipment of cans of beans is expected to weigh
450g per tin, but the actual mean weight of some tins is 449g, the
result may be highly statistically significant but presumably nobody
would care about 1g of beans.

Introductory Econometrics for 59


The Errors That We Can Make
Using Hypothesis Tests

We usually reject H0 if the test statistic is statistically significant at a


chosen significance level.

There are two possible errors we could make:


1. Rejecting H0 when it was really true. This is called a type I error.
2. Not rejecting H0 when it was in fact false. This is called a type II error.
Reality
H0 is true H0 is false
Significant Type I error
Result of (reject H0) =
Test Insignificant Type II error
( do not =
reject H0)

Introductory Econometrics for 60


The Trade-off Between Type I and Type II Errors

The probability of a type I error is just , the significance level or size of test we
chose. To see this, recall what we said significance at the 5% level meant: it is only
5% likely that a result as or more extreme as this could have occurred purely by
chance.
Note that there is no chance for a free lunch here! What happens if we reduce the
size of the test (e.g. from a 5% test to a 1% test)? We reduce the chances of making a
type I error ... but we also reduce the probability that we will reject the null
hypothesis at all, so we increase the probability of a type II error:less likely
to falsely reject
Reduce size more strict reject null
of test criterion for hypothesis more likely to
rejection less often incorrectly not
reject
So there is always a trade off between type I and type II errors when choosing a
significance level. The only way we can reduce the chances of both is to increase the
sample size.
Introductory Econometrics for 61
The Exact Significance Level or p-value

This is equivalent to choosing an infinite number of critical t-values from tables.


It gives us the marginal significance level where we would be indifferent between
rejecting and not rejecting the null hypothesis.

If the test statistic is large in absolute value, the p-value will be small, and vice
versa. The p-value gives the plausibility of the null hypothesis.

e.g. a test statistic is distributed as a t62 = 1.47.


The p-value = 0.12.

Do we reject at the 5% level?...........................No


Do we reject at the 10% level?.........................No
Do we reject at the 20% level?.........................Yes

Introductory Econometrics for 62


Generalising the Simple Model to
Multiple Linear Regression

Before, we have used the model


yt xt ut t = 1,2,...,T
But what if our dependent (y) variable depends on more than one
independent variable?
For example the number of cars sold might plausibly depend on
1. the price of cars
2. the price of public transport
3. the price of petrol
4. the extent of the publics concern about global warming
Similarly, stock returns might depend on several factors.
Having just one independent variable is no good in this case - we want to
have more than one x variable. It is very easy to generalise the simple model
to one with k-1 regressors (independent variables).

Introductory Econometrics for 63


Multiple Regression and the Constant Term

Now we write
yt 1 , t=1,2,...,T
2 x2t 3 x3t ... k xkt ut
Where is x1? It is the constant term. In fact the constant term is usually represented by a
column of ones of length T:

1
1
x1


1 is the coefficient attached to the constant term 1 (which we called before).

Introductory Econometrics for 64


Different Ways of Expressing
the Multiple Linear Regression Model

We could write out a separate equation for every value of t:


y1 1 2 x21 3 x31 ... k xk1 u1
y2 1 2 x22 3 x32 ... k xk 2 u 2

yT 1 2 x2T 3 x3T ... k xkT uT
We can write this in matrix form
y = X +u
where y is T 1
X is T k
is k 1
u is T 1
Introductory Econometrics for 65
Inside the Matrices of the
Multiple Linear Regression Model

e.g. if k is 2, we have 2 regressors, one of which is a column of ones:

y1 1 x21 u1
y 1 x u
2 22 1 2

2

yT 1 x2T uT
T 1 T2 21 T1

Notice that the matrices written in this way are conformable.

Introductory Econometrics for 66


How Do We Calculate the Parameters (the )
in this Generalised Case?
Previously, we took the residual sum of squares, and minimised it w.r.t.
and .
In the matrix notation, we have
u1
u
u 2


u
T

The RSS would be given by


u1
u
u ' u u1 u2 uT 2 u12 u22 ... uT2 ut2


uT
Introductory Econometrics for 67
The OLS Estimator for the
Multiple Regression Model

In order to obtain the parameter estimates, 1, 2,..., k, we would


minimise the RSS with respect to all the s.

It can be shown that


1


2 ( X X ) 1 X y


k

Introductory Econometrics for 68


Calculating the Standard Errors for the
Multiple Regression Model

Check the dimensions: is k 1 as required.

But how do we calculate the standard errors of the coefficient estimates?

Previously, to estimate the variance of the errors, 2, we used s2


u2
t
.
T 2
2 u' u
Now using the matrix notation, we use s
Tk
where k = number of regressors. It can be proved that the OLS estimator of
the variance of is given by the diagonal elements of s2 ( X ' X ),1so that the
variance of is the
1first element, the variance of is the second
2 element, and
, and the variance of is the kth diagonal
element.
k
Introductory Econometrics for 69
Calculating Parameter and Standard Error Estimates
for Multiple Regression Models: An Example

Example: The following model with k=3 is estimated over 15 observations:


y 1 2 x2 3 x3 u
and the following data have been calculated from the original Xs.
2.0 35 .
. 10 30 .
( X ' X ) 1 35 . 10 . ,( X ' y) 2.2 , u' u 10.96
. 65
10 . 4.3
. 65 0.6
Calculate the coefficient estimates and their standard errors.
To calculate the coefficients, just multiply the matrix by the vector to obtain
X ' X 1 X ' y .
To calculate the standard errors, we need an estimate of 2.
RSS 10.96
s2 0.91
T k 15 3

Introductory Econometrics for 70


Calculating Parameter and Standard Error Estimates
for Multiple Regression Models: An Example (contd)

The variance-covariance matrix of is given by


183. 320
. 0.91
s2 ( X ' X ) 1 0.91( X ' X ) 1 320 . .
0.91 594
0.91 594
. .
393
The variances are on the leading diagonal:

Var ( 1 ) 183
. SE ( 1 ) 135
.
Var ( 2 ) 0.91 SE ( 2 ) 0.96
Var ( ) 3.93
3 SE ( ) 198
3.

We write:
y 1.10 4.40 x2t 19.88 x3t
1.35 0.96 1.98
Introductory Econometrics for 71
A Special Type of Hypothesis Test: The t-ratio

Recall that the formula for a test of significance approach to hypothesis


testing using a t-test was
i i*
test statistic
SE i
If the test is H 0 : i = 0
H 1 : i 0
i.e. a test that the population coefficient is zero against a two-sided
alternative, this is known as a t-ratio test:

i
Since i* = 0, test stat
SE ( i )

The ratio of the coefficient to its SE is known as the t-ratio or t-statistic.

Introductory Econometrics for 72


The t-ratio: An Example

In the last example above:


Coefficient 1.10 -4.40 19.88
SE 1.35 0.96 1.98
t-ratio 0.81 -4.63 10.04

Compare this with a tcrit with 15-3 = 12 d.f.


(2% in each tail for a 5% test) = 2.179 5%
= 3.055 1%

Do we reject H0: 1 = 0? (No)


H0: 2 = 0? (Yes)
H0: 3 = 0? (Yes)

Introductory Econometrics for 73


What Does the t-ratio tell us?

If we reject H0, we say that the result is significant. If the coefficient is not
significant (e.g. the intercept coefficient in the last regression above), then
it means that the variable is not helping to explain variations in y. Variables
that are not significant are usually removed from the regression model.
y In practice there are good statistical reasons for always having a constant
t

even if it is not significant. Look at what happens if no intercept is included:

xt

Introductory Econometrics for 74


Data Mining

Data mining is searching many series for statistical relationships


without theoretical justification.

For example, suppose we generate one dependent variable and twenty


explanatory variables completely randomly and independently of each
other.

If we regress the dependent variable separately on each independent


variable, on average one slope coefficient will be significant at 5%.

If data mining occurs, the true significance level will be greater than
the nominal significance level.

Introductory Econometrics for 75


An Example of the Use of a Simple t-test to Test a
Theory in Finance

Testing for the presence and significance of abnormal returns (Jensens


alpha - Jensen, 1968).

The Data: Annual Returns on the portfolios of 115 mutual funds from
1945-1964.

The model: R jt R ft j j ( Rmt R ft ) u jt for j = 1, , 115

We are interested in the significance of j.

The null hypothesis is H0: j = 0 .

Introductory Econometrics for 76


Frequency Distribution of t-ratios of Mutual Fund
Alphas (gross of transactions costs)

Source Jensen (1968). Reprinted with the permission of Blackwell publishers.

Introductory Econometrics for 77


Frequency Distribution of t-ratios of Mutual Fund
Alphas (net of transactions costs)

Source Jensen (1968). Reprinted with the permission of Blackwell publishers.

Introductory Econometrics for 78


Can UK Unit Trust Managers Beat the Market?

We now perform a variant on Jensens test in the context of the UK market,


considering monthly returns on 76 equity unit trusts. The data cover the
period January 1979 May 2000 (257 observations for each fund). Some
summary statistics for the funds are:
Mean Minimum Maximum Median
Average monthly return, 1979-2000 1.0% 0.6% 1.4% 1.0%
Standard deviation of returns over time 5.1% 4.3% 6.9% 5.0%

Jensen Regression Results for UK Unit Trust Returns, January 1979-May


2000
R jt R ft j j ( Rmt R ft ) jt

Introductory Econometrics for 79


Can UK Unit Trust Managers Beat the Market?
: Results

Estimates of Mean Minimum Maximum Median


-0.02% -0.54% 0.33% -0.03%
0.91 0.56 1.09 0.91
t-ratio on -0.07 -2.44 3.11
-0.25

In fact, gross of transactions costs, 9 funds of the


sample of 76 were able to significantly out-perform the
market by providing a significant positive alpha, while 7
funds yielded significant negative alphas.

Introductory Econometrics for 80


Performance of UK Unit Trusts 1979-2000

3500

3000

2500

2000

1500

1000

500

Introductory Econometrics for 81


The Overreaction Hypothesis and
the UK Stock Market
Motivation
Two studies by DeBondt and Thaler (1985, 1987) showed that stocks which
experience a poor performance over a 3 to 5 year period tend to outperform
stocks which had previously performed relatively well.
How Can This be Explained?
2 suggestions
1. A manifestation of the size effect
DeBondt & Thaler did not believe this a sufficient explanation, but Zarowin
(1990) found that allowing for firm size did reduce the subsequent return on the
losers.
2. Reversals reflect changes in equilibrium required returns
Ball & Kothari (1989) find the CAPM beta of losers to be considerably higher
than that of winners.

Introductory Econometrics for 82


The Overreaction Hypothesis and
the UK Stock Market (contd)

Another interesting anomaly: the January effect.


Another possible reason for the superior subsequent performance
of losers.
Zarowin (1990) finds that 80% of the extra return available from
holding the losers accrues to investors in January.

Example study: Clare and Thomas (1995)


Data:
Monthly UK stock returns from January 1955 to 1990 on all firms
traded on the London Stock exchange.

Introductory Econometrics for 83


Methodology

Calculate the monthly excess return of the stock over the market over a 12,
24 or 36 month period for each stock i:

Uit = Rit - Rmt n = 12, 24 or 36 months

Calculate the average monthly return for the stock i over the first 12, 24,
or 36 month period:
1 n
Ri U it
n t 1

Introductory Econometrics for 84


Portfolio Formation

Then rank the stocks from highest average return to lowest and from 5
portfolios:

Portfolio 1: Best performing 20% of firms


Portfolio 2: Next 20%
Portfolio 3: Next 20%
Portfolio 4: Next 20%
Portfolio 5: Worst performing 20% of firms.

Use the same sample length n to monitor the performance of each


portfolio.

Introductory Econometrics for 85


Portfolio Formation and
Portfolio Tracking Periods

How many samples of length n have we got?


n = 1, 2, or 3 years.

If n = 1year:
Estimate for year 1
Monitor portfolios for year 2
Estimate for year 3
...
Monitor portfolios for year 36

So if n = 1, we have 18 INDEPENDENT (non-overlapping) observation /


tracking periods.

Introductory Econometrics for 86


Constructing Winner and Loser Returns

Similarly, n = 2 gives 9 independent periods and n = 3 gives 6 independent periods.


Calculate monthly portfolio returns assuming an equal weighting of stocks in each
portfolio.
Denote the mean return for each month over the 18, 9 or 6 periods for the winner
and loser portfolios respectively as and respectively.
R pW R pL
Define the difference between these as = - .
RDt R pL R pW
Then perform the regression
= 1 + t (Test 1)

RDt
Look at the significance of 1.

Introductory Econometrics for 87


Allowing for Differences in the Riskiness
of the Winner and Loser Portfolios

Problem: Significant and positive 1 could be due to higher return being


required on loser stocks due to loser stocks being more risky.

Solution: Allow for risk differences by regressing against the market risk
premium:

RDt = 2 + (Rmt-Rft) + t (Test 2)

where
Rmt is the return on the FTA All-share
Rft is the return on a UK government 3 month t-bill.

Introductory Econometrics for 88


Is there an Overreaction Effect in the
UK Stock Market? Results

Panel A: All Months


n = 12 n = 24 n =36
Return on Loser 0.0033 0.0011 0.0129
Return on Winner 0.0036 -0.0003 0.0115
Implied annualised return difference -0.37% 1.68% 1.56%

Coefficient for (3.47): 1 -0.00031 0.0014** 0.0013


(0.29) (2.01) (1.55)

Coefficients for (3.48): 2 -0.00034 0.00147** 0.0013*


(-0.30) (2.01) (1.41)
-0.022 0.010 -0.0025
(-0.25) (0.21) (-0.06)
Panel B: All Months Except January
Coefficient for (3.47): 1 -0.0007 0.0012* 0.0009
(-0.72) (1.63) (1.05)
Notes: t-ratios in parentheses; * and ** denote significance at the 10% and 5% levels
respectively. Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell
Publishers.

Introductory Econometrics for 89


Are the Losers Small?

Rank the firms according to size


i = 1 smallest 20% of firms
...
i = 5 largest 20% of firms

Then also rank firms according to their average return as previously.

j = 1 worst performing 20% of firms


...
j = 5 best performing 20% of firms

Introductory Econometrics for 90


Portfolio Ranking by Performance and Size

ij
So R is the return on the ith sized portfolio and the jth return ranking
portfolio (dropping time subscripts for simplicity)m, so
RD1 R 11 R 51 (4)
RD2 R 12 R 52 (5)
RD3 R 13 R 53 (6)
RD4 R 14 R 54 (7)
RD5 R 15 R 55 (8)

Introductory Econometrics for 91


Overreaction versus Small Firm Effects for
a 1-Year Horizon

Panel A: All Months Regression Equation


(3.57) (3.58) (3.59) (3.60) (3.61)
Return on Loser 0.00109 0.00509 0.00583 0.00825 0.00778
Return on Winner -0.00141 0.00116 0.00415 0.00473 0.00457

Coefficient for Test 1: 1 0.0025 0.0040 0.0017 0.0035 0.0032


(0.93) (1.30) (0.80) (1.05) (1.06)

Coefficients for Test 2: 2 0.0023 0.0032 0.0014 0.0028 0.026


(0.83) (1.73) (0.69) (1.12) (1.06)
-0.21 -0.63** -0.258 -0.625** -0.50**
(-0.99) (-4.50) (-1.68) (-3.24) (-2.63)
Panel B: All Months Except January
Coefficient for Test 1: 1 -0.0011 0.0035 0.002 0.0027 0.0038
(-0.44) (1.09) (0.94) (0.75) (1.16)
Notes: t-ratios in parentheses; * and ** denote significance at the 10% and 5% levels respectively.
Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell Publishers.

Introductory Econometrics for 92


Overreaction versus Small Firm Effects for
a 2-Year Horizon

Panel A: All Months Regression Equation


(3.57) (3.58) (3.59) (3.60) (3.61)
Return on Loser 0.00591 0.0080 0.00715 0.0044 0.00708
Return on Winner 0.00218 0.0026 0.00197 0.0002 -0.00057

Coefficient for Test 1: 1 0.0037 0.0053** 0.0052** 0.0042** 0.008**


(1.39) (2.01) (1.97) (1.91) (3.03)

Coefficients for Test 2: 2 0.0035 0.004 0.0034 0.0035 0.0068**


(1.23) (1.56) (1.45) (1.54) (2.64)
-0.024 -0.129** -0.167** -0.075 -0.09
(-0.34) (-2.08) (-2.90) (-1.37) (-1.40)
Panel B: All Months Except January
Coefficient for Test 1: 1 0.0028 0.0054* 0.005* 0.0037 0.009**
(0.99) (1.92) (1.86) (1.60) (3.42)
Notes: t-ratios in parentheses; * and ** denote significance at the 10% and 5% levels respectively.
Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell Publishers.

Introductory Econometrics for 93


Overreaction versus Small Firm Effects for
a 3-Year Horizon

Panel A: All Months Regression Equation


(3.57) (3.58) (3.59) (3.60) (3.61)
Return on Loser 0.01591 0.01484 0.01477 0.01202 0.12420
Return on Winner 0.01155 0.01335 0.01176 0.01130 0.00856

Coefficient for Test 1: 1 0.0044 0.0015 0.003 0.0007 0.0039


(1.55) (0.55) (1.21) (0.21) (1.13)

Coefficients for Test 2: 2 0.0071** 0.0046* 0.0054** 0.005 0.0071**


(2.56) (1.87) (2.21) (1.45) (2.14)
-0.26** -0.30** -0.23** -0.38** -0.32**
(-2.80) (-3.63) (-2.77) (-3.50) (-2.81)
Panel B: All Months Except January
Coefficient for Test 1: 1 0.0046* 0.001 0.0024 0.00004 0.003
(1.73) (0.37) (1.01) (0.01) (0.94)
Notes: t-ratios in parentheses; * and ** denote significance at the 10% and 5% levels respectively.
Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell Publishers.

Introductory Econometrics for 94


Testing for Seasonal Effects in Overreactions

Is there evidence that losers out-perform winners more at one time of the
year than another?

To test this, calculate the difference between the winner & loser portfolios
as previously, RDt , and regress12this on 12 month-of-the-year dummies:
RDt i Mi t
i 1

Significant out-performance of losers over winners in,


June (for the 24-month horizon), and
January, April and October (for the 36-month horizon)
winners appear to stay significantly as winners in
March (for the 12-month horizon).

Introductory Econometrics for 95


Conclusions

Evidence of overreactions in stock returns.

Losers tend to be small so we can attribute most of the overreaction in the


UK to the size effect.

Comments

Small samples
No diagnostic checks of model adequacy

Introductory Econometrics for 96


Testing Multiple Hypotheses: The F-test

We used the t-test to test single hypotheses, i.e. hypotheses involving only
one coefficient. But what if we want to test more than one coefficient
simultaneously?

We do this using the F-test. The F-test involves estimating 2 regressions.

The unrestricted regression is the one in which the coefficients are freely
determined by the data, as we have done before.

The restricted regression is the one in which the coefficients are


restricted, i.e. the restrictions are imposed on some s.

Introductory Econometrics for 97


The F-test:
Restricted and Unrestricted Regressions

Example
The general regression is
yt = 1 + 2x2t + 3x3t + 4x4t + ut (1)

We want to test the restriction that 3+4 = 1 (we have some hypothesis
from theory which suggests that this would be an interesting hypothesis to
study). The unrestricted regression is (1) above, but what is the restricted
regression?
yt = 1 + 2x2t + 3x3t + 4x4t + ut s.t. 3+4 = 1

We substitute the restriction (3+4 = 1) into the regression so that it is


automatically imposed on the data.
3+4 = 1 4 = 1- 3

Introductory Econometrics for 98


The F-test: Forming the Restricted Regression

yt = 1 + 2x2t + 3x3t + (1-3)x4t + ut


yt = 1 + 2x2t + 3x3t + x4t - 3x4t + ut

Gather terms in s together and rearrange


(yt - x4t) = 1 + 2x2t + 3(x3t - x4t) + ut

This is the restricted regression. We actually estimate it by creating two new


variables, call them, say, Pt and Qt.
Pt = yt - x4t
Qt = x3t - x4t so
Pt = 1 + 2x2t + 3Qt + ut is the restricted regression we actually estimate.

Introductory Econometrics for 99


Calculating the F-Test Statistic

The test statistic is given by

RRSS URSS T k
test statistic
URSS m
where URSS = RSS from unrestricted regression
RRSS = RSS from restricted regression
m = number of restrictions
T = number of observations
k = number of regressors in unrestricted regression
including a constant in the unrestricted regression (or the total number
of parameters to be estimated).

Introductory Econometrics for 100


The F-Distribution

The test statistic follows the F-distribution, which has 2 d.f.


parameters.

The value of the degrees of freedom parameters are m and (T-k)


respectively (the order of the d.f. parameters is important).

The appropriate critical value will be in column m, row (T-k).

The F-distribution has only positive values and is not symmetrical. We


therefore only reject the null if the test statistic > critical F-value.

Introductory Econometrics for 101


Determining the Number of Restrictions in an F-test

Examples :
H0: hypothesis No. of restrictions, m
1 + 2 = 2 1
2 = 1 and 3 = -1 2
2 = 0, 3 = 0 and 4 = 0 3
If the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut,
then the null hypothesis
H0: 2 = 0, and 3 = 0 and 4 = 0 is tested by the regression F-statistic. It
tests the null hypothesis that all of the coefficients except the intercept
coefficient are zero.
Note the form of the alternative hypothesis for all tests when more than one
restriction is involved: H1: 2 0, or 3 0 or 4 0

Introductory Econometrics for 102


What we Cannot Test with Either an F or a t-test

We cannot test using this framework hypotheses which are not linear
or which are multiplicative, e.g.
H0: 2 3 = 2 or H0: 2 2 = 1
cannot be tested.

Introductory Econometrics for 103


The Relationship between the t and the F-
Distributions

Any hypothesis which could be tested with a t-test could have been tested
using an F-test, but not the other way around.

For example, consider the hypothesis


H0: 2 = 0.5
H1: 2 0.5
2 0.5
We could have tested this using the usual t-test: test stat
SE ( 2 )
or it could be tested in the framework above for the F-test.
Note that the two tests always give the same result since the t-distribution
is just a special case of the F-distribution.
For example, if we have some random variable Z, and Z t (T-k) then also
Z2 F(1,T-k)

Introductory Econometrics for 104


F-test Example

Question: Suppose a researcher wants to test whether the returns on a company


stock (y) show unit sensitivity to two factors (factor x2 and factor x3) among three
considered. The regression is carried out on 144 monthly observations. The
regression is yt = 1 + 2x2t + 3x3t + 4x4t+ ut
- What are the restricted and unrestricted regressions?
- If the two RSS are 436.1 and 397.2 respectively, perform the test.
Solution:
Unit sensitivity implies H0:2=1 and 3=1. The unrestricted regression is the one in
the question. The restricted regression is (yt-x2t-x3t)= 1+ 4x4t+ut or letting
zt=yt-x2t-x3t, the restricted regression is zt= 1+ 4x4t+ut
In the F-test formula, T=144, k=4, m=2, RRSS=436.1, URSS=397.2
F-test statistic = 6.68. Critical value is an F(2,140) = 3.07 (5%) and 4.79 (1%).
Conclusion: Reject H0.

Introductory Econometrics for 105

Вам также может понравиться