Lecture Note 11 (2014 S)

1
Lecture note 11
Forecasting, Box-Jenkins, and Unit Root Tests

11.1 Forecast Evaluation
11.2 Forecast Exercise
11.3 Box-Jenkins Methodology
11.4 Unit Root Tests

2

11.1 Forecast Evaluation
Recall:
Suppose we know the true generating process, for example the , then the one-step-ahead
forecast would be
is purely random at , and the __________ part of
. Therefore, we
always encounter forecast error.

In practice, we never know the actual order of the process and the
parameters, i.e.,
, ,
,..
. We need to estimate all these parameters.

Therefore, the one-step-ahead forecast is

Obviously, the
will not be same as
as the forecasts made using the

estimated model.

3

Consider the true generating process is the . If we use the
model, it will fit the generated data well. However, sometimes a simpler model, say
the , ____ have a better forecast result compared to the true
model.

Why so?
Generally, a large model contains in-sample _____________ that induce forecast errors.
Clark and West (2007), Dimitrios and Guerard (2004) and Liu and Enders (2003)
forecasts using overly parsimonious models with little uncertainty can provide better
forecasts that models consistent with the actual data-generating process.

How good are my forecasts?
Compare to the realizations
Compare to different forecast models.
Need to have comparison criteria

4

How do we calculate and compare the forecast errors among the models?

Suppose we have 500 observations denoted
. We can use 90% of

the observations, i.e. 450 observations, to estimate the competing models. The
more observations we have, the more we can withhold the data

We use the estimated models to forecast
. Since we know the realization of
, we can easily calculate the one-step-ahead forecast errors
for
competing models.

At t = 451, we can use
to re-estimate the models. We then use the

new estimated models to forecast
and calculate
. [_______
scheme]

Repeating the process, we will have 50 forecasts (
,,
) and 50
forecast errors (
, ,
) for each competing model.

5

Remarks: Use the ,
and one-step ahead forecasts as an example

(a) ________ scheme:
At time 450, when we forecast
, the
is estimated based on
, the
, the

(b) ______ scheme
, the
, the
, the

(c) _____ scheme
, the
, the
, the
6

Forecast Evaluation Criteria

(A) Mean Squared Error:

(B) Root Mean Squared Error: RMSE =

(C) Mean Percentage Error:

(D) Mean Absolute Error:

(E) Mean Absolute Percentage Error:

7

How do we know the MSE(s) of ___models are statistically different from each
other? Speaking differently, how do we know the difference between two models is
not due to the pure chance?

The F Statistic
Taking the MSE as an example, we can use the F test,
, where
is
calculated based on forecast errors of Model A;
is calculated based on forecast

errors of Model B. Note that the larger should be put on the numerator.

If both models are indifferent in terms of forecasting performance, the F statistic value
should be 1; if the F statistic value is larger than , then we reject the
Models A
and B are indifferent.
o Note that: The F test here is valid only if
(i) the forecast
have zero mean and are normally distributed;

(ii)
are not serially correlated,(which is often violated when we

have j-step ahead forecast, j >1).
(iii)
between the two competing models are not _______.

8

The Granger-Newbold Test
Define the general loss function as
. For example, the quadratic

loss function is
.

Denote the loss differential between the two forecasts by
.

Two models are equally good if
for all t. Otherwise,
.

If Assumptions (i) and (ii) discussed in F test are held, Granger and Newbold (1976)
showed that for quadratic loss function, testing
is equivalent testing
, where
is just the sample correlation coefficient between
and
.
9

Example:
One step ahead forecast error for 7 periods.
0.01 0.05 -0.01 0.02 0.03 0.07 0.05
0.02 0.06 0.01 -0.04 0.01 0.03 0.1

Answer:
0.03 0.11 0.00 -0.02 0.04 0.10 0.16
-0.01 -0.01 -0.02 0.06 0.02 0.04 -0.05

Since
. We dont reject that both are indifferent.

z -0.5915 1.0000
x 1.0000

x z
-1.7969294
. display (-0.5915)/sqrt((1-0.5915*0.5915)/6)
10

Harvey et al. (1997) suggested researchers to run
on
, i.e.,
. If the
null,
is not rejected, (where the alternative
), both models are

equally good. Otherwise one is better than the other.

A simple and quick way is to regress the
(difference of the forecast errors

between two models) on a constant term and use the t test to determine whether the
estimated intercept is statistically different from zero.

_cons -.0042857 .0165985 -0.26 0.805 -.0449008 .0363294

d Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total .011571428 6 .001928571 Root MSE = .04392
Adj R-squared = 0.0000
Residual .011571428 6 .001928571 R-squared = 0.0000
Model 0 0 . Prob > F = .
F( 0, 6) = 0.00
Source SS df MS Number of obs = 7
. reg d
. gen d = e1-e2
11

11.2 Forecast Exercise
Example:
Use simulated 100 data points from
, where
We consider the
model, and use the first 90 observations for estimations. We keep the last 10 observations for
forecast evaluation.

Stata> arima y, arima (1,0,0) noconstant

. 100 . -146.5589 2 297.1177 302.3281

Model Obs ll(null) ll(model) df AIC BIC

. estat ic

/sigma 1.042636 .0546465 19.08 0.000 .9355311 1.149742

L1. .7881723 .0718208 10.97 0.000 .6474061 .9289385
ar
ARMA

y Coef. Std. Err. z P>|z| [95% Conf. Interval]
OPG

Log likelihood = -146.5589 Prob > chi2 = 0.0000
Wald chi2(1) = 120.43
Sample: 1 - 100 Number of obs = 100
ARIMA regression
12

Stata> arima y, arima (2,0,0) noconstant

The information criteria for the

(a) The AIC and BIC suggest the is better. The fits the 90 observations better.
(b) But this doesnt necessary mean the delivers a better forecast result.
(c) The estimates are based on sample 90% of the sample observations. We can evaluate the forecast
performances between the and the based on the remained 10% observations.

/sigma 1.042314 .0551962 18.88 0.000 .934131 1.150496

L2. -.0255046 .1024235 -0.25 0.803 -.2262508 .1752417
L1. .8056681 .0917879 8.78 0.000 .6257671 .9855691
ar
ARMA

OPG

Wald chi2(2) = 120.05
ARIMA regression
Note: N=Obs used in calculating BIC; see [R] BIC note

. 100 . -146.5302 3 299.0604 306.8759

Model Obs ll(null) ll(model) df AIC BIC

13

One step ahead forecast
Stata > predict yA if t > (90) /*1 step ahead forecast for y after the ___observation*/
Stata > tsline y yA /* Graphically, comparing the true value y and forecast yA*/
Stata> tsline yA if t> (90) || tsline y if t < (91) /* plot forecast yA and historical y together */
The model

-
6
-
4
-
2
0
2
4
0 20 40 60 80 100
t
y xb prediction, one-step
-
4
-
2
0
2
4
0 20 40 60 80 100
t
xb prediction, one-step y
14

The model

Note: Although the BIC and AIC suggest the model is better, graphically the forecast points
are very similar between these 2 models.

-
6
-
4
-
2
0
2
4
0 20 40 60 80 100
t
y xb prediction, one-step
-
4
-
2
0
2
4
0 20 40 60 80 100
t
xb prediction, one-step y
15

Forecast Evaluation: One step ahead forecast
Stata > generate errorA = y yA /* calculate the forecast error */
Stata> generate SqError = errorA*errorA /* squared the forecast errors */
Stata> summarize (SqError) /* mean squared error is provided*/

Model A:

Model B:

If we look at the MSE(s) between Model A and Model B, Model __s MSE is smaller.
sqE_A 10 2.568757 5.488667 .1343741 18.02609

Variable Obs Mean Std. Dev. Min Max
sqE_B 11 2.330664 5.306183 .0487499 18.17375

Variable Obs Mean Std. Dev. Min Max
16

Suppose we create a variable called d (defined as the difference between errorA and
errorB, where errorA is the forecast errors of Model A). Run the d on the constant
term only, and obtain the following result.

Another test is the Granger-Newbold test. We create variables called
and
, and
use the t test discussed previously.

The t statistic is 0.0488 which is clearly less than
s critical value. Both

models are indifferent in terms of _______________.
_cons -.008101 .0091618 -0.88 0.400 -.0288265 .0126245

d Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total .007554524 9 .000839392 Root MSE = .02897
Residual .007554524 9 .000839392 R-squared = 0.0000
Model 0 0 . Prob > F = .
F( 0, 9) = 0.00
. reg d
(90 missing values generated)
. gen d = errorA -errorB
.04881362
. di (0.01606)/sqrt( (1-0.1606*0.1606)/9)
z 0.0106 1.0000
x 1.0000

x z
(obs=10)
. corr x z
17

Confidence Interval for One step forecast
We use Model A as an example.
Stata> predict sigma2, ___
Stata > generate upper = yA + 1.96*sqrt(sigma2)
Stata> generate lower = yA -1.96*sqrt(sigma2)
Stata> tsline yA upper lower if t> (90) || tsline y if t < (91) /* list forecast yA and historical y together */

-
6
-
4
-
2
0
2
4
0 20 40 60 80 100
t
one-step ahead prediction 95% CI
95% CI historical y
18

J-step ahead forecast
Stata > predict y_J, __________ /* at time 90, forecast
*/
Stata> tsline y_J if t> (90) || tsline y if t < (91)

The true model, , (long term) mean is 0. [Recall:
]
Therefore, over the time the forecast value is converging to 0.

-
2
0
2
4
0 20 40 60 80 100
t
J-step ahead historical y
19

Example:
Use simulated 200 data points from
, where
We used 180
data points to estimate the model. Then we forecast 20 points ahead.
Stata> arima y if t <(181), ar(1) /*estimates used data point 1, 2,,180
Stata > predict y_J, dynamic(180) /* at time 180, forecast
*/
Stata> tsline y_J if t> (179) || tsline y if t < (181)

The mean is
-
4
-
2
0
2
4
6
0 50 100 150 200
time
J-step ahead forecasts historical y
20

Example: Different commands for estimating ARIMA models.
Use simulated 10,000 data points from
, where

Stata A> arima y, arima(1,0,0)

Stata B> arima y, ar(1) /* If it is an arma model, then we can type > arima, ar(1/p) ma(1/q) */

/sigma 1.006175 .0069936 143.87 0.000 .9924676 1.019882

L1. .4988331 .0086065 57.96 0.000 .4819647 .5157015
ar
ARMA

_cons 4.017689 .0200645 200.24 0.000 3.978363 4.057015
y

OPG

Wald chi2(1) = 3359.37
ARIMA regression

/sigma 1.006175 .0069936 143.87 0.000 .9924676 1.019882

L1. .4988331 .0086065 57.96 0.000 .4819647 .5157015
ar
ARMA

_cons 4.017689 .0200645 200.24 0.000 3.978363 4.057015
y

OPG

Wald chi2(1) = 3359.37
ARIMA regression
21

Stata C > reg y L1.y /* note that if we have 3 lags of y, we can type reg y L1.y L2.y L3.y */

Stata D > arima y L1.y /*

Note: Command C gives us the
information but we cant use the _______command.

_cons 2.015774 .0362439 55.62 0.000 1.944729 2.08682

L1. .4984087 .0086667 57.51 0.000 .4814201 .5153972
y

y Coef. Std. Err. t P>|t| [95% Conf. Interval]

Total 13460.234 9998 1.34629266 Root MSE = 1.0058
Residual 10114.2468 9997 1.0117282 R-squared = 0.2486
Model 3345.9872 1 3345.9872 Prob > F = 0.0000
F( 1, 9997) = 3307.20
. reg y L1.y

/sigma 1.005746 .0069965 143.75 0.000 .9920334 1.019459

_cons 2.015774 .0357956 56.31 0.000 1.945616 2.085932

L1. .4984087 .0086088 57.90 0.000 .4815358 .5152816
y
y

OPG

Wald chi2(1) = 3351.87
ARIMA regression
22

11.3 Box Jenkins Methodology
We have discussed the , , .

.

Several assumptions are imposed into the models.
Time Series
is weakly stationary. Speaking differently, the time series is invariant with

respect to time. Thus,
o The constructed model (AR, or MA, or ARMA) is _____.
o If the series is not stationary, the system will explode.
o Cant do forecasting exercise.

is a white noise.
o Mean is zero, variance is
, and the errors are uncorrelated across time periods.

23

Example:
Suppose we are given a time series raw data, say
. How do we model (or forecast)
? Please
state the procedure step by step.
(1) _____________
o Stationary or Non-Stationary? (If non-stationary, take the difference on variable

first,
. Some time we might want to remove the seasonal effect,

by using
)
o If the series is stationary, then we might use correlograms to decide the models and
# of lags
(2) Estimation
o Run regressions/ If we cant decide the model in the stage (1), we can use AIC (or
BIC) criterion here
(3) ________ Checking
We need to include enough AR & MA terms to make sure the residual terms are the
white noise in the models.
o The coefficients of the p and q lags must be significant, but the interior one needs
not be. We can skip the interim terms if they are not useful.
o Test the residuals; if the residual is white noise, the model is considered ok.
(4) Forecasting (either one step ahead or j step ahead)

24

11.4 Unit Root Tests
Stationary is ___________ important for time series analysis, (not only just for the
AR, MA, and ARMA models).

If a time series is not stationary, we must transform the non stationary series to be a
stationary series before any regression analysis. Otherwise, the regression result is
not reliable. Usually, taking the (first) difference on the variable is one of the
possible ways to ________ the series, i.e.
.

Graphically, if the ACF diagram of the series
does not die out _____ enough after

many lags, then the series is very likely a nonstationary series.

25

Example: We analyze the hp (UK housing price) variable.

Graphically, looking at PACF we might conclude that the hp variable can be modeled as
____. However, if we look at the ACF , it dies out very slow. (very likely nonstationary!)

26

What if you still intend to use the AR(1) model?

Dependent Variable: HP
Method: Least Squares
Date: 10/25/11 Time: 17:54
Sample (adjusted): 1991M02 2007M05
Included observations: 196 after adjustments
Convergence achieved after 10 iterations

Variable Coefficient Std. Error t-Statistic Prob.

C 25800.90 12064.50 2.138580 0.0337
AR(1) 1.010519 0.001691 597.4181 0.0000

R-squared 0.999457 Mean dependent var 88796.29
Adjusted R-squared 0.999454 S.D. dependent var 42311.82
S.E. of regression 988.7413 Akaike info criterion 16.64089
Sum squared resid 1.90E+08 Schwarz criterion 16.67434
Log likelihood -1628.808 F-statistic 356908.3
Durbin-Watson stat 1.394712 Prob(F-statistic) 0.000000

Inverted AR Roots 1.01
Estimated AR process is nonstationary

The statistical report shows that the variable is not stationary.
___________
___________
Therefore, the model is not reliable and will be eventually explosive.

27

Weve seen that if the AR root is 1 or larger than 1. The variable is nonstationary.
And therefore the regression result is not reliable. This type of regressions is called
________ regression. Usually R-squared is very high but it is misleading.

Graphically, for non stationary variables, the ACF does not die out easily. (very
persistent!).
On the other hand, for stationary variables the ACF decline exponentially.

How do we statistically test whether the time series data
is stationary?
Suppose we have an AR(1) model,

A unit root ____ if .

Therefore, we can test if is significantly different from 1 or not.

Alternatively, we can test the regression:
and
examine whether ( is zero.

28

11.4.1 Dickey Fuller (DF) Test
(there is a unit root) versus
_________________

There are three versions of the DF test
Case 1:

o Case 1 is the simplest form
Case 2:

o We include an intercept to the model
Case 3:

o We include an intercept and a trend to the model

Whether to include the intercept and/or time trend is an ___________question.
The DF test is very restricted. Because it applies only on AR(1).
29

Example: Fertility Rate

Example: Singapore Inflation Rate (data file can be found in edventure)

-
5
0
5
1
0
i
n
f
1980m1 1990m1 2000m1 2010m1
time
30

Stata> ______ inf, regress _____
/* inf (inflation) is the variable name; I include time trend and 1 lag in the unit root testing */

Stata> dfuller inf, regress /* without including trend */

MacKinnon approximate p-value for Z(t) = 0.0174

Z(t) -3.784 -3.986 -3.426 -3.130

Statistic Value Value Value
Test 1% Critical 5% Critical 10% Critical
Interpolated Dickey-Fuller
Dickey-Fuller test for unit root Number of obs = 357

Z(t) -3.798 -3.451 -2.876 -2.570

Dickey-Fuller test for unit root Number of obs = 357
31

Augmented Dickey Fuller (ADF) Test
As discussed previously, DF test applies only on AR(1). But AR(1) might not
capture ___ serial correlation in
in which case AR(p) is more appropriate.

ADF is a more general unit root test compared to DF test as it can be used to test
AR(p) models.

The unit test model
1
is as follow:
is wrong

Similarly, we can include an intercept and/or a time dummy into the model.
Case (1):

Case (2):

Case (3):

1
It is called augmented DF test because the test is augmented by lags of ______
32

We can graph the series
and decide which model (Case 1, 2, or 3) to use. Another

natural question raised is: how many lags of
should we include?

Schwert (1989) suggested that we can set the number of lags not ______ than

where

Example:
If we have 358 monthly inflation data points, then we set the number of lags in the ADF
test at most up to
.

If the estimate of
is not significant, then we perform the unit root test again by

using 15 lags. Keep reducing the # of lags until the estimate of
is significant.

33

At first, we include 16 lags. Stata > dfuller inf, regress lags(16)

Since the estimate of the
is insignificant (P-value 0.312), we can rerun the test

by including only up to 15 lags. It is almost sure that the ADF test rejects the null
hypothesis.
The inflation rate variable doesnt contain unit root. Therefore, it is a stationary process.

_cons .1012828 .0377618 2.68 0.008 .0269927 .1755729

L16D. .0526748 .0519866 1.01 0.312 -.0496001 .1549498
L15D. .071637 .0514199 1.39 0.165 -.0295233 .1727973
L14D. .1053225 .0521724 2.02 0.044 .0026819 .2079632
L13D. -.0278045 .0527845 -0.53 0.599 -.1316493 .0760403
L12D. -.456482 .0468371 -9.75 0.000 -.5486263 -.3643378
L11D. .0423739 .0475525 0.89 0.374 -.0511779 .1359256
L10D. .0263241 .0488399 0.54 0.590 -.0697603 .1224086
L9D. .1000233 .0489448 2.04 0.042 .0037324 .1963141
L8D. .0068582 .0490985 0.14 0.889 -.089735 .1034514
L7D. .0525277 .0494198 1.06 0.289 -.0446976 .149753
L6D. .0498249 .0494312 1.01 0.314 -.0474228 .1470726
L5D. .1268932 .0488078 2.60 0.010 .0308719 .2229146
L4D. .0803618 .0567872 1.42 0.158 -.0313576 .1920813
L3D. .1720609 .0560726 3.07 0.002 .0617474 .2823744
L2D. .1678053 .0553966 3.03 0.003 .0588216 .276789
LD. .0012893 .0555281 0.02 0.981 -.1079531 .1105318
L1. -.0587156 .0183713 -3.20 0.002 -.0948581 -.0225731
inf

D.inf Coef. Std. Err. t P>|t| [95% Conf. Interval]


Z(t) -3.196 -3.453 -2.876 -2.570

Augmented Dickey-Fuller test for unit root Number of obs = 341
34

11.4.2 Other Unit root tests
Phillips-Perron (PP) Unit Root Test is very popular in the analysis of financial time series.
Unlike (A)DF tests, PP test corrects for any ___________ and _____________ in the errors,

Thus, practically PP is better than ADF test because the test is more robust to the violations of
classical linear assumptions. Another advantage is we do not need to specify the number of
lags.

Example: Inflation rate
Stata>pperron inf

The test statistic is larger than the critical values. We reject the null hypotheis.


Z(t) -4.243 -3.451 -2.876 -2.570
Z(rho) -27.674 -20.386 -14.000 -11.200

Newey-West lags = 5
Phillips-Perron test for unit root Number of obs = 357
35

Kwiatkowski, Phillips, Schmidt and Shin (KPSS) had developed to ________ to traditional
unit root tests, such as ADF and PP tests.
For ADF, PP tests --

For KPSS test --
there is not unit root versus
there is a unit root

Statistically speaking, ADF (or PP) is to find evidence to reject unit root, and KPSS is to find
evidence to reject no unit root.

Example: Stata> kpss inf

16 .129
15 .132
14 .135
13 .139
12 .143
11 .149
10 .156
9 .165
8 .176
7 .19
6 .21
5 .236
4 .274
3 .332
2 .43
1 .628
0 1.22
Lag order Test statistic
10%: 0.119 5% : 0.146 2.5%: 0.176 1% : 0.216
Critical values for H0: inf is trend stationary
Autocovariances weighted by Bartlett kernel
Maxlag = 16 chosen by Schwert criterion
KPSS test for inf
36

Appendix
[A] ARIMA and Other Models
If we perform unit root tests and find that the time series
is not stationary, then we must take a first difference on series, i.e.
.

To make sure the first difference variable is stationary, we perform unit root tests again on the first difference variable
. If
is stationary, then we can start the B-J approach. Otherwise, take the difference on
again.

We call this is an ARIMA
2
(p, 1, q) model. 1 indicates that the series
is stationary after it is taken the first difference.

Example:
What does 2 mean in ARIMA(p, 2, q)?
[Ans]

Recall: Box & Jenkins Approach
Stationary Identification Estimation Checking Forecasting.
Sometimes, nonstationary is due to the seasonal effects, i.e. some seasons have different pattern than the others. Therefore, we
can take first order differences with a seasonal difference at lag 4.
Example:
energy consumption /electrical usage

For example, the demand is very high during summer in Taiwan. But the high demand happens during winter in Chicago, NYC, etc.
The electricity demand usually shows seasonal variation.

2
AutoRegessive Integrated Moving Average (ARIMA)
37

ARIMA is a pure classical statistical technique. We dont really need to know the (economic) structure.
Therefore, this model has been used in many fields, for example Physics, Biology , environmental issues, etc.
We can extend the ARMA models to the models that have more economic meanings.
Example:
AR(p):

Vector AutoRegression (VAR): Multiple time series in the generalized AR models.

[B] Parameter Instability and Structural Change (* not tested in the exam)
The time series data are often nonstationary. It can be due to a time trend, and/or seasonal component, and/or structural change.
Trend (stationary): Once the trend is removed, the series is stationary process.

How do we test the structural change (or break)? How do we model the change?
Know the date of event, for instance global financial crisis, 9/11, etc that changes the (econometric) system.

Suppose you suspect there is a structural change at a particular date. It is straight forward to use the Chow test.
(1)
(2)
We use the data before the change, for example 9/11 or July 1997, to estimate Model 1 and data on and after 9/11 to estimate
Model 2.
vs
is wrong.
38

, where is the sum of squared residuals,

Example:
We simulated 100 data points from
, where
In addition, we simulated another 113

data points from
where

In practice, we plot the graph.

We suspect there is a structural change at date 101 because after date 100, the y increases dramatically. In this simple
exercise, we also suspect that the model is the AR(2) because of the ac and pac graphs, (if we are lucky enough).
We can see the main difference between and after date 100, is (perhaps) only the intercept. There is a jump.
So we can run the regression using the AR(2) model for whole date set to get the . In addition, we run two
regression models. One uses 1-100 data points, and the other uses 101 -213 data points.

-
3
0
-
2
0
-
1
0
0
1
0
y
0 50 100 150 200
time
39

The whole data set (1-213 points)

For Model (1)

_cons .0184165 .1335337 0.14 0.890 -.2448218 .2816548

L2. .0716431 .0690029 1.04 0.300 -.064384 .2076702
L1. .9205832 .06888 13.37 0.000 .7847984 1.056368
y


Total 37383.9386 212 176.339333 Root MSE = 1.882
Model 36640.1616 2 18320.0808 Prob > F = 0.0000
F( 2, 210) = 5172.54
. reg y l1.y l2.y

_cons -1.618651 .3933838 -4.11 0.000 -2.399617 -.8376855

L2. .211723 .0948349 2.23 0.028 .0234518 .3999942
L1. .7065436 .0993953 7.11 0.000 .509219 .9038683
y


Total 2024.5908 97 20.8720701 Root MSE = 1.0395
Model 1921.9442 2 960.972099 Prob > F = 0.0000
F( 2, 95) = 889.39
. reg y l1.y l2.y if t<(101)
40

For Model (2)

,

We calculate the F statistic value and compare the value to the critical F value.

At 5 % significance level, .
Therefore, we reject the null hypothesis, where

To use the Chow test, we need to specify the date for the structural change and to assume that the change fully manifests itself
at that date. But this may not always be appropriate, for example there is no particular data at which we can say that significant
climate change has occurred.

In addition, we need to have enough observations are included in each subsample. Otherwise, the estimated coefficients have
little precisions.

_cons 1.806572 .4558879 3.96 0.000 .9030167 2.710127

L2. .0949843 .090681 1.05 0.297 -.0847425 .274711
L1. .6979468 .0948083 7.36 0.000 .5100398 .8858538
y


Total 405.868817 111 3.65647583 Root MSE = 1.0905
Model 276.248939 2 138.124469 Prob > F = 0.0000
F( 2, 109) = 116.15
. reg y l1.y l2.y if t>(103)
41

We can use recursive estimation to detect if the estimated coefficients change abruptly.

/* note that STATA will clear all the result except the rolling estimates */
/* first set: 1- #, second set 1- (#+1), third set 1-(#+2), etc */
> rolling, recursive window(#) clear: regress y l1.y

/* we need to retell STATA that this is time series data, as the previous data were cleared.*/ >tsset end /* we always use this
command */
>tsline coefficient name1, coefficient name2, etc

Example:
, where
In addition, we simulated another 113 data points from
,
t=1, ,200. Observations 1-100 was simulated from Model 1and the other observations were simulated from Model 2.
> In total, 131 set of estimates were estimated. Each time, the number of observations was increased by 1 every time.

> since all the results were eliminated except the 131 sets of estimates we need to use tsset command again. Then plot the intercept
estimates and slope estimates against time.

...............................
.................................................. 100
.................................................. 50
1 2 3 4 5
Rolling replications (131)
(running regress on estimation sample)
. rolling, recursive window (70) clear: reg y l1.y
. tsline _b_cons _stat_1
delta: 1 unit
time variable: end, 70 to 200
. tsset end
42

The intercept estimates (_b[_cons]) change dramatically after 100
th
data point. This signals that there might be a structural
change.

Similarly, the slope estimates seems to increase after 100
th
data point. But the change is roughly just within (0.75, 0.95).

The estimated intercepts are within (1.55, 2) for the first 30 sets of the estimates then after 100
th
point, the estimated intercept
converges to 0.5. This is because for the very last few set of the estimates, the model estimated the data simulated from a
model with an intercept 0.5

.
5
1
1
.
5
2
50 100 150 200
end
_b[_cons] _b[L.y]

Lecture Note 11 (2014 S)

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Lecture Note 11 (2014 S)

Загружено:

Авторское право:

Доступные форматы

1

is purely random at , and the __________ part of

. We need to estimate all these parameters.

will not be same as

as the forecasts made using the

. We can use 90% of

. Since we know the realization of

, we can easily calculate the one-step-ahead forecast errors

to re-estimate the models. We then use the

) for each competing model.

and one-step ahead forecasts as an example

is calculated based on forecast

have zero mean and are normally distributed;

are not serially correlated,(which is often violated when we

between the two competing models are not _______.

. For example, the quadratic

for all t. Otherwise,

is just the sample correlation coefficient between

0.01 0.05 -0.01 0.02 0.03 0.07 0.05

0.02 0.06 0.01 -0.04 0.01 0.03 0.1

0.03 0.11 0.00 -0.02 0.04 0.10 0.16

-0.01 -0.01 -0.02 0.06 0.02 0.04 -0.05

. We dont reject that both are indifferent.

is not rejected, (where the alternative

), both models are

(difference of the forecast errors

s critical value. Both

information but we cant use the _______command.

is weakly stationary. Speaking differently, the time series is invariant with

, and the errors are uncorrelated across time periods.

. How do we model (or forecast)

. Some time we might want to remove the seasonal effect,

does not die out _____ enough after

(there is a unit root) versus

in which case AR(p) is more appropriate.

(there is a unit root) versus

and decide which model (Case 1, 2, or 3) to use. Another

is not significant, then we perform the unit root test again by

is insignificant (P-value 0.312), we can rerun the test

(there is a unit root) versus

there is not unit root versus

there is a unit root

is not stationary, then we must take a first difference on series, i.e.

is stationary after it is taken the first difference.

energy consumption /electrical usage

, where is the sum of squared residuals,

In addition, we simulated another 113

In addition, we simulated another 113 data points from

Вам также может понравиться