Академический Документы
Профессиональный Документы
Культура Документы
Chapter 5:
5.1
Box-Jenkins Methodology
Introduction
ARIMA modeling was first introduced by George E. P. Box and Gwilym
M. Jenkins in 1976.
ARIMA is in short stands for Autoregressive Integrated Moving Average
models.
ARIMA models
Statistically sophisticated methods of extrapolating time series.
Commonly applied to time series analysis, forecasting and control.
Requires large run of time series data and technical expertise on the
part of the forecaster.
5.2
5.3
On the other hand, the time series that are not stationary is called nonstationary time series.
Series that is not constant around the mean or level (due to trend or
seasonal pattern) and hence can be expressed as deterministic function
of t.
Example: A non-stationary series in mean
1400
1200
1000
800
600
400
200
0
14
12
10
8
6
4
2
0
rk
(y
t 1
y )( y t k y )
(y
t 1
y) 2
where
y
1 n
yt
n t 1
Example 1
Table 1 shows the weekly sales of Ultra Shine Toothpaste.
yt
235
239
244
252
264
277
1
2
3
4
5
6
t
7
8
9
10
11
12
yt
286
295
310
325
336
344
Table 1
t
13
14
15
16
17
18
yt
355
368
384
398
413
423
19
20
21
22
23
24
yt
435
446
456
466
478
491
SALES
Coefficient
1.0
Coefficient
1.0
Upper
Confidence
Limit
Upper
Confidence
Limit
Lower
Confidence
Limit
0.0
Lower
Confidence
Limit
0.5
ACF
ACF
0.5
0.0
-0.5
-0.5
-1.0
-1.0
1
9 10 11 12 13 14 15 16
Lag Number
Figure 1
9 10 11 12 13 14 15 16
Lag Number
Figure 2
Solution:
k 1
rk rk 1, j rk j
rkk
j 1
if k 2, 3,
k 1
1 rk 1, j r j
j 1
where
rk , j rk 1, j rkk rk 1,k j for j 1, 2, , k 1
The partial autocorrelation function (PACF) is a listing, or graph, of the
partial autocorrelations at lags k = 1, 2,
The PACF is being used to determine the Box-Jenkins model.
Example 2
Table 2 shows the autocorrelation function for data in Example 1. Calculate
the partial autocorrelation at lag 1, 2 and 3.
k
1
2
3
4
rk
0.887
0.769
0.694
0.527
k
5
6
7
8
Table 2
k
rk
0.406
9
0.290
10
0.176
11
0.064
12
rk
-0.035
-0.123
-0.199
-0.268
k
13
14
15
16
rk
-0.326
-0.375
-0.410
-0.426
Solution:
5.4
Differencing
A non-stationary series can easily be made stationary by the process of
differencing.
This is analogous to the process of removing the trend pattern from the
actual data.
This is usually referred to as detrending process, and the resulting time
series is called detrended series which is now stationary.
Let the time series with linear trend, yt t
Then the first order differencing of y t ,
yt yt yt 1
( t ) [ (t 1)]
( t ) ( t )
t t
which is a constant.
( yt yt 1 ) ( yt 1 yt 2 )
yt yt 1 yt 1 yt 2
yt 2 yt 1 yt 2
A series that requires first difference to be stationary is said to be
integrated of order one or first order integrated series.
B yt yt 12
twelve-period shift backward
Therefore, the first order differencing can be written as
yt yt yt 1
yt B yt
12
(1 B) y t
NOTE:
Example 3
Write the following using backward shift operator
a) 3 yt
b) yt 1 yt 1 2 yt 2 t 1 t 1 2 t 2
Solution:
5.5
yt 1 B yt 2 B 2 yt p B p yt t
yt 1 B yt 2 B 2 yt p B p yt t
1 B B
1
p B p yt t
10
Example 4
Write the equation for the following AR model.
a) AR(1)
b) AR(2)
c) AR(4)
Solution:
yt t 1 B t 2 B 2 t q B q t
y t 1 1 B 2 B 2 q B q t
Example 5
Write the equation for the following MA model.
a) MA(1)
b) MA(2)
c) MA(5)
Solution:
11
Example 6
Write the equation for the following ARMA model.
a) ARMA(1, 1)
b) ARMA(2, 1)
c) ARMA(2, 2)
d) ARMA(1, 3)
Solution:
12
5.5.4
13
14
5.6
15
MA(q) where q is
the number of
spikes in the ACF
Exponential decay (die down) Exponential decay (die down) ARMA(p, q) where
and/or damped sinusoid
and/or damped sinusoid
p is the number of
spikes in the PACF
and q is the number
of spikes in the
ACF
16
AR(2)
MA(1)
17
Model
MA(2)
ACF / PACF
ARMA(1,1)
Example 9
Based on the given ACF and PACF of the original data (given in the Excel
file), decide the suitable Box-Jenkins model for this time series. Write down
the equation of the model.
45.00
40.00
35.00
30.00
25.00
20.00
1
18
y t 1 1 B 2 B 2 t
y t 1 1 B 2 B 2 3 B 3 t
Example 10
Based on the given ACF and PACF of the original data (given in the Excel
file), decide the suitable Box-Jenkins model for this time series. Write down
the equation of the model.
1200
1000
800
600
400
200
0
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89
19
From the time series plot and the ACF (die down extremely slow) of the
original data, it shows that the series is not stationary (with trend), thus need to
use first order differencing to transform the data to obtain stationary condition.
ACF and PACF of the first order differencing for the original data
From the ACF of the first order differencing for the original data, it die down
quickly, thus the data is stationary. The PACF of the first differencing for the
data cut off after lag 1.
Thus the suitable model is ARIMA(1, 1, 0).
The equation of the model is
wt 1 wt 1 t
where
wt yt yt 1
wt 1 yt 1 yt 2
20
Example 11
Based on the given ACF and PACF of the original data (given in the Excel
file), decide the suitable Box-Jenkins model for this time series. Write down
the equation of the model.
450.00
400.00
350.00
300.00
250.00
200.00
1
From the time series plot, the ACF (die down extremely slow) and PACF
(with spikes at lag 1 and 13) of the original data, it shows that the series is not
stationary (with trend and seasonal component), thus need to use seasonal
differencing and ordinal differencing to transform the data to obtain stationary
condition.
21
ACF and PACF of the first seasonal differencing for the original data
The data is still not yet stationary after the first seasonal differencing, so need
to apply ordinal differencing to the data.
ACF and PACF of original data after the first seasonal differencing and first
order ordinal differencing
From the ACF and PACF of the data after the first seasonal differencing and
the first order ordinal differencing, the data achieved the stationary condition.
From the ACF, it suggests q may take the values 1 or 2 and Q = 1 (2 spikes at
lag 1 and 4, 1 spike at lag 12), while from the PACF, it suggests p may take
the value of 1 or 2 and P = 1 (2 spikes at lag 1 and 4, 1 spike at lag 12).
Thus the possible model may be
ARIMA(1, 1, 1)(1, 1, 1)12
ARIMA(2, 1, 1)(1, 1, 1)12
ARIMA(1, 1, 2)(1, 1, 1)12
ARIMA(2, 1, 2)(1, 1, 1)12
22
(1 1 B12 ) z t (1 1 B12 ) t
(1 1 B12 )(1 B12 ) yt (1 1 B12 ) t
---(2)
23
5.7
24
Hypotheses
H 0 : The residuals are white noise (the residuals are random and
independent of each another)
H1 : The residuals are not white noise (model is mis-specified or
inadequate)
Test statistic
Box-Pierce Q statistic
h
Q N ' rk2
k 1
where
N ' N d
N is the number of observations in the time series
h is the maximum lags being tested
p is the number of AR terms
q is the number of MA terms
rk is the autocorrelation of the residual terms
d is the degrees of the differencing applied to the original data
Ljung-Box statistic
h
r2
Q* N ' ( N '2) k
k 1 N ' k
Decision
Critical value: h2 p q ,
25
Example 12
Using the data for Example 9, the model fitted using SPSS is given as follow.
Write the equation of the fitted models and decide which model is better.
26