Академический Документы
Профессиональный Документы
Культура Документы
Clara Cordeiro
DM/FCT, University of Algarve, Portugal
based on joint work with Professor M. Manuela Neves (supervisior)
DM/ISA, Technical University of Lisbon, Portugal
Clara Cordeiro
1 / 28
Outline
1
Introduction
Procedure
Case study
Closing coments
References
Clara Cordeiro
2 / 28
Introduction
Motivation
Boot motivation:
Is a resampling scheme that has a huge application in many fields;
Is a very popular methodology because of its simplicity and nice
properties;
There has been a great development in the area of dependent data.
EXPOS motivation:
It refers to a set of methods that can be used to model and to obtain
forecasts;
Is a versatile approach that continually updates a forecast
emphasizing the most recent experience;
Are the most widely used forecasting methods.
Clara Cordeiro
3 / 28
Introduction
Idea
The idea is to join these two approaches and to construct a
computational algorithm to obtain:
forecasts;
forecast intervals (@work...);
and in a long plan use it as a missing data imputation (@work...).
What to use?
Use EXPOS methods to select the model that better fits a data set;
Use Boot to resampling and reconstructed a replica of the original
data set;
Use
Clara Cordeiro
4 / 28
Introduction
Work chronology
Past studies:
forecasting using Holt-Winters method and some depend bootstrap
approaches; sieve bootstrap has revealed a good option;
model selection increased including SES, Holts linear and HW
additive and multiplicative methods;
first sketch of a computational algorithm where some considerations
are made: stationarity, BoxCox transformations, differencing, ...;
applied to the M3 competition; good behavior among six well-know
methods but problems with small data sets;
...
Clara Cordeiro
5 / 28
Introduction
Work chronology
Recent studies:
EXPOS selection has been augmented to a choice of thirty methods
(additive and multiplicative error term);
case study of 40 well-known time series is performed;
BoxCox transformation with 0 < < 1 is used;
run procedure using the M3 competition time series;
with this large set of model selection, better point forecasts were
obtained. Bootstrap intervals are narrower. BCa bootstrap intervals
can improve the results (@work...);
...
Clara Cordeiro
6 / 28
Introduction
In sume:
Consider EXPOS in modeling time series, instead of the traditional
ARIMA class.
Combines the use of EXPOS methods with the bootstrap
methodology - Boot.EXPOS.
Use the Boot.EXPOS procedure to obtain forecasts.
Test it on some well-known data sets and then use it on large data
sets.
The forecast performance is evaluated using some accuracy measures
and the results are compared with other forecasting EXPOS methods.
Clara Cordeiro
7 / 28
Clara Cordeiro
8 / 28
Some notes...
Clara Cordeiro
9 / 28
Trend
Component
N (None)
A (Additive)
Ad (Additive damped)
M (Multiplicative)
Md (Multiplicative damped)
N
(None)
N,N
A,N
Ad,N
M,N
Md,N
Seasonal Component
A
M
(Additive) (Multiplicative)
N,A
N,M
A,A
A,M
Ad,A
Ad,M
M,A
M,M
Md,A
Md,M
For each method in the framework, additive error and multiplicative error
versions are considered.
Clara Cordeiro
10 / 28
Clara Cordeiro
11 / 28
What is Bootstrap?
Clara Cordeiro
12 / 28
Questioning Boot
Does it really work?
Yes!
Key of success is to make sure that the bootstrap resampling correctly
mimics the original sampling
Key assumption is independence
Clara Cordeiro
13 / 28
Clara Cordeiro
14 / 28
Any good model should yield residuals that do not show significant
patterns.
Most of exponential smoothing models do not yield white noise
residuals.
In fact it is commonly found some pattern left in the residuals.
In order to model such left-over patterns an autoregressive process is
used to filter the EXPOS residuals series.
Because of the iid nature of the AR residuals, the IID bootstrap can
easily be extended to the dependent case.
These model-based resampling for time series is based on resampling
from the AR residuals.
Clara Cordeiro
15 / 28
Example
Decomposition by ETS(A,Ad,A) method
6e+05
3e+05
6e+05
0e+00
level
3e+05
150000e+00
slope
5000
10000 5000
season
1980
15000
1970
1990
1955
1960
1965
1970
1975
Time
dole: AR resid
1985
1990
0
30000
20000
0
40000
1965
1970
1975
1980
1985
1990
1955
best EXPOS
select by AIC
EXPOS residuals
4
1960
1965
1970
1975
1980
1985
1990
AR residuals
10
15
Lag
Clara Cordeiro
20
25
10
15
Lag
20
25
0.05
PACF
0.15
ACF
0.05
0.15
0.05
0.05
0.4
0.2
0.0
0.2
0.2
0.0
PACF
0.2
0.4
0.15
1960
0.15
1955
ACF
1980
Year
40000
1960
6e+05
4e+05
2e+05
0e+00
8e+05
observed
10
15
Lag
20
25
10
15
20
25
Lag
16 / 28
Simulation examples
Simulation 1
AR reconstruction (#1)
AR reconstruction (#1)+components
4e+05
4e+05
6e+05
8e+05
6e+05
2e+05
2e+05
100
200
300
400
500
100
200
300
400
0e+00
0e+00
30000
20000
10000
10000
20000
30000
40000
Resampling (#1)
1955
1960
1965
1970
1975
1980
1985
1990
1960
1970
1980
1990
Simulation 2
AR reconstruction (#2)
AR reconstruction (#2)+components
4e+05
6e+05
8e+05
6e+05
Clara Cordeiro
100
200
300
400
500
2e+05
2e+05
0
100
200
300
400
0e+00
0e+00
30000
30000
10000
10000
4e+05
10000
30000
Resampling (#2)
1955
1960
1965
1970
1975
1980
1985
1990
1960
1970
1980
1990
17 / 28
..
..
..
..
.
.
.
.
fB1 fB2 fBh
At the end of the B replications, the forecasts achieved through
Boot.EXPOS procedure are the mean over each column
For example, for the dole time series
Forecast comparison: serie dole
EXPOS
Boot.EXPOS
800000
700000
750000
Monthly total
850000
real values
10
11
12
Forecast horizon
Clara Cordeiro
18 / 28
Procedure
Boot.EXPOS
Complete description
Remark: previous EXPOS fit required
1
Construct a time series replica using the EXPOS components and the
previous bootstrap series;
Obtain h step-ahead forecasts from the new time series using the
EXPOS fit;
Clara Cordeiro
19 / 28
Clara Cordeiro
20 / 28
Case study
The selection
Clara Cordeiro
package datasets;
package forecast.
21 / 28
Case study
The data
Number of pigs slaughtered in Victoria, Australia
2000
2005
1800
1400
1200
40000
1995
1600
monthly total
2000
100000
60000
80000
monthly total
30000
1980
1985
1990
1995
1976
1978
1980
1982
YEAR
YEAR
YEAR
50000
30000
2000
1968
1970
1972
1974
YEAR
Clara Cordeiro
1976
1978
1000
200
10000
400
600
800
2500
1984
1500
monthly total
20000
10000
1990
1000
1985
2200
120000
1970
1975
1980
1985
YEAR
1960
1970
1980
1990
YEAR
22 / 28
Case study
Computing
for each time series the best EXPOS method is selected using AIC
EXPOS residuals are extracted and tested for white noise hypotheses
apply Boot.EXPOS
obtain the forecasts for the h period ahead
evaluate the performance of Boot.EXPOS procedure using some
accuracy measures:
Acronyms
RMSE
MAE
MAPE
Clara Cordeiro
Definition
Root Mean Squared Error
Mean Absolute Error
Mean Absolute Percentage Error
p Formula
(mean(Et2 ))
mean(|Et |)
mean(100 YEt )
t
23 / 28
Case study
Forecast results
Forecast comparison: serie Nav
real values
Boot.EXPOS
Boot.EXPOS
1600
EXPOS
10
11
12
1200
1
Forecast horizon
10
11
12
1800
real values
Forecast horizon
Clara Cordeiro
10
11
12
10
11
12
real values
EXPOS
11
12
Boot.EXPOS
60000
Monthly total
1200
3
Boot.EXPOS
1600
EXPOS
40000
1000
2
Forecast horizon
1400
Monthly total
800
600
400
1
Boot.EXPOS
1000
EXPOS
Forecast horizon
real values
70000
50000
Monthly total
real values
85000
30000
95000
Monthly total
105000
Monthly total
40000
35000
Monthly total
EXPOS
1800
Boot.EXPOS
1400
EXPOS
115000
real values
10
11
12
Forecast horizon
10
Forecast horizon
24 / 28
Case study
Accuracy results
Serie
nav
n
279
s
12
h
12
pigs
188
12
12
(A,N,A)
ukdeaths
120
12
12
(M,N,M)
writing
120
12
12
(A,A,A)
UKDriverDeaths
192
12
12
(M,N,A)
gas
476
12
12
(M,Md,M)
Clara Cordeiro
EXPOS fit
(M,A,M)
method
EXPOS
Boot.EXPOS
EXPOS
Boot.EXPOS
EXPOS
Boot.EXPOS
EXPOS
Boot.EXPOS
EXPOS
Boot.EXPOS
EXPOS
Boot.EXPOS
Accuracy measures
RMSE
MAE
MAPE
3661.23
3369.51
10.15
3456.60
3128.17
9.44
9377.28
7554.21
7.48
8653.24
6475.87
6.49
156.84
143.16
10.13
89.84
71.28
4.89
58.61
44.96
5.97
57.21
43.95
5.92
205.63
198.49
14.68
87.78
70.60
5.09
2773.72
2097.73
4.22
2348.16
1908.15
3.84
25 / 28
Closing coments
Clara Cordeiro
26 / 28
Acknowledgements: To the IIF for the travel award that make it possible
to participate in the ISF 2009.
Thank you!
Complete references on the next slide.
Clara Cordeiro
27 / 28
References
Brockwell, P. and Davis, R., Introduction to Time Series and Forescasting, 2nd edition, Springer-Verlag New York, 2002.
Cordeiro, C. and Neves, M., Bootstrap and exponential smoothing working together in forecasting time series,
Proceedings in Computational Statistics (COMPSTAT 2008), Paula Brito (editor), Physica-Verlag, 2008,pp. 891899.
Cordeiro, C. and Neves, M., Forecasting time series with Boot.EXPOS procedure, to be appear in REVSTAT statistical
Journal, June, 2009.
Davison, A.C. and Hinkley, D.V., Bootstrap Methods and their Application, Cambridge University Press, 1998.
Efron, B, Bootstrap Methods: Another Look at the Jackknife, The Annals of Statistics, 7 (1979), pp. 126.
Gardner Jr, E.S., Exponential Smoothing: The State of the Art-Part II, International Journal of Forecasting, 22 (2006),
pp. 637666.
Hyndman, R., forecast: Forecasting functions for time series; software available at
http://www.robjhyndman.com/Rlibrary/forecast/.
Hyndman, R., expsmooth: Data sets for Forecasting with exponential smoothing; software available at
http://www.robjhyndman.com/Rlibrary/forecast/.
Hyndman, R., fma: Data sets from Forecasting: methods and applications by Makridakis, Wheelwright & Hyndman
(1998); software available at http://www.robjhyndman.com/Rlibrary/forecast/.
Hyndman, R., Koehler, A., Ord, J. and Snyder, R., Forecasting with Exponential Smoothing: The State Space
Approach, Springer-Verlag Inc, 2008.
Lahiri, S.N., Resampling Methods for Dependente Data, Springer Verlag Inc, 2003.
R Develpment core team, R: A Language and Environment for Statistical Computing; software available at
http://www.R-project.org.
Trapletti, A. and Hornik, K., tseries: Time Series Analysis and Computational Finance; R package version 0.10-18, 2009.
Clara Cordeiro
28 / 28