Long-Term Forecasting and Evaluation

Long-term forecasting and evaluation
Clive W.J. Granger

a,1
, Yongil Jeon
b,
a
Department of Economics, University of California, San Diego, La Jolla, CA 92093-0508, United States
b
School of Economics, Sungkyunkwan University, 53 Myeongnyun-dong 3ga, Jongno-ku, Seoul, 110-745, South Korea
Abstract
Looking ahead thirty years is a difficult task, but is not impossible. In this paper we illustrate how to evaluate such long-term
forecasts. Long-term forecasting is likely to be dominated by trend curves, particularly the simple linear and exponential trends.
However, there will certainly be breaks in their parameter values at some unknown points, so that eventually the forecasts will
be unsatisfactory. We investigate whether or not simple methods of long-run forecasting can ever be successful, after one takes
into account the uncertainty level associated with the forecasts.
2007 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
JEL classification: C5
Keywords: Long-term trend fitting; Forecasting evaluation; Density forecasting
1. Introduction
Suppose that one was asked to forecast the general
state of the economy of the world in the year 2037
using data available in the year 2007. It would be a
daunting task. Most of the commonly used forecasting
techniques would be irrelevant. If a model assumes
stationarity, the long run forecast is just the mean of the
series. If the model assumes a unit root without drift,
then the long run forecast is essentially the most recent
value of the series. A unit root process with drift
simply gives a long run forecast that is a linear trend.
Long-term forecasting will probably be dominated by
trend curves, particularly the simple linear and expo-
nential trends. Unfortunately, there will also certainly
be breaks in their parameter values at unknown points,
so that eventually the forecasts will be unsatisfactory.
An objective of this paper is to investigate whether or
not simple methods of long-run forecasting can be
successful, after one fully considers the uncertainties
associated with the forecasts.
There are various different approaches to long-run
forecasting. Examples are:
(1) The World Economy. History and Prospects by
Rostow(1978), who views the past to look into the
future, and considers the stages of economic
growth (in a similar fashion to the familiar stages
of demographic change), the trends and the cycles,
Available online at www.sciencedirect.com
International Journal of Forecasting 23 (2007) 539551
www.elsevier.com/locate/ijforecast
Corresponding author. Tel.: +82 2 760 0487; fax: +82 2 744 5717.
E-mail addresses: cgranger@ucsd.edu (C.W.J. Granger),
yjeon@skku.edu (Y. Jeon).
1
Tel.: +1 858 534 3856; fax: +1 858 534 7040.
0169-2070/$ - see front matter 2007 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
doi:10.1016/j.ijforecast.2007.07.002
with both short and long periods, particularly the
Kontratieff Wave. The book includes few actual
forecasts that can be evaluated.
(2) In Towards the Year 2000 edited by Daniel
Bell (1968), which is a report by the American
Academy's Commission on the Year 2000,
forty-three high quality contributors, all males
apart from Margaret Mead, contribute intellec-
tual essays and discussions. These produce
sensible accounts of many topics, viewed from
several perspectives, but fail to reach many
conclusions and give no specific forecasts.
(3) The Limits to Growth by Meadows, Mea-
dows, Randers, and Behrens (1972) contains
specific forecasts, called scenarios, which are
derived from a complicated, possibly non-linear,
feedback model. This model can produce
initially exponential trends which may eventu-
ally clash and result in a collapse of the
economy. A later evaluation, Beyond the
Limits by Meadows, Meadows, Randers, and
Tinbergen (1992) displays results from thirteen
scenarios, of which three reach a maximum
upper limit for the material standard of living
around the year 2000, nine produce collapses
starting soon after 2000, and one expands living
standards until 2040 and then collapses. The
results are not robust to small changes in the
assumptions used in the models.
(4) The Year 2000 by Kahn and Wiener (1967),
which is the subject of the discussion in Section 2,
bases forecasts on trends of many kinds, considers
scenarios about how various movements will
interact in the future, and includes the impact of
scientific development.
2
The study came from the
Hudson Institute, a private non-profit public policy
think tank that is still active. The objective was
to use the data available in 1966 or so to forecast
the society, and particularly the economy, around
the year 2000. When one forecasts as far as thirty-
three years ahead, it should not be expected that
the forecasts will be precise either in size or in date.
In 1967 they would not be expected to make useful
forecasts of, say, unemployment or inflation for
New York or London for the year 2000. It would
be as unrealistic as our making such forecasts in
2000 for the year 2033. It is difficult enough
making forecasts for economic variables over a
one year horizon.
3
Some growth rates can be
expected to remain constant over a thirty year
period, while some could decline because of
changing tastes or scarcity of resources. Other
growth rates could increase because of new
opportunities and scientific advances. The real
forecasting skill comes in deciding what are
reasonable or likely changes in the growth rates.
4
This paper unfolds as follows. Section 2 discusses
the book by Kahn and Wiener (1967), which illustrates
that long-range forecasting is difficult but not
impossible, and provides a useful template of how to
approach the task. We illustrate the evaluation of their
long-term forecasts of science and technology, and the
U.S. population and economy. In the remainder of our
discussion we examine the properties of trend fitting in
greater depth. This is hardly a new use, but it is one
that has not been systematically explored for its long-
run forecasting abilities. Section 3 considers the post-
sample evaluation of four different trend fitting curves
for total U.S. personal consumption, where the initial
forecasting was done in Granger (1980, 1989). Section
4 employs two popular trend fitting models, a random
walk model with drift and a trend-stationary model
(with equal weights and with exponential declining
weights). Although the two models indicate similar
long-term mean forecasts, their confidence intervals
diverge as the forecasting horizons increase. Using the
uncertainty measure helps to differentiate between two
2
Kahn and Wiener use a variety of forecasting techniques, but
particularly:
(a)Trendline fitting for population and economic forecasts. This
produced a mixed success: good if a satisfactory trend curve is
chosen, but otherwise quite incorrect.
(b)Scenarios thinking about plausible futures. Kahn and Wiener
are again successful on occasions, but miss some substantial future
developments.
(c)A form of Delphi for looking at scientific breakthroughs,
which involves talking to prominent scientists. This technique is
often successful, as some current trends do continue, but others
prove to be dead ends.
3
Some variables, such as population, change fairly slowly and
ponderously, while others, such as GDP, change in a steady fashion.
These are thus candidates for numerical forecasts, and hence for
evaluation.
4
When considered together, one gets a diverse, varied set of
trends, described frequently in Kahn and Wiener by the delightfully
old-fashioned term multifold trends.
540 C.W.J. Granger, Y. Jeon / International Journal of Forecasting 23 (2007) 539551
similar models in the trend fitting. Section 5 presents
our conclusions.
2. Forecasting science an evaluative review of
Kahn and Wiener (1967)
The activities of futurists in the 1960s could be
divided into hard and soft components. The hard
side dealt with technological forecasting, often based on
forecasts of scientific developments, as in Gabor (1963),
or on trend fitting, to extrapolate measures of technical
ability, such as computer speed or the maximum speed
of aircraft. The soft side largely consisted of scenario
writing, in which an attempt was made to describe howa
future society would look if various trends continued
and interacted with each other and with expected
scientific discoveries.
It is useful to consider Kahn and Wiener's study in
some detail, because it illustrates the possibility of
making some worthwhile long-run forecasts and the
obvious difficulties of doing so. To be sensible, the
society in the scenario has to fit together; that is, it must
be coherent. A central, or surprise-free scenario is
taken to be the most likely, but a variety of more extreme
possible scenarios can also be envisioned, such as the
weakening of communism, the breakdown of democra-
cy, or the aftermath of a nuclear war. About half of Kahn
and Wiener (1967) is concerned with such scenarios,
which make interesting enough reading, rather like
viewing George Orwell's 1984 in 1960, but such
projections or forecasts are virtually impossible to
evaluate. The surprise-free scenario for 2000 is roughly
correct in some respects but quite wrong in others. Part
of the problem is that a scenario does not forecast a
particularly tight period, such as the year 2000, but rather
a longer time period, say 1975 to 2010. For example,
there is a great deal of attention paid to weapons and
military spending, which was important for much of the
intervening period, but was less so by 2000.
Hard forecasts are rather easier to evaluate, al-
though difficulties remain.
5
The authors provide a list
of 100 Technical Innovations Very Likely in the Last
Third of the Twentieth Century. Using entirely
personal subjective criteria, we would judge that
thirty-five of these innovations have occurred, forty-
eight have not, fourteen have partially occurred (that
is, they are in the very early stages) and three could not
be classified because their statements were unclear.
Some examples of successful forecasts are more
reliable and longer-range weather forecasting; exten-
sive and intensive use of high altitude cameras; new
techniques for cheap, convenient, and reliable birth
control; a general and substantial increase in life
expectancy; high quality medical care for undevel-
oped areas; automated grocery and department stores
(becoming used); and home computers to run
households and communicate with the outside world.
Examples of forecasts that are unsuccessful so far
are intensive and/or extensive expansion of tropical
agriculture and forestry (chopping down trees does not
qualify); some control of weather and/or climate; new
and more reliable educational and propaganda
techniques for affecting human behavior, both public
and private; human hibernation for extensive periods
(months to years); the capability to choose the sex of
unborn children (not counting selective abortion); new
techniques and institutions for the education of
children (computers?); and physically non-harmful
methods of over-indulging.
Some of these examples illustrate how difficult it is
to interpret the forecast and judge whether it is
successful or not.
6
We have listed just a few of the
one hundred innovations given in the book as
examples. If our personal classifications are accepted,
it is unclear whether having thirty-five out of one
hundred correct is considered a success, as there is
nothing with which to compare it. Many of the less
successful forecasts have a problem with time; these
5
The first chapter lists a number of basic trends, most of which
proved to be correct, and several of which will be relevant in the
future. Also in chapter one, the classification of future solutions into
groups from pre-industrial to post-industrial need redefining but
remains useful. The canonical variables from the standard world
from Chapter 6 are also helpful in organizing one's thinking.
6
Examples of half-successes by our chosen scheme are
improved acceptability of the change of sex of children and/or
adults; permanent manned satellites and lunar installations, and
interplanetary travel; and flexible penology without using prisons.
Examples of forecasts that we could not judge include new or
improved super performance fabrics (papers, filters, plastics); a
greater use of underground buildings; and major improvements in
earth moving and construction equipment generally. It is unclear
how one evaluates words such as new or improved, greater use
or major improvements.
innovations seem likely to come about in the next
decade or so.
The authors also list twenty-five less likely but
important possibilities, and a further list of ten far
out possibilities, none of which have occurred by the
year 2000. The first list includes, for example, room
temperature superconductors; the conversion of mam-
mals (humans?) to being fluid breathers; and a
technological equivalent of telepathy. The only
example that comes close to occurring is the prediction
of the use of automated freeways. The far-out list
includes anti-gravity, interstellar travel, and extremely
cheap electricity.
Some scientific advances that have happened but do
not appear on the lists are cloning, the human DNA
genome sequence map, and the development of the
high-production rice and wheat which produced the
green revolution that helped feed the world's
increase in population. The World Wide Web was
not specifically forecast, but the use of fast home
computers, linked together, and the widespread
availability of all the information in the Library of
Congress is discussed. With all forms of technolog-
ical forecasting, the forecasts are about what will be
possible rather than how this will be achieved. Thus,
although the web itself and its full implications were
not forecast in the book, many of the basic properties
of the system were indicated.
7
Kahn and Wiener (1967) contains forecasts of
population, GNP, and GNP per capita for the U.S. and
several other countries to roughly the year 2000. For the
U.S., a number of other economic measures are also
forecast. On some occasions, low, middle, and high
forecasts are provided to suggest some kind of uncertainty
range, although these are not specific confidence
intervals. In what follows, some of the tables are
reproduced in a growth rate form, with actual growth
rate values for the year 2000 added. Most actuals are
available from the 2002 Statistical Abstract of the United
States(www.census.gov) or the World Population Data
Sheet, 2002, from the Population Reference Bureau.
The GNP per capita in the World Population Data
Sheet is in 2000 US$. To convert to 1965 US$ we
multiply by 0.2 (as the average inflation rate between
1965 and 2000 was about 4%). It is seen that the
forecast populations are generally higher than the
actual populations, and the GNP/capita forecasts are
much too high, as is shown in Table 1. The same
patterns occur with major individual countries, as is
shown in Table 2. It is worth noting that two major
countries in the original table, the Soviet Union and
Germany, have changed dramatically. An unintended
forecast being made by the book's authors was that
countries would stay unchanged.
The country-specific data shown in Table 3
indicates that, for GNP per capita for fourteen other
countries, the ratio of actual/forecast values varies from
0.09 to 1.9, which is a wide range. Table 3 illustrates
that only four forecasts were too high. Table 4 shows
that, in general, the forecasts for the U.S. were too high,
being optimistic in forecasting components of the U.S.
GNP per capita. In addition, GNP and two-thirds of the
components were forecast too high.
Kahn and Wiener also forecast labor market
variables, shown in Table 5. Men actually participate
somewhat less than expected, but women participate to
a much greater extent. Although this is not shown, it is
particularly the case for unmarried women, even
though married women's participation rates are also
high relative to expectations. Kahn and Wiener make
continued reference to declines in the work week, extra
vacations and days of rest, and general increases in
leisure. The problem of how society would cope with
the large amounts of leisure time that its workers
would enjoy around the year 2000 was a popular topic
amongst forecasters in the 1960s. This difficulty
really has not occurred, particularly in the United
States, as the above figures show, with the hours of
work in 2000 being similar to those in 1965, except for
the Retail Trade which has turned to a greater use of
part-time workers. Other developed economies do not
follow quite the same pattern, however.
As an article in the Time magazine (13 June 2000)
reported, vacation times taken as mandated vary
greatly across the more-developed countries, from
30 days a year in France, Austria, Denmark, Spain and
Sweden, down to 18 in Germany and 16 in the USA.
The fact that the numerical forecasts are not exactly the
same as the actual figures for (around) 2000 should not
be interpreted negatively. It is unclear whether anyone
could have done much better. Forecasting with perfect
hindsight is all too easy, and is not to be trusted.
7
The chapter on science and technology pays particular attention
to nuclear energy, lasers, and holography, choices which now seem
rather strange. All are important in differing degrees, but for the
latter two, less than was anticipated in 1967, we believe.
To appreciate some of the trends that produced the
forecasts of population and economic quantities, the
background to the mid-1960s should be considered. In
1967 the European Economic Community (EEC)
existed and the Vietnam War was in progress;
Thurgood Marshall was sworn in as the first black U.S.
Supreme Court Justice; and the General Agreement on
Tariffs and Trade (GATT) was signed. In 1968 the
Reverend Martin Luther King and Senator Robert
Kennedy were assassinated, and Richard M. Nixon
was elected President. Also, in 1967 the Microsoft
Corporation and Apple Computers did not exist, but
IBM was successfully making large computers. In
Table 1
Population and GNP per capita of the continents
Forecast
growth rate
Actual
growth rate
Population of
Continents
(Book Table 3)
Africa 150.72 158.79
Asia 95.92 94.75
Europe 31.32 7.99
Oceana 78.57 119.60
North and South
America
116.77 81.76
GNP per capita
Continents
(Book Table 3)
Africa 96.45 178.01
Asia 279.61 463.16
Europe 269.25 135.94
Oceana 115.50 87.70
North and South
America
116.21 87.28
Note: Growth rates from 1965 to 2000. The book divides the last
category into North and South America, but it is unclear which
countries fall into the two groups, particularly with Mexico. The
original source of the data is Kahn and Wiener (1967), but the raw
data were manipulated in Tables 16. For example, Book Table 3
implies that the raw data was adapted from the book, but that the
table represents data transformations made by the authors.
Table 2
Population and GNP per capita of major countries
Forecast growth rate Actual
growth
rate
Low Mid High
Population of
Major countries
(Book Table 12)
United
States
48.72 63.08 85.64 44.79
Canada 70.00 90.00 115.00 56.39
France 20.41 30.61 38.78 20.41
U.K. 0.00 9.09 16.36 9.09
Japan 18.37 25.51 41.84 29.29
India 87.68 102.87 131.62 105.89
China 31.39 68.34 111.92 67.22
GNP per capita of
Major countries
(Book Table 14)
United
States
33.82 185.63 250.86 91.73
Canada 63.96 186.93 251.87 120.54
France 132.85 254.99 371.41 153.85
U.K. 97.89 261.97 367.85 161.09
Japan 365.58 902.33 1066.86 531.97
India 13.13 172.73 310.10 372.73
China 8.16 227.55 888.78 700.00
Note: Growth rates from 1965 to 2000.
Table 3
GNP per capita for fourteen contender countries
Forecast
growth rate
Actual
growth rate
Sweden 247.58 91.99
Australia 129.57 148.58
New Zealand 65.37 91.82
Israel 337.71 189.81
Poland 282.54 87.11
Romania 325.89 68.03
Argentina 164.23 389.84
Mexico 49.45 286.37
Brazil 80.71 421.43
Columbia 29.60 337.55
Thailand 219.05 903.17
Pakistan 119.78 308.79
Nigeria 50.60 92.77
Table 4
Components of US GNP and GNP per capita
Forecast
growth rate
Actual
growth
rate
Low High
Components
of US GNP
(Book Table 26)
Total GNP 219.68 432.75 189.95
Personal
consumption
expenditure
236.11 460.19 211.50
Gross private
domestic
investment
192.52 388.79 230.37
Government
purchases of goods
and services
202.21 403.68 156.03
Components of US
GNP per capita
(Book Table 29)
Per capita GNP 95.60 229.81 99.35
Per capita
disposable income
103.32 238.17 106.55
Personal
consumption
98.20 215.32 114.58
1972, the Dow Jones Industrial Index closed over 1000
for the first time, having risen from 753 in 1970.
Probably the last figure available to the authors for the
birth rate per 1000 in the population would have been
19.4, which was rather lower than the values for
previous years but was higher than any year since
1965. The birth rate was down to 14.8 by 1973, and
stayed below 16.0 until 1988, a decline that would
have been difficult to forecast in 1966. Similarly,
economic growth rates were high in the 1960s but took
a strong hit at the time of the first major oil price shock
in 19734. Thus, forecasts of population and econom-
ic growth would be likely to be too high if trends were
based on previous economic growth levels.
In the thirty-three years from 1967 to 2000, the U.S.
working-age population doubled, the total population
increased by 40%, retail prices increased five times, real
GNP per capita increased by 80%, and stock prices
increased more than ten-fold (and more than doubled in
real terms), illustrating the challenges of long-run
forecasting. The price increases were not uniform across
industries, as shown by the relative consumer price
indices for major groups in Table 6. Although many of
the ratios are in the 4.0 to 5.0 range, medical care (at
9.06), andparticularly fuel oil (at 15.96), are much higher
than other categories, whereas telecommunications (at
2.42) is note-worthily low. It seems that the higher
quality resulting fromscientific advances comes at a high
cost in the medical field, which is not well known for its
level of competitiveness. In comparison, in the highly
competitive telecommunications industry, advances
have resulted in both higher quality and lower costs.
What was missed by Kahn and Wiener (1967)?
Although they discussed the whole world, they do
concentrate on the United States as the largest economy,
and we will do the same. Here we ask the question, what
aspects of the U.S. economy did the writers fail to
forecast correctly for the year 2000 when writing in
1967? The growth in population and GNP per capita
were forecast, but the impacts of these tendencies were
not realized. Land and house prices increased in
particularly popular areas, which also attracted many
of the strongest growth industries. Further, people lived
longer due to improved life styles and medical health
advances, so their earnings during their working years
had to provide for a longer retirement period, leading to
little or no increase in leisure, more women working, and
delay rather than earlier retirement ages. There were no
forecasts of the effects of these developments on either
savings rates or social security.
Major forecasts in the social area that were not
discussed were the major decline of Communism in
Europe, the consequent break-up of the USSR and the
unification of Germany, the peaceful racial democratic
changes in South Africa, and the major migrations that
occurred. Immigration in the U.S. has been a major
Table 5
U.S. labor force participation: Rates and average weekly hours of
work
Forecast
difference
Actual
difference
Labor force
Participation Rates (%)
by age and sex
(Book Table 22)
1964 to 2000
Male Total 0.9 2.5
2024 0.9 4.0
2534 0.1 2.7
3544 0.4 3.4
4554 0.6 5.8
5564 0.9 16.8
65+ 8.1 9.6
Female Total 5.4 23.2
2024 4.9 24.1
2534 4.1 39.2
3544 6.5 32.5
4554 10.7 25.8
5564 9.4 12.0
65+ 0.4 0.2
Average weekly
hours of work
in selected
industries
(Book Table 23)
1965 to 2000
Manufacturing 9.1 0.2
Contract
construction
6.2 1.8
Retail trade 4.0 5.4
Wholesale
trade
8.7 2.0
Bituminous
coal mining
10.0 5.5
Table 6
Consumer price index by major groups
Ratios
1996/1965
Ratios
2000/1965
All items 4.98 5.47
Commodities 4.36 4.77
Energy 7.47 8.45
Food 3.01 4.05
Apparel and upkeep 2.24 2.76
Medical care 5.67 6.08
Fuel oil 15.96 18.24
Electricity 4.24 3.32
Telephone services 2.42 NA
Shelter 4.88 4.76
factor in both population growth and economic devel-
opment. The subject index of Kahn and Wiener (1967)
makes no mention of either migration or immigration.
Pollution was forecast to be a major future problem,
but the form that this has taken, global warming, was
not foreseen. It would be unfair to be critical of these
missed forecasts, even though they are important, as
several were very difficult to predict. However, the
book did consider the possibility of a Mexican
migration flow but dismissed it on societal grounds,
ignoring the economic motivation.
Depending on how we evaluate the success of long-
term forecasts, the predictions of Kahn and Wiener
may or may not have performed well. To sum up,
given the difficulties of long-term forecasting and the
lack of information they were faced with, we argue that
Kahn and Wiener made a great contribution with
regard to forecasting the future.
3. Trend forecasting with out-of-sample and post-
sample data: an example
Granger's textbook produced for his undergraduate
forecasting class, Forecasting in Business and Econom-
ics, provides an example of trend-line fitting and
forecasting using total U.S. personal consumption data.
The first edition of the book, published in 1980, provides
a table of total U.S. personal consumption (billions of
1958 dollars) for the period 19471974. The annual data
for 19481964, which contain 17 observations, are used
to fit four different curves, and forecasts are then made,
using these curves, for the period of 19651977 (which
are only evaluated up to 1974, however, due to the data
availability). The second edition was published in 1989,
updating the total U.S. personal consumption (billions of
1958 dollars) up to the year 1985.
8
Unfortunately, the
Table 7
Trend forecasting percentage errors for U.S. real personal
consumption expenditures
Year Linear Exponential Modified
exponential
Parabolic
1965 7.73 6.29 6.03 6.38
1966 9.91 7.75 7.47 7.93
1967 10.17 7.20 6.88 7.48
1968 12.52 8.76 8.41 9.15
1969 13.51 8.87 8.50 9.42
1970 13.01 7.35 6.94 8.08
1971 14.35 7.75 7.31 8.68
1972 17.47 10.06 9.60 11.19
1973 19.50 11.19 10.70 12.54
1974 15.82 5.94 5.39 7.66
1975 19.85 9.26 8.69 11.21
1976 22.46 11.00 10.41 13.23
1977 24.09 11.64 11.02 14.19
1978 25.82 12.40 11.75 15.30
1979 26.16 11.50 10.81 14.81
1980 24.53 8.16 7.40 12.02
1981 24.18 6.29 5.48 10.69
1982 23.85 4.36 3.49 9.35
1983 26.65 6.37 5.49 11.76
1984 29.10 7.99 7.08 13.81
1985 31.42 9.48 8.54 15.74
1986 32.95 9.98 9.00 16.75
1987 34.01 9.84 8.82 17.19
1988 35.52 10.33 9.28 18.23
1989 36.26 9.74 8.64 18.31
1990 36.52 8.46 7.30 17.78
1991 35.62 5.42 4.18 15.74
1992 36.69 5.22 3.94 16.26
1993 37.79 5.09 3.76 16.85
1994 39.13 5.32 3.95 17.78
1995 39.84 4.59 3.16 17.90
1996 40.97 4.53 3.05 18.61
1997 42.30 4.80 3.27 19.61
1998 44.28 6.21 4.66 21.57
1999 46.24 7.65 6.08 23.55
2000 47.93 8.69 7.09 25.19
2001 48.47 7.77 6.10 25.22
2002 49.50 7.70 5.99 25.97
2003 50.37 7.37 5.60 26.52
Fig. 1. Forecasting personal consumption using trend curves.
8
Note that the conversion used to convert prices with a 1982 base
to a 1958 base is to multiply by 0.33.
second edition did not expand the fitting period, keeping
the same estimation period of 19481964, and thus
providing the same parameters for the four curves.
However, the evaluation is made for the updated sample
period of 19651988.
The four different trend curves fitted in Granger
(1980, 1989) are the linear, exponential, parabolic, and
modified exponential curves, although trend-curve
fitting methods rarely have any solid economic theory
underlying the forecasts. In the first attempt, Granger
(1980) uses a much simpler (but possibly suboptimal)
estimation method, where the averages of five adjacent
points are used to fit the curves. A three-point method
is used for fitting the parabolic curve and the modified
exponential curve. Each of three points is the weighted
average of the first five terms, the weighted average of
the middle five terms, and the weighted average of the
last five terms. Two points, which are the weighted
average of the first five terms and the last five terms,
completely determine the parameters of the two
parameter curves, the linear straight line and the
exponential curve. The estimated parameters from
using either the two- or the three-point method are
9
(i) the linear line, C(t) =a+bt
=192.6+9.687t
(ii) the exponential curve, C(t)
= exp(a+bt) = exp(5.3033+0.0343t)
(iii) the parabolic curve, C(t)
= a+bt +ct
2
=201.96+6.449t +0.1675t
2
(iv) the modified exponential curve, C(t)
= a+br
t
=3.131+197.79 (1.0355)
t
.
9
Granger (1980) estimates the linear function using OLS C(t) =
191+9.69t which is very similar to the two-point estimates. Granger
(1989, page 37) claims that using the OLS estimated exponential
function gives forecasting values that are too high, and thus that the
unsophisticated estimation of the two point method produces the better
forecasts, although this is not what we would usually expect to occur.
Fig. 2. Trend stationary forecasting: the case of the U.S.
Granger (1980, pp. 3536) reports that, over the
estimation and out-of-sample periods, the straight line
is the worst approximation, while the other three
curves are indistinguishable. Each curve's downward
bias for the out-of-sample forecasting period from
1965 to 1974 was noted. Later, Granger (1989, p. 37)
claims that the exponential and modified exponential
forecasts are almost identical, and that both are
superior to those of the linear and parabolic curves,
with acceptable levels of error such as 1%7%.
In this paper, the annual data of real personal
consumption expenditures is updated to 2003 (bil-
lions of chained 2000 dollars) from the Federal
Reserve Bank of St. Louis.
10
Using post-sample data,
the trend forecasting is carried out for four different
methods, shown in Fig. 1, using the parameter
estimates from the 1980 book. The forecasting
evaluation is performed using
error
forecast actual
actual
100:
Table 7 shows the out-of-sample forecast percent-
age errors. These values increase monotonically for the
linear and parabolic curves, but not for two exponen-
tials. Over a 39 year period, the simple exponential has
a percentage error of less than 12.40 on all occasions,
and for 23 horizons is less than 8% wrong. The
modified exponential percentage error is always less
than that of the simple exponential, with 28 horizons
under 8%. The most surprising statistic is that the
modified exponential curve has had single-digit
percentage errors since 1980, and has been at or
below 7.09 since 1981. Thus, it appears that this very
Fig. 3. Trend stationary forecasts with exponential weights: the case of the U.S.
10
The conversion used for converting prices with a 2000 base to a
1958 base is to multiply by 0.20117, after comparing the common
data between 1947 and 1974.
simple trend model can effectively forecast this
particular economic series at least forty years ahead.
These two exponential curves can be written as
+
t
, where for the exponential, =0, =e
a
,
=e
b
, and for the modified exponential, =a, =b,
r =. Thus, the modified exponential is superior,
because it adds a further, and useful, coefficient. It is
nevertheless worth noting that in Table 7, all of the
errors in the exponential columns are negative,
suggesting that the modified exponential would
have done even better if the original estimate of
had been larger. These results are also easily seen in
Fig. 1. The superiority of the exponential and the
modified exponential are due to the relatively stable
growth rate of consumption. The level of consump-
tion, however, is not easily forecastable due to its
non-stationarity.
4. Long-term forecast evaluation with uncertainty
In this section we consider the two simplest models
for long-term trend forecasts, a stochastic random
walk model with drift and a deterministic trend-
stationary model. These two empirical models indicate
similar means, but the confidence intervals change
differently as we forecast at longer horizons. The
distinction between difference stationary and trend
stationary models has been debated. Our simplest form
of the random walk model with drift consists of a non-
zero mean and a shock, while expanding the
confidence interval over forecasting horizons. A
trend stationary model contains a linear time trend
and a white noise innovation, where the confidence
interval does not change over the different horizons.
Thus, the uncertainty is another way to differentiate
Fig. 4. Differenced stationary forecasting: the case of the U.S.
between two popular models in forecasting time
trends. It is noted that adding complexity to the cycles
within these models does not necessarily add valuable
information.
In both models, long-term forecasts are made for
5 years, 10 years and 15 years into the future, where
the uncertainty is measured by quantiles such as the
25% and 75% quantiles around the point forecasts. The
natural question of interest is how much of the time the
forecasts lie inside these error bands. Specifically, the
time series of interest include the means, medians, and
quantiles such as 10%, 25%, 75% and 90%.
11
We
iterated these sequences of forecasts, and they can
differentiate between these two time series models.
Fig. 5. Likelihood for trend and difference stationary forecasts.
11
Under the normality assumption, the mean is same as the
median.
For the trend stationary model, we estimate the
linear time trend and the residual variance from in-
sample data, since the series contains a deterministic
trend accounting for the sustained increase over time
plus a stationary random disturbance term. That is,
y
t
a bdt e
t
;
where e
t
is a white noise with zero mean and
e
2
variance.
Thus the h-step-ahead forecast is f
n;h

bn h,
and its forecast error is V(e
n,h
) =
e
2
. Based on the data and
model information, we can form density forecasts, and
thus quantiles. Fixing the starting year, we move in
one year increments and continue iterating this cycle.
Alternatively, rather than putting equal weights on
historical data, exponentially decreasing weights are
used instead. That is, we choose

a and

b to make the sum
of discounted squared residuals as small as possible:
X
T
t1
k
Tt
y
t
bdt
2
:
The value of =0.8 is used in an exploratory study,
while =1 leads to ordinary least squares estimation.
After fixing the starting year again, we move in
one year increments of T and continue iterating this
cycle until the last observed year.
For the randomwalk model with drift, the wandering
associated with the random walk is dominated by the
positive drift term. We take the first difference of a
random walk series, and form the drift and residuals.
Then we have the difference stationary model being
y
t
y
t1
m e
t
;
where
t
is a white noise with zero mean and variance
2
.
Therefore, the h-step-ahead forecast is g
n;h
hd

m y
n
,
and its forecast errors are V(
n,h
) =h
2
, which increases
as the forecast horizon h increases.
The data used is the annual real gross domestic
product for the United States (billions of chained 2000
dollars) between 1929 and 2003, obtained from the
Federal Reserve Bank of St. Louis.
12
Data from five
other countries are also considered.
13
Logs are taken
before modeling the series. The first models are
estimated using data from 1947 (1950 for the other
countries) to 1965, and then forecasting one year ahead
from 1966, five years ahead from 1970, ten years ahead
from 1975, and fifteen years ahead from 1980. The
residual variances are also calculated and used for
building the 5%and 95%quantiles by using the standard
normality assumption. The procedure is iterated by
one year increments from a fixed starting year, until
forecasts for the year 2003 are reached for each
forecasting horizon. The U.S. results are shown in
Figs. 2, 3 and 4. The likelihoods observed as calculated
from the normality assumption are shown in Fig. 5.
Our discussion will concentrate on the ten and fifteen
year forecasts. Fig. 2 and 4 suggest that the two simple
models being considered are not easy to differentiate
between in terms of the central or mean forecast, but they
do differ substantially in terms of the spread, and thus in
likelihood, as seen in Fig. 5. Over both the ten- and
fifteen-year horizons, both models consistently forecast
quite well, but for this example, the random walk with
drift is superior. The actual GDP is quite close to the
forecast mean and well within the confidence intervals.
This is not so obviously the case with the linear model.
The exercises using data fromother countries proved
to be much less satisfactory, as these GDP growth rates
were much less nearly constant. By giving recent values
of the data fitting period higher weight in the estimation
method than the earlier data, improved forecasts were
obtained, but when a substantial change in growth
occurs several years in the future, it is clear that no trend
curve method will perform very well. To sum up, the
results illustrate the superiority of the random walk with
drift, as it has more realistic confidence intervals.
These confidence intervals have ignored parame-
ter uncertainty, as we follow the standard forecasting
procedure of estimating a model and then using it to
forecast, ignoring the fact that its parameters are
estimated. The model is a tool, and when it is used we
are not considering all the other tools that could have
been used, but were not. A workman using a hammer
does not worry about all the other hammers that could
have been made, but just considers the one in hand.
5. Conclusion where next?
To look ahead thirty years is a particularly difficult
task, and forecasting the next decade is not easy either.
12
http://research.stlouisfed.org/fred2/series/GDPCA.
13
The data for the other countries, including Australia (1950
2003), Japan (19502002), France (19502003), Germany (1950
2003), and the United Kingdom (19502003) are obtained from the
IMF's International Financial Statistics. The analysis of these
countries is available upon request.
What we hope to have shown, however, is that such
tasks are not impossible. Will the low rates of birth
continue? Will technology driven high rates of labor
productivity continue, providing low inflation and
unemployment rates? Will the business cycle return?
One can also ask how many of the scientific questions
will be resolved and become economically relevant. To
guesstimate the answers and the growth rates for trends
for the next ten years is feasible, but forecasting further
ahead is much more difficult. We doubt whether
anyone could have done much better than Kahn and
Wiener (1967), even though their forecasts turned out
to be far from perfect.
Long-run forecasting is a good field to participate
in, as it is a long time before your forecasts can be
evaluated. The forecasts evaluated in Sections 2 and 3
were made in 1967 and 1980, and thus used rather
simple methods. In use at that time were Delphi and a
group of simple deterministic trends. The simple
random walk model with drift was also available by
1980, but was not viewed as a plausible long-run
forecast. It would now be possible to use more
complicated time series models, such as fractional
unit roots with trends of fractional power. One may
have to bootstrap to get approximate confidence
intervals, and then wait for many years to decide
whether the method works. We would suggest
pretending that we do not know what has happened
over the past decade or so, and starting fitting a trend to
a data set ending in 1990, say, and see how well it does
in forecasting 2000 and onward. An explanation of
new models and techniques is certainly worthwhile but
may be difficult to achieve. It is clear from both our
experience and that of Kahn and Wiener that the
occurrence of future major breaks is the main reason
that simple statistical long-term forecasts are of poor
quality. It seems that attention needs to be directed to
the equally difficult task of forecasting such breaks.
Certainly people keep trying; for example, McRae
(1994) discusses the world in 2020, and has a chapter
entitled North America: The Giant in Retreat. It
should be interesting for someone to evaluate his
forecasts in twenty years' time.
Acknowledgement
We are grateful to Dennis Ahlburg for the helpful
comments and suggestions.
References
Bell, D. (Ed.). (1968). Towards the year 2000. Boston: Houghton,
Mifflin and Co.
Gabor, D. (1963). Inventing the Future. London: Sacker and
Warburg.
Granger, C. W. J. (1980). Forecasting in business and economics.
Academic Press Inc.
Granger, C. W. J. (1989). Forecasting in business and economics
(2nd Edition). Academic Press Inc.
Kahn, H., & Wiener, A. J. (1967). The Year 2000 A framework for
speculation on the next thirty-three years. New York: MacMillan
Company.
McRae, H. (1994). The World in 2000. Cambridge: Harvard
Business School Press.
Meadows, D. H., Meadows, D. L., Randers, J., & Behrens, W. W.
(1972). The limits to growth. New York: Universe Books.
Meadows, D. H., Meadows, D. L., Randers, J., & Tinbergen, J.
(1992). Beyond the limits: Global collapse or a sustainable
future? London: Earthscan.
Rostow, W. W. (1978). The world economy. History and prospects.
Austin: University of Texas Press.

Long-Term Forecasting and Evaluation

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Long-Term Forecasting and Evaluation

Загружено:

Авторское право:

Доступные форматы

Long-term forecasting and evaluation

Clive W.J. Granger

Вам также может понравиться