Вы находитесь на странице: 1из 17

MANAGEMENT SCIENCE

Articles in Advance, pp. 117


issn0025-1909 eissn1526-5501
doi 10.1287/mnsc.1110.1382
2011 INFORMS
Demand Forecasting Behavior:
System Neglect and Change Detection
Mirko Kremer, Brent Moritz
Smeal College of Business, Pennsylvania State University, University Park, Pennsylvania 16802
{mirko.kremer@psu.edu, bmoritz@psu.edu}
Enno Siemsen
Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455, siems017@umn.edu
W
e analyze how individuals make forecasts based on time-series data. Using a controlled laboratory exper-
iment, we nd that forecasting behavior systematically deviates from normative predictions: Forecasters
overreact to forecast errors in relatively stable environments, but underreact to errors in relatively unstable
environments. The performance loss that is due to such systematic judgment biases is larger in stable than in
unstable environments.
Key words: forecasting; behavioral operations; system neglect; exponential smoothing
History: Received December 22, 2009; accepted April 16, 2011, by Martin Lariviere, operations management.
Published online in Articles in Advance.
1. Introduction
Demand forecasting in time-series environments is
fundamental to many operational decisions. Poor
forecasts can result in inadequate capacity, excess
inventory, and inferior customer service. Given the
importance of good forecasts to operational success,
quantitative methods of time-series forecasting are
well known and widely available (cf. Makridakis et al.
1998). Although companies frequently have access to
sophisticated quantitative methods embedded in fore-
casting software, empirical evidence shows that real-
world forecasting often relies on human judgment.
In a study of 240 U.S. corporations, over 90% of
companies reported having access to some forecast-
ing software (Sanders and Manrodt 2003a). However,
only 29% of companies primarily used quantitative
forecasting methods, 30% mainly used judgmental
methods, and the remaining 41% applied both quanti-
tative and judgmental methods (Sanders and Manrodt
2003b). Although quantitative methods may provide
the basis for a forecast, it is common practice to mod-
ify computer-generated forecasts based on human
judgment (Fildes et al. 2009).
Several recent studies have examined decision
biases in a variety of operations contexts (cf. Bendoly
et al. 2006) and documented behavioral anoma-
lies related to demand forecasting even when fore-
casting was theoretically irrelevant to the task.
Schweitzer and Cachon (2000) investigated newsven-
dor inventory decision making under stationary and
known demand distributions; a key nding of their
study is that average order quantities are biased
toward mean demand relative to the expected prot-
maximizing order quantity. This biased ordering has
been attributed to randomness in decision making
(Su 2008), as well as to systematic biases like mean
demand anchoring and demand chasing (Schweitzer
and Cachon 2000). In a more complex beergame set-
ting, Croson and Donohue (2003) observed the bull-
whip effect with participants who faced a known
and stationary demand distribution. Croson et al.
(2005) observed this effect even with constant and
deterministic demand. To mitigate such suboptimal
behavior, Schweitzer and Cachon (2000, p. 419) high-
lighted the importance of separating the forecasting
task from the inventory decision task: While the fore-
casting task typically requires managerial judgment,
the task of converting a forecast into an order quan-
tity can be automated. A rm may reduce decision
bias by asking managers to generate forecasts that
are then automatically converted into order quan-
tities. In other words, inventory decisions can be
decomposed by estimating the probability distribu-
tion of future demand (a more judgmental forecasting
task), selecting a service level, and then using both to
determine an order quantity (a more automated task).
Therefore, it is important to consider human judg-
ment in forecasting to attenuate errors in higher-order
decisions such as purchasing, inventory, and capacity.
There is extensive literature on human judgment in
time-series forecasting (Lawrence et al. 2006). Central
results include the widespread use of heuristics such
1
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Published online ahead of print July 15, 2011
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
2 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
as anchoring and adjustment, as well as the inuence
of feedback and task decomposition on forecasting
performance. However, ndings remain inconclusive,
in part because forecasting behavior appears to be
sensitive to different components of the time series.
Furthermore, the judgmental forecasting literature is
typically concerned with the detection of patterns in
a time series, such as trends or seasonality (Harvey
2007). In contrast, our research focuses on reactions
to unpredictable change in the level of a time series:
How do individuals create time-series forecasts in
unstable environments? We study this question in a
laboratory setting where forecasters face time series
generated by a random walk with noise (Muth 1960),
a demand process that provides an intuitive map-
ping between a simple normative forecasting bench-
mark and the structural parameters describing the
demand environment. We show that time-series fore-
casting behavior is described by an error-response
model across a wide range of conditions. However,
forecasters tend to overreact to forecast errors in more
stable environments and underreact to forecast errors
in less stable environments. This pattern is consis-
tent with the system-neglect hypothesis (Massey and
Wu 2005), which posits that forecasters place too
much weight on recent signals relative to the envi-
ronment that produces these signals. Surprisingly, we
nd that forecasting performance relative to the nor-
mative benchmark is poorer in stable environments
compared to less stable environments.
This paper proceeds as follows. The next section
outlines the academic literature that relates to our
research. Section 3 discusses our theoretical develop-
ments, and 4 discusses the results of our study. We
discuss our results and conclude this paper in 5.
2. Related Literature
Existing research on judgmental forecasting provides
vast but somewhat inconclusive empirical evidence
regarding forecasting performance, cognitive pro-
cesses, and managerial interventions. Many studies
have been devoted to comparing the performance of
human forecasts to quantitative forecasting methods,
but the empirical evidence is not consistent (Lawrence
et al. 1985, Carbone and Gorr 1985, Sanders 1992,
Fildes et al. 2009). The literature has investigated a
variety of cognitive processes underlying the evolu-
tion of judgmental forecasts, such as variations of the
anchoring-and-adjustment heuristic (Harvey 2007).
Regarding managerial interventions, judgmental fore-
cast accuracy can improve with performance feed-
back (e.g., Stone and Opel 2000) and task property
feedback (e.g., Sanders 1997), but the effectiveness of
these levers depends on specic contextual elements
of the forecasting task (Lawrence et al. 2006). Existing
research on judgmental time-series forecasting pre-
dominantly examines pattern detection, that is, how
well human subjects can identify trends and seasonal-
ity in noisy time series (Andreassen and Kraus 1990,
Lawrence and OConnor 1992, Bolger and Harvey
1993, Lawrence and OConnor 1995). In contrast, our
research focuses on change detection, that is, how sub-
jects separate random noise from persistent change in
the level of a time series.
When observing demand variation in a time series,
a forecaster needs to identify whether there is sub-
stantive (and persistent) cause for this variation or
whether variation is noise and has no implications
for future observations. The ability to distinguish sub-
stantive change from random variation has been stud-
ied extensively in the literature on regime change
detection (Barry and Pitz 1979). A central conclusion
from regime change research is that decision mak-
ers underreact to change in environments that are
unstable and have precise signals and overreact in
environments that are stable and have noisy signals
(Grifn and Tversky 1992). This seemingly contra-
dictory reaction pattern is reconciled by the system-
neglect hypothesis (Massey and Wu 2005), which
posits that individuals overweigh signals relative to
the underlying system that generates the signals.
A related stream of research in nancial eco-
nomics explains the pattern of short-term underreac-
tion and long-term overreaction to information that is
often observed in stock market investment decisions
(Poteshman 2001). Some theoretical work has been
devoted to explaining this behavioral pattern, linking
such behavior to the gamblers fallacy or the hot-
hand effect (Barberis et al. 1998, Rabin 2002, Rabin
and Vayanos 2010). In an asset pricing context, Brav
and Heaton (2002) illustrate how an over- or under-
reaction pattern arises from biased information pro-
cessing by investors, subject to the representativeness
heuristic (Kahneman and Tversky 1972) and conser-
vatism (Edwards 1968), and demonstrated how this
pattern can also arise from a fully Bayesian investor
who lacks structural knowledge about the possible
instability of the time series. Experimental tests of this
mixed-reaction pattern include those by Bloomeld
and Hales (2002) and Asparouhova et al. (2009).
A central difference between our research and exist-
ing research on change detection is the complexity
of the judgment environment. In Massey and Wu
(2005), participants faced binary signals (red or blue
balls) that were generated from one of two regimes
(draws from two urns with xed proportions of red
and blue balls in each). Given a sequence of signals,
their experimental task was to identify when a regime
change (a switch from one urn to the other) had
occurred. Furthermore, because subjects had perfect
knowledge of the system parameters (the proportion
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 3
of blue balls in either urn), there was no ambigu-
ity concerning the relevant world. This environment
ts a binary forecasting task in which a well-known
phenomenon needs to be detected, such as when
a bull market turns into a bear market. Similarly,
in Bloomeld and Hales (2002) and Asparouhova
et al. (2009), participants faced fairly simple series of
signals generated from a symmetric binary random
walk. Brav and Heaton (2002) illustrated their the-
oretical considerations in an environment in which
a series of independently and identically distributed
assets exhibited a single structural break that shifted
the asset distribution only once during the time series.
A central question of our research is whether the
overreaction/underreaction patterns observed in such
fairly simple settings translate to the relatively richer
environment of time-series demand forecasting.
3. Theory
In the time-series judgment task we examine, a fore-
caster needs to decide whether an observed variation
in the time-series data provides a reason to modify a
previous forecast for the next period. Figure 1 illus-
trates this judgment task.
If the forecaster interprets the variation as ran-
dom noise, she can ignore it and uphold her fore-
cast as the long-run average. If she believes that
the variation represents a change in the underly-
ing level of the time series, recent demand observa-
tions contain more information about the future than
past observations do. Therefore, recent observations
should receive more weight in her revised forecast.
Finally, if she believes that this variation is indicative
of a trend, she would project the variation to not only
shift the level once, but also to continue doing so in
future periods.
In practice, these options are not mutually exclu-
sive. A forecaster may decide that variation is par-
tially due to noise and partially due to a level change.
She may also believe that the variation represents
Figure 1 Challenge of Time-Series Analysis
Noise
Change in level
Trend
Time
Demand
Average
Past Present Future
Variation indicates
Variation
both a level change and a trend. The key challenge we
examine is whether she can successfully differentiate
level changes from noise. Although our empirical
analysis controls for the possibility that forecasters
include illusory trends, our simulated demand envi-
ronment does not contain any underlying trends;
a comprehensive discussion of trend detection is
beyond the scope of this paper.
3.1. Demand Environment
Forecasts are made after observing demand D
|
in
period |, with no additional information on future
demand beyond that which is contained in the time
series D
|
= {D
|
, D
|1
, D
|2
, . . .}. The demand process
follows:
D
|
= j
|
+a
|
, (1a)
j
|
= j
|1
+u
|
, (1b)
where a
|
N(0, n
2
) and u
|
N(0, c
2
) are indepen-
dent random variables. The time series thus contains
two kinds of random components: temporary shocks
(through a
|
) and permanent shocks (through u
|
). The
standard deviation c captures the notion of change
in the true (but unobserved) level j
|
, i.e., permanent
shocks to the time series that persist in subsequent
periods. The standard deviation n captures the noise
surrounding the level, i.e., temporary shocks to the
time series that last only for a single period.
This demand model has several appealing proper-
ties. By varying the two parameters (c and n), the
model provides a simple way to describe a fairly
wide range of different environments, ranging from
rather stable to highly unstable processes. In its lim-
its, the demand model produces pure random walks
for n =0, while yielding stationary white noise pro-
cesses for c = 0. Figure 2 illustrates how the shape
of a representative time series depends on the two
parameters. The demand model dened in Equations
(1a) and (1b) is descriptively accurate for many real-
world processes (see Makriadakis and Hibon 2000,
Gardner 2006, and the references therein), often used
to describe nonstationary demand in supply chain
settings (e.g., Graves 1999), and discussed in prac-
tically every operations management textbook (e.g.,
Nahmias 2008, Chopra and Meindl 2009). Impor-
tant for our empirical investigation of judgmen-
tal forecasts and their performance, this demand
model also provides a direct and intuitive mapping
between a simple normative forecasting benchmark
and the structural parameters describing the demand
environment.
3.2. Normative Benchmark
From Equation (1a), the best forecast

|+1
(made in
period | for period | + 1) for our demand environ-
ment is the true level j
|
, but the true level is obscured
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
4 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
Figure 2 Example Demand Time Series for Different c and r
(p
0
=500)
Time
Condition 5, c = 40, n = 10
D
e
m
a
n
d
D
e
m
a
n
d
D
e
m
a
n
d
D
e
m
a
n
d
D
e
m
a
n
d
D
e
m
a
n
d
Time
Time Time
Time
Time
Condition 6, c = 40, n = 40
Condition 3, c = 10, n = 10 Condition 4, c = 10, n = 40
Condition 1, c = 0, n = 10 Condition 2, c = 0, n = 40
by noise and must be estimated. The challenge is
to optimally determine the extent that a variation in
the demand signal D
|
is evidence for a permanent
change in the level rather than a random, transient
shock. When D
|
follows the demand process in Equa-
tions (1a) and (1b), the optimal forecasting mechanism
is the familiar single exponential smoothing method
(Muth 1960). A forecast
|+1
is a weighted average of
the most recent demand observation and the previous
forecast,

|+1
=oD
|
+(1 o)
|
=
|
+o(D
|

|
). (2)
The latter part of Equation (2) highlights an important
observation: A forecast is a function of an observed
forecast error (D
|

|
) and the weight o that is placed
on this error. In conceptual terms, the optimal fore-
cast revision (from
|
to
|+1
) in light of new evi-
dence D
|
is a function of strength of this evidence,
relative to its weight (Grifn and Tversky 1992). The
strength of evidence is provided by the forecast error
(D
|

|
) itselfall else equal, a forecast should be
more strongly revised the larger the forecast error.
The weight of evidence depends on the parameters
of the system that generates the demand signals, i.e.,
the change (c) and noise (n) parameters governing
the time series. In fact, the weight of evidence in our
context can be precisely captured by the change-to-
noise ratio W =c
2
/n
2
, which itself is directly related
to the optimal smoothing parameter o

. All else equal,


a forecast should be revised according to the change-
to-noise ratio: With low values of W (variations in
demand are mostly noise), forecast errors should be
mostly discarded and should not have much inu-
ence on the new forecast, whereas with high val-
ues of W (variations in demand mostly represent
level changes), the forecast error should have greater
inuence on a forecast. This intuition can be formal-
ized as follows (Harrison 1967, McNamara and Hous-
ton 1987):
1
o

(W) =
2
1 +

1 +4,W
. (3)
Thus, the optimal smoothing constant o

depends
only on the change-to-noise ratio W, whereas the
demand time series is driven by absolute levels of c
and n. For example, condition 3 (c = 10 and n = 10)
and condition 6 (c =40 and n =40) have the same W,
implying the same o

(W). Combining optimal struc-


ture of the forecasting mechanism in Equation (2)
with its optimal parameter in Equation (3) yields the
optimal forecasts for our demand environment,

|+1
=
|
+o

(W)(D
|

|
). (4)
3.3. System Neglect in Forecasting Behavior
Besides being the normative forecasting benchmark
for the demand process described in Equations (1a)
and (1b), single exponential smoothing can be viewed
as a plausible model for describing human forecast-
ing behavior. Practically, exponential smoothing cor-
responds to the mental process of error detection and
subsequent adaptation,
2
i.e., trial-and-error learning,
where a forecaster observes an error and then adjusts
her next forecast based on that error. Furthermore,
exponential smoothing has two important character-
istics as a boundedly rational decision heuristic: First,
it does not require much memory because the most
recent forecast and demand contain all the informa-
tion necessary to make the next forecast. Second, it
is a robust heuristic in many different environments
beyond the particular one used in our study (e.g.,
Gardner 1985, 2006). These are compelling behavioral
reasons to assume that forecasters follow the error-
response logic of exponential smoothing. The cru-
cial question is how their behavioral error-response
1
McNamara and Houston (1987) derived Equation (3) using
Bayesian principles. Harrison (1967) derived the same expression
(although articulated differently) as the argument that minimizes
the variance of forecast errors.
2
Error detection and adaptation are also fundamental principles
of cybernetics (Wiener 1948) and the foundation for closed loop
theories of learning (Adams 1968). There is neurological evidence
that the human brain supports such a process (Gehring et al. 1993).
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 5
parameter o(W) compares to the optimal o

(W) in
Equation (3).
We assume that decision makers update their fore-
casts in the light of new evidence and that they incor-
porate both the strength of this evidence, i.e., the
magnitude of the observed variation in demand sig-
nal, and its statistical weight (W = c
2
,n
2
). However,
we hypothesize that they suboptimally incorporate
signal strength and weight. Specically, we suggest
that decision makers attribute too much weight to
forecast errors at the expense of the system parame-
ters (c, n) that produce these errors. The primary rea-
son for this pattern is that demand signals and the
associated forecast errors are highly salient, whereas
the system parameters are not. In most instances, the
system parameters c and n are unknown or even
unknowable. Even if a decision maker knows the
exact system parameters, those parameters are likely
to remain latent in the background relative to the
signals they produce. Massey and Wu (2005) inte-
grated these ideas into their system-neglect hypothesis.
Because the weight W is less a determinant of actual
behavior than Equation (1) implies for the optimal
benchmark, we expect the behavioral o(W) to be less
responsive to W than o

(W):
do(W)
dW
-
do

(W)
dW
. (5)
We hypothesize the following:
Hypothesis 1 (System Neglect). Individuals show
relatively more overreaction for low values of W, and rela-
tively more underreaction for high values of W.
Such behavior is illustrated in Figure 3, which com-
pares the optimal o

as a function of W according to
Equation (2) to a possible system neglect o as a func-
tion of W. Figure 3 needs to be interpreted with some
caution, because it shows only one possible pattern
of system neglect (absolute overreaction for low val-
ues of W, and absolute underreaction for high values
Figure 3 Normative vs. System-Neglect Reaction
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
R
e
a
c
t
i
o
n

Weight W
Normative reaction
Underreaction
*(W)
(W) Overreaction
System-neglect reaction
of W). System neglect makes no specic predictions
about absolute levels of over- and underreaction, as
pointed out by Massey and Wu (2005). Specically,
system neglect does not predict the location of o(0).
One could observe overreaction for all values of W if
o(0) is very high, and underreaction for all values of
W if o(0) =0.
4. Experiment
4.1. Design and Implementation
4.1.1. Experimental Design. Subjects made se-
quential forecasts based on consecutive time-series
demand signals generated from a random walk with
noise in a controlled laboratory environment. Subjects
were told they were forecasting demand at a retail
store. Subjects began with 30 periods of demand his-
tory and were asked to make a point forecast in each
of 50 successive periods. Throughout the experiment,
an on-screen graph was updated to include demand
up to the current period. In addition, a table pro-
vided information on all past demand observations,
previous forecasts, absolute forecast errors, and rela-
tive forecast errors.
The theoretical developments in the previous sec-
tion posit that human forecasters react to forecast
errors and that their reaction pattern systematically
depends on the forecasting environment. To test
Hypothesis 1 (system neglect), we designed experi-
mental conditions along the two parameters of our
forecasting environment, c and n. First, we varied the
degree of change by letting c equal 0, 10, or 40. Sec-
ond, we varied the degree of noise by letting n equal
10 or 40. These variations resulted in six experimen-
tal conditions representing different demand environ-
ments that range from no change/low noise (c = 0,
n = 10) to high change/high noise (c = 40, n = 40).
A summary of these conditions is shown in Table 1.
Unlike stationary processes (c = 0), environments
characterized by a signicant degree of change over
time are likely to produce distinct demand evolutions.
To address consistency between demand data and the
data-generating system, we generated four demand
data sets from each of the six environments. Each
Table 1 Overview of Experimental Conditions
r =10 r =40
c =0 Condition 1 Condition 2
0 (0) 0 (0)
c =10 Condition 3 Condition 4
1 (0.62) 1/16 (0.22)
c =40 Condition 5 Condition 6
16 (0.94) 1 (0.62)
Note. Change-to-noise ratio h (and u

(h) in parentheses).
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
6 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
data set within a condition started with the same his-
tory of 30 observations, which we provided to sub-
jects prior to their rst forecast and throughout the
experiment. Data in each time series represented units
of demand in each period, and we implemented the
resulting 6 4 = 24 data sets in a between-subject
design. The example time-series data sets depicted in
Figure 2 are data sets from the experiment.
The forecasting task was implemented in the exper-
imental software zTree (Fischbacher 2007). To incen-
tivize accurate forecasting, we paid each subject
$10 multiplied by the subjects accuracy across the
T = 50 periods. Forecasting accuracy was dened as
(1 MAPE), where MAPE is the mean absolute per-
centage error calculated based on the entire history of
forecasts and demand observations. In addition, each
subject was paid a participation fee of $5. Payoffs
were rounded up to the next full dollar value, and the
average payoff was $14.80. Sessions lasted between 30
and 45 minutes.
The study was conducted at a behavioral lab in a
large public university in the American Midwest. The
252 participants belonged to a subject pool associated
with the business school and registered for the study
in response to an online posting. About half of the
subjects were current undergraduate students from
various elds, and the other half were either graduate
students or staff at the university. Of the 24 data sets,
23 data sets have at least 10 subjects, and the remain-
ing data set has 8 subjects.
Let
i|
denote the forecast for period | made by sub-
ject i in period | 1, after observing demand D
|1
.
We use vector notation for the history of forecasts and
demands up to period |, F
i|
={
i|
,
i|1
,
i|2
, . . .} and
D
|
={D
|
, D
|1
, D
|2
, . . .}. To correct for errors and out-
liers, we examined all individual forecasts
i|
with an
absolute forecast error D
|

i|
>300. In a few cases,
typographical errors were obvious, and the forecasts
were corrected. If the intended forecast could not be
determined but the response appeared to be a typo-
graphical error (i.e., one forecast of 20 between a long
series of forecasts between 700 and 900), that forecast
was recorded as missing. However, such corrections
were rare (-0.1% of all forecasts). These deletions left
us with a sample of 11,833 forecasts for our study.
Prior to completing the study, we also performed a
pretest (261 subjects) at a different university located
in the American Northeast; details about the pretest
are given in Appendix A.
4.1.2. Rationality, Parameter Knowledge, and
Beliefs About the Forecasting Environment. Testing
our research hypothesis essentially reduces to com-
paring the normative benchmark o

(W) and observed


behavior o(W) across different environments. Both
values of interest, o

(W) and o(W), imply certain


assumptions about the rationality of the forecaster
and her beliefs about the demand environment. Next,
we address the resulting empirical challenges.
Participants in our study had access to the history
of demand observations before making their rst fore-
cast, shown throughout the experiment in both the
graph and the history table. We did not give our par-
ticipants any information beyond the data contained
in the time series, i.e., they were not informed that
data were generated by a random walk with noise,
nor were they provided with the actual parameters c
and n in their condition. Furthermore, subjects were
not told that the data were constructed without trends
or seasonality. In this light, our normative benchmark
o

(W) appears restrictive, because it imposes three


critical assumptions about the forecasters degree of
rationality and knowledge. Specically, the optimal
forecasting mechanism in Equation (4) implies that
the forecaster has correct beliefs about the structure
of the demand process in Equation (1), understands
the structure of the optimal forecasting mechanism
in Equation (2), and has access to an unbiased esti-
mate of o

(W) from Equation (3). One could argue


that comparing behavior to the optimal smoothing
parameter o

(W) is unfair. Because they lack knowl-


edge of the c and n of their experimental condi-
tion, subjects cannot calculate o

(W). Rather, the best


they can do in any given period | is to estimate c
and n using their existing data, i.e., the history of
demand D
|
, and then translate these estimates into an
appropriate smoothing parameter. Alternatively, the
forecaster can estimate the optimal smoothing param-
eter directly, updating it in every period | given the
observed demand realizations. In this light, the fair
benchmark for forecasting performance is o

|
(D
|
), i.e.,
the smoothing parameter that ex post optimizes past
forecasting performance.
In practice, as we show in Appendix B, o

|
(D
|
)
rapidly converges to o

(W), and the available 30 peri-


ods of demand history are sufcient to achieve
this convergence. To empirically support this argu-
ment, we compare the forecast accuracy resulting
from a rational forecaster with knowledge of c
and n to the performance resulting from a ratio-
nal forecaster who lacks perfect knowledge of these
environmental parameters but optimizes the smooth-
ing constant based on available demand data: In
each time period | for each data set s, we estimate
an optimal o

|
(D
|
) and create the resulting forecast

s|+1
=
s|+1
(D
s|
,
s|
o

|
(D
|
)). The mean absolute fore-
cast errors (MAEs) resulting from such out-of-sample
forecasts were very close to the normative MAEs
based on o

(W) (see Appendix B for details). Overall,


the analyses suggest that o

(W) is a fair benchmark


for the environments of our study.
Our hypothesis further assumes that the forecaster
uses single exponential smoothing, although not nec-
essarily the unbiased o

(W), which implies that she


C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 7
has correct beliefs about the structure of the data-
generating process. Because beliefs are typically unob-
servable, an additional empirical challenge arises
because all incorrect beliefs may be inappropriately
attributed to o(W). For example, although the data-
generating process from Equation (1) does not incor-
porate a systematic trend component, the process can
produce sequences of demand signals that lend them-
selves to the perceptions of trends where there are
none (see the sample demand paths in Figure 1).
Although an overall assessment of the data should
have led to the conclusion not to expect trends (see
Appendix B), a nonholistic assessment of the time
series may produce the illusion of short-term trends.
In such a situation, simply imposing single expo-
nential smoothing as the model of behavior could
bias our conclusions. This essentially is Brav and
Heatons (2002) argument that it is difcult to dis-
tinguish between models of irrational behavior under
structural certainty (individuals use single exponen-
tial smoothing but use biased estimates of o) and
models of rational behavior under structural uncer-
tainty (lacking knowledge of the true structure of the
demand environment, individuals optimally update
parameters of an incorrect forecasting model and only
gradually learn its true structure). To address this
empirical challenge, our analysis in 4.2.2 will con-
sider more generalized forecasting models capable of
capturing behavior that goes beyond single exponen-
tial smoothing.
3
4.2. Results
4.2.1. Initial Analyses. Let
|
=(1, )

i|
denote
the average forecast across all individuals within
a given condition. The optimal forecast for period |
is given as

|
=
|
(D
|1
o

(W)). Through its depen-


dence on the smoothing constant o

(W) and the


demand realizations D
|1
,

|
is specic to each of the
six conditions (which differ by W) as well as to each
of the four demand sets within a condition (which
differ by the vector of demand realizations D
|
).
Table 2 compares the observed mean absolute
forecast error MAE(D
|
, F
i|
) = (1,SIT)

S T

sit
D
s|
,
which is the T -period average across all subjects in
all S demand data sets within a given demand envi-
ronment over all conditions. Simple |-tests (p 0.01)
conrm the observed mean absolute error is signif-
icantly larger than the corresponding error measure
3
Alternatively, the researcher may try to experimentally control
what subjects know directly, e.g., by revealing structure and param-
eters of the data-generating process. Although this approach seems
reasonable in the context of simple coin-toss experiments, it is ques-
tionable to assume that a decision maker can efciently exploit
information on the demand process described by Equations (1a)
and (1b) and its parameters.
Table 2 Observed Forecasting Performance Measured by MAE
r =10 r =40
c =0 10.15 (7.75) 38.55 (30.74)
c =10 16.42 (12.86) 47.36 (36.51)
c =40 38.94 (34.34) 64.03 (53.54)
Notes. Optimal performance is in parentheses. All differences
between observed and optimal MAEs are signicant (p 0.01).
based on optimal forecasts MAE(D
|
, F

|
). Further-
more, a comparison across environments is consistent
with our expectation that performance deteriorates
when noise n and change c increase.
Figure 4 illustrates the evolution of demand D
|
,
average observed forecasts
|
, and normative fore-
casts F

|
for one example data set in each condition
(other data sets look similar). We can make a num-
ber of observations without formal analysis. The aver-
age observed forecasts (gray line) lag the evolution
of demand (dots), which is consistent with exponen-
tial smoothing. This behavior is suboptimal in the sta-
ble demand environments (conditions 1 and 2), where
the correct forecasts F

|
(black line) do not react at
all to demand signals. Furthermore, although both
Figure 4 Example Series of Demand, Average Observed Forecast,
and Normative Forecast
400
500
600
400
500
600
500
650
800
500
650
800
300
500
700
Condition 5, c = 40, n = 10
500
700
900
Time Time
Time Time
Time Time
Demand Observed Normative
Condition 6, c = 40, n = 40
Condition 3, c = 10, n = 10 Condition 4, c = 10, n = 40
Condition 1, c = 0, n = 10 Condition 2, c = 0, n = 40
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
8 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
Figure 5 Illustration of Ranges for the Adjustment Score u
it

it
< 0, i.e., negative trend or
gamblers fallacy
Time
Demand
Past Present Future
F
it 1
D
t 1

it
> 1, i.e., subject believes in
0
it
1, i.e., exponential
a trend
smoothing
the observed and the optimal forecasts smooth some
of the variability in demand signals, there is more
variability in the series of observed forecasts than in
the series of optimal forecasts. This is most visible in
condition 4.
Next, we analyze observed forecast adjustments.
To formalize adjustments as a response to observed
forecast error, we dene the adjustment score o
i|
=
(
i|

i|1
),(D
|1

i|1
), which follows immediately
from rearranging the single exponential smoothing
formula in Equation (2).
4
We use this ratio to catego-
rize behavior, as illustrated conceptually in Figure 5.
A score of o
i|
- 0 indicates that subjects adjusted
their forecast in the opposite direction of their fore-
cast error (11% of all observations). Possible expla-
nations of such behavior include that subjects may
have incorporated a prior (illusory) trend, or they
acted in accordance with the gamblers fallacy, i.e.,
believing that high values of a stable series balance
out with low values in small samples. An adjustment
score of o
i|
= 0 (10% of all observations) indicates
no reaction. If the adjustment score falls between 0
and 1 (42% of all observations), the forecast is consis-
tent with single exponential smoothing. Finally, any
adjustment score o
i|
> 1 (37% of all observations)
indicates that subjects were extrapolating illusionary
trends into the future. This initial analysis highlights
that although error-response level adjustment is the
most likely response pattern, there is strong evidence
that subjects also tended to adjust their forecasts out-
side the range of possibilities consistent with single
exponential smoothing.
To provide a brief aggregate analysis
5
of forecast
adjustments across conditions, we calculated average
4
By construction, this score is not dened for the rst period,
nor for when D
|1
=
i|1
. In such cases, adjustment scores were
recorded as missing. This ratio has also been used as an adjustment
score in newsvendor research (e.g., Schweitzer and Cachon 2000).
5
Because excessively high and low adjustments can have a strong
inuence on this analysis, we remove all o
i|
-1 (5% of observa-
tions) and all o
i|
>2 (8% of observations) in our calculations here.
adjustment scores o = (1,SIT)

SIT
o
i|(s)
as shown
in Table 3.
6
Although such average scores should
be interpreted with caution, we can make several
directional observations. First, the average reaction
o increases in c and decreases in n. This observa-
tion is directionally in line with the normative predic-
tions from Equation (1). Second, with the exception
of condition 5, the average adjustment score differs
from the normative reaction and shows evidence of
overreaction.
4.2.2. A Generalized Model of Forecasting Be-
havior. The previous section highlights that actual
behavior is not completely captured by single expo-
nential smoothing. Identifying a descriptively accu-
rate behavioral forecasting model is ultimately an
empirical question. Rather than imposing single expo-
nential smoothing, we allow the data to select a
preferred model of forecasting behavior. We include
two generalizations in the empirical specication of
behavior: initial anchoring and illusionary trends. Ini-
tial anchoring refers to the well-documented tendency
of individuals to anchor their decisions on an arti-
cial or self-generated value (Epley and Gilovich 2001).
Illusionary trends refers to the idea that individu-
als are quick to see trends where there are none
(DeBondt 1993).
We conceptualize forecasts as containing three
structural components: A level estimate |
|
, a trend
estimate T
|
, and trembling hands noise a
|
, leading
to a generalized structural equation for forecast
|+1
:

|+1
=|
|
+T
|
+a
|
. (6)
We include the noise term because human deci-
sion making is known to be inherently stochastic
(Rustichini 2008). Note that in our experiment, we do
not observe the components in Equation (6) explicitly,
and we therefore replace these components with fore-
casts and demand observations. We specify the level
term in Equation (6) as
|
|
=0
|

|
+o(D
|

|
) +(1 0
|
)C. (7)
The specication in Equation (7) introduces the
anchoring parameter 0
|
and the constant C. Although
exponential smoothing suggests that forecasters
correctly and exclusively anchor on their previous
forecasts, the literature on anchor-and-adjustment
heuristics often includes the initial values of a time
series as an additional anchor (Chapman and Johnson
6
Note that to compare average adjustment scores between condi-
tions, we did not rely on simple paired |-tests because of the nested
nature of our data. Instead, comparisons were made using a nested
random-effects model, with observations nested in subjects, sub-
jects nested in data sets, and data sets nested in conditions.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 9
Table 3 Average Forecast Adjustment Scores u
r =10 r =40
Between-
u u

(h) condition comparision u u

(h)
c =0 0.59

(0.03) 0.00 (p =0.53) 0.56

(0.03) 0.00
Between-conditions
comparison
(p 0.01) (p 0.01)
c =10 0.75

(0.03) 0.62 (p =0.27) 0.69

(0.03) 0.22
Between-conditions
comparison
(p 0.01) (p 0.01)
c =40 0.89

(0.03) 0.94 (p =0.05) 0.79

(0.03) 0.62
Notes. The unit of analysis is the forecast. Bold entries are the average adjustment scores, with standard errors
reported in parentheses. All signicance tests were done using Wald tests. The p-values that are recorded below
or in between average adjustment scores show the result from Wald tests comparing average adjustment scores
between similar conditions with changes in c or r. See Table C.2 for a breakdown of sample size by condition.

The average adjustment score is signicantly different from its normative value at p =0.01.
2002, Baucells et al. 2011). The parameter 0
|
rep-
resents a generalization that allows individuals to
either anchor their forecasts only on previous fore-
casts (0
|
=1), only on the initial and constant value
C (0
|
=0), or on some weighted average of these two
extremes (0 -0
|
-1).
Forecasters should never incorporate trend esti-
mates in our time-series demand environment, but
this normative aspect was unknown to our subjects.
Our data-generating process can produce random
successive level increases or decreases that can easily
be perceived as trends (see Figure 2), and our initial
analyses in 4.2.1 suggests that forecasters are quick
to see trends where they do not exist. Hence, we spec-
ify the trend term in Equation (6) as
T
|
=T
|1
+p(|
|
|
|1
T
|1
), (8)
which corresponds to double exponential smoothing.
Using Equation (7), we can rewrite Equation (8) as
T
|
=(1p)T
|1
+p(0
|
o)(
|

|1
)+op(D
|
D
|1
). (9)
Combining Equations (6), (7), and (9), rearranging
terms, and allowing A to symbolize rst differences
and |
|
to represent the forecast error (D
|

|
), we can
specify our generalized forecasting model as follows:

|+1
= 0
|

|
+o|
|
+opAD
|
+p(0
|
o)A
|
+(1 0
|
)C +(1 p)T
|1
+a
|
. (10)
Equation (10) serves as the basis of our empirical
estimation (see Appendix B for additional remarks
on model and error specication). This model nests
the normative benchmark for our context (i.e., single
exponential smoothing) as a special case for 0
|
= 1
and p =0.
We conduct a hierarchical analysis to examine
whether it is empirically justied to simplify Equa-
tion (10) to the normative benchmark of single expo-
nential smoothing. To that purpose, we estimate four
different models: A full model (model 4), a model
without A
|
as the independent variable (model 3),
a model without A
|
and AD
|
as independent vari-
ables (model 2), and a model without A
|
and AD
|
as independent variables, without a constant, and
with the constraint of 0
|
= 1 (model 1). For each
of the six experimental conditions, we estimate the
structural parameters of these four models, includ-
ing random slopes and intercepts (see Equation (C3)
in Appendix C), using maximum likelihood (ML)
estimation.
7
From the last row in each condition in
Table C.2 (labeled A
2
), we can see that the likeli-
hood ratio tests indicated that any simplication of
model 4 leads to a signicant decrease in model t
across all conditions, which conrms our observa-
tion from the previous section. Based on the model
t statistics, single exponential smoothing does not
fully describe observed behavior, and the general-
ized model of Equation (10) is the empirically pre-
ferred model. One can also see that o, our main
parameter of interest (the effect of j(E
|
), shown in
the rst row under each condition for all models in
Table C.2), generally increases as we constrain the
model. In other words, the estimate for o in simpler
models tends to suffer from a positive bias, i.e., it
tends to be higher than its true value because these
simpler models do not account for additional behav-
ioral factors (besides error-response) inherent in our
data. Consequently, the simple average adjustment
scores reported in Table 3 inate the true reaction o,
because they do not control for additional behavioral
effects. Therefore, we focus our analysis on interpret-
ing the behavioral parameters estimated in model 4.
4.2.3. Estimation. We use the estimates for
model 4 in each condition (Table C.2 in Appendix C)
7
All analyses were conducted in Stata 11 using the xtmixed pro-
cedure. In a few instances, when models would not converge using
ML estimation, restricted ML estimation was used instead.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
10 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
Table 4 Overview of Behavioral Model Parameter Estimates and Hypothesis Tests
Condition 1 Condition 2 Condition 3 Condition 4 Condition 5 Condition 6
c =0, r =10 c =0, r =40 c =10, r =10 c =10, r =40 c =40, r =10 c =40, r =40
u

0.00 0.00 0.62 0.22 0.94 0.62


u 0.39

(0.04) 0.48 (0.05) 0.68

(0.04) 0.60

(0.04) 0.70

(0.03) 0.56

(0.04)
0.28

(0.09) 0.05 (0.08) 0.27

(0.09) 0.07 (0.06) 0.45

(0.10) 0.45

(0.15)
o
l
0.70

(0.03) 0.75

(0.03) 0.99 (0.01) 0.90

(0.02) 1.00 (0.00) 0.95

(0.01)
0 501

(1.2) 507

(7.2) 625

(15) 723

(31)
u =u

p 0.01 p 0.01 p =0.13 p 0.01 p 0.01 p =0.11


Notes. Standard errors are in parentheses. The optimal is 0 in all conditions, the optimal o
l
is 0 in conditions 1 and 2, and 1 in
all other conditions. Signicance tests for u, , and 0 test whether these parameters are different from 0. For o
l
we test o
l
=1. The
parameter 0 is not reported if o
l
is not different from 1. u, , and the hypothesis tests related to u are based on the mean estimates
of the respective random slopes. See Appendix C for details. See Table C.2 for an overview of sample size (forecasts and subjects) in
each condition.

p 0.01.
to calculate the behavioral parameters (i.e., o, p,
0
|
, and C) of Equation (10) (see Appendix B for
details). We provide an overview of these behavioral
parameters in Table 4. All parameter tests reported in
this section are Wald tests on specic parameters or
combinations of parameters; we use likelihood ratio
tests only to test null hypotheses on random effects
or to compare nested models (Verbeek 2000).
Clear patterns emerge from our analysis. In all con-
ditions, the reaction parameter o is positive and sig-
nicant (p 0.01), indicating that individuals react
to their most recent forecast error. We further test
whether the behavioral os are different from their
normative values, as Hypothesis 1 predicts. Note
that o >o

implies overreaction, and o -o

implies
underreaction. We nd evidence for overreaction in
conditions 1, 2, and 4; effectively correct reactions that
are not statistically different from the normative value
in conditions 3 and 6; and evidence for underreaction
in condition 5. This pattern is consistent with system
neglect and conrms our research hypothesis.
To test whether our behavioral estimates for o
from Table 4 adjust to changes in our experimen-
tal parameters as expected, we simultaneously reesti-
mated model 4 across two conditions, allowing model
parameters to change between conditions. This con-
trast estimation allows us to test whether model
parameters are signicantly different from each other
between similar conditions. The results from this anal-
ysis are reported in Table 5. We observe that o initially
increases signicantly in c (c =0 versus c =10), with
no further increase for c =40. In addition, o decreases
signicantly in n, except for the stationary demand
conditions with c =0.
It is interesting to contrast the observed o in con-
ditions 3 and 6. Because the change-to-noise ratio W
is identical in these two conditions, o estimates
should also be the same. However, we observe a sig-
nicant difference (p 0.01) between o estimates in
these two conditions (o = 0.68 and o = 0.56, respec-
tively). This indicates that the behavioral o does not
only react to the ratio W, but may also react to
the actual scale of either change or noise. Speci-
cally, our results suggest that o may be more sensi-
tive to increases in noise than to increases in change.
However, a thorough investigation of this observation
would require a more comprehensive experimental
design with more conditions sharing a common W.
The full model includes behavioral parameters
besides o. We view these additional parameters pri-
marily as statistical controls for behavioral effects that
may otherwise be falsely attributed to o. Although
an in-depth analysis of these other parameters is
beyond the scope of this paper, three additional obser-
vations are worth a brief discussion. First, estimates
for the anchoring parameter 0
|
are signicantly less
than 1 in conditions 1, 2, 4, and 6. Given the struc-
ture of our behavioral forecasting model in Equa-
tion (7), this observation suggests that forecasts in
these conditions are inuenced by an initial and con-
stant anchor C (with a weight of 10
|
) in addition to
the anchor provided by previous forecasts. This result
is directionally correct in conditions 1 and 2, where,
after a fairly short series of observations, the fore-
caster should ignore previous forecast errors entirely
Table 5 Contrasts of u Estimates Across Demand Conditions
r =10 Between- r =40
u condition comparision u
c =0 0.39 (0.04) (p =0.32) 0.48 (0.05)
Between-conditions
comparison
(p 0.01) (p 0.10)
c =10 0.68 (0.04) (p 0.10) 0.60 (0.04)
Between-conditions
comparison
(p =0.66) (p =0.36)
c =40 0.70 (0.03) (p 0.01) 0.56 (0.04)
Notes. Bold entries are estimated values for u from model 4, Table C.2, with
standard errors reported in parentheses. The p-values below and between
u estimates are the results from Wald tests that compare u between simi-
lar conditions. See Table C.2 for an overview of sample size (forecasts and
subjects) in each condition.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 11
(o

=0) and instead base forecasts on the long-run


average. Observing a tendency for such anchoring in
these conditions is therefore not surprising. In condi-
tions 3 and 5 (i.e., low noise conditions), there is no
evidence for initial anchoring, whereas there is a weak
but signicant tendency to base forecasts partially on
an initial anchor in conditions 4 and 6 (high noise
conditions). This observation implies that noise in the
time series may increase the tendency to incorporate
initial anchors.
Second, estimates for the trend-updating parame-
ter p are positive and signicant in conditions 1, 3,
5, and 6, indicating that participants tend to incorpo-
rate illusionary trends into their forecasts. A rational
forecaster optimally learning the unknown parame-
ters should not perceive and update trends in any of
the demand conditions of our study (see Appendix B).
However, although the data-generating process in
Equation (1) has no trend component by construc-
tion, it can produce consecutive demand signals that
may easily be misinterpreted as trends by a human
forecaster. Interestingly, the occurrence of trend-like
sequences of demand signals, and their perception
as real trends, seems to depend on the parameters
c and n. For example, p is not signicant in condi-
tions 2 and 4, suggesting that illusionary trends are
less prevalent with increasing noise. To some degree,
this is expected: Noise appears as temporary vari-
ation in the time series, thereby reducing the false
impression of persistent (trend-like) changes in the
level, though the temporary nature of noise may
become more difcult to detect in conditions of higher
change (as in condition 6). Clearly, acting on illusion-
ary trends can have a detrimental impact on impor-
tant planning decisions in many settings. A thorough
investigation of illusionary trends would require a
different experimental design, including demand con-
ditions producing actual trends.
Third, our generalized forecasting model in Equa-
tion (10) includes A
|
=
|

|1
as an independent
variable, which captures the effect of previous fore-
cast adjustments on the current forecast. We observe
that the effects of A
|
are generally negative and sig-
nicant (Table C.2 in Appendix C). In other words,
participants tended to decrease their forecasts if they
had recently increased them, and increased their fore-
casts if they had recently decreased them. A possible
post hoc explanation for these negative effects may
be some form of counterpoise adjustment, where a
recent change in a forecast is attenuated by a subse-
quent forecast adjustment in the opposite direction.
4.2.4. Performance Implications. Next, we ex-
plore how the decision biases uncovered in the previ-
ous section impact forecasting performance. Using the
estimations from our generalized forecasting model,
we can attribute loss in forecasting performance
to two classes of (mis)behavior: systematic decision
biases such as misspecied error-response (o = o

),
initial anchoring (0
|
=1), or illusionary trends (p >0),
and unsystematic trembling hands random errors.
To separate these two sources of performance loss,
we calculate the forecast performance of three types
of forecast evolutions for each demand seed s: the
normative, the observed, and the behaviorally pre-
dicted forecasts.
The normative forecast for period | of seed s is
dened by

s|
=
s|
(D
s|1
,
s|1
o

(W)), where o

(W)
is common to all demand seeds within a demand
environment. The observed forecast of subject i for
period | of seed s is denoted by
si|
. The predicted
forecasts are dened as

sit
=

sit
(D
s|1
,

si(|1)


O
si
),
where

O
si
are the estimated parameters of our gener-
alized forecasting model, including best linear unbi-
ased predictions of random effects at the data set
and individual levels (see Bates and Pinheiro 1998).
The predicted forecasts

si|
are seed- and individual-
specic forecasts that, unlike the observed
si|
, were
ltered through the structural estimation of our
generalized forecasting model. We measure perfor-
mance for each of our six demand conditions as the
mean absolute forecast error, averaged across all
subjects i and all S seeds s within that condition. For-
mally, for the observed forecasts, we dene MAE
O
=
(1,SIT)

SIT

sit
D
s|
, and make equivalent deni-
tions for the normative (MAE
N
) and predicted fore-
casts (MAE
P
).
8
Using these denitions, we describe
the relative total performance loss from observed fore-
casts as (MAE
O
MAE
N
),MAE
N
. We capture the loss
in forecasting performance that is due to systematic
decision biases (Loss 1) as (MAE
P
MAE
N
),MAE
N
,
and the loss in forecasting performance due to unsys-
tematic random noise in decision making (Loss 2)
as (MAE
O
MAE
P
),MAE
N
. Table 6 provides an
overview of the results.
As expected, all MAEs increase in c and n, but this
result has to be interpreted with caution because dif-
ferent environments produce different forecast per-
formance because of inherent changes in randomness
and complexity. Forecasting performance relative to
optimal performance (Loss (Total)) improves in less
stable environments. In general, the relative loss in
performance that is due to random decision mak-
ing (Loss 2) is at least as high as the relative loss of
performance that is due to decision biases (Loss 1).
Counter to intuition, the relative loss in performance
that is due to systematic error (Loss 1) is lower in
8
Because insufcient forecasting history means we cannot t our
generalized forecasting model for periods 3032, and period 79 is
the last period that results in an observed error, we use all forecasts
made in periods 3379.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
12 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
Table 6 Mean Absolute Forecast Errors and Performance Loss
Loss 1 Loss 2 Loss (Total)
Systematic error Random error
Normative Predicted Observed (MAE
F
MAE
A
), (MAE
0
MAE
F
)/ (MAE
0
MAE
A
),
(MAE
A
) (MAE
F
) (MAE
0
) MAE
A
MAE
A
MAE
A
r =10 r =40 r =10 r =40 r =10 r =40 r =10 r =40 r =10 r =40 r =10 r =40
c =0 7.75 30.74 8.86 34.88 10.15 38.55 14% 13% 17% 12% 31% 25%
c =10 12.86 36.51 14.34 42.01 16.42 47.36 11% 15% 16% 15% 28% 30%
c =40 34.34 53.54 35.41 56.82 38.94 64.03 3% 6% 10% 13% 13% 20%
conditions with high change (c = 40) than in condi-
tions with little (c =10) or no (c =0) change. It seems
that the decision heuristics individuals use to make
forecasts work comparatively better in unstable and
changing environments and become more biased in
stable environments.
5. Conclusion
We investigate judgmental time-series forecasting in
environments that can be precisely described by their
stability and noise. Behavior is somewhat consistent
with the mechanics of single exponential smoothing,
the normative benchmark in our context. However,
subjects tend to overreact to observed forecast errors
in relatively stable time series and to underreact to
forecast errors for less stable time series. This pat-
tern is consistent with the system-neglect hypothesis,
which posits that individuals place too much empha-
sis to signals they receive relative to the system that
generates the signals (Massey and Wu 2005). Our
research provides empirical support for this hypothe-
sis in a many small changes time-series forecasting
context, which is notably different from the few big
changes environments commonly investigated in the
regime-change literature.
Our results show that decisions made in a stable
environment exhibit stronger systematic decision
biases than decisions made in less stable envi-
ronments. The decline in forecasting performance
that is due to randomness in decisions (Loss 2)
is at least as strong as the decline in forecasting
performance that is due to the systematic biases
uncovered (Loss 1). Human judgment appears to be
better adapted to detecting change in volatile environ-
ments than to exploiting information in stable envi-
ronments. A tendency to overreact to noise may be
the result of a decision heuristic geared toward the
detection of and adaptation to change. This nding
suggests that managerial judgment in forecasting is
better suited to unstable environments than to sta-
ble ones, so particular emphasis should be placed on
automating decision making in stable environments.
Additionally, because randomness in decision making
may be mitigated by multiple individuals indepen-
dently preparing forecasts and then averaging those
forecasts (Larrick and Soll 2006), large benets may be
achieved by simply averaging multiple independent
judgments for a forecast.
The system-neglect framework lends itself to the
design of managerial interventions. If decision bias
in forecasting is due to the salience of forecast errors
compared to the latent demand process that gen-
erated these errors, forecasting performance should
improve if either the salience of forecast errors is
reduced or if the demand process is reemphasized
before making a decision.
9
Future research should
address these two avenues in more detail.
Our results relate to the growing literature on
behavioral operations management. For example,
experimental studies of simple newsvendor settings
have documented a persistent tendency to chase
demand in stationary environments (Schweitzer and
Cachon 2000, Bolton and Katok 2008, Kremer et al.
2010). Our study suggests that this tendency may
be a forecasting phenomenon and not exclusively
related to inventory ordering. While subjects in
most newsvendor studies are given full knowledge
about the underlying demand-generating system, the
system-neglect hypothesis suggests that the signals
and feedback they observe will encourage partial
neglect of that knowledge. Therefore, decomposing
such inventory decisions into their forecasting and
ordering components may be a fruitful and impor-
tant endeavor. Further, newsvendor studies often
assume that using a stationary and known demand
environment makes the forecasting task simpler, but
our results suggest that stable environments lead to
more biased decision making. If subjects neglect their
knowledge of the system and change forecasts based
9
We conducted an initial experiment based on the idea of highlight-
ing the demand process. Specically, we had participants generate
forecasts sequentially in the four data sets within a condition, i.e.,
subjects would prepare a forecast in data set 1, then in data set 2,
then in data set 3, etc. This design was intended as a simple manip-
ulation that might reemphasize the underlying system. We com-
pared performance to a design where participants made successive
forecasts in one data set before moving on to the next. We found no
consistent performance improvement using the alternative design,
suggesting that we were unable to reemphasize the system with
this manipulation.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 13
on signals, performance deteriorates more in stable
environments compared to unstable demand envi-
ronments. Finally, subjects in most newsvendor and
beergame studies are confronted with demand stimuli
in quick succession, a context that provides particu-
larly salient demand signals. Our study suggests that
decision makers may perform better when the relative
salience of recent demand signals is mitigated, such
as by reemphasizing the environment before making
the next decision.
Our study has several limitations. Although our
analyses explicitly controlled for initial anchoring and
illusionary trends, our study was not designed to
analyze these behaviors in detail. Future research
should further explore these behavioral phenomena
and explicitly capture predictable changes in the level
such as real trends and seasonality. Further, our
forecasting context assumes that forecasters have no
quantitative forecasting support available, other than
a graph and a history table. However, in practice,
many forecasts are judgmental adjustments based on
a quantitative forecasting method (Fildes et al. 2009).
Future research could explicitly address the impact
that quantitative decision support has on human
judgment.
In general, our understanding of human judgment
in nonstationary environments is limited, partially
because analyzing such contexts is not trivial. Our
research suggests a method of formally capturing
a persistent judgment bias and its relationship to
parameters describing a nonstationary environment.
Our results can provide a solid theoretical and empir-
ical basis for future research on how to design
information and incentive systems that are resistant
to the kinds of judgment biases we observed. For
example, this framework lends itself to the study
of real-world forecasting processes in more complex
organizational and functional environments, such as
incentive conicts that frequently arise in sales fore-
casting processes at the interface between marketing
and operations (e.g., Oliva and Watson 2009).
Additionally, the implications of our study may be
relevant to many elds beyond operations manage-
ment. For example, our framework may be useful for
the study of overreaction and illusionary trends in
stock markets, for examining how medical doctors
interpret longitudinal data of their patients, or per-
haps as a window for understanding human reactions
to climate change. In general, our research points out
that when faced with a time series, decision mak-
ers discount distant information and place stronger
weight on recent informationa strategy consistent
with adaptation to changing environments rather
than information exploitation in stable environments.
We hope that further studies will examine this phe-
nomenon in broader business and societal contexts,
and study its implications for performance, welfare,
and policy.
Acknowledgments
The authors thankfully acknowledge research support by
the Carlson School of Management Deans Research Grant
and by a grant from the Smeal College of Business. They are
also thankful for the many helpful comments made by par-
ticipants at the annual Behavioral Operations conference, as
well as seminars at the Carlson School of Management and
the Darden School. They are also grateful for the construc-
tive feedback provided by an anonymous associate editor
and three anonymous reviewers.
Appendix A. Pretest Information
Prior to the study presented in this paper, we completed
a pretest of our experiment. The task, experimental param-
eters, software, and functionality were very similar to the
baseline study reported here, with two exceptions: First,
participants in the pretest made decisions for only 40 con-
secutive periods, whereas the data presented here are based
on 50 periods. Second, the students in the pretest were
given extra course credit for participating and were entered
into a drawing for one cash reward per section. We con-
ducted the same statistical tests on our pretest data and
found results that are directionally identical to the ones
reported here. The pretest was predominantly used to deter-
mine whether subjects should receive a graph of the time
series, and whether providing qualitative information on
the demand series (product with stable/unstable demand)
inuenced performance. The nal design (subjects received
a graph but no qualitative information) corresponds to the
setting in the pretest in which subjects had the best perfor-
mance.
Appendix B. One-Step-Ahead
Exponential Smoothing
The optimal o is unknown and unknowable for subjects.
The best they can do is estimate an optimal o, given the
data that they have. This fact has implications for forecast-
ing performance that we should consider when creating a
normative benchmark. Subjects can only estimate an opti-
mal o at any point of time, given the data they have until
then, and use this estimate to predict the future. In this
section, we briey examine how such a one-step-ahead
(OSA) procedure differs from the normative benchmark we
employ in our study.
A rst decision is which optimality criterion to use when
estimating the optimal OSA alpha. We use the MAPE as
an optimality criterion, in addition to a maximum likeli-
hood approach (also referred to as SES) (Hyndman et al.
2002). We also use a maximum likelihood procedure that
allows for simultaneous parameter estimation in the con-
text of double exponential smoothing (DES) (Andrawis
and Atiya 2009). Because we have 24 data sets, each with
50 forecasts, using three methods, this resulted in 24 50
3 =3,600 optimizations to nd all optimal OSA alphas.
All methods produce alphas close to 0 for all data sets in
conditions 1 and 2. The OSA benchmark is therefore no dif-
ferent from our normative benchmark in these conditions.
Furthermore, the double exponential smoothing method
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
14 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
Table B.1 OSA Alphas
Condition 3 Condition 4 Condition 5 Condition 6
Period 30 80 30 80 30 80 30 80
Normative 0.62 0.62 0.22 0.22 0.94 0.94 0.62 0.62
OSA (MAPE) 0.65 0.66 0.21 0.22 0.90 0.85 1.00 0.78
OSA (SES) 0.69 0.67 0.24 0.24 0.90 0.88 0.86 0.73
OSA (DES) 0.65 0.67 NA NA 0.90 0.87 0.77 0.71
Table B.2 Mean Absolute Errors Using OSA Alphas
Condition Normative OSA (MAPE) OSA (SES) OSA (DES)
3 13.11 13.01 13.02 13.00
4 36.59 36.64 36.66 NA
5 33.92 33.45 33.58 33.50
6 53.95 54.77 53.36 53.06
Avg. 33.66 33.74 33.32 33.19
never produces an optimal beta that is far from 0, indicat-
ing that our data indeed show little evidence for trends.
For the remaining conditions, alphas are generally close to
the normative benchmark, and, with the possible exception
of condition 6, show little evidence of being closer to the
normative benchmark as more data are revealed through-
out the experiment than just the initial demand history
of 30 periods. We summarize the average alphas over all
data sets in a condition, early and late in the experiment, in
Table B.1.
Furthermore, we test whether the overall MAEs using
any of these out-of-sample procedures are different from
the MAEs using our normative benchmark. We calculate the
absolute error for each observation in each condition using
each method and then average these absolute errors over all
observations within each condition to get an overall MAE
for each condition and method. Results from this analysis
are summarized in Table B.2.
As can be seen in Table B.2, forecasting performance
employing the different OSA methods and using our nor-
mative approach is quite similar. This is a result of the OSA
estimates being fairly close to the normative o, and of the
objective function dened by the absolute forecast errors
using single exponential smoothing being relatively smooth
around the optimum. Therefore, it matters little whether we
use the normative benchmark or any of the out-of-sample
methods. Nevertheless, we replicate our performance com-
parison from Table 6 with the OSA (SES) errors in Table C.1
below.
Table C.1 Performance Comparison Using OSA Procedure
Loss 1 Loss 2 Loss (Total)
Predicted Observed (MAE
F
MAE
A
), (MAE
0
MAE
F
)/ (MAE
0
MAE
A
),
(MAE
A
) (MAE
F
) (MAE
0
) MAE
A
MAE
A
MAE
A
r =10 r =40 r =10 r =40 r =10 r =40 r =10 r =40 r =10 r =40 r =10 r =40
c =0 7.75 30.74 8.86 34.88 10.15 38.55 14% 13% 17% 12% 31% 25%
c =10 12.81 36.59 14.34 42.01 16.42 47.36 12% 15% 16% 15% 28% 29%
c =40 34.00 52.67 35.41 56.82 38.94 64.03 4% 8% 10% 14% 15% 22%
Appendix C. Econometric Specication and
Estimation Details
Equation (10) provides a basis for the behavioral model we
estimate in our analysis. An empirical problem with Equa-
tion (10) is that we do not observe data on T
|1
, which could
bias empirical results. To at least partially control for this
potential bias, we propose to estimate Equation (10) with
the additional independent variables AD
|1
and A
|1
, lead-
ing to the following empirical specication:

|+1
=u
1
|
|
+u
2

1
+u
3
AD
|
+u
4
AD
|1
+u
5
A
|1
+constant +a
|
.
(C1)
Finally, the following (equivalent) specication of Equa-
tion (C1) provides an easier comparison of nested models,
so it serves as our primary empirical specication:
A
|+1
= u
1
|
|
+(u
2
1)
|
+u
3
AD
|
+u
4
AD
|1
+u
5
A
|
+u
6
A
|1
+constant +a
|
. (C2)
In general, an observation at time | in the experiment
is nested in subject i, who is nested in demand data set
s, which is nested in experimental condition (i.e., demand
environment) c. Because we estimate our model within each
condition, this approach yields a three-level nested struc-
ture of error terms, such that we have random intercepts .
s
and .
i
. Furthermore, we believe that the behavioral param-
eters of our model vary considerably depending both on the
actual data set being observed and on the individual per-
forming the forecast. This expectation implies that u
1
u
4
should be modeled as random coefcients. However, results
in our pretest show that although there was some vari-
ance over u
1
and u
3
, there was little variance on the other
two coefcients. Estimating random coefcients models in
which the coefcients have little variance can lead to non-
convergence and inappropriate standard errors. Therefore,
we estimate only u
1
and u
3
as random slopes. This three-
level random-effects model will effectively control for the
dependence we have among observations in our data set.
In summary, we can write
A
|+1(c, s, i)
= u
si
1
|
|
+ u
2

1
+u
si
3
AD
|
+u
4
AD
|1
+u
5
A
|
+u
6
A
|1
+constant +.
s(c)
+.
i(s, c)
+a
|
. (C3)
All random coefcients are estimated as having a nor-
mal distribution. In our results, we use j and u to refer to
the mean and standard deviation of that distribution. For
example, j(|
|
) refers to the mean of the random slope u
1
,
whereas u
i
(|
|
) refers to the standard deviation of that slope
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 15
T
a
b
l
e
C
.
2
R
e
s
u
l
t
s
f
r
o
m
B
e
h
a
v
i
o
r
a
l
E
s
t
i
m
a
t
i
o
n
P
r
e
d
i
c
t
i
n
g

l
t
+
1
C
o
n
d
i
t
i
o
n
1
C
o
n
d
i
t
i
o
n
2
C
o
n
d
i
t
i
o
n
3
M
o
d
e
l
1
M
o
d
e
l
2
M
o
d
e
l
3
M
o
d
e
l
4
M
o
d
e
l
1
M
o
d
e
l
2
M
o
d
e
l
3
M
o
d
e
l
4
M
o
d
e
l
1
M
o
d
e
l
2
M
o
d
e
l
3
M
o
d
e
l
4
p
(
l
t
)
0
.
6
3

(
0
.
0
4
)
0
.
4
9

(
0
.
0
3
)
0
.
4
0

(
0
.
0
3
)
0
.
3
9

(
0
.
0
4
)
0
.
5
4

(
0
.
0
4
)
0
.
4
5

(
0
.
0
4
)
0
.
4
6

(
0
.
0
5
)
0
.
4
8

(
0
.
0
5
)
0
.
8
4

(
0
.
0
5
)
0
.
8
5

(
0
.
0
4
)
0
.
7
4

(
0
.
0
4
)
0
.
6
8

(
0
.
0
4
)
l
t

0
.
3
9

(
0
.
0
2
)

0
.
3
7

(
0
.
0
3
)

0
.
3
0

(
0
.
0
3
)

0
.
3
3

(
0
.
0
2
)

0
.
3
1

(
0
.
0
2
)

0
.
2
5

(
0
.
0
3
)

0
.
0
1
(
0
.
0
1
)

0
.
0
1

(
0
.
0
1
)

0
.
0
1
(
0
.
0
1
)
p
(

0
t
)
0
.
1
0

(
0
.
0
3
)
0
.
1
1

(
0
.
0
3
)

0
.
0
1
(
0
.
0
4
)

0
.
0
3
(
0
.
0
4
)
0
.
1
4

(
0
.
0
5
)
0
.
1
9

(
0
.
0
5
)

0
t

1
0
.
0
0
(
0
.
0
1
)
0
.
0
1
(
0
.
0
2
)

0
.
0
2

(
0
.
0
1
)

0
.
0
5

(
0
.
0
2
)

0
.
0
1
(
0
.
0
1
)
0
.
0
2
(
0
.
0
2
)

l
t

0
.
0
8

(
0
.
0
2
)

0
.
0
3
(
0
.
0
2
)

0
.
0
8

(
0
.
0
2
)

l
t

0
.
0
5

(
0
.
0
2
)

0
.
0
8

(
0
.
0
2
)

0
.
0
7

(
0
.
0
1
)
p
(
c
o
n
.
)
1
9
4

(
1
1
)
1
8
3

(
1
3
)
1
5
1

(
1
5
)
1
6
6

(
1
1
)
1
5
6

(
1
2
)
1
2
9

(
1
5
)
8
.
2

(
4
.
4
)
8
.
5

(
4
.
2
)
4
.
2
(
3
.
9
)
o
s
(
l
t
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
5
(
0
.
0
6
)
0
.
0
5
(
0
.
0
7
)
0
.
0
7
(
0
.
0
5
)
0
.
0
6
(
0
.
0
5
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
o
s
(

0
t
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
4
(
0
.
0
5
)
0
.
0
4
(
0
.
0
5
)
0
.
0
6
(
0
.
0
5
)
0
.
0
7
(
0
.
0
5
)
o
s
(
c
c
r
.
)
0
.
8
1
(
0
.
3
6
)
0
.
7
6
(
0
.
3
3
)
0
.
6
5
(
0
.
2
9
)
4
.
0

(
1
.
9
)
3
.
5

(
1
.
6
)
3
.
2

(
1
.
5
)
1
.
2

(
0
.
6
3
)
1
.
1

(
0
.
5
6
)
0
.
8
8

(
0
.
4
9
)
o
i
(
l
t
)
0
.
2
2

(
0
.
0
3
)
0
.
1
8

(
0
.
0
2
)
0
.
1
8

(
0
.
0
3
)
0
.
1
7

(
0
.
0
3
)
0
.
2
6

(
0
.
0
3
)
0
.
2
2

(
0
.
0
3
)
0
.
2
2

(
0
.
0
3
)
0
.
2
2

(
0
.
0
3
)
0
.
2
0

(
0
.
0
3
)
0
.
2
0

(
0
.
0
3
)
0
.
1
9

(
0
.
0
3
)
0
.
1
8

(
0
.
0
3
)
o
i
(

0
t
)
0
.
1
3

(
0
.
0
2
)
0
.
1
3

(
0
.
0
2
)
0
.
1
4

(
0
.
0
3
)
0
.
1
5

(
0
.
0
3
)
0
.
1
9

(
0
.
0
3
)
0
.
1
9

(
0
.
0
3
)
o
i
(
c
o
n
.
)
0
.
9
7
(
0
.
2
1
)
0
.
7
7
(
0
.
2
1
)
0
.
5
9
(
0
.
2
3
)
6
.
2

(
0
.
9
6
)
4
.
9

(
0
.
8
5
)
4
.
7

(
0
.
8
6
)
1
.
7
8
(
0
.
3
7
)
1
.
3
(
0
.
3
9
)
1
.
0
(
0
.
5
7
)
A
2
,
0
2
1
(
4
3
)
1
,
9
2
3
(
4
1
)
1
,
8
8
0
(
4
0
)

2
2
5
6
.
0
0

6
3
.
9
3

1
3
.
4
8

2
5
0
.
6
1

5
7
.
6
0

2
6
.
7
4

6
5
.
9
9

7
0
.
2
8

3
2
.
0
8

C
o
n
d
i
t
i
o
n
4
C
o
n
d
i
t
i
o
n
5
C
o
n
d
i
t
i
o
n
6
M
o
d
e
l
1
M
o
d
e
l
2
M
o
d
e
l
3
M
o
d
e
l
4
M
o
d
e
l
1
M
o
d
e
l
2
M
o
d
e
l
3
M
o
d
e
l
4
M
o
d
e
l
1
M
o
d
e
l
2
M
o
d
e
l
3
M
o
d
e
l
4
p
(
l
t
)
0
.
7
2

(
0
.
0
3
)
0
.
6
7

(
0
.
0
3
)
0
.
6
7

(
0
.
0
3
)
0
.
6
0

(
0
.
0
4
)
0
.
9
7

(
0
.
0
4
)
0
.
9
8

(
0
.
0
5
)
0
.
8
2

(
0
.
0
3
)
0
.
7
0

(
0
.
0
3
)
0
.
8
2

(
0
.
0
5
)
0
.
8
0

(
0
.
0
5
)
0
.
6
6

(
0
.
0
4
)
0
.
5
6

(
0
.
0
4
)
l
t

0
.
1
6

(
0
.
0
2
)

0
.
1
5

(
0
.
0
2
)

0
.
1
0

(
0
.
0
2
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)

0
.
0
9

(
0
.
0
1
)

0
.
0
7

(
0
.
0
1
)

0
.
0
5

(
0
.
0
1
)
p
(

0
t
)

0
.
0
2
(
0
.
0
3
)
0
.
0
4
(
0
.
0
3
)
0
.
1
9

(
0
.
0
6
)
0
.
3
1

(
0
.
0
7
)
0
.
1
6

(
0
.
0
8
)
0
.
2
5

(
0
.
0
8
)

0
t

0
.
0
4

(
0
.
0
1
)
0
.
0
1
(
0
.
0
2
)

0
.
0
3

(
0
.
0
1
)
0
.
1
0

(
0
.
0
2
)

0
.
0
4

(
0
.
0
1
)
0
.
0
6

(
0
.
0
2
)

l
t

0
.
1
4

(
0
.
0
2
)

0
.
1
4

(
0
.
0
2
)

0
.
1
4

(
0
.
0
2
)

l
t

0
.
0
7

(
0
.
0
1
)

0
.
0
5

(
0
.
0
1
)

0
.
0
3

(
0
.
0
1
)
p
(
c
o
n
.
)
9
9

(
9
.
5
)
9
0
(
9
.
9
)
6
2

(
1
1
)
4
.
5

(
2
.
6
)
2
.
9
(
2
.
4
)
1
.
2
(
2
.
2
)
6
7

(
6
.
4
)
5
2

(
6
.
1
)
3
7

(
6
.
3
)
o
s
(
l
t
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
2
(
0
.
1
0
)
0
.
0
2
(
0
.
0
8
)
0
.
0
7
(
0
.
0
3
)
0
.
0
8
(
0
.
0
4
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
0
.
0
7
(
0
.
0
5
)
0
.
0
7
(
0
.
0
4
)
0
.
0
0
(
0
.
0
0
)
0
.
0
0
(
0
.
0
0
)
o
s
(

0
t
)
0
.
0
3
(
0
.
0
5
)
0
.
0
2
(
0
.
0
7
)
0
.
1
2

(
0
.
0
5
)
0
.
1
1

(
0
.
0
5
)
0
.
1
4

(
0
.
0
6
)
0
.
1
3

(
0
.
0
6
)
o
s
(
c
o
n
.
)
2
.
9

(
1
.
4
)
3
.
0

(
1
.
4
)
2
.
5

(
1
.
2
)
2
.
2
(
1
.
5
)
1
.
9
(
1
.
3
)
1
.
7
(
1
.
1
)
4
.
4

(
2
.
1
)
3
.
1

(
1
.
6
)
2
.
4

(
1
.
3
)
o
i
(
l
t
)
0
.
1
8

(
0
.
0
2
)
0
.
1
7

(
0
.
0
4
)
0
.
1
5

(
0
.
0
3
)
0
.
1
3

(
0
.
0
3
)
0
.
1
2

(
0
.
0
2
)
0
.
1
1

(
0
.
0
2
)
0
.
1
1

(
0
.
0
2
)
0
.
1
0

(
0
.
0
2
)
0
.
2
1

(
0
.
0
3
)
0
.
1
9

(
0
.
0
2
)
0
.
2
0

(
0
.
0
3
)
0
.
1
7

(
0
.
0
3
)
o
i
(

0
t
)
0
.
1
0

(
0
.
0
3
)
0
.
1
0

(
0
.
0
3
)
0
.
1
0

(
0
.
0
2
)
0
.
1
0

(
0
.
0
3
)
0
.
1
8

(
0
.
0
3
)
0
.
1
8

(
0
.
0
3
)
o
i
(
c
o
n
.
)
3
.
5

(
0
.
9
8
)
3
.
6

(
1
.
0
)
2
.
4
(
1
.
2
)
5
.
3

(
0
.
8
1
)
4
.
1

(
0
.
8
3
)
3
.
1

(
1
.
1
)
6
.
9

(
1
.
2
)
4
.
9

(
1
.
1
)
3
.
4

(
1
.
3
)
A
/
S
u
b
.
1
,
8
8
0
(
4
0
)
2
,
0
1
8
(
4
3
)
2
,
1
1
1
(
4
5
)

2
1
3
5
.
4
7

1
5
.
3
7

3
6
.
7
3

9
5
.
8
1

9
5
.
9
8

4
7
.
9
3

1
6
9
.
2
9

1
6
0
.
6
0

4
3
.
4
1

N
o
t
e
s
.
p
(
x
)
s
t
a
n
d
s
f
o
r
t
h
e
m
e
a
n
o
f
r
a
n
d
o
m
e
f
f
e
c
t
x
;
o
i
(
x
)
s
t
a
n
d
s
f
o
r
t
h
e
s
t
a
n
d
a
r
d
d
e
v
i
a
t
i
o
n
o
f
r
a
n
d
o
m
e
f
f
e
c
t
x
a
t
t
h
e
i
n
d
i
v
i
d
u
a
l
l
e
v
e
l
(
s
i
n
d
i
c
a
t
e
s
d
a
t
a
s
e
t
l
e
v
e
l
)
.

p
=
0
.
1
0
;

p
=
0
.
0
5
;

p
=
0
.
0
1
.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
16 Management Science, Articles in Advance, pp. 117, 2011 INFORMS
at the individual level. The behavioral parameters of Equa-
tion (10) can then be calculated as follows: o =j(|
|
), 0
|
=
u
2
+1, p =j(D
|
),j(|
|
), and C =constant,u
2
.
References
Adams, J. A. 1968. Response feedback and learning. Psych. Bull.
70(6) 486504.
Andrawis, R. R., A. F. Atiya. 2009. A new Bayesian formulation for
Holts exponential smoothing. J. Forecasting 28(3) 218234.
Andreassen, P. B., S. J. Kraus. 1990. Judgmental extrapolation and
the salience of change. J. Forecasting 9(4) 347372.
Asparouhova, E., M. Hertzel, M. Lemmon. 2009. Inference from
streaks in random outcomes: Experimental evidence on beliefs
in regime shifting and the law of small numbers. Management
Sci. 55(11) 17661782.
Barberis, N., A. Shleifer, R. Vishny. 1998. A model of investor sen-
timent. J. Financial Econom. 49(3) 307343.
Barry, D. M., G. F. Pitz. 1979. Detection of change in nonstation-
ary, random sequences. Organ. Behav. Human Performance 24(1)
111125.
Bates, D. M., J. C. Pinheiro. 1998. Computational methods for mul-
tilevel modeling. Technical Memorandum BL0112140-980226-
01TM. Bell Labs, Lucent Technologies, Murray Hill, NJ.
Baucells, M., M. Weber, F. Welfens. 2011. Reference-point formation
and updating. Management Sci. 57(3) 506519.
Bendoly, E., K. Donohue, K. L. Schultz. 2006. Behavior in opera-
tions management: Assessing recent ndings and revisiting old
assumptions. J. Oper. Management 24(6) 737752.
Bloomeld, R., J. Hales. 2002. Predicting the next step of a ran-
dom walk: Experimental evidence of regime-shifting beliefs.
J. Financial Econom. 65(3) 397414.
Bolger, F., N. Harvey. 1993. Context-sensitive heuristics in statistical
reasoning. Quart. J. Experiment. Psych. 46A(4) 779811.
Bolton, G., E. Katok. 2008. Learning-by-doing in the newsvendor
problem: A laboratory investigation of the role of experience
and feedback. Manufacturing Service Oper. Management 10(3)
519538.
Brav, A., J. B. Heaton. 2002. Competing theories of nancial anoma-
lies. Rev. Financial Stud. 15(2) 575606.
Carbone, R., W. Gorr. 1985. Accuracy of judgmental forecasting of
time series. Decision Sci. 16(2) 153160.
Chapman, G. B., E. J. Johnson. 2002. Incorporating the irrele-
vant: Anchors in judgments of belief and value. T. Gilovich,
D. Grifn, D. Kahneman, eds. Heuristics and Biases. Cambridge
University Press, Cambridge UK, 120138.
Chopra, S., P. Meindl. 2009. Supply Chain Management. Prentice Hall,
Upper Saddle River, NJ.
Croson, R., K. Donohue. 2003. Impact of POS data sharing on
supply chain management: An experimental study. Production
Oper. Management 12(1) 111.
Croson, R., K. Donohue, E. Katok, J. Sterman. 2005. Order stabil-
ity in supply chains: Coordination risk and the role of coor-
dination stock. Working paper, University of Texas at Dallas,
Richardson.
DeBondt, W. F. M. 1993. Betting on trends: Intuitive forecasts of
nancial risk and return. Internat. J. Forecasting 9(3) 355371.
Edwards, W. 1968. Conservatism in human information processing.
B. Kleinmuntz, ed. Formal Representation of Human Judgment.
Wiley, New York, 1752.
Epley, N., T. Gilovich. 2001. Putting adjustment back in the anchor-
ing and adjustment heuristic. Psych. Sci. 12(5) 391396.
Fildes, R., P. Goodwin, M. Lawrence, K. Nikolopoulos. 2009. Effec-
tive forecasting and judgmental adjustments: An empirical
evaluation and strategies for improvement in supply-chain
planning. Internat. J. Forecasting 25(1) 323.
Fischbacher, U. 2007. z-Tree: Zurich toolbox for ready-made eco-
nomic experiments. Experiment. Econom. 10(2) 171178.
Gardner, E. S. 1985. Exponential smoothing: The state of the art.
J. Forecasting 4(1) 128.
Gardner, E. S. 2006. Exponential smoothing: The state of the art
Part II. Internat. J. Forecasting 22(4) 637666.
Gehring, W. J., B. Goss, M. G. H. Coles, D. E. Meyer, E. Donchin.
1993. A neural system for error detection and compensation.
Psych. Sci. 4(6) 385390.
Graves, S. C. 1999. A single-item inventory model for a nonstation-
ary demand process. Manufacturing Service Oper. Management
1(1) 5061.
Grifn, D., A. Tversky. 1992. The weighing of evidence and the
determinants of condence. Cognitive Psych. 24(3) 411435.
Harrison, P. J. 1967. Exponential smoothing and short-term sales
forecasting. Management Sci. 13(11) 821842.
Harvey, N. 2007. Use of heuristics: Insights from forecasting
research. Thinking Reasoning 13(1) 524.
Hyndman, R. J., A. B. Koehler, R. D. Snyder, S. Grose. 2002. A state
space framework for automatic forecasting using exponential
smoothing methods. Internat. J. Forecasting 18(3) 439454.
Kahneman, D., A. Tversky. 1972. Subjective probability: A judgment
of representativeness. Cognitive Psych. 3(3) 430454.
Kremer, M., S. Minner, L. N. Van Wassenhove. 2010. Do ran-
dom errors explain newsvendor behavior? Manufacturing Ser-
vice Oper. Management 12(4) 673681.
Larrick, R. P., J. B. Soll. 2006. Intuitions about combining opinions:
Misappreciation of the averaging principle. Management Sci.
52(1) 111127.
Lawrence, M., M. OConnor. 1992. Exploring judgmental forecast-
ing. Internat. J. Forecasting 8(1) 1526.
Lawrence, M., M. OConnor. 1995. The anchor and adjustment
heuristic in time-series forecasting. J. Forecasting 14(5) 443451.
Lawrence, M. J., R. H. Edmundson, M. J. OConnor. 1985. An exam-
ination of the accuracy of judgmental extrapolation of time
series. Internat. J. Forecasting 1(1) 2535.
Lawrence, M., P. Goodwin, M. OConnor, D. nkal. 2006. Judge-
mental forecasting: A review of progress over the last 25 years.
Internat. J. Forecasting 22(3) 493518.
Makridakis, S., M. Hibon. 2000. The M3-competition: Results, con-
clusions and implications. Internat. J. Forecasting 16(4) 451476.
Makridakis, S., S. Wheelwright, R. Hyndman. 1998. Forecasting:
Methods and Applications. Wiley, New York.
Massey, C., G. Wu. 2005. Detecting regime shifts: The causes of
under- and overreaction. Management Sci. 51(6) 932947.
McNamara, J. M., A. I. Houston. 1987. Memory and the efcient
use of information. J. Theoretical Biology 125(4) 385395.
Muth, J. F. 1960. Optimal properties of exponentially weighted fore-
casts. J. Amer. Statist. Assoc. 55(290) 299306.
Nahmias, S. 2008. Production and Operations Analysis. Irwin,
Chicago.
Oliva, R., N. Watson. 2009. Managing functional biases in organiza-
tional forecasts: A case study of consensus forecasting in sup-
ply chain planning. Production Oper. Management 18(2) 138151.
Poteshman, A. M. 2001. Underreaction, overreaction, and increasing
misreaction to information in the options market. J. Finance
56(3) 851876.
Rabin, M. 2002. Inference by believers in the law of small numbers.
Quarterly J. Econom. 117(3) 775816.
Rabin, M., D. Vayanos. 2010. The gamblers and hot-hand fallacies:
Theory and applications. Rev. Econom. Stud. 77(2) 730778.
Rustichini, A. 2008. Neuroeconomics: Formal models of decision-
making and cognitive neuroscience. P. W. Glimcher, C. Camerer,
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.
Kremer, Moritz, and Siemsen: Demand Forecasting Behavior
Management Science, Articles in Advance, pp. 117, 2011 INFORMS 17
R. Poldrack, E. Fehr, eds. Neuroeconomics. Elsevier, London,
3346.
Sanders, N. 1992. Accuracy of judgmental forecasts: A comparison.
Omega 20(3) 353364.
Sanders, N. 1997. The impact of task properties feedback on time
series judgmental forecasting tasks. Omega 25(2) 135144.
Sanders, N., K. B. Manrodt. 2003a. Forecasting software in practice:
Use, satisfaction, and performance. Interfaces 33(5) 9093.
Sanders, N., K. B. Manrodt. 2003b. The efcacy of using judgmen-
tal versus quantitative forecasting methods in practice. Omega
31(6) 511522.
Schweitzer, M. E., G. Cachon. 2000. Decision bias in the newsven-
dor problem with a known demand distribution: Experimental
evidence. Management Sci. 46(3) 404420.
Stone, E. R., R. B. Opel. 2000. Training to improve calibration and
discrimination: The effects of performance and environmental
feedback. Organ. Behav. Human Decision Processes 83(2) 282309.
Su, X. 2008. Bounded rationality in newsvendor models. Manufac-
turing and Service Oper. Management 10(4) 566589.
Verbeek, M. 2000. A Guide to Modern Econometrics. Wiley, New York.
Wiener, N. 1948. Cybernetics or Control and Communication in the Ani-
mal and the Machine. Wiley, New York.
C
o
p
y
r
i
g
h
t
:
I
N
F
O
R
M
S
h
o
l
d
s
c
o
p
y
r
i
g
h
t
t
o
t
h
i
s
A
r
t
i
c
l
e
s
i
n
A
d
v
a
n
c
e
v
e
r
s
i
o
n
,
w
h
i
c
h
i
s
m
a
d
e
a
v
a
i
l
a
b
l
e
t
o
s
u
b
s
c
r
i
b
e
r
s
.
T
h
e

l
e
m
a
y
n
o
t
b
e
p
o
s
t
e
d
o
n
a
n
y
o
t
h
e
r
w
e
b
s
i
t
e
,
i
n
c
l
u
d
i
n
g
t
h
e
a
u
t
h
o
r

s
s
i
t
e
.
P
l
e
a
s
e
s
e
n
d
a
n
y
q
u
e
s
t
i
o
n
s
r
e
g
a
r
d
i
n
g
t
h
i
s
p
o
l
i
c
y
t
o
p
e
r
m
i
s
s
i
o
n
s
@
i
n
f
o
r
m
s
.
o
r
g
.

Вам также может понравиться