Вы находитесь на странице: 1из 17

Transportation Research Part B 39 (2005) 641–657

www.elsevier.com/locate/trb

A semi-compensatory discrete choice model with explicit


attribute thresholds of perception
a,1 b,*
Vı́ctor Cantillo , Juan de Dios Ortúzar
a
Department of Civil Engineering, Universidad del Norte, Km. 5 Antigua Vı́a a Puerto Colombia, Barranquilla, Colombia
b
Department of Transport Engineering, Pontificia Universidad, Católica de Chile, Casilla 306, Código 105,
Santiago 22, Chile

Received 24 January 2004; received in revised form 13 August 2004; accepted 16 August 2004

Abstract

Random utility models are typically based on the assumptions that individuals exhibit compensatory
behaviour and that their choice sets are pre-specified. These assumptions may be unrealistic in many prac-
tical cases; in particular, it has been argued that thresholds may be part of choice–rejection mechanisms (i.e.
if the threshold of any attribute for a given alternative is surpassed the alternative is rejected) and, therefore,
act as explicit criteria to determine the set of feasible alternatives.
We formulate a semi-compensatory two-stage discrete choice model incorporating randomly distributed
thresholds for attribute acceptance. The formulation allows us to estimate the parameters of the thresholdÕs
probability distribution; these parameters can be expressed as a function of the socio-economics character-
istics of the individual and the conditions under which the choice process takes place.
Applying the model to real and simulated data allowed us to conclude that if thresholds exist in the pop-
ulation, the use of compensatory models such as Multinomial Logit or even Mixed Logit, can lead to seri-
ous errors in model estimation and, therefore, in the computation of marginal rates of substitution such as
the subjective value of time, as well as in prediction. Although the Mixed Logit model can deal with random
parameters allowing to treat taste differences, we found it could not be used to accommodate non-compen-
satory behaviour or to represent the existence of thresholds.
Ó 2004 Elsevier Ltd. All rights reserved.

*
Corresponding author. Tel.: +56 2 354 4822.
E-mail addresses: vcantill@uninorte.edu.co (V. Cantillo), jos@ing.puc.cl (J. de Dios Ortúzar).
1
Tel.: +57 5 350 9236.

0191-2615/$ - see front matter Ó 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.trb.2004.08.002
642 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

Keywords: Discrete choice; Thresholds; Semi-compensatory behaviour; Value of time

1. Introduction

Research on discrete choice random utility models has paid special attention recently to the
error term, trying to incorporate more flexible structures for its covariance matrix and developing
appropriate and efficient techniques for model estimation (McFadden and Train, 2000; Greene
and Hensher, 2003; Train, 2003). In parallel, social scientists have been exploring the cognitive
basis of the choice process for a long time (Payne et al., 1993). The challenge has been to formu-
late models with greater realism in the representation of how individuals actually process the
choice tasks.
Most discrete choice models have been based on the assumption that individuals choose follow-
ing compensatory behaviour 2; in fact, the usual linear-in-parameters specification of systematic
utility implies precisely that. However, people may behave non-compensatory such as in the elim-
ination-by-aspects 3 (EBA) process (Tversky, 1972) or the satisficing heuristic 4 (Simon, 1955).
More than 20 years ago Williams and Ortuzar (1982) verified the inconsistency of parameters esti-
mated using standard models when people behaved in non-compensatory fashion or if individual
choice sets were incorrectly specified.
In particular the EBA model postulates thresholds that set up a choice–rejection mechanism
(i.e. if the threshold of any attribute is surpassed for a given alternative, it is rejected); therefore,
thresholds can be seen as criteria to specify the set of feasible options (choice set) explicitly. In
typical discrete choice model applications the modeller assumes that choice sets can be established
deterministically. The availability of an alternative is treated as an observable binary variable de-
fined on the basis of often arbitrary criteria (i.e. a mode is available if its access time is less than a
certain duration). Thus, even though the modeller accepts that a threshold may exist, he does not
consider individual differences that could imply that for some people an alternative might still be
available (i.e. people may be prepared to walk more than the pre-defined duration criteria to
board a vehicle).
Determining the distribution of thresholds as a mechanism for the acceptance or rejection of
alternatives has practical importance in the application of transport policies, it facilitates a better
understanding of individual behaviour and reduces misspecification problems. Indeed, previous
research has considered lexicographic and semi-compensatory rules, 5 introduced thresholds
and studied the choice set generation process. Young et al. (1982) and Young and Richardson

2
Decision strategies that use tradeoffs among attributes are called compensatory strategies.
3
In the original EBA model individuals are assumed to have both a ranking of attributes and minimum acceptable
thresholds for each. The process begins with the most important attribute, the threshold of which is retrieved and all
alternatives with values over the threshold for that attribute are eliminated. The process is repeated for the remaining
attributes in order of importance until one alternative satisfies them all.
4
In the satisficing heuristic, alternatives are considered one at a time in any order and the individual may end the
search the moment he finds an acceptable alternative.
5
This could be, for example, an EBA-like strategy where after checking the thresholds for all attributes more than
one alternative remains and thus a compensatory strategy is used to choose the preferred one.
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 643

(1983), estimated discrete choice models based on the EBA framework; Kawamoto and Setti
(1992) proposed a new procedure for calibrating a semi-compensatory mode choice model. More
recently, Swait (2001) proposed a non-compensatory choice model incorporating attribute cut-offs
into the decision problem formulation and tested it as an extension to the traditional compensa-
tory utility maximization framework. On a different track, Batley and Daly (2003) established
conditions under which the Nested Logit model yields equivalent choice probabilities to the
EBA framework, a fact which had been suggested by McFadden (1981).
Some analysts have centred their attention on the choice set formation process based on the
two-stage choice paradigm and probabilistic choice set model presented by Manski (1977). Swait
and Ben Akiva (1987) are particularly important for our work as they proposed a behavioural
interpretation of the choice set generation process that is useful for structuring and specifying
models incorporating random constraints; they discussed several models in relation to their
behavioural plausibility and estimability. Ben Akiva and Boccara (1995) discussed models with
latent choice sets, by adding an explicit probabilistic representation of the various alternatives
considered by the individual; in their formulation they incorporated the effects of stochastic con-
straints or elimination criteria and the influence of attitudes and perception on the choice set gen-
eration process. Their idea was to estimate a choice set generation model using information
contained in the response to questions about the availability of each alternative. As a related
development, Morikawa (1995) proposed a method to estimate probabilistic choice set models
with a large number of alternatives. Finally, and in a different approach to the choice-set simula-
tion problem, Cascetta and Papola (2001) proposed a random utility model based on the concept
of intermediate degrees of availability/perception of alternatives, simulated through an ‘‘inclusion
function’’ introduced in the systematic utility of the model.
However, none of the above considered the importance of determining the distribution of at-
tribute thresholds in the population. Neither has anyone studied in-depth problems such as the
incidence of thresholds in the computation of marginal rates of substitution, in the incorrect spec-
ification problem associated to using compensatory models in the presence of thresholds, or in
model forecasts. 6
In this paper we formulate a hybrid semi-compensatory two-stage model loosely based on the
ideas of Williams and Ortuzar (1982), incorporating thresholds for the acceptance of attributes in
the process of discrete choice. We postulate that if thresholds exist they could be random, differ
among individuals and even be a function of socio-economic features and choice conditions. Our
formulation allows to estimate both the model taste parameters and those of the threshold prob-
ability distribution.
The rest of the paper is organized as follows: in the next section we discuss the theoretical for-
mulation of the model including hypotheses and constraints. After that, we present two empirical
analyses: one using simulated data and another using data collected as part of a stated preference
survey. In both cases we compare our method with the traditional compensatory model, including
willingness-to-pay calculations and model forecasts to various policies. Finally we present our
practical conclusions and discuss possible avenues for further research.

6
A referee indicated that a new paper by Gilbride and Allemby (2004) indicated a resurgence in these topics.
644 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

2. Theoretical formulation

Manski (1977) suggested a probabilistic choice set model in two stages that can be used in the
case we are interested in:
X
P iq ¼ P q ði=CÞP q ðC=GÞ; ð1Þ
C2G

where Piq is the probability that individual q chooses option Ai (Ai 2 M); M is the master set of
alternatives; Pq(i/C) is the probability that individual q chooses Ai given choice set C; and Pq(C/G)
is the probability that the choice set of individual q is C. In this case G is the set of all non-empty
subsets of M or, what is the same, all the potential choice subsets. In general, if M has nq options
G has a maximum of 2nq  1 elements. Therefore, the main problem of the approach is the high
degree of computational complexity associated to a large number of alternatives (more than six).
Morikawa (1995) suggests an appropriate method when M consists of too many alternatives,
whereby the set formation process is modeled by pair wise comparisons of alternatives, reducing
the computational effort.

2.1. First stage: choice set formation

We will assume that the semi-compensatory choice process of individuals is characterized by


attributes considering thresholds that set up a choice or rejection mechanism. Thus, if a threshold
is surpassed by an attribute the alternative is rejected; this is also similar in spirit to the formula-
tion of Swait and Ben Akiva (1987). The process is followed eliminating alternatives sequentially.
At the end of this first stage the remaining options (if there are more than one) constitute the
choice set of the individual and then she will choose, for example, following a utility maximization
compensatory rule.
The process can imply an initial deterministic analysis to define a potential choice set nq for each
individual. For example, the option Car-driver may be available if there is a car in the household
and if the individual has a driving license. Then, we define a stochastic choice set generation stage
expressing the probability that set C is the actual choice set of the individual; this is based on the
existence of the attribute thresholds.
In certain cases it might be convenient to define each threshold in relation to an acceptable
standard for the individual. This can be done in multiple ways; for example, the threshold could
be taken as the most favorable value among those that the attribute can take for the set of poten-
tial alternatives; it could also be the value that the attribute takes for the chosen alternative, or
simply any reference value. In such cases it may be convenient to express the deviation with re-
spect to the threshold in percentage terms, because the perception of change of a given individual
might be related to the initial value of the attribute.
Alternatively, when an attribute does not vary much among individuals (e.g. the case of similar
origin-destination pairs, common to all individuals, in a mode or route choice context), we could
treat the thresholds in absolute terms; that is, check that the current value of the attribute (CXkqj)
is not larger that a certain threshold Txkq for the kth attribute of individual q. If C Xkqj > T X kq the
individual rejects alternative Aj. Thus, in this behavioural model an individual will consider that
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 645

an alternative is part of her choice set if and only if all its attributes are within their respective
thresholds.
In a general perspective, let Tq be the vector of thresholds for individual q. The thresholds dis-
tribute according to a join density function X(d) with mean E½Tq  ¼ T  and covariance matrix
Var[Tq] = R, but note that we can allow the means to depend on socioeconomic and context data.
By dividing each element rij in R by ri rj, we obtain the correlation matrix R. The probability
Pq,j2C, that each attribute of alternative Aj is within its respective threshold for individual q
and therefore is included in choice set is equal to the probability that C Xkqj 6 T X kq 8k, or in matrix
notation, P{CXqj 6 Tq}. This can be written as:
Z 1 Z 1 Z 1
P q;j2C ¼ P fCXqj 6 Tq g ¼ XðdÞ dd1 dd2 . . . ddk : ð2Þ
C Xkjq C X 2jq C X 1jq

Obviously, the probability that alternative Aj does not belong to choice set C is Pq,j62C =
1  Pq,j2C.
In addition, the probability that the choice set is empty is obviously equal to the probability
that no alternative belongs to it, that is: 7
Y
nq
P q;C¼/ ¼ P q;j62C ð3Þ
j¼1

and as a result:
Ynq n o
1 d jC ð1d Þ
P q ðC=GÞ ¼  P q;j2C P q;j62CjC ; ð4Þ
1  P q;C¼/ j¼1

where djC is one if alternative Aj is an element of choice set C, and zero otherwise.
We need to propose an appropriate join density function for the vector of thresholds (e.g. a
multivariate Normal distribution MVN). The problem is that this implies multiple integrals
and therefore to solve it numerically we need approximation.
The model formulated by Swait and Ben Akiva (1987) assumed that thresholds were independ-
ent among attributes; in this simpler case, the probability that Aj is included in the choice set of
individual q is given by:
Y
Kj
P q;j2C ¼ P kjq ; ð5Þ
k¼1

where Pkjq is the probability that the kth attribute of alternative Aj is within the threshold for indi-
vidual q, Kj is the number of attributes of option Aj for which the threshold existence will be eval-
uated. As an illustration, if Txkq distributes according to a left-truncated Normal distribution with
mean T Xk and variance r2Xk with the point of truncation at zero (as attributes are generally posi-
tive), the probability that the kth attribute of alternative Aj is within the threshold for individual q,
is equal to:

7
Assuming no correlation between each pair of probabilities Pq,l62C, and Pq,m62C, l 5 m.
646 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657
  
  
ðC Xkjq  T Xk Þ ðT Xk Þ
P kjq ¼ 1U 1U ; ð6Þ
rXk rXk
where U(*) is the standard Normal cumulative distribution function.
If the population is relatively homogenous the means of the distributions may be constant; nev-
ertheless, it is possible to consider variations among individuals by expressing the means as func-
tions of the individual features and choice conditions. For example, a linear in the parameters
expression might be used:
T Xkq ¼ s þ q Y q ; ð7Þ

where Yq is a vector including socio-economic characteristics of the individual (e.g. sex, income,
car ownership) and choice conditions (e.g. purpose and schedule of the trip, if it is a frequent or
occasional trip, time restrictions, etc.), and s and q are parameters. In practice it may be conven-
ient to express the variables in Yq as binary 0–1 dummies.

2.2. Second stage: compensatory choice process

The compensatory choice second stage is based on random utility theory. We express the utility
of option Aj for individual q as:
U jq ðXjq ; ajq Þ ¼ V jq ðXjq ; ajq Þ þ ejq ; ð8Þ

where a is a vector of parameters, X is a vector of attributes and individual socioeconomic features


and ejq a random component the specification of which defines the type of discrete choice model to
be used (Ortuzar and Willumsen, 2001). Vjq is the observable component of utility that can, as
usual, be considered linear in the parameters:
V jq ¼ ajq Xjq : ð9Þ

The probability of choosing Ai given choice set C is:


P q ði=CÞ ¼ P fejq 6 V iq þ eiq  V jq ; 8Aj 2 Cg: ð10Þ

If f(e) is the probability distribution of the errors, then:


Z
P q ði=CÞ ¼ f ðeÞ de; where RC ¼ fejq 6 eiq þ V iq  V jq ; 8Aj 2 Cg: ð11Þ
RC

As a consequence, it is possible to express (1) as:


"n #
1 X Y q n o Z
d jC ð1d Þ
P iq ¼ P q;j2C P q;j62CjC f ðeÞ de : ð12Þ
1  P q;C¼/ C2G j¼1 RC

In particular if the residuals e distribute IID Gumbel, then:


"n #
X Y q n o
1 d jC ð1d jC Þ expðV iq Þ
P iq ¼ P q;j2C P q;j62C P : ð13Þ
1  P q;C¼/ C2G j¼1 AðjÞ 2C expðV jq Þ
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 647

2.3. Model estimation

To estimate the model we can recourse to the maximum likelihood method. Stated preference
(SP), revealed preference (RP) or mixed SP/RP information would be required for this purpose. If
we assume a more complex specification of the error term, simulated maximum likelihood can be
used and if we use mixed SP/RP information it is necessary to use an appropriate scale factor such
that the variances of both data sets are equaled.
The log-likelihood function is:
Q X
X nq
lða; uÞ ¼ gjq ln P jq ; ð14Þ
q¼1 j¼1

where gjq is one if Aj was chosen by individual q and zero otherwise; we denote as u the set of
parameters of the threshold distribution and Pjq is given by (12). It is important to note that
the parameters u are identifiable. To estimate the model we wrote a special code using the Maxlik
library in GAUSS 8 (Aptech Systems, 1994).
The traditional discrete choice model is a particular case of the two stages model above, in
which the means for each threshold are large enough and their variances small (this would actu-
ally imply the non-existence of threshold and therefore no alternative would be rejected at the first
stage). For this reason the common t-test for the threshold parameters (means and variances) does
not seem appropriate, since the null hypothesis that their values are zero is not interesting. The
hypothesis to test is if the threshold means are big enough, and their variances small enough,
so that all observations are accepted. Given that it is difficult to determine which values to raise
for the null hypothesis, we propose the likelihood ratio (LR) test as a better alternative (Ortuzar
and Willumsen, 2001):
LR ¼ 2flðhr Þ  lðhÞg; ð15Þ
where l(hr) is the log-likelihood at convergence for a restricted version of the model while l(h) is
the log-likelihood of model (12). The estimator distributes asymptotically v2 with r degrees of free-
dom; in this case r is the number of linear restrictions needed to pass from the general model (12)
to the traditional random utility model with no thresholds. So, if we consider nq variables with
threshold effects, and assume that each has two parameters (mean and variance), then r would
be equal to 2nq when comparing (12) against the restricted model without thresholds.

3. Empirical analysis

The model proposed above was applied to three data sets. The first a simulated database, the
second a SP survey and the third a RP survey. We will only report the first two here as the third
did not produce extra evidence. Also, since the emphasis was not on the error term of the utility
functions, we assumed for simplicity that e distributed IID Gumbel.

8
The optimization algorithms used were BFGS, BHHH and Newton Raphson, or a combination of these.
648 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

3.1. Application to simulated data

To examine the performance of the proposed model for a population where thresholds do exist,
we followed the Williams and Ortuzar (1982) tradition generating a simulated data bank with
three hypothetical alternatives: Taxi, Bus and Metro and included three attributes: Cost, Travel
time and Access time. To generate the data we used Normal distributed attributes as shown in
Table 1. Initially, 10,000 independent observations were generated assuming the population be-
haved as follows:

1. Individuals compare each attribute with its threshold in turn; if the threshold is exceeded the
alternative under consideration is rejected.
2. If two or more alternatives are accepted after the first stage, the individual chooses among them
according to a compensatory utility maximizing process.

Thresholds were also generated for each attribute following independent Normal distributions
(that is, without correlation, or R = I) with means and standard deviations as in Table 2. The
‘‘true’’ (target) parameters of the utility function are also presented in Table 2 together with
the results of estimating three types of models. The first is a classical Multinomial Logit
(MNL) model fully consistent with compensatory behaviour; the second is the proposed model
which includes thresholds for the attributes, and the last is a Mixed Logit (ML) model allowing
for Normal distributed random taste parameters. In parenthesis two t-tests appear: the first cor-
responds to the traditional null hypothesis hk = 0, and the second refers to the null hypothesis
hk = hv, where hv are the target values.
The t-tests in Table 2 allow us to conclude that the MNL and ML parameters for Cost and
Travel time are significantly different from their targets at the 95% level. It is important to note
that the specification of random parameters in the ML model does not allow to compensate
for the existence of thresholds. In fact, it is evident that the MNL and ML models are equivalent
in this case and therefore it appears that the existence of thresholds simply cannot be modelled by
a compensatory choice model. On the other hand, contrasting the parameters estimated for the
proposed model against the targets we note an acceptable recovery; also, the values of the means
and standard deviations of the estimated thresholds are very close to the targets. The same con-
clusion can be derived for the parameters of the utility function, and although there is a larger
difference in the case of the Cost parameter, for all attributes the null hypothesis hk = hv is ac-
cepted at the 95% level.

Table 1
Parameters used for attribute generation
Attribute Taxi Bus Metro
Cost (C) Mean 45 20 15
Standard deviation 10 3 1
Travel time (TT) Mean 18 30 10
Standard deviation 5 10 3
Access time (AT) Mean 8 10 15
Standard deviation 4 5 5
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 649

Table 2
Models with simulated database
Parameter Value Parameter MNL Proposed ML
range target model
Cost 12.0–82.2 0.005 0.0808 (53.1; 49.8) 0.0074 (1.81; 0.6) 0.0808 (53.1: 49.8)
Travel time 4.0–67.7 0.08 0.0714 (45.1; 5.4) 0.0805 (22.8; 0.1) 0.0714 (45.1: 5.4)
Access time 0.5–36.7 0.16 0.1618 (44.7;0.5) 0.1647 (31.4; 0.9) 0.1619 (44.7; 0.5)
Threshold cost (mean) 40 40.2 (118.4; 0.6)
Threshold cost 5 5.3 (24.1; 1.4)
(standard deviation)
Threshold travel 45 45.8 (26.7; 0.5)
time (mean)
Threshold travel time 8 10.5 (6.2; 1.5)
(standard deviation)
Threshold access 28 27.9 (32.2; 0.1)
time (mean)
Threshold access time 4 3.0 (4.2; 1.4)
(standard deviation)
Standard deviation of 0.00 (0.01)
cost parameter (rC)
Standard deviation of 0.0003 (0.07)
travel time (rTT)
Standard deviation of 0.0002 (0.03)
access time (rAT)
Sample size 10,000 10,000 10,000 10,000
Log likelihood 7523.0 6944.9 7523.0

Summarizing, results show that the assumption of compensatory behaviour may lead to serious
errors. This is also demonstrated by the subjective value of time (SVT) results shown in Table 3
(which are totally different to the ‘‘real’’ values in the case of the MNL and ML models). In fact,
not only the estimates from the compensatory models are biased but the targets are outside their
confidence intervals (Armstrong et al., 2001). Meanwhile the proposed model estimates have the
targets well within their confidence intervals. 9
Furthermore, to investigate the practical consequences of adopting models with and without
thresholds when there is evidence that they exist, we tested the performance of the MNL and
the proposed model in terms of their predictive capabilities. We used scenarios representing ‘‘pol-
icy changes’’ which ranged from slightly to quite different from the base data used for estimation.
This entailed changing attribute values for the options and re-executing the choice simulation pro-
cedure. We generated a series of simulated future scenarios which could be compared with the
modelsÕ predictions. Following Munizaga et al. (2000) we tested six policies P1 to P6. Table 4
shows the specific changes to the Cost (C) and Travel time (TT) attributes associated to each

9
The estimation of SVT is only valid as long as the attributes Time and Cost are within their respective thresholds
(i.e. acceptable values for the individual). If an increment in travel time makes the attribute surpass the threshold, the
willingness-to-pay of the individual would tend to infinity. Also, if a cost increase makes this attribute to surpass its
threshold, the willingness-to-pay would fall to zero.
650 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

Table 3
SVT and confidence intervals for simulated data
SVT Target MNL Proposed model Mixed logit
Point Confidence Point Confidence Point Confidence
estimate interval estimate interval estimate interval
Travel time 16.0 0.88 0.84–0.92 10.9 5.48–120.0 0.88 0.84–0.92
Access time 32.0 2.00 1.94–2.06 22.3 11.7–234.9 2.00 1.94–2.06

Table 4
Policy changes: percentage change in attribute values
Policy Cost taxi Cost bus Cost metro Travel time bus Travel time metro
P1 20
P2 +25
P3 15
P4 50
P5 +100
P6 50 +50 50 +50

policy. Policies P1 to P3 correspond to small changes; in contrast, P4 to P6 represent strong pol-


icy changes.
The error measure considered was the percentage difference between observed or ‘‘true’’ behav-
iour (i.e. that simulated for the modified attribute values) and that estimated with both models.
The minimum response error that can be considered a prediction error is given by the standard
deviation of the simulated observations generated with different seeds, as it reflexes the inherent
variability of the simulation process. We performed 20 repetitions of the data generation process
using different seeds for the pseudo-random numbers and found that the largest value of the coef-
ficient of variation was 2.3%. Given these results we decided to take 3% as a reasonable threshold.
Therefore, any discrepancy exceeding this value was considered an estimation error.
To test goodness of fit we used a chi-squared test given by:
X ðN
^ i  N i Þ2
v2 ¼ ; ð16Þ
i
Ni
where N ^ i is the model estimate of the number of individuals choosing alternative Ai, and Ni is the
actual (simulated) number. The result should be contrasted with the critical v2 value at the 5%
level with two degrees of freedom (i.e. 5.99).
Table 5 shows that the proposed model yields far superior results, with all response errors
within the threshold of 3%. Meanwhile, the MNL yields response errors larger than this value
in all cases. A similar analysis using the v2 index shows that the errors for the proposed model
are not statistically significant at the 5% level in every policy; conversely, the MNL errors are sig-
nificant in all cases. These results allow us to conclude that if there are attribute thresholds for
accepting alternatives, a misspecified compensatory model might lead to significant response er-
rors even in the case of relatively mild policy changes.
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 651

Table 5
Comparison of simulated and model forecasts
Policy Targets MNL Proposed model
2
Taxi Bus Metro Taxi Bus Metro v Taxi Bus Metro v2
P1 1451 3602 4947 1241 3359 5399 88.0 1455 3610 4935 0.1
14.5% 6.7% 9.1% 0.3% 0.2% 0.2%
P2 1600 2753 5647 1589 3063 5348 50.7 1596 2770 5634 0.1
0.7% 11.2% 5.3% 0.2% 0.6% 0.2%
P3 2764 2339 4897 1954 2503 5543 334.1 2728 2374 4897 1.0
29.3% 7.0% 13.2% 1.3% 1.5% 0.0%
P4 1230 4963 3807 1002 4547 4451 186.3 1215 5020 3765 1.3
18.5% 8.4% 16.9% 1.2% 1.1% 1.1%
P5 1658 2973 5369 2226 4173 3601 1260.7 1665 3001 5334 0.5
34.3% 40.3% 32.9% 0.4% 0.9% 0.6%
P6 1339 5703 2958 870 7378 1752 1148.3 1300 5815 2885 5.2
35.1% 29.4% 40.8% 2.9% 2.0% 2.5%

To evaluate the effect of sample size on model estimation we generated another simulated data-
base but only with 1000 draws. Although the estimated parameters were very similar to those ob-
tained for the sample of 10,000 observations, the target recovery was less successful than in the
experiment with the larger sample. Notwithstanding, the proposed model had again the best fit
with a significant improvement in log-likelihood. This confirms the effect of sample size in estima-
tion when dealing with complex models incorporating many attributes (Williams and Ortuzar,
1982; Munizaga et al., 2000).
Finally, to test the effect of correlated thresholds we generated a database with correlation be-
tween the thresholds for Travel time and Access time, while the attribute Cost was assumed to be
independent. We considered correlated attributes that distributed following a bivariate Normal
distribution. The results are shown in Table 6.
The MTI model considers all thresholds to be independent, while the MTMN model allows for
dependence between the thresholds of Travel time and Access time. Note that both models are
equivalent with the exact same log-likelihood, and that the correlation parameter in the MTMN
model is not significant (although correctly estimated). We carried out the same experiment var-
ying the correlation and obtained similar results; the log-likelihood function appears to be fairly
insensitive to correlation. Consequently, this evidence suggests that in the estimation of a choice
model incorporating thresholds the effect of threshold correlation may be marginal.

3.2. Application to a SP survey

This data came from a route choice stated preference survey for car-trips between the cities of
Santiago and Valparaiso in Chile (Rizzi and Ortuzar, 2003). Three variables were used in the
experiment: travel time (min); toll charge (US$); and number of fatal accidents per year. Each
individual had to choose between the two routes according to a fractional factorial design. More
than 900 survey forms were distributed and 395 returned. From these, 342 responses were consid-
ered appropriate for the analysis and the total number of valid observations was 3071. We found
652 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

Table 6
Comparison of models considering correlation
Parameter Value Parameter MNL MTI MTMN
range target
Cost 12.0–82.2 0.005 0.0799 (52.9; 49.9) 0.0070 (1.7; 0.5) 0.0070 (1.7; 0.5)
Travel time 4.0–67.7 0.08 0.0714 (45.2; 5.4) 0.0798 (22.8; 0.1) 0.0798 (22.8; 0.1)
Access time 0.5–36.7 0.16 0.1607 (44.5; 0.2) 0.1628 (33.5; 0.6) 0.1628 (33.5; 0.6)
Threshold 40 40.2 (117.5; 0.6) 40.2 (117.5; 0.6)
cost (mean)
Threshold cost 5 5.5 (25.2; 2.2) 5.5 (25.3; 2.2)
(standard deviation)
Threshold travel 45 44.2 (35.6; 0.7) 44.2 (35.4; 0.6)
time (mean)
Threshold travel time 8 8.6 (5.9; 0.4) 8.6 (6.0; 0.4)
(standard deviation)
Threshold access 28 27.4 (44.6; 0.9) 27.4 (43.9; 0.9)
time (mean)
Threshold access time 4 2.3 (3.2; 2.4) 2.3 (3.2; 2.4)
(standard deviation)
Correlation between 0.5 0.46 (0.1; 0.0)
travel and access times
Sample size 10,000 10,000 10,000 10,000
Log likelihood 7549.1 6973.2 6973.2

150 respondents (e.g. 1350 observations) who answered the choice game lexicographically (i.e.
they always chose the alternative with the ‘‘best’’ value of a particular attribute); of these, 72 were
lexicographic in the Accidents variable, 57 in the Travel time variable, and 21 in the Toll
variable. 10
The deterministic component of indirect utility was modelled as:
V i ¼ aA þ bC þ cT ; ð17Þ
where A is the annual accident rate, C stands for the toll charge and T for travel time. Initially we
estimated models for the whole database (i.e. including the lexicographic individuals) and the re-
sults are presented in Table 7. The first model is the conventional MNL; T1 is a model considering
a threshold for the Toll, T2 considers a threshold for Travel time and T3 a threshold for Accidents.
In model T4 the mean of the threshold for Accidents is expressed as s + qYM; the latter is a dum-
my variable that takes the value one for men of 30 years old or less, and zero otherwise. This func-
tion was based on evidence that males under this age were less-likely to be willing to pay for safety
(Rizzi and Ortuzar, 2003). In all cases we assumed that the thresholds were independent and fol-
lowed a left truncated Normal distribution.
A perusal of Table 7 reveals that there is no evidence of threshold effects for the attributes Toll
and Travel time; in fact, the LR test allows us to conclude that models MNL, T1 and T2 are equiv-
alent (LR is smaller than the critical value v295%;2df ¼ 5:99). The results also indicate that all the
effect of the Accidents variable appears at the non-compensatory stage, acting as a discriminating

10
This proportion of lexicographic individuals is fairly typical of SP studies (Saelensminde, 2001).
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 653

Table 7
Models for the SP survey including all individuals
Coefficients Value MNL T1 T2 T3 T4
(t-ratios) range
Accidents (a) 16–28 0.1028 (14.7) 0.1033 (13.9) 0.1034 (13.4) 0.043 (1.1) 0.026 (0.7)
Toll (b) 2.4–7.0 0.35 (7.9) 0.35 (7.4) 0.35 (7.8) 0.50 (4.7) 0.50 (5.3)
Travel time (c) 90–135 0.0391 (14.4) 0.0392 (14.2) 0.0388 (12.9) 0.0707 (6.1) 0.0673 (7.1)
Threshold 28.24 (42.9) 28.14 (41.9)
accidents
(mean)
Threshold 5.97 (6.1) 5.85 (6.9)
accidents
(standard
deviation)
Threshold toll 7.98 (0.3)
(mean)
Threshold toll 0.39 (0.03)
(standard
deviation)
Threshold travel 159.6 (1.3)
time (mean)
Threshold travel 11.3 (0.2)
time (standard
deviation)
Young men (q) 6.30 (2.8)
Sample size 3071 3071 3071 3071 3071
Log-likelihood 1795.5 1795.4 1795.4 1785.9 1773.3

element. Contrariwise, both Toll and Travel time are clearly more relevant at the compensatory
stage. On the other hand, models T3 and T4 have LR tests of 19.2 and 43.3 respectively when
compared with the MNL, suggesting a clear threshold effect for Accidents. Note that the param-
eter of this attribute in the utilities of models T3 and T4 has a positive sign but low t-ratio, sug-
gesting it could be dropped from the specification. Finally, it is evident that model T4 has the best
fit, and the positive sign and magnitude of the extra parameter q allows us to conclude that the
accident threshold for young men is larger than for the rest of the population.
A comparison of subjective values of time shows that they are very similar for models MNL, T1
and T2, confirming their equivalence (Table 8). But if we estimate the SVT for models T3 and T4
we get a significant increase. This fact is novel and interesting, and deserves a more detailed anal-
ysis considering its microeconomic interpretation.

Table 8
Estimation of SVT for SP survey data
SVT (US $/min) MNL T1 T2 T3 T4
Point estimate 0.112 0.112 0.111 0.141 0.135
Confidence interval 0.098–0.133 0.096–0.138 0.094–0.136 0.106–0.206 0.075–0.257
654 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

We also estimated models for the lexicographic individuals alone (Table 9). Model L1 includes
a threshold for Accidents, model L2 does it for Travel time and model L3 for both Accidents and
Travel time. Our results show that the Multinomial Logit model (MNLL) has an incorrect sign for
the Toll parameter which is also non-significant. We do not report a model incorporating a thresh-
old for Toll as the model was substantially similar to MNLL. Models L1 and L2 have a clearly
better fit than the MNL but in model L3 (with thresholds for both variables) it is evident that
the effect of Accidents is more important than that of Travel time. Once more, a model including
a different threshold for young men (L4) achieves the best fit. Finally, note that in the threshold
models the compensatory stage seems to be of no relevance as all the parameters of attributes in
the utility function are not significantly different from zero. This is obviously a reassuring finding.
It may be of interest to mention that the results obtained for a sample of just compensatory
individuals were similar to those obtained for the models with the complete sample. Again the
existence of a threshold for Accidents appeared clearly demonstrated and there was no evidence
about thresholds for Toll or Travel time.
Finally, we compared the performance of the MNL and the proposed T4 model in terms of re-
sponse to seven policy changes (Table 10). Policies P1 to P3 correspond to small changes, while
P4 to P7 represent strong policy changes. All scenarios make route two more attractive than route
one, except P7 which is a special case (i.e. an increase in the toll coupled with a decrease in both
accidents and travel time).
In the original survey 1009 observations chose route two (32.9%). Table 11 shows the percent
increase in patronage predicted by the models for this route when the policies were applied as per-
centage differences. When changes are small forecasts are similar by both models, but when the
policies are strong the differences become significant.

Table 9
Models for lexicographic individuals in the SP survey
Coefficients Value MNLL L1 L2 L3 L4
(t-ratios) range
Accidents (a) 16–28 0.1237 (12.9) 0.2415 (0.4) 17.111 (0.1) 0.1073 (0.09) 3.8006 (0.1)
Toll (b) 2.4–7.0 0.05 (0.6) 0.00 (0.0) 1.65 (0.8) 0.10 (0.04) 6.50 (0.1)
Travel time (c) 90–135 0.0212 (6.5) 0.3667 (1.1) 1.3820 (0.1) 0.474 (0.7) 3.0316 (0.1)
Threshold 24.19 (33.8) 24.17 (34.2) 23.97 (39.0)
accidents (mean)
Threshold 9.86 (14.7) 9.89 (14.6) 8.69 (17.5)
accidents
(standard
deviation)
Threshold travel 136.22 (31.4) 190.9 (0.3)
time (mean)
Threshold travel 54.9 (8.9) 13.95 (0.1)
time (standard
deviation)
Young men 20.044 (4.5)
Sample size 1350 1350 1350 1350 1350
Log-likelihood 689.3 664.9 668.0 664.5 654.9
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 655

Table 10
Policy changes: percentage change in attribute values
Policy Toll 1 Toll 2 Accident 2 Time 2
P1 20
P2 +25
P3 15
P4 50
P5 +100
P6 50
P7 +100 50 50

Table 11
Comparison of model forecasts
Policy MNL Proposed model (T4)
Route 1 Route 2 Increase Route 2 (%) Route 1 Route 2 Increase Route 2 (%)
P1 1686 1385 37.3 1662 1409 39.7
P2 1718 1353 34.1 1815 1256 24.5
P3 1602 1469 45.6 1669 1402 38.9
P4 1144 1927 91.0 1576 1495 48.2
P5 774 2297 127.6 1314 1757 74.1
P6 619 2452 143.0 1194 1877 86.0
P7 713 2358 133.7 678 2393 137.2

In fact, there is a tendency to over estimate the number of individuals choosing route two by the
MNL in comparison to the proposed model. Policy P7 is worthy of a special analysis since
changes are strong but partly tend to increase and partly to decrease the probability of using route
two; however, the forecasts with both models are similar.
To end this section we want to mention that another test was made using a very fine revealed
preference data set but results were not particularly informative. We suspect this happened be-
cause the choice set determination had been conducted with utmost care. The interested reader
might consult these results in Cantillo (2004).

4. Conclusions

The consideration of attribute thresholds in discrete choice models allows us not only to con-
sider seemingly compensatory individuals, but also individuals that may be using non-compensa-
tory choice mechanisms such as elimination by aspects. We propose a general model, a particular
case of which is the classic compensatory model when thresholds are large enough to reject no
option.
We found that if there is evidence about the existence of thresholds in the population the use of
a fully compensatory model, such as MNL or even Mixed Logit, can lead to serious errors in pre-
dictions and in estimation, and therefore in the computation of marginal rates of substitution such
656 V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657

as the subjective value of time. Mixed Logit models, although able to consider random variations
in the parameters including differences in tastes, appear not to be able to represent the existence of
non-compensatory behaviour. On the other hand, although we used IID Gumbel errors at the
compensatory stage for simplicity it is perfectly possible to establish another type of specification,
including a random parameters model.
The proposed model is able to consider correlation among thresholds but we found that in esti-
mation the effect of correlation was marginal and models incorporating correlated thresholds were
equivalent to those considering independence among thresholds. When we applied the model to
real data, we found a better fit when the distribution of the thresholds was expressed as a function
of socio-economic characteristics of the individual and the choice conditions prevalent, than when
we considered the same distribution over the whole sample.
An evident disadvantage of the model is the computational effort required for its estimation,
especially when the number of alternatives is large (six or more). However, at least in the case
of stated preference surveys it is unusual to design experiments with a large number of options.
There are several aspects of interest for future research. One is to evaluate the impact of sample
size on the estimation of models including thresholds; a first analysis seems to indicate that a large
number of observations is required, but how many? Another aspect relates to a more in-depth
analysis on the estimation of subjective values of time or other marginal rates of substitution
in the presence of thresholds, including their microeconomic implications. Finally, the influence
of thresholds on model elasticities also remains to be examined.

Acknowledgment

We are grateful to Ben Heydecker, Luis Rizzi, Mauricio Sillano and Huw Williams for their
intellectual help in developing these ideas. We are also grateful for the useful review and sugges-
tions of three anonymous referees. Notwithstanding and as usual, all errors are our sole respon-
sibility. Thanks are also due to the Chilean Fund for the Development of Scientific and
Technological Research (FONDECYT) for having supported this research through several pro-
jects (1970117, 1000616 and 1020981). Finally, we wish to acknowledge the support of the School
of Engineering Travelling Fund and MECE program, for having partially supported a visit of the
first author to the Centre of Transport Studies at University College London, where the first draft
of this paper was written.

References

Aptech systems, 1994. Gauss UserÕs Manuals. Maple Valley.


Armstrong, P.M., Garrido, R.A., Ortuzar, J. de D., 2001. Confidence intervals to bound the value of time.
Transportation Research 37E, 143–161.
Batley, R., Daly, A., 2003. Establishing equivalence between nested logit and hierarchical elimination-by-aspects choice
models. In: Proceedings European Transport Conference, Strasbourg, October 2003.
Ben Akiva, M.E., Boccara, B., 1995. Discrete choice models with latent choice sets. International Journal of Research
in Marketing 12, 9–24.
V. Cantillo, J. de Dios Ortúzar / Transportation Research Part B 39 (2005) 641–657 657

Cantillo, V., 2004. Modelación de la Demanda Incorporando Umbrales Mı́nimos de Percepción y Valoración de
Atributos. Ph.D. Thesis, Department of Transport Engineering, Pontificia Universidad Católica de Chile, Santiago
(in Spanish).
Cascetta, E., Papola, A., 2001. Random utility models with implicit availability/perception of choice alternatives for the
simulation of travel demand. Transportation Research 9C, 249–263.
Gilbride, T., Allemby, G., 2004. A choice model with conjunctive, disjunctive and compensatory screening rules.
Marketing Science 23, 391–406.
Greene, W., Hensher, D., 2003. A latent class model for discrete choice analysis: contrast with mixed logit.
Transportation Research 37B, 681–698.
Kawamoto, E., Setti, J.R., 1992. Procedure for the calibration of a semi-compensatory mode choice model.
Transportation Research Record 1357, 66–71.
Manski, C.F., 1977. The structure of random utility models. Theory and Decisions 8, 229–254.
McFadden, D., 1981. Econometric models of probabilistic choice. In: Manski, C.F., McFadden, D. (Eds.), Structural
Analysis of Discrete Data: with Econometric Applications. MIT Press, Cambridge, Mass.
McFadden, D., Train, K., 2000. Mixed MNL models for discrete response. Journal of Applied Econometrics 15, 447–
470.
Morikawa, T., 1995. A hybrid probabilistic choice set model with compensatory and non-compensatory choice rules.
In: Proceedings 7th World Conference on Transport Research, vol. 1. Travel Behaviour. Pergamon, Oxford.
Munizaga, M.A., Heydecker, B.G., Ortuzar, J. de D., 2000. Representation of heteroscedasticity in discrete choice
models. Transportation Research 34B, 219–240.
Ortuzar, J. de D., Willumsen, L.G., 2001. Modelling Transport. Wiley, Chichester.
Payne, J., Bettman, J., Johnson, E., 1993. The Adaptive Decision Maker. Cambridge University Press, New York.
Rizzi, L.I., Ortuzar, J. de D., 2003. Stated preference in the valuation of interurban road safety. Accident Analysis and
Prevention 35, 1–14.
Saelensminde, K., 2001. Inconsistent choices in stated choice data. Transportation 28, 269–296.
Simon, H.A., 1955. A behavioural model of rational choice. Quarterly Journal of Economics 69, 99–118.
Swait, J.D., 2001. A non-compensatory choice model incorporating attribute cut-offs. Transportation Research 35B,
903–928.
Swait, J.D., Ben Akiva, M.E., 1987. Incorporating random constraints in discrete models of choice set generation.
Transportation Research 21B, 91–102.
Train, K., 2003. Discrete Choice Methods with Simulation. Cambridge University Press, Cambridge.
Tversky, A., 1972. Elimination by aspects: a theory of choice. Psychological Review 79, 281–299.
Williams, H.C.W.L., Ortuzar, J. de D., 1982. Behavioural theories of dispersion and the misspecification of travel
demand models. Transportation Research 16B, 167–219.
Young, W., Richardson, A.J., 1983. The application of an elimination by aspects model to residential location choice.
Australian Road Research 13, 100–112.
Young, W., Richardson, A.J., Ogden, K.W., Rattray, A.L., 1982. Road and rail freight mode choice: application of an
elimination by aspects model. Transportation Research Record 838, 38–44.

Вам также может понравиться