Beruflich Dokumente
Kultur Dokumente
4/9/2019
Stochastic user equilibrium
1. Discrete Choice Models and Traffic Assignments
This following section formulates the route-choice as a process of selection among alternative
paths, on which the travel time is random. The formulation is based on the theory of
discrete choice models, which describe individuals' choices between competing alternatives.
The focus in this book is on travel alternatives such as the available paths from an origin
to a destination, alternative modes of transportation, or alternative destinations which could
be visited.
The relevant underlying theory and the two most commonly used discrete choice models
are introduced in Section 10.1. This section concentrates on the formulation of the models
and their relationship to the behavioral concept of utility maximization. Section 10.2 shows
how these models can be applied to calculate the flow assigned to each path connecting an
origin and a destination in a transportation network.
1.1. REVIEW OF DISCRETE CHOICE MODELS
The hypothesis underlying discrete choice models is that when faced with a choice
situation, an individual's preferences toward each alternative can be described by an
"attractiveness" or "utility" measure associated with each alternative. This utility is a
function of the attributes of the alternatives as well as the decision maker's characteristics.
The decision maker is assumed to choose the alternative that yields the highest utility.
Utilities, however, cannot be observed or measured directly. Furthermore, many of the
attributes that influence individuals' utilities cannot be observed and must therefore be
treated as random. Consequently, the utilities themselves are modeled as random, meaning
that choice models can give only the probability with which alternatives are chosen, not
the choice itself.
This section presents some of the basic concepts associated with discrete choice models
(which are also known as random utility models). It deals with the notions of a choice
function, a satisfaction function, and the application of discrete choice models to
populations with heterogeneous characteristics.
These issues are discussed in general as well as in relation to two specific
discrete choice model forms: the multinomial logit and the multinomial probit.
Choice Function
Let 𝑈 = (𝑈1 , . . ., 𝑈𝐾 ) denote the vector of utilities associated with a
given set of alternatives, 𝜘. This set includes K alternatives numbered
1, 2, . . ., 𝐾. The utility of each alternative to a specific decision maker
can be expressed as a function of the observed attributes of the
alternatives and the observed characteristics of this decision maker. Let
a denote the vector of variables which include these characteristics and
attributes. Thus 𝑈𝑘 = 𝑈𝑘 (𝑎). To incorporate the effects of unobserved
attributes and characteristics, the utility of each alternative is expressed
as a random variable consisting of a systematic (deterministic)
component, 𝑉𝑘 (𝑎), and an additive random "error term", 𝜉𝑘 (𝑎), that is,
𝑈𝑘 (𝑎) = 𝑉𝑘 (𝑎) + 𝜉𝑘 (𝑎) (1.1)
The random component of the utility satisfies 𝐸[𝜉𝑘 (𝑎)] = 0, meaning
that 𝐸[𝑈𝑘 (𝑎)] = 𝑉𝑘 (𝑎). In this context, 𝑈𝑘 (𝑎) is sometimes referred to
as the "perceived utility" and 𝑉𝑘 (𝑎) as the "measured utility.
The utility functions usually include a set of parameters that are
statistically estimated from individuals' observed choices. For now,
assume that the values of these parameters are known and thus 𝑈𝑘
changes only with a.
Given the distribution of possible utility values, the probability that a
particular alternative will be chosen by a decision maker, who is
randomly selected from a given population, can be calculated. The
distribution of the utilities is a function of the attribute vector, a.
Therefore, the probability that alternative 𝑘 (where 𝑘 𝜖𝜘) will be chosen,
𝑃𝑘 , can be related to a. The function relating 𝑃𝑘 to a is known as the
choice function and is denoted by 𝑃𝑘 (𝑎). The probability that alternative
𝑘 is chosen, given a, should be understood as the fraction of individuals
in a large population, all members of which are characterized by a, who
choose alternative 𝑘. The choice probability is then the probability that
𝑈𝑘 (𝑎) is higher than the utility of any other alternative (given a) and
𝑃𝑘 (𝑎) = 𝑃𝑟 [𝑈𝑘 (𝑎) ≥ 𝑈𝑙 (𝑎), ∀ 𝑘 𝜖𝜘 ] ∀ 𝑘 𝜖𝜘 (1.2)
The choice function, 𝑃𝑘 (𝑎) , has all the properties of an element of a
probability mass function, that is,
0 ≤ 𝑃𝑘 (𝑎) ≤ 1 ∀ 𝑘 𝜖𝜘 (1.3a)
∑𝑘𝑘=1 𝑃𝑘 (𝑎) = 1 (1.3b)
Once the distribution of the error term, k, is specified, the distribution of
the utilities can be determined, and the choice function can be
calculated explicitly. Consider, for example, a choice situation in which
only two alternatives are available. The utility of the first alternative is
𝑈1 = 3 + 𝜉 and the utility of the second alternative is 𝑈2 = 2. In this
case all parameters and characteristics are known and fixed. The
probability that the first alternative will be chosen, 𝑃1 , is
𝑃1 = 𝑃𝑟 (𝑈1 > 𝑈2 ) = 𝑃𝑟 (3 + 𝜉 ≥ 2)
= 𝑃𝑟 (𝜉 ≥ − 1) (1.4)
Assume now that the density of 𝜉 is uniform between -2 and 2, that is,
1
𝑓𝑜𝑟 − 2 ≤ 𝜔 ≤ 2
Pr(𝜔 ≤ 𝜉 ≤ 𝜔 + 𝑑𝜔) = {4 (1.5)
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
where 𝜔 is any real number. Given this distribution, the choice
probability expressed by 𝐸𝑞. (1.4) is 𝑃1 = 0.75, implying that 𝑃2 = 0.25.
This example demonstrates the nature of the choice probability; 75% of
all individuals for whom the utility functions are specified as above
will choose alternative 1, according to this model. If, for some other
population of individuals, 𝑈1 remains unchanged and 𝑈3 = 3 (instead of
2), the choice probability of the first alternative, 𝑃1 , will be 0.5
(assuming that the distribution of 𝜉 does not change). In other words,
half of the new population will choose alternative 1. The choice
probability thus depends on the characteristics (captured by the vector
a) of the utility functions, as well as on the distribution of the random
component.
The following two sections discuss two specific discrete choice models,
arising from two different specifications of the error term. These are the
logit and the probit models.
Multinomial Logit
One of the most widely used discrete choice models is the logit model.
This model, can be derived from the concepts of random utility and
utility maximization by assuming that the random terms of each utility
function are independently and identically distributed Gumbel variates.
The choice probability is then given by
𝑒 𝑉𝑘
𝑃𝑘 = ∑𝑘 𝑉𝑙 ∀ 𝑘 𝜖𝜘 (1.6a)
𝑘=1 𝑒
1
𝑃𝑘 = (1.6b)
1+∑𝑘
𝑙≠𝑘 𝑒
𝑉𝑙 −𝑉𝑘
In the case of binary logit, that is, when there are only two alternatives
(indexed k = 1 and k = 2 ), the model can be written as
𝑒 𝑉1 1
𝑃1 = == (1.7a)
𝑒 𝑉1 + 𝑒 𝑉2 1+ 𝑒 𝑉2−𝑉1
and, of course,
1
𝑃2 = 1 − 𝑃1 = (1.7b)
1+ 𝑒 𝑉1 −𝑉2
Figure 1.2 illustrates the automobile choice function, 𝑃𝑎𝑢𝑡𝑜(𝑡,𝑡̂) , versus the difference 𝑡̂ −
𝑡. Given 𝑡 and 𝑡̂ (or merely their difference), the model gives the share of all individuals in
the population under study, choosing the automobile alternative. If the size of this population
is known, the number of individuals choosing each mode can be determined.
The logit formula is easy to use in the multinomial case (see Eq. [1.6a]) as well. This, in fact,
is the main advantage of the logit as compared with other, more sophisticated discrete choice
models.
Multinomial Probit
Following many other statistical models, it may be natural to assume that the random error
term of each utility is normally distributed. This is the underlying assumption of the probit
model which is based on the postulate that the joint density function of these error terms
is the multivariate normal (MVN) function.
The MVN distribution is the multinomial extension of the well-known normal density function.
It describes the distribution of a random vector, 𝜉 = (𝜉1 , … 𝜉𝐾 . This distribution is
characterized by a (K-length) vector of means, µ , and a (𝐾 𝑥 𝐾) covariance matrix, 𝛴 . The
notation , 𝜉 ∼ 𝑀𝑉𝑁(µ, 𝛴) indicates that the vector, 𝜉 is MVN distributed with mean vector
P and covariance matrix 𝛴. The covariance matrix includes the variances of the components
of the random vector, and the covariances between these components, that is,
(𝛴)𝑘𝑘 = 𝑣𝑎𝑟(𝜉𝑘 ) ∀ 𝑘 𝑎𝑛𝑑 (𝛴)𝑘𝑙 = 𝑐𝑜𝑣(𝜉𝑘 , 𝜉𝑙 ) ∀ 𝑘 ≠ 𝑙 (1.10)
As shown there, this distribution conserves its shape under linear transformation, meaning, for example,
that the sum (as well as the difference) of two normal variates is normally distributed. Given a
covariance matrix (associated with the randomness in the perception of a set of alternatives) and a
vector of known alternatives' attributes, the distribution of the utility vector, 𝑈(𝑎) = [𝑈1 (𝑎), … 𝑈𝐾 (𝑎)]
can be modeled as multivariate normal; in other words,
𝑈(𝑎)~𝑀𝑉𝑁[𝑉(𝑎), 𝛴] (1.11)
As with any discrete choice model, the probability that a particular alternative will be chosen is
given by the probability that its utility is the highest in the choice set (see Eq. [10.2]). With probit
models, however, this choice probability cannot be expressed analytically since the cumulative normal
distribution function cannot be evaluated in closed form. In the case of a two-alternative (i.e., binary)
model, however, the choice probabilities can be computed by referring to cumulative normal tables,
as shown below.
To demonstrate the computation of a choice probability in the binary case, assume that 𝑈 = (𝑈1 ,
𝑈2 ) where 𝑈1 = 𝑉1 + 𝜉1 and 𝑈2 = 𝑉2 + 𝜉2 . The distribution of the error term vector is given by
𝜉 ~𝑀𝑉𝑁(𝟎, 𝛴)
Where 𝜉 = (𝜉1 , 𝜉2) ), 𝟎 is a vector of zeros, (0, 0), and
𝜎21 𝜎12
𝛴 = ( )
𝜎21 𝜎22
Now (from Eq. [1.2]), the probability of choosing the first alternative is given by
Where Φ(. ) is the standard cumulative normal curve. Once the values of 𝑉1, 𝑉2 and 𝜎
are given, the value of 𝑃1 can be found by referring to a standard normal
distribution table.
The Satisfaction Function
The focus of the first part of this section was on the choice function and the computation of
the choice probability. Another important quantity related to discrete choice models is the
satisfaction, 𝑆̃, and the satisfaction function, 𝑆̃(a). The satisfaction is a scalar, which captures
the expected utility that an individual (decision-maker) derives from a set of alternatives, 𝜘,
each with utility 𝑈𝑘 . Since each individual is assumed to select the alternative with the
maximum utility, the satisfaction is simply the expectation of the maximum utility alternative.
In other words,
𝑆̃ = 𝐸 [max{𝑈𝑘 }] [1.13a]
∀𝑘
where the expectation is taken with respect to the distribution of 𝑈 (which is,
of course, derived from the distribution of 𝜉 ). The satisfaction can be expressed
as a function of the vector a and the quantity 𝑆̃(a) is known as the satisfaction
function. Given that 𝑽 = 𝑽(𝒂) and given the distribution of 𝜉 , the satisfaction
can also be expressed as a function of the vector of measured utilities, V, that is,
𝑟𝑠
where 𝜉𝑘 is a random error term associated with the route under consideration.
𝑟𝑠
Furthermore, assume that 𝐸[𝜉𝑘 ] = 0, or 𝐸[𝐶𝑘𝑟𝑠 ] = 𝑐𝑘𝑟𝑠 . In other words, the
average perceived travel time is the actual travel time. (The average travel time
can be interpreted as the travel time used in the deterministic analysis.) If the
population of drivers between r and s is large, the share of drivers choosing the kth
route, 𝑃𝑘𝑟𝑠 , is given by
𝑃𝑘𝑟𝑠 = 𝑃𝑟 (𝐶𝑘𝑟𝑠 ≤ 𝐶𝑙𝑟𝑠 ∀ 𝑙 ∈ 𝜘𝑟𝑠 ) ∀ 𝑘, 𝑟, 𝑠 [1.19]
In other words, the probability that a given route is chosen is the probability
that its travel time is perceived to be the lowest of all the alternative routes.
Figure 2 Network example with one O-D pair and two routes.
𝑐2 − 𝑐1 increases (i.e., as 𝑐1
All the curves shown in the figure depict an increase in 𝑃1 as
becomes smaller, relative to 𝑐2 ). For very low variance, the curve is very steep,
meaning that travelers perceive the differences between 𝑐1 and 𝑐2 accurately and
act accordingly. In fact, if the variance is zero, 𝑃1 = 0 for (𝑐2 − 𝑐1 ) < 0 and
𝑃1 = 1 for (𝑐2 − 𝑐1 ) > 0. This is the route-choice behavior associated with the
deterministic model in which the drivers' perception ofthe travel times is assumed
to be perfectly accurate. In a deterministic (all-or-nothing) model, all users will
be assigned to the shortest (measured) travel-time path from their origin to their
destination.
If the variance is larger, the change in the choice probability is more moderate
since the perception of travel times is less accurate. At the limit, when the
variance is extremely large, the actual travel times do not affect the perception,
which is completely random. In this case 𝑃1 = 0.5 for any value of (𝑐2 − 𝑐1 ), as
shown in the figure. The actual magnitude of the perception variance should
be determined empirically on a case-by-case basis. There is no reason, however,
to think a priori that it will be zero, as implied by the deterministic approach.
where 𝑃1(𝒄) = 𝑷𝒓(𝐶1 ≤ 𝐶2), is the probability that route 1 will be chosen by a motorist
who is randomly drawn from the population of q drivers. The flow on route 2, 𝑓2 , is given by
𝑓2 = 𝑞𝑃2 (𝒄) = 𝒒[𝟏 − 𝑃1 (𝒄)] [1.21b]
The total travel time in the system, 𝐶𝑇 , is
𝐶𝑇 (𝒄) = 𝑐1 𝑓1 + 𝑐2 𝑓2
= 𝑞[𝑐1 𝑃1 (𝒄) + 𝑐2 𝑃2 (𝒄) [1.22]
To illustrate the paradox numerically, assume that 𝑐1 = 4.0, 𝑐2 = 2.0, and 𝑞 = 1.0 in some
appropriate units. Assume further that the choice model is given by the logit-like function
1
𝑃1 =
1 + 𝑒 𝑐1 −𝑐2
The flows in this case are 𝑓1 = 0.119 and 𝑓1 = 0.881. The total travel time is
𝐶𝑇 = 0.119 ∗ 4.0 + 0.881 ∗ 2.0 = 2.238 (flow time) units
Consider now an improvement of route 1 which reduces the travel time on thisroute from 4.0
time units to 3.0 time units. The flows in this case would be 𝑓1 = 0.269 and 𝑓1 = 0.731.
𝐶𝑇 = 0.269 ∗ 3.0 + 0.731 ∗ 2.0 = 2.269 (flow time) units
The total travel time, 𝐶𝑇 , increased in this case even though one of the links was improved(!)
The Expected Perceived Travel Time
Route choice models can be viewed as discrete choice models in which the utility of
traveling over a given path, 𝑈𝑘𝑟𝑠 , is given by
𝑟𝑠
𝑆̃𝑟𝑠 = 𝐸 [max {𝑈𝑘 }] [1.24]
𝑘𝜖𝜘𝑟𝑠
𝑟𝑠
𝑆̃𝑟𝑠 (𝒄𝑟𝑠 ) = 𝐸 [ min {𝐶𝑘 }] [1.25]
𝑘𝜖𝜘𝑟𝑠
The relationship between 𝑆𝑟𝑠 (𝒄𝑟𝑠 ) and the satisfaction function, 𝑆̃𝑟𝑠 (𝒄𝑟𝑠 ) can be derived by
𝑟𝑠 1 𝑟𝑠
substituting 𝐶𝑘 = (− 𝑈𝑘 ), that is,
𝜃
1 𝑟𝑠
𝑆𝑟𝑠 (𝒄𝑟𝑠 ) = 𝐸 [min {= (− 𝑈𝑘 )}]
𝑘 𝜃
1 𝑟𝑠 1
= −𝜃 𝐸 [max {𝑈𝑘 }] = − 𝜃 𝑆̃𝑟𝑠 (𝒄𝑟𝑠 ) [1.26]
𝑘𝜖𝜘𝑟𝑠
The expected perceived travel time function, then, is minus the satisfaction function, scaled
in travel time units. It captures the expected disutility of travel between O-D pair r-s (measured
in travel-time units) as perceived by a randomly selected motorist. (This motorist is randomly
selected from the 𝑞𝑟𝑠 motorists going from r to s.) The total expected perceived travel time
between O-D pair r-s is 𝑞𝑟𝑠 𝑆𝑟𝑠 (𝒄𝑟𝑠 ).
The properties of the expected perceived travel time are similar to those of the satisfaction
except for its orientation toward travel-time minimization rather than utility maximization.
Thus 𝑆𝑟𝑠 (𝒄𝑟𝑠 ) is concave with respect to 𝒄𝑟𝑠 . Furthermore, as with the satisfaction function,
𝑟𝑠
the marginal expected perceived travel time of O-D pair r-s, with respect to 𝑐𝑘 , is the.
probability of choosing that route, that is,
𝜕𝑆𝑟𝑠 (𝒄𝑟𝑠 ) 𝑟𝑠
= 𝑃𝑘 (𝒄𝑟𝑠 ) [1.17a]
𝜕 𝑐𝑟𝑠
𝑘
where 𝑃𝑘𝑟𝑠 (𝒄𝑟𝑠 ) k between r and s is
is the probability that the travel time on route
smaller than the travel time on any other route between r and s. Finally, 𝑆𝑟𝑠 (𝒄𝑟𝑠 ) is
monotonic with respect to the number of available paths between r and s, that is,
Annotate References
The modern interest in probit models, in the context of travel demand analysis, started in
the late 1970s. Daganzo et al. (1977) have adopted Clark's (1961) approximation formulas to
the computation of the probit choice function. The Monte-Carlo simulation technique for
estimating the probit choice probabilities was developed at the same time by Lerman and
Manski (1978). Both research teams (Daganzo et al., and Lerman and Manski) found that
Clark's approximation dominated the simulation method in the range of problems considered.
Other approximate analytical methods developed in the same time include the numerical
integration method of Hausman and Wise (1978). Many of these methods are summarized
in a survey paper by Sheffi et al. (1982). The state of the art (as of 1979) with regard to
discrete choice models in general, and the multinomial probit model in particular, is
summarized in Daganzo's (1979) book on the subject. (this reference also includes many
original contributions).
The concept of satisfaction in the context of discrete choice models was suggested by
Daganzo (1979). For logit models, however, it was developed earlier, as a system evaluation
criterion, by Williams (1977), as an accessibility measure by Ben-Akiva and Lerman (1978),
and in the context of networks by Sheffi and Daganzo (1978). The aggregation concept is
discussed in many of the above-mentioned references, as well as by McFadden and Reid
(1975), Koppelman (1976), and Bouthelier and Daganzo (1979).
Daganzo and Sheffi (1977) developed the view of stochastic network loading models in the
context of discrete choice analysis. The stochastic para- dox discussed in Section 10.2 was
pointed out by Sheffi and Daganzo (1978a) in the context of logit route-choice models. The
distribution of the flows under stochastic network loading models was formulated by Daganzo
(1977a), who also showed how this distribution can be used to develop statistical tests on
the possible values of the flow pattern.
Reference