A Hybrid GA To Solve The Multidepot VRP

AHybridGeneticAlgorithmtoSolvethe
MultiDepotVehicleRoutingProblem
IngerKolman
24042012
Documentdownloadedfromhttp://www.logisticsresearch.nl
Abstract
Aretherelogisticbenefitstobegainedfromdeliveringcertainweborderedshipmentsviathephysicalstore?In
literature, the problem of determining how to distribute products when there are multiple possible sources
(distributioncentersandintermediatepoints;stores,depots,hubs)isknownastheMultiDepotVehicleRouting
Problem(MDVRP).ThisthesisinvestigatesavariantoftheMultiDepotVehicleRoutingProblem(MDVRP)inwhich
allsourcesornodeswithinthenetworkareinfactinterrelated.Thisinterrelatednessexistsbecausethesources
representdistributioncenters,depotsandphysicalstoresofasingleretailorganization.Objectiveistodetermine
asetofdeliveryroutesthatminimizestotalcostssuchthatcapacityconstraintsandroutedurationconstraintsare
satisfiedandallcustomersareserved.
SupervisingUniversity:
UniversityofGroningen
Supportedby:
A Hybrid Genetic Algorithm to Solve the

Multi-Depot Vehicle Routing Problem
Masters Thesis Operations Research
Author:
Inger Kolman
i.b.kolman@student.rug.nl
Student Number: 1646761
Supervisor:
prof. dr. K.J. Roodbergen
April 24, 2012

Masters Thesis: A Hybrid Genetic Algorithm for the Multi-Depot Vehicle Routing Problem
Abstract
This thesis investigates a variant of the Multi-Depot Vehicle Routing Problem (MDVRP) in
which the sources, supplying the customers, are related. The sources include a distribution
center and several intermediate points, where these intermediate points can be physical stores,
depots and/or hubs. The relatedness of the sources is due to the assumption that the interme-
diate points are supplied by the distribution center, which is one of the sources as well. The
objective is to determine a set of routes that minimizes total costs such that capacity constraints
and route duration constraints are satisfied and all customers are served. Due to the related-
ness of the sources, the cost structure in this variant of the MDVRP is more complex than it is in
the regular MDVRP. A hybrid genetic algorithm is presented for the variant of the MDVRP. The
algorithm is benchmarked by the nearest neighbour algorithm, which searches for the nearest
unvisited customer location until every location is visited, satisfying capacity constraints and
route duration constraints. It is shown that the presented algorithm performs well on numer-
ous different input sets.
Keywords: Multi-Depot Vehicle Routing Problem, meta-heuristics, hybrid genetic algorithm.

Contents
1 Introduction 1
1.1 Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Goal and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Structure of Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Problem Formulation 6
3 Literature Review 11
4 Algorithm 14
4.1 Generation of Initial Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.1 Initial Population Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.2 Assignment Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.3 Routing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.4 Feasibility Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Generation of Offspring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4.1 Stopping Criteria Corresponding to Generation of Offspring . . . . . . . . 20
4.4.2 Parent Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.3 Create Offspring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.4 Routing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.5 Feasibility Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.7 Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4.8 Condition 1: Maximum Size Subpopulation . . . . . . . . . . . . . . . . . 22
4.4.9 Condition 2: Maximum Number of Iterations . . . . . . . . . . . . . . . . . 22
4.5 Stopping Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Computational Experiments 23
5.1 Parameter Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.1 Relevance of Education Step . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.2 Influence of Infeasibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2.3 Restrictions Due to High-Level of Factor 1 . . . . . . . . . . . . . . . . . . 26
5.2.4 Summarizing the Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2.5 Distance and Time Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2.6 Demand and Service Time Vector . . . . . . . . . . . . . . . . . . . . . . . . 27

5.2.7 Vehicle Capacity and Maximum Route Duration . . . . . . . . . . . . . . . 27
5.2.8 Transport and Handling Costs . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.2.9 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6 Results 29
6.1 Parameter Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.1.1 Fluctuating and . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.1.2 Fluctuating nc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.2 Results 2 2 2 Factorial Experiment . . . . . . . . . . . . . . . . . . . . . . . . 35
6.3 Comparing Availability of Multiple Sources with Only a Distribution Center
Available . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7 Conclusion 39
8 Further Recommendations 40
Bibliography 42
Appendices 44
Appendix I 44
iii
List of Figures
1.1 Direct Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Indirect Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Possible Routing Scheme, Direct Distribution . . . . . . . . . . . . . . . . . . . . . 10

2.2 Possible Routing Scheme, Both Direct and Indirect Distribution . . . . . . . . . . 10
4.1 Flowchart General Scheme Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.1 Input set 1, execution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.7 Parameter set 1: = 9, = 28 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.8 Parameter set 2: = 15, = 46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.9 Parameter set 3: = 36, = 109 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1 Input set 4, execution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

List of Tables
5.1 Ranges Parameters (Vidal et al., 2011) . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Calibration Results (Vidal et al., 2011) . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Stopping Conditions (Vidal et al., 2011) . . . . . . . . . . . . . . . . . . . . . . . . 28
6.1 Possible values of nclose for different nc and . . . . . . . . . . . . . . . . . . . . . 34

6.2 Confidence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Chapter 1
Introduction
1.1 Situation
Nowadays, almost all retailers not only sell their products in physical stores, but also on the
internet. This relatively new channel entails several changes for both customers and retail
companies. One of these changes concerns the distribution of products to customers. The
most classical form of distribution is a product delivery to a physical store, where the customer
buys it. However, it is increasingly common that (internet) purchases are delivered at home,
especially when there is an additional service included, such as the installation of a product.
1.2 Motivation
As the home delivery of products is increasingly common, it is interesting to investigate the
process of distributing products to customers and to examine cost saving possibilities. Retail-
ers have the possibility to outsource the distribution (to, in the Netherlands, PostNL or DHL for
example) or to execute it themselves. When focusing on retailers that execute the distribution
themselves, the most common manner in the Netherlands is direct distribution from a central
distribution center (DC) to customers throughout the whole country with vans or trucks. This
situation is presented in Figure 1.1. An alternative way to supply customers is through in-
direct distribution, where indirect implies first distributing products with a larger truck to a
store, hub or depot and second, from this intermediate point (IP), supply the home deliveries
with smaller vehicles. This situation is presented in Figure 1.2. This indirect distribution might
result in cost savings when comparing it to direct distribution. For instance, when considering
a retailer with physical stores throughout the country, these stores could be used as interme-
diate points for indirect distribution. The stores are provided with usual store supply from
the distribution center by large trucks. When using remaining capacity within these trucks to
distribute products meant for home deliveries to the stores, i.e. the intermediate points, the
products might be closer to the end customers without making (hardly any) extra costs.
Notice that a combination of direct and indirect distribution is possible as well: certain home
deliveries are supplied directly from a distribution center while others go through intermediate
points.
Obviously, there are some issues that need to be dealt with, for example the (remaining) ca-
pacity of the vehicles and the available storage capacity at the intermediate points. To deal
with these issues, some assumptions need to be made, which will be discussed after presenting
the goal of this thesis.
Figure 1.1: Direct Distribution Figure 1.2: Indirect Distribution
In the literature, the problem of determining how to distribute products when there are mul-
tiple possible sources (distribution centers and intermediate points; stores, depots, hubs) is
known as the Multi-Depot Vehicle Routing Problem (MDVRP). This is an extension of the clas-
sical Vehicle Routing Problem (VRP). The classical VRP involves the determination of a set
of routes that minimizes total cost or travel distance, while satisfying certain constraints and
serving all customers from a single source. The MDVRP minimizes total cost or travel distance
while considering multiple sources serving the customers.
The difference between the regular MDVRP and our situation is that the sources are supposed
to be unrelated in the regular MDVRP while in our situation they are not. The intermediate
points in our situation of the MDVRP are supplied with stock from a distribution center which
is one of the sources as well. Using the remaining capacity of the vehicles to supply the interme-
diate points with products meant for home deliveries, results in a more complex cost structure
and hence, the objective function of minimizing costs is more complex.
The regular MDVRP is a well known problem in the literature and nowadays it is still ex-
amined a lot. This literature will be discussed later on and the application of the literature to
cover our situation of the MDVRP will be presented as well.
1.3 Goal and Scope

The goal of this research is to solve the variant of the MDVRP with related sources. We want to
determine a near optimal routing scheme for the daily distribution of all products with home
delivery service when assuming that every home delivery can either come from a central dis-
tribution center or from an intermediate store, hub or depot. This goal can be summarized as
follows:
Determine a model that specifies how to distribute home delivery orders, when multiple sources
supplying these orders are available, while minimizing the total cost.
Several components can be included in the total cost such as transport costs, handling costs,
inventory costs. Which components to include depends on the assumptions made.
2
Let us sum up the following assumptions to create a clear view on the situation under consid-
eration, i.e. to define the scope of this research:
Home delivery from an intermediate point is always possible, i.e. the storage capacity
and inventory level at the intermediate points are always large enough.
The capacity of the trucks used to supply the intermediate points is unrestricted, i.e. there
is always enough remaining capacity in the trucks to distribute home delivery orders to
the intermediate points.
Two types of costs corresponding to the distribution of the products are included: trans-
port costs, which are presented as a cost per kilometer, and handling costs at the interme-
diate points, which are presented as a fixed cost per unit of volume. There are no costs
included corresponding to the transport of products from the distribution center to an
intermediate point, since existing transport lanes are used, i.e. as it is assumed that the
capacity of the trucks corresponding to these existing transport lanes is unrestricted, no
additional costs are included. Hence, the cost per kilometer corresponding to the trans-
port only concerns the actual routes starting and ending at the sources.
Orders are handled on a daily basis. Consolidation of orders over time is not considered.
Return flows and return handling costs are outside the scope of this research.
Delivery and departure time windows are outside the scope of this research.
Vehicles start and end their route at the same source (a distribution center or intermediate
point).
The amount of vehicles available at the sources is unrestricted. When the maximum
vehicle capacity is reached, a new route is started and a new vehicle for the new route is
assumed to be available.
As the scope of the research is defined, let us now discuss the approach to tackle the research
question.
1.4 Methodology
To solve the special situation of the MDVPR, we present a hybrid genetic algorithm. A hybrid
algorithm combines several heuristics to exploit their best properties. The genetic algorithm
concept, introduced by Holland (1975), is based on mimicking an evolutionary process. The
procedure starts with an initial population of candidate solutions where each solution in the
population is called a chromosome. In the MDVRP a chromosome represents a routing scheme.
Over time the population evolves as iteration by iteration the chromosomes evolve. These iter-
ations are called generations. During each iteration, i.e. during each generation, every chromo-
some in the population is evaluated by its fitness measure. This fitness measure represents the
quality of the chromosome. In the MDVRP it represents the quality of the routing scheme. This
quality might be expressed in the costs related to the routing scheme, but as we will discuss
later on, several other aspects may be considered as measurements as well. Chromosomes are
selected to execute genetic operations as crossover and mutation according to their fitness. The
fitter the chromosome, the higher the probability of being selected which enhances evolution.
3
In the crossover phase two chromosomes, called parents, are combined to generate offspring,
i.e. to generate a new routing scheme in the MDVRP. The mutation phase is concerned with
mutating the offspring to maintain a diverse population. Maintaining a diverse population
helps to avoid being trapped in a local optimum which is one of the nice properties of genetic
algorithms. When a prespecified number of offspring is created, a new generation is obtained
by selecting the best parents and offspring according to their fitness and removing the other
chromosomes. After a prespecified number of generations is executed the algorithm returns
the best chromosome. This chromosome, hopefully, represents a near optimal solution, i.e. a
near optimal routing scheme.
The genetic algorithm we propose to solve the MDVRP determines the routing and assignment
scheme for inserted order data. The assignment scheme presents the assignment of the home
deliveries to the sources and the routing scheme presents the actual routes. As the number of
vehicles available at the sources is assumed to be unrestricted, the hybrid genetic algorithm
also presents the required number of vehicles at each source.
The proposed algorithm is based on the heuristic introduced by Vidal et al. (2011). This so
called Hybrid Genetic Search with Adaptive Diversity Control (HGSADC) solves three exten-
sions of the VRP; the MDVRP, the periodic VRP (PVRP) and the multi-depot periodic VRP (MD-
PVRP). Vidal et al. (2011) demonstrate that for all currently available benchmark instances for
the three problem classes, HGSADC identifies either the best know solutions, including the op-
timal ones, or new best solutions. There is no recent research known to improve the method
proposed by Vidal et al. (2011). Therefore, this is the method we use to develop our algorithm.
The main difference between the heuristic proposed by Vidal et al. (2011) and our heuristic
is associated with the relatedness of the sources in our situation due to the distribution center,
being a source itself, supplying the other sources. To deal with this relatedness, the handling
costs at the intermediate points are included in our algorithm. This implies that home deliv-
eries supplied directly from a distribution center are only concerned with a cost related to the
actual transportation, while the home deliveries supplied from an intermediate point are not
only concerned with a cost related to the transportation but also with a cost related to the extra
handling required at the intermediate point.
Another difference between the two heuristics concerns the routing algorithm that is used.
Vidal et al. (2011) apply the giant-tour representation and the Split algorithm as Prins (2004)
did. We apply a routing algorithm that is easier to build within the used software (MATLAB),
namely the nearest neighbour algorithm. As this algorithm is less advanced, the outputted
routes are less advanced as well. This problem is overcome by the route improvement phase
that is included in the so called education phase of the algorithm.
Two other differences between the heuristic presented by Vidal et al. (2011) and ours are re-
lated to the parent selection and the crossover procedure. The above detailed description of
the concept of genetic algorithms is provided to be able to understand these two differences.
Vidal et al. (2011) select parents to produce offspring randomly while we apply the roulette
wheel selection operations (Golberg, 1989). This roulette wheel approach takes the quality of
the chromosomes into account, i.e. the fitness measure. It selects the best chromosomes with
the highest probability and the worst with the lowest. For the crossover procedure Vidal et al.
(2011) propose periodic crossover with insertions related to periodic routing problems. As we
do not consider a periodic problem, we do not apply this approach. Instead, we randomly
combine the parents to create offspring.
4
The algorithm we propose is tested by a 2 2 2 factorial experiment. Hence, eight different

experiments are investigated to develop a clear overview of the capabilities of the algorithm.
The three factors correspond to the total number of locations included (source locations plus
customer locations), the demand of the customers and the required service time at the customer
locations.
1.5 Structure of Remainder

The next chapter discusses the problem formulation with a formal statement of the MDVRP.
Chapter 3 exhibits the Literature Review in which an overview is presented of heuristics de-
veloped to solve the MDVRP, including the heuristic of Vidal et al. (2011). In Chapter 4 our
hybrid genetic algorithm is described. Chapter 5 discusses the set up of the 2 2 2 factorial
experiment in more detail. The results corresponding to the testing of the algorithm are ana-
lyzed in the sixth chapter and conclusions upon these results are drawn in Chapter 7. Finally,
in Chapter 8 recommendations for further research are discussed.
5
Chapter 2
Problem Formulation
As previously mentioned, the situation we consider can be described by the Multi-Depot Vehi-
cle Routing Problem (MDVRP) which is an extension of the classical Vehicle Routing Problem
(VRP). The VRP is a well known problem in several fields of study of operations research due
to its application in many industries, such as garbage collection, goods distribution and mail
delivery (Liong et al., 2008). Dantzig and Ramser (1959) defined the VRP as a generalization of
the Traveling Salesman Problem (TSP). The TSP is concerned with the task of determining the
shortest route or least cost route to visit a given set of customer locations exactly once, starting
and ending at the same depot. Dantzig and Ramser (1959) extended the TSP by specifying de-
mand at each customer location and imposing a capacity constraint on the vehicles supplying
these locations. Later on numerous extensions of the VRP have been investigated, including
the MDVRP.
Before presenting the formal statement of the MDVRP, let us present an overview of the vari-
ables used in this section and the rest of this thesis:
n The total number of customers.

s The total number of sources.
VC The set representing the customers.
VD The set representing the sources.
V = VC VD The set representing all locations together.
qj j VC The demand of customer j, in a prespecified unit of volume (m2 , m3
or pallets).
j j VC The service time required by customer j, in minutes.
tij i, j V The time required to travel from i to j, in minutes.
dij i, j V The travel distance from i to j, in kilometers.
Q The vehicle capacity, in a prespecified unit of volume (m2 , m3 or
pallets).
T The maximum route duration, in minutes.
c1 The transport costs per kilometer, in euros.
c2 The handling costs per prespecified unit of volume (m2 , m3 or pallets),
in euros.
R The set representing the separate routes of one routingscheme.
xijk i, j V, k R The binary variable describing whether arc (i, j) is included in route k.
zij i, j V The binary variable describing whether customer j is assigned to
source i.
Now, let us present the mathematical representation of the MDVRP. Let G = (V, A) represent a
complete graph where the set V represents the vertices and A the arcs. Let V = VD VC where
VD corresponds to the set containing the sources and VC to the set representing the customers.
Define the number of sources to be s and the number of customers to be n, i.e. VD = {1, ..., s},
VC = {s + 1, ..., s + n} and hence, V = {1, ..., s, s + 1, ..., s + n}. Let the demand and service
duration of customer j VC be defined by qj and j respectively. Let Q define the capacity
of the vehicles that distribute the products to the customers and let T represent the maximum
amount of time a route can last. The arcs (i, j) A indicate the travel possibility from vertex i
to j, i, j V . Let the travel distance from i to j, i, j V be defined by dij . The time required to
travel from i to j is represented by tij , i, j V . The total duration of a route is the time required
for traveling plus the required service time. Solving the MDVRP involves the determination of
a set of routes that minimizes the total duration while satisfying the route duration and vehicle
capacity constraint and serving all customers. Another well known objective is minimizing
total costs instead of time.
In our special situation of the MDVRP we deal with the objective of minimizing total costs.
As discussed in the previous chapter, we include two types of costs: transport costs (a cost per
kilometer) and handling costs at intermediate points (a cost per unit of volume). Let the cost
per kilometer and the cost per unit of volume be defined by c1 and c2 respectively. To be able to
define the mathematical representation of our situation of the MDVRP the following variables
are required as well as the previously defined variables:
Let the set R represent the separate routes of one routing scheme.
Let binary variable xijk , i, j V and k R, describe whether arc (i, j) is included in route
k in the routing scheme; hence, xijk is equal to one if vertex j is entered after vertex i in
route k and zero otherwise.
Let binary variable zij , i VD and j VC , describe whether customer j is assigned to

source i; i.e. zij is equal to one if customer j is assigned to source i and zero otherwise.
Now, the costs can be described as follows:

XXX
transport costs = xijk dij c1 (2.1)
iV jV kR
X X
handling costs = zij qj c2 (2.2)
iVD jVC
total costs = (2.1) + (2.2)
Using this function of the total costs in the objective of minimizing these costs, results in the
following mathematical representation of our MDVRP, which is again based on Vidal et al.
(2011):
XXX X X
min xijk dij c1 + zij qj c2
iV jV kR iVD jVC
7
XX
s.t. xijk (tij + j ) T k R (2.3)
iV jV
XX
xijk qj Q k R (2.4)
iV jVC
X X
xijk = 1 j VC (2.5)
iVD kR
xijk {0, 1} k R and i, j V (2.6)
zij {0, 1} i, j V (2.7)
Constraint (2.3) corresponds to the route duration constraint, i.e. every route k R can last
at most T minutes. Hence, k R we must have (k) T where (k) represents the total
route duration of route k. This total
P route
P duration is the sum of the required travel time and
the service duration, i.e. (k) = iV jV xijk (tij + j ). The second constraint represents
the vehicle capacity constraint, i.e. every route k R can have a maximum load equal to Q.
Constraint (2.5) indicates that every customer j VC is visited exactly once. The fourth and
fifth constraint define the binary variables xijk and zij respectively.
To be able to give more insight into our situation of the MDVRP, consider the following fic-
titious example (i.e. no real data is used):
Assume the existence of one distribution center (DC), defined by 1;
Assume the existence of two physical stores, which can be used as intermediate points
for indirect distribution. Let these intermediate points be defined by 2 and 3;
Assume the existence of four customers, represented by 4, 5, 6, 7;
From this it follows that we have VD = {1, 2, 3}, VC = {4, 5, 6, 7} and hence, V = VD VC =
{1, ..., 7}.
Define the following variables for the situation with three sources and four customers, where
again no real data is used, i.e. it is a fictitious example:
Let the following matrix define the travel distances from i to j, i, j V , in kilometers:
0 3 5 6 2 5 7.5

3 0 9 3 4.5 7 11

5 9 0 11.5 6 7 3

dij = 6
3 11.5 0 8 6.5 12.5 ;
2 4.5 6 8 0 8 9

5 7 7 6.5 8 0 6
7.5 10 3 12.5 9 6 0
Let the following matrix define the travel time required to travel from i to j, i, j V , in
minutes:
0 18 30 36 12 30 45

18 0 54 18 27 42 66

30 54 0 69 36 42 18

tij =
36 18 69 0 48 39 75 ;

12 27 36 48 0 48 54

30 42 42 39 48 0 36
45 66 18 75 54 36 0
8
Let the following vector define the demand of customer j VC , in m2 :

qj = 3 4 2.5 3 ;
Let the following vector define the required service time of customer j VC , in minutes:

j = 30 40 25 30 ;
Let the maximum amount of time a route can last be equal to T = 180 minutes;
Let the capacity per vehicle be equal to Q = 8 m2 ;
Let the transport costs be c1 = 2 euro per kilometer;
Let the handling costs at the intermediate points be c2 = 1.50 euro per m2 .
As previously mentioned, the most common way of supplying customers is through direct dis-
tribution, i.e. assigning every customer to the DC. A possible routing scheme corresponding to
this assignment scheme is presented in Figure 2.1. This routing scheme consists of two routes:
1) 1 - 4 - 5 - 1 and 2) 1 - 6 - 7 - 1. The first route is 16 kilometers long and lasts 166 minutes. The
second route is 18.5 kilometers long and also lasts 166 minutes. The vehicle load corresponding
to the routes is 7 m2 and 5.5 m2 respectively. The costs corresponding to this routing scheme
are equal to 2 (16 + 18.5) = 69 euros. Note, no handling costs are included since there are no
deliveries from intermediate points.
Now impose the possibility of indirect distribution. Notice that supplying customer 4 from
intermediate point 2 and customer 7 from intermediate point 3 would be a reasonable option
to consider. A possible routing scheme corresponding to this situation is presented in Fig-
ure 2.2. This routing scheme consists of three routes: 1) 1 - 5 - 6 - 1, 2) 2 - 4 - 2 and 3) 3 -
7 - 3. The first route length is 15 kilometers and the second and third route are both 6 kilo-
meters long. The route durations are 155, 66 and 66 minutes respectively. The vehicle loads
are 6.6, 3 and 3 m2 respectively. The total costs corresponding to this routing scheme are
2 (15 + 6 + 6) + 1.50 (3 + 3) = 63 euros, where 54 euros of these costs are related to the
actual transport costs and 9 euros to the handling of the products at the intermediate points.
Comparing the two situations indicates that the possibility of indirect distribution results in
cost savings.
9
Figure 2.1: Possible Routing Scheme, Direct Figure 2.2: Possible Routing Scheme, Both
Distribution Direct and Indirect Distribution
Notice that the routing schemes presented above are two examples of numerous possibilities
to solve the considered situation. This situation is quite abstract and far from reality as no real
data is used and hence, it is not a surprise that cost savings occur. Although the situation is
unrealistic, it illustrates our problem well.
In this thesis a model is presented that solves this problem, i.e. determines a near optimal
routing scheme. Different situations are investigated to examine cost saving possibilities and
to create an overview of the capabilities of the algorithm.
10
Chapter 3
Literature Review
Numerous papers have been written on the VRP and several extensions of the problem have
been discussed over the years, including the MDVRP. Other extensions include the VRP with
time windows, the VRP with backhauls, the VRP with pick-ups and deliveries and the VRP
with multiple use of vehicles (Gendreau et al., 2008). Nowadays still a lot of research is per-
formed on extensions of the VRP. These extensions become increasingly broad. Aforemen-
tioned extensions of the VRP are combined and various constraints are added.
Exact algorithms as well as heuristics have been developed to solve the VRP and its numerous
extensions. The exact algorithms are time consuming and hardly ever applicable to problems
concerned with more than 50 customers (Liong et al., 2008). Therefore when considering real-
istic situations heuristics are usually more applicable.
An extremely broad review would be required to discuss the entire range of exact algorithms
and heuristics corresponding to the classical VRP and its extensions. Since we are dealing with
the MDVRP, the remainder of this literature review focuses on the heuristics developed to solve
this problem.
The first heuristics to solve the MDVRP were developed, among others, by Tillman and Cain
(1972) and Wren and Holliday (1972). Tillman and Cain (1972) introduced a heuristic based
on the savings algorithm proposed by Clarke and Wright (1964) which implies that combining
routes might result in cost savings. This heuristic, as many other heuristics developed in the
early literature, does not include an improvement stage on the initial solution to obtain a bet-
ter solution (Zhang et al., 2011). The heuristic introduced by Wren and Holliday (1972) does
include an improvement stage; in the first stage an initial feasible solution is obtained and in
the second stage several refinement heuristics are applied to this solution. This procedure of
obtaining an initial solution first and then improving it, is one of the two common structures of
the heuristics developed to solve the MDVRP (Ho et al., 2008).
The second well known structure involves decomposing the MDVRP into subproblems first
and then solving these subproblems separately before connecting them iteratively. Gillett and
Johnson (1976) introduced an algorithm based on this structure, also known as the cluster first
- route second principle. This principle is applied in our algorithm as well. It applies the
two-stage solution technique by assigning the customers to the depots in the first stage and
determining the routes in the second stage. Hence, the MDVRP is divided into single-depot
problems in the first stage and in the second stage the optimal routes corresponding to these
subproblems are obtained. Golden et al. (1977) also developed an algorithm using the cluster
first - route second principle. This algorithm clusters customers in the first stage according to
the distances to the depots closest and second closest to the customers. Raft (1982) developed
an algorithm for which the problem is decomposed in five, instead of two, stages. The five
stages are solved separately and then connected iteratively.
Chao et al. (1993) applied the cluster first - route second principle presented by Golden et al.
(1977) to obtain an initial solution and improve this solution by changing the depot assignments
of the customers. Hence, they combine the two commonly applied structures in heuristics solv-
ing the MDVRP. In the improvement procedure Chao et al. (1993) applied the record-to-record
approach (Dueck, 1993). To explain this record-to-record approach define the best solution ob-
tained so far as S, an alternative solution as S 0 , the total distance corresponding to solution S
as record R and deviation D as a preset percentage of R. The alternative solution S 0 is selected
as new solution if the objective value of S 0 is less than R + D. Hence, the record-to-record ap-
proach allows deteriorations to prevent from being trapped in a local optimum to eventually
obtain a good solution.
Tabu search algorithms on the MDVRP have been first introduced by Renaud et al. (1996) and
Cordeau et al. (1997). These algorithms use memory structures to memorize the examined so-
lutions. A tabu search algorithm will not examine a solution more than once as it marks the
visited solutions as tabu in its memory for a certain number of iterations (Glover, 1990).
During the 21st century research has primarily been dedicated to extensions of the MDVRP
but the classical MDVRP has been investigated as well. Pisinger and Ropke (2007) developed a
unified heuristic to solve five extensions of the classical VRP including the MDVRP. This heuris-
tic uses the adaptive large neighbourhood search framework designed by (Ropke and Pisinger,
2006). This framework expands and contracts the search for a better solution by choosing adap-
tively among a number of heuristics that are able to insert and remove customers in a route.
Ho et al. (2008) were the first to propose two hybrid genetic algorithms to solve the MDVRP.
The first algorithm generates the initial population randomly while the second algorithm uses
the savings algorithm and the nearest neighbour heuristic. The nearest neighbour heuristic
is a routing heuristic that solves the TSP. It searches for the nearest unvisited customer lo-
cation until every location is visited and then returns to the starting location. Mirabi et al.
(2010) presented three hybrid heuristics to solve the MDVRP where these heuristics combine
components from constructive heuristic search and improvement methods. With constructive
heuristic search a complete solution, i.e. all customers satisfied, is constructed by extending
an empty solution. The three hybrid heuristics presented by Mirabi et al. (2010) differ in the
improvement technique they apply.
As previously mentioned, the heuristic proposed in this thesis is based on the heuristic in-
troduced by Vidal et al. (2011) which is also a hybrid genetic algorithm. This so called Hybrid
Genetic Search with Adaptive Diversity Control (HGSADC) addresses three extensions of the
VRP including the MDVRP. Vidal et al. (2011) apply the giant-tour representation and the Split
algorithm as Prins (2004) did to obtain a routing scheme corresponding to the randomly de-
termined assignment scheme. The giant-tour representation implies that from each depot one
routes starts, serving all customers assigned to that depot and then returns to the depot. The
Split algorithm is used to find the optimal segmentation of the giant-tour into separate routes.
One of the differences between regular genetic algorithms and the algorithm developed by
Vidal et al. (2011) is the control on the diversity of the population. Vidal et al. (2011) do not
include a mutation phase, instead they apply advanced population diversity management
12
schemes of including feasible as well as infeasible solutions and refreshing the population
when a prespecified maximum number of iterations made without improving the best solution
obtained so far is reached. Also, a diversity contribution is included in the fitness measure used
to evaluate the chromosomes. This enhances the evolution of the population while avoiding it
to converge prematurely.
Another difference is the inclusion of an education phase to further enhance the improvement
of chromosomes. In this phase local-search procedures educate the chromosomes by improv-
ing their routing schemes.
As previously mentioned, Vidal et al. (2011) demonstrated that their algorithm is the best per-
forming method to approach the MDVRP. Therefore, this is the method we use to develop our
own algorithm which is presented in the next chapter.
13
Chapter 4
Algorithm
The general scheme of our proposed meta-heuristic is presented below in text, General Scheme
Algorithm, as well as in a flowchart, Figure 4.1. The heuristic generates an initial population
of feasible and infeasible chromosomes which are separated in subpopulations. These chro-
mosomes represent a routing scheme corresponding to a randomly determined assignment
scheme. Next, the chromosomes undergo some route improvement techniques during the so
called education step. Thereafter, the roulette wheel selection operation (Golberg, 1989) is ap-
plied to select parents which assignment schemes are randomly combined to produce offspring
and a routing scheme corresponding to this new created assignment scheme is determined. The
created offspring undergoes the education procedure and is included in the subpopulation ac-
cording to its feasibility.
General Scheme Algorithm
Step 0: While the computation time & the amount of iterations allowed without improve-
ment both have not reached their maximum do:
Step 1: Generate initial population.

1.1 While the maximum size of the initial population is not reached do:
1.2 Obtain an assignment scheme by randomly assigning every customer to a source.
1.3 Determine a routing scheme corresponding to the assignment scheme by applying the near-
est neighbour algorithm.
1.4 If the route duration constraint is not satisfied by one of the routes from the routing scheme
obtained in 1.3 then insert the chromosome into the infeasible subpopulation. Else insert the
chromosome into the feasible subpopulation.
1.5 End while.
Step 2: Evaluate every chromosome by its evaluation value.

Obtain the evaluation value for every chromosome in the population where this is a combina-
tion of the rank of the costs of the chromosome and the rank of the diversity contribution of the
chromosome.
Step 3: Educate the chromosomes.

The chromosomes are educated by applying some techniques to improve their routing schemes.
Step 4: Generate offspring, i.e. new chromosomes.

4.1 While the two subpopulations have not reached their maximum size and a prespecified
number of iterations allowed in a row without improving the best solution obtained so far is
not reached do:
4.2 Select parents according to the roulette wheel selection operation (Golberg, 1989).
4.3 Apply random crossover to the assignment schemes of the parents to produce offspring.
4.4 Determine a routing scheme corresponding to the assignment scheme of the offspring by
applying the nearest neighbour algorithm.
4.5 If the route duration constraint is not satisfied by one of the routes from the routing scheme
obtained in 4.4 then insert the offspring into the infeasible subpopulation. Else insert the off-
spring into the feasible subpopulation.
4.6 Obtain the evaluation value of the offspring.
4.7 Educate the offspring.
4.8 If one of the two subpopulations reached its maximum size then select survivors and return
to 4.1.
4.9 If the prespecified number of iterations allowed in a row without improving the best solu-
tion obtained so far is reached then diversify the population by selecting survivors and return
to 1.1.
4.10 End while.
Step 5: End while.
Figure 4.1: Flowchart General Scheme Algorithm
15
In the following sections the algorithm will be explained in more detail. The numbering of the
sections and subsections is similar as the numbering of the steps within the General Scheme
Algorithm. The first section considers the generation of the initial population. Second, the
evaluation process of the chromosomes is described. The third section is concerned with the
process corresponding to the education of the chromosomes. Next, the generation of offspring
is discussed. Last, the fifth section discusses the stopping criterion. As previously mentioned,
our heuristic is based on the method presented by Vidal et al. (2011). We will indicate the
differences between this heuristic and ours. When no difference is indicated, one can assume
similarity between the two methods.
4.1 Generation of Initial Population

4.1.1 Initial Population Size
In the first step of the algorithm, an initial population is generated containing 4 chromosomes,
where is the minimum size of both the feasible and infeasible subpopulation. The set P op
represents the chromosomes. At the end of the initialization step, at least one of the subpop-
ulations contains chromosomes. The other one might contain less than chromosomes and
hence, is then incomplete since is defined as the minimum size of both subpopulations. The
maximum size of each subpopulation is + . A subpopulation reaching its maximum size
undergoes a procedure of selecting survivors. This survivor selection procedure is discussed
in subsection 4.4.8.
Let us, in the next subsections explain the actual generation of the initial population.
4.1.2 Assignment Procedure

Chromosomes are created by randomly assigning the n customer locations to a source i where i
represents the distribution center, stores and/or depots, i = 1, ..., s. Hence, we are dealing with
n customers locations and s sources; therefore we can define the following sets of locations
VD = {1, ..., s}, VC = {s + 1, ..., s + n} and V = VD VC = {1, ..., s, s + 1, ..., s + n}. The assign-
ment of customer location j VC , to source i VD , can be defined by yj = i and hence, the as-
signment scheme of chromosome r is represented by an array of n assignments: [ys+1 , ..., ys+n ].
This assignment scheme for chromosome r can also be defined by matrix Z(r) := zij (r) where
zij (r) = 1 if customer location j VC is assigned to source i VD , and zero otherwise. Both
yj = i and Z(r) are decision variables representing the same decision.
4.1.3 Routing Algorithm

The actual chromosomes not only represent an assignment scheme, but also the routes obtained
by applying the nearest neighbour heuristic to this assignment scheme. Here, our algorithm
differs from the one presented by Vidal et al. (2011) as they apply the giant-tour representation
and the Split algorithm as Prins (2004) did.
As previously described, the nearest neighbour heuristic is a routing heuristic that solves the
TSP. It searches for the nearest unvisited customer location until every location is visited and
then returns to the starting location. In our case the starting location is always one of the
sources i VD . We apply an extension of the classical nearest neighbour heuristic by inserting
a capacity constraint. From source i VD the nearest neighbour j VC , for which yj = i holds,
is located and entered. From this location again the nearest unvisited neighbour is located and
16
entered. This continues until the maximum vehicle capacity is reached or all customer locations
j VC , for which yj = i holds, are visited. When the maximum vehicle capacity is reached
while there remain unvisited customers, the route is ended by returning to i and a new route is
started. After applying the nearest neighbour heuristic, the matrix X(r) := xijk (r) represents
the routing scheme corresponding to chromosome r. Hence, it describes which arcs (i, j) are
driven in chromosome r in route k R(r), i.e. xijk (r) = 1 if arc (i, j) is driven in k R(r) and
zero otherwise where the set R(r) represent the routes corresponding to chromosome r.
4.1.4 Feasibility Check

Next, the feasibility of the chromosomes is investigated. When the route duration constraint
is not satisfied by one of the routes in the routing scheme of the chromosome, it is inserted
in the infeasible subpopulation. When every route satisfies the route duration constraint, the
chromosome is inserted in the feasible subpopulation. In mathematics this yields when (k) >
T for at least one k R(r), where (k) represents the route duration of route k and T the
maximum route duration, the solution is infeasible and hence, it is inserted in the infeasible
subpopulation; otherwise it is inserted in the feasible subpopulation.
4.2 Evaluation
The chromosomes are evaluated by their fitness measure. From now on this fitness measure is
represented by the so called evaluation function. This function combines the rank of the costs of
the chromosome and the rank of the diversity contribution of the chromosome. As Vidal et al.
(2011) describe, this construction of evaluating the individuals not only by their objective value,
i.e. their costs, which is most common, but by also including a diversity contribution show to
not only efficiently avoid premature population convergence, but also outperforms traditional
diversity management methods relative to the general behavior of the solution method. The
costs are composed of two components: total costs (i.e. transport and handling costs) and a
penalty cost corresponding to the feasibility of the chromosome. The two ranks correspond to
the location of the chromosome in the ordered list of costs and diversity contributions.
Let us discuss the parts of the evaluation function separately in both words as well as in math-
ematics in the following subsections.
Total Costs
For the total costs corresponding to the routing scheme of the chromosome r we distinguish
two costs: the transport costs and the handling costs. The transport costs are the costs related
to the actual driving from A to B. We define c1 to be a fixed cost per kilometer. In this cost per
kilometer multiple factors influencing the transport costs are taken into account such as fuel
costs, driver costs, insurances, fleet costs. Now, in mathematics we can define the transport
costs for chromosome r to be:
XX
ctransport (r) = xijk (r) dij c1 ,
iV jV
where dij represents the distance of driving from i to j. These distances are acquired using the
program XCargo.
Define the handling costs by c2 , a cost per unit of volume (per m2 , per m3 , per pallet). The
17
quantity used is dependent on the investigated data. Let demand at customer location j VC
be defined by qj , where qj is defined in the same quantity. In mathetics we have:
X X
chandling (r) = zij (r) qj c2 .
iVD jVC
The total costs for chromosome r are represented by the sum of the transport and handling
costs, i.e. ctotal (r) = ctransport (r) + chandling (r).
Penalized Costs
The penalty for infeasibility is the amount of time that exceeds the route duration constraint
multiplied by a penalty parameter, . This parameter dynamically changes during the execu-
tion of the algorithm to increase the favorability of naturally feasible solutions. The penalty pa-
rameter is initially set equal to 1 and is adjusted according to the number of iterations executed
without improving the best solution obtained so far. This adjustment can be mathematically
defined by:
ITactual
=1+ ,
ITallowed
where ITactual is the actual number of iterations executed without improvement so far and
ITallowed the maximum allowed number of iterations without improving the best solution ob-
tained so far. Notice, [1, 2]. Hence, during the execution of the algorithm, the penalty
parameter increases and hence, the penalty costs weigh more.
In mathematics, the penalized cost for chromosome r is defined by:
X
(r) = max {0, (k) T } .
kR(r)
Diversity Contribution
The diversity contribution of chromosome r is represented by (r). It is the average distance
to its nclose closest neighbours (Vidal et al., 2011). The average distance is a figurative distance;
it represents the average difference between the assignment schemes of the individuals. Let the
following equation represent the mathematic notation of the average distance:
1 X
(r1 , r2 ) = [I(j (r1 ) 6= j (r2 ))]
n
jVC
where I(j (r1 ) 6= j (r2 )) is an indicator function returning one if the source assignment of
customer location j N is not the same for chromosomes r1 and r2 and zero otherwise.
The diversity contribution is now mathematically represented by:
1 X
(r) = (r, r2 )
nclose
r2 Nclose
where the set Nclose represents the nclose closest neighbours of chromosome r.
18
Evaluation Function
Now let the actual evaluation function be defined by:

1
EV (r) = c RAN K + dc RAN K((r)).
ctotal (r) + (r)

1
RAN K ctotal (r)+(r) and RAN K((r)) present the rank of individual r corresponding to the
costs and the diversity contribution respectively. The rank of the costs considers the inverse of
the sum of the total costs and penalized costs to be able to assign the highest rank to the chro-
mosome with the lowest costs. In this situation the best chromosome r maximizes the evalua-
tion value EV (r). The ranks are weighted by the parameters c and dc . These parameters are
initially equal to 0.5 and are dynamically adjusted during the execution of the algorithm. The
adjustments can be mathematically presented in the following way:
ITactual
c = 0.5 + 0.5
ITallowed
ITactual
dc = 0.5 0.5
ITallowed
The adjustments make sure that during the execution, the costs become heavier weighted while
the weight on the diversity contribution becomes lighter.
4.3 Education
When the evaluation process is complete, the 4 initial chromosomes are educated. This ed-
ucation procedure is concerned with nine route improvement techniques. To discuss these
techniques first define the following variables:
Let u be a customer location, i.e. u VC ;
Let route k(u) represent the route containing customer location u in chromosome r, k(u)
R(r);
Let (u, v) be the partial route from u to v in k(u);
Let the neighbourhood of u be defined by the h n, h [0, 1], closest neighbours of u;
Let v be a customer location in the neighbourhood of u;
Let x be the successor of u in k(u) and let y be the successor of v in k(v).
For the nine route improvement techniques the following restrictions are required: u 6= y and
v 6= x. Now the following nine moves can be defined:
1. Remove u and place it after v;
2. Remove (u, x) and place (u, x) after v;
3. Remove (u, x) and place (x, u) after v;
4. Swap u and v;
19
5. Swap (u, x) and v;
6. Swap (u, x) and (v, y);
7. If k(u) = k(v), replace (u, x) and (v, y) by (u, v) and (x, y);
8. If k(u) 6= k(v), replace (u, x) and (v, y) by (u, v) and (x, y);
9. If k(u) 6= k(v), replace (u, x) and (v, y) by (u, y) and (x, v).
The last two moves are inter-route moves, the seventh move is an intra-route move, the first
three moves are so called insertions and moves four to six are swaps.
During the education step every chromosome is educated which, as the route improvement
techniques suggest, yields improving the routing schemes obtained by the nearest neighbour
algorithm. A customer location u is randomly selected as well as customer location v from the
neighbourhood of u. The nine moves are randomly investigated and the first move yielding
an improvement of the evaluation value, EV (r), is implemented. Then, for the same chromo-
some, a new u and v are selected and again the nine moves are randomly investigated. This
process of randomly selecting u and v continues until a u is selected for which none of the nine
moves results in improvement. When this event occurs, the feasibility of the chromosome is in-
vestigated and the new evaluation value of the chromosome, EV (r), is obtained. This implies
that the feasibility check is performed numerous times during the education step as well as the
evaluation phase. When the feasibility of the chromosome changed, it is deleted from the old
subpopulation and inserted in the other one. Thereafter, the process starts all over again for the
next chromosome until all chromosomes have been educated.
4.4 Generation of Offspring

4.4.1 Stopping Criteria Corresponding to Generation of Offspring
The initial population is extended by creating offspring while two conditions are satisfied:
the two subpopulations have not reached their maximum size, (condition 1),
and the prespecified number of iterations allowed in a row without improving the best
solution obtained so far is not reached (condition 2).
When one of the two subpopulations reaches its maximum size a prespecified number of sur-
vivors is selected. With this group of survivors the procedure of generating offspring starts all
over again.This process is explained in more detail in section 4.4.8.
Let the number of iterations allowed in a row without improving the best solution obtained
so far be defined by L = 0.75 . When the number of iterations in a row without improving
the best solution obtained so far reaches L, the population is diversified by first selecting a
prespecified number of survivors and then returning to the first step of the algorithm to create
a new initial population containing the selected survivors. This diversification process is dis-
cussed in more detail in subsection 4.4.9
In the following subsections we will discuss the entire process of generating offspring step
by step.
20
4.4.2 Parent Selection

The first step in creating offspring is the selection of parents. We will apply the roulette wheel
selection operation (Golberg, 1989) which takes the evaluation of the chromosomes into account
to select the best chromosome with the highest probability and the worst with the lowest. Here
our algorithm differs from the one presented by Vidal et al. (2011) as they randomly select par-
ents.
The roulette wheel selection operation, as the name suggests, is based on a roulette wheel.
Let the interval [0, 1] represent the roulette wheel. Every chromosome is represented on the
roulette wheel, i.e. on the interval [0, 1], by a specific part. Let the probability of obtaining
chromosome r be defined by:
EV (r)
P (r) = P .
rP op EV (r)
Now the part on the interval [0, 1] related to chromosome r can be defined by :

X X
P (i), P (i) .
i{1,...,r1} i{1,...,r}
To illustrate this consider the following:

chromosome r = 1 corresponds to the part [0, P (1)],
chromosome r = 2 corresponds to the part [P (1), P (1) + P (2)],
chromosome r = 3 corresponds to the part [P (1) + P (2), P (1) + P (2) + P (3)],
etcetera.
To select two parents two random numbers on the interval [0, 1] are selected; i.e. the roulette
wheel is spinned twice. The chromosomes corresponding to these two randomly generated
numbers are selected as parents.
4.4.3 Create Offspring

To create offspring the assignment schemes of the two selected parents are randomly combined.
This random crossover results in a new assignment scheme for a new chromosome. From this
point onwards the newly created offspring follows the same procedure as the chromosomes do
in the process of generating the initial population:
4.4.4 Routing Algorithm

the nearest neighbour algorithm is applied to obtain a routing scheme corresponding to
the assignment scheme of the new chromosome,
4.4.5 Feasibility Check

its feasibility is investigated,
4.4.6 Evaluation
the evaluation value of the offspring is obtained and finally,
21
4.4.7 Education
the chromosome is educated and inserted in the subpopulation corresponding to its fea-
sibility.
4.4.8 Condition 1: Maximum Size Subpopulation

As previously described, offspring is created as long as the following two conditions are satis-
fied:
the two subpopulations have not reached their maximum size + ,
and the prespecified number of iterations allowed in a row without improving the best
solution obtained so far, L, is not reached.
In this section we will describe what happens when one of the two subpopulations reaches its
maximum size. In the next section we will discuss what happens when the second condition is
not satisfied.
When one of the two subpopulations reaches its maximum size, + , the best chromosomes
are selected for both subpopulations. These two groups of chromosomes are the survivors.
The two remaining groups, both containing chromosomes, are deleted from the subpopula-
tions. With the two groups of chromosomes the whole process re-enters the stage of generat-
ing offspring until one of the two conditions is unsatisfied again.
4.4.9 Condition 2: Maximum Number of Iterations

When the number of iterations in a row without improving the best solution obtained so far
reaches L, the /3 best chromosomes are selected for both subpopulations. These two groups
of /3 chromosomes are the survivors. The remaining two groups of chromosomes are deleted
from the subpopulations. With the two groups of /3 chromosomes the whole process re-enters
the stage of creating an initial population to diversify the chromosomes. The population is
expanded until it has size 4 again and the process continues with this new initial population.
4.5 Stopping Criterion

The whole process evolves generation by generation until either one of the two stopping crite-
ria is met. The two stopping criteria are a maximum computation time, Tmax , and a maximum
number of iterations the algorithm is allowed to make without improving the best solution ob-
tained so far, ITallowed . When the computation time reaches Tmax or the number of iterations
ITactual reaches ITallowed , the algorithm is stopped and the best chromosome, i.e. the best rout-
ing scheme, obtained so far is returned as the best solution.
Now, every aspect of the genetic algorithm is explained in much detail. Let us, in the next
chapters, describe how the algorithm is tested.
22
Chapter 5
Computational Experiments
The hybrid genetic algorithm presented in the previous chapter is tested by a 2 2 2 factorial
experiment. The factors correspond to the total number of locations included (source locations
plus customer locations), the demand of the customers and the required service time at the
customer locations. For each factor two levels are specified: a high and a low level. Before
discussing these levels in more detail, the parameter specification is investigated.
5.1 Parameter Specification

In the previous chapter the following four parameters have been defined: , , nclose and h.
Vidal et al. (2011) specified the following ranges for these parameters to use in the calibration
of their values:
Parameter Range
Minimum population size [5,200]
Number of offspring in a generation [1,200]
nc Proportion of close individuals considered in the diversity contribution, [0,0.25]
nclose = nc
h Proportion of close individuals considered in the education step [0,1]
Table 5.1: Ranges Parameters (Vidal et al., 2011)
According to Vidal et al. (2011) these ranges are appropriate due to values found in the litera-
ture (this holds for the subpopulation sizes), conceptual requirements (a local distance mea-
sure is assumed to implicate not more than 25% of the population) and parameter definition
(this holds for the last proportion). Due to programming reasons we need to deal with the fol-
lowing two constraints: must be larger than 3 and must be divisible by three. Therefore
we define the ranges [6, 66] and [19, 199] for and respectively. For nc and h we apply the
ranges Vidal et al. (2011) set.
Vidal et al. (2011) applied a meta-evolutionary method, the Evolutionary Strategy with Co-
variance Matrix Adaptation (CMA-ES) (Hansen and Ostermeier, 2001) to perform parameter
optimization. This resulted in the specification of the parameter values presented in Table 5.2.
Parameter Value
Minimum population size 25
Number of offspring in a generation 70
nc Proportion of close individuals considered in the diversity contribution, 0.2
nclose = nc
h Proportion of close individuals considered in the education step 0.4
Table 5.2: Calibration Results (Vidal et al., 2011)
Instead of applying the CMA-ES as Vidal et al. (2011) did, we apply a trial and error procedure
to obtain parameter values within the previously mentioned ranges. The results according to
the specification of the parameters will be presented in the next chapter.
When the parameter values are specified, the algorithm can be tested. Let us, in the next sec-
tion, discuss the experimental design we apply.
5.2 Experimental Design

As previously mentioned, a 2 2 2 factorial experiment is applied. Before discussing the
factors in more detail, let us present an overview of the input the algorithm requires:
1. A distance matrix, including the distances dij , i, j V , representing the distance from i
to j in kilometers;
2. A time matrix is required, including the travel times tij , i, j V , representing the time
required to travel from i to j in minutes;
3. A demand vector, including the demands qj , j VC , representing the demand of cus-

tomer j in m2 , m3 or in pallets (depending on the investigated situation);
4. A service time vector, including the service time j , j VC , representing the service time
required at customer j in minutes;
5. The vehicle capacity, Q (in m2 , m3 or in pallets; again depending on the investigated

situation);
6. The maximum amount of time (in minutes) a route can last, T ;
7. The transport costs c1 , per kilometer;
8. The handling costs at the intermediate points, c2 .
Some of these requirements are defined by the factors of the experiment. Let us therefore now
discuss these factors. For each factor two initial levels (high and low) are specified. We mention
initial since preliminary experiments demonstrated that the possible levels are restricted. Be-
fore discussing the results corresponding to the preliminary experiments and the consequences
for the possible levels, let us discuss the levels we initially defined:
Factor 1: The number of locations included (source locations plus customer locations).
High: 500 zip codes including one distribution center, 9 intermediate points and 490 cus-
tomer locations;
24
Low: 50 zip codes including one distribution center, 4 intermediate points and 40 cus-
tomer locations.
Factor 2: The demand of the customers.

A customers demand is randomly assigned from a set of three values. We assume that orders
are not piled up during transport and therefore define demand in m2 .
High: {4; 5, 5; 5.5} m2 ;
Low: {0.5; 1; 1.5} m2 .
Factor 3: The required service time at the customer locations.

A customers service time is randomly assigned from a set of three service times defined in
minutes.
High: {60; 90; 120} minutes;
Low: {10; 30; 60} minutes.
As previously mentioned, preliminary experiments indicated that the algorithm encounters

some issues when dealing with these initial levels. Let us therefore summarize the findings
corresponding to the preliminary experiments in the following subsections and indicate the
consequences for the algorithm and the levels of the factors.
5.2.1 Relevance of Education Step

One of the findings does not include any problems and does not have consequences for the
levels of the factors, but weighs heavily on the execution of the algorithm and is therefore
mentioned first. The education step accounts for a large part of the total computation time
and does not influence the eventual best solution, independent of the parameter settings and
independent of the input. Therefore, it is excluded from the algorithm which implies that h
does not need to be defined.
5.2.2 Influence of Infeasibility

A problem was encountered corresponding to the infeasibility of solutions. A solution is infea-
sible when the route duration constraint is exceeded by at least one route within the solution.
Independent of the parameter settings and independent of the input, the appearance of infea-
sible solutions is enlarged by the assignment procedure we apply. As customers are randomly
assigned to a source instead of to a source nearby, the distances between the locations within
one route are most of the time quite large, especially when the set of locations is small. Hence,
the time required to travel is quite large as well, which results in a large route duration.
The two extreme settings of high- and low-levels of the factors, both increase the appearance
of infeasibility. In the low-levels setting this is due to the fact that the demands within the de-
mand range are small and in the high-levels setting this is due to the fact that the service times
within the service time range are large. In the low-levels setting this can be explained as fol-
lows: the routing algorithm, i.e. the nearest neighbour algorithm, inserts customers into a route
as long as the vehicle capacity allows this. This implies that when demand of all customers is
relatively small, many customers are included in a single route which results in a large route
duration and hence, many infeasible solutions. Large service times, in the high-levels setting,
25
in combination with large traveling times due to the random assignment, also results in large
route durations.
The actual problem we encountered is the fact that for the two extreme settings only infea-
sible solutions were included. Starting with only infeasible solutions, the chances of including
feasible solutions during the execution of the algorithm are rare. Therefore, to be able to in-
vestigate the extreme situations the decision was made to change the assignment procedure to
increase the appearance of feasible solutions. Instead of only assigning customers randomly,
from now on customers are with 70% chance assigned randomly and with 30% chance they
are assigned to one of their three nearest sources. Assigning customers to one of their nearest
sources, decreases the traveling times and therefore increases the appearence of feasiblity.
This does not solve the entire problem. The low-level demand range {0, 5; 1; 1, 5} m2 still allows
too many customers to be in one route. Even when all customers are assigned to their nearest
source, the vehicle capacity is so large, that many customers are included in one route and the
route duration constraint is exceeded. Therefore, from now on the low-level demand range is
specified as {1.5; 2; 2.5} m2 . Then, independent of the parameter settings, not only infeasible
solutions are included when investigating the low-levels setting.
For the high-levels setting the problem is also not entirely solved by the new assignment pro-
cedure. Therefore, the new high-level service time range is specified as {30; 60; 90} minutes.
Then, independent of the parameter settings, feasible solutions are included as well.
5.2.3 Restrictions Due to High-Level of Factor 1

For the high-level setting, we initially intended to investigate 500 zip codes. Unfortunately
MATLAB encounters memory issues when dealing with such a large number of locations.
Therefore, the high-level of Factor 1 is adjusted to 350 zip codes.
MATLAB is able to cope with 350 zip codes, although it still encounters memory issues when
dealing with large values of and , i.e. large populations. The maximum values for and
MATLAB is able to deal with are 15 and 46 respectively. This implies that for the high-levels
setting new ranges for and need to be considered, namely [6, 15] and [19, 46] respectively.
For the low-level setting the larger, previously defined ranges are investigated.
5.2.4 Summarizing the Restrictions

Results according to the preliminary experiments have changed the levels of the factors within
the 2 2 2 factorial experiment quite drastically. Let us therefore present a clear overview
of the new levels to be able to discuss the results corresponding to the different experiments in
the next chapter.
Factor 1: The number of locations included (source locations plus customer locations).
High: 350 zip codes including one distribution center, 9 intermediate points and 340 cus-
tomer locations;
Low: 50 zip codes including one distribution center, 4 intermediate points and 40 cus-
tomer locations.
26
Factor 2: The demand of the customers.
High: {4.5; 5; 5.5} m2 ;
Low: {1.5; 2; 2.5} m2 .
Factor 3: The required service time at the customer locations.
High: {30; 60; 90} minutes;
Low: {10; 30; 60} minutes.
Combining these three factors, with each two levels, results in eight different experiments.
These experiments are all investigated to develop an overview of the capabilities of the algo-
rithm. The results corresponding to the investigation of the experiments will be presented in
the following chapter. However, before discussing the results, let us return to the input require-
ments which will be discussed in the following sections.
5.2.5 Distance and Time Matrix

The first two input requirements, a distance and time matrix, are dependent on Factor 1: The
number of locations included. The by Factor 1 defined number of zip codes are randomly
drawn from the total of 4039 possible four-digit zip codes from The Netherlands. Next, the
program XCargo is used to obtain both the distance matrix as well as the time matrix.
5.2.6 Demand and Service Time Vector

The demand vector and service time vector are dependent on the level of Factor 2 and Factor 3
respectively. As previously mentioned, the demand vector is obtained by randomly assigning
demand from the by Factor 2 specified set to each customer. The service time vector is obtained
by randomly assigning service time from the by Factor 2 specified level to each customer.
5.2.7 Vehicle Capacity and Maximum Route Duration

As previously mentioned, due to the assumption that orders are not piled up during transport,
we use the measurement m2 to define the vehicle capacity. Expert knowledge is used to define
the capacity to be 15 m2 and a maximum route duration of 10 hours, i.e. 600 minutes.
5.2.8 Transport and Handling Costs

The program NEA is used to obtain a transport cost per kilometer of e1.53. In this cost several
components are included such as fuel costs, driver costs, insurances, fleet costs and mainte-
nance costs.
Expert knowledge is used to define the average time required to load and unload a vehicle and
the cost per hour for an employee to do this work. This information is combined to acquire the
handling costs of e4.25 per m2 .
27
5.2.9 Stopping Criteria

Besides the input requirements, the stopping criteria need to be defined as well.
Vidal et al. (2011) applied three different sets of stopping conditions for ITallowed and Tmax ,
namely:
(ITallowed , Tmax )
(104 , 10 min)
(2 104 , 30 min)
(5 104 , 60 min)
Table 5.3: Stopping Conditions (Vidal et al., 2011)
We initially intend to investigate these same sets of stopping criteria. Whether the algorithm is
fast enough to indeed apply these sets, will be discussed in the following chapter.
Now, all required input of the experiment is defined and the experiment design is clear. The
following chapter discusses the results corresponding to the specification of the parameters
and the testing of the algorithm.
28
Chapter 6
Results
This chapter contains two sections of results according to two sets of experiments. The first
section is concerned with the results according to the parameter specification and the second
deals with the results obtained when executing the algorithm with the, in the first section,
specified parameter values. Throughout this whole chapter the defined stopping criterion is
ITallowed = 104 iterations.
6.1 Parameter Specification

To identify appropriate parameter values, several different input sets are investigated. For
each input set, the extreme parameter values, i.e. the minima and maxima within their ranges,
are investigated first. Next, a trial and error procedure is applied to obtain the most suitable
values for and . Let us summarize our findings in the following subsections. Notice that the
parameter h is not investigated, since the education step is excluded from the algorithm.
6.1.1 Fluctuating and

Unfortunately it is hard to draw conclusions upon the most suitable setting of the parameter
values and . Varying the settings of the parameter values for numerous input sets does
not contribute to unambiguous results. This is illustrated by discussing several different input
sets, but before doing so, let us present three statements that hold in general, independent of
the parameter settings and input sets:
The smaller and , the faster the algorithm reaches the stopping criterion, i.e. the
smaller the required execution time.
The larger and , the smaller the number of generations included while executing the
algorithm.
The larger and , the earlier the first large drop down in total costs is observed. Figures
6.1 to 6.6, presented in this chapter, and the figures presented in Appendix 8, illustrate
this observation. This increase in rapidness of the drop down occurrence is explained by
the fact that the weight on costs increases during the execution of the algorithm, while
the weight on the diversity contribution decreases. As and increase, the number
of chromosomes generated in each population increases and hence, more chromosomes
are created within one generation that do not improve the best solution obtained so far.
From this it follows that the weight on costs increases more rapidly when and are
larger which results in the earlier occurrence of a large drop down in total costs.
Although the drop down occurs faster for larger and , the drop down may not be as
large as a drop down occurring for a smaller and , as illustrated by the figures. Hence,
it is not possible to state that a larger and are more suitable.
Let us investigate different input sets to illustrate that we are not capable of drawing strict con-
clusions upon the most suitable settings of the parameters and . Each input set is investigate
for a fixed value of nc, namely nc = 0.20. The following section discusses the actual influence
of fluctuating the parameter value nc.
To be able to compare the figures corresponding to the different input sets, the results are dis-
cussed on the following three pages.
30
Input set 1
The first investigated input set includes 25 locations: one distribution center, two intermediate
points and 22 customer locations. The demand range is [2; 3; 4] m2 and the service time range is
[50; 60; 70] minutes. Figures 6.1 and 6.2 present results for several examined settings for and
. For only 25 locations included, small and are sufficient, i.e. include better solutions, and
therefore larger settings for and are excluded from the figures. The two figures correspond
to obtained solutions of executing the algorithm twice for the exact same input set. Notice
that the patterns in the two figures are quite similar. For the different settings of and
the algorithm converges to approximately the same solution. As the algorithm is fastest for
the smallest and , and the obtained final solution does not differ for different settings of
the parameter values, the figures suggest that the setting with the smallest value is best to
apply, i.e. = 6 and = 19. Notice that these two figures illustrate only two examples of
several executions of the algorithm for different demand and service time ranges. The observed
patterns hold in general. Hence, for an input set including 25 customers a and equal to 6
and 19 respectively, are suitable, independent of the demand and service time range.
Figure 6.1: Input set 1, execution 1
31
Input set 2
Figures 6.3 and 6.4 illustrate results obtained by executing the algorithm twice for the same
input set. This input set includes 50 locations: one distribution center, four intermediate points
and 45 customer locations. The demand range is [2; 3.5; 5] m2 and the service time range is
[45; 60; 75] minutes. Again, large settings for and are excluded since small parameter values
are most suitable. Notice that the results corresponding to the two execution are quite similar
for = 6 and = 19, although the patterns corresponding to the other settings of and
differ quite a lot. Other executions of the algorithm for the same input set yield again other
patterns. This implies that it is not possible to specify the most suitable values of and for
this specific input set. Executing the algorithm for other input sets containing 50 locations, also
results in numerous observed patterns and no unambiguous specification of the most suitable
parameter values.
32
Input set 3
The third investigated input set includes 200 locations: one distribution center, nine intermedi-
ate points and 190 customer locations. The demand range is [3; 3.5; 4] m2 and the service time
range is [10; 20; 30] minutes. Figures 6.5 and 6.6 present the results corresponding to several
settings of parameter values. Notice that the overall patterns within the two figures are quite
similar, although looking closer suggests that it is again hard to determine the most suitable
parameter values for and . More executions of the algorithm for both this input set as well
as for other input sets containing 200 locations, imply this same result of being incapable of
setting and .
33
Investigating the figures of the three input sets presented above, suggests that as the input set
includes more locations, the most suitable parameter values increase. However, as there are
two many fluctuations within the observed patterns, there is no formal evidence. Appendix 8
contains the figures corresponding to two other input sets, which again illustrate that specify-
ing the suitable parameter values for and requires further research. The only investigated
set that presents the same pattern for varying specifications of demand range, service time
range, and , is the input set containing only 25 locations. For such a small input set the
smallest and are suitable. We are not able to make statements about suitable values for
and for larger input sets. Fortunately, independent of the values of and , the algorithm
turns out to work appropriately as is illustrated in the following section. But, before discussing
these results, let us investigate the specification of nc.
6.1.2 Fluctuating nc
The parameter nc defines how many close individuals, nclose , are considered when the diver-
sity contribution is investigated; nclose = nc where nclose is rounded. Table 6.1 presents
an overview of values of nclose corresponding to different values of nc and to get insight in
the reasonability of certain values. Notice that for small the number of close individuals in-
vestigated for the diversity contribution is quite small. Even the maximum value of nc within
the specified range, i.e. nc = 0.25, yields a small number for nclose . When only a few close
individuals are taken into account in the diversity contribution, the diversity contribution may
be small when the number of differences between the individuals is small. This would result
in diversity contributions of the individuals close to one another and hence, the chances of
undervaluing the diversity contribution become high, which is undesirable. This implies that
the most suitable value for nc might be dependent on the most suitable parameter value of .
When is small, a higher nc is probably most appropriate. Let us investigate different settings
for nc and and to examine the influence of fluctuating the parameter value nc.

6 9 12 15 18 24 30 36 42 48 54 60 66
0.05 0 1 1 1 1 1 2 2 2 2 3 3 3
0.10 1 1 2 2 2 2 3 4 4 5 5 6 7
nc 0.15 1 2 2 3 3 4 5 5 6 7 8 9 10
0.20 2 2 3 4 4 5 6 7 8 10 11 12 13
0.25 2 3 4 5 5 6 8 9 11 12 14 15 17
Table 6.1: Possible values of nclose for different nc and .
Figures 6.7 to 6.9 present the results corresponding to five different values of nc for three dif-
ferent settings of and . These three settings are all investigated for the same input set with
100 locations: one distribution center, nine intermediate points and 190 customer locations; the
demand range [2; 4; 6] m2 and the service time range [20; 40; 60] minutes. Notice that the pat-
terns observed within the figures are not similar. This holds for these three examples, as well
as in general. The general conclusion we can draw is that the most suitable value for nc, inde-
pendent of and and independent of the input set, is either nc = 0.15 or nc = 0.20. In the
following section nc = 0.20 is used to investigate the eight different experiments as discussed
in the previous chapter.
Notice that the figures corresponding to the different settings of nc and different values of
and suggest that a larger and are more suitable for this input set. Unfortunately, we can
34
not draw such a hard conclusion since, as described in the previous section, there are too many
situations in which different settings are suggested.
Figure 6.7: Parameter set 1: = 9, = 28
6.2 Results 2 2 2 Factorial Experiment

The previous chapter presented a detailed discussion on the 2 2 2 factorial experiment that
is applied to investigate the capabilities of the genetic algorithm. Let us now discuss the results
according to the eight different experiments.
35
The first result to present is the fact that infeasibility still causes some issues. Three of the
eight experiments cannot be executed due to the fact that only infeasible solutions are included.
These three experiments correspond to the following settings of the factors:
1. Factor 1: high-level, i.e. 350 locations.

Factor 2: low-level, i.e. [1.5; 2; 2.5] m2 .
Factor 3: high-level, i.e. [30; 60; 90] minutes.
2. Factor 1: high-level, i.e. 350 locations.

Factor 3: low-level, i.e. [10; 30; 60] minutes.
3. Factor 1: low-level, i.e. 50 locations.

Factor 3: high-level, i.e. [30; 60; 90] minutes.
Notice that each of these experiments, deals with low-level demand range. Low-level demand
range, implies that many customers are included in a single route. The first two experiments
both contain 350 locations. The first experiment that cannot be executed not only deals with the
low demand level, but also has large service times which, as explained in the previous chapter,
enlarges the chances of solutions being infeasible. The second experiment however deals with
the low-level of Factor 2, service time range, and cannot be executed as well. This implies that
even with smaller service times, too many customers are included in one route, when dealing
with 350 locations. The third experiment includes 50 locations and deals with the low-level de-
mand and the high-level of the service time range. Few locations included, results in locations
being more spread throughout the country. The nearest neighbour or nearest source, might not
be that near. This enlarges the occurrence of high route durations and hence, enlarges the ap-
pearance of infeasibility. Notice that it is possible to execute the experiment with 50 locations,
low-level demand and low-level service times. This implies that when dealing with low-level
service times it is possible to construct routes that do not exceed the route duration constraint.
Modificating the nearest neighbour algorithm by including a route duration constraint can
solve the infeasibility issues although a hard constraint would result in the exclusion of all in-
feasible results. As including both feasible and infeasible solutions results in a more diversified
population and decreases the chances of being trapped in a local minimum (Vidal et al., 2011),
a soft constraint would probably be more suitable. The application of another, more advanced
routing algorithm might solve the issues with infeasibility as well.
Now, let us discuss the results corresponding to the remaining experiments.
For the five remaining experiments it can be shown that the genetic algorithm outperforms
the nearest neighbour algorithm, independent of the parameter values of and . For each
experiment several parameter settings have been investigated. Remember that the possible
range for and is limited for the high-level setting of Factor 1, as described in Section 5.2.3
and hence no large values of and are investigated for this setting. One setting per exper-
iment is discussed to illustrate the performance of the genetic algorithm with respect to the
performance of the nearest neighbour algorithm. The nearest neighbour algorithm is applied
in such a way that every customer location is assigned to its nearest source. For each source a
separate execution of the nearest neighbour algorithm determines the routing scheme.
36
Table 6.2 presents the confidence intervals for the five experiments obtained by executing the
algorithm 25 times per experiment. The confidence intervals are derived by applying the Stu-
dents t-test. For a detailed description on the determination of the confidence intervals, we
refer to Law (2007). For all experiments and all 25 executions the best solution obtained by
the genetic algorithm is at least as good as the best solution obtained by the nearest neighbour
algorithm. To be precise, the genetic algorithm obtained the same best solution as the nearest
neighbour algorithm seven times out of the 125 (25 5) executions. For the other 118 executions,
the genetic algorithm presents a better solution.
The confidence intervals represent the intervals in which the percentage of improvement will
be with 95 percent chance. Notice that the largest improvements by applying the genetic algo-
rithm instead of the nearest neighbour algorithm, are achieved for the experiments including 50
locations. This may be explained by the following reasoning: the nearest neighbour algorithm
assigns customers to their nearest source and determines the routing schemes by considering
the separate source problems. Customer location A may be located near customer location B,
while the nearest source for customer A may be source 1 and the nearest source for customer
B may be source 2. The nearest neighbour algorithm then includes customer location A within
a route that starts and ends at source 1 while it may be more sufficient to include it in a route
that starts and ends at source 2; for example, it may be more sufficient to visit customer loca-
tion A after visiting customer location B. This possibility is included in the genetic algorithm.
When considering only a small number of locations, these locations are more spread through-
out the country which implies that the achieved improvement may be higher compared to the
improvement obtained when considering a large number of locations.
Experiment Level of Level of Level of 95% Confidence

Number Factor 1 Factor 2 Factor 3 Interval
1 high high high 9 28 (1.04%, 1.54%)
2 high high low 12 37 (0.81%, 1.37%)
3 low high high 6 19 (3.36%, 6.44%)
4 low high low 24 73 (2.23%, 4.51%)
5 low low low 30 91 (6.36%, 10.83%)
Table 6.2: Confidence Intervals
Notice that for the five experiments several other settings of and have been investigated
as well. In accordance with earlier findings, the most suitable parameter values for and
remain unclear. As previously mentioned, further research is required.
6.3 Comparing Availability of Multiple Sources with Only a Distri-

bution Center Available
Let us now compare the situation in which all customer locations are assigned to the distribu-
tion center with the situation in which multiple sources are available. For the setting of low-
levels for all three factors the following confidence intervals are obtained by again performing
25 executions:
Confidence interval corresponding to the genetic algorithm with multiple sources avail-
able versus the genetic algorithm with only the distribution center available for supplying
the customer locations: (18.26%, 27.68%).
37
This implies that including the possibility of supplying customers locations from multiple
sources results, with a 95% confidence level, in cost savings of 18.26% to 27.68%.
Confidence interval corresponding to the genetic algorithm with multiple sources avail-
able versus the nearest neighbour algorithm with only the distribution center available
for supplying the customer locations: (23.35%, 33.50%).
Obviously, comparing these two situations involves significant improvements as it is al-
ready shown that improvement is achieved by comparing the genetic algorithm and near-
est neighbour algorithm both with multiple sources.
Confidence interval corresponding to the genetic algorithm with only the distribution
center supplying the customer locations versus the nearest neighbour algorithm with
only the distribution center as well: (5.69%, 9.10%).
This illustrates again that the genetic algorithm obtains a better solution than the nearest
neighbour algorithm, with a 95% confidence level, there is an improvement of 5.69% to
9.10%.
The genetic algorithm requires some modifications to obtain a solution for the situation in
which all customer locations are assigned to the distribution center. One of these modifica-
tions is that a route duration constraint is inserted to obtain feasible solutions. The drawback
of implementing a route duration constraint within the routing part of the genetic algorithm, is
that all chromosomes generated are similar. This is due to the fact that the assignment schemes
are all the same, i.e. all customer locations are assigned to the distribution center, and due to
the application of the nearest neighbour algorithm to obtain the routing schemes. To overcome
this problem, and to let the chromosomes converge, the education step is activated.
Notice that the comparison between the situation where multiple sources are available for dis-
tribution to customers, and the situation in which all customer locations are supplied from the
distribution center, is only discussed for one of the experiments. For the other four experiments
the algorithm can cope with, similar findings hold.
As mentioned at the beginning of this section, every experiment is executed with a stopping
criterion of ITallowed = 104 iterations. The figures presented throughout this chapter all indicate
that the algorithm converges within this number of iterations. Therefore, no other stopping cri-
teria are investigated.
Now, let us present an overview of the main results:
To be able to specify the parameter values for and further research is required.
The most suitable parameter value for nc is either nc = 0.15 or nc = 0.20.
The genetic algorithm presented in this thesis outperforms the nearest neighbour algo-
rithm.
Including the possibility of supplying customer locations from multiple sources instead
of supplying from one distribution center involves costs savings.
38
Chapter 7
Conclusion
In this thesis we have discussed the situation of distributing products with home delivery ser-
vice to their final customers. The most common manner of distributing products is direct distri-
bution from a central distribution center. An alternative is to apply indirect distribution which
yields that customers are supplied from intermediate points such as physical stores, hubs or
depots, with these intermediate points being supplied from the distribution center. Obviously,
a combination of both direct and indirect distribution is possible as well.
Applying this combination of direct and indirect distribution requires the answer to an inter-
esting question: How to assign customers to the sources appropriately and determine suitable
routes?. In this thesis, this problem is modeled as a variant of the Multi-Depot Vehicle Routing
Problem (MDVRP). A variant of the MDVRP is applied, instead of the regular MDVRP, due to
the relatedness of the sources. The regular MDVRP requires independent sources, while in the
situation of combining direct and indirect distribution, the sources are related since the distri-
bution center, being a source itself, also supplies the other sources, i.e. the intermediate points.
A hybrid genetic algorithm is presented to solve the variant of the MDVRP and answer the
assignment and routing question. Results according to the testing of the algorithm show that
the algorithm works appropriately for several input sets. Eight situations are examined by ap-
plying a 2 2 2 factorial experiment, where the factors correspond to the number of locations
included, the demand range and the service time range. For each factor a high and low level
is specified. The algorithm turns out to deal well with five of the eight combinations of the
levels. It encounters issues when dealing with low-level demand due to infeasibility. Dealing
with low-level demand in combination with a large number of locations is not possible, inde-
pendent of the specified service time range. The same kind of problems are encountered for
the situation of a small number of locations, low-level demand and high-level service times.
The infeasibility problems could be solved by redefining the algorithm to ensure that always
at least one feasible solution is included. This may be done by the implementation of another
routing algorithm or the adjustment of the nearest neighbour algorithm by imposing a soft
route duration constraint.
Two main results have been demonstrated for the five experiments the algorithm could cope
with. First, the genetic algorithm was shown to outperform the nearest neighbour algorithm.
Second, including the possibility of supplying from multiple sources instead of from one dis-
tribution center, results in cost savings.
Chapter 8
Further Recommendations
The most explicit recommendation corresponds to further research on suitable parameter val-
ues of and . In this thesis, we were not able to specify the most appropriate parameter values
as different patterns were observed for each execution of the algorithm. Therefore, further re-
search is required to be able to properly define the parameters.
Furthermore, in this thesis several assumptions have been made to define the scope of the
research. Relaxing some restricting assumptions or applying different assumptions might re-
sult in interesting situations.
Examples of assumptions to apply, yielding different situations that may be worth while to
consider, are:
Include restrictions on the storage capacity and inventory level at the sources. Although
this increases the complexity of the problem structure, it makes it more realistic as well.
A check must be included corresponding to the availability of the required products at
the sources before assigning the customers.
Include a capacity constraint for the vehicles supplying the intermediate points. Again,
this increases the complexity of the structure of the problem while making it more re-
alistic. It requires a check on the possibility of supplying the intermediate points with
products meant for home delivery as well as the check of availability of the required
products at the sources, before assigning customers.
Include the possibility of consolidation of orders over time. This requires a multi-period
approach.
Include return flows and return handling costs. Dealing with return flows is a well-
known problem nowadays. The problem structure would change as besides the delivery
of products, pickups need to be included as well. The available vehicle capacity should be
considered in a slightly different way and the assignment and routing decisions become
more complicated.
Include delivery time windows. Nowadays, it is quite common that customers can choose
a delivery time from prespecified time windows. Including such an option not really
changes the problem when time windows are hard and no switches of customers between
time windows occur. Routes would be planned per time window; the only modification
that should be made, is the fact that routes within the time windows do not need to end at
the same source the route started from. The only requirement is that at the end of the day,
vehicles should return to this source. Dealing with soft delivery time windows requires
more modifications.
Include a restriction on the number of vehicles available at the sources. In this thesis, it is
assumed that an infinite number of vehicles is available at each source. The best solution
obtained when executing the algorithm presents, besides the most suitable assignment
and routing scheme, the required number of vehicles per source. More realistic is the
assumption that the number of vehicles available at the sources is fixed. To include this
assumption, a constraint on the number of available vehicles is required which increases
the complexity of the assignment and routing decision.
Include a cost related to the service at the customers. In this thesis, no costs are included
corresponding to the service time at the customers. The vehicles do not drive and hence,
no transport costs are made although the drivers do spend time at the customers while
they get paid. A cost could be included corresponding to the time spend at the customers.
Besides investigating different situations by including and/or adapting assumptions, two other
aspects would be interesting to examine. First of all, it would be interesting to investigate an
input set with a large number of locations included. Second, it would be interesting to investi-
gate a larger range of parameter values for and for such a large input set as it may increase
the solution quality. In this thesis, these two aspects could not be investigated due to mem-
ory issues MATLAB encountered. Another MATLAB version or another computer with more
memory space, might solve these memory issues.
Last but not least, the genetic algorithm may be adjusted by applying another routing algo-
rithm or by including a route duration constraint within the nearest neighbour algorithm to be
able to cope with the infeasibility issues we encountered and investigate more input sets.
41
Bibliography
Chao, I.M., B.L. Golden, and E. Wasil (1993). A new heuristic for the multi-depot vehicle routing
problem that improves upon best-known solutions. American Journal of Mathematical and
Management Sciences 13(3), 371406.
Clarke, G. and J.W. Wright (1964). Scheduling of vehicles from a central depot to a number of
delivery points. Operations Research 12(4), 568581.
Cordeau, J.F., M. Gendreau, and G. Laporte (1997). A tabu search heuristic for periodic and
multi-depot vehicle routing problems. Networks 30, 105119.
Dantzig, G.B. and J.H. Ramser (1959). The truck dispatching problem. Management Science 6(1),
8091.
Dueck, G. (1993). New optimization heuristics: The great deluge algorithm and the record-to-
record travel. Journal of Computational Physics 104, 10861094.
Gendreau, M., J.-Y. Potvin, O. Braysy, G. Hasle, and A. Lkketangen (2008). Metaheuristics for
the vehicle routing problem and its extensions: A categorized bibliography. Chapter in: The
Vehicle Routing Problem: Latest Advances and Challenges, 143169. Golden, B., S. Raghavan, and
E. Wasil (Eds.), Springer.
Gillett, B.E. and J.G. Johnson (1976). Multi-terminal vehicle-dispatch algorithm. Omega 4, 711
718.
Glover, F. (1990). Tabu search - part ii. ORSA Journal on Computing 2(1), 432.
Golberg, D.E. (1989). Genetic Algorithms in Search Optimization and Machine Learning. Addison-
Wesley.
Golden, B., T. Magnanti, and H. Nguyen (1977). Implementing vehicle routing algorithms.
Networks 7, 113148.
Hansen, N. and A. Ostermeier (2001). Completely derandomized self-adaptation in evolution

strategies. Evolutionary Computation 9(2), 159195.
Ho, W., G.T.S. Ho, P. Ji, and H.C.W. Lau (2008). A hybrid genetic algorithm for the multi-depot
vehicle routing problem. Engineering Applications of Artificial Intelligence 21(4), 548557.
Holland, J.H. (1975). Adaptation in natural and artificial systems. Ann Arbor: University of Michi-
gan Press.
Law, A.M. (2007). Simulation Modeling and Analysis. McGraw-Hill.
Liong, C.Y., I. Wan Rosmanira, O. Khairuddin, and M. Zirour (2008). Vehicle routing problem:
Models and solutions. Journal of Quality Measurement and Analysis 4(1), 205218.
Mirabi, M., S.M.T. Fatemi Ghomi, and F. Jolai (2010). Efficient stochastic hybrid heuristics
for the multi-depot vehicle routing problem. Robotics and Computer-Integrated Manufactur-
ing 26(6), 564569.
Pisinger, D. and S. Ropke (2007). A general heuristic for vehicle routing problems. Computers
& Operations Research 34(8), 2403 2435.
Prins, C. (2004). A simple and effective evolutionary algorithm for the vehicle routing problem.
Computers and Operations Research 31(12), 19852002.
Raft, O.M. (1982). A modular algorithm for an extended vehicle scheduling problem. European
Journal of Operational Research 11(1), 6776.
Renaud, J., G. Laporte, and F.F. Boctor (1996). A tabu search heuristic fort he multi-depot
vehicle routing problem. Computers and Operational Research 23, 229235.
Ropke, S. and D. Pisinger (2006). An adaptive large neighborhood search heuristic for the
pickup and delivery problem with time windows. Transportation Science 40(4), 455472.
Tillman, F.A. and T.M. Cain (1972). An upperbound algorithm for the single and multiple
terminal delivery problem. Management Science 18(11), 664682.
Vidal, T., T.G. Crainic, M. Gendreau, N. Lahrichi, and W. Rei (2011). A hybrid genetic algorithm
for multi-depot and periodic vehicle routing problems. Technical Report 2011-05 CIRRELT,
University of Montreal.
Wren, A. and A. Holliday (1972). Computer scheduling of vehicles from one or more depots to
a number of delivery points. Operational Research Quarterly 23(3), 333344.
Zhang, J., J. Tang, and F.Y.K. Fung (2011). A scatter search for multi-depot vehicle routing
problem with weight-related cost. Asia-Pacific Journal of Operations Research 28(3), 323348.
Appendix I
Input set 4
The fourth investigated input set includes 100 locations: one distribution center, four interme-
diate points and 95 customer locations. The demand range is [2; 4; 6] m2 and the service time
range is [20; 40; 60] minutes.
Figure 1: Input set 4, execution 1

Input set 5
The fifth investigated input set includes 250 locations: one distribution center, nine intermedi-
ate points and 240 customer locations. The demand range is [3; 4; 5] m2 and the service time
range is [10; 45; 90] minutes.

A Hybrid GA To Solve The Multidepot VRP

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

A Hybrid GA To Solve The Multidepot VRP

Hochgeladen von

Copyright:

Verfügbare Formate

AHybridGeneticAlgorithmtoSolvethe

A Hybrid Genetic Algorithm to Solve the

Masters Thesis Operations Research

prof. dr. K.J. Roodbergen

April 24, 2012

Keywords: Multi-Depot Vehicle Routing Problem, meta-heuristics, hybrid genetic algorithm.

5.2.6 Demand and Service Time Vector . . . . . . . . . . . . . . . . . . . . . . . . 27

1.1 Direct Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1 Possible Routing Scheme, Direct Distribution . . . . . . . . . . . . . . . . . . . . . 10

4.1 Flowchart General Scheme Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 15

6.1 Input set 1, execution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1 Input set 4, execution 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.1 Ranges Parameters (Vidal et al., 2011) . . . . . . . . . . . . . . . . . . . . . . . . . 23

6.1 Possible values of nclose for different nc and . . . . . . . . . . . . . . . . . . . . . 34

Figure 1.1: Direct Distribution Figure 1.2: Indirect Distribution

1.3 Goal and Scope

The algorithm we propose is tested by a 2 2 2 factorial experiment. Hence, eight different

1.5 Structure of Remainder

n The total number of customers.

Let binary variable zij , i VD and j VC , describe whether customer j is assigned to

Now, the costs can be described as follows:

total costs = (2.1) + (2.2)

Let the following vector define the demand of customer j VC , in m2 :

Let the capacity per vehicle be equal to Q = 8 m2 ;

Let the transport costs be c1 = 2 euro per kilometer;

General Scheme Algorithm

Step 1: Generate initial population.

Step 2: Evaluate every chromosome by its evaluation value.

Step 3: Educate the chromosomes.

Step 4: Generate offspring, i.e. new chromosomes.

Step 5: End while.

Figure 4.1: Flowchart General Scheme Algorithm

4.1 Generation of Initial Population

4.1.2 Assignment Procedure

4.1.3 Routing Algorithm

4.1.4 Feasibility Check

Let u be a customer location, i.e. u VC ;

Let (u, v) be the partial route from u to v in k(u);

Let the neighbourhood of u be defined by the h n, h [0, 1], closest neighbours of u;

Let v be a customer location in the neighbourhood of u;

Let x be the successor of u in k(u) and let y be the successor of v in k(v).

1. Remove u and place it after v;

2. Remove (u, x) and place (u, x) after v;

3. Remove (u, x) and place (x, u) after v;

5. Swap (u, x) and v;

6. Swap (u, x) and (v, y);

4.4 Generation of Offspring

4.4.2 Parent Selection

To illustrate this consider the following:

chromosome r = 2 corresponds to the part [P (1), P (1) + P (2)],

chromosome r = 3 corresponds to the part [P (1) + P (2), P (1) + P (2) + P (3)],

4.4.3 Create Offspring

4.4.4 Routing Algorithm

4.4.5 Feasibility Check

4.4.8 Condition 1: Maximum Size Subpopulation

the two subpopulations have not reached their maximum size + ,

4.4.9 Condition 2: Maximum Number of Iterations

4.5 Stopping Criterion

5.1 Parameter Specification

Table 5.1: Ranges Parameters (Vidal et al., 2011)

Table 5.2: Calibration Results (Vidal et al., 2011)

5.2 Experimental Design

3. A demand vector, including the demands qj , j VC , representing the demand of cus-