Sie sind auf Seite 1von 263

Produktion und Logistik

Herausgegeben von
C. Bierwirth, Halle, Deutschland
B. Fleischmann, Augsburg, Deutschland
M. Fleischmann, Mannheim, Deutschland
M. Grunow, München, Deutschland
H.-O. Günther, Bremen, Deutschland
S. Helber, Hannover, Deutschland
K. Inderfurth, Magdeburg, Deutschland
H. Kopfer, Bremen, Deutschland
H. Meyr, Stuttgart, Deutschland
K. Schimmelpfeng, Stuttgart, Deutschland
Th. S. Spengler, Braunschweig, Deutschland
H. Stadtler, Hamburg, Deutschland
H. Tempelmeier, Köln, Deutschland
G. Wäscher, Magdeburg, Deutschland
Diese Reihe dient der Veröffentlichung neuer Forschungsergebnisse auf den Gebie-
ten der Produktion und Logistik. Aufgenommen werden vor allem herausragende
quantitativ orientierte Dissertationen und Habilitationsschriften. Die Publikatio-
nen vermitteln innovative Beiträge zur Lösung praktischer Anwendungsprobleme
der Produktion und Logistik unter Einsatz quantitativer Methoden und moderner
Informationstechnologie.

Herausgegeben von
Professor Dr. Christian Bierwirth Professor Dr. Herbert Kopfer
Universität Halle Universität Bremen

Professor Dr. Bernhard Fleischmann Professor Dr. Herbert Meyr


Universität Augsburg Universität Hohenheim

Professor Dr. Moritz Fleischmann Professor Dr. Katja Schimmelpfeng


Universität Mannheim Universität Hohenheim

Professor Dr. Martin Grunow Professor Dr. Thomas S. Spengler


Technische Universität München Technische Universität Braunschweig

Professor Dr. Hans-Otto Günther Professor Dr. Hartmut Stadtler


Technische Universität Berlin Universität Hamburg

Professor Dr. Stefan Helber Professor Dr. Horst Tempelmeier


Universität Hannover Universität Köln

Professor Dr. Karl Inderfurth Professor Dr. Gerhard Wäscher


Universität Magdeburg Universität Magdeburg

Kontakt
Professor Dr. Thomas S. Spengler
Technische Universität Braunschweig
Institut für Automobilwirtschaft
und Industrielle Produktion
Katharinenstraße 3
38106 Braunschweig
Thomas Kirschstein

Integrated Supply
Chain Planning in
Chemical Industry
Potentials of Simulation
in Network Planning
Foreword by Prof. Dr. Claudia Becker
und Prof. Dr. Christian Bierwirth
Thomas Kirschstein
Halle (Saale), Germany

Dissertation University of Halle (Saale), 2014

Produktion und Logistik


ISBN 978-3-658-08432-5 ISBN 978-3-658-08433-2 (eBook)
DOI 10.1007/978-3-658-08433-2

Library of Congress Control Number: 2014958973

Springer Gabler
© Springer Fachmedien Wiesbaden 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or
part of the material is concerned, speci¿cally the rights of translation, reprinting, reuse of illus-
trations, recitation, broadcasting, reproduction on micro¿lms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a speci¿c statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained
herein or for any errors or omissions that may have been made.

Printed on acid-free paper

Springer Gabler is a brand of Springer Fachmedien Wiesbaden


Springer Fachmedien Wiesbaden is part of Springer Science+Business Media
(www.springer.com)
V

Foreword
In literature on operations management, chemical industry is primarily dealt with focused
case studies. The limited flexibility of production processes as well as the volatility of
customer demand were identified as the most prominent challenges for chemical supply
chain management. Most chemical production assets are complex, inflexible, and incur
high set-up costs. Hence, such assets can only be used in an economically reasonable way
if operated continuously. Therefore, logistical processes offer substantial contributions to a
chemical company’s value added. E.g. logistical processes ensure the supply of downstream
assets in case of asset break downs. More important, however, logistical processes allow
balancing demand variations within chemical production networks. Therefore, transport
and distribution planning is essential in both, intra-site logistics, mostly relying on pipeline
transportation, and inter-site logistics, mostly relying on rail and ship transportation.
With this book, Thomas Kirschstein covers this topic in an encompassing, profound,
and general way. Starting with basic chemical production processes, which are modelled
with time series methods, established and new logistical planning problems for distributed
chemical production networks are presented. Finally, these components are integrated in a
simulation-based planning framework which is implemented in a decision support system.
The developed decision support system is validated by means of case studies relying on
historical records of a real world chemical production network.
The present work is an important contribution to scientific literature from a methodolo-
gical and application-oriented point of view. It develops systematically the basic elements
for modelling complex chemical production networks and illustrates the benefits of advan-
ced decision support systems in chemical supply chain management. We wish the book
continued success and wide acceptance.

Claudia Becker and Christian Bierwirth


VII

Preface
This book is the product of a process which started in 2006 as a cooperation seminar
between the Dow Olefinverbund GmbH and the chairs of Production & Logistics, Stati-
stics, and Operations Research. At this time I couldn’t imagine a result like this. During
this time not only the project has developed and changed, also my experiences, skills, and
expectations have grown. These developments would have been impossible without the
support of a bunch of persons. The space provided here does not suffice to duly thank all
these persons. So, the following thanks are rather exemplary than encompassing.
Special thanks are due to my supervisors Prof. Dr. Claudia Becker, Prof. Dr. Christian
Bierwirth, and Prof. Dr. Taïeb Mellouli which offered invaluable support by their advices
and constant commitment in all those years.
No less important was the support of all employees of Dow Olefinverbund GmbH which
allowed me to study not only Dow’s production and logistics processes but also their day-
to-day business. In particular, I thank Wolfgang Schnabel, Andreas Kroupa, and Wubbe
Prins for organizing and coordinating the practical part of this project. Without their
support, this book would not exist.
Besides my direct supporters, I’d like to thank all colleagues at the department for
Economics and Business Administration. Thanks to them research and teaching at the
university was always a pleasant and (almost) always a productive challenge. Represen-
tative for all colleagues I thank Heidrun Rudolph, Lisiane Schnegelsberg, and Dr. Steffen
Liebscher from the chair of Statistics as well as Ute Lorenz, Dorota Mańkowska, Jens
Kuhpfahl, and Prof. Dr. Frank Meisel from the chair of Production & Logistics.
An important contribution to the successful completion of this book offered my family
and friends. Not only by numerous discussions about the contents of the projects but also
by the constant interest in its progress, I never lost sight of the goal of this project.
But, it is unthinkable that this book would exist without the support of my wife Susanne
and my daughters Antonia and Emilia. All three of them are constant sources of joy,
inspiration, and recuperation in particular in times of frustration and self-doubts. For
their support, I’m deeply grateful.

Thomas Kirschstein
IX

Contents

List of Tables XI

List of Figures XV

List of Abbreviations XVII

List of Notation XIX

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Chemical production processes 5


2.1 Characterization of chemical production processes . . . . . . . . . . . . . . . 7
2.2 Modelling chemical production processes . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Chemical kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2 Modelling & simulation of chemical processes . . . . . . . . . . . . . 18
2.2.3 Process identification & control . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Time series methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.1 ARIMA models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.2 GARCH models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.3 Multivariate time series models . . . . . . . . . . . . . . . . . . . . . . 31
2.3.4 Data preparation, model specification and residual checking . . . . . 33

3 Distribution planning in chemical industry logistics 51


3.1 Characteristics of chemical industry logistics . . . . . . . . . . . . . . . . . . 52
3.2 Planning problems for pipeline operations . . . . . . . . . . . . . . . . . . . . 54
3.2.1 Technical and organizational prerequisites . . . . . . . . . . . . . . . 54
3.2.2 Single-product pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.3 Multi-product pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2.3.1 Batch flow pipelines . . . . . . . . . . . . . . . . . . . . . . . 68
3.2.3.2 Batch split pipelines . . . . . . . . . . . . . . . . . . . . . . . 81
3.2.3.3 Multi-source pipeline systems . . . . . . . . . . . . . . . . . 85
3.3 Planning problems for rail operations . . . . . . . . . . . . . . . . . . . . . . . 87
X

3.3.1 Technical and organizational prerequisites . . . . . . . . . . . . . . . 87


3.3.2 A short-term rail transportation problem . . . . . . . . . . . . . . . . 90
3.3.2.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . 90
3.3.2.2 Components for modelling rail transports . . . . . . . . . . 91
3.3.2.3 Components for modelling turnover processes . . . . . . . . 92
3.3.2.4 Components for modelling the objective function . . . . . 93
3.3.2.5 Mathematical model . . . . . . . . . . . . . . . . . . . . . . . 94
3.4 Planning problems for ship operations . . . . . . . . . . . . . . . . . . . . . . 108
3.4.1 Technical and organizational prerequisites . . . . . . . . . . . . . . . 108
3.4.2 Maritime inventory routing problems . . . . . . . . . . . . . . . . . . 110
3.4.3 Maritime inventory shipping problems . . . . . . . . . . . . . . . . . . 113

4 Integrated planning of chemical supply chains 123


4.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.2 Sources and effects of uncertainty in chemical industry . . . . . . . . . . . . 141
4.3 A framework for simulation-based integrated planning of supply chains in
chemical industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.3.1 Conceptual modelling & data analysis . . . . . . . . . . . . . . . . . . 151
4.3.2 Components of chemical supply chain simulation models . . . . . . . 159
4.3.3 Verification & validation . . . . . . . . . . . . . . . . . . . . . . . . . . 167
4.3.4 Planning of simulation experiments . . . . . . . . . . . . . . . . . . . 170
4.3.4.1 Performance measures in (chemical) supply chain models . 172
4.3.4.2 Experimental designs . . . . . . . . . . . . . . . . . . . . . . 175
4.3.4.3 Simulation optimization . . . . . . . . . . . . . . . . . . . . . 185
4.3.5 Decision support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

5 Conclusion and final remarks 203

Bibliography 207

Appendix 229
XI

List of Tables

2.1 Typology of production processes . . . . . . . . . . . . . . . . . . . . . . . . . 9


2.2 Characteristics of production processes in chemical industry . . . . . . . . . 10
2.3 EACF for T = 100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4 EACF for T = 1000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5 theoretical EACF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6 Information criteria for both sampled time series and various model speci-
fications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.7 Estimated parameters for the Naphtha time series . . . . . . . . . . . . . . . 43
2.8 Coefficients of the initial V ARX(3) model for de-alkylation plant . . . . . 46
2.9 Coefficients of the V ARX(3) model with outlier correction and variable
selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.1 Categorization of pipeline types . . . . . . . . . . . . . . . . . . . . . . . . . . 55


3.2 Transition matrices and production modes for providers and consumers . . 64
3.3 All combinations of plant production modes . . . . . . . . . . . . . . . . . . . 65
3.4 Transition matrix Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.5 Comparison of problem features for Magatão et al. (2004) and Relvas et al.
(2006) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.6 Set of parameters and decision variables for the sELSP . . . . . . . . . . . . 73
3.7 Transition costs ctrans
st for the PIG insertion scenario . . . . . . . . . . . . . . 78
3.8 Interface quantities dst and transition costs ctrans
st for the interface scenario 79
3.9 Net demand rate ωs and holding cost rates chold s for both scenarios . . . . . 79
3.10 Resulting optimal pumping cycles for the sELSP-BP and sELSP-BP-IF . . 80
3.11 Pumping times and idle times of the optimal schedules for the sELSP-BP
and sELSP-BP-IF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.12 Classification of literature on scheduling of one-to-many pipeline systems . 83
3.13 Sets, parameters, variables, and decision variables for the MC-RTP . . . . . 98
3.14 Exemplary assignment of RTCs to two trains . . . . . . . . . . . . . . . . . . 100
3.15 Technical parameters for the rail operations planning example . . . . . . . . 103
3.16 Consumption rate ωis , stock capacities sCap Ini
is , initial stocks sis , and target
Tar
stock levels sis for all sites i and chemicals s . . . . . . . . . . . . . . . . . . 104
3.17 Sets, parameters, variables, and decision variables for the MISP-STA . . . 115
3.18 Technical specification of tankers and tanks (nCap Cap Trv
vk , qsk , cv ) . . . . . . . . 117
XII

4.1 Classification of literature on integrated SC configuration and management


planning in chemical industry . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.2 Classification of literature on integrated SC management planning in chem-
ical industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.3 Classification of sources of uncertainties and examples (per dimension) . . 145
4.4 Material balances (in t/hour) and storage capacities (in t) per site and
chemical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.5 Examples for processors according to processor type and processing attributes160
4.6 Relation between aggregation levels and types of processors . . . . . . . . . 162
4.7 Pseudo-code for chemical SC simulation model . . . . . . . . . . . . . . . . . 166
4.8 Classification of V&V techniques . . . . . . . . . . . . . . . . . . . . . . . . . 168
4.9 Attributes of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
4.10 Examples and classification of performance measures in SCM . . . . . . . . 174
4.11 Values of the control variables for the experimental design . . . . . . . . . . 179
4.12 Effects of input variables on responses in example 11 . . . . . . . . . . . . . 182
4.13 Overview on simulation optimization methods . . . . . . . . . . . . . . . . . 187
4.14 Optimal and baseline values for inventory parameters . . . . . . . . . . . . . 200
4.15 Performance measures for optimal and baseline values of inventory parameters202

A.1 Information criteria for VARX models of order 1 to 6 for the de-alkylation
plant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
A.2 Coefficients of the V ARX(3) model with outlier correction . . . . . . . . . 230
A.3 ANOVA table for the initial V ARX(3) model (without outlier correction) 230
A.4 ANOVA table for the initial V ARX(3) model with outlier correction and
variable selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
A.5 Stationary flow rates (in t/h) for exemplary chemical production network . 234
A.6 Coefficients of the time series models for cracker at site 1 (V ARX(2)+AR(3))234
A.7 Coefficients of the time series models for cracker at site 2 (V ARX(2)+AR(3))234
A.8 Coefficients of the time series models for hydrogenation plant at site 1
(V ARX(3) + AR(2)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
A.9 Coefficients of the time series models for hydrogenation plant at site 2
(V ARX(3) + AR(3)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
A.10 Coefficients of the time series models for Butex plant at site 2 (V ARX(1)+
AR(4)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
A.11 Coefficients of input and output time series models of SISO and MISO plants235
A.12 Un-/loading capacities, initial stock of empty RTCs, target and initial stock
levels per transported chemical at both production sites . . . . . . . . . . . 236
A.13 Cost rates for the MC-RTP instances . . . . . . . . . . . . . . . . . . . . . . . 236
A.14 Distributions of inter-arrival time and deliver quantities for external cus-
tomers/suppliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
XIII

A.15 Transition matrices for plants at site 1 . . . . . . . . . . . . . . . . . . . . . . 236


A.16 Transition matrices for plants at site 2 . . . . . . . . . . . . . . . . . . . . . . 237
A.17 Resolution V design for 6 dichotomous variables ("-1" encodes the variable’s
lower level and "1" the upper level) . . . . . . . . . . . . . . . . . . . . . . . . 238
A.18 Estimated responses for all possible configurations (dominated configura-
tion gray coloured) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
A.19 Pseudocode for inventory system model . . . . . . . . . . . . . . . . . . . . . 243
A.20 Summary for the logistic regression model of responses of efficient settings 244
XV

List of Figures

1.1 Overview on the structure of the thesis . . . . . . . . . . . . . . . . . . . . . . 4

2.1 Chemical SC scheme with highlighted production plants . . . . . . . . . . . 5


2.2 Production quantities of basic chemicals in 2008-2010 . . . . . . . . . . . . . 6
2.3 Exemplary production network of Naphtha derivatives . . . . . . . . . . . . 11
2.4 Scheme of a distillation column . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 Overview on chemical operations . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Steps in chemical process modelling & control . . . . . . . . . . . . . . . . . 14
2.7 Topics in chemical kinetic and key words/methods . . . . . . . . . . . . . . . 15
2.8 Schematic flow sheet of a steam cracker . . . . . . . . . . . . . . . . . . . . . 21
2.9 Chemical process control scheme . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.10 Theoretical and empirical ACF and PACF for an ARMA(2,1) process . . . 38
2.11 ACF of residuals, ACF of squared residuals and QQ-plot of residuals for
ARMA(2,0) model (small sample, T = 100) and for ARMA(2,1) model
(large sample, T = 1000) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.12 Time series plot of Naphtha inflow rate . . . . . . . . . . . . . . . . . . . . . 42
2.13 ACF and PACF of corrected Naphtha time series . . . . . . . . . . . . . . . 43
2.14 Flowsheet of the de-alkylation plant . . . . . . . . . . . . . . . . . . . . . . . 44
2.15 Raw data for a de-alkylation plant . . . . . . . . . . . . . . . . . . . . . . . . 45
2.16 Residual diagnostic plots for V ARX(3) model of the de-alkylation plant. . 47
2.17 Scatterplot of residuals for outlier corrected V ARX(3) model of the de-
alkylation plant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.18 Real and fitted time series of both outflow rates (outlier indices superimposed) 50

3.1 Chemical SC scheme with highlighted inter-site transports . . . . . . . . . . 51


3.2 Modal split for chemical products in Germany in 2009 (based on total
transported quantity in mill. tons) . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Scheme of an exemplary serial multi-access pipeline . . . . . . . . . . . . . . 57
3.4 Cumulative distribution function and loss function for Y (4) . . . . . . . . . 66
3.5 Inventory pattern in a batch flow system . . . . . . . . . . . . . . . . . . . . 71
3.6 Illustration of interface calculation in batch pipeline systems. . . . . . . . . 76
3.7 Illustration of interface calculation in batch pipeline systems . . . . . . . . . 81
3.8 One-to-many pipeline types . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
XVI

3.9 Four-layer flow network for one chemical . . . . . . . . . . . . . . . . . . . . . 95


3.10 Optimal stock levels and network flows for periods 1 to 8 of the MC-
RTP example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.11 Optimal stock levels for the MC-RTP instance of example 6 . . . . . . . . . 107
3.12 Optimal stock levels and chemical flows for periods 1 to 8 of the MISP-
STA example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.13 Optimal stock levels for the MISP-STA instance of example 7 . . . . . . . . 120

4.1 Chemical SC scheme for integrated planning . . . . . . . . . . . . . . . . . . 124


4.2 Adapted SCM matrix (based on Stadtler (2005)) . . . . . . . . . . . . . . . . 127
4.3 Daily median inflow rates of aromatic hydrocarbons for a de-alkylation plant147
4.4 Hourly average inflow rate of aromatic hydrocarbons (including confidence
intervals at a confidence level of α = 0.1%) . . . . . . . . . . . . . . . . . . . . 148
4.5 Diagnostic plots of trimmed inflow rates of aromatic hydrocarbons for a
de-alkylation plant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.6 Steps in the simulation-based planning projects . . . . . . . . . . . . . . . . 150
4.7 Development scheme for simulation studies . . . . . . . . . . . . . . . . . . . 152
4.8 Exemplary chemical supply chain with two sites . . . . . . . . . . . . . . . . 154
4.9 Flow chart of a part of the exemplary chemical SC using the notation of
Table 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4.10 Development scheme for simulation studies . . . . . . . . . . . . . . . . . . . 165
4.11 Steps in experimental planning . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.12 Simulated and fitted responses per experimental configuration . . . . . . . . 183
4.13 Loss functions for Naphtha consumption during pipeline inspection and
their relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
4.14 Density and loss function for total Naphtha consumption during order lead
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
4.15 Scatterplot of Pareto front (β-service level in grey scale) . . . . . . . . . . . 194
4.16 Grey-scaled levelplot of estimated Pareto front . . . . . . . . . . . . . . . . . 195
4.17 Example of Delaunay triangulation-based sub-sample generation . . . . . . 199
4.18 Diagnostic plots of ARDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
4.19 Graphical determination of optimal performance vector . . . . . . . . . . . . 202

A.1 Diagnostic plots for the ARX(1) model of the Naphtha time series . . . . . 229
A.2 Residual diagnostic plots for V ARX(3) model with outlier compensation. 233
A.3 QQ-plots of residuals for models (4.19)-(4.21) . . . . . . . . . . . . . . . . . . 240
A.4 Density function of the Weibull distribution with k = 1.5 and λ = Γ( 365
5
)
. . . 241
3
A.5 Loss functions for both sites during pipeline inspection . . . . . . . . . . . . 242
A.6 Diagnostic plots for (4.22) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
XVII

List of Abbreviations

AB agent-based
ACF auto-correlation function
AIC Akaike information criterion
AR(X) auto-regressive [model] (with exogenous variables)
auto-regressive conditional heteroscedasticity [model]
ARCH(X)
(with exogenous variables)
auto-regressive integrated moving average [model]
ARIMA(X)
(with exogenous variables)
auto-regressive moving average [model] (with
ARMA(X)
exogenous variables)
ARD average relative deviation
CV corporate value
DE discrete event
DSS decision support system
EACF extended auto-correlation function
(s)ELSP(-BP- (sequence-dependent) economic lot sequencing problem
IF) (with global stocks and interface handling)
FIR finite impulse-response
generalized auto-regressive conditional
GARCH(X)
heteroscedasticity [model] (with exogenous variables)
GA genetic algorithm
GDP gross domestic product
h hour
HQIC Hannan-Quinn information criterion
IRP inventory routing problem
IPPSP integrated production planning and scheduling problem
(s)LP (stochastic) linear program
MA(X) moving average [model] (with exogenous variables)
MAD median absolute deviation (from median)
MC-RTP multi-chemical rail transportation problem
(s)MILP (stochastic) mixed-integer linear program
MIMO multiple input, multiple output
MINLP mixed-integer non-linear program
MIP mixed-integer program
MIRP maritime inventory routing problem
MIRSP maritime inventory routing and scheduling problem
XVIII

maritime inventory shipping problem (with ship type


MISP(-STA)
and tank assignment)
MISO multiple input, single output
Nomenclature statistique des activités économiques
NACE
dans la Communauté européenne
NLP non-linear program
N P-hard non-deterministic polynomial-time hard
NPV net present value
NSGA non-dominated sorting genetic algorithm
OLAP on-line analytical processing
PACF partial auto-correlation function
PE(T) polyethylene (terephthalate)
PDP pick-up and delivery problem
PIG pipeline inspection gauges
PP polypropylene
PS polystyrene
QQ-plot quantile-quantile plot
RMSE root mean squared error
ROV real-options-based value
RSA/RSM response surface approximation/methodology
RTC rail tank car
SBR styrene-butadiene rubber
SC supply chain
SCM supply chain management
SCOR Supply Chain Operations Reference
SD system dynamics
SDE stochastic differential equations
SIC Schwarz information criterion
SIMO single input, multiple output
SISO single input, single output
t ton
TSM time series methodology
TSP travelling salesman problem
V&V verification and validation
vector auto-regressive [model] (with exogenous
VAR(X)
variables)
vector auto-regressive moving average [model] (with
VARMA(X)
exogenous variables)
VLE vapour-liquid-equilibrium
vector moving average [model] (with exogenous
VMA(X)
variables)
VRP vehicle routing problem
XIX

List of Notation

General notation
a, ..., z, α, ..., ω vectors
A, ..., Z, Γ, ..., Ω matrices
A, ..., Z sets
i, j, k, b indices
t, T time index, time horizon
E(⋅)/P (⋅)/V ar(⋅)/Cov(⋅, ⋅) expectation/probability/variance/covariance of ⋅
Γ(⋅) Gamma function
ˆ
x̂/x̂ estimate of x / estimate of estimate of x
Bin (n, p) Binomial distribution with n trials and probability p
N (μ, σ 2 ) Normal distribution with mean μ and variance σ 2
WB (k, λ) Weibull distribution with shape k and scale λ
Notation of Chapter 2
N index and number of lags for exogenous variables
L/M number of output/exogenous variables
K number of parameters
p/q/P /Q order of AR/MA/ARCH/GARCH process
α, β coefficients of (G)ARCH models
γ/ρ, Corr(⋅, ⋅) partial correlation/ correlation
φ, θ AR and MA coefficients
Φ, Θ VAR and VMA coefficient matrices
coefficient scalars/vectors/matrices for exogenous
υ/υ/Υ
regressors
δ(B), ω(B), θ(B), φ(B) univariate lag polynomials
Φ(B), Θ(B), Υ(B) multivariate lag polynomials
σ2 variance
Σ covariance matrix
μ/μ mean scalar/vector
, ε, ξ, η/, ξ/Ξ error scalars/vectors/matrices
B back-shift operator
C(t) concentration function
X(t) turnover function
XX

Cs starting concentration
A reactant
EA activation energy
R universal gas constant
tR Damköhler number
m reaction order
r reaction rate
k reaction rate constant
W (t) Wiener process
Notation of Chapter 3
K, H set of state combinations
I set of lot positions
N set of access points
O set of states
S set of modes
ρCap pumping rate capacity
ρ pumping rate
Q transition matrix
qst transition probability from state s to state t
π steady state vector
ωs /ωs flow level of state s/state combination s
Xj flow rate at access point j
yh total material balance of state combination h
ph probability of state combination h
μC expected net deficit
α, β service level
r stock level
V (⋅) expected loss function
L/L̄ stock level / maximum stock level
T stock /T f ill stock-up time/pipeline fill time
τ time for pipeline transport
s /cs
cset hold set-up / holding costs for product s
ctrans
st transition costs for a change from product s to t
binary, 1 if at position i = 0, ..., I chemical s is
bsi
scheduled
Ts cycle time of chemical s ∈ S
tbi starting time of position i
binary, 1 if from position i − 1 to position i a transition
xsti
from chemical s to t occurs
XXI

binary, 1 if chemical s is lastly scheduled j positions


osij
before i
binary, 1 if position i is the last position where
rsi
chemical s is scheduled
T cycle fundamental cycle time
Notation of Section 3.2 summarized in Table 3.13
Notation of Section 3.3 summarized in Table 3.17
Notation of Chapter 4
D set of Delaunay simplices
I set of sites
O set of states
P set of products/chemicals
Y set of performance vectors
set of system configurations / efficient system
Z/Z ef f
configurations
ωs flow level of state s/state combination s
Q transition matrix
π steady state vector
ρCap pumping rate capacity
ρ pumping rate
q (opt) (optimal) order quantity
t(opt) (optimal) order interval
cbatch /chold batch-injection / holding cost for product
y vector of dependent measures
n number of dependent measures (length of y)
x vector of control variables
m number of control variables (length of x)
z vector of performance measures (typically z = (x, y))
ˆ real/simulation/meta-model mapping function linking
H(⋅)/Ĥ(⋅)/Ĥ(⋅)
x and y
C(⋅) cost function
U (⋅) profit/performance function
V (⋅) (first order) loss function
mi number of levels of control variable i
b binary vector, coding a discrete system configuration
μ intercept vector/basic performance
γ/γ/Γ coefficient scalar/vector/matrix of linear model
relative average target stock deviation at site i for
rip
product p
XXII

total average relative target stock deviation for all sites


r
and all products
sTar
ip target stock level for product p at site i
sCap
ip inventory capacity for product p at site i
sitp stock level for product p at site i in period t
tr total number of trains dispatched
trijt number of trains dispatched from i to j in period t
βip service level for product p at site i
β average service level for all products and sites
real
ωipt realized chemical flow of product p at site i in period t
plan planned/required chemical flow of product p at site i in
ωipt
period t
(si , Si ) inventory parameters (re-order level and stock-up level)
lit (Naphtha) stock level at location i in period t
¯l average total stock level ( ∑t ∑T i lit )
δ number of shipments
talarm time interval between successive pipeline inspections
tinsp duration of pipeline inspections
tship order lead time
Delaunay simplex (set of points constituting the
d
simplex)
relative performance deviation vector / average relative
e/e
performance deviation
1

1 Introduction

1.1 Motivation
The chemical industry is a key industry in developed countries. Not only its size but also
its role as a raw material supplier for almost all other industries establishes the chemical
industry as one of the most important industries in a nation’s economy. The chemical
industry is defined by the type of production process. Chemical industry focuses on the
modification of a substance’s chemical and/or physical properties.1
In this thesis the chemical industry and, particularly, the basic chemical industry is
the subject of interest. The basic chemical industry provides industrial goods (e.g. basic
plastics, coatings, fertilizer) as well as consumer products (e.g. detergents, tires, insulat-
ing materials). This industry can be seen as an indicator for a nation’s economy since
shocks in economic development (or the expectation of it) typically first affect the basic
industries. A general change in (expected) economic development induces a revision of
raw material stock levels to match the expected production plans in a broad range of com-
panies in various industries. Since almost all industries use various chemical products as
raw materials, the chemical industry is confronted with the accumulated "revision effect"
manifesting itself either in an increase or decrease of total customer demand. This is typi-
cally one of the first measurable and tangible effects of a change in economic development.
Hence, companies in chemical industry are faced with a volatile and hardly influenceable
customer demand. Moreover, most chemical products are commodities, i.e. there are only
few possibilities for product diversification such that the market for chemical products is
highly competitive with a strong focus on prices.
Another crucial characteristic is that many chemical production processes are only cost-
efficient if operated continuously. Continuously operated chemical production plants are
expensive and complex technical systems. Typically, they are optimized for operation
at a specific conversion rate such that operational costs per output unit are minimized.
Deviations from the optimal production rates typically cause additional operational costs
stressing the small contribution margins.
Chemical transformation processes are typically split into multiple stages that are or-
ganizationally separated but technologically interdependent. Chemical production plants
are interconnected via direct energy and product flows but managed locally. Due to diverg-

1
See Schönsleben (2011).

T. Kirschstein, Integrated Supply Chain Planning in Chemical Industry, Produktion und Logistik,
DOI 10.1007/978-3-658-08433-2_1, © Springer Fachmedien Wiesbaden 2015
2

ing production processes in basic chemicals’ production as well as the processes’ continuity
and interconnectedness, chemical production plants are often clustered at large-scaled in-
tegrated chemical production sites. This form of organization is economically preferable
since co-products can be processed further with little logistical effort. However, managing
such industrial complexes is a challenging task since all product flows have to be kept in
balance simultaneously to maintain all production processes. This is particularly compli-
cated if unforeseen events occur. Therefore, local production and consumption as well as
material imports and exports have to be planned accordingly. This perspective is called
the intra-site perspective.
Basic chemical companies often operate a certain number of such integrated production
sites such that a geographically disperse network of production sites exists. To keep all
materials balanced at all sites, material transports between sites have to be considered.
This perspective is called the inter-site perspective.
In summary, basic chemical companies face a challenging combination of difficult market
conditions, inflexible and complex production processes, and a wide-spread, interdepen-
dent production network. Hence, there is an inevitable strain for minimizing the oper-
ational costs of production from a company-wide perspective. To identify and exploit
such a network’s cost saving potentials, elaborate decision support tools are necessary
taking into account not only the complex production network of each site but also the
interrelations between sites. Such a company-wide perspective typically addresses tac-
tical and strategical decisions since, on the operational level, the local management of
the production sites is responsible. However, operational processes have to be modelled
accurately to be able to anticipate the network’s reactions on an aggregated level. On the
tactical and strategical level, environmental influences have to be incorporated to reflect
the uncertain nature of future developments.
At this level of aggregation, multiple objectives are typically pursued. Beside the op-
erational costs of a certain network configuration, e.g. investment costs to realize such a
configuration as well as robustness aspects and service measures come into scope.2 Thus,
tactical/strategical decision support tools have to provide

• a reliable model of the production sites (including the production processes and the
local logistical processes),

• an accurate model of the logistical interactions between the production sites,

• the ability to handle multiple objectives, and

• a way to incorporate environmental stochastic processes.

Potential subjects for tactical/strategical network optimization are


2
Here, robustness can be understood as the ability of a system configuration to perform well under
various environmental scenarios such that the risk of a critical state of the network is minimized or,
at least, restricted to some upper bound.
3

• the capacities of the logistical system (for transport, turn-over, and stock holding),

• the inventory management in the network, and

• the capacities of production plants.

This thesis aims at providing a general framework for building a tactical/strategical deci-
sion support model to optimize (large-scaled) chemical production networks. This frame-
work consists of three core parts:

1. To model local chemical production processes at production sites, time series mod-
els are used which are able to accurately represent the dynamic, time-dependent
structure of chemical production processes.

2. To model logistical interactions in a chemical production network, mixed-integer


(linear) planning models are used to decide about the logistical activities within
the network on an operational level. Hence, the behaviour of the logistic systems is
described dynamically considering varying system states.

3. Finally, to integrate both components, a simulation environment is used that man-


ages the interactions between the production system and the logistical system.
Moreover, additional stochastic influences can be incorporated.

1.2 Outline of the thesis


This thesis is organized as follows: Chapter 2 introduces an overview on chemical pro-
duction processes. Relevant chemical and physical concepts are introduced which are
essential to understand the dynamic nature of chemical reactions. It is shown that time
series models are adequate methods to describe the behaviour of chemical production
plants.
Chapter 3 provides an overview on logistical planning models typically occurring in
chemical production networks. These models are categorized by means of the transport
mode.
Chapter 4 constitutes the main chapter of this thesis. A literature review on integration
concepts for network planning in chemical industry is provided. A categorization of risk
sources affecting the performance of a network configuration is provided that subsequently
allows modelling of stochastic environmental disturbances and their effects. The integra-
tion of local production site models and logistical interaction models in a simulation
framework is described. Experimental designs and simulation optimization techniques
are described. These techniques are used to find configurations of the simulated network
which improve its performance.
The thesis concludes with Chapter 5 where the main results of this work are reviewed,
discussed, and summarized.
4

Throughout this thesis, all methods and techniques described are illustrated by small-
scaled examples or case studies based on real-world data. In Chapter 4, a complex case
study inspired by a cooperation project with a chemical company is presented, integrating
the very most of the aforementioned methods. Figure 1.1 shows a graphical overview on
the structure of the thesis and the associated examples.

Chapter 1: Introduction
section example

Motivation —
Outline —

Chapter 2: Chemical production processes


section example Chapter 3: Distribution planning
Characteristics —
Classification — section example

Ex. 1: Time series methodology Ex. 4: Serial pipeline supply


Pipeline
(TSM) Ex. 5: Batch pipeline sequencing
Modelling Ex. 2: applied univariate TSM Rail Ex. 6: Rail distribution planning
Ex. 3: applied multivariate TSM Ex. 7: Maritime inventory
Ship
shipping

Chapter 4: Integrated planning of chemical supply chains


section example
Literature —
Uncertainties Ex. 8: Modelling plant states
Ex. 9: Conceptual model of a chemical SC
Case study

Ex. 10: Implementation of a conceptual model


Simulation Ex. 11: Simulation optimization by experimental designs
Ex. 12: Simulation optimization by a genetic algorithm
Ex. 13: Post-optimization for decision support

Chapter 5: Conclusion

Figure 1.1: Overview on the structure of the thesis


5

2 Chemical production processes


This chapter provides an overview on basic definitions, terms, concepts, and techniques
to describe and model chemical production processes. This allows modelling of the core
components in chemical production networks. Figure 2.1 shows an exemplary chemical
production network where the production plants are highlighted.

intra-site inter-site

S1 C1

intra-site

intra-site

S2 C2

storage supplier customer production plant

Figure 2.1: Chemical SC scheme with highlighted production plants

From Figure 2.1, it becomes obvious that modelling the production plants provides the
basic data to describe the material flows within the whole network. Before the chemical
production processes are described in detail, some characteristics and key figures about
chemical products are provided.
Basically, two major groups of chemicals can be distinguished: inorganic and organic
chemicals. The latter subsumes all chemicals containing at least one bond of a hydrogen
and carbon atom. The former group encompasses all chemicals without such a bond.1
This work primarily focuses on organic chemicals which are produced by the (basic)
chemical industry. For the industrial production of organic chemicals, three basic natural
resources are available: natural gas, coal, and crude oil. Of these three materials crude oil
1
In particular, pure carbon, e.g. in the form of diamonds is handled as an inorganic chemical. See
e.g. Seager and Slabaugh (2007) for more details.

T. Kirschstein, Integrated Supply Chain Planning in Chemical Industry, Produktion und Logistik,
DOI 10.1007/978-3-658-08433-2_2, © Springer Fachmedien Wiesbaden 2015
6

is the exceedingly most important raw material for the production of organic chemicals.
The (basic) chemical industry is an intermediary who transforms these raw materials
into basic and intermediate (organic) chemicals. Three main production phases can be
distinguished:

1. raw substance splitting: raw materials are split into (short-chained) basic chemicals

2. re-composition: basic chemicals are re-composed into intermediate chemicals

3. final composition: basic and intermediate chemicals react to final chemicals.

The category of basic chemicals comprises about 20-30 chemical substances building the
basis for all subsequent substances. This class summarizes basic organic substances as
well as basic gases and inorganic basic substances (such as Chlorine or Ammonia).2 These
substances are purely intermediate and not sold to consumer markets.
Intermediate chemicals are usually simply structured chemicals most often composed
by basic chemicals that are only exceptionally made for (private) consumer markets.
This class comprises, e.g., alcohols and many kinds of acids.3 Examples for marketable
intermediate chemicals are ammonia compounds used as basic fertilizers in agriculture.
The classes of basic and intermediate chemicals consist of a fairly small number of sub-
stances which are mainly fluids or gases. But these substances are used in vast quantities
to produce final chemical products. To give an impression, Figure 2.2 shows the produced
quantities of basic and intermediate chemicals in 2008-2010 for Germany (grouped as
stated above).4
production quantities in 1,000 tons
100000

organic chemicals
inorganic chemicals
basic gases
80000
60000
40000
20000
0

2008 2009 2010

Figure 2.2: Production quantities of basic chemicals in 2008-2010

2
See e.g. Behr et al. (2010, p. 10 ff.) or Baerns et al. (2006, ch. 16).
3
See e.g. Behr et al. (2010, p. 10 ff.) or Baerns et al. (2006, ch. 17).
4
These production statistics are calculated based on the European goods classification scheme NACE.
The three classes "basic gases", "inorganic chemicals", and "organic chemicals" correspond to the
NACE classes 20.11, 20.13, and 20.14, see Eurostat (2011).
7

The group of organic chemicals represents about 50% of the total quantity of produced
chemicals with a total quantity of about 40 million tons per year in Germany. These prod-
ucts are typically crude oil derivatives. Thereof, approximately 20 million tons account
for organic basic chemicals such as alkenes or aromatic compounds where the remain-
ing 20 million tons comprise organic intermediate chemicals such as alcohols or chlorine
derivatives.5
Final products in chemical industry are produced by chemical reactions of intermediate
and basic chemicals. These final products are widely used in almost all other industries.6
Final chemicals can be categorized into
1. polymers/plastics

2. agrochemicals (fertilizers, pesticides, etc.)

3. body care products (detergents, soaps, cosmetics, etc.)

4. speciality chemicals (coatings etc.)

5. pharmaceuticals.
Among these categories, pharmaceuticals and body care products set up own sub-
industries due to the special characters of their products and production processes.
These companies directly serve (private) consumer markets. The remaining three cat-
egories comprise "classic" chemical companies mainly providing intermediate products
for other industries such as mechanical engineering industry, building industry, textile
industry, and plastics industry.
In the next section chemical production processes are characterized and categorized.
The theoretical modelling of the underlying chemical reactions is described subsequently.
Chemical production processes realise chemical reactions in industrial scale in chemical
plants. Based on models of chemical reactions, methods are provided to describe the
behaviour of chemical production processes.

2.1 Characterization of chemical production processes


Production in chemical industry is in many aspects different from common industries.
In most industries final products are produced by mechanical transformation processes
such as assembling or machining. In contrast, in chemical industry the chemical and/or
physical properties of substances are altered. Most chemical production processes rely on
chemical reactions aiming at the transformation of reactants into substances of interest.
The term chemical reactions refers to changes of the reactants’ molecular structures.7 Be-
5
See Eurostat (2011).
6
However, especially in the last two decades a trend towards a deeper vertical integration and special-
ization is visible in chemical industry. This aims at product portfolios containing a higher share of
consumer goods which promise higher contribution margins and less market risks.
7
See e.g. Baerns et al. (2006, p. 24 ff.).
8

side chemical reactions, also physical transformation processes (so-called basic operations)
are used to alter specific properties of the reactants.
Usually, chemical reactions require an initiation, i.e. they only take place under specific
circumstances. The list of reaction parameters is vast. Basic physical parameters are
pressure, temperature, electricity, or light. Specific constellations of these parameters
influence the reaction rate, i.e. how fast or slow a reaction takes place.8 If a reaction
can only take place by means of auxiliary chemicals, it is called catalytic and such an
auxiliary reactant is called catalyst.9 Another important measure of chemical reactions
is the conversion rate of a reaction, i.e. how many percent of the input reactants’ mass
is transformed into the substances of interest.10 This measure is important to decide
whether a reaction can be realized in an economically profitable way.
The molecular structure of reactants can be changed in numerous ways:11 First the
reactants’ molecules can be combined which is called synthesis. A prominent example is
the hydrogenation of carbon dioxide to produce methanol.12
Second, a molecule or molecular fragment can also be split which is called decomposition.
To recycle bottles made of polyethylene terephthalate (PET) catalytic depolymerization
is used to split the PET in valuable components.13
In a substitution, a molecular fragment of a reactant is replaced by a fragment of another
reactant. A prominent example is the alkylation where an alkyl group is transferred from
one molecule to another. E.g., the production of Ethylbenzene from Benzene and Ethylene
by the so-called Friedel-Crafts alkylation is a standard process in chemical industry.14
If more than one molecular fragment is substituted, this is categorized as a metathesis.
A recent industrial application is the olefin metathesis to produce e.g. Propene from
Ethene and 2-Butene.15
The types of reactions presented above can be combined with the ordinary classifi-
cation of production processes in convergent, divergent, and transformation processes.
Depending on the number of input and output products this process classification can be
enhanced as Table 2.1 shows.
A SISO process is a single input-single output process and corresponds to transforma-
tion processes or substitution reactions. MISO processes (multiple inputs-single output)
comprise convergent processes and SIMO processes (single inputs-multiple output) com-
prise divergent processes similar to decomposition and synthesis, respectively. MIMO
8
E.g. see Baerns et al. (2006, p. 32).
9
The catalyst is not part of the molecular reorganization, i.e. no part of the catalyst is part of the
resulting chemicals. See Behr et al. (2010, ch. 12) for more information.
10
E.g. see Baerns et al. (2006, ch. 11.2) for more details.
11
A similar classification scheme particularly addressing organic reactions can be found in Jones and
Bunnett (1989).
12
See Bill (1997).
13
The resulting components depend on the specific depolymerization process applied, see e.g. Mishra
et al. (2002) or Paszun and Spychaj (1997).
14
E.g. see Degnan et al. (2001).
15
See Mol (2004).
9

```
``` # outputs
``` single multiple
# inputs ```
```
single SISO SIMO
multiple MISO MIMO

Table 2.1: Typology of production processes

processes (multiple inputs-multiple output) may include transformations as well as con-


vergent and divergent parts. They correspond to metatheses or coupled decomposition
reactions.
Based on this classification of chemical reactions, corresponding characteristics of the
production systems can be deduced. The production of basic chemicals is in almost all
cases a split of long-chained raw materials (crude oil, coal, natural gas) into short-chained
substances (such as alkenes). These production processes are primarily divergent and can
be categorized as SIMO processes. The cracking of mixtures of long-chained substances
into their short-chained components is usually performed by thermal chemical reactions
(such as steam-cracking). The resulting mixture of short-chained substances is usually
separated by distillation processes. Such processes are in general continuous and hardly
interruptible. Typically, they are single-purpose assets i.e. assets designed to produce a
fixed set of chemicals.16 Depending on the raw materials’ composition and the operating
parameters (temperature, pressure, reaction time, etc.) the production coefficients can be
controlled under certain restrictions. The set-up and control of the production coefficients
depends on the precedence relation of the produced products. Not in all cases a focal main
product exists.17
The production of intermediate chemicals requires more manifold types of production
processes. Similar to the production of basic chemicals, chemical reactions are typically
accompanied by separation processes such that most of these reactions can be catego-
rized as SIMO or MIMO processes. Because intermediate chemicals are required for the
production of final chemical products in huge quantities, they are usually produced by
continuously operated plants. Typically, the production plants are specialized to perform
a specific reaction and, hence, are single-purpose plants.
Final chemical products are typically produced on multi-purpose plants which are de-
signed for a specific product family. Such production processes are usually convergent.
The composition of raw materials to produce a final chemical is called a recipe. Multi-
purpose plants are capable to handle multiple recipes, i.e. reactants and products handled
vary in both type and quantity. These processes can be mainly categorized as MISO or
MIMO processes.

16
However, the production coefficients may vary.
17
This depends primarily on the further use of the output products and/or their market prices. For
example steam crackers are usually optimized for Ethylene production because Ethylene is used in a
wide variety of final products.
10

Depending on the product portfolio as well as the structure of product demand, multi-
purpose plants are operated either in batch mode or continuously. There are plenty of
definitions about both terms.18 Here, a technological point of view is used, i.e. batch
processes are characterized by a fixed production capacity which is defined as the quan-
tity of produced chemicals after which a process interruption is required. In contrast,
continuous processes are characterized by a production rate, which is defined as the quan-
tity of goods produced in a given time. The time between process interruptions is not
technically limited. The former especially occurs for specialty chemicals and pharmaceu-
ticals, where the latter is typical for polymers/plastics and some agrochemicals.19 For
body care products the production technology is mixed depending on the product variety.
Low-volume products with many product variations and often changing recipes, such as
cosmetics, are usually produced in batch mode, whereas high-volume products with few
product variations, such as detergents, are usually produced continuously. In both cases,
the production processes are interruptible. Table 2.2 summarizes the above-mentioned
characteristics.20

final chemicals
basic chemicals intermed. chemicals
commodity speciality
mode continuous continuous continuous batch
vergence divergent di-/convergent convergent/transform.
purpose single single multiple

Table 2.2: Characteristics of production processes in chemical industry

The production of basic and intermediate chemicals is typically organized in a network


of continuously processed plants. This principally leads to an advantage for horizontally
coupled production processes and, hence, horizontally integrated chemical companies.
To exploit these economies of scope, locally concentrated production sites are necessary
to avoid logistical efforts. These integrated production sites comprise a great variety
of production plants which are interconnected by product and energy flows.21 A typi-
cal flow sheet example for sites based on cracking of Naphtha is depicted in Figure 2.3.
Main products are the simple alkenes Ethylene, Propylene, and Butadiene. Beside these
pure alkenes, a fraction called pyrolysis gasoline (Pygas) is extracted which is a mix-
ture of acyclic and, mainly, cyclic hydrocarbons (aromatics) such as Xylene, Toluene,
and Benzene. Benzene is the most important cyclic hydrocarbon and raw material for

18
See e.g. Loos (1997, p. 48 ff.) and references therein.
19
Low-volume products such as specialized pesticides are often produced in batch mode, see e.g. Loos
(1997, p. 70 ff.).
20
Far more typical characteristics of production processes could be included, see e.g. Loos (1997, sec. 3.1).
However, the chosen characteristics are the most important with respect to logistical implications.
21
The head sites of BASF in Ludwigshafen and Dow Chemical in Midland are popular, large-scaled
examples of such integrated production sites.
11

e.g. Ethylbenzene which in turn can be de-hydrogenated to Styrene. Styrene can then be
used as raw material for Styrene-Butadiene rubber which is an important raw material
e.g. for the production of tires. Other branches of Benzene application are the production
of Aniline (which is widely used in polyurethane production) and Cumene which is a
composition of Benzene and Propylene and is mainly processed to resins.22

Naphta

cracking

Pygas

Ethylene distillation Propylene

Benzene Benzene Benzene

Butadiene synthesis alkylation


nitration

Ethylbenzene Nitro-Benzene Cumene

dehydrogenation hydrogenation

Styrene Styrene Aniline

co-polymerization polymerization

Styrene-Butadiene rubber Polystyrene

Figure 2.3: Exemplary production network of Naphtha derivatives

Note that the provided final products depicted in Figure 2.3 rather represent a set of sub-
products than a uniform substance. These sub-products have the same basic molecular
structure but can differ e.g. in certain physical characteristics or colour.23
The vessels containing chemical reactions are called (chemical) reactors. Their design
and size depends on the intended reaction. The performance of a reactor is measured in
terms of the conversion rate, the purity of the substances of interest, and economic aspects
such as resistance, energy consumption etc. Basic types of reactors can be categorized
by different characteristics.24 The vessel design is a first category. Roughly, one can
distinguish between (stirred-)tank reactors and pipe reactors.
To effectively execute chemical reactions, the involved reactants have to be provided in
sufficient purity. Moreover, many chemical reactions result not only in one pure output
product but in a mixture of output products. Thus, preparation and post-production
22
See e.g. Baerns et al. (2006, ch. 16) for details.
23
This can be obtained e.g. by various additives or process settings, see Behr et al. (2010).
24
E.g. see Baerns et al. (2006) or Trambouze and Euzen (2004) for more detailed classification schemes.
12

processes have to be carried out to provide valuable reactants for subsequent chemical
reactions. These operations are called basic operations and can be categorized in thermic
and mechanic operations.25
The most prominent thermic basic operation is the rectification or distillation to sepa-
rate individual components from a mixture.26 For separation, the mixture is evaporated
completely and successively cooled down. The components can be effectively separated if
they have different boiling points or dew points, respectively. In the case of close boiling
points for the components in the feed mixture, distillation is still applicable if an auxiliary
chemical (so-called solvent) is available changing the boiling or dew points of at least one
component (so-called fractional distillation).27
A distillation column is a metal tube which is separated in compartments by so-called
trays or plates. The mixture to be separated is fed into the middle of the column. In-
side the column temperature and pressure are variable depending on the height. At the
column’s top temperature is maximal and pressure minimal. Conversely, at the bottom
temperature is minimal and pressure maximal. Depending on the boiling points of the
mixture’s components, the composition of liquids and gases differs in each compartment
and at each tray, respectively. Ideally, at each tray a single fraction/component of the
feed mixture can be obtained.28 To guarantee constant conditions regarding pressure and
temperature, a surplus of residue liquids at the bottom of the column is (re-)boiled and
fed back into the column (so-called reflux). Similarly, a surplus of gases at the top of col-
umn is condensed and fed back. Figure 2.4 shows a schematic overview of a prototypical
column.29
Separation processes are designed and optimized for a specific mixture to be separated
which determines e.g. the number and position of trays as well as the atmospheric con-
ditions. Such processes are single-purpose plants for SIMO or SISO30 processes. The
production rate often can be varied in certain ranges without causing serious variations
of the separation accuracy.
The distinction in reactions and basic operations to prepare and post-process chemical
reactions is valid in most cases. However, there is a principal advantage to integrate both
chemical process steps. Despite organizational and technical drawbacks, such an integra-
tion is physically and chemically advantageous due to more favourable energy balances.
One example is the reactive distillation where chemical reactions take place inside the
25
E.g. see Baerns et al. (2006) for details.
26
For a more general overview on (thermal) separation processes see e.g. Seader et al. (2011) or Baerns
et al. (2006, ch. 9).
27
E.g. see Behr et al. (2010, p. 88-89),Hoffman (1977) or Seader et al. (2011) for details.
28
Regarding the determination of the maximal or optimal number of trays a large body of literature has
been expanded and is still growing, e.g. see Seader et al. (2011) for an overview and Yeomans and
Grossmann (2000) or Viswanathan and Grossmann (1993) for more detailed insights.
29
In practice many adaptations and subtypes of this basic form have been derived depending on the
specific processes and circumstances the distillation column is intended for. See e.g. Smith (2005,
ch. 11 and 12) for an overview.
30
If only one component is considered as valuable.
13

condenser

gas

Prod. D
reflux

Prod. C
mixture/feed
trays

Prod. B

vapor

liquid
Prod. A
reboiler

Figure 2.4: Scheme of a distillation column

distillation column.31 With the exception of such special techniques, chemical operations
can be categorized as displayed in Figure 2.5.

operations

reactors basic operations

tank pipe thermic mechanic

batch continuous decompose mix

Figure 2.5: Overview on chemical operations

Chemical production processes can be divided in chemical reactions and basic opera-
tions (i.e. physical transformations). In chemical production plants, multiple processes
from both classes are combined and take place in sub-plants which are closely intercon-
nected. The planning and configuration of such plants is very complex and expensive.
Hence, a detailed modelling of the underlying chemical and physical processes is necessary
to avoid misinvestments. The next section outlines an overview on the steps necessary to
31
For an overview on reactive distillation see e.g. Baerns et al. (2006, p. 322 ff.) and for details see
e.g. Taylor and Krishna (2000).
14

model and control a chemical production plant beginning with a brief introduction into
the mathematical modelling of chemical reactions.

2.2 Modelling chemical production processes


A chemical production process is a combination of physical and chemical transformation
processes. The behaviour of such a transformation system can be described in mathemat-
ical terms. To describe the behaviour of a chemical reaction system two general questions
have to be answered:

• What are the requirements and outcomes of the intended reaction(s) regarding en-
ergy and reactants?

• How can these requirements be maintained over time by technical systems?

Roughly spoken, the first question can be answered by thermodynamical analyses of


the intended reaction, whereas the answers to the second question are typically subsumed
under the term (chemical) kinetics.
Based on thermodynamical and kinetic descriptions of the individual process steps, a
meta-model can be developed which is able to describe and predict the behaviour of a
whole chemical production process. Such a process model can be developed for different
purposes and at different levels of detail: To design a chemical production process, a de-
tailed model of the potential plant(s) necessarily includes the description of the system’s
dynamics. In contrast, once the production process is designed, a model is necessary to de-
scribe the dependency of the system’s output w.r.t. certain control parameters. Figure 2.6
depicts a prototypical procedure in chemical process modelling.

Thermodynamics Kinetics

simulation process model

system identification process design

model predictive control

Figure 2.6: Steps in chemical process modelling & control

In the next subsection, kinetic definitions and concepts are introduced. Subsequently,
relevant properties of chemical operations w.r.t their modelling are outlined. Special
15

attention is given to methods and concepts concerning the mathematical description of


existing chemical production processes, subsumed under the term "system identification".

2.2.1 Chemical kinetics


The theory that addresses problems regarding the inter-temporal description of chemical
process parameters is subsumed under the term kinetics of chemical processes. The ki-
netics of chemical reactions include the temporal description of a reaction, i.e. the way
the concentrations/shares of reactants and products of a chemical reaction are developing
during the reaction. This requires the description of mass transport processes and the
description of heat/energy transport processes. Figure 2.7 displays this trisection and
introduces some keywords/methods explained below.32

chemical kinetics

reaction kinetics mass transport energy transport

reaction speed Fick’s law Fourier’s law


concentration diffusion convection
Arrhenius eq. concentration conduction

Figure 2.7: Topics in chemical kinetic and key words/methods

In reaction kinetics, the object of interest is the progress of a chemical reaction measured
in terms of the change in concentration of the corresponding reactants and products.33 The
change of a component’s amount in time is defined as the reaction rate of this component.34
Assuming constant volume this equals the concentration of this component. The reaction
rate rj of component j can be expressed as the change in concentration cj over time:

dcj
rj = . (2.1)
dt

The change in concentration typically depends on the reaction time, the concentration
of the other components, the temperature, and reaction specific properties. Temperature
32
In the strict sense the term kinetics only refers to reaction kinetics, i.e. the temporal development of
chemical reactions. Mass and energy transport processes are usually categorized as a pure physical
phenomenon. However, here a classification scheme based on the topics’ time domain is chosen. For
an extensive introduction in chemical kinetics, see e.g. Levine (2005).
33
In the following, the term component subsumes reactants and products as well.
34
There exist also some other definitions of reaction rates based on reaction-specific characteristics,
e.g. volume, mass or activated catalyst surface, see e.g. Baerns et al. (2006, p. 59-60) or Behr et al.
(2010, p. 41-42).
16

and concentration parts are combined multiplicatively:

rj = f (T ) ⋅ f (c, t, ...) (2.2)

where c is the vector of concentrations and T is the temperature. The temperature effect
is usually assumed to be constant (also called reaction rate constant) and obtained by the
Arrhenius equality:
EA
f (T ) = k = k0 ⋅ e R⋅T (2.3)

where k0 is a reaction specific factor, EA is the energy necessary to start the reaction
and R is the universal gas constant.35 In the simplest case, the reaction rate depends
multiplicatively on the powers of the reactant’s concentrations:

rj = k ⋅ f (c) = k ⋅ cm
1 (t) ⋅ ... ⋅ cN (t)
1 mN
(2.4)

where mi is the reaction order of reactant i and ∑N i=1 mi is the order of the whole reaction.
36

The type of reaction orders depends on the structure of the reaction. Assume a simple
decomposition of one reactant, e.g. in cracking reactions of hydrocarbons in gasoline
production:
ν1 ⋅ A1 → ν2 ⋅ A2 + ... + νn ⋅ An (2.5)

where A1 is the (only) reactant and A2 , .., An are the products. The stoichiometric co-
efficients νj refer to the number of molecules of chemical j consumed/produced during
the reaction. For all components their concentration depends on the concentration of the
reactant, i.e. for reactions of order m1 = 1 follows for the reaction rate of chemical j

rj = νj ⋅ k ⋅ c1 (t). (2.6)

If one molecule of A1 is split into the products, it follows ν1 = −1. For the change in
concentration of the reactant follows:

r1 = dc1
dt = −k ⋅ c1 (t) (2.7)

which is the simplest form of a homogeneous linear differential equation of order 1 with
the solution:

c1 (t) = cs,1 ⋅ e−k⋅t (2.8)

where cs,1 is the concentration of A1 at the beginning of the reaction. Here, (2.8) implies
an exponentially decaying concentration of the reactant over time.

35
E.g. see Behr et al. (2010, p. 42).
36
There are also other forms of reaction rate equations, such as hyperbolic relations. See Baerns et al.
(2006, p. 62 ff).
17

In reality, most chemical reactions consist of numerous elementary reactions combined


in reaction networks that are much more difficult to describe.37 An interesting elemen-
tary reaction is a 2-step sequential reaction which is relevant e.g. in modelling thermal
separation processes:
k1 k2
A1 → A2 → A3

where all stoichiometric coefficients are equal to 1. Assuming again a first-order-type


reaction for both reactions, it follows that

dc1 (t)
r1 = = −k1 ⋅ c1 (t) (2.9)
dt
dc2 (t)
r3 = = k1 ⋅ c1 (t) − k2 ⋅ c2 (t) (2.10)
dt
dc3 (t)
r3 = = k2 ⋅ c2 (t). (2.11)
dt

This simple system of differential equations can be solved iteratively. The solution of
(2.9) is given in (2.8) and, hence, can be substituted in (2.10):

dc2 (t)
r2 = = k1 (cs,1 ⋅ e−k1 ⋅t ) − k2 ⋅ c2 (t). (2.12)
dt

Assuming k1 ≠ k2 and cs,2 = 0, (2.12) can be solved using standard results for homogeneous,
linear differential equations of first order:

k1 ⋅ cs,1 −k1 ⋅t −k2 ⋅t


c2 (t) = (e −e ) (2.13)
k2 − k1

Consequently, (2.11) can be solved by imputation of (2.13) and assuming cs,3 = 0:

cs,1
c3 (t) = cs,1 + (k2 ⋅ e−k1 ⋅t − k1 ⋅ e−k2 ⋅t ) . (2.14)
k2 − k1

These examples of some simple types of chemical reactions show the general solution
methodology for determining the behaviour of chemical reactions over time. Especially
the modelling of catalytic and reversible reactions requires more sophisticated mathemat-
ical methods to determine the concentration rates.38 However, in the end more or less
complicated systems of differential equations have to be solved.
Another prominent thermodynamical relation often relevant in describing chemical pro-
cesses is the heat conduction formalized in Fourier’s law.39 This partial differential equa-
tion describes the diffusion of heat (energy) in time depending on (contact) area and
temperature gradient. This partial differential equation leads to some other prominent

37
For more detailed information about the deduction of reaction rates for other elementary reactions and
reaction networks e.g. see Levine (2005) or Baerns et al. (2006, ch. 4).
38
See e.g. Baerns et al. (2006, p. 65-72) or Levine (2005) for details.
39
See e.g. Behr et al. (2010), p. 49 ff.
18

thermodynamical equations such as the heat and diffusion equation.40 The diffusion and
behaviour of energy is the main controller of most reactions. E.g. an energy-consuming
chemical reaction system is arranged such that the energy loss due to the chemical trans-
formation process is compensated by external energy supply (e.g. heating). Distillation
processes are typical examples for such processes.41
Another class of equations subsumes kinetic and phase transformations of all involved
reactants. Such equations describe how the reactants’ molecules are transformed and
distributed in the reactor depending on time and other environmental parameters. De-
pending on the kind of chemical processes under consideration, both classes of equations
are of varying importance for modelling. E.g. for catalytic packed-bed reactors the chem-
icals’ reaction rates heavily depend on local physical conditions at the (solid) catalyst
material. The precise modelling of the local physical conditions and the mixture of chem-
icals flowing is important and complex in this case. In contrast, for classic stirred-tank
reactors kinetic and phase transformations are comparatively easy to model.
Most mass balances are deduced from the chemical transformation process directly,
whereas most energy equations are mainly deduced from thermodynamic relations. The
kinetics of a chemical reaction subsume relationships determining how the reaction rate
depends on conditions such as pressure, temperature and catalysts.

2.2.2 Modelling & simulation of chemical processes


Combining all these thermodynamic and kinetic relations, theoretical mass and heat equa-
tions can be used to calculate (theoretical) reaction time and the size of a reactor to
generate a desired quantity of a certain product.42 This requires a model for the chemical
reaction quantifying the transformation rates in time depending on the concentration of
the involved reactants and catalysts as well as the physical parameters (e.g. pressure and
heat). There is usually a wide range of possibilities to process chemical reactions depend-
ing on the physical parameters. If a chemical reaction is isolated from the environment
(i.e. no energy transfer to the surroundings), this is called adiabatic. In contrast, if a
reaction is processed under constant physical conditions, it is called isothermal/isobaric
(i.e. usually heat has to be transferred to the environment).
Under isothermal/isobaric conditions the transformation processes of a chemical reac-
tion can be modelled depending on the behaviour of the concentrations in time given a
certain temperature.43 To get an idea, assume the simple splitting reaction

A1 → A2 + ... + An

40
There are a lot of standard textbooks regarding this topic see e.g. Sandler (2006), Koretsky (2004), or
Letcher (2004).
41
See e.g. Seader et al. (2011).
42
See Behr et al. (2010, p. 59 ff.) or Baerns et al. (2006, p. 145 ff.).
43
See e.g. Behr et al. (2010, p. 64-66).
19

mentioned above with stoichiometric coefficients equal to 1. The concentration of reactant


A1 is given in (2.8). Now, define the turnover ratio X(t) = 1− c(t)
cs as the part of reactant A1
that is transformed at a given time t relative to the initial concentration at the beginning
of the reaction. Plugging in (2.8) yields

X(t) = 1 − e−k⋅t . (2.15)

Assuming a constant volume of the reactants and processing the reaction in a stirred-
tank reactor one important question is to determine the time needed to achieve a desired
turnover rate. To answer this question, (2.15) is solved exactly to t = − k1 ln(1 − X) for a
given turnover ratio X. Normalizing this result by the reaction constant k leads to the
so-called Damköhler number for this specific class of reactions:

Da = − ln(1 − X). (2.16)

In general, Damköhler numbers approximate the reaction time necessary to achieve a


desired concentration/turnover rate. To calculate the Damköhler numbers for nth-order
reactions, the concentration differential equation can be derived directly by the following
steps:

dc1 (t)
= − k1 ⋅ cn1 (t) (2.17)
dt
d(1 − X)
⇔ = − k ⋅ (1 − X)n ⋅ cns,1 (2.18)
dt
dX
⇔ − = − k ⋅ (1 − X)n ⋅ cns,1 (2.19)
dt
1
⇔ dt = dX (2.20)
k ⋅ (1 − X)n ⋅ cns,1
1 XR 1
⇒ tR = n
cs,1 ⋅ k ∫0
dY (2.21)
(1 − Y )n
1 1 1
⇔ tR = ⋅( ( − 1)) (2.22)
cns,1 ⋅ k n − 1 (1 − X)n−1

For more complex reactions the derivation of Damköhler numbers is much more compli-
cated.44
Under adiabatic conditions heat flows have to be modelled accordingly with respect to
the mass balances. For exothermic reactions usually a stimulation of the chemical reaction
is required due to the higher energy level at the beginning of the reaction. In general,
the interactions between heat and mass balances are more difficult to model.45 More
complicated cases occur when both conditions are mixed, i.e. under polytropic conditions.
Here, solutions for heat and mass balances are hardly available analytically. Instead,

44
See e.g. Baerns et al. (2006, ch. 5.2) for more details.
45
E.g. see Behr et al. (2010, p. 67).
20

numerical or simulation approaches are necessary to model such cases properly.


Given a specific reaction and type of reactor, it is usually possible to derive a theo-
retic concentration model which is able to anticipate the reaction’s behaviour over time
depending on the specific environmental conditions such as pressure, temperature, and
reactant inflow rates. But although such a theoretical approach for a chemical process can
be derived more or less easily, the relationships of the technical realization in terms of a
chemical plant are far more complicated due to the fact that a theoretical model necessar-
ily requires simplifications and assumptions. For example, a homogeneously distributed
mixture inside a reactor is often assumed which is rarely the case in a real reactor. This
typically leads to a stochastification of theoretic relations, e.g. the differential equations
may turn into stochastic differential equations (SDEs). This turns the concentration as
calculated in (2.7) into a stochastic variable with associated distribution.46 For example,
(2.7) can be turned into
dc1 (t) = −k ⋅ c1 (t) ⋅ dt + σ ⋅ dW (2.23)

where dW denotes a stochastic process (the so-called Wiener process, which is described in
section 2.3) and σ is some variability parameter indicating the influence of the stochastic
component. Equation (2.23) constitutes a specific Ornstein-Uhlenbeck process.47 For this
specific class of SDEs an analytic solution can be derived as48
t
c1 (t) = cs,1 ⋅ e−k⋅t + σ ⋅ ∫ ek⋅(s−t) dW (s) (2.24)
0

which extends (2.8) by the addition of a stochastic term. Note that (2.23) can be solved
since the stochastic component is constant in time and independent from the concentration
process c1 (t). For most other types of SDEs or systems of SDEs analytic solutions are
not available.49
The design of chemical plants usually combines a bundle of chemical reactions and
basic operations, where the chemical reaction of interest is rather the core of a network
surrounded by preparation and follow-up operations. Predicting the behaviour of such
a network is difficult and usually not manageable by analytic models even if theoreti-
cal models are available for all components. Therefore, simulation approaches are often
applied.50
To give an example, Figure 2.8 shows the schematic sketch of a typical steam cracker
splitting Naphtha (or similar hydrocarbons) into short-chained components.
The focal product is usually the C2 -fraction, known as Ethylene.52 The chemical process

46
See e.g. Behr et al. (2010, p. 68).
47
Here, the parameter μ = 0, see Uhlenbeck and Ornstein (1930).
48
See e.g. Hassler (2007).
49
See e.g. Hassler (2007).
50
See e.g. Behr et al. (2010).
51
See Behr et al. (2010, p. 179).
52
Ethene or Ethylen is the simplest alkene consisting of only two carbon atoms.
21

"# $




 


 





 

!
 !
  
 

!


 
 
 
 

Figure 2.8: Schematic flow sheet of a steam cracker51

of splitting the feed’s long-chained molecules into smaller ones is processed in a pipe
reactor (located in a furnace to provide the necessary temperature). To separate the
produced mixture of short-chained chemicals a bundle of separation steps is processed in
a system of distillation columns.
Modelling the complete production process as a SIMO process requires the modelling of
the splitting process first. This process can be characterized as a continuously processed
decomposition reaction in a pipe reactor.53 A model to describe the complete chemical
splitting process contains 680 single chemical reactions describing the dependencies of 37
substances.54 This decomposition describes the kinetic behaviour of the reaction depen-
dent on energy (heat) and pressure.55 Additionally, the environmental parameters have
to be modelled in terms of energy and mass equations.56 This model simulates the output
of the splitting process depending on certain control variables. In this case, mainly the
residence time of the feed in the pipe (between 0.1 and 2 seconds), the composition of the
feed, and the cracking temperature (between 800 and 900 °C) determine the composition
of the output mixture. Depending on the costs associated to energy and feed as well
as the value of the output mixtures, it is still an ongoing effort to derive optimization
models maximising the profit by finding the optimal control setting.57 However, not only
the composition of the cracking mixture but also the quality of the separation processes
is necessary to predict the outcome of the whole plant. For these components additional
models were developed.58
53
Classification according to Figure 2.5.
54
See Edwin and Balchen (2001).
55
Details can be found in Sundaram and Froment (1978),Willems and Froment (1988a), and Willems
and Froment (1988b). The latter references gives information about the necessary activation energies
of the specific reactions.
56
Here specific models were developed, see e.g. Ghashghaee and Karimzadeh (2007) for a comprehensive
one.
57
E.g. see Edwin and Balchen (2001).
58
See e.g. Qi et al. (2002) for a reactive distillation model for C4 separation process or specialized
22

Hence, to model a chemical plant precisely requires a lot of effort. This effort is typically
expended when a new plant is planned. However, such a precise process model is not
necessary or useful for other purposes. In short-term control of chemical production plants,
so-called prediction control models are used which focus on the measurement of system
responses to certain changes of control variables on an empirical basis. These models use
basic relations of the theoretical models introduced above. The next subsection outlines
a brief overview of such models.

2.2.3 Process identification & control


In general, the processes controlled in chemical industry are not hazardous but still dan-
gerous. Hence, the monitoring and control of chemical production plants is an important
topic in theory and practice. A process control scheme for chemical processes can be
depicted as shown in Figure 2.9.

targets controls outputs


controller chem. process
inputs

disturbances
estimator
estimates/model

Figure 2.9: Chemical process control scheme59

The chemical process can be controlled by various variables which affect physical con-
ditions as well as the product and energy inflows of the chemical plant. Output variables
comprise the physical properties of the plant’s outflows. In case of continuously oper-
ated plants, outputs are measured as flow rates. Beside the output quantity, the output
quality e.g. in terms of the composition of the output mixture is an important measure.
To link inputs/controls with output measures, an estimator quantifies the relationship
between both components. This estimator relies on a predefined model representing the
theoretical background of the estimated process. This model is completed by estimated
parameters which are based on historical records of the process’ performance. Note that
the estimator has to be able to distinguish between undesired, uncontrollable disturbances
and desired previous control effects. The resulting parametrized model can then be used
to predict the future behaviour of the process depending on the control settings. This
model is reported to the controller unit. The controller unit varies the control variables
to meet the desired target configuration of the production process. The optimal sequence
of control variables may depend on economic parameters (e.g. input prices) resulting in
a simultaneous optimization of controls and outputs. This implies a dynamic process

textbooks such as Seader et al. (2011).


59
Based on Darby et al. (2009).
23

control. Usually, however, predefined static optimal target values for output measures are
to be met.60
Since this work deals with the aggregated simulation and planning of chemical produc-
tion processes, the focus is laid upon methods to determine estimations of the process
models. For process control this task is the crucial one as the estimations’ accuracy de-
termines the accuracy of the whole control process. The task to find an accurate process
model is often called process identification.61 To describe the input-output behaviour of
(continuously operated) chemical production plants finite impulse response (FIR) models
are widely used. These models can be seen as regression models where the historical
records of input/control measures determine the output measure. The term "finite" indi-
cates that a finite number of historical records is used to predict the process’ outputs.62
Often, chemical processes show a significant time-dynamic behaviour which is typically
reflected in auto-correlated and cross-correlated process measures. However, classic re-
gression models do not incorporate auto-correlation explicitly which in turn leads to a loss
in estimation efficiency or, even worse, biased estimates.63 Therefore, time series meth-
ods can be applied to incorporate auto-correlation effects. According to the classification
shown in Table 2.1 four basic types of FIR models can be distinguished.

SISO model

For the simplest case of univariate input variables and univariate output measures the
following model describes a linear dependency between the input (or control) variable
x = (x1 , ..., xT )′ and the output variable y = (y1 , ..., yT )′

N
yt = μ + ∑ υb ⋅ B b xt + ξt (2.25)
b=0

where T denotes the time horizon and B denotes the so-called back-shift operator
B b xt = xt−b . The parameters υb reflect the (lagged) influence of the control variable
on the output variable. μ is the average output level. The noise vector ξ = (ξ1 , ..., ξT )′
contains the disturbances that cannot be explained by the model. Note that ξ can be
derived from some coloured noise model as well as from a classic white noise model.
White noise refers to a stochastic process with zero mean and no auto-correlation, while
coloured noise accounts for auto-correlation. If the process under study fulfils the white
noise assumptions, (2.25) can be interpreted as a simple regression model and can be

60
About the conception of model predictive control schemes see e.g. the brief overview in Darby et al.
(2009) or textbooks such as Camacho and Bordons (2004).
61
E.g. see the overview on chemical process modelling provided by Lewin et al. (2002) or textbooks about
the topic such as Ljung (1999).
62
This finite number of regressors can be chosen quite large which is indicated by the term "non-
parsimonious" FIR models, see Dayal and MacGregor (1996).
63
See the seminal papers Yule (1921) and Yule (1926) or some standard textbook on time series analysis,
e.g. Cryer and Chan (2008, pp. 260-265).
24

estimated using ordinary regression estimators.64 If the process generates auto-correlated


residuals, more elaborate time series models should be estimated.65 These are discussed
in the next section.

SIMO model

For multiple outputs the univariate control variable approach (2.25) has to be re-
formulated using matrix notation

N
yt = μ + ∑ υ b ⋅ B b xt + ξ t (2.26)
b=0

where yt = (y1,t , ..., yL,t )′ is the vector of L output variables, υ b = (υ1,b , ..., υL,b )′ the vector
of regression parameters, and ξ t = (ξ1,t , ..., ξL,t )′ the vector of output errors. Similarly,
Ξ = (ξ1,1 , ..., ξL,1 , ..., ξ1,T , ..., ξL,T ) is an L×T matrix which is derived from an L-dimensional
stochastic process either white or coloured. μ = (μ1 , ..., μL )′ denotes the vector of output
levels.

MISO model

The MISO model is used to describe converging production processes. Here, the structure
is similar to a SISO model, where the input is some (say) M -dimensional vector such that

N
yt = μ + ∑ υ ′b ⋅ B b xt + ξt (2.27)
b=0

with xt = (x1,t , ..., xM,t )′ the vector of control variables and υ b = (υ1,b , ..., υM,b )′ the vector
of regression parameters for lag b. For the noise term the same remarks hold as for the
SISO model.

MIMO model

The last and most general class of processes combines both previous models. Here, inputs
and outputs are multivariate leading to the following multivariate regression model

N
yt = μ + ∑ Υb ⋅ B b xt + ξ t (2.28)
b=0

with x = (x1,t , ..., xM,t )′ the vector of control variables, yt = (y1,t , ..., yL,t )′ the vector of
output variables, Υb = (υ 1,b , ..., υ L,b ) = (υ1,1,b , ..., υ1,L,b , ..., υM,L,b ) the matrix of regression
parameters for lag b, and ξt = (ξ1,t , ..., ξL,t )′ the vector of noise. Again Ξ is some L × T
matrix sampled from a coloured or white L-dimensional stochastic process.
64
E.g. see Dayal and MacGregor (1996) for more information.
65
Note that there is still the option to ignore the time-correlation in favour of increasing the input lag
N . It can be shown that this model yields the correct estimates asymptotically, see Ljung (1999).
However, the estimates based on finite samples may be poor, see Dayal and MacGregor (1996).
25

2.3 Time series methodology


The models (2.25)-(2.28) can be interpreted as regression models if the corresponding noise
processes are white. Otherwise, the time-dependency structures of the processes have to
be taken into account to obtain accurate estimates of the desired regression coefficients.66
Therefore, (2.25)-(2.28) can be transformed into standard time series models for which
established estimation procedures already exist. For the single output cases ((2.25) and
(2.27)), this results in so-called autoregressive, moving-average models with exogenous
regressors (or briefly ARMAX models) whereas multiple output processes lead to so-called
vector ARMAX models (VARMAX).
The remainder of this section introduces the relevant notation with an additional focus
on the extension to heteroscedastic models (so-called (G)ARCH and ARMA-GARCH
models) as these can be seen as the discrete-time counterpart of continuous stochastic
processes formulated in terms of SDEs.

2.3.1 ARIMA models


This section describes the class of the most common ARMA models and some of their
extensions. The term ARMA combines both basic types of time-dependencies, the auto-
regressive (AR) model and the moving average (MA) model. Suppose a time series y =

(y1 , .., yT ) collected over T periods with zero mean. Autoregressive dependency means
that any observation yt depends on previous observations yt−i of this time series with
i = 1, ..., p such that
p
yt = ∑ φi ⋅ yt−i + t (2.29)
i=1

with t ∼ wn(0, σ 2 )67 for all t ∈ {p+1, ..., T }. This denotes an autoregressive process of order
p (in short AR(p)). Typically, the stochastic nature of the analysed process is assumed
to be stable in time. I.e. the stochastic characteristics (such as variance, autocorrelations
and mean) of the recorded process do not change in time. This is called the stationarity
premise which implies some restrictions for the parameters φi . E.g. suppose an AR(1)
process with φ = 1.68 It follows

t
yt = yt−1 + t = yt−2 + t−1 + t = ... = ∑ i . (2.30)
i=0

66
Alternatively, the noise type can be ignored applying so-called non-parsimonious finite impulse-response
(FIR) models by increasing N , the number of lagged observations of the control variable(s). This
method has some serious drawbacks e.g. biased and instable estimates/forecasts in case of finite
samples. See e.g. Dayal and MacGregor (1996) and references therein.
67
The term wn(μ, σ 2 ) refers to a white noise process with zero mean and (constant) variance σ 2 .
68
This defines a so-called random walk.
26

For a sum of white noise residuals ∑ti=0 i holds

t t
E(yt ) = E (∑ i ) = ∑ E( i ) = 0 (2.31)
i=0 i=0
t t
V ar(yt ) = V ar (∑ i ) = ∑ V ar( i ) = t ⋅ σ 2 (2.32)
i=0 i=0
t s t s
Cov(yt , ys ) = Cov (∑ i , ∑ i ) = ∑ ∑ Cov( i , j ) = t ⋅ σ 2 with s ≥ t. (2.33)
i=1 i=1 i=1 j=1

Hence, this process is not stationary due to the time-varying, unbounded variance and
covariance.69 Only AR(1) processes with ∣φ∣ < 1 are stationary. This result can be gener-
alized for any AR(p) model for which the following conditions must hold:70
p
∑ φi < 1 (2.34)
i=1
∣φi ∣ < 1 i = 1, ..., p. (2.35)

In the moving average (MA) model yt depends on the previous errors t−i instead of
previous observations. An MA model of order q (MA(q)) is represented by
q
yt = ∑ θj ⋅ t−j + t (2.36)
j=1

with t ∼ wn(0, σ 2 ) for all t ∈ {1, ..., T }. For MA processes no stationarity conditions have
to be considered, even an infinite MA(∞) process remains stationary.71 MA processes
have another special characteristic: their non-uniqueness with respect to the (first lag)
auto-correlation ρ. Suppose a time series generated by an MA(1) process with parameter
θ: yt = θ ⋅ t−1 + t . The autocorrelation ρ = ρ (yt , yt−1 ) can be expressed as

Cov (yt , yt−1 )


ρ= . (2.37)
V ar(yt )

For an MA(1) process it holds72

V ar(yt ) = (1 + θ2 ) ⋅ σ 2 , (2.38)
Cov (yt , yt−1 ) = θ ⋅ σ 2 , (2.39)
θ
ρ = . (2.40)
1 + θ2
69
The covariance expression can be understood intuitively as the sum of variances due to the error terms
commonly shared by both variables yt and ys . For more detailed deductions see Cryer and Chan
(2008, pp. 12-13) or Brockwell and Davis (2002, pp. 16-17).
70
These conditions are necessary but not sufficient. Precisely, for the characteristic equation φ(x) = 0 all
roots must have absolute value greater 1 (with the characteristic polynomial φ(x) = 1 − φ1 ⋅ x − φ2 ⋅ x2 −
... − φp ⋅ xp ), see Schlittgen and Streitberg (2004, pp. 121-132) or Cryer and Chan (2008, pp. 72-77).
71
See Cryer and Chan (2008, pp. 55-56).
72
See e.g. Cryer and Chan (2008).
27

However, setting θ′ = 1θ yields exactly the same auto-correlation ρ. Hence, there are at least
two possible values for θ producing exactly the same time series w.r.t. the auto-correlation
structure.73 This problem is related to the stationarity condition of AR processes. To solve
this problem the invertibility condition is introduced. An MA process must be invertible
into an infinite AR process. This holds if and only if the characteristic equation for the
characteristic polynomial θ(x) = 1 + θ1 ⋅ x + θ2 ⋅ x2 + ... + θq ⋅ xq has roots with absolute value
larger than 1.74 Given a special MA process with known order and unknown parameter
(set), there exists only one parameter (set) such that this MA process is invertible.75
Both types of auto-correlation models are rarely found in real world problems in genuine
form, but in combination they build a huge class of time series patterns summarized as
so-called ARMA models. An ARMA(p,q) model can be formalized as:
p q
yt = ∑ φi ⋅ yt−i + ∑ θj ⋅ t−j + t (2.41)
i=1 j=1

with t ∼ wn(0, σ 2 ) for all t ∈ {1, ..., T }. To assure stationarity and invertibility the process
has to satisfy stationarity conditions from its AR part and invertibility conditions from
its MA part.76 Basic ARMA models only include ex ante stationary models but, by little
extensions, this class can be extended to comprehend also difference stationary time series.
Consider again a random walk model (which is clearly non-stationary):

yt = yt−1 + t . (2.42)

Taking the first differences results in a stationary time series since the white noise process
remains
Δyt = yt − yt−1 = t . (2.43)

The time series Δyt is stationary. Hence, the original series yt is called difference station-
ary.77 Differencing (i.e. the concept of taking differences) extends the ARMA class to the
most popular ARIMA class where ARIMA stands for autoregressive integrated moving
average. The concept of differencing can be extended to higher orders leading to complex
mathematical formulation in traditional notation.78 Therefore, the general ARIMA model
is formulated compactly as
φ(B)(1 − B)d yt = θ(B) t (2.44)

where φ(B) and θ(B) denote the characteristic polynomials for the AR and MA part as

73
For details see Schlittgen and Streitberg (2004, p. 117).
74
See Schlittgen and Streitberg (2004, pp. 124-132).
75
See e.g. Cryer and Chan (2008, p. 80).
76
For details see Shumway and Stoffer (2006, pp. 93-97).
77
Alternatively, also the term integrated of order 1 is used. This term suggests that this concept can be
applied to differenced time series again, see e.g. Pfaff (2006, pp. 23-24).
78
For more details about higher-order differencing see e.g. Shumway and Stoffer (2006, pp. 98-103) or
Cryer and Chan (2008, pp. 87-98).
28

described above.
A useful extension to classic ARIMA models are models incorporating exogenous ex-
planatory variables, so called ARIMAX models. As the name suggests, additional param-
eters are added to the ARIMA model. The use of explanatory variables usually results
from knowledge about the process’ external dependencies, but sometimes it is also reason-
able to model outliers or structural changes of the time series using auxiliary exogenous
variables.79
Assume the SISO model (2.25) where ξ is coloured noise and follows an ARMA process
with lag polynomials φξ (B) and θξ (B). Without loss of generality μ is assumed to be
zero and (2.25) changes to

N
φξ (B)yt = ∑ υb B b xt + θξ (B) t (2.45)
b=0

where t is white noise. Incorporating an external regression variable into ARIMA


models leads to some complications. Basically, there are two distinct cases: First, the
regression variable x = (x1 , ..., xT )′ is deterministic. In this case, a recursive estimation
procedure starting with a standard regression analysis can be applied leading to the
maximum likelihood estimates of the regression parameters under assumption of Gaussian
(white) noise.80
In the second case, the regression variable x is a (stochastic) time series, i.e. φx (B)xt =
θ (B)εt such that (2.45) changes to
x

N
θx (B)
φξ (B)yt = ∑ υb B b εt + θξ (B) t (2.46)
b=0 φx (B)
where εt is a white noise series.
The first approach to solve such a problem is to perform a regression analysis be-
tween the time series y and the regression variable x ignoring the fact of auto-correlated,
stochastic variables. Afterwards, the residuals from this first step could be obtained and
a time series could be fitted for these residuals, e.g. using ARIMA models. Unfortunately,
this approach can lead to biased and inefficient estimates even if the sample is large.81
Due to the time series characteristics of the dependent and independent variables, the
cross-correlations between x and y might be overlaid by the individual temporal depen-
dencies of x and y. To obtain (reasonable) estimates for the impulse response parameters

υ = (υ0 , ..., υN ) , the time series y has to be "cleaned" from the time series effects of the re-
gressor variables x. This procedure is called prewhitening and performed in the following
steps:

1. obtain estimates for φx and θ x (by analysing x)


79
See Cryer and Chan (2008, ch. 11).
80
See Shumway and Stoffer (2006, p. 293 ff.) for details.
81
See Cryer and Chan (2008, p. 40) and referred literature for details.
29

x x φ̂x (B) φ̂x (B)


2. transform yt and xt by the estimates φ̂ and θ̂ : ỹt = y
θ̂x (B) t
and ε̂t = x
θ̂x (B) t

3. obtain estimates υ̂ by calculating the cross-correlation between ỹt and ε̂t

4. obtain the residuals ξˆt = yt − ∑N b


b=0 υ̂b B xt

5. based on ξˆt determine estimates for φξ and θ ξ .82

In the case of the MISO model with multiple, say M , input variables the prewhitening
approach can be easily adapted by iteratively performing steps 1 and 2 of the prewhitening
procedure for all input variables to obtain estimates for υ i , i = 1, ..., M . The residual series
is finally calculated by ξˆt = yt − ∑N
b=0 υ̂1,b B x1,t − ... − ∑b=0 υ̂M,b B xM,t .
1 b NM b 83

2.3.2 GARCH models


Chemical processes can be modelled in detail as a bunch of equations and differential
equations based on chemical and physical laws. These laws rely on static assumptions
about the environment where the corresponding processes take place. In practice, all com-
ponents are influenced by stochastic factors that can influence the static process behaviour
and/or the dynamic process characteristics. Incorporating continuous stochastic processes
to a differential equation leads to stochastic differential equations. Linear stochastic dif-
ferential equations (SDEs) are usually formulated in the following general form using the
Wiener process W (t):

dy(t) = c1 (t)y(t)dt + c2 (t)dt + (c3 (t)y(t) + c4 (t))dW (t) (2.47)

where W (t) is the Wiener process which is characterized by the following properties:84

• P (W (0) = 0) = 1,

• W (t) has normally distributed increments W (t) − W (s) ∼ N (0, t − s) with 0 ≤ s < t,

• all paths are almost surely continuous.

It can be shown that for a linear SDE with coefficient functions ci (t) (which are con-
tinuous in t) an explicit, unique solution always exists and the SDE has finite first and
second-order moments.85
Time series models and SDEs deal with the same sort of stochastic process. Both differ
only in the domain of variables, which are either discrete or continuous. For instance,
chemical processes are continuous by nature. In practice, however, the condition of a

82
Details about prewhitening can be found e.g. in Box et al. (2008, p. 417 ff.) or Cryer and Chan (2008,
p. 265 ff.).
83
See e.g. Wei (1990, p. 328 ff.).
84
See Hassler (2007, p. 117) and Iacus (2008, p. 18 ff.)
85
See Hassler (2007) and Øksendal (2003).
30

chemical reaction is measured by various sensors in discrete time intervals. Hence, the
question arises whether discrete time series models can be seen as discretized SDEs.
It can be shown that ARCH and GARCH models are able to approximate stochastic
differential processes if the latter fulfil certain properties.86 Albeit the goodness of fit
is limited,87 both types of methods are related and can be converted into each other.88
Moreover, simple stochastic processes show quite simple auto-correlation structures simi-
lar to basic ARMA models. For instance, the Ornstein-Uhlenbeck process can be seen as
the continuous equivalent of the AR(1) process. In other words, an Ornstein-Uhlenbeck
process measured in discrete intervals can be interpreted/modelled as an AR(1) process
(see also (2.23), (2.60), and (2.61)).89
This subsection briefly introduces GARCH models as discrete counterpart of continuous
stochastic processes. In contrast to ARMA models, the basic idea is that the variance/
volatility in time is no longer deterministic and constant but depends on previous errors
and volatility, i.e.

y t = μ + t (2.48)
σt2 = f (σt−1
2 2
, ..., σt−p , t−1 , ..., t−q ) (2.49)
t ∼ wn(0, σt2 ). (2.50)

The model above is called a GARCH(p,q) model if f (⋅) is a linear function such that
(2.49) changes to

σt2 = ω + α1 2t−1 + ... + αq 2t−q + β1 σt−1 + ... + βp σt−p . (2.51)

If the volatility depends only on the previous errors i (i.e. q = 0), the corresponding
models are called ARCH(p). (G)ARCH processes are known to be weakly stationary if
q p
∑ αi + ∑ βi < 1.90
i=1 i=1
(G)ARCH models have become very popular in empirical economic research. In the
past decades a vast amount of extensions and generalizations have been published.91
These numerous adaptations of the basic model consisting of (2.48), (2.50) and (2.51)
are able to incorporate very specific time series characteristics and are usually driven by
empirical observations and experiences.92 A relevant extension in the context of chemical

86
See Nelson (1990) and Duan (1997).
87
See Wang (2002).
88
E.g. see Fornari and Mele (2001) for an application of GARCH models as diffusion approximations
fitted to financial data sets.
89
See e.g. Hassler (2007, p. 236) about the similarity between both processes and Uhlenbeck and Ornstein
(1930) or Vasicek (1977) for details about the Ornstein-Uhlenbeck process.
90
E.g. see Hassler (2007, p. 96). To assure stationarity for higher moments (>2) more restrictive condi-
tions have to be met, see Ling and McAleer (2002).
91
See Engle (2002) for a review.
92
E.g. see Bauwens et al. (2006) for a review of recent multivariate GARCH models and Bollerslev (2008),
Hansen and Lunde (2005) or Degiannakis and Xekalaki (2004) for a review of univariate (G)ARCH
31

production processes are so-called GARCH-M and ARMA-GARCH models. For this class
of models the mean process in (2.48) is not a constant, but a stochastic process itself.
The GARCH-M(p,q) mean process is modelled as function of σ:93

yt = μ ⋅ σt + t . (2.52)

An increase in volatility leads to an increase in the mean. Similarly, for an ARMA-


GARCH(P ,Q)(p,q) the mean process follows an ARMA(P ,Q) process whereas the volatil-
ity is modelled by a GARCH(p,q) process:

P Q
yt = ∑ φi ⋅ yt−i + ∑ θj ⋅ t−j + t (2.53)
i=1 j=1

σt2 = σ02 + α1 t−1 + ... + αq t−q + β1 σt−1


2
+ ... + βp σt−p
2
+ (2.54)
t ∼ wn(0, σt2 ). (2.55)

This class of models allows formulating a great variety of dependency patterns and is used
in many empirical applications.94 Its most appealing advantage is the possibility to model
both a time-dependency structure for the observed series and for their volatility combined
in one model. For modelling chemical production processes, the latter property is useful
for describing the variations in a chemical process’ equilibrium. In normal operation,
chemical reactions take place under controlled physical and chemical conditions. The
parameters of these reactions such as conversion rates and reaction times depend on
these conditions and determine the stochastic processes of output and input measures.
Variations in process control variables affect these conditions and have an immediate
impact on the parameters of the underlying reactions. This may change the structures
of the input and output processes. As a consequence, a destabilization of the chemical
process might occur resulting in an increase in the measures’ volatility.
The incorporation of exogenous regressors (leading to ARMAX-GARCH models) is less
emphasized in the literature since it is similar to ARMAX models. In this case (2.53) is
substituted by (2.45). For estimation of the corresponding parameters, prewhitening can
be applied to eliminate the influence of the exogenous time series variables. Afterwards,
the ARMA-GARCH model can be estimated on the residuals.95

2.3.3 Multivariate time series models


The methodology for multivariate time series models is similar to the univariate cases
except for the fact that all notation is changed into vectors and matrices such that most

models.
93
See Hassler (2007).
94
See Li et al. (2002) for an extensive review and references therein for applications.
95
It has to be noted that the estimation of ARMA-GARCH models requires some more sophisticated
methods compared to simple ARMA models, see Francq and Zakoian (2004).
32

concepts can be transferred more or less easily. This section briefly introduces the key
models for multivariate time series models, so-called vector autoregressive moving average
processes (VARMA) and their extensions with exogenous regressors (VARMAX). Multi-
variate GARCH processes (MGARCH) are not discussed in detail here.96
Suppose a multivariate time series of dimension L collected at T points of time, say
Y ∈ RL×T . The basic types of dependency, auto-regression and moving average, are used to
formulate a model explaining the behaviour of this multivariate time series. Suppose that
the observations of Y at a given time t, say yt ∈ RL , depend on the previous observations
yt−i with i = 1, ..., p and previous error terms, say t−j with j = 1, ..., q. This relation can
be formulated as a VARMA(p,q) model
p q
yt = ∑ Φi yt−i + ∑ Θj t−j + t (2.56)
i=1 j=1
p q
⇔ yt = ∑ Φi B i yt + ∑ Θj B j t + t (2.57)
i=1 j=1

⇔ Φp (B)yt = Θq (B)t (2.58)

where Φi and Θj are L × L parameter matrices. Φp (B) = I − Φ1 B − ... − Φp B p and


Θq (B) = I + Θ1 B + ... + Θq B q denote the corresponding lag polynomials. It is required
that all matrices are non-singular and the covariance matrix of t is positive definite.
Similar to the univariate case, the process is stationary and invertible if all roots of the
determinantal polynomial of ∣Φp (B)∣ and ∣Θq (B)∣ are outside of the unit circle.97
Model (2.58) can easily be extended to a VARMAX model by incorporating exogenous
variables. Let X ∈ RN ×T be the matrix of explanatory variables, then (2.58) is reformulated
to
Φp (B)yt = μ + ΥN (B)xt + Θq (B)t (2.59)

where ΥN (B) = Υ0 + Υ1 B + ... + ΥN B N is the polynomial of the transfer matrices. The


model is equal to the MIMO model (2.28) if Ξ is assumed to be a VARMA process of
order (p, q).98 If N = 1, (2.59) reduces to the SIMO model (2.26).99 For the estimation
of VARMAX models integrated estimation procedures are available.100 The problem of
time series being used as exogenous variables does not pose a direct problem since these
variables can be handled as endogenous101 such that their dependency structure is simul-
taneously estimated. However, such a procedure complicates the model and increases the

96
For an introduction see e.g. Bauwens et al. (2006).
97
About the stationarity and invertibility conditions see e.g. Wei (1990, pp. 335 ff.).
98
For an application of an MIMO model in chemical process modelling see Barceló et al. (2011).
99
For more details about the relationship between multivariate finite impulse response models and (par-
simonious) VARMAX models in chemical process analysis, see e.g. Seppala et al. (2002).
100
See e.g. Hall and Nicholls (1980) for a direct maximum likelihood estimation or Metaxoglou and Smith
(2007) for estimation based on a transformation into state-space models.
101
I.e. these variables can be included in Y.
33

number of parameters to be estimated and, hence, may hamper the model’s feasibility.102
If exogeneity can be assumed (e.g. in case of control variables), a direct modelling of the
exogenous variables’ stochastic models is the preferred option. Similar to the prewhiten-
ing approach, a time series model for the exogenous variables has to be fitted which can
be used to improve the estimates of the original VARMA model.103

2.3.4 Data preparation, model specification and residual checking


The previous section points out which structure of a time series model to choose depending
on the general type of the production plant to be modelled. Once a model class is selected,
the corresponding model’s parameters have to specified based on historical records of the
investigated plant. The model fitted to the historical records has then to be checked for
adequacy, i.e. the residuals of the model have to be checked for the assumptions inherent
to the chosen model (typically white noise).
In a first step, data describing the input and output flows has to be acquired. In
basic chemical industry, most inputs and outputs are flow rates recorded in discrete time
intervals. For short-term plant control, these intervals are very short.104 For the purpose
of simulation such a short interval is not in all cases advantageous as it determines the
granularity of the subsequent simulation model and, hence, its complexity. Moreover,
the time-dependency structure is immediately affected by the choice of the recording
interval. In principle, the shorter the recording intervals are, the larger the order of the
process is to be expected.105 In the other extreme, a too long recording interval will not
show significant time-dependencies most probably as short-term dependencies diminish
or cancel out by summation/averaging.
To give an example for the influence of the length of the recording interval, let y(t)
denote the value of an Ornstein-Uhlenbeck process as introduced in (2.23) at time t. For
this specific process the autocorrelation ρ for t → ∞ can be expressed as106

ρ(y(t), y(t + h)) = e−k⋅h . (2.60)

Let Δt denote the recording interval and let h = m ⋅ Δt be a multiple of Δt. Then
yt+m = y(t + m ⋅ Δt) is the discrete equivalent of y(t + h). For the autocorrelation of two
measures holds

m
ρ(y(t), y(t + m ⋅ Δt)) = ρ(yt , yt+m ) = e−k⋅m⋅Δt = (e−k⋅Δt ) = φm . (2.61)

Note that (2.61) represents the autocorrelation function of an AR(1) process with pa-
102
See e.g. Horváth (2003).
103
In the state-space reformulation of the VARMAX model, the time series model of the exogenous
variables is used to predict initial values of the regressor matrix, see Metaxoglou and Smith (2007).
104
Intervals about or shorter than one second are frequently found, see Darby et al. (2009).
105
I.e. the parameters p and q in the ARIMA models introduced before, see e.g. (2.41).
106
For the underlying expressions for asymptotic variance and autocovariance, see Hassler (2007, p. 237).
34

rameter φ = e−k⋅Δt . I.e. with Δt → ∞ the parameter φ converges towards 0. In other


words, although the type of the discrete time series process remains unchanged, the cor-
responding parameter changes with Δt. From an empirical point of view, analysing the
process structure by means of the empirical autocorrelation of a finite sample will not
allow detecting this pattern if Δt is too large (since empirical autocorrelations will be too
small to be distinguished from noise). A reasonable choice with respect to the use of time
series models within the simulation environment needs to be defined. From numerous
experiments with samples from multiple chemical plants, an interval length of 30 minutes
up to multiple hours is appropriate in most cases.
In principle, the measured flow rates may vary either due to fluctuations of the underly-
ing production process or due to measurement errors. For the analysis of the production
system, one is interested in the former variation only. However, the latter cannot be
excluded since there is no infinitely precise and reliable sensor. Flow measures can be
affected by turbulences or pollutions. Hence, implicitly it has to be assumed that the
sensors’ measures of the corresponding flow rates are unbiased and that the measurement
error does not dominate the error process of the production process. Since these sen-
sors are also used for the automatic control systems, continuous efforts are expended for
checking their accurateness. Hence, it can be assumed that the sensors work properly
in the sense that they are unbiased and have measurement errors that are as small as
(technically) possible.107
Once the data of a chemical production plant is collected, the basic type of model is
specified, i.e. SISO, SIMO, MISO or MIMO. When deciding on the basic model type the
number of relevant measures has to be determined. A lot of variables may affect the
performance of a chemical production plant (e.g. product flows, atmospheric conditions,
energy flows). Among these, the relevant variables need to be extracted. Relevance refers
to the use of time series models within the simulation environment and prerequisites to
build an appropriate model of the production process. For the final simulation model,
main chemicals (raw, intermediate, and final chemicals) of the studied production system
are fixed parts of the time series models. From the remaining variables (such as energy
flows or auxiliary chemical flows), variables are included which yield a relevant improve-
ment of the accuracy of the final time series model. If a variable cannot improve the final
model’s accuracy, it should be dropped from the analysis to avoid over-specification.108
To decide on a model’s accuracy, various criteria are available quantifying the model’s
goodness of fit. These information criteria (IC) are based on the model’s likelihood
typically assuming a Gaussian error process.109 The very most information criteria can

107
However, there is typically no possibility to check these assumptions explicitly.
108
In case of over-specification (synonymously over-fitting) additional variables or parameters are incor-
porated in the model which show a spurious correlation with the error terms. This may lead to less
efficiently estimated parameters and poor prediction accuracy, see Cryer and Chan (2008, pp. 185-188)
or Schlittgen and Streitberg (2004, pp. 347-349).
109
See e.g. Schlittgen and Streitberg (2004, pp. 332-345).
35

be expressed as
K
IC = −2 ⋅ ln (LT (K)) +
⋅ κ(T ) (2.62)
T
where K denotes the number of parameters fitted in the model, κ(T ) is a function of the
number of observations T (this function distinguishes the various criteria), and LT (K)
denotes the model’s likelihood. The number K of parameters to be estimated depends
on both, the order of the model and the number of exogenous variables. For example,
assuming a VARX(p) of dimension d with non-zero mean and k = 1 exogenous variable
(which has a contemporary effect only), the number of parameters to be estimated is
K = p ⋅ d2 + d + d.
From (2.62) it can be taken that the model’s goodness of fit and the number of parame-
ters used are counterbalanced. Among a set of model specifications, the specification with
minimal IC value is recommended. In other words, information criteria aim at minimiz-
ing the residuals’ variance with as few parameters as possible. Often used information
criteria for time series models are the Akaike information criterion (AIC),110 the Schwarz
information criterion (SIC)111 or the Hannan-Quinn information criterion (HQIC)112 with
the following κ functions:




⎪ 2 for AIC


κ(T ) = ⎨ ln (T ) for SIC . (2.63)




⎩ ln (ln (T ))
⎪ for HQIC

Among these IC, the AIC has a non-zero asymptotic probability for over-fitting.113 To
avoid an over-fitting of the model order, a corrected AIC (AICc ) can be calculated by
AICc = AIC + 2⋅(K+1)⋅(K+2)
T −K−2 .114
The SIC is deduced from Bayesian arguments. It consistently estimates the true order
of ARMA(p, q) processes and is probably the most widely used information criterion in
univariate time series analysis.115 The HQIC is the most recent IC and especially designed
for multivariate time series models.116 In practice, multiple ICs are simultaneously cal-
culated which allows the analyst to cross-check the recommendations of the various ICs.
Strongly deviating recommendations may indicate an inappropriate model structure.
One source of mis-specification is the type of the error process. Homogeneity of vari-
ances is typically assumed by standard time series models. However, in the context of
chemical production processes e.g. an adjustment of a plant’s production rate may lead
to an imbalance of the underlying chemical reaction(s). This instability may materialize
in fluctuations of the output flow rate(s) which may also affect flow rates in subsequent
110
See Akaike (1974).
111
This criterion is often called Bayesian information criterion (BIC), see Schwarz (1978).
112
See Hannan and Quinn (1979).
113
In other words it is biased, see Cryer and Chan (2008, p. 131).
114
See Hurvich and Tsai (1989).
115
See Cryer and Chan (2008, p. 131).
116
See Quinn (1980).
36

periods. Technically, such a "bull-whip effect" can be modelled by introducing a specific


model for the error component of the model, e.g. a GARCH model. The presence of
time-varying variances can be uncovered by checking the squared and absolute residuals
of an already fitted time series model for auto-correlation.
To decide on the order of time series models as well as to check residuals for the white
noise assumptions, auto-correlations of time series and residuals need to be calculated and
analysed. Standard metrics to analyse for time-dependent correlation structures are the
autocorrelation function (ACF), partial ACF (PACF) and extended ACF (EACF). The
ACF estimates the empirical auto-correlations between lagged observations

Cov (yt , yt−k )


Corr (yt , yt−k ) = = ρk (2.64)
V ar(Y )

of an empirical time series by the standard sample correlation formula.117 These values
are usually displayed in a correlogram. If the auto-correlation for a specific lag exceeds
some critical value,118 a significant auto-correlation is to be assumed. Hence, the ACF
allows detecting the order of pure MA processes which can be easily seen from the formal
definition of MA processes in (2.36). I.e. the ACF of an MA(q) process theoretically shows
significant values for the first q auto-correlations.119
To detect AR-like correlations, the PACF corrects the ACF values by the effect that is
caused by preliminary auto-correlations120

γk = Corr (yt , yt−k ∣yt−1 , ..., yt−k+1 ) . (2.65)

To estimate this conditional correlation measure, a backward recursion (backward fore-


cast) is applied, the so-called Durbin-Levinson-algorithm.121 Again, if the PACF value
for a specific lag exceeds a critical bound,122 a significant AR-like correlation is to be
presumed.
Unfortunately, neither PACF nor ACF lead to directly interpretable results for ARMA
processes. The extended ACF tries to overcome this drawback by jointly providing infor-
mation about the order of both components. For each AR order tested, the EACF first
determines estimates of the AR coefficients by a sequence of regression models. After-
wards, the residuals’ ACF is calculated. The results are presented in a table indicating
significant or non-significant auto-correlations (typically denoted by an x and o, respec-
tively). In such a table, the rows represent the AR order p whereas columns represent
117
E.g. see Cryer and Chan (2008, pp. 109-112).
118
For details about this bound, see e.g. Cryer and Chan (2008, p. 112).
119
Contrarily, pure AR processes show an exponentially decaying ACF. For a more precise description see
e.g. Cryer and Chan (2008, pp. 109-112).
120
For details see e.g. Schlittgen and Streitberg (2004, pp. 194-201), Cryer and Chan (2008, pp. 112-115)
or Brockwell and Davis (2002, pp. 94-96).
121
For details see Schlittgen and Streitberg (2004, pp. 196-198).
122
This bound is deduced by assuming a Gaussian√AR(p) process. Then, sample auto-correlations are
approximately normal and the critical value is 2/T , see Cryer and Chan (2008, p. 115).
37

the MA order q. To determine a time series’ ARMA order, the coordinates of the upper
left vertex of a triangle of os correspond to the order (p, q) of the underlying ARMA
process.123
ACF, PACF and EACF are applied to time series data to initially estimate the type and
order of the underlying time series model. Once an initial model is fitted, ACF, PACF and
EACF are applied to the corresponding residuals to check the white noise assumptions. If
significant auto-correlations are found, the model type or order is adjusted accordingly and
the procedure is repeated until the residuals do not show any serious auto-correlations.
A normality test such as Jarque-Bera’s test or graphical tools such as QQ-plots are also
part of the standard procedure of residual checks. Although normality is often assumed to
derive arguments for theoretical deductions, this is not a necessary condition for consistent
estimates.124 Nonetheless, if normality can be assumed, an identical distribution of the
residuals and variance homogeneity follows immediately. However, if normality cannot be
confirmed, the weaker concept of identical variances can be checked by calculating ACF,
PACF and EACF for the squared and absolute residuals of a model.125 If noticeable
correlations occur, heterogeneous variances must be presumed and the time series model
should be extended by a sub-model for the variances (i.e. an ARCH or GARCH model).
The following examples illustrate how the above mentioned methods can be used to
find appropriate time series models for empirical samples. The first example shows the
application of the methods by means of a simulated univariate time series. This example
emphasizes the effect of the length of time series and gives an impression how to interpret
the described graphical tools. The second example describes the analysis of a sample of
the input flow rate of a cracker plant. This example focuses on practical problems (such as
outliers) in analysing even simple univariate time series of chemical production processes.
The third example shows how the transformation process of a chemical production plant
can be modelled using multivariate time series methodology based on a real-world sample
from a de-alkylation plant. Again practical problems such as the determination of the
model order and multivariate outliers are discussed.

123
For details about the deduction of the EACF test statistic, see Tsay and Tiao (1984) and for a practical
overview see e.g. Cryer and Chan (2008, pp. 116-118) or Tsay (2002, pp. 51-53).
124
Consistent estimates can be obtained for independent and identically distributed residuals with quasi-
maximum likelihood estimators and Gaussian MLE, see Yao and Brockwell (2006).
125
See Cryer and Chan (2008, pp. 277-280).
38

Example 1 (Univariate time series analysis). Consider two samples of an ARMA(2,1)


model with sizes T = 100 and T = 1000 (assuming Gaussian errors ∼ N (0, 4) and zero
mean).126 The parameters are φ = (0.7, −0.2) for the AR part and θ = (0.4) for the MA
part. The empirical ACF and PACF are calculated for both samples next to the model’s
theoretical ACF and PACF. The corresponding correlograms are depicted in Figures 2.10a
- 2.10f.
1.0

1.0

1.0
0.5

0.5

0.5
ACF

ACF

ACF
0.0

0.0

0.0
−0.5

−0.5

−0.5
5 10 15 20 5 10 15 20 5 10 15 20
Lag Lag Lag

(a) ACF for T = 100 (b) ACF for T = 1000 (c) Theoretical ACF
1.0

1.0

1.0
0.5

0.5

0.5
Partial ACF

Partial ACF

partial ACF
0.0

0.0

0.0
−0.5

−0.5

−0.5
5 10 15 20 0 5 10 15 20 25 30 5 10 15 20
Lag Lag Lag

(d) PACF for T = 100 (e) PACF for T = 1000 (f) Theoretical PACF

Figure 2.10: Theoretical and empirical ACF and PACF for an ARMA(2,1) process

The number of observations available for estimation considerably influences the em-
pirical ACF and PACF. For the small sample with T = 100 the ACF correlogram (Fig-
ure 2.10a) does not show any similarity to the corresponding theoretical ACF correlogram
(Figure 2.10c). None of the ACF correlograms shows an easily interpretable pattern. In
contrast, for all PACFs (Figures 2.10d - 2.10f) the first two lags show the highest cor-
relation pointing to a second order AR component. Tables 2.3 - 2.5 show the results of
the sample EACFs and the theoretical EACF for this example ("x" denotes a significant
correlation and "o" denotes a non-significant one).
For the small sample, the EACF values (Table 2.3) show an ambiguous pattern. Proba-
bly, an ARMA(1,3) or ARMA(3,1) model would be chosen as initial choices. In contrast,
the large sample (Table 2.4) shows a good fit to the theoretical EACF (Table 2.5) and an
ARMA(2,1) model seems an appropriate initial choice. Since the analysis of ACF, PACF
and EACF cannot reveal an unambiguous favourite model for both samples, all possible
order specifications with p ∈ {0, ..., 4} and q ∈ {0, ..., 4} are estimated for both samples.127
The corresponding AICc , SIC, and HQIC values are displayed in Table 2.6 where the min-
imum for each criterion and time series is set in bold font and the true model order is
coloured in grey.128
126
The samples are simulated by means of the arima.sim function with R 2.15.1.
127
For all models no intercept is estimated.
128
Only a subset of all tested orders is displayed in the table for the sake of brevity (q ≤ 1).
39

MA lag MA lag MA lag


0 1 2 3 4 5 6 0 1 2 3 4 5 6 0 1 2 3 4 5 6
0 x x o x x x o 0 x x o x x o o 0 x x x x x x x
1 x x o o o x o 1 x x x x o o o 1 x x x x x x x
AR lag

AR lag

AR lag
2 o x o x o o o 2 x o o o o o o 2 x o o o o o o
3 x o o o o o o 3 x x o o o o o 3 x x o o o o o
4 x x o o x o o 4 x x x o o o o 4 x x x o o o o
5 x x x o o o o 5 x o o x o o o 5 x x x x o o o
6 x x o x o o o 6 x o o x o o o 6 x x x x x o o

Table 2.3: EACF for T = Table 2.4: EACF for T = Table 2.5: theoretical
100 1000 EACF

AR MA T = 100 T = 1000
order order AICc SIC HQIC AICc SIC HQIC
1 0 4.273 4.175 4.160 4.486 4.479 4.476
2 0 4.141 3.943 3.912 4.272 4.258 4.252
3 0 4.330 3.987 3.941 4.272 4.246 4.237
4 0 4.556 4.022 3.960 4.291 4.250 4.238
0 1 4.258 4.160 4.145 4.463 4.456 4.453
1 1 4.197 3.999 3.968 4.277 4.262 4.256
2 1 4.330 3.987 3.940 4.269 4.244 4.234
3 1 4.566 4.032 3.970 4.291 4.250 4.238
4 1 4.835 4.062 3.984 4.317 4.257 4.242

Table 2.6: Information criteria for both sampled time series and various model specifica-
tions

For both samples all three criteria receive their minimum for a unique model order. In
the small sample case, all criteria suggest an ARMA(2,0) whereas in the large sample
case an ARMA(2,1) is suggested (which is the correct order). Fitting these models to both
time series and analysing the corresponding residuals results in diagnostic plots without
remarkable features as Figures 2.11a - 2.11f show.
For the residuals’ ACFs only spurious significant auto-correlations occur at lag 9 for
the small sample and lag 30 for the large sample. Hence, it can be concluded that no auto-
correlation remains and the model successfully captures the time-dependency pattern.129
For the squared residuals’ ACFs, slightly significant auto-correlations occur at lag 11 for
the small sample and at lags 6, 17, and 26 for the large sample. Therefore, also identical
variances of the residuals can be assumed.130 In summary, identically and independently
distributed residuals can be assumed for both models. This is also confirmed by the QQ-
plots (Figure 2.11c and Figure 2.11f) which show a good fit to the normal distribution for
both models. This confirms the assumption of independently and identically distributed
residuals indicating adequate models. Note that for the small sample the true model order

129
Since the ACFs show no attractions, the PACF need not to be analysed.
130
Also the ACFs of absolute residuals show no serious attractions.
40

is not identified despite the fact that the residual analysis reveals no hints for an inaccurate
model specification.
0.4
0.4
4
2

0.2
0.2
0

0.0
0.0

ACF
ACF
−2
Sample Quantiles

−0.2
−0.2
−4

−0.4
−0.4
5 10 15 20 5 10 15 20 −2 −1 0 1 2
Lag Lag Theoretical Quantiles
(a) ACF of residuals for T = 100 (b) ACF of squared residuals for T = 100 (c) QQ-plot of residuals for T = 100

0.4
0.4
5

0.2
0.2
0

0.0
0.0

ACF
ACF
Sample Quantiles

−0.2
−0.2
−5

−0.4
−0.4
5 10 15 20 25 30 5 10 15 20 25 30 −3 −2 −1 0 1 2 3
Lag Lag Theoretical Quantiles
(d) ACF of residuals for T = 1000 (e) ACF of squared residuals for T = 1000 (f) QQ-plot of residuals for T = 1000

Figure 2.11: ACF of residuals, ACF of squared residuals and QQ-plot of residuals for ARMA(2,0) model (small sample, T = 100) and for
ARMA(2,1) model (large sample, T = 1000)
41
42

Example 2 (Univariate time series analysis for chemical production data). An application
of univariate time series analysis is the modelling of univariate input flows of chemical
production processes (as observed for SISO or SIMO processes). To apply prewhitening
procedures, the auto-correlation structure of the (exogenous) time series of the input flow
has to be modelled. In this example, the input flow rate of a Naphtha cracker is analysed.131
Figure 2.12 shows the average hourly input flow rate for one week (T = 168).
41.62
input flow rate in t/hour
41.60
41.58
41.56

0 50 100 150
Time

Figure 2.12: Time series plot of Naphtha inflow rate

The time series shows a stationary pattern with a homogeneously oscillating flow rate
around a constant level. However, an additive outlier occurs at time index 128. To
analyse the time dependency pattern of the Naphtha time series, the effect of the outlier
has to be removed. Therefore, the outlier’s effect (i.e. the raise beyond the mean level)
is estimated by modelling a simple linear regression model and the corresponding value of
the time series is corrected by this estimate. For the corrected Naphtha time series, ACF
and PACF are calculated (Figure 2.13a and Figure 2.13b).
The ACF shows a decaying cyclic pattern whereas the PACF clearly indicates a signifi-
cant partial auto-correlation at lag 1. Therefore, an AR(1) model is chosen as the initial
model. To incorporate the outlier at index 128, an ARX(1) model is fitted to the original
Naphtha time series where a dummy variable xt accounts for the effect of the outlier132

yt = μ + φ ⋅ yt−1 + β ⋅ xt + t . (2.66)

The estimated parameters of the ARX(1) model are displayed in Table 2.7.
The corresponding residuals’ standard deviation is estimated as σ̂ = 0.002 and the
model’s AICc is calculated as -8.73. An analysis of the residuals reveals no hints for a vio-
lation of the underlying assumption of independent and identically distributed residuals.133
131
Data is provided by Dow Chemical. The scale of the time series has been transformed due to reasons
of privacy protection.
132
The dummy variable is binary, i.e. xt = 1 for t = 128 and zero otherwise.
133
Diagnostic plots can be found in the Appendix in Figure A.1.
43

1.0
0.8
0.6
ACF
0.4
0.2
0.0

0 5 10 15 20
Lag
(a) ACF of corrected Naphtha time series
0.8
0.6
Partial ACF
0.4
0.2
0.0

5 10 15 20
Lag
(b) PACF of corrected Naphtha time series

Figure 2.13: ACF and PACF of corrected Naphtha time series

Jarque-Bera’s test for normality confirms that the hypothesis of normally distributed resid-
uals cannot be rejected (at a 5% level), i.e. it is assumed that ˆ ∼ N (0, 0.0022 ). Due to
the already convincing results of the ARX(1) model (which has minimal order) no other
ARMA models of higher order are estimated and (2.66) is taken as the best model for the
Naphtha time series.

parameter μ̂ φ̂ β̂
estimate 41.6 0.79 0.0564
standard error 0.001 0.046 0.002

Table 2.7: Estimated parameters for the Naphtha time series


44

Example 3 (Modelling a MISO production process of a chemical plant). Aromatic hydro-


carbons are one of the major chemicals resulting from cracking of Naphtha. This class of
chemicals comprises Benzene, Toluene and Xylene. For a post-processing to final chemi-
cals, Benzene is the most important aromatic hydrocarbon. By de-alkylation, Toluene and
Xylene can be transformed to Benzene. The de-alkylation of Toluene and Xylene is a cat-
alytic exothermal reaction where the alkyl-group(s) in the Toluene and Xylene molecules
are removed by reacting with hydrogen to Benzene and Methane. Methane can be reduced
to hydrogen and carbon black, where the hydrogen can in turn be used as input for the de-
alkylation.134 The conversion rate of this transformation is about 60 - 90%.135 To operate
efficiently, not transformed Xylene and Toluene is separated from Benzene and cycled
back to the input mixture. The stream of input chemicals also contains longer-chained
hydrocarbons (C9 ) resulting from the previous cracking process and imperfect separation.
These hydrocarbons are also contained in the output flow and have to be separated from
Benzene. The flow sheet of the chemicals involved in this production process is depicted
in Figure 2.14.

Toluene + Xylene

Methane
Benzene

H2

inflow mixture

aromatics + C9 Tol. + Xyl. + C9 C9

Figure 2.14: Flowsheet of the de-alkylation plant136

All chemicals entering or leaving this plant are recorded. Figure 2.15a shows the average
hourly inflow and outflow rates for two weeks (T = 336).137 Although hydrogen is a
necessary component to undergo the relevant chemical reactions, the external inflow of
hydrogen has shown no individual effect on the outflow rate. Thus, external hydrogen
supply has been dropped from the further analysis. The outputs are pure Benzene and the
134
See e.g. Ma and Trimm (1996).
135
See Ozokwelu (2006).
136
See e.g. Ozokwelu (2006).
137
This data is provided by Dow Chemical. The scales of all variables are transformed due to privacy
protection reasons.
45

stream of long-chained hydrocarbons (so-called C9 stream). A dependency between inflow


rate and outflow rates can be presumed. To verify this finding, Figure 2.15b shows the
scatter plots for Benzene outflow rate and C9 outflow rate dependent on the (contemporary)
inflow rate.

● ●
● ●
●●●
● ●

● ●●
●●

●●


●● ●●

120
●●● ●

●●●
● ●●


● ●
●●
130

● ● ●●
●●

●● ●




● ●●●
● ●●

●●

●● ● ●
● ● ● ●● ●

●●


● ● ●

● ●
input rate

● ● ●●

●●
●●● ●
●● ●●● ●
●●

● ●●
● ●● ●
● ● ●
●●●
●●●● ●
●●● ●●
●●
● ●
●● ● ●

Benzene rate
●●●● ● ● ● ●● ●● ●
120

● ●● ●● ● ●●
●● ●


●●
● ●●●

110

●●
● ● ●
● ● ● ●


●● ●●
● ●

● ● ● ●
● ●

●●
● ●
●●



●●
● ●

●●●
110

● ●
● ●●

● ● ● ● ●
● ●
● ●
● ● ●
● ● ●

100


● ●
120


● ●
Benzene rate

● ●●
● ●
●●

90
●●

●●●

110

● ● ●
3.0 ●
● ●● ●
100

● ●
● ●●
● ● ●●
●●
● ●


● ●●
● ●
●●

●●
● ●
● ● ●●●
● ●
●●
●●● ●
●●

● ● ●
●●


● ●●●
● ● ● ●

90

●●
●●
● ●
●● ● ● ● ●
●● ●
●●
●●● ●●●
●●●●●● ●●
● ●





● ●●
●●
● ●
● ●● ●●
C9 rate

● ● ●●

●●●
●● ● ●● ●

2.5

●●
●● ●
●● ● ●

● ● ● ●
●●
3.0

●●● ● ● ●● ● ● ●
● ● ● ●●●●
● ● ● ● ●


●● ● ●●


●●● ●● ● ●● ●
●●
●●

● ● ● ● ●

●● ● ●
● ● ● ●●
C9 rate

●● ●

● ●● ●● ●
●● ●

●● ● ● ●●


● ● ● ●
2.5

●●

● ● ●
● ● ● ●
●●● ●

● ●

2.0

● ●

● ●
● ● ● ● ●

● ●
●● ●

2.0

●● ● ●


● ●
●●

0 50 100
200 150
250 300 110 120 130
Time input rate
(a) Inflow and outflow rates for de-alkylation plant (b) Scatter plots for Benzene and C9 outflow rates
dependent on inflow rate

Figure 2.15: Raw data for a de-alkylation plant

From Figure 2.15b it can be observed that the output rate of Benzene depends linearly
on the input stream, while the C9 outflow rate more likely shows a slightly quadratic
dependency pattern. Furthermore, from the general nature of chemical production plants,
an auto-correlation structure of both output components can be presumed.138 Let Y be
the matrix of both output flow rates with dimension 336 × 2. yt ∈ R2 is the vector of both
output rates at time index t. The inflow rate x is a vector of length 336 and xt denotes
the inflow rate at time index t. To model the outflow rates yt , a vector auto-regression
(VARX) model is supposed with inflow rate x and squared inflow rate x2 as exogenous
variables. Figure 2.15a clearly indicates a non-stationary time series of inflow rates which
is primarily characterized by multiple changes of its mean. This can be interpreted as
actions to control the plant’s performance. An analysis of the stationary parts of the
inflow time series shows no indication for auto-correlation structures. It is assumed that x
follows a white noise process with varying means and is handled as an exogenous regressor.
138
A raw initial analysis by checking the residuals of a simple regression of the input rate on both output
rates confirms this presumption. The results are not shown here for the sake of brevity.
46

To select an appropriate order for the supposed VARX model, models of various orders are
fitted and compared based on standard information criteria.139 The information criteria
suggest a model order of p = 3 or p = 4.140
For reasons of parsimony a V ARX(3) model is chosen, i.e.

yt = Φ1 ⋅ yt−1 + Φ2 ⋅ yt−2 + Φ3 ⋅ yt−3 + υ 1 ⋅ xt + υ 2 ⋅ x2t + t . (2.67)

The corresponding estimated parameter matrices can be found in Table 2.8.141

Φ̂1 Φ̂2 Φ̂3 υ̂ 1 υ̂ 2


0.12 1.67 0.03 0.78 −0.20 −2.91 0.96 −0.00023
( ) ( ) ( ) ( ) ( )
0.04 0.82 0.01 −0.21 −0.02 0.15 −0.03 −0.00004

Table 2.8: Coefficients of the initial V ARX(3) model for de-alkylation plant

The residuals’ covariance matrix is estimated as

initial ⎛ 2.813 0.014 ⎞


Σ̂ = .
⎝ 0.014 0.005 ⎠

Afterwards, the residuals are analysed for deviations from the underlying assumptions
by checking for remaining correlations and normality. Checking for remaining correla-
tions among the residuals is performed by analysing the auto-correlation for both residual
components and the cross-correlation between both components. These are shown in Fig-
ure 2.16a. Checking for normality is done by analysing the quantile-quantile plots (QQ-
plots) for both residual components (shown in Figure 2.16b) and a Jarque-Bera-Test for
normality.
From Figure 2.16a no clearly significant auto- and cross-correlations are obvious. At
lag 16 of the ACF of the C9-residuals and at lags 0 and -6 of the residuals’ CCF slightly
significant correlations can be found (at a 5% level). It is to conjecture that the significant
findings are spurious since it is hardly reasonable to assume a delay of 6 or 16 hours
before the outflow rates are affected.
The QQ-plots displayed in Figure 2.16b suggest a good fit to the normal distribution.
However, the multivariate Jarque-Bera-test rejects the normality hypothesis (at a 5% α-
level). This result is probably caused by some fluctuations in the outflow rates occurring
at changes of the production level (e.g. see time intervals 40-70, 100-170 and 270-310).
These outliers might be caused either by process instabilities or measurement errors. It
is to conjecture that a better fit to the normal distribution can be achieved by explicitly
139
Computations are performed using the function VARselect from the vars package with R 2.15.1. See
Pfaff (2008).
140
The detailed results are kept for the appendix in Table A.1.
141
Computations are performed using the VAR function from the vars package with R 2.15.1, see Pfaff
(2008). The model’s ANOVA can be found in Table A.3 in the appendix.
0.3
Benzene residuals

6
Benzene residuals

0.2
4

0.1
2

−0.1
0

autocorrelation
−2

0.3−0.3
C9 residuals

Sample Quantiles
−4

0.2
0.1
−6
C9 residuals

0.2

−0.1
autocorrelation
0.1

−0.3
0 5 10 15 20 25
Lag

0.3
Benzene & C9 residuals
0.0

0.1
Sample Quantiles
−0.1

−0.1
crosscorrelation
−0.2

−3 −2 −1 0 1 2 3

−0.3
−20 −10 0 10 20 Normal Quantiles
Lag

(a) Autocorrelation and cross-correlation for residuals of both outflow (b) QQ-plots of residuals’ marginal distributions for both outflow rates
rates

Figure 2.16: Residual diagnostic plots for V ARX(3) model of the de-alkylation plant.
47
48

199

● ordinary residual

0.2
outlying residual
64
● 61
158 ● ●

● ●



● ●
● ● ● ● ●● 302
249 ●
0.1

● ●● ●
● ● ●
● ● ● ● ●● ● ●
residuals (C9)

● ● ● ●
● ●
●● ●
● ●
● ● ●
● ● ●
●● ●● ●●● ●
●● ● ● ● ●
● ●
51 ● ●● ●● ●
● ●● ●
● ● ● ● ● ●
● ● ● ● ● ●● ●●●
●●
● ●●
● ●
● ●
●● ●● ● ●● ●● ● ● ● ● ●
● ● ● ●●

● ●●●●● ● ● ● ●

● ● ● ● ●● ● ● ● ● ●● ●
0.0


●●● ● ● ●● ● ● ●●● ● ●
● ● ●●
● ● ● ● ● ●● ● ●● ● ● ●

● ●
● ●●●● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●● ●
● ● ●● ● ●●
●● ● ● ● ● ● ●
● ● ● ● ● ●● ● ● ●●● ● ● 155
●●
● ● ●
●● ● ●● ●● ●
●●

● ●
●● ● ● ●
● ● ● ● ● ● ●
207
● ● ●● ● ● ● ●● ●● ● ●
● ● ●
● ● ●●● ● ● ●● ●
● ● ●●●
● ● ●
−0.1

297 ●

●●
● ● ●● ● ●
●● ●

● ● ●


● ●
20 ● ●

65 ●●


188 202 156 109
● ●
−0.2

● ●

−6 −4 −2 0 2 4 6
residuals (Benzene)
Figure 2.17: Scatterplot of residuals for outlier corrected V ARX(3) model of the de-
alkylation plant

modelling these outliers. Therefore, a multivariate outlier detection procedure is applied


based on a robust estimation of the residuals’ covariance matrix142

rob ⎛ 2.255 0.008 ⎞


Σ̂ = .
⎝ 0.008 0.005 ⎠

Based on the estimated robust covariance matrix, the Mahalanobis distances for all residu-
als are calculated. All residuals whose Mahalanobis distances exceed a critical value based
on the 97.5% quantile of the χ22 -distribution are handled as outliers.143 For each identi-
fied outlier, a dummy variable is constructed which has entry 1 at the time index of the
identified outlier (and is zero otherwise).144 Figure 2.17 shows the pairs of residuals at
each time. Grey-coloured ◾-marked points refer to identified outliers whose time indices
are superimposed.
Subsequently, (2.67) is enhanced by the dummy variables for the outliers. To provide the

142
The robust estimator is the minimum covariance determinant estimator (MCD), carried out by covMcd
function with default settings from the robustbase package, see Rousseeuw and van Driessen (1999).
143
See e.g. Maronna et al. (2006) or Liebscher et al. (2012).
144
I.e. so-called innovative outliers are modelled. The procedure can be seen as a multivariate adaptation
of the univariate standard procedure for outlier detection as described e.g. in Cryer and Chan (2008,
pp. 257-260).
49

most parsimonious model, only parameters with significant effects are finally selected.145
The corresponding re-fitted coefficients of the selected model can be found in Table 2.9.146

Φ̂1 Φ̂2 Φ̂3 υ̂ 1 υ̂ 2


−− 2.51 −− −− −0.15 −2.67 1.05 −0.00028
( ) ( ) ( ) ( ) ( )
0.04 0.85 0.01 −0.25 −0.02 0.18 −0.03 −0.00003

Table 2.9: Coefficients of the V ARX(3) model with outlier correction and variable selec-
tion

While the estimates of the autocorrelation coefficients for the C9 time series (lower rows
in Φ̂1 to Φ̂3 ) only change slightly, the estimates the autocorrelation coefficients for the
Benzene time series (upper rows in Φ̂1 to Φ̂3 ) are clearly affected since three parameters
are dropped from the model. The remaining coefficients are affected, too. In particular,
the lagged cross-correlations to the C9 time series change from 1.67 to 2.51 and from
-2.91 to -2.67 (right upper entries in Φ̂1 and Φ̂3 ). This confirms the serious effect of even
unobtrusive outliers in multivariate times series analysis. By incorporating the outliers’
effects, the model’s AIC decreases from -4.22 to -4.72. Similarly, SIC decreases from -4.05
to -4.17. The analyses of residuals show a similar pattern as for the initial model and
reveal no serious hints for cross- or auto-correlation.147 Now, the multivariate Jarque-
Bera test does not reject the hypothesis of multivariate normally distributed variables (at
a 5% level). The residuals’ empirical covariance matrix is finally estimated as

f inal ⎛ 1.868 0.011 ⎞


Σ̂ = .
⎝ 0.011 0.004 ⎠

Using this model, the outflow rates can be fitted quite well, as Figure 2.18 shows.

145
The estimated coefficients for the V ARX(3) model with outlier compensation can be found in the
appendix, Table A.2.
146
The model’s ANOVA can be found in Table A.4 in the appendix.
147
The corresponding correlograms can be found in the appendix in Figure A.2.
50

125
115
105
Benzene rate
real
fitted
outlier

90 95
3.0
2.5
C9 rate
real

2.0
fitted
outlier

0 50 100 150 200 250 300


Time

Figure 2.18: Real and fitted time series of both outflow rates (outlier indices superimposed)
51

3 Distribution planning in chemical


industry logistics
The preceding chapter deals with methods describing the behaviour of product flows in
chemical production plants to model the core components of chemical production networks
in detail. The next step in modelling product flows in chemical production networks
is to describe product flows between chemical production sites and plants. At a site,
intermediate chemicals are produced by some plants and consumed by some other plants
whereby raw chemicals are only consumed and final chemicals are only produced. To
buffer temporal imbalances of chemical flows, inventories are hold at the sites. Figure 3.1
shows the schematic chemical production network with added inventories symbols and
highlighted chemical flows.

intra-site inter-site

S1 C1

intra-site

intra-site

S2 C2

storage supplier customer production plant

Figure 3.1: Chemical SC scheme with highlighted inter-site transports

This chapter deals with methods to plan the distribution of chemicals in chemical
production networks from a focal point of view. Specific planning problems occurring for
the transport modes pipeline, rail and ship are described.

T. Kirschstein, Integrated Supply Chain Planning in Chemical Industry, Produktion und Logistik,
DOI 10.1007/978-3-658-08433-2_3, © Springer Fachmedien Wiesbaden 2015
52

3.1 Characteristics of chemical industry logistics


Logistics in basic chemical industry is in many aspects different from other industries.
This is mainly due to the special properties of the production processes and handled
products. Intermediate and basic chemicals are typically fluids or gases and often (at
least) environmentally hazardous. Hence, transport and storage processes are strongly
focused on reliability and safety. Furthermore, large quantities of these chemicals have to
be shipped to a relatively small set of destinations. This leads to a tendency of transport
consolidation. On the other hand, material distribution planning has to face high out-of-
stock costs. These are caused by high set-up costs for continuously operated plants and a
high risk of demand losses. This has led to high average stock levels in chemical industry
(compared to other industries).1 This fact promises opportunities for cost reductions by
an improved distribution planning and an efficient inventory management.
To exploit these cost reduction potentials, the special requirements of chemical products
and production processes have to be considered in distribution planning. Due to the
vast quantities of raw and intermediate chemicals, high-capacity transport systems are
advantageous. High-capacity transport systems include pipelines, rail cars, and ships/
barges. For final chemical distribution road transports are dominating because the set of
destinations is disperse and transport quantities are often sufficiently small. This becomes
apparent in Figure 3.2 where the modal splits are depicted (in terms of the total quantity
of transported chemical products in Germany in 2009 categorized by transport mode).

road
[43.6%]
road
[70.2%]

pipeline
[14.6%]
open sea open sea
[7.7%] [8.8%]

short sea
[5.8%] short sea
rail [9.6%]
pipeline [8.4%]
rail
[23.4%]
[8.0%]

(a) Modal split - all companies (total: 313 (b) Modal split - chemical companies (total: 107
mill. tons) mill. tons)

Figure 3.2: Modal split for chemical products in Germany in 2009 (based on total trans-
ported quantity in mill. tons)2

Figure 3.2a shows the modal split for all transports of chemical products in Germany
in 2009. This includes all transports along the complete supply chain, in particular final
chemical transports to customers as well as raw and intermediate chemical transports
1
See Shah (2005).
2
See VCI (2011).
53

between chemical companies. The modal split is fairly similar to the total modal split
over all products.3 Road transports are dominating with a share of about 70% of the
total transported quantity. This picture changes if only transports performed by chemical
companies are considered (Figure 3.2b). Here, the share of road transports is considerably
smaller (~43%). Most transports are handled via high-capacity transporters. E.g. 70%
of the total amount of transported alkenes such as Ethylene and Propylene are shipped
via pipeline.4 This shows the special relevance of these transport modes for the chemical
industry. It is to presume that this picture is even more biased in favour of high-capacity
transports in basic chemical industry.
For all considered means of transport the problem is to plan transport flows and local
stocks such that total costs for transports and stock holding in the production network are
minimized. Depending on the mean of transport, various technical restrictions have to be
considered. In the case of rail cars or ships, transport flows have to be accompanied by a
corresponding flow of transport carriers such that carrier flows are planned simultaneously.
Depending on the modelling of the carriers, the problem structure is similar either to
inventory-routing problems (IRP)5 or to capacitated network design problems (CNDP).
In the former case, routes for each carrier have to be determined whereas in the latter case,
carrier flows are modelled as dynamic arc capacities.6 The literature on IRPs considering
ship transports constitutes the sub-class of maritime IRPs which is briefly reviewed in
subsection 3.4.2. No approach is known addressing IRPs for rail transports.7 This is
probably because in the rail context transports are not performed by a single transport
carrier (such as a ship). Rail transports require a joint motion of locomotives and rail
cars. The routing of both types of equipment would complicate the problem considerably.
Therefore, in subsection 3.3.2 a hybrid CNDP-IRP model for rail transports in chemical
industry is formulated as a multi-layer multi-commodity network design model. Here, the
transport traction, offered by locomotives, is modelled as a binary decision incurring a
fixed charge whereby the transport capacity, offered by rail cars, is modelled by rail car
flows routed through the network.
In contrast to all other means of transport, pipeline transports do not require mobile
transport devices. Therefore, the planning of pipeline transports is not concerned with
the matching of transport flows and transport carrier flows. Nonetheless, other problems
come into scope such as the planning of pumping sequences when the pipeline is used by
multiple products.

3
See Statistisches Bundesamt (2012).
4
Further 20% are transported by ships, see Association of Petrochemicals Producers in Europe (2004).
5
For a recent and encompassing overview on IRPs see Andersson et al. (2010).
6
These capacities are typically associated with fixed charges, see e.g. Gendron et al. (1999) or SteadieSeifi
et al. (2014).
7
See Andersson et al. (2010).
54

3.2 Planning problems for pipeline operations


Some facts distinguish pipeline systems from all other modes of transport:

• Solid substances cannot be transported via a pipeline.

• Pipeline transports can be performed without a mobile transport device.

• Pipelines are typically privately owned infrastructure objects.8

The last fact induces that investment and management of pipeline systems are com-
pletely organized by the participating companies and (almost) all costs are internalized.
Investments in pipeline infrastructure make sense under specific circumstances only:

• transport substance/s is/are liquefiable,

• regular transport needs,

• high transport quantities, and

• fixed sources and destinations.

These requirements are given e.g. for the transport of raw and intermediate chemicals
within a chemical production site. For trans-regional transports, most prominent exam-
ples for pipeline transports are crude oil and natural gas pipelines. However, also some
basic chemicals such as Ethylene or Ammonia are transported via pipelines over long
distances to connect partners in a chemical supply chain.9

3.2.1 Technical and organizational prerequisites


Pipelines are used for transports of fluids or gases in a tube over long distances. The
tube is segmented by pumping stations into more or less regular parts. In contrast to all
other modes of transport, the transport infra-structure does not need a (mobile) carrier to
operate the transports. Operating costs for pipeline transports are comparatively small
and are mainly driven by maintenance and energy costs. In combination with the high
transport capacity, this leads to small transport cost rates measured on a ton-kilometre
basis.10 This cost advantage comes at the expense of spatial and organisational inflexi-
bility. Pipelines can only be used for liquefiable substances and are designed for a given
set of products. Because most substances to be transported have certain chemical prop-
erties, pipelines are built with regard to these properties, i.e. to minimize deterioration
8
Nonetheless, there exist several projects where the public sector supports pipeline investments in coop-
eration with private companies. See e.g. Association of Petrochemicals Producers in Europe (2004).
9
E.g. see Association of Petrochemicals Producers in Europe (2004) for details about the European
network of Ethylene and Propylene pipelines.
10
E.g. the energy consumption rate for pipeline transports is 0.14 mega-joules [MJ]/t-km compared with
0.35 MJ/t-km and 0.45 MJ/t-km for ship and rail transports. See van Essen et al. (2003).
55

and corrosion. Using the pipeline for other chemicals is usually not possible or associated
with high preparation and/or repair costs.
To categorize pipeline systems from an organizational point of view, two basic criteria
can be used: The number of products to be transported and the number of access points.
Commonly, pipelines are designed for one product only, e.g. in the case of crude oil trans-
ports. This allows optimizing the pipeline’s technological configuration with respect to
the energy consumption. However, in chemical industry also multi-product pipelines are
met when the intended chemicals to be transported are chemically similar.11
The structure of the pipeline network is another aspect to be considered. Since in
pipelines the product flow is unidirectional in very most cases,12 the network structure
can be categorized according to the number of sources and sinks. In the simplest case,
there is a single source and a single sink (one-to-one network). Similarly, if there are
multiple sinks to be supplied this is labelled as a one-to-many network. Conversely, there
exist many-to-one networks and, finally, for multiple sources and sinks also many-to-many
networks. Table 3.1 displays this categorization scheme.

system structure
many-to-
one-to-one one-to-many many-to-one
many

continuous continuous continuous cont. network


# products

single
flow split junction flow

batch batch
multiple batch flow batch split
junction network flow

Table 3.1: Categorization of pipeline types

Of special relevance among these types are the one-to-many and many-to-many net-
works as these show the most challenging planning problems. In the single product case,
the main problem is to keep the flow in balance along the pipeline or pipeline segments
such that inflow and outflow are equal at any time.
A more complicated problem is met in the multi-product case where batches of products
have to be routed through the pipeline (network) whereby a batch may also be split into
sub-batches at pipeline forks. In the cases of multiple sources, additionally a coordinated
pumping planning is required under consideration of the pumping capacity, the due dates
of the batches, and the segment-wise flow balances.
Pipelines are typically operated with a unidirectional material flow. In typical set-
tings the associated partners along the pipeline can be categorized distinctly in (pure)
11
A necessary condition is that chemicals to be transported do not react with each other. See e.g. Asso-
ciation of Petrochemicals Producers in Europe (2004).
12
However, there exist exceptions as described in Magatão et al. (2004).
56

providers/sources and (pure) consumers/sinks. Here, the providers are located at the
head of the pipeline and the consumers are located at the bottom. This is called a seri-
ally operated pipeline.
In contrast, if the partners along the pipeline can act as providers and consumers, each
partner has to be able to receive from and feed in the pipeline. These situations are met
in local pipeline systems, e.g. at large-scaled integrated chemical production sites, where
multiple plants and storage facilities are connected to a single-product pipeline. Here,
the storage facilities buffer the total pipeline flow and serve as receivers and suppliers of
products at the same time. This requirement is met by building orbital or cyclic pipelines
connecting all partners. Still the material flow is unidirectional, but all partners can take
or feed material from/to the pipeline as long as the (segment-wise) total flow is in balance.
There might be additional chemical or technical restrictions that have to be taken into
account for pipeline transport planning such as the interruptibility of pipeline operation.
E.g. if a product remains too long in a pipeline without movement, chemical reactions
may take place which may cause corrosion or polymerization processes. This would lead
to a failure and cause expensive maintenance and/or cleaning operations. However, the
most obvious technical restriction is probably the pump rate of a pipeline. The pump rate
measures the transport speed on a volume-per-time basis. Typically, there is an upper
bound for the tolerable pressure inside a pipeline determining the maximum pump rate
(say ρcap ). The realized pump rate might be smaller and depends on the power of pumps
at the pumping stations.13

3.2.2 Single-product pipelines


A relevant restriction to be fulfilled for operating pipelines is the flow balance. I.e., the
input flow equals the output flow implying constant physical conditions (such as pressure)
for all pipeline segments. For instance, assume a serially operated multi-access pipeline
where N is the set of access points consisting of N out destination points and N in feeding
locations such that N out ⋂ N in = ∅ and N out ⋃ N in = N . The flow rate ρ is measured
in volume-per-time units and ρi is the flow rate at location i measured at a point of
time.14 For i ∈ N in , ρi is an inflow and for i ∈ N out it is an outflow. Assuming constant
pressure and a completely filled pipeline, the inflow-outflow condition ∑i∈N in ρi = ∑j∈N out ρj
holds at each point in time. The flow rates ρi can be controlled via valves. For multi-
access pipelines, at the access points heterogeneous flow rates may occur. Figure 3.3
shows an example for a serial multi-access pipeline with N = {1, ..., 4}, N in = {1, 2} and
N out = {3, 4}.
Due to the flow balance and pipeline capacity it follows ∑i∈N in ρi = ∑j∈N out ρj ≤ ρcap and

13
Typically the pumping power can be controlled. In a reasonably planned pipeline system typically the
pumps’ maximal power suffices to exploit the maximum pump rate ρcap .
14
A time indexing of flow rates is dropped for the sake of simplicity. In principle, flow rates are time-
continuous variables, i.e. ρi (t) is the precise notation of flow rate functions.
57

i=2
i=1 ρ2

ρ1

j=3 j=4 ρ4
ρ3

Figure 3.3: Scheme of an exemplary serial multi-access pipeline

ρi , ρj ≤ ρcap . If local flow capacities exist, it follows additionally ρj ≤ ρcap


j ≤ ρcap .
The planning and control of pipelines is a technically challenging task since the flow
parameters have to be controlled continuously to ensure immediate reactions in case of
leakages or similar problems. Therefore, a pipeline model is necessarily representing the
reaction of (outflow) flow rate and pressure depending on certain control parameters such
as inflow pressure and pumping power. Modelling the pipeline flow leads to a system of
non-linear partial differential equations which cannot be solved analytically.15 However,
numerical methods are available16 that find their way into modern pipeline management
systems.17
Despite the considerable challenges in pipeline control, difficult operational planning
problems do not occur for single-product pipelines from a logistical point of view. An ex-
ception is the planning of maintenance operations which, for instance, include the planning
of replacement investments to prevent technical short-falls. When maintained carefully,
short-falls due to technical problems occur rarely.18 The structural integrity of a pipeline
has to be checked in regular intervals by pipeline inspection gauges (PIGs) that are used
e.g. to scan for structural weaknesses. These PIGs most often are passive, i.e. they have
no driving unit and are transported within the usual material flow. Beside the termina-
tion of these PIG runs (in accordance to legal regulations), there is no planning problem
associated as long as the usual pipeline operation does not have to be interrupted.
Beside the control of material flow and the planning of maintenance activities, there
are no operational planning problems involved in running a pipeline system that could
be labelled as a logistical problem. The only parameters with logistical relevance to be
planned operationally are the flow rates. Flow rates have to assure that all plants along
the pipeline can work properly, i.e. it has to be ensured that enough material is available
to supply the consuming plants. If the pipeline cannot be stopped (ρ > 0), it has to be

15
See e.g. Matko et al. (2000) for an overview or Herrán-González et al. (2009) for the special case of gas
pipelines.
16
See e.g. Blaẑiĉ et al. (2004).
17
E.g. see Cameron et al. (2001).
18
See e.g. Association of Petrochemicals Producers in Europe (2004) for empirical figures.
58

assured additionally that sufficient storage capacity is available to maintain a minimal flow
rate at any time. As the producing plants along the pipeline are not working constantly
at the same level, buffers have to be integrated in the pipeline system to assure material
availability. A crucial question arising is the distribution of stocks along the pipeline. A
typical setting consists of tanks/inventories at each access point.
In contrast to common assumptions in inventory management, the demands of chem-
icals at chemical production sites are composed by the input and output flows of the
related plants instead of a total number of orders within a period. This implies that
the uncertainty covered by safety stocks arises rather from technological sources than
from external markets. Moreover, make-to-stock production is predominating in chemical
industry because almost all products are commodities.
In the case of continuously operated single-purpose plants the set of input and output
chemicals is fixed (i.e. there is only one recipe used). However, the throughput of the plant
can vary in most cases. I.e. there is a finite set of production modes each one associated
to specific flow levels of all chemicals. As such plants are typically designed for maximal
efficiency, the production mode associated with the maximum throughput is the intended
state of production. A change to another mode is often forced by disturbances such as
technical failures.19
To sum up, chemical production plants are characterized by a finite number of produc-
tion modes (say ∣S∣ = k) corresponding to a finite number of flow levels ω(s) with s ∈ S
for the associated chemicals.20 Each production mode s lasts for some time (reflecting
e.g. repair times).
The unintended change of a plant’s production mode can be interpreted as a stochastic
process. Two components are necessary to model such a process: If a plant enters a
certain production mode, the sojourn time (the time until the mode changes again) must
be expressed. When the production mode changes, a transition model determines which
mode is entered next. Since a finite number of production modes exists, the transition
from one mode to another is a discrete process with finite state space. The sojourn time
of a production mode is a continuous variable in principle. However, the analysis and
modelling of sojourn times as continuous variables requires a lot of historical records to fit
an accurately parametrized sojourn time distribution for each mode.21 In most cases, this
empirical problem can be circumvent by considering a discrete time scale. A stochastic
model in discrete time assumes that a plant is in exactly one mode/state per period of
time. The sojourn time of a mode can then be reflected as the length of a sequence of
periods without a mode change. Thus, a joint sojourn time and transition model can be

19
The shut-down of a plant can also be seen as a specific production mode with zero throughput. Beside
unintended changes of the optimal production mode, sometimes planned shut-downs occur e.g. for
regular technical inspections or due to a severe drop in demand.
20
Note that the number of flow levels of a certain chemical is smaller than the number of production
modes if multiple modes have the same flow level of the specific chemical.
21
This is because some modes last only for a few days and rarely occur, e.g. plant break downs.
59

expressed by transition probabilities ruling the transition from one production mode to
another at the beginning of each time period. This constitutes a discrete Markov process
or a Markov chain.22
Assume that there are probabilities that a plant’s production mode changes in a period
from mode s to mode t denoted by qst ≥ 0. Transitions from mode s to s are allowed:
qss ≥ 0 reflecting the probability that the plant remains in mode s in the next period
(sojourn time). Assume that transition matrices Q exist for all plants which are squared
matrices composed by the mode transition probabilities qst . Such a matrix is called
a stochastic matrix if the sum of each row or each column equals 1, respectively. By
convention, it is assumed that the rows sum up to 1, i.e. ∑kt=1 qst = 1, s = 1, ..., k. A
stochastic matrix constitutes a discrete time Markov process/chain.23 Let u0 be a binary
(row) vector of length k which has entry 1 at some starting position s and zero entries
otherwise, i.e. u0 represents the starting state of the Markov process in period 0.24 Then
the probability vector for period 1 is calculated by u1 = u0 ⋅ Q whose entries represent the
probabilities that the process enters the corresponding state. If Q constitutes a regular
Markov chain, a steady state vector π = {π1 , ..., πs , ..., πk } exists where πs represents the
long-term probability that the modelled process is in mode s.25 The steady state vector
π can be obtained by Π = lim Qn where each row of Π equals the steady state vector
n→∞
π.26
Another property of the steady state vector is its temporal invariance: If the Markov
process starts with π as the initial distribution (u0 = π) the starting distribution remains
unchanged for all following periods: u1 = πQ = πQn = un = π for n > 0.27
Let X jm denote the flow rate of the considered chemical of plant m ∈ {1, ..., M j } at
location j. Each plant has a finite number of production states/modes S jm with ∣S jm ∣ =
k jm . Each state is associated to a flow rate of the considered material that is consumed
or produced per period denoted by ω(s) with s ∈ S jm or, more compact, ωsjm ∈ Ojm . The
corresponding transition matrix Qjm is of dimension k jm ×k jm . The transition probability
jm
from state s to state t is denoted by qst for plant m at location j.
j
Let X be the total flow rate at location j in a period of time (i.e., X j = ∑M
j
m=1 X
jm ).

The set of states of X , denoted by O , depends on the combination of the states of all
j j

plants at location j. Let S j = S j1 × ⋯ × S jM denote the set of state combinations at


j

j
location j. Its cardinality is given by ∣S ∣ = k = ∏M
j j
m=1 k
jm . Hence, an element sj ∈ S j is

a vector of states for all plants at location j, sj = (sj1 , ..., sjM ) with sjm ∈ S jm . Under
j

22
Since the term "chain" refers to finite state space, the term "Markov chain" is probably more precise,
see e.g. Ibe (2008, ch. 2).
23
See Grinstead and Snell (1997, pp. 405-407).
24
To ease calculus all vectors are row vectors in the following.
25
The term regular Markov chain, simply spoken, refers to the property that all possible states can be
reached in a finite number of subsequent periods independent from the starting distribution. For
more information about definition, conditions, and properties of regular and ergodic Markov chains
see e.g. Grinstead and Snell (1997, pp. 433 ff.).
26
See e.g. Grinstead and Snell (1997, pp. 435 ff.)
27
More formally, π is the eigenvector of Q with eigenvalue 1.
60

j
independence of the production plants and their states,28 the transition probability qst
from state combination s to state combination t is given by

Mj
j
qst = ∏ qsjm
m tm . (3.1)
m=1

As a (complete) convolution of regular Markov processes, Qj are stochastic matrices


(with dimension k j × k j ) and constitute regular discrete Markov processes. Hence, steady
state vectors π j for each location j exist which represent the long-term probabilities of a
site’s state combinations.29
At integrated chemical production sites, however, plants might not be entirely indepen-
dent e.g. due to a common energy supply. Under such circumstances the states of different
plants are interdependent since the transition probabilities also depend on the states of
other plants. In this case, X j and S j are the same as described above but the transition
matrix Qj has to be derived directly by analysing the entire production network as a more
complex stochastic process, not as a simple convolution of independent Markov chains.
The associated total material balance of a state combination sj ∈ S j at location j is
defined by

Mj
ωsjj = ω(sj ) = ∑ ω(sjm )
m=1

with ω j ∈ Oj . Note that there might exist state combinations with identical total
material balance, i.e. ∣Oj ∣ ≤ ∣S j ∣.30 Without loss of generality define that if X j > 0 there is
a (net) demand, whereas if X j < 0 the considered material is surplus at location j. Often,
for a given location j, X j is either exclusively positive (Oj ⊂ R+ ) or exclusively negative
(Oj ⊂ R− ).
The total material balance over all locations associated to the considered pipeline is the
sum of all X j : X = ∑j∈N X j . Again, X depends on the combination of the locations’ total
production states. Let S denote the set of the locations’ state combinations and O the
set of associated total production balances of X. To calculate the transition probabilities
qst from a total state combination s ∈ S to a total state combination t ∈ S the same
procedure as mentioned before is applied assuming independence across locations. As
different locations typically do not share a direct technical relation (in contrast to the
plants at a location), it can be assumed that independence is not a crucial assumption here.
Therefore, s, t ∈ S are vectors of ∑N j
j=1 M elements assigning each plant at each location
a state, i.e. s = (s , ..., s ) with s ∈ S and N = ∣N ∣. Consequently, Q is constructed
1 N j j

28
I.e. that a mode change at a plant does not depend on the states of all other plants. In other words,
changes of a plant’s production mode are not influenced by the other plants’ modes.
29
This follows immediately for regular Markov chains because the corresponding transition matrices
must contain positive entries only. For more information about the summation of Markov chains see
Rozhkov (2010).
30
Equality holds if all state combinations result in unique total material balances.
61

as a convolution of regular discrete Markov processes and, hence, is a regular discrete


Markov process whose steady state vector is denoted by π. Supposed the pipeline supply
corresponds to the regular transport mode for all destinations, then E(X = ∑j∈N Xj ) = 0
is an appropriate assumption.31
A critical situation occurs when the supply system is out of balance, i.e. if X ≠ 0. Either
the material is surplus (X < 0) and has to be stored or it is in deficit (X > 0) and has to
be provided from stock. Hence, stock capacities and safety stocks are required to cover
such critical situations.
Since the pipeline flow rates in most pipeline systems can be adjusted easily, the precise
distribution of stock capacities and safety stock levels can be compensated by adjusting
the pipeline flow rates accordingly. It suffices to determine a common stock capacity and
safety stock level. The technical and economic impact of such situations depend on the
height of the imbalance and its duration. In supply systems, strategic safety stocks are
designed to cover the demand of a certain number of periods of time with a predefined
probability.32 Typically, this time span is the lead time required to replenish inventory,
e.g. by placing an order with an external supplier. In the case of pipeline supply systems,
the supply should be organized by the partners along the pipeline which act as suppliers
and customers. Hence, a desired time span can be assumed during which the supply
system should be operated autarkical (with a certain probability). Let b denote this
critical time span. The demand during this time span is denoted by Y (b) and is composed
as the total sum of total material balances over b periods. The set of all possible state
sequences of length b out of domain S is denoted as H = S × ⋯ × S with ∣H∣ = k b where
%&& & & & & & & & '&& & & & & & & & &*
b
k = ∣S∣. An element h ∈ H is a vector of b state combinations, i.e. h = (s1 , ..., sb ) with
si ∈ S. The probability ph of a state sequence h can be calculated straight forwardly by
using the transition matrix Q and the steady state vector π as follows33

b
ph = πs1 ⋅ ∏ qsl−1 sl . (3.2)
l=2

For a certain sequence h ∈ H the corresponding total material balance yh ∈ Y is calcu-


lated by

b
yh = ∑ ω(sl ) (3.3)
l=1

where the total sum over all production quantities at all locations and plants is ω(sl ) =
31
Otherwise, the supply system is dis-balanced in expectation which would be an indication for a badly
managed system or not properly modelled suppliers/consumers.
32
See e.g. Tempelmeier (2005, pp. 397 ff.).
33
This again implies independence. This can be seen as a special case of a hidden Markov model with
certain observations, see e.g. Ephraim and Merhav (2002). Although the number of sequences grows
exponentially, calculating the corresponding probabilities can be reduced considerably (O(k 2 b)) by
iteratively calculating the probabilities of sub-sequences (dynamic programming). In the context of
hidden Markov models this is called forward-backward algorithm, see e.g. Yu and Kobayashi (2003).
62

N j
∑j=1 ωsj .
l
Let yh(g) denote the total material balances in ascending order (yh(1) = min yh ) and ph(g)
h∈H
the associated probability vector. For a given total safety stock level along the pipeline r,
−1
the α-service level of the pipeline storage system is defined by ∑fg=1 ph(g) with f determined
such that r > yh(f ) and r ≤ yh(f +1) . To define a safety stock level satisfying a desired
34

α-service level (say rα ), the quantile y α with 0 < α ≤ 1 needs to be defined by y α = yh(f )
−1
with f determined such that ∑fg=1 ph(g) ≥ α and ∑fg=1 ph(g) < α. It follows that setting the
safety stock to

rα ≥ max (0, y α ) (3.4)

leads to stock-out situations at most with probability 1 − α (within the critical time span
b).
Moreover, the safety stock level can be determined based on the β-service level. I.e. at
least β ⋅100 percent of the average net deficit occurring along the pipeline within b periods
b
can be provided from stock. Here, the average net deficit is defined by μC = ∑kg=f ph(g) ⋅ yh(g)
(with f determined such that yh(f ) > 0 and yh(f −1) ≤ 0). For the expected loss in dependence
b
of a certain stock level r follows V (r) = ∑kg=f (yh(g) − r) ⋅ ph(g) (with f determined such that
yh(f ) > r and yh(f −1) ≤ r). Then, 1 − Vμ(r)
C defines the β-service level of a stock level r for the
pipeline storage system.35 The total safety stock level satisfying a predefined β-service
level can be determined by
rβ ≥ V −1 ((1 − β) ⋅ μC ) . (3.5)

The setting of a global safety stock level based on the lower bounds introduced above
implies that

• flow rates at all locations are arbitrarily adjustable and

• ρcap is not exceeded at any location.

The first point is a reasonable assumption provided there is a disjunctive categorization


of locations into material (net) providers and (net) consumers such that there is one
unique flow direction within the pipeline, i.e. Xj ≥ 0 or Xj ≤ 0 for all locations j. I.e. if
the pipeline is serially operated, the total safety stock has to be distributed among the
provider locations. Another situation occurs for cyclic pipelines. Here, the material flow is
unidirectional, but each location can act as consumer or provider as long as the total flow
is balanced. The safety stock can be distributed among all participants along the pipeline
as long as all participants can feed surplus material from stock to starving partners.
Safety stocks at the consumer locations have to cover the risks of supply (i.e. pipeline
short-falls) and/or the risk of consumption rates higher than the pipeline’s maximum
34
This definition is similar to the usual α-service level definition, see e.g. Tempelmeier (2005, p. 397 ff.).
35
Again see Tempelmeier (2005, p. 397 ff.) for the usual β-service level definition.
63

transport capacity. The former risk is neglected in the deductions since pipelines are
typically highly reliable transporters. In the latter case, the described procedure can be
used to determine the total safety stock to be held at the consumer locations.
A further assumption is that the maximum flow rate ρcap is not exceeded at a single
location, i.e. X j ≤ ρcap
j ≤ ρcap at all locations j. Otherwise, a local safety stock level has
to be incorporated covering situations when the local demand X j exceeds the maximum
flow rate ρcapj . To calculate the local safety stock level, the procedure described above can
be adapted. Consider the transition matrix Qj and recall the set of state combinations
S j as well as the corresponding set of local material balances Oj ∋ ωsj = ω(sj ) with
sj ∈ S j . To incorporate the local flow restriction, define a critical material balance by
j ). Subsequently, define the local critical demand surplus yh
ω C (sj ) = max (0, ω(sj ) − ρcap jC

over b periods for a state combination h = (s1 , ..., sb ) by yh = ∑i=1 ω (si ). The probability
j j jC b C j

of a state sequence remains as pjh = πsjj ⋅ ∏bl=2 qsjj sj . The required local safety stock rj can
1 l−1 l
∣Hj ∣
j = ∑g=f πh(g) ⋅ yh(g)
j jC
be determined e.g. by assuming that the expected demand surplus is μC
(with f determined such that yhjC(f ) > 0 and yhjC(f −1) ≤ 0).36 The local α- or β-service level
constraints can be deduced by means of (3.4) and (3.5) accordingly. Note that since
the calculation for the total safety stock along the pipeline does not differentiate states
(w.r.t. the local pipeline capacity), local safety stock levels serve as a part of the global
safety stock as well and are implicitly included in the total safety stock level calculation.
Conversely, the pipeline capacity at location j also restricts the feed rate. Hence, the
distribution of safety stock shares among the provider locations has to take into account
these constraints as well. In general, the number of covered periods should be as equal as
possible for all local safety stocks. I.e. the share of the total safety stock r for a particular
provider location j can be assigned by requiring an identical number of periods covered,
ρcap
i.e. rj = r ⋅ ∑ min(ρj cap ,ρcap ) for all j ∈ N in .
j j
The following example exercises the aforementioned procedure to derive safety stock
levels for a serially operated pipeline system with two provider locations and two consumer
locations

36
Here, yhjC(g) denotes the surplus demand in ascending order and πhj (g) its corresponding probability.
64

Example 4 (Serial pipeline supply network). Suppose that the pipeline system depicted
in Figure 3.3 connects four locations consuming or providing Ethylene. The pipeline is
serially operated with two providers of Ethylene at its head and two consumer plants at its
bottom. Provider P 1 is an Ethylene producing plant (e.g. a steam cracker) with a capacity
of 3,000 tons per day. Provider P 2 is a seaport where Ethylene can be unloaded from
tanker ships into tanks feeding the pipeline. Feeding the pipeline is operated in batch mode.
I.e. when the Ethylene stock level is sufficiently high, Ethylene is fed into the pipeline with
a constant rate of 1,000 tons per day. Otherwise, there is no feeding. The consumer plants
are supposed to be continuously operated (e.g. to produce Ethylbenzene) with consumption
rates of 2,500 tons and 1,500 tons per day, respectively. All production plants are restricted
to two production modes, either full capacity working (s = 2) or a breakdown (s = 1) with
no production/consumption. Tables 3.2a- 3.2d show the transition matrices of providers
(QP 1 and QP 2 ) and consumers (QC1 and QC2 ) as well as the corresponding production
modes and associated Ethylene quantities given in 100 tons.

s 1 2 s 1 2 s 1 2 s 1 2
ωsP 1 0 -30 ωsP 2 0 -10 ωsC1 0 25 ωsC2 0 15
0 0.60 0.40 0 0.85 0.15 0 0.60 0.40 0 0.90 0.10
-30 0.20 0.80 -10 0.05 0.95 25 0.10 0.90 15 0.10 0.90
(a) QP 1 (b) QP 2 (c) QC1 (d) QC2

Table 3.2: Transition matrices and production modes for providers and consumers

The corresponding steady state probabilities are calculated as

1 2 1 3 1 4 1 1
πP 1 = ( , ) , πP 2 = ( , ) , π C1 = ( , ) , π C2 = ( , ) . (3.6)
3 3 4 4 5 5 2 2

Altogether, there are 24 = 16 state combinations. The state space of the total Ethylene
balance X is O =(-40, -30, -25, -15, -10, -5, 0, 5, 10, 15, 25, 30, 40) with ∣O∣ = 13. The
system is in balance (ωs = 0) if all locations are working (s = (2, 2, 2, 2)) or if all locations
are off (s = (1, 1, 1, 1)). Tables 3.3a and 3.3b show the state combinations constituting
all unbalanced total states.
For the total material balance X the corresponding transition matrix Q is calculated
using (3.1). For instance, q99 (i.e. that there is no transition from state s9 with ωs9 = 0)
is calculated as q99 = ∏4j=1 qljj lj = q22 q22 q22 q22 = 0.8 ⋅ 0.95 ⋅ 0.9 ⋅ 0.9 = 0.62. Analogously, the
P 1 P 2 C1 C2

further transition probabilities are calculated and summarized in Table 3.4.


From Q the steady state vector can be derived as π =(1/20,1/60,1/20,1/5,1/60,
1/40,1/15,1/120,1/5,1/40,1/15,1/10,1/120,1/30,1/10,1/30). Note that E(X) = ∑16 t=1 πt ⋅
ω(st ) = 0, i.e. in the long run the supply system is expected to be in balance.
Assume that the critical period during which the supply system should be kept au-
65

t ωst state comb. st t ωst state comb. st


1 -40 (2, 2, 1, 1) 10 5 (1, 2, 1, 2)
2 -30 (2, 1, 1, 1) 11 10 (2, 1, 2, 2)
3 -25 (2, 2, 1, 2) 12 15 (1, 2, 2, 1)
4 -15 (2, 2, 2, 1) 13 15 (1, 1, 1, 2)
5 -15 (2, 1, 1, 2) 14 25 (1, 1, 2, 1)
6 -10 (1, 2, 1, 1) 15 30 (1, 2, 2, 2)
7 -5 (2, 1, 2, 1) 16 40 (1, 1, 2, 2)
(a) Total surplus states (b) Total deficite states

Table 3.3: All combinations of plant production modes

t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ωst -40 -30 -25 -15 -15 -10 -5 0 0 5 10 15 15 25 30 40
-40 0.41 0.02 0.05 0.27 0.00 0.10 0.01 0.01 0.03 0.01 0.00 0.07 0.00 0.00 0.01 0.00
-30 0.06 0.37 0.01 0.04 0.04 0.02 0.24 0.09 0.00 0.00 0.03 0.01 0.01 0.06 0.00 0.01
-25 0.05 0.00 0.41 0.03 0.02 0.01 0.00 0.00 0.27 0.10 0.01 0.01 0.01 0.00 0.07 0.00
-15 0.07 0.00 0.01 0.62 0.00 0.02 0.03 0.00 0.07 0.00 0.00 0.15 0.00 0.01 0.02 0.00
-15 0.01 0.04 0.06 0.00 0.37 0.00 0.03 0.01 0.04 0.02 0.24 0.00 0.09 0.01 0.01 0.06
-10 0.21 0.01 0.02 0.14 0.00 0.31 0.01 0.02 0.02 0.03 0.00 0.21 0.00 0.01 0.02 0.00
-5 0.01 0.06 0.00 0.10 0.01 0.00 0.55 0.02 0.01 0.00 0.06 0.02 0.00 0.14 0.00 0.02
0 0.03 0.18 0.00 0.02 0.02 0.05 0.12 0.28 0.00 0.01 0.01 0.03 0.03 0.18 0.00 0.02
0 0.01 0.00 0.07 0.07 0.00 0.00 0.00 0.00 0.62 0.02 0.03 0.02 0.00 0.00 0.15 0.01
5 0.02 0.00 0.21 0.02 0.01 0.03 0.00 0.00 0.14 0.31 0.01 0.02 0.02 0.00 0.21 0.01
10 0.00 0.01 0.01 0.01 0.06 0.00 0.06 0.00 0.10 0.00 0.55 0.00 0.02 0.02 0.02 0.14
15 0.03 0.00 0.00 0.31 0.00 0.05 0.02 0.00 0.03 0.01 0.00 0.46 0.00 0.02 0.05 0.00
15 0.00 0.02 0.03 0.00 0.18 0.01 0.01 0.03 0.02 0.05 0.12 0.00 0.28 0.02 0.03 0.18
25 0.01 0.03 0.00 0.05 0.00 0.01 0.28 0.05 0.01 0.00 0.03 0.07 0.01 0.41 0.01 0.05
30 0.00 0.00 0.03 0.03 0.00 0.01 0.00 0.00 0.31 0.05 0.02 0.05 0.00 0.00 0.46 0.02
40 0.00 0.00 0.01 0.01 0.03 0.00 0.03 0.01 0.05 0.01 0.28 0.01 0.05 0.05 0.07 0.41

Table 3.4: Transition matrix Q

tonomous is set to b = 4 days. Hence, all ∣S∣ = 65, 536 sequences of state combinations are
evaluated. The corresponding probabilities and total material balances are calculated using
(3.2) and (3.3). Figure 3.4a shows the states of the total material balances Y (4) and the
associated cumulated probabilities. The corresponding loss function V (r) is depicted as a
function of the safety stock level r in Figure 3.4b.
To assure material availability with a probability of at least α = 0.95, a safety stock level
of at least rα = 95 ⋅ 100 tons is required. This safety stock level also ensures a β-service
level of at least β = 0.95 where μC = 22.45 (as defined above). Both values are depicted in
figures Figure 3.4a and Figure 3.4b, respectively.
66

1.0


95% ● ●
● ● ● ●
● ● ● ● ● ● ●

● ● ●

● ●
0.8
cumulated probabilities

● ●






0.6





0.4




● ●

0.2



● ●



● ● ●
● ●
0.0

● ● ●
● ● ● ● ● ●

−150 −100 −50 0 50 100 150


total material balance Y(4) y = 95
0.95

(a) Cumulative distribution function for Y (4)


20


15
expected loss V(r)


10


5



μC ⋅ 0.05 = 1.122





● ● ● ● ● ● ●
0

● ● ● ● ●

0 50 100 150
V−1 ⋅ (μC ⋅ 0.05) = 95 safety stock level r
(b) Loss function V (r)

Figure 3.4: Cumulative distribution function and loss function for Y (4)
67

3.2.3 Multi-product pipelines


Material flows along multi-product pipelines are far more complicated to plan. The most
prominent field of application for multi-product pipelines is the distribution of refinery
products among a network of distribution terminals.37 The operating principle is essen-
tially the same as for single product pipelines. However, as multiple materials pass the
pipeline sequentially, the different materials have to be separated in some sense. Basically
there are two options to separate batches of different materials

• separation PIGs or

• interfaces.

For the first option, special PIGs are used. Separation PIGs are technical devices acting
as an impenetrable membrane. This separates subsequent chemicals perfectly at a point
of destination. A disadvantage is the effort to plug-in the PIGs when a transition occurs.
Beside the investment costs for PIGs, they have to be re-distributed to the feed points
and must be constantly maintained. This is economically reasonable only under spe-
cific conditions, e.g. when the transported materials are of high value and comparatively
heterogeneous such that a mixing of materials induces high re-processing costs.
The second option is simply to allow a mixture of successively injected materials. This
transition mixture is often labelled as interface.38 This option is preferable as in many
cases the set of materials to be transported is chemically homogeneous (e.g. crude oil
derivatives). However, the interface is still a problem as it typically does not meet the
chemical specifications of one of the parental materials. There are two ways to deal with
the interface: Either the interface is added to one of both parental materials where it
does not harm in the further production processes39 or the interface has to be extracted
(e.g. at the end of a serial pipeline) for a special treatment.40 No matter how interfaces are
treated, transition efforts have to be faced. Hence, minimizing the number of interfaces
is one objective to aim at when planning multi-product pipeline schedules.
If transition efforts can be expressed in terms of costs, the total costs for pipeline
transport can be used as the planning objective. In this context, total costs comprise
transition costs, transport costs, and inventory costs. Transport costs account for oper-
ational pipeline costs, such as the energy costs for pumping. These costs are affected by
the pipeline’s pump rate which needs to be explicitly controlled and planned when the
transported materials differ in viscosity or other physical properties.41 Inventory costs are
37
E.g. see Rejowski and Pinto (2008) and references therein.
38
See e.g. Hull (2005).
39
Especially as its concentration declines to 0 when getting mixed with large amounts of the parental
material.
40
E.g. see Hull (2005).
41
See Rejowski and Pinto (2008) for an example. However, often exists a regular pump rate for which
the operational costs per flow unit are minimized. Deviations from this optimized pump rate usually
incur additional operational costs due to increased (relative) energy consumption. See Hane and
Ratliff (1995) or Hull (2005).
68

distinguished in stock holding costs and shortage costs. Note that in literature primarily
the first category is taken into account.42 According to the classification presented in
Table 3.1 the next subsection deals with the batch flow case, i.e. multi-product pipelines
with only two access points.

3.2.3.1 Batch flow pipelines

In the batch flow system, the pipeline connects two locations where a set of products
is produced and consumed. The subset of products that are used at both locations is
denoted by S. These products have to be balanced between both locations by pipeline
transports.
In the literature on batch flow planning, primarily planning problems from the oil in-
dustry are prevalent. Here, pipeline systems are more extensively used than in all other
industries where pipelines mainly distribute chemicals from a refinery to customer mar-
kets. The problem is to meet the customer demands in time and replenish local inventories.
These planning problems are typical operational planning problems where the production
quantities and customer demands are given for a certain time horizon. At both locations
tanks are used to buffer demand and supply over time. The connecting pipeline is used
to transport the materials to the consuming location and/or to the producing location.
The aim is to find a pumping schedule that meets the demand at the consuming loca-
tion by minimizing the operating costs. Technical restrictions to be considered encompass
e.g. settling periods, tank pumping restrictions or interface expenditure approximations.43
In the literature, basically two directions can be distinguished. A concise comparison of
the problem features tackled in both directions is contained in Table 3.5.

Magatão et al. (2004) category Relvas et al. (2006)


Pipeline system
complete fill state complete
range flow rate fixed
reversible flow direction fixed
one-at-a-time source/destination one-at-a-time
Tank system
finite capacities finite
one-at-a-time inflow/outflow one-at-a-time
Demand
range quantity per period fixed

Table 3.5: Comparison of problem features for Magatão et al. (2004) and Relvas et al.
(2006)

42
This is because in most studied planning problems stock outs are not allowed at all and, hence, shortages
are explicitly prohibited. See MirHassani (2008) or Cafaro and Cerdá (2008) for examples where both
costs are considered. Note that in Cafaro and Cerdá (2008) shortages are defined as tardily delivered
batches.
43
E.g. see Relvas et al. (2006) or Rejowski and Pinto (2008).
69

Both directions commonly assume that the pipeline remains completely filled over the
total time horizon. The pipeline is fed by only one tank at a time and can only feed one
tank at a time. Conversely, a tank can either be filled or emptied at a time. It is assumed
that for each product a set of tanks is available and it has to be determined when to use
which tank for pumping.
Magatão et al. (2004) assume a one-to-one pipeline system with reversal flow option.44
The objective is to minimize interfaces and tank change-overs. The problem is modelled
as a large-scaled, time-discrete mixed-integer program (MILP). Magatão et al. (2005)
provide a solution approach relying on the decomposition of the monolithic MILP into
two sub-MILPs and a main MILP.
In contrast, Relvas et al. (2006) assume a unidirectional pipeline flow system. The
corresponding problem is formulated as a time-continuous MILP. Time-discrete customer
demands are dynamically assigned to certain batches depending on the batches’ arrival
times. The objective is to maximize the utilization of the pipeline, i.e. to maximize the
product flow from the provider to the customer depot. Originally, the pipeline’s flow
rate is fixed and the pipeline has to be operated continuously. In Relvas et al. (2007)
the original model is extended by relaxing both restrictions. To deal with the increased
complexity, Relvas et al. (2009) propose an efficient heuristic for determining desirable
product sequences.
In principle both types of operational pipeline planning problems can be applied to
chemical pipeline systems, too, as the technical specifications in both industries are sim-
ilar. However, in chemical industry the distribution of final chemicals is rarely based on
multi-product pipelines. More likely, supply processes of raw and intermediate materials
are performed by pipeline transports. This implies that chemical production sites exist
at both locations. As most chemical production sites are continuously operated, it can be
presumed that in normal operation a constant demand or surplus per period of time is
observed. I.e. there is a fixed (net) consumption and a fixed (net) production rate for each
product and site. The flow rate between both sites is then the minimum of consumption
and production rate. The remaining surplus or deficit has to be handled differently, not
affecting the pipeline operations planning at all. If the production system is in balance,
both rates are equal for each product. In any case, it can be concluded that local con-
sumption and production rates for each product are assumed to be equal. Hence, in the
long run a pumping sequence has to be determined minimizing the total costs consisting
of stock holding and transition costs.
Let ω denote the production and consumption rate for a certain product and ρ denote
the pump capacity at which the pipeline is operated. Furthermore, let T denote the cycle
time between two successive batches of the considered product and τ denote the time
necessary to transport a batch of a product from provider to consumer location.45 The
44
A many-to-many pipeline network with reversal flows is provided in Moura et al. (2008).
45
I.e. it is assumed that the pipeline is continuously operated at a constant pump rate for all products.
70

total consumption/production during a cycle is defined by F = T ⋅ ω which is transported


for τ periods in the pipeline. Assuming that ρ > ω, the time span necessary to fill the
pipeline is T f ill = Fρ = T ⋅ ωρ . Hence, the stock-up time is T stock = T − T f ill = T ⋅ (1 − ωρ )
and the maximum stock level L obtained at both locations is calculated by multiplying
the stock-up time T stock with the production rate ω, thus, L = ω ⋅ T stock = ω ⋅ T ⋅ (1 − ωρ ).
The stock levels obtained in pipeline and inventories at both locations are depicted in
Figure 3.5.
Since production and consumption rates are equal, the total stock in the system is
constant at the level L̄(T ) = L + τ ⋅ ω = ω ⋅ (τ + T ⋅ (1 − ωρ )) and depends only on the cycle
length T . The holding costs decrease with decreasing cycle length up to the theoretic
minimum of τ ⋅ ω in case of a permanent material flow.46 The total stock level splits into
the average stock level at the sites with L2 and the average stock in the pipeline given by
τ ⋅ ω.
However, as a counterbalance, the smaller the batches the higher the number of product
transitions and related costs. Since each interface is associated with some financial effort,
smaller batches increase transition costs. Hence, there is the typical structure of lot-sizing
problems and a strong analogy to the economic lot scheduling problem (ELSP).47
In general the ELSP aims at identifying the optimal recurring sequence and size of
product batches on a single machine in order to minimize the sum of holding and set-
up/transition costs. More formally, for a given set of products S there are constant
consumption rates ωs , production rates ρs , set-up cost rates cset
s , and holding cost rates
chold
s .48 The total expected costs per period are calculated by

cset chold ⋅ ωs ⋅ Ts ωs
T C ELSP = ∑ ( s
+ s ⋅ (1 − )) (3.7)
s∈S Ts 2 ρs

where 12 ⋅ ωs ⋅ Ts ⋅ (1 − ωρss ) = L2 is the average stock level of product s. The difficulty in


determining the cycle times Ts is caused by the feasibility constraint, i.e. that the batches
scheduled on the machine are not overlapping at any time. In fact, it can be shown that
finding the optimal solution and checking a schedule for sequence feasibility is both N P-
hard.49 Hence, there is a lot of literature about heuristics for this problem.50 Sequence
feasibility refers to the fact that based on the cycle times Ts and production/pumping
times Ts ⋅ ωρss a schedule can be build. This fundamental schedule/cycle has a total length
of T cycle = max Ts and is repeated infinitely. I.e. every T cycle periods this schedule starts
s∈S
anew. Hence, it has to be assured that each product is exactly scheduled every Ts periods
also across subsequent fundamental cycles.
46
Though, the theoretical minimum is possible in the one-product case only.
47
See e.g. Narro Lopez and Kingsman (1991) for a review.
48
In analogy to the production states of production plant introduced above, a product s transported via
the pipeline defines the current state of the pipeline.
49
I.e. there is no algorithm known that can solve this problem in polynomial time. See Hsu (1983).
50
See e.g. Chatfield (2007) for a concise review.
Provider

12
L

8
Δ
Δω − Δ(ρ − ω)
Δ

4
stock level
● ●

0
00 ⎛ ω⎞ 6
1T ⎛ ω⎞ 12
2T Time
T 1− T 2−
⎝ ρ⎠ ⎝ ρ⎠
Pipeline

12
F

10
L

8
6
Δρ − Δρ

4
Δ Δ

stock level
2
0
● ●

0 6 12 Time
0 τ ⎛ ω⎞ 1T τ+T ⎛ ω⎞ 2T τ + 2T
T 1− T 2−
⎝ ρ⎠ ⎝ ρ⎠
Consumer

12
10
L

8
6
Δ
Δ(ρ − ω) − Δω
Δ

stock level
2
0
● ●

0 ω 6 12 Time
0 τ 1T τ + T⎛1 − ω⎞ τ + T 2T τ + T⎛2 − ω⎞ τ + 2T
τ−T
ρ ⎝ ρ⎠ ⎝ ρ⎠

Figure 3.5: Inventory pattern in a batch flow system


71
72

The proposed problem with equal consumption and production rates can be seen as
a special case of the classic ELSP by assuming constant pumping rates for all products,
i.e. ρs = ρ and no set-up times. Additionally, in this case the cost function has to be
adjusted according to the total stock level in the entire supply system (provider, pipeline,
and consumer stock). Hence, (3.7) changes to

cset ωs
T C ELSP −BP = ∑ ( s
+ chold
s ⋅ ωs ⋅ (τ + Ts ⋅ (1 − ))) (3.8)
s∈S Ts ρ

where τ denotes the pumping time to transport the material through the pipeline and
L̄ = ωs ⋅ (τ + Ts ⋅ (1 − ωρs )) is the total stock level along the pipeline. If ∑s∈S ωs ≤ ρ, a
feasible schedule always exists.51 If sequence feasibility is neglected, the optimal cycles
Ts∗ can be derived from (3.8) with respect to Ts which results in
1
2 cset
Ts∗ =2
2
3
s
. (3.9)
chold
s ⋅ ωs ⋅ (1 − ωρs )

If the set of independent solutions results in a schedule feasible w.r.t. sequence and ca-
pacity, this is the optimal solution for (3.8) at the same time. Otherwise, heuristics for
the classic ELSP can be applied.52
If subsequent batches in the pipeline are separated by interfaces, the transition costs
may depend on the product sequence as e.g. reprocessing costs may differ for different
mixtures. Therefore, a sequence-dependent economic lot sequencing problem (sELSP)
has to be formulated to represent the case of interface separation. To reformulate (3.8),
let ctrans
st denote the transition costs for subsequent batches of products s and t.
Then, the set-up cost part per period of time is replaced by the total transition costs
occurring in a fundamental cycle divided by the fundamental cycle’s total length

1
min → T C sELSP −BP = ( ∑ ∑ ∑ (ctrans ⋅ xsti )) (3.10)
T cycle s∈S t∈S i∈I st
ωs
+ ∑ (chold
s ⋅ ωs ⋅ (τ + Ts ⋅ (1 − ))) .
s∈S ρ

hold
Note that replacing the stock holding costs part in (3.10) by ∑s∈S cs 2ωs Ts (1 − ωρs ) leads
to the classic sELSP.53
To assure sequence feasibility, a mixed-integer non-linear program (MINLP) is formu-
lated for the ELSP-BP and sELSP-BP. Table 3.6 contains the required set of parameters
and variables.
Sequence feasibility requires that the lot sizes are determined such that the reach of
51
This can be seen as an adaptation of the so-called capacity feasibility constraint of the classic ELSP,
see e.g. Chatfield (2007).
52
For example Dobson’s approach (Dobson, 1987) or Haessler’s approach (Haessler, 1979).
53
See Wagner and Davis (2002) or Dobson (1992).
73

Sets
S = 1, ..., S set of chemicals
I = 1, ..., I set of lot positions
Parameters
M large number
τ pumping time
ρ pump rate
ωs demand rate of chemical s ∈ S
ctrans
st transition costs for a transition from chemical s ∈ S to t ∈ S (e.g. due to
PIG insertion or interface handling)
chold
s holding costs for chemical s ∈ S
Decision variables
bsi binary, 1 if at position i = 0, ..., I a chemical s ∈ S is scheduled
Ts cycle time of chemical s ∈ S
tbi starting time of position i ∈ I
Variables
xsti binary, 1 if from position i−1 to position i a transition from chemical s ∈ S
to t ∈ S occurs
osij binary, 1 if chemical s ∈ S is lastly scheduled j positions before i ∈ I
rsi binary, 1 if position i ∈ I is the last position where chemical s ∈ S is
scheduled
rrs binary, 1 if chemical s is the finally scheduled chemical in the fundamental
cycle
T cycle fundamental cycle time

Table 3.6: Set of parameters and decision variables for the sELSP

a certain chemical’s lot covers the time span until the production of the next lot of this
chemical starts. Therefore, assume that for a set of chemicals S a set of lot positions I is
available, with ∣S∣ ≤ ∣I∣. If the number of lot positions equals the number of chemicals, the
inequality Ts ≥ ∑t∈S Tt ⋅ ωρt = T cycle holds for each chemical’s cycle time Ts . In this case, each
chemical is scheduled at exactly one position in the fundamental cycle. The sequence with
minimum transition costs can be determined by solving a Travelling Salesman problem
where the transition cost rates ctrans
st serve as distances between cities. However, such a
solution implies similar production rates ωs and similar cost rates for each product. If
these parameters vary, an optimal fundamental cycle can be composed by heterogeneous
individual cycle lengths Ts . As a consequence, chemicals with short cycles have to be
scheduled more than once in the fundamental cycle. Hence, either ∣I∣ > ∣S∣ or ∣I∣ ≫ ∣S∣
must hold for the number of positions. Sequence feasibility in such a schedule requires that
for each chemical the time span between two successive lots of this chemical is equal to
its cycle time Ts . Formally, the set of constraints (3.11)-(3.27) assures sequence feasibility
74

in the general setting.

ωs
tbi+1 ≥ tbi + Ts ⋅ − M ⋅ (1 − bsi ) ∀i ∈ I, s ∈ S (3.11)
ρ
ω s
T cycle ≥ tbi + Ts ⋅ − M ⋅ (1 − bsi ) ∀i ∈ I, s ∈ S (3.12)
ρ
Ts ≥ tbi + T cycle − tbj − M ⋅ (3 − osii − bsi − rsj ) ∀i, j ∈ I, j ≥ i, s ∈ S (3.13)
Ts ≥ tbi − tbi−j − M ⋅ (3 − osij − bsi−j − bsi ) ∀i, j ∈ I, j < i, s ∈ S (3.14)
Ts ≤ tbi + T cycle − tbj + M ⋅ (3 − osii − bsi − rsj ) ∀i, j ∈ I, j ≥ i, s ∈ S (3.15)
Ts ≤ tbi − tbi−j + M ⋅ (3 − osij − bsi−j − bsi ) ∀i, j ∈ I, j < i, s ∈ S (3.16)
I
rsi ≥ bsi − ∑ bsl ∀s ∈ S, i ∈ I (3.17)
l=i+1

∑ rsi = 1 ∀s ∈ S (3.18)
i∈I

∑ bsi ≥ 1 ∀s ∈ S (3.19)
i∈I

∑ bsi ≤ 1 ∀i ∈ I (3.20)
s∈S

bs0 = 1 ∀s ∈ S (3.21)
∑ bsi ≤ ∑ bsi−1 ∀i ∈ I (3.22)
s∈S s∈S

xsti ≥ bsi−1 + bti − 1 ∀i ∈ I, i > 1, s, t ∈ S (3.23)


bsi−1 + bti
xsti ≤ ∀i ∈ I, i > 1, s, t ∈ S (3.24)
2
xst1 ≥ rrs + bt1 − 1 ∀s, t ∈ S (3.25)
rrs + bt1
xst1 ≤ ∀s, t ∈ S (3.26)
2
i−1
osij ≥ bsi−j − ∑ bsl ∀s ∈ S, i, j ∈ I (3.27)
l=i−j+1
i
∑ osij = 1 ∀s ∈ S, i ∈ I (3.28)
j=1
I
rrs ≥ rsi − ∑ ∑ rtj ∀s ∈ S, i ∈ I, i < I (3.29)
t∈S j=i+1

Inequalities (3.11) assure that the starting time of a lot at position tbi+1 is not smaller
than the previous lot’s starting time tbi plus the time required to pump the quantity of the
scheduled chemical at position i. Constraints (3.12) restrict the fundamental cycle time
T cycle of the fundamental cycle to the maximum of all starting times tbi (plus correspond-
ing pumping time). The time span between two successive lots of a certain chemical s is
exactly Ts . There are two cases to be distinguished: First, when a new fundamental cycle
starts, the time span between the last position, say i, at which this chemical is scheduled
in the old fundamental cycle (rsi = 1) and the first position in the new fundamental cycle,
75

say i, at which the chemical is scheduled has to be restricted. Then, osii = 1 and bsi = 1
indicate that chemical s is scheduled at position i and has been lastly scheduled i periods
before, which is in period 0 or in the previous fundamental cycle. This case is modelled
by constraints (3.13) and (3.15).
Second, when a chemical is scheduled at least at two positions in a fundamental cycle,
say i and k with i > k, and nowhere in between, then this chemical is previously scheduled
j = i − k positions before position i. Hence, osi(i−k) = osij = 1 and bsi = bsk = bs(i−j) = 1.
Thus, constraints (3.14) and (3.16) ensure that tbk − tbi = Ts .
Constraints (3.17) and (3.18) define the last position at which each chemical s is sched-
uled. Constraints (3.19)-(3.20) assure that each chemical is at least scheduled once in the
fundamental cycle. No more than one chemical is scheduled at each position. Constraints
(3.21) indicate that all chemicals are scheduled in the (previous) fundamental cycle.
Constraints (3.22) avoid intermediate idle positions within the fundamental cycle. Con-
straints (3.23)-(3.26) set the transition variables xsti = 1 iff bsi−1 = 1 and bti = 1. Here,
constraints (3.25)-(3.26) define xst1 = 1 iff chemical s is the last chemical and chemical t
is the first chemical scheduled in a fundamental cycle.
Constraints (3.27)-(3.28) set osij = 1 iff chemical s is previously scheduled at position
i − j (i.e. bsi−j = 1). Finally, constraints (3.29) set rrs = 1 iff chemical s is the chemical
scheduled at last in a fundamental cycle.
The sELSP-BP constituted by (3.10)-(3.28) refers to the situation when transition costs
are caused by efforts for PIG injection to separate the batches. However, if no PIG is
used for separation, an interface is built. This interface has to be treated like one of the
two parental chemicals or is specially treated.
In the former case, some adaptations are necessary. First, let dst represent the quantity
of material which is subtracted from (dst < 0) or added to (dst > 0) a batch due to an
interface of the chemical sequence s → t. Note that this quantity typically does not depend
on the quantity of the parental batches but on the pressure and diameter of the pipeline.54
As both parameters are assumed to be constant, dst is a parameter. Moreover, it holds
that dst = −dts as the loss of one parental chemical is the surplus of the other one.55
The size of a batch received at the consumer location depends now on its predecessor
and successor. In turn, the injected batch size has to compensate the correction of the
received batch size due to the interfaces. Let qsi denote the total batch size correction at
the consumer location for the batch at position i and material s. Then, qsi = ∑t∈S dts ⋅xtsi +
∑u∈S dsu ⋅ xsu(i+1) holds. Note that due to the specific structure of dst also ∑s∈S ∑i∈I qsi = 0
holds, i.e. no quantity is lost. Figure 3.6 illustrates a situation for three subsequent batches
of chemicals t, s and u at positions i − 1 to i + 1, respectively.
54
This holds as long as the batch is larger than the critical mixing quantity. See e.g. Hall and Nicholls
(1980) or Cafaro and Cerdá (2004).
55
Note that dst = −dts is not a necessary condition as it implies equal parental shares constituting the
interface. It suffices to force sign(dst ) = −sign(dts ) where sign(⋅) denotes the sign function. However,
this generalization requires some more notation and, therefore, is not discussed here.
76

position i − 1 position i position i + 1


chemical t chemical s chemical u

∣dts ∣ + ∣dst ∣ ∣dsu ∣ + ∣dus ∣


batch - dts
batch + dst − dsu batch + dus

Figure 3.6: Illustration of interface calculation in batch pipeline systems.

As Figure 3.6 shows, the interface between chemicals t and s is added to the batch of
chemical s (i.e. dts < 0 and dst > 0). In contrast, the interface between chemicals s and u is
added to the batch of chemical u (i.e. dus < 0 and dsu > 0). It holds bt(i−1) = bsi = bu(i+1) = 1
and it follows from (3.23) and (3.24) that xtsi = xsu(i+1) = 1. Consequently, qsi = dts −dsu < 0
holds since ∣dts ∣ < ∣dsu ∣ and the batch of chemical s at position i is reduced by qsi units.
The batch size of a chemical s is given by Ts ⋅ ωs , i.e. due to the interfaces the realized
batch size is Ts ⋅ ωs + qsi . Due to the correction of the batch sizes, the pumping times have
to adjusted accordingly. I.e. for qsi > 0 pump times can be reduced such that the pumping
time expression changes to Ts ⋅ωρs −qsi . Thus, (3.11) and (3.12) change to

Ts ⋅ ωs − qsi
tbi+1 ≥ tbi + − M ⋅ (1 − bsi ) ∀i ∈ I, s ∈ S (3.30)
ρ
Ts ⋅ ωs − qsi
T cycle ≥ tbi + − M ⋅ (1 − bsi ) ∀i ∈ I, s ∈ S (3.31)
ρ

which implies that the pumping times depend on the pumping sequence, too. Additionally,
the objective function is affected since the batch size corrections depend on the sequence.
I.e. (3.10) changes to

1
min → T C sELSP −BP −IF = ( ∑ ∑ ∑ (ctrans ⋅ xsti )) (3.32)
T cycle s∈S t∈S i∈I st
ωs
+ ∑ (chold
s ⋅ (∑ qsi + ωs ⋅ (τ + Ts ⋅ (1 − )))) .
s∈S i∈I ρ

The remaining constraints (3.13)-(3.28) remain unchanged. This constitutes the sELSP
model with global stocks and interface handling (sELSP-BP-IF).
A third option to handle interfaces is extraction and external post-processing. This
implies that all batches reduce to a certain amount due to the associated interfaces. To
reflect this specific interface handling procedure in the sELSP-BP-IF, the values of dst
have to be strictly negative such that qsi are negative, too. Note that this implies a total
loss of material due to the pumping process.
Note further that the MINLPs introduced above are hard to solve to optimality even
for very small problem sizes. Presumably, heuristics developed for the classic ELSP
77

can be adopted to the sELSP, sELSP-BP, and sELSP-BP-IF with a little effort, e.g. by
solving a TSP-like sub-problem to account for the sequence dependency.56 However, the
development of efficient heuristics is not in the scope of this work. For the relevant
problem sizes with a few chemicals discussed here, optimization is still possible within
reasonable time (i.e. within some minutes at most). To show the applicability of the
sELSP-BP and sELSP-BP-IF, the following example 5 describes a case study from basic
chemical industry.

56
Another approach which promises a good prospect for adoption is the genetic optimization approach
of Chatfield (2007).
78

Example 5 (Batch pipeline sequencing). Suppose a one-to-one serial pipeline that con-
nects two chemical sites. The provider site produces the raw and intermediate materials
for the consumer site. Suppose that the consumer site produces Styrene-based chemicals
such as Polystyrene and Styrene-Butadiene rubber. Raw materials for Styrene production
are Benzene and Ethylene. Among others, a steam cracker provides Ethylene and Pygas,
which is subsequently refined to Benzene. It is assumed that the capacities of these plants
are unbalanced such that the deficit of Pygas and Benzene has to be imported. Addition-
ally, the cracker feed, Naphtha, has to be imported.
These three chemicals are mixable, chemically related liquids and can be transported
via the same pipeline. Pygas and Naphtha are mixtures of hydrocarbons, whereby Pygas
can be interpreted as a sub-mixture primarily consisting of aromatic chemicals with main
component Benzene. At the provider site these three chemicals are surplus and, thus, are
transported via a pipeline. Ordering these chemicals to increasing purity57 leads to the
sequence Naphtha, Pygas, Benzene. Assume that two possibilities for pipeline operation
are discussed: Either the batches are separated by PIGs such that no interfaces occur or
the interfaces are "downgraded" to the less pure parental chemical.
The transition costs in case of PIG separation depend on the monetary effort for PIG
handling, e.g. redistribution of PIGs as well as depreciation of the PIG due to capital
commitment. Since Benzene is a highly reactive chemical it is assumed that two kinds of
PIGs are used. For separation of Naphtha and Pygas a PIG with lower investment costs
can be used whereas for the separation of Benzene a more robust device has to be used
which induces higher investments and, thus, higher capital commitment costs. Table 3.7
shows the transition costs ctrans
st given for this scenario.

chemical transition cost ctrans


st [e]
s Naphtha Pygas Benzene
Naphtha 0 500 1,000
Pygas 500 0 1,000
Benzene 1,000 1,000 0

Table 3.7: Transition costs ctrans


st for the PIG insertion scenario

For the downgrading option the interfaces are re-processed. E.g. an interface consisting
of Pygas and Naphtha is handled as Naphtha and stored in the Naphtha tanks where
the concentration of aromatics increases due to the Pygas inflow. The cracking process
separates these components again. This implies in turn, that downgraded ingredients
are re-processed leading to additional processing costs. I.e. an interface of Benzene and
Naphtha has to be downgraded to Naphtha and the Benzene part is re-processed twice.
Hence, sequence-dependent transition costs ctrans
st are calculated by the quantity to be re-
57
Purity refers to the fact that Naphtha and Pygas are mixtures of multiple hydrocarbons whereby
Pygas is a sub-fraction of Naphtha (obtained after cracking) and Benzene is a pure component of
Pygas (obtained after distillation).
79

processed multiplied with the processing cost rate and the number of re-processing steps.
The processing cost rate is assumed to be 100 e per ton for cracking of Naphtha and
separation of Benzene from Pygas as well. Table 3.8 shows the interface quantities dst
and associated transition costs ctrans
st for this scenario.

chemical interface quantities dst [tons] transition cost ctrans


st [e]
s Naphtha Pygas Benzene Naphtha Pygas Benzene
Naphtha 0 10 50 0 1,000 10,000
Pygas -10 0 10 1,000 0 1,000
Benzene -50 -10 0 10,000 1,000 1,000

Table 3.8: Interface quantities dst and transition costs ctrans


st for the interface scenario

Typically, it can be assumed that the market value of (basic) chemicals increases with
passing more processing steps. Hence, stock holding costs increase, too. The net demand
rates provided in tons per day for Naphtha, Pygas, and Benzene under normal processing
conditions can be found in Table 3.9 accompanied by stock holding cost rates in e per ton
and day.

chemical stock cost net demand rates


s chold
s [e/(tons⋅day)] ωs [tons/day]
Naphtha 0.10 3,000
Pygas 0.15 1,000
Benzene 0.20 2,000

Table 3.9: Net demand rate ωs and holding cost rates chold
s for both scenarios

To complete the example’s setting, the maximum pipeline capacity is assumed to be


ρ = 7, 000 tons per day. Obviously, the pipeline capacity suffices to supply the consumer site
with all three raw and intermediate chemicals. The pipeline is assumed to be operated at
maximum capacity (as operating costs are optimized for this pump rate) or to be completely
inactive.
In this example two scenarios for pipeline operation are investigated:

• separate subsequent batches by means of separation PIGs

• interface downgrading.

To determine the optimal pumping schedule, for the first scenario an sELSP-BP and
for the second scenario an sELSP-BP-IF has to be solved. To compare these scenarios,
both sELSP-BP and sELSP-BP-IF are solved to optimality under the parameter setting
provided in Table 3.9 as well as Table 3.7 or Table 3.8, respectively. Up to 15 lot positions
are allowed in each model (i.e. ∣I∣ = 15).58 Both solutions differ substantially with total
58
Both MINLPs are solved on an Intel Core 2 Q6700 at 2.66 GHz with BONMIN in about 140 seconds, see
Bonami and Lee (2007).
80

costs per period of 3,270 e and 3,737 e for the sELSP-BP and sELSP-BP-IF, respectively.
Table 3.10 shows the cycle times as well as starting and pump times for both models.

pumping times Ts starting times tbi


model
Naphtha Pygas Benzene 1 2 3 4
sELSP-BP 2.066 2.066 2.066 0 0.417 1.023 –
sELSP-BP-IF 2.769 1.385 2.769 0 0.428 1.385 1.583

Table 3.10: Resulting optimal pumping cycles for the sELSP-BP and sELSP-BP-IF

It can be taken from Table 3.10 that for the sELSP-BP a unique cycle time for all
products is the optimal solution such that each chemical is pumped every 2.066 days.
For the sELSP-BP-IF the fundamental cycle time is 2.769 days which is probably caused
by the generally higher level of transition costs forcing larger batches. The solution of
the sELSP-BP-IF yields more diverse pumping times. Here, Pygas is scheduled twice in a
fundamental cycle. This is caused by the exceedingly high transition costs between Naphtha
and Benzene which induces an intermediate batch of Pygas between Naphtha and Benzene
batches. Table 3.11 shows the pumping times of the chemicals and the idle times for each
batch position.

pumping times idle time of position i59


model
Naphtha Pygas Benzene 1 2 3 4

sELSP-BP 0.885 0.295 0.590 0.091/2 0.091/3 0.113/1 –


sELSP-BP-IF 1.186 0.198 0.794 0.231/2 0.162/3 0.001/2 0.001/1

Table 3.11: Pumping times and idle times of the optimal schedules for the sELSP-BP and
sELSP-BP-IF

It can be observed in Table 3.11 that for the sELSP-BP-IF solution pumping times of
Naphtha and Benzene are prolonged in favour of a decrease of the Pygas batch size since
Pygas is scheduled twice. Note that the pumping times for the sELSP-BP-IF solution also
reflect the correction due to the interfaces. Figure 3.7 illustrates the fundamental cycle
and the associated interface quantities.
It can be observed that q31 = −20 and q13 = 20 which results in an increase of the
pumping time of Benzene and a decrease of the pumping time of Naphtha by approximately
700 ⋅ 24 ⋅ 60 ≈ 4 minutes which is almost negligible.
2

Comparing both scenarios merely based on the period cost, the PIG separation shows a
cost advantage. However, not all costs are included in this model since the redistribution
of the PIGs is not considered in the sELSP-BP. Moreover, investment and maintenance
of the PIGs causes additional costs. In result, the moderate advantage in period costs
59
Behind the idle time the index of the chemical scheduled at position i is provided separated by a slash.
Indices of chemicals are: Naphtha...1; Pygas...2; Benzene...3.
81

q31 = −20 q22 = 0 q13 = +20 q24 = 0

-10 -10 +10 -10 +10 +10 -10 +10


3 2 1 2

Figure 3.7: Illustration of interface calculation in batch pipeline systems

compared to the interface downgrading option may vanish if an encompassing cost analysis
is conducted.

3.2.3.2 Batch split pipelines

As the literature on pipeline scheduling has its roots in the petrochemical industry, a
straight-forward extension from batch flow pipeline systems to batch split systems is
natural. Here, typically a single source has to serve a set of distinct sinks with a set
of (petro-)chemicals. Typically, the source represents a refinery and the sinks represent
local fuel depots from which the local customer demand is served. Conversely, a harbour
may also serve as a supplier of different types of raw materials for a set of refineries.60
These sinks are connected to the source via a serial pipeline system. At the source and
the sinks a set of storage tanks is available for each product. Common assumptions for
the proposed models in this branch of literature are summarized as follows:

• General characteristics
– finite time horizon
– interface re-processing
– known and deterministic demands for each product at each depot
– known and deterministic production rates for each product at the source

• Pipeline system
– unidirectional flow
– pump rate range
– initial pipeline filling
– constant pressure

• Tank system
– tank level ranges
– initial tank levels.
60
See e.g. Más and Pinto (2003).
82

The main distinctions between the various approaches published are

• the objective function

• the modelling approach (discrete or continuous w. r. t. time and pipeline)

• incorporation of inventory/production planning problems.

In the following, the literature focusing on batch split pipeline scheduling is reviewed.
Integrative approaches containing pipeline transports in a supply chain context are out
of scope in this paragraph.61 Such approaches provide crude pipeline models to reflect
the transports in production networks over time but simplify technical details of pipeline
operation (such as the precise material tracking in time or interface handling) for the sake
of solvability of the entire model.62
Regarding the objectives, all approaches try to minimize the total costs of pipeline
operations over the planning horizon. Four cost categories are typically considered:

• Pumping costs (in terms of energy consumption multiplied with energy cost rate)

• inventory holding costs (at source and sinks)

• shortage costs

• interface costs (for reprocessing).

The proposed approaches differ in the considered cost categories and in the type of
formulation which is either a linear or a non-linear model. Inventory costs depend on the
duration and level of stock holding. Hence, if stock levels and the time periods are decision
variables, inventory cost calculation is non-linear as both variables have to be multiplied.
The pumping costs primarily depend on the energy consumption for pumping. Typically,
pipelines are assumed to be operated at constant pressure denoted by the constant pump
rate. In some cases, however, an adjustment of the pump rate is technologically possible
and economically reasonable, e.g. to speed up the transport of the chemicals. If this is the
case, the pump rate determines the duration of pumping. As the size of an injected batch
is in any case a decision variable, the time for pumping is represented by the ratio of batch
size and pump rate which determines the total energy consumption and, consequently,
the pumping costs. Hence, pumping costs are non-linear if the pump rate is a decision
variable. Given these criteria the literature classification scheme is set up as follows:
The indicator κ ⊆ (p, h, b, i) is a tuple representing the considered cost categories where p
indicates pumping costs, h indicates inventory holding costs, b indicates backorder costs,
and i indicates interface costs. Note that not all cost categories are considered in all
61
Examples for such integrative approaches can be found in Pitty et al. (2008); Neiro and Pinto (2004);
Pinto et al. (2000).
62
See Rejowski and Pinto (2008) and the remarks about the literature review therein.
83

references, i.e. ∣κ∣ ≤ 4. The tuple λ ∈ {n, l}∣κ∣ indicates whether the corresponding cost
category is linearly or non-linearly modelled.
The second classification category distinguishes between approaches with continuous
or discrete formulation for time and pipeline. In a continuous formulation, continuous
coordinates are used to describe e.g. the position of a batch or a pipeline branch. The in-
dicator ν ∈ {c, d}2 indicates whether a continuous or discrete formulation is chosen, where
the first component refers to the time aspect and the second refers to the pipeline. In
general, time-discrete formulations are more intuitive, but to the costs of larger model di-
mensions in terms of the number of equations and variables. In contrast, time-continuous
formulations are more compact but less intuitive.63
For the pipeline a discrete formulation implies that the pipeline is represented as a
sequence of discrete segments. This eases the tracking of product batches along the
pipeline and the formulation of product removals at the depots. Table 3.12 summarizes
the relevant literature according to the proposed classification scheme.

Reference κ λ ν
Rejowski and Pinto (2003) (p, h, i) (l, l, l) (d, d)
Cafaro and Cerdá (2004) (p, h, i) (l, l, l) (c, c)
MirHassani (2008) (i) (l) (d, d)
Cafaro and Cerdá (2008) (p, h, b, i) (l, l, l, l) (c, c)
Rejowski and Pinto (2008) (p, h, i) (n, n, l) (c, d)
MirHassani and Fani Jahromi (2011) (p, h, i) (l, l, l) (c, d)

Table 3.12: Classification of literature on scheduling of one-to-many pipeline systems

The literature referred to in Table 3.12 aims at minimizing the total operational costs
for a planning period. Most works take stock holding costs, pumping costs, and costs
for reprocessing the interface into account. Two exceptions are noteworthy: MirHassani
(2008) and Cafaro and Cerdá (2008).
MirHassani (2008) proposes a comparatively simple pipeline scheduling model consid-
ering only interface reprocessing costs to determine optimal schedules. In this model the
pipeline is subdivided into equally-sized segments. Similarly, the time horizon is subdi-
vided into equal periods of time. However, in contrast to most other models, a branching
pipeline is modelled. I.e. the depots are not located at a serial main pipeline, but con-
nected via a sub-pipeline that branches from the main pipeline. Figure 3.8 depicts both
types of pipeline systems with one source (S1) and three depots (D1, D2, D3).
Albeit the problem structure for branching pipelines is generally similar to a pipeline
structure with multiple depots located along one main pipeline, it complicates the
mathematical formulation by forcing to track batches along the pipeline branches
which equals the modelling of additional pipelines. In the follow-up paper (MirHas-
63
I.e. they contain less decision variables. See e.g. Rejowski and Pinto (2008) or Maravelias and Gross-
mann (2003) for a brief or more extensive discussion, respectively.
84

D1

S1 D1 D2 D3 S1 D3

D2

(a) Multiple depots along a (main) pipeline (b) Branching pipeline

Figure 3.8: One-to-many pipeline types

sani and Fani Jahromi (2011)), the mathematical formulation is improved towards a
time-continuous model showing a better performance which is able to handle larger
problem instances.
Most of the literature on pipeline transportation considers customer demands for the
products at the depots to be served at the end of the planning horizon. An exception is
proposed in Cafaro and Cerdá (2008) where customer demands at the depots are associ-
ated with due-dates such that backorder costs are incorporated in the objective function.
Moreover, the work proposes a rolling horizon model for updating and re-scheduling pre-
viously determined schedules according to updated demand characteristics.
Beside the incorporation of backorder costs in multi-period models, the handling of
pipeline operation costs differs. In general, pipeline operation costs are driven by the
energy consumption which, in turn, depends on the quantity and speed of pumping. As
stated before, most pipeline models are adapted for a specific pump rate, i.e. models are
based on a fixed pump rate or pump rate range (see Rejowski and Pinto (2003); Cafaro
and Cerdá (2004, 2008)). However, these works incorporate time-varying energy cost
rates depending on the time of day. I.e. in specified periods (typically when total energy
consumption in a region is high) higher energy cost rates have to be faced. Hence, pipeline
operation is shifted in tendency to less expensive periods.
In contrast, in Rejowski and Pinto (2008) the speed of pumping is a decision variable
which leads to a non-linear term in the objective function since both pump rate and
batch length have to be determined. Furthermore, this work is noteworthy as it explicitly
accounts for the inventory holding costs, but at the expense of another non-linear term
in the objective function. Note that the time-continuous formulation requires that both
the period’s length and the tank level in a period are decision variables. The product of
these terms determines the holding costs. Cafaro and Cerdá (2004, 2008) approximate
the holding costs by averaging the stock levels at the depots.
Despite the complexity of the scheduling decisions, all proposed models can be solved
with standard solvers in a reasonable amount of time.64 This is because only small problem
instances are solved w.r.t. the number of products to be scheduled and the number of

64
Either e.g. CPLEX in the MILP case (see e.g. Rejowski and Pinto (2003); Cafaro and Cerdá (2008))
or e.g. CONOPT in the MINLP case (see Rejowski and Pinto (2008)).
85

periods/pipeline segments to be tracked. In practice, multi-product pipelines handle


hardly more than ten products. While the number of products to be transported is
limited, the time domain is not. Most approaches are rather short-term models that have
to be applied to a rolling horizon environment.65

3.2.3.3 Multi-source pipeline systems

To tackle more realistic planning problems of real-world pipeline systems, models for
multi-source pipeline systems have been developed recently. This extension of one-to-
many systems considerably increases complexity. Hence, first approaches provide a heuris-
tical decomposition that decomposes a problem into three blocks:66

1. Allocation decision: determine which locations are potential candidates for injecting
and receiving batches.

2. Batch sizing: determine sizes of the batches to inject and receive.

3. Batch scheduling: determine the exact starting and receiving dates for each batch.

Naturally, the quality of these decomposition approaches is considerably affected by


the neglected interdependency of the decomposed decisions.67 Therefore, monolithic,
time and volume-continuous MILP formulations are proposed recently to integrate these
decisions.68 These models are straight-forward adaptations of one-to-many approaches.69
The basic setting is adapted from the one-to-many case (see the previous paragraph),
i.e. a serially operated main pipeline is assumed with a refinery at its head and depots
along this pipeline. One key alteration is that depots now can receive and inject batches
in the pipeline. Most other assumptions are inherited from the one-to-many systems. In
particular, the product flow is still unidirectional and only one location at a time can
inject a batch in the pipeline.70
However, the one-at-a-time restriction blocks all depots except for the injecting and the
receiving location. Only pipeline segments connecting these depots are active, whereas
the remaining pipeline segments are idle. In general, however, there are no technical
restrictions prohibiting simultaneous product flows on idle pipeline segments. Hence, si-
multaneous pump runs may cause a drastic increase in the pipeline schedule’s efficiency.
Simultaneous pump runs are only possible if the used pipeline segments do not overlap,
i.e. simultaneous flows do not use the same pipeline segments. Hence, restrictions for as-
suring "non-overlappedness" of simultaneous pump runs have to be incorporated. Despite
65
See e.g. Cafaro and Cerdá (2008).
66
See e.g. Boschetto et al. (2008). Another decomposition approach is proposed by Moura et al. (2008).
Here, in the first stage injection and destination locations as well as batch sizes are determined which
are subsequently scheduled in the second stage.
67
See Boschetto et al. (2008).
68
See Cafaro and Cerdá (2009) and Cafaro and Cerdá (2010).
69
To be precise, these are extensions of Cafaro and Cerdá (2004, 2008).
70
See Cafaro and Cerdá (2009).
86

this increase in complexity, the resulting extended MILP can still solve realistic problem
instances to optimality in a reasonable amount of time. Surprisingly, for some examples
computation times are decreasing compared to the one-at-a-time approach.71 More im-
portant, the resulting schedules are more compact (increasing the pipeline utilization)
and reduce total operation costs.
In general, many-to-many systems include all other pipeline system configurations as
special cases. Since computational complexity of the models proposed in Cafaro and
Cerdá (2009, 2010) is limited and the general characteristics of pipeline system are the
same as for the previously described systems, the model proposed by Cafaro and Cerdá
(2010) can be seen as a generic basic model encompassing the previously published works
on one-to-many and many-to-many pipeline systems.
With respect to the technical characteristics of pipeline transports in basic chemical
industry the differences to petrochemical industry are limited. However, multi-product
pipelines in basic chemical industry are typically less widespread. Here, single-product
pipelines are dominating. For local material transports at chemical production sites,
multi-product pipelines do not make sense because a constant flow of materials among
the set of local production plants has to be realized. For trans-regional transports, multi-
product pipelines might be used although most intermediate and basic chemicals are
chemically not very similar. This induces that finding a common technical standard
configuration of a multi-product pipeline is a challenging task. Incompatible chemicals
can not be transported without physical separation. Costs for separation or interface
reprocessing are often very high. Consequently, if multi-product pipelines are used in the
basic chemical industry, the number of different chemicals transported is usually smaller
than in petrochemical industry. Moreover, the number of partners along the pipeline can
be assumed to be smaller, since no depots for serving customer markets are required.
Instead, the pipelines are used for balancing supply and demand of raw and intermediate
chemicals among a set of (interdependent) chemical production sites. As a consequence,
the resulting scheduling problems are expected to be less complex than in petrochemical
industry. Available operational scheduling models, which have been originally designed
for petrochemical industry, can be adapted here. Since pipelines in chemical industry are
used to interconnect chemical production sites, the quantities to be distributed via the
pipeline system can be assumed to be rather constant. Hence, not an operational reactive
pumping schedule but a long-term pumping schedule (as provided by the sELSP-BP) is
more likely to be useful for this specific industry.

71
See Cafaro and Cerdá (2010).
87

3.3 Planning problems for rail operations


While pipelines are used for (almost) continuous transports of chemicals between and
inside chemical production sites, rail transports are (typically) used for smaller and more
sporadic shipments between chemical production sites and to customers with rail road
access. Rail transports are of special interest in basic chemical industry because large
quantities of material can be transported at once on a dense infrastructure compared
to waterway or pipeline networks.72 Particularly locations without access to navigable
waterways often depend on this mode of transport. Moreover, travelling times for rail
transports are often easier to calculate and uncertainty is smaller in contrast to waterway
transports. Hence, rail transports are typically a reliable and well predictable mode of
transport.
However, transport capacities of ships and barges are generally larger than train capac-
ities.73 Hence, transport cost rates per volume and distance unit are typically higher for
rail transports. Since most large-scaled chemical production sites have waterway access,
raw materials are typically supplied via ship transports, but intermediate and minor basic
chemicals are often transported via rail e.g. because the distribution structure is disperse
and/or not all customers/suppliers have waterway access.
Compared to pipeline and ship, rail transports are more flexible,74 but require more or-
ganizational and technical efforts. Hence, the next subsection sheds light on technical and
organizational details considering rail transports of chemical products. Subsequently, an
operational planning model is proposed aiming to support short-term transport decisions
in a chemical production network connected by rail links.

3.3.1 Technical and organizational prerequisites


Since rail transports depend on the availability of rail cars, transport capacities have to
be managed in accordance to the chemical quantities planned to be shipped. This (often)
forces chemical companies relying on rail transports to manage fleets of rail cars and
organize transports by themselves. Most often these rail car fleets are rented or bought.75
But although the rail cars are managed by chemical companies, the transport itself is
performed by rail transport service providers (or rail operators). Hence, the organization
of rail transports requires the coordination of product demands, transport capacities, and

72
At least in most developed countries (such as most countries in Europe and the USA) a denser railway
network can be assumed.
73
For transcontinental transports ships and rail transports are not competing. However, in Europe
open sea and rail transports may compete. E.g. transport relations to and from middle and western
European countries passing the Mediterranean or Black Sea can also be performed by trans-balkan
rail transports.
74
In the sense that the railway network is denser and, hence, more locations can be reached by rail
transports.
75
For instance, chemical companies operating in the US often own huge rail car fleets for domestic rail
transports, see Closs et al. (2003).
88

transport services. Beside these organizational issues, technical prerequisites have to be


considered for planning rail transport operations.
Special prerequisites are needed to exchange chemicals by rail. Necessarily, a link to
the common rail road network must exist. Such an access link often exists or can be built
easily. To be able to properly organize rail transports, shunting activities have to take
place. This includes the movement of rail cars for de-/composing and rearranging trains
as well as movements of rail cars to and from loading and unloading stations. The ability
to move and arrange rail cars depends on the available number and layout of shunting rail
tracks as well as the available manpower and shunting equipment (e.g. locomotives). The
interplay of these factors defines the shunting capacity, which can be measured e.g. by the
maximum number of trains to be de-/composed in a certain amount of time. However,
shunting activities are hard to plan and depend on a variety of organizational details,
such as the exact position of rail cars and their intended positions after shunting. Hence,
shunting yards are often bottlenecks when organizing rail transports in chemical industry.
The planning of shunt yard operations is a challenging task.76
The vehicles required to transport products by rail are called rail cars. A vast number
of rail car types exists, often specialized for certain products.77 Since most chemicals are
fluids or gases, the specific type of rail cars considered in chemical industry is the rail tank
car (RTC). Various sub-types of RTCs can be categorized, where typically the volume to
be transported varies between 20 and 120 m3 . Hence, the payload (in tons) carried by a
certain RTC depends on its volume and the density of the transported chemical.
Gases are transported liquefied in so-called gas RTCs either under pressure or super-
cooled. Fluids are transported in fluid RTCs that can be emptied on top or bottom or
both.78 Transported liquids are typically not under pressure. Depending on the volume,
the payload for both types of RTCs varies between approximately 12 and 70 tons.
The unloading procedure is similar for both RTC types. RTCs are either pumped out
or pressed out by pumping in an inert gas (such as nitrogen). In practice, gas RTCs
are typically pumped out. However, the unloading procedure leads to an equalization of
pressure. Hence, a rest of the transported chemical remains in the RTC if it is not pressed
out.79 Since the removal of the rest is time-consuming, shippers of chemicals often decide
to keep the rest inside the RTC in favour of faster turnover times. However, this implies
that if this RTC should be used to transport another chemical, the rest left inside causes
a contamination. Hence, in practice an RTC is often dedicated to a specific chemical.
In general, RTCs can interchange between chemical-specific fleets if they are rinsed
e.g. by nitrogen. Another opportunity is to simply accept the contamination incurred by

76
See e.g. Boysen et al. (2012) for an overview on shunting yard optimization models.
77
See Schenker (2007).
78
To get an impression about RTC types, see the list of RTC types offered by lessors such as GATX
(GATX (2009)) or VTG (VTG (2011)).
79
The unloaded rest may account for approximately 10% of the total payload depending on the specific
unloading technique, see Compressed Gas Association (1999, pp. 104-115) for details.
89

transferring RTCs from one fleet to another.80 In this case, no rinsing costs occur for
rearranging the RTC fleets but re-processing costs for the contaminated chemical have to
be faced. It has to be noted that this option is only possible among fleets of the same
RTC type (i.e. gas or fluid RTC).
For transferring the relevant chemicals to and from RTCs, special technical devices
are required. For loading and unloading activities, the RTCs have to be connected to
the local pipeline system at so-called transfer arms. This is done by shunting the RTCs
to the transfer stations. Typically, an RTC has one access valve, i.e. either unloading or
loading can take place at a time. The transfer time depends on the technical specifications
of the local pipeline system and the RTC. Once the transfer procedure is finished, the
RTC is decoupled from the local pipeline system, shunted, and transported to its next
destination. The total process time for unloading/loading of RTCs consists of

1. preparation time for shunting and connecting the RTC to the transfer station,

2. time for chemical transfer,

3. post-processing time for decoupling and re-shunting.

Steps 1) and 3) depend on the availability of manpower and the layout of the transfer
stations. Since transfer stations are often serially located along dead-end rail tracks, RTCs
can interfere with one another. Hence, the planning of turnover activities may have severe
influence on the total turnover time and on the RTCs’ total cycle time.81 After shunting
and transfer, RTCs have to be transported from some point of origin to some point of
destination.
Often, chemical companies act as shippers of chemicals and manage the RTCs while
the transport is organized by a rail operator.82 Basically, the shipper has two transport
options:

• compose complete block trains or

• hand over single RTCs.

In the first case, the rail operator has to offer the locomotive(s) for hauling and has to
determine route as well as schedule of the block train on a rail network. In the second case,
the RTCs handed over have to be consolidated at central shunting yards with other rail
cars in order to form block trains. Afterwards, block trains have to be routed, scheduled,
and re-composed. Obviously, the latter option is far more complex to plan and organize.
Hence, higher shipping fees can be expected on an RTC basis in contrast to the block train

80
This is similar to interface "downgrading" in pipeline batch planning, see subsection 3.2.3.
81
On the impact of cycle times on supply chain performance see Closs et al. (2003).
82
This differs from the typical organization in road freight transport where the transport operator man-
ages the transport and owns/manages the transporters.
90

option. To save transport costs, chemical companies often prefer to organize complete
block train transports. This option is the more useful the higher and more frequent
transport volumes are.
To sum up, a lot of technical and organizational restrictions have to be considered
for planning rail transports in the chemical industry. Beside technical constraints, the
planning of transport processes is integrated in the production and replenishment planning
procedures since the consolidation of transport volumes is often organized already at the
chemical production sites. Moreover, large chemical companies often manage a wide-
spread network of production sites. Hence, RTCs can be used for a variety of different
transport tasks. To support decisions on when and how many chemicals to be shipped
from which origin to which destination, the following subsection proposes a short-term
rail transportation model.

3.3.2 A short-term rail transportation problem


As stated above, rail transports in chemical industry are often used for distribution of
intermediate and basic chemicals with moderate demand. Often the transported chemicals
serve as raw materials for other chemical plants which either belong to a customer’s or the
shipper’s production network. Since rail transports require cargo consolidation to operate
economically efficiently, at large-scaled chemical production sites bundles of chemicals are
handled. I.e. rail transports are a viable option even if the individual demands of each
chemical cannot be shipped efficiently by rail.

3.3.2.1 Problem formulation

Chemical production networks consist of a number of chemical production sites where


multiple interdependent plants are located. At each site multiple chemicals are produced
and/or consumed. For each chemical and at each site, tanks are available for intermediate
storage. Local deficits or surpluses of chemicals have to be balanced by imports or exports
of chemicals. It is assumed that all sites are capable to use rail transports for balancing,
i.e. all sites have a rail road access as well as shunting and turnover facilities for RTCs. For
each chemical, a fleet of RTCs is available for transportation. Transports are performed
by rail operators offering train capacities in regular intervals. Therefore, complete trains
are composed at the sites which are handed over to the rail operator.
On the operational level it is assumed that parameters are known and deterministic. It
is to decide about the short-term distribution of the considered chemicals in the network
such that the total operational costs for transport, turnover, and storage are minimized.
More precisely, it is to decide about

• the local stock of each chemical at each site,

• the shipments of chemicals and RTCs w.r.t. quantity and time,


91

• the unloading and loading of RTCs w.r.t. quantity and time,

• the booking of train transports.

The problem is formulated as a time-discrete, multi-commodity, multi-layer network


design model.83 The following basic notation is used: Suppose a set of chemical production
sites I is given for a set of (basic) time periods T . Among all chemicals consumed and
produced at all sites, a subset of these chemicals S exists such that each chemical s ∈ S
is surplus at least at one site and in one period and is in deficit at least at one other
site and period. The local total balance of chemical s in period t at site i is denoted
by ωits . If there is a total surplus, it holds ωits < 0. In case of a (net) deficit, it follows
ωits > 0. For each chemical s and site i ∈ I, tanks are available for storage with maximum
total storage capacity sCapis given in tons. To assure against plant breakdowns due to
material shortages, a local target stock level sTaris is defined. Note that if customers are
supplied, these can be modelled as production sites with zero target stock levels where
order quantities are modelled as sporadic deficits in periods when orders are due. Non-
zero inventory capacities can be set to reflect an intermediate storage option and allow
premature deliveries. Otherwise, the transports are forced to arrive exactly in time.

3.3.2.2 Components for modelling rail transports

To distribute chemicals among the sites, RTCs are required. Therefore, each chemical
s ∈ S has a fleet of RTCs distributed among the sites. Each fleet is assumed to be
composed by a homogeneous RTC type with maximum payload capacity rsCap , length rsLe ,
and tare weight rsWe . At each site, these RTCs can be consolidated to trains, can be
unloaded, and/or can be loaded.
For transports between sites, trains can be composed and dispatched.84 The set of rail
links available for transport is denoted by L ⊆ I × I. Note that each site has access to at
least one rail link. Each site can dispatch a maximum number of trains yitCap in period t
which acts as a proxy for a site’s shunting capacity. The maximum number of trains to
be chartered for a rail link (i, j) ∈ L, ȳijt
Cap
, reflects varying contingents negotiated between
the shipper and the rail operator(s). 85

Le We
Similarly, individual maximum train lengths trij and train weights trij are assumed
for each rail link (i, j) ∈ L reflecting technical train specifications such as the power and
number of locomotive(s) committed by the rail operator as well as legal restrictions e.g. for
maximum train length. Typically, either the length or the weight constraint is restrictive.
The former restriction is often restrictive if only empty RTCs are hauled whereby the
latter is restrictive if only loaded RTCs are hauled.
83
See Newman and Yano (2000) for a similar intermodal transportation planning problem.
84
Loaded and empty RTCs are shunted and composed to a train load. The train load is handed over to
the rail operator who offers the traction (i.e. the locomotive) and performs the transport.
85
This allows incorporating periods without available train dispatches, e.g. on weekends or holidays.
92

The discretized travelling time of a train operating on rail link (i, j) ∈ L is denoted by
tTrv
ij which is an integer value and reflects the number of basic periods elapsing during
shipment. Obviously, this is a notable simplification of real-world conditions and a draw-
back of all time-discrete models. In this case, however, the simplification does not severely
interfere the modelling since travelling times in trans-regional rail freight transports are
typically relatively large.86 Transport times include the pure travelling times as well as
waiting and inspection times, e.g. when borders are crossed or when cargo trains have to
wait for passenger trains to pass. Hence, even for comparatively short distances to bridge,
travelling time is counted in hours. As a rule of thumb, the basic period can be set to
the maximum over the greatest common divisor of all time parameters including train
travel times as well as e.g. turnover times. The total planning horizon is restricted since
demand and production rates have to be updated frequently. Planning horizons of 7 to
14 days seem appropriate for most problem instances in practice.

3.3.2.3 Components for modelling turnover processes

An additional advantage of a time-discrete model is that e.g. work shifts for the turnover
staff are discrete in time. Under regular conditions turnover and transfer activities do not
take place around the clock but e.g. in two shifts with 8 hours each (or one shift with 12
hours). Hence, a time-discrete formulation allows to easily incorporate such regulations.
Therefore, the unloading and loading capacities, eCap Cap
its and lits , are expressed in tons of
chemical s to be handled at site i in period t. Both parameters reflect the work force
availability and technical restrictions such as the number and capacity of transfer arms. It
is assumed that exclusive turnover capacities are provided for each chemical at a site. This
implies single-product transfer stations for each product. This is reasonable if handled
chemicals are dissimilar and a joint unloading/loading is too dangerous or expensive.87
However, in case of similar chemicals, commonly used transfer stations may exist at the
sites.88 In this case, a slightly adjusted turnover capacity formulation is required. In basic
chemical industry, however, individual transfer stations are predominant.
To properly model the RTC handling processes at the sites, the time necessary to un-
load/load the RTCs has to be considered. The total processing time for RTC handling
is summarized in parameter tTrn
is which is the number of basic periods required to make
an RTC available for transport (after loading or unloading). I.e. if RTCs are unload-
ed/loaded, tTrn
is is the minimum time span an RTC has to spend at site i. In most cases,
a turnover time span of one work shift, i.e. one basic period, is assumed before an RTC

86
A reasonable choice for the basic period is e.g. 24, 12 or 8 hours. In practice, more granular planning
accuracy is seldom required since travelling times in trans-regional rail transports are not much shorter
than 12 hours. See Newman and Yano (2000) for a more detailed discussion.
87
Dissimilar chemicals are likely to react with each other which is typically unintended as it bears the
risk of debris, hazardous reactions or even explosions.
88
Such stations are often used in crude oil industry when different types of fuel or other oil derivatives
are handled.
93

is available for transport. Note that RTCs which are not unloaded/loaded are available
for transport immediately.89
If the number of RTCs to be unloaded/loaded in a period at a site exceeds the respective
capacities eCap Cap
its and lits , the remaining number of RTCs have to wait at the site’s shunting
L
yard. Loaded RTCs, denoted by srits , may wait on rail tracks ready for forwarding.
This equals an additional inventory denoted by sLits which induces stock holding costs
for committed capital and costs for supervising the RTCs. Since most chemicals are
hazardous, storing requires permanent supervision and control. In regular tanks this is
typically assured by sensor systems. Although RTCs are designed for recurring transports
of hazardous chemicals, they typically do not meet e.g. legal requirements for a permanent
technical supervision system.

3.3.2.4 Components for modelling the objective function

To evaluate the operational effort for realizing transports and keeping stocks, three types
of costs have to be considered:

• stock holding costs

• transport costs

• turnover costs.

Stock holding costs are calculated as follows: At each site safety stocks of each handled
chemical are hold. They protect against extraordinary demand peaks e.g. caused by plant
breakdowns. With respect to the inventory holding costs, the stock level sits should meet
the desired safety stock level sTar
is . An overshooting of the safety stock levels causes un-
necessary stock holding costs (e.g. due to capital commitment) whereas an undershooting
increases expected shortfall costs since in emergency cases plant shutdowns menace due
to a shortfall of supply. Because the costs induced by a missing or surplus ton of a certain
chemical depend on the target stock level sTar is , over- and undershooting (oits and uits )
are measured as relative deviations from sTar Os
is . The corresponding cost rates cis and cis
Us

are measured in monetary units per relative deviation of the stock level of chemical s
from its target stock level at site i. Often it holds cUsis ≫ cis because a plant shutdown
Os

90
causes much higher costs compared to stock holding. Additionally, for chemicals stored
intermediately in (loaded) RTCs, the cost rate cAdd reflects the additional supervision and
maintenance effort measured in monetary units per ton and period.
Transport costs reflect the monetary effort to be faced when trains are dispatched, i.e. a
rail operator performs a transport service. Therefore, cost rate cTrv
ij accounts for the charge

89
E.g. if empty RTCs arrive in a period at a site and are not to be loaded, they can be forwarded to
another site in the same period.
90
In general, expected shortfall costs increase non-linearly with increasing undershooting of target stock.
However, for the sake of solvability this non-linearity is dropped in this model.
94

paid when a train is dispatched on rail link (i, j) ∈ L. These charges are typically based
on general agreements with the responsible rail operator and depend on the distances to
be travelled, borders crossed, train specifications and other details.91
The turnover cost rate cTrn
is comprises direct costs for unloading and loading RTCs. This
cost rate reflects labour costs for shunting, unloading/loading, and supervision activities.

3.3.2.5 Mathematical model

To sum up, a rail transportation model for multiple products shipped by trains between
chemical production sites can be formulated as a multi-layer, multi-commodity time-
space expanded network flow model. In other words, an operational multi-chemical rail
transportation problem is provided (in short MC-RTP).
In the network model each chemical corresponds to a main layer representing the infras-
tructure of this chemical. The main layer consists of four sub-layers which are dedicated
to

1. the chemical stocks in local tanks,

2. the chemical stocks in loaded RTCs,

3. the stock of loaded RTCs,

4. the stock of empty RTCs.

Flows between two nodes of the same sub-layer correspond to storing or shipping flows
of RTCs or chemicals, respectively. Flows between nodes of different sub-layers indicate
loading or unloading flows. At each node inflows and outflows are in balance. Each flow is
restricted by stock, RTC or train capacities. Transport flows of chemicals are restricted by
the accompanying flow of loaded RTCs, i.e. flows in the second and third sub-layer match
each other. Hence, transport capacities are dynamically modelled as in a classic IRP. In
difference to classic IRPs, no routes for individual carriers are determined. Instead, the
RTC flow balances ensure that the correct transport capacity is modelled. The available
transport capacity, offered by RTCs, can only be exploited by chartering trains. This
component assigns arc-based flow capacities incurring fixed costs which is a characteristic
of fixed-charge network design models. Hence, the proposed model can be seen as a hybrid
model bearing characteristics of both prominent model classes. Figure 3.9 illustrates an
exemplary flow network for five periods and three sites.
Grey arrows indicate (passive) flows of chemicals and (empty) RTCs. Marked by black
arrows is the following sequence of activities: In period 1 empty RTCs are shipped from
site 2 to site 1 where they are needed to load a certain quantity of the chemical in period
2. I.e. a flow between forth and third sub-layer indicates a change in the RTCs’ states.

91
Note that additional flexibility can be incorporated by time varying charges e.g. for weekend transports.
95

layer

1,3 2,3 3,3 4,3 5,3

r)
b- s
su TC
ye
1,2 2,2 3,2 4,2 5,2

la
R
ship

(4 pty
1,1 ping 2,1 3,1 4,1 5,1

em

th
loading

unloading
1,3 2,3 3,3 4,3 5,3

b- Cs
r)
ye
1,2 2,2 3,2 4,2 5,2

su RT
la
(3 ded
g
1,1 2,1 3,1 shippin 4,1 5,1

a
rd
lo
1,3 2,3 3,3 4,3 5,3

ye s
la C
b- T
r)
su in R
1,2 2,2 3,2 4,2 5,2

nd ls
g
shippin 4,1

(2 ica
1,1 2,1 3,1 5,1

em
ch
un
lo
ad
in
1,3 2,3 3,3 4,3 5,3

g
g

ye s
din

la nk
r)
b- ta
loa

1,2 2,2 3,2 4,2 5,2

su in
e
sit

st s
storing

(1 ical
1,1 2,1 3,1 4,1 5,1

em
time

Figure 3.9: Four-layer flow network for one chemical ch

Correspondingly, a quantity of the chemical is transferred from the tanks (first sub-layer)
to the mobile stock (second sub-layer). In period 3 the loaded RTCs depart by train from
site 1 carrying the chemical to site 2 where they arrive in period 4 (second and third
sub-layer). In this period the RTCs are unloaded such that the local chemical stocks are
replenished and, simultaneously, the RTCs are emptied. In this sequence two shipments
are performed, i.e. (at least) two trains are chartered.
To formalize the sketched network flow model, the remaining notation is introduced
as follows: A shipment flow of chemical s from site i to site j in period t is denoted by
xijts (second sub-layer). Shipment flows of loaded and empty RTCs are denoted by rxLijts
and rxEijts , respectively (third and forth sub-layer). Unloading flows and loading flows of
E L
chemical s in period t at site i are described by zits and zits (flows between first and second
sub-layer). These flows are accompanied by the corresponding RTC flows (between third
E L
and forth sub-layer) rzits and rzits , respectively. To enable a shipment of loaded and/or
empty RTCs from node i to node j a number of trains (yijt ) has to be dispatched.
The local chemical stock in tanks is denoted by sits (first sub-layer) whereas the mo-
bile stock in RTCs is denoted by sLits (second sub-layer). The number of loaded and
L E
empty RTCs at hand are described by srits and srits . Table 3.13 summarizes all notation
introduced above (and some more).
96

Sets

S = {1, ..., S} set of chemicals


I = {1, ..., I} set of sites
T = {1, ..., T } set of periods
L⊆I ×I set of rail links
Parameters

Cost rates

cOs
is over-shooting cost rate for chemical s ∈ S
cUs
is under-shooting cost rate for chemical s ∈ S
cTrn
s turnover costs for chemical s ∈ S
cTrv
ij costs for chartering a train on rail link (i, j) ∈ L
(artificial) cost rate for carrying an empty RTC of chemical s ∈ S
c̃Tr−E
ijs
on rail link (i, j) ∈ L
(artificial) cost rate for carrying a loaded RTC of chemical s ∈ S
c̃Tr−L
ijs
on rail link (i, j) ∈ L
cAdd cost rate for storing chemical in RTCs

Utilization parameters

local net balance of chemical s ∈ S at site i ∈ I in period t ∈ T ; if


ωits
ωits > 0 there is a net deficit, otherwise a surplus
rsWe tare weight of RTCs for chemical s ∈ S
rsLe length of RTCs for chemical s ∈ S
sTar
is target stock level for chemical s ∈ S at site i ∈ I

Capacities

rsCap payload capacity of RTCs for chemical s ∈ S


We
trij maximum train weight for trains on rail link (i, j) ∈ L
Le
trij maximum train length for trains on rail link (i, j) ∈ L
Cap maximum number of trains to be dispatched on rail link (i, j) ∈ L
ȳijt
in period t ∈ T
maximum number of trains to be dispatched at site i ∈ I in period
yitCap
t∈T
maximum number of empty RTCs of chemical s ∈ S to be carried
nCap−E
ijs
by a train on rail link (i, j) ∈ L
maximum number of loaded RTCs of chemical s ∈ S to be carried
nCap−L
ijs
by a train on rail link (i, j) ∈ L
eCap
its unloading capacity of chemical s ∈ S at site i ∈ I in period t ∈ T
Cap
lits loading capacity of chemical s ∈ S at site i ∈ I in period t ∈ T
97

sCap
is inventory capacity for chemical s ∈ S at site i ∈ I

Initialization parameters

sIni
is initial stock level for chemical s ∈ S at site i ∈ I
sL−Ini
is initial stock level stored in RTCs for chemical s ∈ S at site i ∈ I
L−Ini
ris initial loaded RTCs for chemical s ∈ S available at site i ∈ I
E−Ini
ris initial empty RTCs for chemical s ∈ S available at site i ∈ I
quantity of chemical s ∈ S dispatched in period τij ∈ {1 − tTrv
ij , ..., 0}
xIni
ijτij s
on rail link (i, j) ∈ L
number of loaded RTCs dispatched in period τij ∈ {1 − tTrvij , ..., 0}
rxL−Ini
ijτij s
on rail link (i, j) ∈ L
number of empty RTCs dispatched in period τij ∈ {1 − tTrvij , ..., 0}
rxE−Ini
ijτij s
on rail link (i, j) ∈ L
L−Ini
rziτis s
number of RTCs loaded in period τis ∈ {1 − tTrn
is , ..., 0} at site i ∈ I

E−Ini
number of RTCs unloaded in period τis ∈ {1 − tTrn
is , ..., 0} at site
rziτis s
i∈I

Time parameters

tTrv
ij integer, travelling time for trains on rail link (i, j) ∈ L
tTrn
is integer, turnover time for chemical s ∈ S at site i ∈ I
Decision variables

xijts flow of chemical s ∈ S on rail link (i, j) ∈ L in period t ∈ T


integer, flow of RTCs loaded with chemical s ∈ S on rail link
rxLijts
(i, j) ∈ L in period t ∈ T
integer, flow of empty RTCs for chemical s ∈ S on rail link
rxEijts
(i, j) ∈ L in period t ∈ T
E
zits unloading flow of chemical s ∈ S in period t ∈ T at site i ∈ I
L
zits loading flow of chemical s ∈ S in period t ∈ T at site i ∈ I
E
integer, RTCs of chemical s ∈ S unloaded in period t ∈ T at site
rzits
i∈I
L
rzits integer, RTCs of chemical s ∈ S loaded in period t ∈ T at site i ∈ I
integer, number of trains dispatched on rail link (i, j) ∈ L in
yijt
period t ∈ T
Variables

sits stock level of chemical s ∈ S in period t ∈ T at site i ∈ I


sLits stock of chemical s ∈ S stored in RTCs in period t ∈ T at site i ∈ I
L
number of loaded RTCs for chemical s ∈ S stored in RTCs in
srits
period t ∈ T at site i ∈ I
98

E
number of empty RTCs for chemical s ∈ S in period t ∈ T at site
srits
i∈I
relative over-shooting in % of the target stock of chemical s ∈ S in
oits
period t ∈ T at site i ∈ I
relative under-shooting in % of target stock of chemical s ∈ S in
uits
period t ∈ T at site i ∈ I

Table 3.13: Sets, parameters, variables, and decision variables for the MC-RTP

Using the aforementioned notation, the total operational costs for balancing the chem-
ical production network by rail transports is expressed in (3.33) which constitutes the
MC-RTP’s objective function to be minimized.

T C = ∑ ∑ ∑ (oits ⋅ cOs
is + uits ⋅ cis ) +
Us
i∈I t∈T s∈S

∑ ∑ yijt ⋅ cTrv
ij +
(i,j)∈L t∈T

∑ ∑ ∑ (rzits
E
+ rzits
L
) ⋅ cTrn
s +
i∈I t∈T s∈S

∑ ∑ ∑ sLits ⋅ cAdd (3.33)


i∈I t∈T s∈S

The first part of (3.33) accounts for the stock holding costs arising from deviations of
stock levels from the desired target stock levels. The second part sums up the total train
charges over all rail links by multiplying the number of dispatched trains yijt with the
corresponding train cost rate cTrv
ij . Turnover costs for turnover activities taking place at
the sites are summed up in the third part of (3.33). The last part accounts for costs for
supervision and control of loaded RTCs waiting at the sites which can be interpreted as
costs for dynamic storage capacity extensions.
The considered restrictions can be categorized in balancing and capacity constraints.
Constraints (3.34)-(3.37) represent the balancing constraints.

sits = si(t−1)s + zits


E
− zits
L
− ωits ∀i ∈ I, t ∈ T , s ∈ S (3.34)

sLits = sLi(t−1)s − zits


E
+ zits
L

− ∑ xijts + ∑ xji(t−tTrv
ji )s
∀i ∈ I, t ∈ T , s ∈ S (3.35)
j∈I j∈I

L
srits = sri(t−1)s
L
− rzits
E
+ rzi(t−t
L
Trn )s
is

− ∑ rxLijts + ∑ rxLji(t−tTrv )s ∀i ∈ I, t ∈ T , s ∈ S (3.36)


ji
j∈I j∈I

E
srits = sri(t−1)s
E
− rzits
L
+ rzi(t−t
E
Trn )s
is
99

− ∑ rxEijts + ∑ rxEji(t−tTrv )s ∀i ∈ I, t ∈ T , s ∈ S (3.37)


ji
j∈I j∈I

Here, (3.34) and (3.35) track the quantities of chemical s available at site i in period t
E L
in tanks (3.34) and in loaded RTCs (3.35). Unloading and loading flows (zits and zits )
connect both inventories inversely, e.g. unloading RTCs (i.e. zits E
> 0) reduces sLits and
increases sits . Furthermore, tank inventories buffer the net consumption or surplus ωits .
In contrast, the stock loaded in RTCs sLits absorbs incoming chemical flows from other
sites (xji(t−tTrv
ji )s
) and provides outgoing chemical flows to other sites (xijts ).
Equations (3.36) and (3.37) represent stock balances for loaded and unloaded RTCs.
E L
Both stocks are interconnected by RTC transfer flows rzits and rzits . E.g. unloading of
E
rzits RTCs in period t at site i immediately reduces the number of loaded RTCs. However,
tTrn
is periods are required for unloading and shunting before these RTCs can be handled
as empty RTCs again. Both stocks are replenished by incoming RTCs from other sites
(rxLji(t−tTrv )s and rxEji(t−tTrv )s ) and provide outgoing RTC flows to other sites (rxLijts and
ji ji
rxEijts ).
Capacity constraints can be separated into two subgroups: Static capacities which refer
to infrastructural conditions (such as tank, train, or turnover capacities) and dynamic
capacities referring to restrictions that are planned simultaneously (such as RTC stocks
and flows). Constraints (3.38)-(3.42) express static capacity constraints.

E
zits ≤ eCap
its ∀i ∈ I, t ∈ T , s ∈ S (3.38)

L
zits ≤ lits
Cap
∀i ∈ I, t ∈ T , s ∈ S (3.39)

sits ≤ sCap
is ∀i ∈ I, t ∈ T , s ∈ S (3.40)

yijt ≤ ȳijt
Cap
∀(i, j) ∈ L, t ∈ T (3.41)

∑ yijt ≤ yit ∀i ∈ I, t ∈ T
Cap
(3.42)
j∈I

∑ xijts + ∑ rsWe ⋅ (rxEijts + rxLijts ) ≤ trij


We
⋅ yijt ∀(i, j) ∈ L, t ∈ T (3.43)
s∈S s∈S

∑ rsLe ⋅ (rxEijts + rxLijts ) ≤ trij


Le
⋅ yijt ∀(i, j) ∈ L, t ∈ T (3.44)
s∈S

Constraints (3.38) and (3.39) restrict the unloading and loading flows to the unloading and
loading capacities of the corresponding periods and sites. Similarly, (3.40) ensures that in
all periods stock levels are less or equal to the tank capacities. Constraints (3.41) require
that the maximum number of trains dispatchable on a rail link per period is not exceeded.
Similarly, constraints (3.42) ensure that the sum of all trains composed on site i in period
t is smaller than the corresponding maximum shunting capacity. Constraints (3.43) and
100

(3.44) refer to technical train specifications assuring that maximum weight and length
are not exceeded. On the right-hand side, the total available capacities are calculated as
the product of maximum train weight and length multiplied with the number of trains
dispatched on a link (i, j) ∈ L. On the left-hand side, the trains’ total hauling weight is
calculated as the sum of all chemicals loaded in RTCs (∑s∈S xijts ) plus tare weights of all
RTCs attached to the train(s). The total length of the trains is the sum of the lengths of
all RTCs attached.
If yijt > 1 for a specific trip (i, j, t) and only one of both constraints is restrictive, an
integer feasible assignment of RTCs to individual trains is assured by constraints (3.43)
and (3.44). However, if both constraints are (almost) restrictive92 and yijt > 1, an integer
feasible assignment of RTCs to trains only exists if the number of RTCs is a multiple of
the number of trains yijt . To illustrate this situation, consider a specific trip with y = 2.
For simplicity assume that a homogeneous fleet of RTCs exist with equal tare weight
rWe = 2, equal length rLe = 1, and equal capacity rCap = 8. Let trLe = 20 and trWe = 100.
Le
Then, the maximum number of RTCs due to (3.44) is trrLe ⋅y = 20 1 ⋅2 = 40 with an associated
tare weight of 40 ⋅ rWe = 80. According to (3.43) the total remaining (weight) capacity for
payload is then y⋅trWe −80 = 120 which implies a total number of r120 Cap = 15 of loaded RTCs.

In consequence, 25 empty RTCs and 15 loaded RTCs with a maximum payload of 120
could be carried by two trains according to (3.43) and (3.44). However, the numbers of
loaded and empty RTCs are not multiples of two. The most balanced integer assignment
of loaded and empty RTCs to two trains is displayed in Table 3.14.

train number of RTCs total total


# empty loaded length weight
1 13 7 20 40 + 7 ⋅ 8 = 96
2 12 8 20 40 + 8 ⋅ 8 = 104
total 25 15 40 200

Table 3.14: Exemplary assignment of RTCs to two trains

From Table 3.14 it can be taken that for train # 1 a feasible RTC assignment is made
while for train # 2 the maximum total weight of 100 is exceeded. In case when multiple
trains can be dispatched on a trip, a reformulation of (3.43) and (3.44) would overcome
this problem. However, to reformulate both constraints additional variables are necessary
indicating whether both constraints are (almost) restrictive for a specific trip. This would
lead to a more complicated model. Since the problem only occurs if the total payload is
close to its maximum, it suffices to introduce a safety buffer for the total train weight.
We
I.e. a train’s allowed total weight trij We
is calculated as trij ˜ We
= tr ij − max rs
Cap ˜ We
where tr ij
s∈S

92 Le
To be precise, it has to hold trij ⋅ yijt − ∑s∈S rsLe ⋅ (rxE L We
ijts + rxijts ) = 0 and trij ⋅ yijt − ∑s∈S xijts +
We E L Cap
∑s∈S rs ⋅ (rxijts + rxijts ) ≤ max rs .
s∈S
101

rsCap
is the technical maximum of the train’s total weight. Note that in practice the ratio ˜ We
tr ij
is quite small (approximately 1-5 %) such that no serious loss in train utilization is to be
expected.
Dynamic capacity constraints are formulated in (3.45)-(3.49).

xijts ≤ rxLijts ⋅ rsCap ∀(i, j) ∈ L, t ∈ T , s ∈ S (3.45)

xijts > (rxLijts − 1) ⋅ rsCap ∀(i, j) ∈ L, t ∈ T , s ∈ S (3.46)

E
zits ≤ rzits
E
⋅ rsCap ∀i ∈ I, t ∈ T , s ∈ S (3.47)

L
zits ≤ rzits
L
⋅ rsCap ∀i ∈ I, t ∈ T , s ∈ S (3.48)

sLits ≤ srits
L
⋅ rsCap ∀i ∈ I, t ∈ T , s ∈ S (3.49)

sLits > (srits


L
− 1) ⋅ rsCap ∀i ∈ I, t ∈ T , s ∈ S (3.50)

Here, (3.45) and (3.46) ensure that the shipped quantity of a chemical s matches the
total payload capacity of all loaded RTCs to be shipped.93 Constraints (3.47) and (3.48)
restrict the loading and unloading flows to the payload capacities of the simultaneously
loaded and unloaded RTCs. Constraints (3.49) and (3.50) enforce that the quantity of
chemicals designated to be stored in RTCs matches the total payload capacity of loaded
RTCs. Constraints (3.45) and (3.46) in combination with (3.49) and (3.50) also restrict
loading and unloading flows to be matched with the numbers of loaded and unloaded
RTCs in a period.
To assess relative under- and over-shooting of target stock levels, constraints (3.51)
assign the deviations of the ratio ssTar
its
from 1 to the variables oits and uits . If sits > sTar
is ,
is
it follows that sTar − 1 > 0. Because oits and uits are non-negative variables and both are
sits
is
associated to positive cost rates, the cost-minimal values for oits and uits satisfying (3.51)
are obtained by oits = ssTar
its
− 1 and uits = 0. Otherwise, uits = ssTar
its
− 1 and oits = 0. Finally,
is is
(3.52) and (3.53) define the variables’ domains.

sits
oits − uits = −1 ∀i ∈ I, t ∈ T , s ∈ S (3.51)
sTar
is

L
xijts , zits E
, zits , sits , sLits , oits , uits ∈ R+ ∀i ∈ I, j ∈ I, t ∈ T , s ∈ S (3.52)

yijt , rxLijts , rxEijts , rzits


E L
, rzits L
, srits E
, srits ∈ N0 ∀i ∈ I, j ∈ I, t ∈ T , s ∈ S (3.53)

Constraints (3.33)-(3.53) constitute the MC-RTP. The optimal solution of the MC-

93
Note that dropping (3.46) would circumvent the distinction in empty and loaded RTCs. This would
allow sending empty RTCs declared as loaded ones to another site where these could be loaded
immediately. Hence, the turnover times would not be considered.
102

RTP minimizes the total costs for transports, turnover activities, and stock holding. The
solution provides flows of RTCs and chemicals within the considered time horizon. At
the end of the planning horizon, no transports are planned because the arrival periods
are beyond the time horizon.94 In practice, therefore, the MC-RTP has to be applied in
a rolling horizon environment with overlapping time horizons to update schedules with
new information e.g. about consumption and production rates.95 I.e. starting with an
initial MC-RTP instance and for fixed a length of the planning horizon, trains and RTCs
are scheduled in accordance to the corresponding (optimal) solution for the first peri-
ods (the re-planning interval).96 After the re-planning interval has elapsed, an updated
MC-RTP instance is set up and solved. Therefore, updated initial values for stocks and
available RTCs etc. and additional production/consumption rates for the new periods to
be planned are incorporated. Additionally, trains that have departed but not arrived yet
have to be considered. To use the MC-RTP in a rolling horizon environment, additional
initialization equations have to be added.
At first, initial stock levels in tanks sIni L−Ini
is , quantities available in RTCs sis , the corre-
L−Ini E−Ini
sponding numbers of loaded RTCs ris , and the available numbers of empty RTCs ris
for each chemical s and site i at the beginning of the planning horizon are incorporated
(see (3.54)-(3.57)).

si0s = sIni
is ∀i ∈ I, s ∈ S (3.54)

E
sri0s = ris
E−Ini
∀i ∈ I, s ∈ S (3.55)

L
sri0s = ris
L−Ini
∀i ∈ I, s ∈ S (3.56)

sLi0s = sL−Ini
is ∀i ∈ I, s ∈ S (3.57)

If RTCs and chemicals are dispatched in previous periods (and arrive during the planning
horizon), these have to be considered. This affects variables xijτij s , rxLijτij s , and rxEijτij s
with 1 − tTrv ij ≤ τij ≤ 0 which have to be set to the corresponding initial values denoted by
xIni
ijτij s , rx L−Ini E−Ini
ijτij s , and rxijτij s as shown in (3.58)-(3.60).

xijτij s = xIni
ijτij s ∀(i, j) ∈ L, s ∈ S, τij ∈ {1 − tTrv
ij , ..., 0} (3.58)

rxEijτij s = rxE−Ini
ijτij s ∀(i, j) ∈ L, s ∈ S, τij ∈ {1 − tTrv
ij , ..., 0} (3.59)

rxLijτij s = rxL−Ini
ijτij s ∀(i, j) ∈ L, s ∈ S, τij ∈ {1 − tTrv
ij , ..., 0} (3.60)

94
A similar problem occurs in pipeline operations planning when batches planned at the end of the
planning horizon are solely injected to keep the pipeline working. Such batches do not arrive at
depots during the planned time horizon. Hence, they cannot satisfy demands at depots.
95
For a similar rolling horizon approach for pipeline operations planning see Cafaro and Cerdá (2008).
96
Note that the re-planning interval should be considerably smaller than the planning horizon, see
e.g. Cafaro and Cerdá (2008).
103

Similarly, RTCs already shunting at the yards at the beginning of the planning horizon
have to be considered by setting the variables rziτ L
is s
E
and rziτis s
for 1 − tTrn
is ≤ τis ≤ 0 to
L−Ini E−Ini
their corresponding initial values rziτis s
and rz iτis s

E
rziτis s
= rziτ
E−Ini
is s
∀i ∈ I, s ∈ S, τij ∈ {1 − tTrn
is , ..., 0} (3.61)

L
rziτis s
= rziτ
L−Ini
is s
∀i ∈ I, s ∈ S, τij ∈ {1 − tTrn
is , ..., 0} (3.62)

In combination, equations (3.58)-(3.62) allow using the MC-RTP in a rolling horizon


environment. Otherwise, the corresponding variables are set to 0.
The following example illustrates the applicability of the MC-RTP for an artificial prob-
lem instance.

Example 6 (Rail operations planning). Consider a chemical production network con-


sisting of three sites where three commonly produced/consumed chemicals are planned for
14 periods. A period equals 12 hours. At the beginning of each day, a single train can
be dispatched from every site. Each site is connected with any other site via a rail link,
i.e. L = {(i, j) ∶ i, j ∈ I ∧ i ≠ j}. Table 3.15a shows the travelling times tTrv
ij for all rail
links (i, j) ∈ L.

site j chemical s chemical s


1 2 3 1 2 3 1 2 3
1 - 2 4 1 20 20 25 rsCap 15 15 20
site i

site i

2 2 - 1 2 20 20 20 rsWe 5 5 10
3 4 1 - 3 25 20 25 rsLe 20 20 25
(a) tTrv
ij
E−Ini
(b) ris (c) rsCap , rsWe , rsLe

Table 3.15: Technical parameters for the rail operations planning example

For each chemical a fleet of RTCs is available. No previous transports are assumed
to be on track such that the initialization variables in constraints (3.58)-(3.62) are set to
zero. Similarly, all available RTCs at the sites at the beginning of the planning horizon
are empty and ready for loading. Hence, initialization variables in constraints (3.56) and
(3.57) are zero, too. Tables 3.15b and 3.15c show the distribution of the RTC fleets among
the sites and the technical configurations for the RTCs in the chemicals’ fleets, respectively.
All turnover times tTrn
is are set to one period. Furthermore, train specifications are given
We
by trij = 800 and trij
Le
= 1, 000 for all (i, j) ∈ L. The unloading and loading capacities eCap its
Cap
and lits are set to 700 tons such that a complete train’s load can be (un-)loaded within a
period.
For simplicity it is assumed that the chemicals’ consumption/production rates ωits are
constant over the planning horizon. Table 3.16 displays the assumed consumption/pro-
duction rates ωis as well as stock capacities sCap Ini
is , initial stocks sis , and target stock levels
104

sTar
is for each chemical s and site i.

site i 1 2 3
chemical s 1 2 3 1 2 3 1 2 3
ωis -50 25 100 -50 25 -200 100 -50 100
sCap
is 3000 2500 2500 1500 1500 5000 5000 5000 5000
sIni
is 400 500 700 500 300 1000 700 500 1000
sTar
is 500 250 500 500 250 1000 750 500 1000

Table 3.16: Consumption rate ωis , stock capacities sCap Ini


is , initial stocks sis , and target
stock levels sTar
is for all sites i and chemicals s

is = 1, cis = 1, 000, cij = 100, and cs


Finally, cost rates are defined as follows: cOs Us Trv Trn =

cAdd = 10 for all i, j ∈ I and s ∈ S. Note that overshooting the target stock levels induces
only a thousandth of the cost for undershooting. The train cost rate cTrvij is comparatively
small forcing extensive rail activities.
Solving this instance of the MC-RTP leads to the optimal solution after 8 seconds with
an objective value of 20,845.97 Note that network design problems are N P − hard prob-
lems.98 However, the computational complexity depends on the relation of the parts of the
objective function and the size of the network (particularly the number of links). Here,
a comparatively small problem instance with a favourable ratio of costs99 is given which
eases solving to optimality.
The optimal transport flows and stock levels are displayed in Figure 3.10 for the first
eight periods by means of a time-space expanded network. Each cell shows the stock levels
at a site in a particular period. Dispatched trains are indicated by arrows between two
nodes where the associated transport quantities (in tons of chemicals) and/or the number
of empty RTCs are assigned to each arrow.100
In t = 1, from sites i = 1 and i = 3 repositioning transports are dispatched enabling
chemical transports in later periods. For example, empty RTCs of chemicals s = 1 and
s = 3 are dispatched in period t = 1 from site i = 3 to site i = 2 arriving there in t = 2.
In period t = 3, empty RTCs of chemical s = 1 are partially forwarded101 (4 out of 20)
to site i = 1 together with some loaded RTCs of chemical s = 3.102 Similar sequences of
re-positioning and supply transports can be found for the remaining chemicals and sites.
These sequences are repeated if the production/consumption rates do not change in time
which results in recurring re-positioning cycles.
97
Calculations are performed on a 2.6 GHz machine with IBM CPLEX 12.5, see IBM (2010).
98
See e.g. Crainic (2000).
99
To be precise, the inventory costs dominate transport costs such that network flow primarily determines
the total costs.
100
For clarity, the indices of variables annotated at the arcs are reduced and indicate the number of the
chemical only.
101
The rest is used for transports of chemical s = 1 back to site i = 3 in periods t = 5 and t = 7.
102
To be precise, 26 loaded RTCs of chemical s = 3 are dispatched in period t = 3 carrying 520 tons from
site i = 2 to site i = 1. The train’s maximum weight is totally exploited.
t =1 t =2 t =3 t =4 t =5 t =6 t =7 t =8
450 500 430 480 410 460 510 560
600 500 400 300 720 620 520 420
i =1
475 450 425 400 375 350 325 300
E E E
rx s=3 = 13 rx s=3 = 12 rx s=3 = 26
x s=1 = 120 x s=1 = 120
E
x s=3 = 520 rx s=2 = 12
E
rx s=1 =4
550 600 650 700 510 560 445 495
1200 1400 1080 1280 1160 1360 1280 1480
i =2
275 250 225 320 295 270 245 310
x s=3 = 320 x s=3 = 280
x s=1 = 240 x s=2 = 90 x s=1 = 285
E
rx s=3 = 20 x s=2 = 120 x s=2 = 90
E E E
rx s=1 = 25 rx s=3 =5 rx s=1,3 = 16
600 500 400 300 200 340 360 545
900 800 700 600 500 720 620 800
i =3
550 600 530 580 540 590 550 600

inventory capacity 1000 current stock level


target stock chemical 1 chemical 2 chemical 3
1000
Figure 3.10: Optimal stock levels and network flows for periods 1 to 8 of the MC-RTP example
105
106

For settings with realistically sized finite planning horizons, only the beginnings of such
cycles are visible. At the end of the planning horizon no transports are reasonable as
the savings in stock holdings become effective in periods beyond the planning horizon
only. Therefore, stock levels are the less balanced the closer the planning horizon ap-
proaches. This end-of-horizon effect becomes strikingly apparent for chemical s = 1 (dark-
grey curves) at sites i = 1 and i = 3 as it is illustrated by Figure 3.11.103 The figure shows
the stock levels for all chemicals, periods, and sites including the corresponding target
stocks (as dashed lines).
It can be observed that the stock levels meet the corresponding target stocks well in most
cases. For chemical s = 2 (medium grey curves) the stock levels are close above the target
stocks at all sites and in (almost) all periods.104 Chemicals s = 3 (light grey curves) and
s = 1 are oscillating around their target stock levels at sites i = 1 and i = 2 whereas at
site i = 3 both chemicals are clearly below their target stocks. Transports of chemical s = 3
from site i = 2 to i = 3 could reduce holding costs at both sites.105 However, this option
is not realized either due to a lack of RTCs or due to economic reasons (e.g. the related
transport costs).
This example provides some first insights into the interplays of re-positioning and stock-
piling transports crucially depending on both the availability of RTCs and the heights
of target stock levels. Moreover, it becomes apparent that in practical applications the
MC-RTP has to be applied in a rolling horizon environment to overcome end-of-horizon
problems like stock decreases and undone repositioning transports.106
In reality, the production networks typically consist of more than three sites107 whereby
a time horizon of one or two weeks is reasonable in most cases. The time horizon depends
on the forecasting stability of production/consumption estimates as well as the transport
times. Hence, the model’s complexity increases with increasing numbers of nodes and
periods. In cases of very large instances, a heuristical procedure can be set up like this:

1. Build a relaxed MC-RTP by replacing the train cost part ∑(i,j)∈L ∑t∈T yijt ⋅ cTr
ij in the
objective function (3.33) with

∑ ∑ ∑ (rxLijts ⋅ c̃Tr−L
ijs + rxijts ⋅ c̃ijs )
E Tr−E

(i,j)∈L s∈S t∈T

2. Solve the relaxed MC-RTP with a standard solver.

3. Feed the determined chemical flows x∗ijts to the original MC-RTP.

4. Solve the MC-RTP instance with fixed chemical flows.


103
But also the effect is obvious for chemical s = 2 (medium grey curves) at sites i = 1 and i = 3.
104
Except for periods t = 3 and t = 7 at site i = 2 where a slight undershooting is observed.
105
Since this chemical is in deficite at site i = 3 but surplus at site i = 2.
106
For a more detailed discussion of end-of-horizon problems see Hughes and Powell (1988).
107
For example the production network of Dow Chemical consists of more than 20 sites.
chemical s = 1
location i = 1 stock level
chemical s = 2 target stock
chemical s = 3

750 1000
500
stock level
250
0
location i = 2

1250
750
stock level
250
0
location i = 3

750
stock level
250
0
0 2 4 6 8 10 12 14
period
Figure 3.11: Optimal stock levels for the MC-RTP instance of example 6
107
108

A considerable part of the model’s complexity is caused by the integer decisions to deter-
mine the number of dispatched trains. Since these decision variables are associated with
fixed charges per train, a variabilization of these fixed costs simplifies the problem consider-
ably. Variable transport costs depend linearly on the number of RTCs dispatched on a rail
link. To define variable transport cost rates c̃Tr−L
ijs and c̃Tr−E
ijs , calculate the maximal num-
trWe trLe
ber of empty and loaded RTCs to be carried, i.e. nCap−L
ijs = ⌊ rWe +r
ij
Cap ⌋ and nijs
Cap−E
= ⌊ rLe
ij
⌋,
s s s
respectively. Then, the transport cost rates can be calculated as

cTr
ij
c̃Tr−E
ijs = Cap−E
(3.63)
nijs
cTr
ij
c̃Tr−L
ijs = Cap−L
. (3.64)
nijs

The proposed heuristic reduces the computation time by approximately 50% for the in-
stance described in this example.108 The heuristical solution yields total costs of 21,117.75
which is about 1.3 % above the optimal value.

3.4 Planning problems for ship operations


Ship transportation is one of the most attractive transport modes for raw chemical pro-
curement in chemical industry. Raw materials and basic chemicals are required constantly
and in huge quantities to supply continuously operating chemical production plants. The
ideal transport mode is the pipeline transport. However, pipelines dedicated for basic and
raw chemicals are seldom available due to large distances to bridge, high investment costs,
and the inflexibility of sourcing. In most cases, raw and basic chemicals are transported
by ships if a sea or river port is available.

3.4.1 Technical and organizational prerequisites


Transporting chemicals in ships (or tankers) leads to a similar technological classification
of transport carriers as for RTCs. Depending on the phase of the chemical to be trans-
ported, two main classes can be subdivided: For the transport of liquids chemical tankers
are used whereas gas tankers are used for gases.
Chemical tankers are similar to oil tankers but typically considerably smaller. Large-
scaled chemical tankers for open sea transports have a deadweight of about 40,000 tons
at maximum, whereas the largest class of (crude) oil tankers has a deadweight range from
320,000 up to 550,000 tons.109 Chemical tankers carry a number of tanks that can be

108
Note that the relaxed MC-RTP was solved to optimality although this is not necessary for a heuristic.
The procedure can be fastened by terminating optimization at a threshold for computation time or
optimality gap.
109
Chemical tankers for inland waterway transports have typically a deadweight of at most 10,000 tons.
109

independently loaded with different liquid chemicals. Depending on the hazardousness of


the chemicals to be transported, there exist different classes of chemical tankers which vary
in their safety features.110 Unloading and loading takes place at specific port terminals by
pumping the chemicals in or out. At modern chemical tankers, each tank can be handled
separately. To avoid explosion or other chemical reactions the tanks are always filled
either by the chemicals to be transported or by inert gas (e.g. nitrogen).
Gas tankers are technologically more sophisticated vessels. Similar to transports in
RTCs, gaseous chemicals can only be transported efficiently in a liquefied state. Hence,
gas tankers have to keep the chemicals either under high pressure or under super-cooled
conditions.111 In particular, maintaining super-cool conditions in all tanks is technologi-
cally challenging for long trips e.g. over open sea. Despite sophisticated tank insulation,
temperature of tanks and cargo increases while travelling. This leads to a vaporization of
the super-cooled cargo and an increase of pressure inside the tanks. The resulting gas (so-
called boil off ) has to be released from tank to keep the pressure in an acceptable range.
It is either used to produce energy on board or is re-liquefied. Hence, there is either a
loss of cargo or additional energy is required to keep all cargo liquefied.112 Therefore, the
costs for long-haul transports of liquefied gases over open sea are noteworthy. Generally,
liquefaction by pressure is economically preferred at small-sized vessels (up to 10,000 m3 ),
whereas liquefaction by refrigeration is carried out on large-scaled tankers (up to 125,000
m3 ).113 The loading procedure is more complicated compared to chemical tankers since
differences in pressure and temperature have to be harmonized before loading can take
place. While unloading gas tankers, the chemicals on board vaporize gradually. Hence,
similar to gas RTCs, a rest of the chemical remains in the tanks if it is not pumped out
(e.g. by spray pumps). However, since the transport quantities are much larger compared
to RTCs, the tanks are typically emptied completely. Most gas tankers have a transport
capacity of about 125,000 m3 .114 The payload in tons depends on the density of the
liquefied chemical.
The waterway transports of raw and basic chemicals are organized differently from
RTC transports. Chemical companies usually do not own or rent tankers (for long peri-
ods) themselves, but place transport orders e.g. on spot markets.115 Hence, routing and

For this classification see Hayler et al. (2003) or Stopford (2009).


110
See Stopford (2009) for details.
111
Supercooled means here temperature well below 0 ○ C depending of the gas to be liquefied, e.g. liquefied
natural gas (LNG) is liquefied at about -160 ○ C, whereas Ethylene liquefies at about -100 ○ C, see
Stopford (2009).
112
For instance, the transport of liquefied natural gas requires 10-25% of the energy transported in form
of cargo to maintain super-cooled conditions and re-liquefy the vaporized natural gas, see Schumacher
(2011).
113
This classification is neither distinctive nor complete. There are also tankers providing both types of
liquefaction techniques (so-called semi-refrigerated tankers) as well as small-scaled fully refrigerated
tankers or large-scaled only pressurized tankers. See Stopford (2009) for details.
114
Most recently larger ships are laid down with a capacity of about 250,000 m3 . See Colton (2012) for
an encompassing list of LNG carriers under service or construction.
115
An exception are large petrochemical companies that often own a number of tankers to deliver their
110

scheduling of tankers is not integrated with inventory and replenishment planning per
se. Transports are organized by ship operators which own fleets with different types of
tankers. Similar to rail transports, the contracts are either quantity-based or tanker-
based, i.e. either the transport fee depends on the quantity to be shipped or a complete
tanker is chartered.116 The former option is typically used for small- to medium-sized
transport orders (say up to 10,000 tons) and organized on large-scaled tankers with mul-
tiple tanks. Typically, such large-scaled tankers operate on predefined routes acquiring
numerous transport orders of various materials at various ports.117 Due to scale effects,
such tankers can operate at comparatively low cost rates per m3 ⋅ km, but they are in-
flexible due to the fixed routes. If a single transport order has sufficient size, chartering
a complete (smaller-scaled) tanker is often economically preferable. Chartering smaller
tankers generates comparatively high cost rates but offers more flexibility.
Both types of transport contracts imply a distinction between the shipper (a chemical
company) placing the transport orders and the ship operator realizing these transport
orders. Despite the fact that the chemical companies do not own or rent tankers for long
periods, an integration of transportation and inventory planning is reasonable and can
be organized by a more or less close cooperation between chemical companies and ship
operators. From the perspective of chemical companies, a close cooperation with ship
operators requires a collaborative planning of tanker routes visiting a set of production
sites under consideration of the stock levels at the sites. Such problems are categorized
as maritime inventory routing problems. In contrast, if the cooperation is less close,
only transport orders are placed e.g. on a spot market for chemical shipments. Here, the
problem is to determine start and end node, timing, and composition of transport orders
under consideration of the stock levels at the nodes. Such problems are categorized as
maritime inventory shipping problems. The next two subsections briefly review available
literature in both classes.

3.4.2 Maritime inventory routing problems


Maritime inventory routing and scheduling problems correspond to the class of shipping
problems for industrial operations where the shipper owns or manages the ships used for
the transports.118 In chemical industry, this class primarily addresses petrochemical com-
panies with a global network of production sites and customer terminals. The transport
quantities are huge and well predictable, justifying an integrated collaborative planning
of inventory management and routing. Maritime inventory routing problems (MIRP) are

products. See e.g. Persson and Gothe-Lundgren (2005).


116
For rail transports either contracts are based on the number of RTCs (wagonload contract) or complete
block trains.
117
This corresponds to the so-called liner problem, see e.g. Christiansen et al. (2004).
118
See e.g. Christiansen et al. (2004).
111

subtypes of the classic inventory routing which are characterized by the following facts:119

• There is a set of ports, where at least one product is consumed or produced (at a
constant rate).

• Each port has inventory capacities to store products.

• Each port has the capability to unload/load ships.

• Ships are used to transport products between the ports in a finite time horizon.

• A heterogeneous fleet of ships is available for transportation.120

If multiple periods are considered, the routes have to be coordinated in space and
time. Such problems are also categorized as maritime routing and scheduling problems
(MIRSP). Typically, the objective is to minimize the total operational costs over a finite
time horizon consisting of

• transport costs for operating the ships (mainly fuel and manpower),

• port costs for loading and unloading, and

• inventory holding costs.

Note that inventory costs are omitted by many approaches121 with the argument that all
inventories are owned by the same company, i.e. the total inventory among all nodes of the
network is assumed as independent from the routing and scheduling decisions. However,
this argument is not entirely correct in the long run since the frequency of transports
affects the total stock among the network.122 In other words, the higher the transport
frequency (i.e. the transport costs) the lower the total stock in the network (and, hence,
total inventory holding costs). In the short run, however, inventory holding costs (in the
sense of capital commitment costs) are probably sufficiently small to be disregarded as an
optimization criterion.
Contributions for MIRSPs mostly differ in two categories: the number of products
considered and the solution method(s). Basically, the underlying problem structure is
a vehicle routing problem (VRP) and/or a pick-up and delivery problem (PDP) under
maritime-specific constraints such as port and turnover capacities as well as inventory
constraints. Since VRP and PDP are hard to solve to optimality even in their basic
form, solution methods for MIRSP mainly rely on sophisticated and tailored heuristical
procedures. First contributions (Christiansen and Nygreen (1998); Christiansen (1999))
119
For a comparison with "classic" IRPs for other modes of transport see e.g. Christiansen et al. (2004)
or Andersson et al. (2010).
120
This is not a necessary condition, however most work assumes heterogeneous fleets as this is typical
for real world problems.
121
See Christiansen (1999); Al-Khayyal and Hwang (2007); Siswanto et al. (2011).
122
Also compare the discussion about pipeline scheduling, e.g. see Figure 3.5.
112

focus on one-product problems and model the underlying real-world problem as an Inven-
tory PDP with Time Windows (IPDPTW). It can be shown that inventory constraints
can be reformulated to time windows. To solve the IPDPTW, a column generation ap-
proach is proposed. Therefore, the IPDPTW is reformulated allowing a decomposition
into ship routing and harbour visiting sub-problems.123 By solving the sub-problems,
feasible ship routes and visiting sequences for all ships and harbours are generated.124
In the master problem, a combination of the identified routes and sequences has to be
determined. Therefore, a tailored branch-and-bound procedure is proposed to find an
optimal combination of routes and sequences.
MIRSPs for multiple products require a sophisticated modelling of the tanker capacity
which has to be modelled as a set of tanks dedicated to a specific product on a specific trip.
In Persson and Gothe-Lundgren (2005) it is shown that in such a formulation inventory
constraints cannot be converted into time windows. In Al-Khayyal and Hwang (2007)
the problem is simplified by fixing the assignment of tanks to products. Siswanto et al.
(2011) show that for MIRSPs considering non-dedicated tanks and multiple products even
small instances are much harder to solve to optimality than the single product problems
with dedicated tanks. Contributions also differ in the technological restrictions at ports,
e.g. how many ships or products can be served simultaneously. Christiansen and Nygreen
(1998); Christiansen (1999) and Siswanto et al. (2011) allow at most one ship to be handled
at a time in a harbour, whereas Al-Khayyal and Hwang (2007) allow multiple ships to be
docked at the same time.
Almost all contributions to MIRSP use a time-continuous formulation and consider con-
stant consumption and/or production rates for the products at the sites. One exception
is Persson and Gothe-Lundgren (2005) using a time-discrete formulation. Moreover, pro-
duction planning decisions are incorporated besides routing and inventory management.
The second exception propose Christiansen et al. (2011) where a solution procedure for a
MIRSP with time-varying consumption rates is described (but no mathematical model).
The literature on MIRSPs is large and still growing, particularly addressing specific
aspects of industrial applications. E.g. in Christiansen et al. (2011) and Stålhane et al.
(2012) applications of MIRSPs in cement industry and for LNG transports are proposed.
A related branch of literature focuses on the problem of scheduling shipments when the
ships are not owned/managed by the shipper. These maritime inventory shipping prob-
lems are discussed in the next subsection.

123
For each ship a routing model is formulated which is independent from all other ships. Similarly, for
each harbour a visiting model is proposed which is independent from all other harbours.
124
Promising routes and sequences are identified by solving the LP-relaxation of the master problem
iteratively to determine columns with least reduced costs.
113

3.4.3 Maritime inventory shipping problems


Maritime inventory shipping models face the same general problem structure as MIRSPs.
However, the organization of shipments is different for this class of problems. In contrast
to MIRSPs, maritime inventory shipping problems (MISP) do not explicitly consider the
routing of a particular ship. Instead, it is assumed that some ships are available for
transport orders, e.g. by hiring tankers on a spot market. Hence, the planning of routes
is left to the ship operator and not integrated in the production and inventory planning.
From the shipper’s perspective, it has to be decided about

• when a transport is to be performed,

• what products in which quantity are to be shipped, and

• which ship (type) is to be used.

These decisions are made based on the inventory and production constraints as well as
the cost rates offered for different types of ships by the ship operators.
Ronen (2002) proposes a MISP for a single ship type where the ships’ capacities are
not modelled in detail, e.g. no tanks are modelled and the transport capacity is only
restricted by a port-based upper bound. Only one ship can leave a departure port per
period but multiple ships may arrive at a destination port in a period. The objective
function minimizes the total shipping and inventory costs where the inventory costs are
measured as the weighted sum of absolute deviations from a desired safety stock level,
i.e. no adjustments according to the target stock level are made. The shipping costs
consist of a fixed part per dispatched ship and a variable part depending on the quantity
shipped. To overcome some lacks of detailedness of the approach presented by Ronen
(2002), the following extensions are made:

• Multiple ship departures at a port are allowed.

• Multiple types of ships are available characterized by a type-specific number of tanks


and charter rates.

• Different types of tanks on a ship are modelled with varying sets of loadable prod-
ucts.

• Transport costs consist of a daily charter rate (depending on ship type) and fixed
port fees (depending on the departure and arrival port and the ship type).

• Inventory costs are measured as the weighted sum of relative deviations from a
target stock level.

Due to the great variety of ship types available for chemical transports (in particular
among gas tankers) it seems reasonable to reflect this option by modelling a heterogeneous
114

set of tanker types. The main features characterizing tankers from an economic point of
view are the offered transport capacity and the associated costs. The transport capacity
depends on the size and type of tanks on board. The type of tank also determines the
set of products that can be loaded. In the case of chemical tankers, this set is mainly
determined by the safety class of a tank. For gas tankers the liquefaction technology
plays an additional role.125 Hence, the number and the types of tanks installed on a
tanker determines its flexibility and capacity.
Altogether, a MISP is proposed that minimizes the total costs for shipping and stock
holding. It has to be decided on the transport volumes to be shipped, the ship types
used for the shipments and, subsequently, the assignment of chemicals to tanks on a trip.
The following maritime inventory shipment problem with ship type and tank assignment
(MISP-STA) uses the following notation given in Table 3.17.

Sets

S = {1, ..., S} set of chemicals


I = {1, ..., I} set of ports/sites
T = {1, ..., T } set of periods
K = {1, ..., K} set of tank types
V = {1, ..., V } set of tanker types
L⊆I ×I set of links
Parameters

cOs
is over-shooting cost rate
cUs
is under-shooting cost rate
cTrn
is turnover costs for chemical s ∈ S at port i
cTrv
v cost rate for chartering a tanker of type v for one period
cPort
iv port fee for tanker type v at port i
local net balance of chemical s ∈ S at port i ∈ I in period t ∈ T ; if
ωits
ωits > 0 there is a net deficit, otherwise a surplus
Cap
qsk capacity of tank type k for chemical s ∈ S
nCap
vk number of tanks of type k installed on tanker type v
sCap
is inventory capacity for chemical s ∈ S at port i ∈ I
sTar
is target stock level for chemical s ∈ S at port i ∈ I
sIni
is initial stock level for chemical s ∈ S at port i ∈ I
initial flows of chemical s ∈ S from i to j in period t ∈ T in period
xIni
ijτ s
τ ∈ {1 − tTrv
ij , ..., 0}
tTrv
ij integer valued travelling time on link (i, j) ∈ L
Decision variables
125
E.g. Ethylene can be transported efficiently under super-cooled conditions only. For most other basic
chemicals also liquefaction by pressure is possible. See Stopford (2009).
115

xijts flow of chemical s ∈ S from i to j in period t ∈ T


yijtv number of tankers of type v used on trip (i, j, t) (integer)
number of tanks of type k dedicated to chemical s on trip (i, j, t)
zijtsk
(integer)
Variables

sits stock level of chemical s ∈ S in period t ∈ T at port i ∈ I


relative over-shooting in % of the target stock of chemical s ∈ S in
oits
period t ∈ T at port i ∈ I
relative under-shooting in % of target stock of chemical s ∈ S in
uits
period t ∈ T at port i ∈ I

Table 3.17: Sets, parameters, variables, and decision variables for the MISP-STA

Since it is assumed that tankers are chartered for a specific transport order, it is rea-
sonable to assume that a time-dependent charter rate is charged when a tanker is booked.
Hence, the charter costs on link (i, j) ∈ L are calculated by a charter rate per period cTrv
v
multiplied by the trip’s duration tTrv
ij .
126 Furthermore, port fees are to be expected de-

pending on the handling time (for unloading or loading) as well as the size of the ships.127
These together constitute the transport costs depending on the travel time, ship type,
and ports (see first two rows of (3.65)).
As a counterpart, inventory costs have to be faced. As discussed for the MC-RTP model,
it is assumed that target stock levels (in the sense of safety stocks) are predefined for each
port/production site and product. Deviations from these target levels are to be minimized
and are included in the objective function multiplied by specific cost rate equivalents.
These cost rate equivalents are higher for negative deviations (i.e. undershooting) and
lower (or zero) for positive deviations (i.e. overshooting). As before, undershooting cost
rate equivalents reflect the expected costs for a plant shutdown due to raw material short-
age whereas overshooting cost rate equivalents reflect unnecessary stock holding costs. As
in the MC-RTP, stock deviations are measured relative to the target stock levels taking
into account varying target stock levels.128 The total costs of transports and inventory
holding are minimized and formulated as

T C = ∑ ∑ ∑ (yijtv ⋅ (tTrv
ij ⋅ cv + civ
Trv Port
jv )) +
+ cPort
(i,j)∈L t∈T v∈V

∑ ∑ ∑ ∑ (zijtsk ⋅ (cTrn
is + cjs )) +
Trn

(i,j)∈L t∈T s∈S k∈K

126
To reflect distance-dependent discounts the cost rates can be assumed to be negotiated for each trip
individually, i.e. cTrv
ijv .
127
See Stopford (2009) for more details.
128
See the discussion in subsection 3.3.2.
116

∑ ∑ ∑ (cOs
is ⋅ oits + cis ⋅ uits ) .
Us
(3.65)
i∈I s∈S t∈T

Constraints of the MISP-STA are formulated in (3.66)-(3.72)

sits = si(t−1)s − ωits + ∑ xji(t−tTrv


ji )s
j∈I

− ∑ xijts ∀i ∈ I, s ∈ S, t ∈ T (3.66)
j∈I

xijts ≤ ∑ (zijtsk ⋅ qsk


Cap
) ∀(i, j) ∈ L, s ∈ S, t ∈ T (3.67)
k∈K

∑ zijtsk ≤ ∑ nvk ⋅ yijtv ∀(i, j) ∈ L, k ∈ K, t ∈ T


Cap
(3.68)
s∈S v∈V

sits
− 1 = oits − uits ∀i ∈ I, s ∈ S, t ∈ T (3.69)
sTar
is

si,0,s = sIni
is ∀i ∈ I, s ∈ S (3.70)

sits ≤ sCap
is ∀i ∈ I, t ∈ T , s ∈ S (3.71)

xijτ s = xIni
ijτ s ∀(i, j) ∈ L, τ ∈ {−tTrv
ij ...0}, s ∈ S (3.72)

xijts ∈ R+ ∀i ∈ I, j ∈ I, t ∈ T , s ∈ S (3.73)

yijtv , zijtsk ∈ N0 ∀i ∈ I, j ∈ I, t ∈ T , s ∈ S, k ∈ K, v ∈ V (3.74)

Constraints (3.66) constitute the inventory balance equations consisting of the previous
period’s stock, local consumption/production (ωits ), and in-/outflows from/to other ports.
Constraints (3.67) assure that a sufficient number of tanks is assigned to chemical s in
order to store the shipping quantity xijts on board. Similarly, constraints (3.68) restrict
the total number of dedicated tanks on a trip to the number of tanks installed on board
of the chartered ships. Relative stock deviations from the desired target stock levels
are calculated by constraints (3.69). Constraints (3.70) initialize the stock levels at the
beginning of the planning horizon whereas (3.71) assure that stock capacities are not
exceeded. Finally, constraints (3.72) incorporate flows from previous periods which allows
applying the model in a rolling horizon environment.
The following example shows the applicability of the MISP-STA by means of the same
artificial setting as described in example 6.

Example 7 (Maritime inventory shipping). Assume three ports with associated production
sites where three chemicals are produced/consumed. Shipments are planned for a 14-
117

periods scenario.129 Travelling times as well as production/consumption rates and stock


configurations are set as stated in Tables 3.15a and 3.16. Three types of tankers are
available for transportation. Different numbers and types of tanks are installed on the
tankers. Two types of tanks are distinguished. Table 3.18 shows the number of tanks
installed on the tankers and tank capacities for all products.

ship type v chemical s


1 2 3 1 2 3
tank k

1 1 2 4 0 500 0
2 1 4 8 450 100 600
cTrv
v 200 300 400 – – –

Table 3.18: Technical specification of tankers and tanks (nCap Cap Trv
vk , qsk , cv )

The charter rates cTrv


v are increasing non-linearly with increasing tanker size to reflect
economies of scale for larger-sized tankers. Two tank types are used, k = 2 is an all-round
tank that can be used for all chemicals, whereas type k = 1 is specialized for chemical s = 2.
The remaining cost rates are set as follows: cPort
iv = 10 and cTrn
is = 1 for all i ∈ I, v ∈ V,
and s ∈ S, respectively. Cost rates for over- and undershooting target stock levels are set
is = 1000 and cis = 1 for all i ∈ I and s ∈ S. This setting was solved to
as in example 6: cUs Os

optimality in about 5.7 hours, the corresponding minimal total cost is 13,514.130 Again, the
MISP-STA can be categorized as a specific network design problem where arc capacities
are associated with fixed costs for chartering ships. Therefore, the same remarks w.r.t. the
model’s complexity hold as for the MC-RTP. In this problem instance, however, the ratio
between transport and stock holding cost is less favourable as the transport cost rates are
considerably higher than in the MC-RTP instance. Moreover, the transport capacity is
not restricted by a predefined set of RTCs but purely depends on the number of chartered
ships.
Figure 3.12 shows the resulting optimal transport quantities dispatched in the first eight
periods.131
Mainly tanker type v = 1 is chartered since handled quantities are sufficiently small.
Exceptions can be observed in periods t = 1, t = 4 and t = 7 where chemicals s = 1 and s = 3
are shipped jointly which requires ship type v = 2. Primarily, chemical s = 3 is transported
from port/site i = 2 to i = 1 and i = 3. Chemical s = 1 is shipped from i = 1 and i = 2 to
i = 3. The transports from i = 1 to i = 3 are routed via i = 2. Note that direct transports
from i = 1 to i = 3 last four periods whereas indirect transports via port i = 2 last only
129
Due to the longer transport and handling times, a rougher time scale can be expected in the mar-
itime/waterway context compared to rail context. I.e. a period’s length is expected to be one or two
day(s).
130
The calculation was performed on a 2.6 GHz machine with IBM CPLEX 12.5, see IBM (2010).
131
Associated indices refer to the transported chemical and the used ship type on a trip (i, j, t).
118

t =8 t =2 t =3 t =4 t =5 t =6 t =7 t =9
450 500 550 600 300 350 400 450
600 500 400 600 500 400 700 600
i =8
475 450 425 400 375 350 325 300

x s=3 = 400
x s=3 = 300 y v=8 = 8
y v=8 = 8 x s=8 = 350
y v=8 = 8
400 450 500 400 450 500 400 450
100 900 8000 8000 900 8000 900 8000
i =2
275 250 225 350 325 300 275 250

x s=3 = 300 x s=3 = 200 x s=8 = 500


x s=8 = 850 x s=2 = 850 x s=8 = 850 x s=3 = 400 x s=2 =
y v=2 = 8 y v=8 = 8 y v=2 = 8 y v=2 = 8 y v=8 = 8

600 650 550 450 500 400 300 700


100 8800 8000 100 8000 100 900 8800
i =3
550 600 500 550 600 650 700 500

inventory capacity 8000 current stock level


target stock chemical 8 chemical 2 chemical 3
8000
Figure 3.12: Optimal stock levels and chemical flows for periods 1 to 8 of the MISP-STA example
119

three periods. Hence, indirect transports induce lower transport costs. Chemical s = 2 is
transported from the only producer site i = 3 to site i = 2.132
Comparing Figure 3.12 and Figure 3.10, it can be observed that the total quantity shipped
in the first periods is higher for the maritime setting than for the corresponding rail setting
(example 6). This is caused by the restricted number of RTCs in the rail setting. The
RTCs have to be re-positioned to perform material transports. This leads to comparatively
more balanced stock levels in the maritime setting. To illustrate this fact, Figure 3.13
shows the stock levels and target stocks of all chemicals at all ports over the complete
planning horizon.
In the maritime setting, all stock levels are closer to their corresponding target stock
levels compared to Figure 3.11. This is particularly apparent for chemicals s = 1 and s = 3
at site i = 3. In the rail setting the stocks of both chemicals were below their target stocks
(almost) over the complete planning horizon. In the maritime setting, both stocks are
considerably closer to their target stocks. This corresponds to the stock patterns of both
chemicals at site i = 2 where the corresponding stocks’ overshooting is reduced. I.e. in
the maritime setting transports of chemicals s = 1 and s = 3 from site i = 2 to i = 3 are
economically preferable (in contrast to the rail problem).133
In general, the maritime setting offers more transport flexibility in the sense that ships
can be dispatched independent of the availability of RTCs. On the other hand, the freight
consolidation on ships is restricted by the technical constraints of the tanks available on
board. In the given examples, the first fact prevails resulting in smoother stock patterns.
In contrast to the MC-RTP, the MISP-STA is more complex due to the larger number
of integer variables to be determined. To solve larger instances, a heuristical procedure
can be set up as follows:

1. Build a relaxed MISP-STA by dropping the ship flow variables yijtv and (3.68). Re-
place the first part of (3.65) by

∑ ∑ ∑ ∑ zijtsk ⋅ tTrv
ij ⋅ c̃k
Trv
(3.75)
(i,j)∈L t∈T s∈S k∈K

2. Solve the relaxed MISP-STA with a standard solver.

3. Feed the determined chemical flows x∗ijts to the original MISP-STA.

4. Solve the MISP-STA instance with fixed chemical flows.

Similar to the MC-RTP, a considerable part of the model’s complexity is inherited from
the integer decisions determining the number of tankers. For variabilization, the cost rates

132
Transports to site i = 1 take place in later periods.
133
Compare also the transport flows displayed in Figure 3.12 and Figure 3.10.
120

chemical s = 1
location i = 1 stock level
chemical s = 2 target stock
chemical s = 3

750 1000
500
stock level
250
0
location i = 2

1250
750
stock level
250
0
location i = 3

750
stock level
250
0
0 2 4 6 8 10 12 14
period
Figure 3.13: Optimal stock levels for the MISP-STA instance of example 7
121

c̃Trv
k approximate the shipment charges on a tank basis and are calculated as follows

∑v∈V nvk ⋅ cTrv


Cap
c̃Trv
k = v
. (3.76)
V ⋅ ∑k∈K nCap
vk

The heuristic delivers a solution for the described problem instance within 8 seconds. The
solution’s total cost is 14,379 which is approximately 6% above the optimum.
123

4 Integrated planning of chemical


supply chains
In the previous chapters isolated planning problems in chemical industry are described, re-
viewed and modelled. These approaches allow analysts to model typical chemical produc-
tion processes and logistical planning problems in chemical production networks. Chemi-
cal production networks consist of many chemical plants clustered at chemical production
sites. Such networks can be seen as an important part of chemical supply chains (SC). In
the scientific literature, there is no unique and concise definition what a SC is, but some
common features are prevalent in most definitions:

• focus on a product/service: There is a (consumer) product or service to be provided


at a market.

• multiple processing steps: Multiple transformation processes have to be passed until


the product or service can be offered.

• multiple units: Several business units (companies) are responsible for the transfor-
mation processes.

Hence, a chemical’s SC can be defined as all (business) units and processes involved in
the production and distribution of this chemical. From this point of view, chemical pro-
duction networks of large-scaled chemical companies are major parts of the SCs of many
chemicals. A particular property of chemical SCs is that a comparatively small number of
independent companies is involved in the SC. This is because chemical companies are of-
ten deeply integrated, i.e. many subsequent production steps are processed by a chemical
company. On the supply side, raw material suppliers such as oil-producing companies and
refineries are to be taken into account. Moreover, logistical service providers managing
transport and distribution processes play a vital role in chemical SCs.
To manage SCs, a broad body of (scientific) literature has emerged since the 1980s.1
Supply chain management (SCM) aims at the planning, execution, and control of all
transformation processes in a defined SC such that the SC’s performance is optimized.2
The main focus of SCM is on the coordinated planning of the SC processes. For planning,
1
See Croom et al. (2000) for an overview.
2
This is often operationalized as minimizing the SC’s total costs or maximizing the SC’s total profit.
See Stadtler (2005) for a more detailed discussion and definition of SCM.

T. Kirschstein, Integrated Supply Chain Planning in Chemical Industry, Produktion und Logistik,
DOI 10.1007/978-3-658-08433-2_4, © Springer Fachmedien Wiesbaden 2015
124

SCM draws many approaches from related disciplines (such as logistics and operations
management), but focuses on their integration and interactions.
In the management of chemical SCs, product flows within chemical production networks
are to be planned from a focal point of view. I.e. local and network-wide processes have to
be managed such that the production network’s total performance is optimized. Figure 4.1
shows the schematic overview of the exemplary chemical SC where relevant elements for
integrated planning are highlighted.

intra-site inter-site

S1 C1

intra-site

intra-site

S2 C2

storage supplier customer production plant

Figure 4.1: Chemical SC scheme for integrated planning (relevant elements highlighted)

For integrated planning approaches, all elements in the considered chemical SC are
relevant including transport relations to (external) customers and suppliers. Only external
suppliers or customers themselves are not modelled in detail.
Core elements of chemical SCs considered in integrated planning approaches at the
tactical/strategical level are capacities and parameters of the production systems and
the logistical systems (e.g. turnover or inventory capacities). Since sites and plants are
interconnected, local adjustments more or less immediately affect the remaining sites/-
plants. Integrated approaches capture these spill-over effects by a combined modelling of
the interdependent components.
Integration not only focuses on the spatial or temporal dimension of an SC but also
on separated planning (sub-)problems. These (sub-)problems and their interdependencies
are modelled in a common framework. Basically, two options for integration can be
distinguished:3 The first option is to merge the sub-problems into a monolithic model
(also called deep integration). The second option is to stay with the decomposition into
3
See Dolk and Kottemann (1993) or Geoffrion (1999).
125

sub-problems and to formulate an interaction scheme between the sub-problems (also


called functional integration). In the latter case, a transfer of data has to be organized
between the sub-problems in order to anticipate spill-over effects between related sub-
problems. Often a monolithic optimization model can be formulated, but its complexity
prohibits a direct solution using standard techniques. To tackle such a problem basically
three options are available:

1. The decomposition of the monolithic model into suitable sub-problems and setting
up a functional integration scheme.4

2. The development of tailored optimization techniques by exploiting specific properties


of the monolithic model.

3. The development of a heuristic procedure for the monolithic model.

The following subsection reviews the literature on integrated planning problems with
special focus on (basic) chemical industry. The papers reviewed are categorized according
to methodical features and problem characteristics.

4.1 Literature review


The management of SCs consists of manifold aspects. These aspects encompass traditional
planning tasks such as distribution planning or vehicle routing as well as network-specific
problems such as the coordination of partners along the SC. Usually, the planning tasks
are solved individually where the results of a specific task serve as input for subsequent
tasks. This is called a sequential planning approach. To provide a categorization for SCM-
related planning tasks, the so-called supply chain planning matrix was developed.5 It cat-
egorizes SCM planning problems according to the planning level (i.e. strategical, tactical,
or operational problems) and the progress of the value-adding process (i.e. procurement,
production, distribution or sales).
In this concept, the strategic network planning comprises network design and structure
decisions e.g. about the location of SC facilities and their corresponding layout.
At the tactical level, the master planning module comprises aggregated models of the
considered SC taking into account capacities of the production and logistics system as
well as demand forecasts. Here, aggregated material flows between the SC facilities are
planned for time horizons up to one year. A rough time discretization (e.g. in months
or weeks) is used to reflect e.g. varying demand patterns. Demand forecasts are made at

4
Note that the suitable sub-problems do not need to be the same as the original sub-problems.
5
See Meyr et al. (2008) and Stadtler (2005). Note that there is a lot of work from numerous scientific
perspectives labelled with SCM, see Croom et al. (2000) for a detailed analysis and categorization of
SCM approaches. Based on this classification scheme this work reviews SCM approaches at a network
level modelling material flows.
126

the demand planning module which comprises models to compute demand forecasts for
different regions and time horizons by assuming specific types of demand processes.
At the operational level, problems are considered on a detailed perspective dealing with
the short-term planning and control of basic SC operations. The aggregated material flows
determined at the tactical level are broken down to the local level on a short-period basis
like a daily or hourly time frame. For the pre-determined production quantities assigned
to a specific production site on the tactical level, plans and schedules for all plants at this
site are derived in the production planning & scheduling modules. Similarly, replenishment
and transport decisions are determined in the inventory management and distribution &
transport planning module, respectively.
The general SCM matrix can be adapted to better reflect the specific characteristics of
basic chemical industry.6 In the traditional SC planning matrix, the material requirement
planning modules are concerned with determining order quantities of raw and intermediate
materials necessary to realize a pre-defined production plan. These tasks are particularly
challenging if the variety of handled materials is large.7 In basic chemical industry, how-
ever, the variety of raw and intermediate chemicals is comparatively small while their
demand is high and mostly determined by the plants’ technical specifications.
Moreover, the means of transport differ for procurement and distribution transports.
For raw and intermediate transports often pipeline, ship, and rail are used. In contrast,
for the distribution of final chemicals rail and road transports are prevalent. As pipeline,
ship, and rail are comparatively inflexible and differently organized compared to road
transports, an independent planning of procurement and customer transports is justified.
Therefore, the procurement planning module encompasses the planning of procurement
transports of raw and intermediate chemicals and the planning of their local stocks. In
contrast, the distribution transports module comprises the tasks of the classic distribution
& transport planning module for final chemicals.
The transport capacities provided along a chemical SC have to be planned on the tac-
tical level. In the case of ships or RTCs, the corresponding fleet sizes are determined. To
enlarge fleets, equipment can be bought or rented. To reduce fleets, equipment can be sold
or renting contracts may not be prolonged. Such fleets are sometimes used commonly by
multiple chemical companies e.g. by short-term renting contracts or by exchanging RTCs
or ships (so-called swaps). Therefore, the original master production planning module is
complemented by the master transportation planning module capturing all tactical de-
cisions about the capacities of the logistical system in the SC.8 Pipeline capacities are
determined by technological details and, thus, are subjects to strategical issues as they
cannot be adapted easily in the short run.
6
See Zoryk-Schalla et al. (2004) for a report from an Aluminium-producing company about implementing
an advanced planning system based on an adapted SCM matrix.
7
See Meyr et al. (2008).
8
Note that this module also contains planning problems determining turnover capacities at the sites if
they limit the transport quantities.
127

The distinction into a strategical, tactical, and operational level is retained but re-
named into design, configuration, and operations to better reflect the terms most used in
literature.9 Figure 4.2 shows the adapted SCM matrix.

design strategic network planning

confi- master master demand


guration transportation planning production planning planning

opera- procurement distribution production production demand


tions planning planning planning scheduling fulfillment

Figure 4.2: Adapted SCM matrix (based on Stadtler (2005))

The basic planning problems corresponding to the specific modules are briefly described
as follows:

• design and re-design:


– Strategic planning: This block comprises planning problems determining the
general structure of the considered SC, e.g. location and structure of produc-
tion sites, type of production processes or types and capacities of logistical
facilities.10

• configuration:
– Master production planning: This module encompasses decisions about the
usage of production capacities of the considered SC, e.g. the determination of
production quantities and their assignment to production sites.11
– Master transportation planning: Determination of aggregated transport flows
among the participants of the SC and provided transport capacities.12
– Demand planning: Determination of demand forecasts for a finite planning
horizon. They are based on historical records of former periods, external pre-

9
See e.g. Melo et al. (2009).
10
See e.g. Vidal and Goetschalckx (1997) or Tsiakis et al. (2001) for an overview.
11
In chemical industry this is often encompassed by campaign planning problems. See Kallrath (2005) or
Grunow et al. (2002).
12
Examples are e.g. rail car fleet sizing models (Cheon et al., 2012), container fleet sizing (Dong and
Song, 2009), or ship fleet sizing (Ronen, 1993).
128

dictors (e.g. economic growth indicators), and assumptions about the demand
process.13

• management
– Production planning: Determination of local production quantities and assign-
ment to local plants.14 For multi-product batch processes, the aim is to define
batch sizes for different chemicals and to assign them to available plants.15 In
case of multi-product continuous processes, the aim is to determine the produc-
tion mode, i.e. the physical conditions and raw material composition specifying
the product mixture.16
– Production scheduling: Determination of mode sequences/schedules of batches
assigned to the plants. The changeovers between production modes typically
induce costs for cleaning, maintenance, and/or lost material.17
– Procurement planning: Determination of transport modes and transport ca-
pacities for procurement transports as well as planning of inventories of raw
and intermediate chemicals at a site.18
– Distribution planning: Planning of distribution processes, e.g. management of
distribution stocks, routing, and scheduling final chemical transports to cus-
tomers.19
– Demand fulfilment: Management of customer orders, e.g. tracking of orders
along the production process, acceptance of new orders, setting of initial order
due dates.20

The modules displayed in Figure 4.1 address different planning problems and associated
planning models. There are plenty of interdependencies between the blocks as the out-
come of a particular module is typically the basis for subsequent lower-level modules. In
hierarchical planning, long-term/strategic decisions form the basis for tactical planning
whose outcome typically guides operational plans. In a hierarchical planning approach,
top-level decisions anticipate the low-level decisions to a certain extent.21 However, the
13
See Stadtler (2005).
14
The distinction between production planning and scheduling problems is not always consistent, see
Kallrath (2002) or Berning et al. (2004).
15
For an overview on integrated planning and scheduling, a detailed description in the context of multi-
product plants can be found in Maravelias and Sung (2009).
16
This problem is extensively studied for refinery operations planning leading to non-linear process mod-
els, see Zhang et al. (2001), Li et al. (2005), or Alhajri et al. (2008).
17
See e.g. Méndez et al. (2006) for an extensive overview and classification for scheduling of chemical
batch processes.
18
This comprises e.g. inventory routing models for procurement via ship or pipeline (see Ronen (2002)
or Moura et al. (2008)) and local tank management models (see e.g. Saharidis et al. (2009)).
19
This block comprises e.g. the branch of MIRSPs (see e.g. Christiansen et al. (2004)) and most pipeline
scheduling models (see e.g. Cafaro and Cerdá (2010) or Neiro and Pinto (2004)).
20
See Kilger and Meyr (2008).
21
See e.g. Fleischmann and Meyr (2003).
129

anticipation is never perfect such that the final solution of top- and base-level problems is
typically not (globally) optimal. To overcome this deficiency, integrated planning models
are often build by aggregating planning problems from adjacent blocks either horizontally
or vertically.
To categorize the literature on integrated planning, the modelling technique is an im-
portant criterion. Different types of models are suitable depending on the planning prob-
lem’s requirements. In general, when addressing long-term problems, assumptions about
the occurrence of future events have to be made. However, the future is known for its
uncertainty. Hence, techniques capable to handle uncertainty are prevalent. Such ap-
proaches can be further distinguished according to the stochastic elements considered
(which is typically the demand). For short-term problems, uncertainty is (most often)
not an obstacle. For operational problems, mathematical optimization is widely used,
albeit simulation is still a reasonable option.22 Mathematical optimization models are
categorized w.r.t. linearity (linear vs. non-linear), the domain(s) of decision variables (in-
teger/binary vs. continuous), and stochasticity (stochastic vs. deterministic). Simulation
techniques come to the fore when at least one of the following criteria prevails:23

• The system to model is complex, but mathematically describable.

• Multiple objectives are pursued.

• Multiple stochastic elements are considered.

• The system’s dynamic behaviour in time is of (special) interest.

In general, these criteria often hold for planning problems with a rather long planning
horizon. Therefore, simulation-based models are prevalent especially for SC configuration
problems.24 Special classes of simulation techniques are distinguished.
In this work the term simulation refers to stochastic and dynamic models.25 If all
components are modelled continuously, this is typically called a system dynamics model.26
Such a model is reasonable either at a high aggregation level27 or when all modelled
processes are indeed continuous.28 The system studied is tracked continuously and its
(aggregated) behaviour can be described at any point in time.

22
See Papageorgiou (2009) for details and Adhitya and Srinivasan (2010) for an application of simulation
for a detailed process modelling.
23
See Law (2007) or Carson (2004) for a more detailed discussion about these prerequisites. The sub-class
of deterministic simulation is typically referred to as computer experiments, see Santner et al. (2003)
for more details. In this work the focus is on stochastic simulation models.
24
See Kleijnen and Smits (2003) for an overview.
25
I.e. there exist stochastic process elements to be modelled and the system is studied over time, see Law
(2007) for a more detailed classification.
26
See Ogata (2003) for an introduction.
27
E.g. when (discrete) objects handled in a network can be aggregated to continuous flows.
28
For more details about system dynamics models in SCM see e.g. Kleijnen (2005). For applications see
e.g. Rabelo et al. (2007) or Venkateswaran et al. (2004).
130

In contrast, if the processes of the modelled system are discrete, the corresponding
model is called a discrete-event simulation. Here, the system’s states are calculated at a
finite number of points in time. This type of simulation is often used for atomic simula-
tions, i.e. when a system is modelled in detail.29
A third category are so-called agent-based simulation models. This sub-class of sim-
ulation models is characterized by a network of interacting agents. Each agent reacts
to stimuli from other agents and environmental variables depending on mathematically
formulated decision rules.30 This type of simulation is particularly useful to model inter-
actions between interrelated, but independent entities such as business units, companies,
or market actors.31
To categorize the literature on integrated SC planning in chemical industry the following
criteria are used:

• SCM matrix refers to the modules of the adapted SCM matrix (Figure 4.2) addressed
by the reviewed reference. Capital letters indicate the planning level (design, con-
figuration, and operation) with at most two superscripts. The first superscript is
p, t or d referring to transportation, production, or demand. The potential sec-
ond superscript specifies the sub-problem on the operational level with p, s or d
referring to procurement/production, scheduling or distribution/demand fulfilment.
E.g. the combination Otp refers to the procurement planning module whereas the
combination Opp refers to the production planning module.

• Modelling technique refers to the technique used to formalize the planning problem
and comprises mathematical optimization approaches such as (mixed-integer) linear
programs [(MI)LP], (mixed-integer) non-linear programs [(MI)NLP], or stochastic
programs [s(MI)LP] as well as simulation techniques such as agent-based models
[AB], discrete-event models [DE], or system dynamics models [SD]. If mathematical
optimization models are embedded in a simulation framework, this is indicated by
a connector. E.g. DE-MILP indicates that a MILP is embedded in a discrete-event
simulation framework.

• Objective refers to the objective pursued (such as costs, net present value [NPV],
corporate value [CV], or profit). If multiple objectives are considered, this is stated
by [mu].

• Type of production refers to the type of production technology considered. Either


continuous [co] or batch [ba] production is modelled. For aggregated models the
technology might not be modelled in detail, then the production technology is un-
specified [us].
29
See e.g. Law (2007), chap. 1 for more details.
30
See e.g. Sterman (2000) for an encompassing textbook or Kleijnen (2005) for a brief introduction.
31
Therefore, agent-based simulation models are often applied to model market or social interaction
networks, see e.g. Axelrod (2001).
131

• Time scale describes the way the time dimension is represented. Continuous [co] or
discrete [di] models are distinguished if time is modelled explicitly. Otherwise, the
model refers to a single period [si].

• Transport mode refers to the transport mode(s) considered (ship [sh], pipeline [pi],
road [ro], or rail [ra]). For aggregated models only flows might be relevant. Then,
the mode is unspecified [us].

• Uncertainty summarizes the sources of uncertainty incorporated in the model


(e.g. demand [de], prices [pr], or multiple [mu]).

• Solution method refers to the method applied to solve the proposed model e.g. (com-
mercial) standard solvers [standard] (such as CPLEX or CONOPT), a specific op-
timization method [specific], a hierarchical decomposition approach [decomp.], or a
heuristical procedure [heur.]. For simulation models, often a finite set of scenarios
is evaluated and compared [scen.]. But also genetic algorithms are sometimes used
for simulation optimzation [GA].32

If a criterion is not included in the proposed model, this is indicated by a horizontal line.
In general, simulation approaches are advantageous if a static framework of entity-
processing units exists, i.e. the system’s general structure is static. But for strategic
problems the structure of a SC is typically the subject to be altered. As this work aims at
a framework relying on simulation, literature primarily focusing on strategical problems
is out of focus.
At the tactical level, however, more simulation-based approaches can be found. Ta-
ble 4.1 shows a classification of literature for integrated SC configuration and operations
planning where 7 out of 20 reviewed articles propose simulation-based approaches.
Among the simulation approaches Garcia-Flores and Wang (2002), Mele et al. (2006),
and Puigjaner and Guillén-Gosálbez (2008) propose agent-based simulation models where
the agents (partially) utilize MILPs to determine their decisions (hybrid model).33 These
approaches primarily investigate the organizational structure of an SC, i.e. the interac-
tion of involved decision making units. In Garcia-Flores and Wang (2002) a chemical SC
producing paints and coatings is modelled by using six classes of agents: retailers, lo-
gistic service providers, warehouses, purchasing departments, plants, and suppliers. The
production processes at the plants are organized as multi-purpose, multi-product batch
processes. The precise operational planning and scheduling of batches is carried out by a
MILP implemented in a standard software package. Decision fields are the parameters of
the inventory policy and the parameters of the negotiation scheme between the warehouse
in cases of stock-outs. I.e. in case of a stock-out at a warehouse, the requested order is
32
This classification encompasses the solution methods applied in the reviewed literature. A more detailed
classification (e.g. addressing meta-heuristics, math-heuristics, etc.) is omitted for the sake of brevity.
33
E.g. site managing agents decide about the assignment of production orders to plants whereas financial
agents decide about loans to be taken.
132

type of
time transport uncer- solution
Reference SCM matrix modelling objective produc-
scale mode tainty method
tion
McDonald and Karimi (1997) (Opp , Ops , C p , C t ) M ILP cost ba di us de standard
Vidal and Goetschalckx (2001) (C p , C t ) N LP profit co si us — heur.
Gjerdrum et al. (2002) (C p , C t ) M IN LP multiple us di us — specific
pp
Gupta and Maranas (2003) (O , Ops , C p , C t ) sM ILP cost ba di us de decomp.
Jackson and Grossmann (2003) (C p , C t ) N LP profit co di us — decomp.
Gupta and Maranas (2004) (C p , C t ) sM ILP NPV/ROV us di us de standard
Ryu et al. (2004) (C p , C t ) sLP profit us si us de specific
Chen and Lee (2004) (C p , C t ) M IN LP multiple ba di us de+pr standard
Oh and Karimi (2006) (C p , C t ) LP profit co di us — standard
Yi and Reklaitis (2007) (Opp , Ops , C p , C t ) M IN LP cost ba co us — specific
Amaro and Barbosa-Póvoa
(Otd , Opp , Ops , C t , C p ) M ILP profit ba di mu — standard
(2008)
Al-Othmann et al. (2008) (C t , C p ) sLP profit co di us de+pr standard
pp
Kim et al. (2008) (O , C p , C t , D) M IN LP profit co si mu — standard
Garcia-Flores and Wang (2002) (Opp , C t , C p ) AB + M ILP cost ba di us mu scen.
Jung et al. (2004) (Opp , Ops , C p ) DE + M ILP cost ba di us de specific
Mele et al. (2006) (Ops , Opp , C t , C p ) AB + M ILP profit ba di us mu GA
Puigjaner and Guillén-Gosálbez
(C t , C p ) AB + M ILP multiple us co us mu GA
(2008)
ps pp t p
Jung et al. (2008) (O , O , C , C ) DE + M ILP inventory ba di us de scen.
Pitty et al. (2008) (Otp , Opp , C t , C p ) DE profit co di pi/sh mu scen.
Adhitya and Srinivasan (2010) (Opp , Ops , C t , C p ) DE profit ba co us mu scen.

abbr.: ba...batch; co...continuous; de...demand; di...discrete; mu...multiple; pi...pipeline; pr...price; ra...rail; ro...road; sh...ship;
si...single; us...unspecified

Table 4.1: Classification of literature on integrated SC configuration and management planning in chemical industry
133

delivered from another warehouse and the charge of additional transport costs is subject
of the negotiation process between the agents. However, logistical details determining the
additional transport cost are out of scope.34 The optimization is based on an evaluation
of a pre-defined set of scenarios. Mele et al. (2006) propose a similar simulation approach
with the same assumptions for the logistical processes and production processes as well
as the same optimization model for operational planning and scheduling of batches. This
work proposes an optimization procedure based on a genetic algorithm which seeks for an
inventory parameter constellation maximizing the SC’s expected total profit. Puigjaner
and Guillén-Gosálbez (2008) extend this work by dealing with multiple objectives. The
applied multi-objective genetic algorithm is NSGA-II.35
Another hybrid simulation model is proposed by Jung et al. (2004, 2008). Here, a
discrete-event simulation is chosen to model the stochastic environment. The focus is on
the organization of production processes, i.e. optimization of operations is in focus. In
Jung et al. (2004) a single-stage production-distribution problem is considered where the
production planning and scheduling of multiple batch plants is embedded in a stochastic
rolling horizon environment.36 The objective is to maximize the expected profit by decid-
ing on the safety stocks at the plants without falling below a minimum customer service
level. The customer demand is the only stochastic component. For optimization a stochas-
tic gradient approach is applied which iteratively increases the safety stocks for each com-
bination of product and plant until the expected profit cannot be increased any further.
Jung et al. (2008) extend this work by considering a multi-stage production-distribution
system with pure inventory holding facilities and combined facilities for production and
inventory holding. For both types of facilities the influence of inventory parameters on
performance measures is analysed by simulation experiments. Thee approaches incorpo-
rate operational decisions by MILPs. Hence, optimal reactions on varying system states
are modelled on the operational level. However, both models omit logistical details on
the operational level (only flows between facilities are considered).
Pure simulation approaches are proposed by Pitty et al. (2008) and Adhitya and Srini-
vasan (2010). Pitty et al. (2008) propose a discrete-event simulation model for a refinery
supply chain. Operational decisions such as unloading schedules and production planning
are made based on simple priority rules. Various configurations of the modelled SC are
studied and compared to reveal optimization potentials. This approach explicitly consid-
ers some details of ship and pipeline transports. Adhitya and Srinivasan (2010) describe a
discrete-event simulation model for an SC producing and distributing lubricant additives.
Here, batch production is modelled. Again, operational production decisions are made by
priority rules and a scenario analysis is conducted to evaluate the effects of other priority

34
A classic transportation problem is solved to determine the total transport effort.
35
See Deb et al. (2007).
36
For production planning the MILP proposed by McDonald and Karimi (1997) is used. Production
schedules are determined by heuristics to ease solvability.
134

rules. Logistical processes are not explicitly modelled.37


Core decisions on the tactical level are the assignment of production quantities to the
available production sites and the distribution of raw, intermediate, and final products
among the various participants in the SC. Of particular importance is the integration of
production planning and final product distribution. Numerous approaches are devoted to
this kind of production-distribution problem.38 Vidal and Goetschalckx (2001), Gjerdrum
et al. (2002), and Chen et al. (2003); Chen and Lee (2004) study effects of financial
instruments and restrictions on production-distribution problems, in particular in multi-
national companies. Vidal and Goetschalckx (2001) study the problem of determining
transfer prices of intermediate products which are transported between subsidiaries of
a multi-national company. This induces tax costs leading to quadratic terms in the
objective function due to the multiplication with the production and transport quantities
to be determined simultaneously. To solve the resulting NLP a sequential fix-and-relax
procedure is proposed. Transports are represented as flows between SC facilities. I.e. no
details of the logistics system are modelled. The same holds for Gjerdrum et al. (2002)
where transfer prices between multiple companies are studied. Here, the problem is to
find a "fair" profit allocation by determining transfer prices. The resulting MINLP is
solved by a branch-&-bound algorithm.39 A fair distribution of the total expected profit
is also pursued in Chen and Lee (2004) next to customer service levels, safety stocks,
and the robustness of solutions. Details of the logistical system are not modelled, but
the transport costs depend on the transport quantity reflecting economies of scale of
different transport modes. This model considers discrete scenarios for customer demand
and product prices.40 A fuzzy approach is proposed which relies on a component-wise
normalization of the vector of multiple objectives. To merge the normalized objective
vector into a single measure, different metrics are discussed such as the minimum of the
normalized objective components or the product of all objective components. None of the
reviewed approaches incorporates transportation aspects in detail.
As tactical decisions affect an SC’s performance for a comparatively long time span, the
uncertainty of various system parameters has to be considered in the planning process.
Stochasticity of demand is considered by Gupta and Maranas (2003, 2004) as well as Ryu
et al. (2004); Al-Othmann et al. (2008). All these approaches do not consider details of
the logistical system and determine transport flows between facilities only.41 Gupta and
Maranas (2003, 2004) rely on the tactical planning model proposed by McDonald and
Karimi (1997). In Gupta and Maranas (2003) this model is decomposed into the pro-
duction assignment problem and distribution planning problem. While the distribution
37
The inventory policy rules the generation of transport orders which are characterized by a random lead
time and pre-determined transport costs.
38
For a recent review of integrated production and distribution planning not restricted to the chemical
industry see Mula et al. (2010).
39
The integer variables reflect a discrete number of transfer price levels.
40
The deterministic version of this model can be found in Chen et al. (2003).
41
Transport flows are associated with a fixed transport cost rate.
135

planning depends on the realized demand, production planning is based on the estimated
demand. A two-stage stochastic linear program is created seeking for a configuration
with minimum expected total costs. In Gupta and Maranas (2004) the objective of the
model is modified by considering future payments. For discounting future payments, two
approaches are discussed: The net present value (NPV) and the real-options-based value
(ROV).42 In Ryu et al. (2004) the hierarchy of production and distribution planning is
inverted, i.e. the distribution problem rules the production planning problem. Further-
more, both components are affected by the uncertainty of demand. To solve the problem,
a parametric programming approach is described. Al-Othmann et al. (2008) present a
scenario-based sLP for the optimization of a petroleum SC consisting of companies from
the crude oil sector as well as the refinery, petroleum, and basic chemical sector. The
objective is to find a configuration maximizing the expected profit. A two-stage stochas-
tic linear program is proposed whereby decisions about the production quantities in the
crude oil sector constitute the first stage problem43 whereas the decisions for the other
sectors are made based on a discrete number of demand scenarios.
A special part of the chemical industry is the oil-producing industry. Core production
units in this industry are refineries. When refinery operations are modelled, typically
non-linear models are used due to the non-linear production yield curves of distillation
units.44 Approaches for refinery production-distribution planning are proposed by Kim
et al. (2008), Amaro and Barbosa-Póvoa (2008), Jackson and Grossmann (2003), Oh
and Karimi (2006), and Yi and Reklaitis (2007). Among these, only Kim et al. (2008)
and Amaro and Barbosa-Póvoa (2008) incorporate some logistical aspects by considering
multiple transport modes with varying transport cost rates. In Kim et al. (2008) the
production network of a Korean petroleum company is modelled. The model considers
multiple, non-specified modes of transport (with varying cost rates). Amaro and Barbosa-
Póvoa (2008) also deal with unspecific transport modes but the proposed model considers
batch production processes. Jackson and Grossmann (2003) use non-linear expressions
for modelling continuous production processes (similar to Kim et al. (2008)). To solve the
production-distribution problem, a decomposition approach is proposed by splitting the
model into the spatial and temporal dimension. A simplified production process model
is used by Oh and Karimi (2006) where the product fractions are handled as constants
and no reflux streams are considered. This work focuses on financial impacts of duty
drawbacks for planning production and distribution in multi-national companies. An
LP is presented which seeks for the production-distribution plan maximizing the total
after-tax profit. The influence of tax differences and exchange rates between different
currencies is considered by Yi and Reklaitis (2007). Here, batch processes are assumed for
production and a periodic continuous-time model is formulated. An optimal configuration
42
Both differ in the interest rate used for discounting and the definition of the expected return. In case
of the NPV the expected rate of return is used whereby the ROV utilizes the risk-free rate of return.
43
I.e. the decisions are independent from demand fluctuations by utilizing expectations.
44
See Li et al. (2005).
136

of the studied network is exhibited by deriving analytical expressions for production and
order lot-sizes.
For tactical production planning roughly assigning production quantities to production
sites suffices for deriving reliable mid-term plans.45 On the operational level, however,
a more detailed view on the involved production processes is needed. In particular, the
planning of batch processes is a challenging task as it involves

• a subdivision of the assigned production quantity into plant-conform batches,46

• an assignment of these operational batches to available plants,47

• a scheduling of batches assigned to a plant.48

Planning of continuous processes faces a similar problem structure except that the batch
size is not limited by technological restrictions. Beside production planning, the planning
of logistical activities needs to be involved. Since (operational) production planning typi-
cally focuses on one production site, intra-site transports (which are typically based on the
pipeline mode) need to be considered but also inter-site transports or customer deliveries
are of importance in daily business in some cases.
Table 4.2 shows a literature classification for integrated approaches primarily addressing
operational planning problems in chemical SC management.
All approaches summarized in this table contain a production planning and scheduling
component and are deterministic. The integrated production planning and scheduling
problem (IPPSP)49 is the core component of most approaches. The approaches differ in
details regarding the production system and the incorporation of logistical aspects. The
reviewed approaches can be subdivided into approaches modelling batch and continuous
production.
Batch production is considered by the first seven approaches shown in Table 4.2. Timpe
and Kallrath (2000) propose a time-discrete MILP for the IPPSP with multiple production
sites/plants, multiple production modes, intermediate storages, sales points, and trans-
ports between sites, storages, and sales points. The model uses different time scales for
production and sales planning which allows a more precise planning of production. In a
production period, a site can produce in different modes whereby changeover times and
costs are considered. To store intermediate and raw materials, one-product and multi-
product tanks are available and an assignment of products to tanks is implemented. The
45
Rough assignment refers to an assignment to aggregated production facility groups such as similar
plants or even complete sites.
46
This is due to the fact that typically batch processes have limited capacity, e.g. the volume of the
reactor is limited.
47
This is necessary in case of multi-plant networks which are typical for producing e.g. speciality chemi-
cals.
48
This might be necessary due to sequence-dependent set-up efforts and due dates.
49
See Maravelias and Sung (2009), Li et al. (2010a) or Shao et al. (2009) for an introduction and literature
review. For an application of the ELSP in the pure batch scheduling context in chemical industry see
Cooke and Rohleder (2006).
type of
model- time transport uncer- solution
Reference SCM matrix objective produc-
ling scale mode tainty method
tion
Timpe and Kallrath (2000) (Otd , Opp , Ops ) M ILP sales ba di us — standard
Grunow et al. (2002) (Otd , Opp , Ops ) M ILP cost ba di us — decomp.
Neumann et al. (2002) (Opp , Ops ) M ILP other ba co — — heur.
Romero et al. (2003) (Opp , Ops ) M ILP revenue ba di — — standard
Berning et al. (2004) (Opp , Ops ) other profit ba di — — GA
Guillén et al. (2007) (Otd , Opp , Ops ) M ILP equity ba di us — standard
Sung and Maravelias (2009) (Opp , Ops ) M IP cost ba di — — decomp.
tp
Bok et al. (2000) (O , Otd , Opp , Ops ) M ILP profit co di us — decomp.
Pinto et al. (2000) (Otd , Opp ) M ILP profit co si pi — standard
Neiro and Pinto (2004) (Otp , Opp , Ops ) M IN LP cost co di pi — standard
Erdirik-Dogan and Grossmann
(Opp , Ops ) M ILP profit co co — — decomp.
(2006, 2008)
tp pp ps
Shah and Ierapetritou (2011) (O , O , O ) M ILP profit co co pi — standard
DE +
own approach (Otp , Opp , C t , C p ) mu co di pi/ship/rail mu specific/GA
M ILP

abbr.: ba...batch; co...continuous; de...demand; di...discrete; mu...multiple; pi...pipeline; pr...price; ra...rail; ro...road; sh...ship;
si...single; us...unspecified

Table 4.2: Classification of literature on integrated SC management planning in chemical industry


137
138

model is solved with standard solvers for multiple objectives. Maximization of sales is
finally recommended which best fits to the intended application in practice.
Grunow et al. (2002) propose a sequential planning approach consisting of the pro-
duction planning step (called campaign planning) as well as the assignment and schedul-
ing step (called assignment model). The focus is on one production site with multiple
multi-purpose processing units. Transports between plants are mentioned but not ex-
plicitly modelled.50 The individual models are solved with standard solvers seeking for a
minimum-cost solution. A similar decomposition scheme is proposed by Neumann et al.
(2002). Here, the batch planning problem is formulated as a MILP whereas the schedul-
ing problem is solved by a heuristic considering limited plant capacities. This way, a
re-scheduling step is avoided and intermediate storages are introduced. A quite simi-
lar but integrated model for batch sizing and scheduling is proposed by Berning et al.
(2004). Due to the model’s complexity, a genetic algorithm is used to derive near-optimal
solutions. Sung and Maravelias (2009) propose an algorithm to determine the set of fea-
sible solutions for a production planning problem by evaluating infeasible polytopes of
the underlying scheduling problem. The resulting linear inequalities can be added to the
production planning model to find an optimal and feasible production plan (w.r.t. to the
scheduling constraints).
In Romero et al. (2003) a simplified IPPSP for chemical batch production is described
which is enhanced by budget constraints. A one-stage batch production on parallel,
identical plants is considered. The goal is to maximize the total revenue over the planning
horizon without violating budget and resource constraints. Due to the simple production
model, the proposed MILP can be solved with standard solvers. This approach is extended
by Guillén et al. (2007) where multiple plants as well as transports between plants are
incorporated next to cash flow and budget constraints. The objective is to maximize the
accumulated equity during the planning horizon, i.e. the difference between accumulated
cash and outstanding liabilities.
For continuous production planning and scheduling, there is no technological production
capacity limit. The number of production modes is typically smaller for continuous pro-
duction plants which eases the scheduling problem. Erdirik-Dogan and Grossmann (2006,
2008) consider a single production site with continuous single-stage production. Erdirik-
Dogan and Grossmann (2006) present the "basic" production planning and scheduling
problem for a single production unit. The production unit can be set up to multiple
products but only one product can be produced at a time. Deterministic demand fig-
ures serve as lower bounds for the portfolio of chemicals to be produced. The model
is formulated in continuous time51 whereby a rough division of the planning horizon in
time periods is used. This allows the introduction of demand due dates and eases the

50
To overcome resource conflicts in the sequential procedure, a post-optimization step performs a re-
scheduling of the schedule provided by the assignment model.
51
I.e. production times are continuous variables.
139

calculation of inventory levels. The objective is to maximize total profit over the planning
horizon by considering approximated inventory costs as well as changeover and process-
ing costs. To solve the proposed MILP, an iterated decomposition approach is described
by splitting the planning model into a master problem52 and a sub-problem.53 Here,
the master problem provides an upper bound for the original problem whereas the sub-
problem provides a lower bound. The iteration is stopped if the difference between both
bounds is sufficiently small. Erdirik-Dogan and Grossmann (2008) extend this model by
considering multiple parallel production units. This model additionally requires to assign
production lots to the production units which complicates the problem considerably. A
similar iterated decomposition approach as used by Erdirik-Dogan and Grossmann (2006)
is presented to solve larger instances of this problem.
In practice, large-scaled chemical companies operate multiple different plants at mul-
tiple production sites. Such a situation requires to model materials transports between
plants/sites and is considered by Bok et al. (2000), Neiro and Pinto (2004), Pinto et al.
(2000), and Shah and Ierapetritou (2011). Bok et al. (2000) present a discrete-time pro-
duction planning and scheduling model with multiple plants. Each plant can be set-up
to a specific production mode once at the beginning of each period. Transports between
sites affect the provider’s and the receiver’s stock levels.54 Raw material transports are
associated with fixed ordering costs. However, further logistical details such as turnover
capacities are not considered. The objective is to maximize the total profit over the plan-
ning horizon. To solve larger problem instances, a two-stage decomposition procedure
is described which partitions the monolithic model into a master problem and a sub-
problem. The master problem determines a production and transport plan by ignoring
changeover calculations. In the sub-problem the production plan is determined given the
transport flows from the solution of the master problem. Based on the solution of the
sub-problem, cuts are added to the master problem in the next iteration.
A related problem for the operational production and distribution planning of refineries
is presented by Pinto et al. (2000). Here, a single-period planning model is formulated
which determines the blending and distribution of refinery products via a network of
pipelines and tanks. Multiple distillation units are assumed to provide a continuous flow
of multiple (raw) materials being transported by multi-product pipelines to tanks. Each
tank is designated to one product type only. Transition processes for switching from one
product to another are modelled.55 The model aims at determining a pumping schedule
maximizing the profit. The resulting MILP is solved with standard solvers. Neiro and
Pinto (2004) extend this work by modelling petroleum supply and production planning
52
The master model determines production quantities and inventories per period.
53
The sub-model determines the precise schedule using the restricted set of products planned by the
master problem.
54
Customer transports are not considered because products are sold ex factory.
55
Each customer market is supplied via a multi-product pipeline from these tanks. The distribution
pipelines can be fed simultaneously from multiple tanks to allow a blending of the (raw) products to
final products.
140

for multiple periods. Transition processes of multi-product pipelines are neglected. The
production is formulated as a generic transformation model that accounts for multiple
production modes. For each unit exists an intended production mode. Deviations from
the intended mode generate additional processing costs. Albeit the model is formulated as
a profit maximization model, the demand is deterministic and no stock-outs are allowed
such that, in fact, the model is a cost minimization problem considering production,
transport, and storage costs.
Shah and Ierapetritou (2011) present an integrated planning model for refinery pro-
duction based on distillation and blending units. Production planning is simplified by
assuming semi-continuous production modes.56 Changeovers of production modes are
penalized in the objective function.57 The proposed MILP is formulated in continuous
time and can be solved with standard solvers for realistically sized problem instances.
The objective is to maximize a profit-like measure considering cost-inducing performance
measures such as the number of setups or the plants’ utilization rates. The weights for
the performance measures have to be interpreted rather as preference measures than as
cost rates. The setting of these weights significantly affects the model’s computational
complexity.
To sum up, the review of existing literature reveals that a lot of effort has been spent in
developing integrated approaches to tackle deficiencies of sequential planning processes.
Prevalent in scientific literature are analytical optimization approaches allowing a high
degree of both portability and conciseness. On the tactical level, however, stochastic
components have to be considered to enable an accurate modelling of real systems. Op-
timization approaches, typically, handle only one source of uncertainty.58 But for an
encompassing model focusing on multiple system components, usually multiple sources
of uncertainty have to be considered. Simulation approaches offer this capability and,
therefore, are prevalent in literature on integrated tactical problems (see Table 4.1).
The literature review also reveals that most simulation-based approaches lack a system-
atic procedure for deriving recommendations of improvements. Instead, most approaches
still bear the stigma of rather descriptive models where a (pre-)selected number of scenar-
ios is evaluated and compared. Potentials for improvement are not revealed directly this
way. However, a systematic procedure for post-analysis of simulation models can lead to
an exploration of the space of possible system configurations which allows deriving rec-
ommendations for improvements on a (more) reliable basis. This topic will be discussed
in section 4.3.4.
A further fact is that most integrated approaches do not consider transport processes
accurately, in particular at the operational level. Most approaches focus on production
planning decisions and handle logistical processes as auxiliary components. Exceptions
56
I.e. for the set of production tasks the product recipes can be varied in certain ranges only
57
Changeovers are also considered for multi-purpose tanks whereby the changeover implies a downgrade
of the interface of mixed products.
58
This is most often the demand, see Peidro et al. (2009) for more detailed information.
141

such as Pinto et al. (2000), Neiro and Pinto (2004), and Shah and Ierapetritou (2011)
consider refinery operations where the logistical system differs from basic chemical indus-
try in the sense that primarily the pipeline management and tank management constitute
critical logistical components. In basic chemical industry, however, also rail transports
constitute a relevant logistical aspect, in particular for inter-site transports. Also other
logistical details such as turnover capacity are often not modelled in detail.
These drawbacks are overcome by the simulation framework proposed in this thesis by
integrating MILPs for operational distribution planning considering transport-mode spe-
cific restrictions in a simulation environment. It aims at integrating both the production
and logistics system in basic chemical industry at the same level of aggregation. As the
proposed framework is designed to support tactical decisions, multiple sources of uncer-
tainty are handled. The characteristics of the proposed framework are briefly summarized
in the last row of Table 4.2. The next section categorizes and discusses typical sources of
uncertainty relevant for integrated supply chain planning in chemical industry.

4.2 Sources and effects of uncertainty in chemical


industry
Planning a network of integrated chemical production sites causes various problems. Due
to the interdependency of the production processes, local disturbances at a certain level
of a particular production site can affect the site’s performance entirely or at least down-
stream (inter-site effect). Moreover, due to the interconnectedness of production sites
in a chemical production network, such a local disturbance can spread out on the en-
tire SC (intra-site effect). This typically leads to increased risk costs. The organization
and central control of chemical production networks aims at limiting the effects of such
disturbances.
In basic chemical industry spill-over effects are intensified by the inflexibility of contin-
uous production processes. Even small disturbances may lead to plant break-downs. In
general, continuously operated plants have to face high set-up or changeover costs.59 This
is because most chemical reactions need some running-in time before the process and pro-
duced products meet the required (quality) specifications. The products produced during
a changeover from one production mode to another are not marketable or require some
post-processing (so-called off-spec products). Additionally, after a complete shut-down
technical inspection and cleaning routines have to take place before the restart. Due to
the high costs of re-start and changeover procedures, plant shutdown and changeover costs
dominate most other costs like e.g. external supply or storage costs. In the worst case,
final-product plants have to be shut down which causes a loss of demand if the asset is
operated at maximum capacity such that losses cannot be caught up leading to lost sales.
59
See e.g. Kallrath (2002) or Oh and Karimi (2004).
142

Although the interdependency of supply risks is well-known in chemical production


networks, it is difficult to quantify the effects arising from certain sources of uncertain-
ties. This is caused by the complexity of production processes, the interconnectedness
of the production network, and the limited capacities of commonly used logistical in-
frastructures. However, quantitative information is necessary to determine economically
reasonable risk management decisions.
To avoid shutdowns and to increase the local network’s robustness against disturbances,
buffers of raw and intermediate materials have to be at hand or at least immediately acces-
sible. Therefore, one of the most crucial tasks in SC planning is the management of stocks
and material flows in the network.60 This is also necessary for perfectly balanced produc-
tion sites, i.e. when all local plants are coordinated in such a way that no intermediate
material is in deficit or runs in surplus during normal operation. Stocks of intermediate
products are kept to buffer plant shortfalls or unintentional changeovers. However, most
integrated production sites are gradually grown and not planned in one step. Hence, in
normal operation, there typically exists a set of intermediate chemicals which are surplus
or in deficit. Regular logistical activities are necessary to keep the production network
running properly.
Beside uncertainties occurring from the production system, other internal sources also
affect or disturb the network’s performance. Basically, there are two dependent techno-
logical structures in chemical plants. Production sites cannot be operated without an
infrastructure supplying these plants with raw materials and energy. At the same time,
produced materials have to be transferred, stored, or further processed. Intra-site trans-
ports of intermediate and basic chemicals are usually handled via local pipeline systems
where dedicated pipelines for each chemical exist. Most raw and intermediate chemicals
are liquids or gases and are stored in tanks. This allows the decoupling of interconnected
plants to a certain extent by buffering material imbalances. Most final and some intermedi-
ate chemicals are solid substances that are either directly transferred to customer-oriented
packagings or, in the bulk case, transferred into outbound storages such as silos.
Energy in chemical production sites comprises two main sources: electric power and
heat. Electric power is distributed via a cable network. Either a local power plant exists
or the production site is supplied by an external partner. At a production site often a
backup system exists to overcome short-falls of the primary source of electric power (at
least for some time). Hence, disturbances due to a power short-fall occur quite seldom at
chemical production sites.
Heat is often distributed as steam, which is centrally generated and distributed via
heat transfer pipelines at different pressures. Especially the steam supply is a vulnerable
process in each integrated chemical production. Typically, the set of local plants can be
sub-divided into steam providers and consumers. Steam can result as a by-product (e.g. of

60
See Park et al. (2006) for a complex inventory management model for a single site.
143

exothermic reactions) or as a result of quenching processes.61 On the other hand, steam


is required by endothermic reactions or for distillation processes. I.e. if a steam-supplying
plant shuts down, additional external steam supply is required to properly supply steam-
consuming plants at a site. Otherwise, steam-consuming plants are forced to shut down
as well. This supply can either be organized by local backup boilers or by an external
producer such as a power plant. However, if the backup access to an external steam
supplier is not available, a backup boiler needs to be ready for steam supply in an instant
which constitutes an expensive option. The backup boiler requires permanent heating
and, thus, generates also remarkable continuous energy costs. Therefore, it is difficult to
decide whether to risk a shutdown of steam-consuming plants or to maintain a backup
boiler permanently.
The local pipeline system of chemical production sites needs to be carefully supervised
because disturbances immediately affect all associated plants. Possible disturbances are
manifold. However, pipeline systems are exceptionally reliable from a technological point
of view if they are carefully maintained with respect to corrosion and material weaknesses.
Problems may occur from operating a plant under inappropriate conditions. For example,
if pressure and/or temperature are not within the allowed ranges, chemical reactions may
take place inside pipelines. This may affect the quality of the transported material and
can lead to congestions or damages in the pipeline.
Infrastructure also encompasses all facilities dedicated to material handling and trans-
ferring from external sources. This includes inter-site pipelines, roads, railways, or berths
as well as all facilities needed for turnover processes.
Beside technological problems, there may also arise disturbances due to managerial
mistakes. The management of integrated production sites encompasses e.g. decisions
about financial means, the staff, and stocks of all materials relevant for the production
processes. Financial disturbances include mis-estimations of prices and costs and may
lead to inefficient production or sourcing decisions. Uncertainty of workforce occurs from
fluctuations in personnel capacity.62 Disturbances of material supply can occur from two
sources: Either the local stocks are inappropriate and/or the procurement and distribution
processes do not match the realized requirements, e.g. orders are delivered too late or in
the wrong quantity/quality.
An important part of external uncertainty is caused by customer demands.63 However,
in basic chemical industry almost all final chemicals are produced to stock and, thus,
are decoupled from direct market demands. Hence, variations of the customer demand
do not affect the upstream production network immediately. Due to the exceptionally
high re-start and changeover costs for plants, demand variations are mostly reflected by
stock increases or sales price declines. The adaptation of production capacities by means
61
A prominent example are steam crackers (see Figure 2.8) where the quenching process provides high-
pressure steam, see Behr et al. (2010, p. 178).
62
E.g. due to strikes, illnesses etc.
63
See Applequist et al. (2000).
144

of plant shut-downs or mode changeovers is only considered for a long-term decline in


customer demand, e.g. in a recession. For these reasons, demand uncertainty is not in the
main focus of this work.
Another type of external uncertainty emerges from the suppliers’ side. Suppliers com-
prise all partners providing raw and intermediate materials which are not part of the con-
sidered chemical production network. From these external partners materials are supplied
to keep the production network working. Uncertainties affect the network’s performance
when orders cannot be fulfilled in the expected way. E.g. shipments arrive too late or
not in the expected quantities. Such uncertainty can hardly be avoided, but has to be
reflected in safety stocks and strategic supply planning.
As a last type, external uncertainties may arise from governmental authorities or service
providers. Here, the potential realizations can be manifold depending on the specific
source. Changing governmental regulations may force a re-organization or re-configuration
of the chemical SC, e.g. due to restrictive environmental regulations, duty drawbacks, or
changed tax rates.64 However, such events typically have a long-term character and,
thus, have no immediate effect on SC operations. In contrast, service contracts support
SC operations on regular basis. A prominent example are logistical services provided
by external partners, so-called 3rd party logistics (3PL). Here, logistical processes are
outsourced and only supervised by the SC management. Uncertainty results e.g. from
unknown delivery dates or shortfall times (in case of outsourced maintenance activities).
These variations have to be considered in planning the SC operations.
In general terms, uncertainty refers to the unknown realization of at least one (random)
variable. In SCM, objects considered and planned have different dimensions categorized
into quantitative, qualitative, and temporal measures.65 Examples of uncertain events can
be found for each category discussed above.66 Table 4.3 shows a classification of sources
of uncertainty and examples w.r.t. quantity, quality, and time.67
Depending on the system to model, specific sources and dimensions of uncertainty are
more or less relevant. Typically, the quality dimension is less relevant in the SCM context
because its measures are difficult to be incorporated in quantitative models. In contrast,
quantitative measures are crucial in planning SC operations. Temporal aspects become
especially relevant in the logistical context, e.g. when capacities of transport resources are
to be planned and estimations about turn-over times are necessary.
To model uncertainty, knowledge about the nature of the underlying stochastic process
is required. More precisely, assumptions have to be made on how this process is ruled,
which outcomes are possible,68 and what their corresponding probabilities are.69 If the set

64
See e.g. Vidal and Goetschalckx (2001) or Oh and Karimi (2004).
65
See e.g. Van der Vorst and Beulens (2002).
66
For a similar categorization see Klibi et al. (2010).
67
For related examples see also Applequist et al. (2000).
68
The so-called sample space.
69
The probability of an outcome may depend on other variables or decisions made in advance, see e.g. Law
145

internal external
technological
produc-
finan- person- materi- infra-
tion demand supply 3PLs
cial nel ally structure
system
work- demand transport
quan- credit stock pumping yield avail-
force varia- capaci-
tity line levels capacity rates ability
size tion ties
qua- quali- reli- off-spec product product reli-
interests decay
lity fication ability rate specs specs ability
evapo- pro-
due pumping due delivery transport
time speed ration cessing
date rates dates dates times
rate times

Table 4.3: Classification of sources of uncertainties and examples (per dimension)

of outcomes and the process’ basic structure are known, information about the underlying
probabilities can be deduced.
For example, the output rate of a simple SISO reactor depends on various conditions.
To model the transformation of input to output, knowledge about the chemical reaction
and the chemical reactor can be used. E.g., a linear model might be used to describe this
relationship properly on an aggregated level (e.g. the hourly production rate). Neglecting
minor influences70 leads to a simplification of the process model. Additionally, measure-
ment errors may hinder a perfect description of the process and lead to uncertainty in
the observed process measures.71 This uncertainty is expressed e.g. by a (normal) error
process. The resulting linear regression model can be verified using historical records of
the process. Often historical records allow analysts to deduce a proper stochastic model
of such a process. For more complex production processes more sophisticated stochastic
models (as described in section 2.3) can be necessary.
However, not all relevant stochastic processes are typically known at such a level of
detail. For external sources of uncertainty, typically, the underlying processes are not
known (entirely) or not describable in detail.72 Even the set of outcomes might not be
perfectly known. Modelling such processes is often based on historical records and rough
assumptions about the modelled process. E.g. the total customer demand faced by a
company or entire SC builds a conglomerate of a manifold of individual orders placed in
a given interval of time. However, neither models for each individual ordering process can
be derived nor an agglomerative model describing the ordering processes of all potential
customers could be handled. Hence, to model and estimate demands, one has to rely
on historical records about the total customer orders in history. A model to predict the
total demand to be faced in a period relies on indicators (such as GDP growth rate)

(2007).
70
E.g. the composition of input flows or precise temperature and pressure within the reactor.
71
For a more detailed discussion and definition about uncertainty in the context of operations manage-
ment see Zimmermann (2000).
72
According to Zimmermann (2000) this refers to a lack of information causing the uncertainty.
146

and on common assumptions about the customers’ ordering processes.73 However, there
is no reliable information about the uncertainty inherent in the estimation procedure
since the individual ordering processes cannot be modelled adequately. Hence, no reliable
information about the degree of uncertainty is available.74 This phenomenon often occurs
in long-term planning problems, e.g. when an SC configuration is to be found which fits
best to a fuzzy set of scenarios.75
For the purpose of modelling SC operations on a detailed level, precise knowledge about
the stochastic processes to be modelled is required. For an existing SC the status quo
of operations is known and documented in most cases. Hence, internal processes can
be described and information about the processes’ stochasticity is (at least) partly avail-
able. This allows the proper modelling of internal uncertainty. E.g. chemical production
processes can be modelled using the methods described in section 2.3 and illustrated in
example 3. However, such a process model is insufficient to model the production pro-
cess as a whole since only the output streams can be described with respect to the input
stream. To set up a proper model of the entire production process the stochasticity of
the input stream(s) has to be modelled as well. Typically, there exists a finite set of
operation states for a plant which are overlaid by stochastic noise. To model the states
of a plant, the Markov methodology introduced in subsection 3.2.2 can be used, enriched
by a stochastic model for the noise process. The following example extends the analysis
of the de-alkylation plant introduced in example 3 by analysing and modelling the inflow
process.

Example 8 (Modelling of operation states of chemical production plants). Suppose the


de-alkylation plant introduced in example 3. This chemical production process requires
streams of aromatic hydrocarbons (mainly Toluene and Xylene) and hydrogen to produce
Benzene (and a stream of long-chained hydrocarbons as a by-product). The limiting input
is the stream of aromatic hydrocarbons. Nonetheless, the supply of hydrogen has to be
assured, although it is typically not limiting. To assess the states of operation, the inflow
of the aromatic hydrocarbon stream is analysed. Because the set of general operation
states of this plant should be analysed, the median inflow rate of aromatic hydrocarbons is
recorded on a daily basis.76 Figure 4.3a shows the daily median inflow rates of aromatic
hydrocarbons recorded on 182 days and Figure 4.3b shows the corresponding histogram.77
It can be observed that two basic states of this plant are prevalent: Either the plant
produces at a level of about 141 units or it is not producing. However, there are some
fluctuations primarily at the beginning of the recording interval. Investigating the rea-
73
See Tsiakis et al. (2001).
74
A prominent example is the occurrence of so-called rare events or hazards such as atomic accidents or
stock exchange crashes. See Taleb (2007) for a more detailed discussion.
75
See Klibi et al. (2010).
76
This is in contrast to the modelling of the short-term process model where more granular data is
required. See example 3 where hourly averages are used.
77
Note that the data used for estimating the detailed process model corresponds to the first 14 observa-
tions depicted here in Figure 4.3a.
147

140

100
120
median inflow rate per day

80
100

frequency
80

60
60

40
40

20
20

0
0

0 50 100 150 0 50 100 150


Time (days) median inflow rate per day
(a) Time series of daily median inflow rates of aro- (b) Histogram of daily median inflow rates of aro-
matic hydrocarbons matic hydrocarbons

Figure 4.3: Daily median inflow rates of aromatic hydrocarbons for a de-alkylation plant

sons of these perturbations reveals that they are caused by technical re-adjustments due
to preceding repair and maintenance activities. Hence, two general states of operation
are considered as relevant, namely: full capacity production or a complete shutdown. Let
O = {0, 141} denote the set of these two states.
To model the state of the plant, a discrete Markov process is used. To calculate the
transition matrix Q of a discrete Markov process, the transition probabilities between both
states have to be estimated. All transitions of the recorded inflow data is used. The time
series of plant states ωt are calculated by



⎪ 141 ft > 0
ωt = ⎨ (4.1)

⎪ ft = 0
⎩ 0

where ft is the median inflow rate at day t depicted in Figure 4.3. The estimated transition
matrix Q̂ is then given by
0 141
0 ⎛ 0.875 0.125 ⎞
Q̂ = .
141 ⎝ 0.002 0.998 ⎠
The corresponding steady state vector is π = (0.087, 0.913). In other words, in the long
run, on 8.7% of days, the de-alkylation plant suffers from an (unintended) break-down.
The probability to face a break-down at a day which started in normal operation is 0.2%.
When the plant operates at full capacity, there are still variations in the realized inflow
rates. To model these fluctuations, the time series of average hourly inflow rates is anal-
ysed in an appropriate interval where the plant operated at full capacity. As in Example
148

3, a two-week interval is chosen corresponding to days 105-119 depicted in Figure 4.3a.


Figure 4.4 shows the time series of the recorded flow rates.

144
average inflow rate per hour
143
142
141
140

0 50 100 150 200 250 300 350


Time (hours)

Figure 4.4: Hourly average inflow rate of aromatic hydrocarbons (including confidence
intervals at a confidence level of α = 0.1%)

In general, the inflow rates oscillate slightly around an average of 141 units. However,
some outliers occur which are probably caused by measurement errors rather than real vari-
ations of the flow rate. To account for these observations, mean and standard deviation
are robustly estimated by calculating median and median absolute deviation from median
(MAD). This yields the following robust estimates: μ̂median = 141 and σ̂M AD = 0.18. As-
suming a normal distribution of the inflow rate, both estimates are used to calculate a
confidence interval with a confidence level of α = 0.1%. The corresponding quantiles are
displayed in Figure 4.4 by the dashed lines. Observations exceeding these intervals are
handled as outliers.
Having removed the outlying observations, the implied normal distribution is checked by
Shapiro-Wilk’s test. Figure 4.5a shows the QQ-plot and Figure 4.5b the autocorrelation
values for the trimmed time series of inflow rates.
The trimmed time series shows a good fit to the normal model and no hints for auto-
correlation.
149

141.4
141.2
Sample quantiles
141.0
140.8
140.6
140.4

3 2 1 0 1 2 3
Theoretical quantiles
(a) QQ-Plot of trimmed inflow rates of aromatic hydrocarbons
1.0
0.8
0.6
ACF
0.4
0.2
0.0

0 5 10 15 20 25
Lag
(b) Autocorrelations of trimmed inflow rates of aromatic hydrocar-
bons

Figure 4.5: Diagnostic plots of trimmed inflow rates of aromatic hydrocarbons for a de-
alkylation plant
150

4.3 A framework for simulation-based integrated


planning of supply chains in chemical industry
From the review of literature it can be concluded that simulation-based approaches estab-
lish an opportunity to tackle integrated planning problems when analytic optimization
is not able to properly reflect all aspects of the considered problem. However, finding
an optimal solution for an integrated planning problem by means of a simulation model
is more complicated compared to analytic optimization approaches. This is because the
development of a simulation model is a complex multi-step process. Figure 4.6 shows a
simple sketch of the steps in a simulation study.

conceptual model data analysis

simulation model

perform
experiments

meta-model

decision support

Figure 4.6: Steps in the simulation-based planning projects

A conceptual model is a formal description of the simulation model to be developed.


I.e. the relevant elements of the real system as well as their relations are represented at an
appropriate level of detail. It defines the general structure of the simulation model and
determines the requirements for data analysis.
In the data analysis step historical data of the real system to be modelled are analysed to
extract relevant parameters for the simulation model and the involved stochastic processes.
The simulation model is the implementation of the conceptual model accompanied by the
required parameters obtained from the data analysis step as some sort of a computer
program.
To derive recommendations for improving the system performance, simulation experi-
ments have to be conducted. I.e. control parameters of the simulation model are altered
in a structured way to investigate their effects on the performance measures. A specific
setting of the control parameters is called a configuration of the system.
As a simulation model typically contains stochastic elements, a simulation run is a
stochastic process. Hence, the outcome of simulation runs are stochastic variables. There-
fore, a certain number of simulation runs has to be conducted to handle the stochasticity
151

of the simulation outcomes. Keeping run times in acceptable ranges and deriving concise
information about the system performance at the same time, is a challenging task. Var-
ious approaches are provided in the literature to tackle this problem (often labelled as
simulation optimization).78
In the case of single-objective problems, the aim is to find the optimal configuration of
the system under study. Once found, this configuration is the sole recommendation for
the management. The objective typically represents an expected performance measure.
However, since stochastic events may influence the performance measure, it is advisable
to incorporate measures of variability (like the variance of a performance measure) as a
further decision criterion. This immediately leads to a multi-objective problem.79
In the multi-objective case, a set of Pareto-optimal configurations has to be found (so-
called Pareto set). In many cases not all possible Pareto-optimal configurations can be
determined. Instead, a subset is returned by a multivariate simulation optimization ap-
proach. Based on this subset, analytical models can be fitted to this data describing the
relationship between the control parameters and performance measures. These analytical
models are often lower-polynomial models and are called meta-models, response surface
models, or surrogate models.80 In other words, meta-models are mathematical functions
representing interdependencies of variables in a (simulated) system. Meta-models can be
used to derive Pareto-optimal configurations without the need to simulate these configu-
rations. Meta-models can help to find the most suitable system configuration in an easy
way (e.g. by weighting input and performance measures with costs and/or income rates).
In the remainder of this section the outlined planning process is described step-by-step.
Along these theoretical deductions, an exemplary simulation model of a chemical SC is
successively described, developed, and analysed in the example blocks.

4.3.1 Conceptual modelling & data analysis


Building a conceptual model is the first step in a simulation project. The main idea is to
generate a blue-print of the structure of the system to be modelled. Based on the problem
definition, the scope of the study, and the real-world system, a concept is derived how to
model and solve the problem stated. The conceptual model builds the back-bone of the
subsequent simulation model. Figure 4.7 shows a prototypical development scheme for
simulation projects.
At least two parties are involved in the development of a simulation study. In SCM,
simulation projects are instruments for process optimization. Such projects are initiated
by the perception of process deficiencies and weaknesses. Overcoming these problems is

78
See e.g. Law (2007), Banks et al. (2005) or Kleijnen (2007).
79
In this context Pareto-optimal configurations are often called stochastically dominating, see Klibi et al.
(2010).
80
See e.g. Box and Draper (2007) or Kleijnen (2007).
81
Following Law (2007) and Banks et al. (2005).
152

manager problem definition


project scope

real-world system modeller

modelling

validation database validation

programming
conceptual simulation
model verification
model

Figure 4.7: Development scheme for simulation studies81

the responsibility of SC managers. The perceived deficiencies define the aims and scope of
the project as well as the system components to be improved. Modellers have to propose
a concept for identifying the reasons why the perceived deficiencies occur. In a first
phase this concept description is called a conceptual model or assumptions document.82
In this phase of the project all stakeholders commonly discuss and argue about the scope
and capabilities of the model to be developed. From the modeller’s point of view, the
conceptual model is a simplification of the real-world system.83 Simplification can be
achieved by reducing complexity or scope of the model:84

• complexity reduction
– subsume system components
– drop randomness
– approximate dependencies

• scope reduction
– drop system components
– drop variables
– restrict domains of variables.

Together, managers and modellers have to agree upon simplifications and implicit assump-
tions made, finally leading to the definition of a suitable conceptual model. As Figure 4.7
suggests, the development of simulation models is an iterative process, where the appro-
priateness of model specifications is checked in a structured way. Errors in implementing
the conceptual model in a computer program are checked by verification.
82
See Law (2007) or Banks et al. (2005) for the latter term.
83
See Robinson (2006).
84
See Robinson (2006).
153

In the validation phase the behaviour of the simulation model is compared with the real
system’s behaviour. If errors are revealed, the modelling and programming steps have to
be re-started to ensure the required level of accuracy.
All sub-processes of the development process, beginning with the perception of deficien-
cies, require the analysis of process data. The data characterizes the system’s performance
and behaviour. This is necessary to parametrize the simulation model and build a credible
basis for comparing the model’s behaviour and reality. A common database maintained
by both modellers and managers allows linking necessary and available information. A
lack of data may lead to uncertainty, more complexity, and a loss in accuracy. Uncertainty
is reflected e.g. by stochasticity at variables which in turn may lead to a larger number
of simulation experiments and more volatile performance measures.85 On the other hand,
the information collected in the project database needs to be relevant for the project.
Otherwise, a flood of data to be evaluated and analysed hinders an efficient development
process as it is time and resource consuming.
The analysis of available and relevant data typically requires a set of statistical methods.
They are applied to extract deeper knowledge about the processes to be modelled. The
set of methods is vast and the choice of methods depends on the needs of the specific
project. Time series methodology is one prominent branch of methods.
The following example describes the conceptual model of an artificial chemical SC
analysed in this section.

Example 9 (Conceptual model of a chemical supply chain). To illustrate the proposed


framework, in this example an exemplary chemical SC basically consisting of a core part
of two chemical production sites is described. The example is inspired by the production
network of Dow Chemical in Europe. The underlying real-world processes are not described
in detail here.86 Instead, the resulting conceptual model after analysing the real world
system is presented. All assumptions and models used are described in the following.
Figure 4.8 visualizes the production network.
Both production sites produce basic and intermediate chemicals based on Naphtha. How-
ever, both sites focus on different final products. They also vary in their plants’ capacities.
The Naphtha is steam-cracked at both sites with similar plant configurations. Hence, the
ratios of output flows are similar but the heights of the flow rates are not.
Grey-coloured chemicals and associated arrows indicate auxiliary chemicals which are
not further processed at either site. As common in steam-cracking, the fraction of C1
hydrocarbons produced as a by-product is immediately used for heating the cracker fur-
naces. Similarly, the fraction of long-chained hydrocarbons (C9+ ) is used as fuel for steam
production at both sites.87 Hydrogen (H2 ) is produced and consumed by various plants.

85
See Zimmermann (2000).
86
This is partly due to privacy protection reasons and partly because the proposed example does not
match the real SC entirely.
87
Additionally, natural gas is admixed if the local fuel production is insufficient. In general, cracking is
154

site 1 PE C1

H2
C2
Benz. Styr. PS C2
C5+
C9+
H2
C3
C1 Cum. C3
C4

Naphtha
C4

S1 rail transports

Naphtha site 2 PE C5

H2
C2
Benz. Styr. C6
C5+
C9+ H2
C3
C1 PP C7
C4

Buta. SBR C8

Raf.
supplier storage customer production plant

Figure 4.8: Exemplary chemical supply chain with two sites

Imbalances are compensated either by heating with H2 or by external supply from producers
of industrial gases nearby both sites.88
Steam crackers produce a fraction of C4 hydrocarbons which is a mixture of differently
structured chemicals all containing four carbon atoms.89 In this mixture, 1,3-Butadiene
is the largest sub-fraction with the highest applicability in downstream processes. There-
fore, this sub-fraction is extracted by rectification.90 The remaining sub-fractions of C4
hydrocarbons (so-called raffinate) is sold to the market ex factory.
At both sites imbalances of the processed intermediate chemicals occur when all plants
produce at full capacity. Unbalanced intermediate chemicals are indicated by a coloured
storage symbol when inter-site transports are possible. Correspondingly coloured arrows
between both sites indicate the direction of the product flow. Table 4.4 quantifies these
imbalances and provides an overview on the chemicals’ associated storage capacities.
A net demand (indicated by negative values in the balance columns in Table 4.4) of
intermediate chemicals has to be satisfied by product flows between both sites or from

a highly endothermic reaction which in turn leads to a net demand of natural gas.
88
Note that this is a typical situation since most chemical production sites have demand for industrial
gases not only for chemical reactions but also e.g. for cooling (N2 ).
89
See Sun and Wristers (2006).
90
Since all C4 -sub-fractions possess similar boiling points, extractive distillation has to be applied, i.e. a
solvent chemical is used to drive the boiling points apart. See Sun and Wristers (2006).
155

balances storage capacities


chemical site 1 site 2 site 1 site 2
Naphtha -3000 -2000 30,000 25,000
Ethylene (C2 ) -200 200 20,000 20,000
Propylene (C3 ) 250 -300 10,000 8,000
Crude C4 hydrocarbons (C4 ) 300 -400 4,000 3,500
Pygas (C5+ ) 0 0 15,000 10,000
Pyoil (C9+ ) 60 40 2,500 2,000
1,3-Butadiene (Butadiene) — 0 — 2,500
C4 hydrocarbons (Raffinate) — 360 — 5,000
Benzene 0 0 25,000 20,000
Styrene -400 400 7,500 10,000
Cumene 585 — 25,000 —
Polyethylene (PE) 1,200 380 55,000 10,000
Polypropylene (PP) — 600 — 12,500
Polystyrene (PS) 600 — 17,500 —
Styrene-Butadiene rubber (SBR) — 320 — 18,000

Table 4.4: Material balances (in t/hour) and storage capacities (in t) per site and chemical

external sources. A surplus has to be sold to customers or distributed internally. A


precise overview of the product flow rates per site and plant is displayed in Table A.5 in
the appendix. In addition, the coefficients of the time series models generating the product
flows in detail are provided in Tables A.6 - A.11 in the appendix.
Sources of disturbances considered in this example are categorized in three classes. First,
the production plants are stochastic transformers, i.e. the transformation processes are
modelled by stationary time series models with normally distributed errors. The plants’
states are modelled by Markov models as introduced before. The corresponding transition
matrices are provided in the appendix in Table A.15 and Table A.16. Additionally, nor-
mally distributed errors are added to simulate the inflow rates with ∼ N (0, 10
ω
) where ω
is the current state of the plant.91
From Table 4.4 it can be taken that inter-site transports of Ethylene, Propylene, C4 hy-
drocarbons, and Styrene suffice to buffer local imbalances and increase plant utilizations.92
Therefore, rail transports are considered between both sites. For each product a centrally
managed fleet of RTCs is available for transports. Each RTC provides a transport capacity
of 50 tons.
For inter-site transports a regular train shuttle is organized. A train can be chartered
departing in accordance with a predefined schedule. The train is operated by a rail ser-
vice provider. If a train is chartered, the dedicated RTCs are gathered at the departing
site’s shunting yard where they are fetched by the rail service provider at the scheduled

91
In the case of of plants with multiple inputs, for each input stream its variation is calculated based on
its current state.
92
This is because reciprocal imbalances exist at both sites for all these chemicals. E.g. Ethylene is short
at site 1 but surplus at site 2. Hence, transports of Ethylene from site 2 to site 1 are reasonable.
156

time. After an expected travelling time of approximately 24 hours the train arrives at the
arrival site’s shunting yard where the attached RTCs are decoupled and further processed
(i.e. loaded or unloaded). The train’s travelling time is a normally distributed random
variable with N(24, 1). Loaded RTCs become available for transport the day after un-
loading. During this time, turnover processes as well as security checks and shunting
operations take place.
Unloading and loading times (in hours) are normally distributed random variables with
N(2, 0.25). The total turnover capacity depends on the number of transfer arms available
and the number of operating hours. Turnover operations take place 10 hours per day from
Monday to Saturday. Due to strikes, technical failures, etc. operating hours may reduce
to 5 or 0 hours. A discrete Markov model is used to model these disturbances constituted
by the following transition matrix:

10 5 0
10 ⎛ 0.99 0.005 0.005 ⎞
⎜ ⎟
Qturn−over = 5 ⎜⎜ 0.15 0.80 0.05 ⎟⎟
.
0 ⎝ 0.20 0.05 0.75 ⎠

Since both sites are geographically separated and do not share a common workforce, the
Markov model is applied for both sites independently.
The number of transfer arms and corresponding turnover capacities can be found in
Table A.12 in the appendix.93
To dynamically generate plans for train departures and RTC flows on the operational
level, the MC-RTP is used. The interval of time for discretization is a 24-hour rhythm.
The target stocks for the corresponding chemicals are calculated such that the maximal
demand of a chemical can be supplied for approximately 10 days from stock. The target
and initial stock levels as well as the number of initially available empty RTCs can be
found in Table A.12. Cost rates for the MC-RTP instances are shown in Table A.13 in
the appendix. Note that the costs for additional storage in RTCs are given as cAdd = 1.94
For Styrene, C4 hydrocarbons, Naphtha, and the final chemicals the network is imbal-
anced which requires exports and imports to and from external partners. Naphtha is sup-
plied via pipeline which connects the sites with the exclusive supplier (a refinery nearby).
The supply is organized in batches injected by the supplier when an order is placed by a
production site.
Batch injection and transport is associated with costs for pipeline operation (denoted by
cbatch ). Stock holding at the sites is associated with a stock holding cost rate chold . The
fixed pipeline transport capacity ρ is 7,500 tons per day (which exceeds both consumption
rates ωi ), optimal batch sizes q opt and order intervals topt can be determined by solving the
93
The turnover capacities are symmetric, i.e. loading capacity at the sending site equals unloading ca-
pacity at the receiving site. In an asymmetric setting both capacities do not match.
94
This allows a virtually cost-neutral storage extension.
157

economic order quantity model with finite production rate:95


1
2 2 ⋅ cbatch ⋅ ωi
qiopt = 2
2
3 hold and (4.2)
c ⋅ (1 − ωρi )
1
2 2 ⋅ cbatch
ti = 2
opt
2
3 hold . (4.3)
c ⋅ ωi ⋅ (1 − ωρi )

To calculate the optimal order quantities qiopt and corresponding order intervals topt
i requires
information about holding and batch injection costs, chold and cbatch .
Holding costs chold typically consist of the costs for maintaining storage facilities and
committed capital. Maintenance costs of storage facilities mainly encompass labour costs
for the maintenance staff and the facilities’ depreciations. Both are independent from
the stock levels and, hence, do not affect the stock level decision. In contrast, costs for
committed capital are typically assumed to grow linearly with the stock level as they reflect
the opportunity costs for investments which cannot be realized. With a market price for
Naphtha of about 700 e per ton and a return on assets of about 5% in basic chemical
industry, the storage cost rate can be approximated by chold = 35 e per ton and year.96
Batch injection costs are comparatively hard to assess as they depend on the techni-
cal specification of the pipeline. Pipeline operations primarily induce costs for energy
consumption. A pipeline transport generates energy costs which are proportional to the
transport volume. For modern basic chemical pipelines an energy consumption of about
42 kWh per ton and kilometre is estimated.97 Depending on the energy supply contract,
the costs of a batch transport can be calculated. However, with a contract charging energy
costs proportional to the energy consumption, batch injection and transport costs grow
linear with increasing batch size. Hence, the fixed cost part for pipeline operations cbatch
only consists of operational costs for preparing pipeline operation which are comparatively
small. Hence, it is concluded that qiopt → ωi which implies a continuous operation of the
pipeline at the level of the consumption rate. This result is reasonable as long as no con-
siderable fixed costs for providing a Naphtha batch occur.98 In this situation, both sites are
supplied continuously, i.e. the pipelines’ feed rate is 5,000 tons a day with both crackers
operating at full capacity. The flow rate is reduced accordingly when one or both of the

95
Note that this implies that both pipelines can be operated independently, see e.g. Günther and Tem-
pelmeier (2004).
96
Prices for Naphtha are partially transparent on the spot markets and can be found e.g. in Kellermann
(2012). However, most quantities are traded based on long-term contracts. Hence, the spot mar-
ket prices are potentially slightly over-estimated. The return on assets is a key figure reported in
companies’ economic reports, see e.g. BASF (2012) for one specific example or Fortune (2006) for an
industry-wide figure.
97
See van Essen et al. (2003).
98
Note that typically refineries are operating continuously where Naphtha results as a direct by-product.
Hence, production-specific set-up costs due to a variation of the production process are not to be
expected.
158

crackers are shut-down. The safety stocks to be held at both sites are defined such that for
5 days the crackers can be supplied from stock. I.e. safety stock levels are set to 15, 000
and 10, 000 tons for site 1 and 2, respectively. Details on the supply system will be studied
in example 12.
Final chemicals (such as Polystyrene, Polyethylene, etc.) are sold to customers ex
factory (i.e. customer transports are not considered). Styrene and C4 hydrocarbons are
transported via RTCs. To plan these transports the MC-RTP instance is extended by an
artificial sink and an artificial source representing customers and suppliers of Styrene and
C4 hydrocarbons, respectively. These fictional nodes are always able to absorb or provide
a required quantity of chemicals. They represent the spot market for both products where
orders are placed and a certain company accepts (at a certain price). Transports from/to
this source/sink represent transports to ex ante unknown partners and, hence, transport
times are stochastic. Here, a normally distributed transport time is assumed with N(5, 1)
(given in days).
Logistical disturbances are incorporated reflecting RTC defects and variations in the
customers’ pick-up quantities and frequencies (see Table A.14 in the appendix). RTC
defects occur with probability p = 0.025 per day and RTC. I.e. the number of defect RTCs
per day is binomial distributed with sritsf ail
∼ Bin (srits , p) where srits is the number of
RTCs of chemical s checked at site i in period t.99 The time for repair is given in days
and normally distributed with N (7, 1).
Auxiliary processes may fail e.g. steam or electric power supply breaks down which
causes plant breakdowns or a site-wide breakdown. For simplicity only two events are
modelled here: First, a breakdown of the steam network which forces all plants with energy-
consuming reactions to shut-down. This set includes the cracker, Butadiene extraction,
Benzene production (hydrogenation plant) as well as Cumene and Styrene production.100
To model such an event, a Markov approach is used with the following transition matrix
where state ω = 1 indicates normal operations and ω = 0 indicates a breakdown:

1 0
1 ⎛ 0.995 0.005 ⎞
Qsteam = .
0 ⎝ 0.35 0.65 ⎠

The second auxiliary process affecting the production network is a breakdown of the hy-
drogen network which immediately affects all plants consuming or producing hydrogen.
This includes the cracker, Benzene production, and Styrene production. The following

99
Note that any RTC is inspected before it leaves a site or is loaded. Hence, this encompasses all RTCs
E
waiting for further processing (srits ).
100
The alkylation reaction taking place in Cumene and Styrene production is exothermic (see Vora et al.
(2006)) which induces a production of steam. The same holds for the cracker where the quenching
process produces steam. Hence, these plants are connected to the site’s steam network and affected
when steam distribution cannot be managed.
159

transition matrix models these events:

1 0
1 ⎛ 0.99 0.01 ⎞
Qhydro = .
0 ⎝ 0.50 0.50 ⎠

Note that both auxiliary processes are assumed to be highly reliable and failures can be
bypassed or repaired in comparatively short time.
In the next step, all components and (statistical) models described in this example are
implemented in an R program. The implementation is described in the next example.

4.3.2 Components of chemical supply chain simulation models


The components necessary to model a chemical SC depend on the type of simulation
model chosen. Basically, the simulation of SCs is focused on material flows among a set of
processors whereby also flows of money or information might be reasonable.101 Processors
transform and manage the flow of materials depending on specific rules and stimuli from
other model components.
In agent-based simulation models, processors are agents which are interrelated and
directly interact depending on their internal decision rules, stimuli from related agents,
and environmental stimuli. Beside material flows, flows of information are often used
modelling the transport of stimuli between agents. Incorporating asymmetric information
is a general advantage of this simulation technique such that this model type is particularly
applicable for systems with a decentralized management.
In system dynamics simulation, processors reflect stocks. Flows between stocks are
continuous in time. Processors are controlled by setting flow rates which are managed
by control cycles as exemplarily depicted in Figure 2.9. I.e. flow rates are adjusted
w.r.t. variations of the realized flow rates from their corresponding target levels.
Discrete-event simulation is probably the most flexible simulation technique. Flows are
here discretely modelled, i.e. at a discrete number of time indices flows are controlled.
Processors can represent various system elements depending on the type of flows con-
trolled and the processors’ internal decision rules. Processors of material flows represent
e.g. production facilities where flows are transformed and (possibly) delayed to reflect
the time consumption for production. Moreover, processors may also reflect transport or
storage facilities where flows are controlled depending on the available capacities.
Typically, the management of flows is implicitly determined by local rules of the pro-
cessors. To improve the system’s performance, however, it is sometimes advisable to
coordinate these local decisions, i.e. the processors are centrally managed. This implies
that the simulation model requires an additional processor reflecting central information

101
See Kleijnen (2005).
160

processing. Central control of processors is enforced by decision models (e.g. formulated


as optimization programs) such that a hybrid simulation model is formed.
For modelling chemical SCs using discrete-event simulation, the following types of pro-
cessors are included to reflect material flows:

• production facilities,

• logistical facilities for


– storage (tanks, silos, etc.),
– turnover (transfer stations, quays, etc.),
– transport (rail yards, pipelines, ships, etc.),

• auxiliary processes for


– energy supply (electric power, steam, etc.),
– auxiliary product supply (water, industrial gases, etc.),
– workforce supply.

Production and logistical facilities are represented as processors transforming material


flows according to local restrictions. Processors can be characterized according to general
attributes about the handling of in- and outgoing material flows. Typical categories are
the vergence and volume of flows as well as the handling of time. Table 4.5 displays
examples of production and logistical processors w.r.t. these categories.

production logistical processors


processors storage turnover transport
multiple simultaneous multi-sink
flow split separation
consumers loading pipelines
ver- multiple simultaneous multi-source
gence merge synthesis
sources unloading pipelines
route transformation one-to-one one-at-a-time ship/rail/truck
batch
delay ✓ ✓ ship/rail
production
time non- continuous
— — pipeline
delay production
conser-
flow regular regular regular regular
ving
quan-
redu- off-spec product
tity decay rest in RTC
cing product interface

Table 4.5: Examples for processors according to processor type and processing attributes

Processors may merge, split, or route ingoing flows. For production processors this
corresponds to the general classification of chemical production processes (see Table 2.1).
161

For logistical processors these attributes correspond to the technical details of the mod-
elled equipment. Splitting storages can be used to supply multiple consuming processors
at the same time. Similarly, splitting turnover processors can handle multiple carriers at
the same time or may feed multiple storages at the same time. Transport processors that
are able to split flows simultaneously are e.g. one-to-many pipeline. Analogous examples
are given for merging and routing processors.
Depending on the real system’s specifications, a crucial attribute of processors is their
handling of time deciding whether ingoing flows are delayed or not. An example of
delaying production processors are batch processes where typically some time elapses until
the chemical transformation process is finished. Storage and turnover processors are also
delaying whereby in the former case the delay is an intended feature and flexible. In the
latter case, the delay occurs due to the processing times for turnover preparation (e.g. for
RTC shunting and connection to the transfer system). The time delay for transport
processors accounts for the transport time which is typical for e.g. rail and ship transports.
In contrast, pipeline transport can be non-delayed if the pipeline is completely filled once
and continuously operated under constant conditions.
Another characteristic is whether a processor conserves the inflowing quantity or not.
If non-usable or invaluable materials are produced during a transformation process, these
can be dropped from the simulation model for the sake of brevity such that a total loss
of material is modelled. Examples of such losses are off-spec materials102 produced after
a mode changeover in a multi-purpose plant and interfaces in multi-product pipeline
operation.
The kind of modelling of auxiliary processes depends on their structure and importance
for the entire system. Often auxiliary processes are modelled on an aggregated level as
(global) environmental variables affecting the attributes (such as capacities) of processors.
Such processes are often not modelled explicitly by means of processors or flows but rather
as (environmental) events which are ruled by stochastic processes.
Similar to SC planning problems, the elements of simulation models (i.e. flows, pro-
cessors, and environmental variables) can be structured according to the level of detail
involved. Based on the basic structure of a chemical SC as depicted in Figure 4.10, three
aggregation levels are distinguished:

• plant level

• intra-site level

• inter-site level.

Table 4.6 shows a brief overview on the related model components and aggregation levels.
At the plant level one or more production processors form a material transformation
process. Chemical production plants can be modelled as processors with technologically
102
See page 141 for the definition.
162

logistic
production auxiliary
storage turnover transport
plant ✓ — — — (✓)
intra-site — ✓ ✓ ✓ ✓
inter-site — (✓) — (✓) —

Table 4.6: Relation between aggregation levels and types of processors

fixed rules of operation (e.g. a fixed material flow inside the plant). Here, static models
are used to represent the transformation process. Considered stochastic elements are
e.g. yield rates, general production mode, and processing times. Operating rules at the
plant level are e.g. priority rules for material processing (such as first-in-first-serve).
The intra-site level is characterized by a set of multiple plants which are interconnected
by logistical processors (such as pipelines and tanks). Logistical processors enable inter-
mediate storage and a connection to external sources of materials. Examples for local
processor rules are inventory policies for each storage processor. At this level, a coordina-
tion of production processors addresses e.g. production planning and scheduling problems
for multi-product batch processes. In this case, the central control of production proces-
sors tackles an operational planning problem. A similar intra-site coordination problem
occurs in the management of local stocks within a multi-product pipeline system.103 Such
coordination problems can either be solved by applying heuristics locally or by an em-
bedded analytical optimization model.104 Considered stochastic influences in modelling
logistical processors reflect e.g. variations in their capacity due to technical failures or the
processing times of handled objects.
Coordination problems also occur on the inter-site level where multiple integrated pro-
duction sites are interconnected by (material) flows. At the inter-site level, only logistical
processors are added to the model if any at all.105 Here, the main focus is on the man-
agement of inter-site transports by utilizing the logistical facilities located at the sites.
To utilize these facilities, transport carriers have to be modelled, i.e. trucks, rail cars, or
ships. The flow of these entities forms the capacity for inter-site material flows. Hence,
at the inter-site level primarily flows (e.g. of materials, transport carriers, or information)
have to be planned for which operational distribution planning and routing models can
be used. Stochastic influences often affect the transport times but also the quality of
materials which may decay during transport.
If the modelled real system relies on discrete objects, a discrete-event simulation is
the natural choice for simulation. However, in chemical industry production processes
are continuously operated. Hence, material flows are usually continuous with variable

103
See Neiro and Pinto (2004).
104
See Garcia-Flores and Wang (2002), Jung et al. (2004) or Jung et al. (2008) for examples of the latter
case.
105
An example are inter-site pipelines.
163

flow rates. In contrast, the associated logistical sub-systems (e.g. for rail and ship trans-
ports) rely on discrete entities which requires discrete-event simulation. This implies that
continuous processes have to be simulated for discrete time intervals. Hence, the ideal
simulation model is a continuous time discrete-event model which integrates continuous
chemical process models. If operational planning models are included in the simulation
model, this forces a continuous time formulation of these optimization models. However,
typical operational planning models are complex and time continuous formulations are
often not adequate to solve such problems efficiently. Often a time-discrete formulation
is preferred to keep the problems solvable for instances of practical size.106 Therefore,
time-discrete, discrete-event formulations are often appropriate for hybrid simulation ap-
proaches.107 Fortunately, continuous chemical production processes can be approximated
by discrete time series models quite accurately as long as the time increment is chosen to
be not too large, see chapter 2.
The following example describes the implementation of the conceptual model described
in example 9.

Example 10 (Implementing an integrated chemical SC simulation model). The concep-


tual model of the chemical SC described in example 9 is implemented as a hybrid time-
discrete discrete-event simulation model. It is implemented as an R program. The model
uses two discrete time scales. The basic time scale is given in days whereby each day is
subdivided into 24 hours. The conceptual model’s graphical representation from Figure 4.8
can be extended using the processor notation introduced in Table 4.5. Figure 4.9 shows a
graphical representation of the implemented simulation model for site 1.108
Flows of each production processor are generated in each period depending on the stock
levels of the adjacent stock processors and specific time series models.109 Flows between
production processors use unidirectional single-product pipelines. As non-delayed transport
processors, these pipelines are omitted from the flow chart of Figure 4.9 for the sake of
simplicity. Transport and turnover processors handle RTC and material flows. Both
depend on each other and have to be planned simultaneously.
To generate plans for inter-site rail transport and turnover processes, the MC-RTP model
is used in a rolling horizon environment where the individual MC-RTP instances are solved
by CPLEX 12.3.110 The planning horizon is set to seven days.111 The re-planning interval
is set to three days such that flow decisions are fixed for three days.112 The general struc-

106
See e.g. Pinto et al. (2000).
107
See Garcia-Flores and Wang (2002), Jung et al. (2004) or Jung et al. (2008) for examples.
108
Site 2 and information flows are omitted for the sake of brevity.
109
Details are described in example 9 and Table A.6-Table A.11. Flows from sources and sinks (repre-
senting suppliers and customers) are generated as described in example 9 and Table A.14.
110
See IBM (2010).
111
For the configurations used in the simulation study, all MC-RTP instances are solved to optimality in
less than two seconds.
112
To determine the re-planning interval, simulation experiments with varying re-planning intervals are
conducted. In tendency, it holds that the shorter the replanning interval is, the better the system
164

C2 PE

Benz. Styr. PS
C5+

C3
Cum.
C4

storage processor
production processor turnover processor
storage processor for empty RTCs
fictional sink/source transport processor
storage processor for loaded RTCs

product flow RTC flow

Figure 4.9: Flow chart of a part of the exemplary chemical SC using the notation of
Table 4.5

ture of the implemented simulation model is displayed in Figure 4.10. Numbers annotated
to the blocks refer to the lines in the pseudo-code provided in Table 4.7.
In each iteration (i.e. for each day), it is initially checked whether a new transport
plan is to be determined. If this is the case, initial data is gathered reflecting the current
state of the SC. This comprises the stocks of chemicals and RTCs at the sites (loaded or
empty) as well as the currently flowing chemicals and RTCs. Additionally, the expected
production rates are estimated by linearly extrapolating the last week’s (hourly) production
rates. The capacities of the storage facilities as well as expected values for turnover and
transport times are constants and, hence, fixed parameters for each MC-RTP instance.
The turnover and train capacities follow regular train and workforce schedules which are
assumed to be deterministic and, hence, known for all instances. The cost rates ruling the
MC-RTP are fixed for all instances (see Table A.13).
After solving an MC-RTP instance, the resulting optimal flows of materials and RTCs
are passed to the simulation model. Whether these flows can be realized depends on the
availability of resources which are affected by stochastic elements. Hence, scheduled flows
can be realized only if sufficient

• RTCs,

• turnover capacities,
performance and the longer the computation time is. An interval length of three days yields a
reasonable compromise between run time and solution quality.
165

set-up matrices
2-10
initialize inidicators

t=1

12
is t re-plan- yes generate
13
ning period? MC-RTP-instance

no
solve 14
instance

t=t+1 generate update planned


15-16
random events 19-28 RTC flows

realize material + 29-32


RTC flows

no
end of
horizon?

yes

evaluate
performance

Figure 4.10: Aggregated flow diagram for chemical SC simulation model

• material stocks or storage capacities

are available. Breakdowns of turnover capacities are ruled by a Markov approach and
immediately affect all flows. A loss of empty RTCs due to unexpected maintenance activ-
ities can affect scheduled flows if the remaining number of empty RTCs is insufficient to
transport the scheduled flow. Moreover, dispatched trains may arrive later than estimated
such that scheduled turnover processes are delayed. A breakdown of production plants may
induce e.g. stock-outs such that planned chemical flows cannot be realized.
Anyway, when a scheduled flow cannot be realized as intended, it is postponed to the
next period. Hence, stochastic influences typically lead to a postponement of transport
activities. In the next re-planning period all flow schedules are re-set to the new optimal
solution such that formerly postponed flows are integrated in the new schedule. Table 4.7
shows the detailed pseudo-code of the implemented simulation model.
Configuring the SC can be done by a set of parameters of the simulation model reflecting
the adaptation of capacities (e.g. RTC fleet sizes, inventory, or turnover capacities) and
of operational rules (e.g. target stocks and cost rates for the MC-RTP). Note that the cost
rates ruling the MC-RTP instances reflect the operational costs for transports, turnover,
166

1 Function Sim.Chem.SC(time horizon T , production models, initial stocks sini ,


initial rail cars srini , cost rates, inventory capacities & target stocks, turnover
capacities, train & handling capacities)
2 generate result & auxiliary matrices ;
3 initialize inventory & RTC stock matrices ;
4 for all sites i do
5 generate state sequence vector of length T for hydrogen (ωithydro ), steam
(ωitsteam ) and transfer system (ωitturn ) ;
6 for all plants j do
7 generate state sequence vector ωitj ;
8 assign plant states w.r.t. ωitj , ωithydro and ωitsteam ;
9 end
10 end
11 for t = 1 → T do
12 if t is planning period then
13 generate MC-RTP instance for next 7 days ;
14 solve MC-RTP instance for next 7 days ;
15 generate transport times for scheduled trains ;
16 update flow schedules according to new solution ;
17 end
18 for all sites i do
19 for all RTC fleets do
20 determine loss of empty RTCs for maintenance ;
21 assign maintenance durations for each RTC under repair ;
22 receive repaired RTCs ;
23 update empty RTC stock ;
24 end
25 for τ = 1 → 24 do
26 for all plants j do
27 generate hourly product flows depending on ωitj (w.r.t. to
available stocks & inventory capacities) ;
28 end
29 for all products p do
30 realize scheduled transport & transfer flows (w.r.t. to available
RTCs & turnover capacities) ;
31 update stocks sit ;
32 end
33 end
34 end
35 end
36 end

Table 4.7: Pseudo-code for chemical SC simulation model


167

and stock holding. They can be seen as policy parameters as they control the level of
preference for stock holding or transportation. In particular, the setting of inventory cost
rates for over- and undershooting determines the smoothness of stocks and can be seen as
an alternative for the variation of target stock levels.
To evaluate the performance of a simulation run, the flow and stock matrices are re-
turned by the model. These matrices can be used to calculate a desired number of perfor-
mance measures (e.g. the number of dispatched trains, average stock levels, etc.). These
depend on the purpose of the study. It has to be noted that each simulation run is a stochas-
tic experiment. Hence, multiple replications are necessary to evaluate the performance of
a certain SC configuration properly.

4.3.3 Verification & validation


Verification and validation are instruments to ensure the quality of modelling. Verifica-
tion particularly focuses on the technical aspect ensuring that the programmed model
behaves as intended.113 Validation focuses on the accuracy of the final simulation model
which should reflect the behaviour of the corresponding real-world system sufficiently
precisely.114 The verification and validation process (V&V process) intends to establish
confidence of the managers in the developed simulation model such that the final simula-
tion model can be used for resolving the system’s initial deficiencies efficiently.
Basically, three different groups may supervise the development process: the modellers,
the managers, or independent consultants.115 In any development process the modellers
constantly check the quality of modelling and programming process by logical checks,
back-tracing, and debugging. However, even a technically perfect implementation of a
mis-specified model cannot compensate the mis-specification. Additionally, modellers
might not be objective with respect to the accuracy of their "product" and, in principle,
there might be a tendency to conceal own mistakes.
Therefore, the managers typically have to take a vital part in the V&V process. In
particular, the opinion of the project managers is inevitable to decide on the applicability
of the final model.116 However, the managers are typically neither directly involved in the
programming process nor experts in the field of programming. Hence, they may have a
lack of knowledge and expertise for a quick and qualified judgement about the accuracy
of modelling. As a third option, external consultants can be involved in the V&V process
to judge about the accuracy of modelling.117 These consultants should be experts in the
field and should not be involved in the development process.
In the V&V process, various techniques are available relying on the comparison of a

113
See Law (2007).
114
See Law (2007).
115
See Sargent (2005).
116
See Sargent (2005).
117
See Balci (1997).
168

model with a reference system (e.g. the real system or the conceptual model). They can
be categorized according to the type of test and the type of criterion used. Basically,
the model structure can be evaluated by testing the internal logic of the model. I.e. it
is tested whether the relations ruling the model match the assumed relations deduced
from the reference system. On the other hand, the output of a model can be checked for
similarity with the reference system.
The criteria used for V&V can be qualitative or quantitative. Qualitative criteria
involve a subjective assessment on the logic or similarity of the tested model. In contrast,
quantitative criteria measure the similarity or consistency by a numerical comparison
of a reference value from the reference system with a (calculated) test value obtained
from the tested model.118 Table 4.8 shows examples of V&V techniques according to the
categorization criteria discussed.119

type of test
logical similarity

• cause-effect-diagram • Turing test


quali-
type of criterion

• animation • face validation


tative
• trace analysis • path testing

• fixed-value test • predictive validation


quanti- • extreme value test • statistical techniques
tative
• sensitivity analysis • trace-driven testing

Table 4.8: Classification of V&V techniques

These techniques are more or less applicable for the different V&V activities. For
verifying the implemented simulation model, software engineering techniques are most
widely used which refer to the techniques displayed in the upper left cell in Table 4.8.
Cause-effect-diagrams visualize relations between events and system outcomes. They
can be compared with commonly accepted assumptions about relations between events
and system outcomes to detect unintended model behaviour. Similarly, animation and
trace analysis often help modellers and managers to check the plausibility of the model’s
behaviour over time. However, not all programming mistakes uncover themselves in an
animation or in the analysed traces. Moreover, rare events are hard to detect since
animation is applied for rather short time frames and trace analysis is applied for a

118
Similarly, both extremes can be labelled with subjective and objective. See Rabe et al. (2008) for a
continuous-scale classification on the degree of subjectivity.
119
For an extensive list of V&V techniques see Balci (1998).
169

restricted number of elements/simulation runs.120


A more objective way to check the logical structure of a model is to use quantitative
criteria. Fixed-value and extreme-value tests aim at the evaluation of the model’s be-
haviour under specific conditions for which a reference value can be derived ex ante. In
the former case, a fixed setting is created omitting all stochastic influences. Hence, the
deterministic logic of the model can be tested. Similarly, the model can be tested under
extreme conditions.121 Sensitivity analysis carries out a systematic test of the model re-
sponse(s) for a restricted set of configurations to estimate general relations between input
parameters and output variables. These relations can be compared with ex ante known
relations. Note that for V&V purposes typically not the precise estimation of parameter
effects is important, but the general relation (e.g. the sign of an effect or the order among
the variables’ effects).122
If data of the real system is available, the developed simulation model can be tested for
similarity with the real system in a quantitative way (bottom-right cell in Table 4.8). For
this purpose, a lot of statistical procedures can be applied depending on the specific object
to be tested. Typically, regression techniques, distribution tests, or time series analysis
methods are used.123 A reliable quantitative approach is to generate a forecast of the near
future by means of the simulation model which is then compared with the real systems
behaviour after the forecast period has expired. This is called predictive validation.124
A mixture of trace analysis and fixed-value test is the trace-driven simulation where a
historical situation is simulated. The model’s output is compared with the historical
records then.
If explicit tests for similarity cannot be carried out, rather vague qualitative techniques
should be applied (top-right cell in Table 4.8). The Turing test is carried out by experts
or managers. It evaluates whether the output of a model can be distinguished from real
historical records.125 If not, this is an indicator that the model represents the reality
sufficiently accurately.126 Similarly, face validity is a discussion process where simulation
outputs and/or model components are jointly discussed by the developers, managers,
and/or experts.127
If used solely, each technique has drawbacks and restrictions such that, typically, a set
of V&V techniques is used to verify the simulation model and to check validity. Con-
fidence in the appropriateness of the model is rather created by a well-structured V&V

120
See Sargent (2005) and Rabe et al. (2008) for more information.
121
I.e. conditions which are barely realistic (e.g. when parameters are set to values which are judged as
just tolerable in reality), see Sargent (2000).
122
See Kleijnen and Sargent (2000) and Kleijnen (1995).
123
See e.g. Sargent (2005); Kleijnen (1995); Rabe et al. (2008).
124
See e.g. Rabe et al. (2008).
125
Therefore, real historical records and simulated records are presented and the expert/manager is asked
to identify the simulated record.
126
See Schruben (1980).
127
See Sargent (2000).
170

process than by a specific technique.128 Once the model development has passed all V&V
steps successfully, the simulation model is assumed to accurately reflect the real system’s
behaviour (w.r.t. the scope of the project). In the next steps, simulation experiments are
planned.

4.3.4 Planning of simulation experiments


By definition, simulation models are rather descriptive than normative. Hence, to reveal
relevant information for decision makers, simulation experiments have to be conducted.
The way how to plan experiments and how to extract concise information from experi-
mental results is subject of the broad field of experimental design. The general aim is to
extract as much information as possible about the simulated system with as little com-
putational effort as possible. As both aims are contradictory, general assumptions about
the information to be extracted are made and the best way to extract this information
w.r.t. the associated computational effort is sought for. This is typically done by assum-
ing a special type of relation between the variables of the studied system constituting a
mathematical model (e.g. a linear regression model).
The definition and classification of variables depends on the specific questions to be
answered by the experiments. After specification of relevant variables and their type of
relation, an experimental design is set up and simulation runs are performed. Analysing
the simulation results allows the analyst to parametrize the mathematical model. This
results in a meta-model of the simulated system. This simple mathematical model is
finally used to extract quantitative, concise information about the system’s behaviour.
Figure 4.11 refines Figure 4.6 by adding details about the experimental phase.
At the beginning of the planning process, it has to be clarified which questions should
be answered by the study. In most cases, the research questions are rather qualitative,
unspecific statements than precise algebraic formulations. Hence, such statements have
to be operationalized into mathematically manageable terms. I.e. measures have to be
defined describing the aspects of the system under study that should be investigated.
Such measures are called variables. In the context of experimental studies variables have
a lot of attributes depending on their purposes.130 Table 4.9 shows an overview on the
most common attributes of variables.
Basically, variables are distinguished in response variables and explanatory variables
depending on the role defined by the study’s questions and hypotheses. Explanatory

128
See Rabe et al. (2008).
129
Based on Hinkelmann and Kempthorne (2005a).
130
See e.g. Law (2007).
171

Questions & hypotheses simulation model

model building define variables

classify variables specify relations

statistical model

Experimental design

simulation perform experiments


optimization

meta-modelling

decision support

Figure 4.11: Steps in experimental planning129

variables are typically measurable131 while response variables can be unobservable.132 In


this case, they are constructed by observable variables using a predefined model and are
called latent variables.133 To investigate impacts of explanatory variables on response
variables, they have to be controllable. If they are uncontrollable, but measurable, they
are usually handled as nuisance variables. Effects of uncontrollable, latent variables are
typically subsumed in error terms. In simulation experiments variables are observable and
measurable by definition.134 Nuisance variables are incorporated by stochastic processes
in a simulation model.
From the underlying questions of a study and their operationalization it follows how
variables are defined and which role they can have. When designing and analysing sim-
ulation experiments typically multiple response and explanatory variables exist. When

131
Note that in structural equation models measurable indicator variables are grouped into factors. These
factors are unobservable/latent and influence latent/unobservable response variables. Such models
are typically designed and estimated to confirm a theory about the construction and relations of the
latent factors, see Pearl (2000).
132
I.e. the existence of such a variable is assumed, but it cannot be measured precisely e.g. intelligence or
trustworthiness.
133
See e.g. Borsboom et al. (2003) for an example and definition.
134
In principle, also latent variables might be considered in simulation studies. However, in a simulation
study the subject studied is explicitly modelled in detail and not as a black box. Hence, the intro-
duction of latent variables in a simulation model yields typically no additional insights in the general
relation of its elements.
172

attrib. scale observability role controllability


numeric
explanatory
(continuous, measurable controllable
(independent)
values

discrete)
response
ordinal
latent (dependent) uncontrollable
nominal nuisance

Table 4.9: Attributes of variables

simulating a business system, response variables are usually measures of the system’s per-
formance variables. These variables are assumed to contribute to the overall success of
the business system. The contributions of performance variables to the overall success
are often operationalized as costs or revenue rates.

4.3.4.1 Performance measures in (chemical) supply chain models

A prominent measure of overall success is the profit obtained with a given set of resources
in a given time span. Basically, the profit depends on costs, revenues and assets employed,
whereby those depend on system-internal and -external influences. In SC simulation the
aim is to investigate the behaviour of the system and explain the relations between its
components. Here, the focus is on studying internal relations under external disturbances.
Let x ∈ Rm denote the vector of controllable independent variables and y ∈ Rn denote the
vector of dependent response variables defined in the model building phase. Furthermore,
e ∈ Ro denotes the vector of uncontrollable variables which are not considered in detail
in the model but are captured as environmental effects. The function H(⋅) describes the
real relation between dependent and independent as well as environmental variables

y = H(x∣e). (4.4)

Typically, the function H(⋅) is unknown and reflects the behaviour of the real system.
Since the controllable variables x are often strategic decisions, experiments in reality are
often too expensive. Hence, simulation models are developed mimicking the real system’s
behaviour. Formally, the simulation model can be seen as a function Ĥ(⋅) such that

ŷ = Ĥ(x∣e). (4.5)

All these components influence the profit u of the real system, i.e.

u = U (x, y∣e) = U (x, H(x)∣e) (4.6)


173

where U (⋅) is a function describing the relation between influencing variables and obtained
profit u. Typically, U (⋅) is unknown in all details, but at least the direction of influences
of the variables are known.135 As the vector of controllable variables x reflects the set of
options to take, it is sought for an optimal configuration xopt that either maximizes U (⋅)
or, if the focus is purely on internal processes, minimizes the associated total costs where
C(⋅) denotes the cost function of the system under study. I.e.

xopt = arg max U (x, H(x)∣e) or xopt = arg min C (x, H(x)∣e) . (4.7)
x x

x̂opt = arg max U (x, Ĥ(x)∣e) or x̂opt = arg min C (x, Ĥ(x)∣e) . (4.8)
x x

A regime of external conditions e has to be defined and integrated in the simulation


model. This is referred to as a scenario. In common sense, a scenario comprises both,
stochastic processes reflecting environmental conditions and the general structure of the
modelled system. Typically, the focus of simulation is on the internal processes of a supply
chain under a more or less specific environmental regime. E.g. in chemical industry an
SC’s revenues depend mainly on product prices that can be realized. Due to the highly
competitive market for basic chemicals and the inflexibility of (continuous) production
processes in combination with immense capital commitment for production plants, the
focus for optimization is on the internal processes of a chemical supply chain.
To evaluate the (total) costs of a configuration for a specific scenario e requires knowl-
edge about the measures inducing costs (cost drivers) and a formal description of the re-
lation between costs and measures.136 However, a precise mathematical relation between
(performance) measures and costs is often hard to obtain because interdependencies ex-
ist among cost drivers. Moreover, cost rates are unknown and/or vary in time as they
e.g. depend on discounts, quantities, or interest rates. Therefore, the direct evaluation of
the costs of a configuration is not reasonable as it implies deterministic and time-invariant
assumption about the underlying cost function C(⋅).
Instead, the direct evaluation of cost drivers is in focus for optimization. Cost drivers
are performance measures driving costs or profit but their immediate contribution to
total costs/profit cannot be expressed explicitly.137 For assessing the SC performance
diverse performance measurement systems are proposed with many of them relying on the
balanced scorecard approach138 in combination with the SCOR model.139 Performance
measures can be categorized according to the dimensions of the balanced scorecard or

135
I.e. the signs of the partial derivatives of w.r.t. x and y are known at least in some ranges.
136
In case of a linear relation, only one parameter (the cost rate) is required. But also more complex
relations are imaginable.
137
See Sürie and Wagner (2008).
138
See Kaplan et al. (1992) for the seminal work on the balanced scorecard and Bhagwat and Sharma
(2007) for performance measurement systems in SCM based on the balanced scorecard approach.
139
See Sürie and Wagner (2008) or Gunasekaran et al. (2004, 2001) and Agami et al. (2012) for an overview
of performance measurement systems.
174

SCOR model as well as their attributes.140 Table 4.10 shows a classification of performance
measures according to aggregation level and type of measure.

operational tactical strategical


costs for
costs per labour
cost inventory cost deprecation &
hour
financial

amortization
revenue/
cash flow total profit corporate value
value-based
cost-to-income return on
relational cost-to-asset ratio
ratio investment
cash flow cycle product
time order lead times
non-financial

time development time


capacity forecasting customer service
reliability
utilization accuracy level
quality of
quality delivery quality process quality
produced goods

Table 4.10: Examples and classification of performance measures in SCM

Depending on the scope of the simulation study, performance measures should be chosen
w.r.t. to the aggregation level as well as managerial impact. Ideally, an SC evaluation
system containing an already existing system of performance measures can be adapted
to the simulation environment.141 However, this procedure bears the risk that too many
criteria have to be evaluated which increases the computational effort for evaluating SC
configurations. Moreover, traditional performance measurement systems aim at reflecting
the current situation of an SC whereas simulation aims at finding a future configuration.142
Hence, the choice of performance measures to be evaluated in the simulation model should
account for cost and profit drivers instead of a direct evaluation of financial measures
since they typically imply assumptions about environmental financial drivers such as raw
material prices or interest rates. This information, however, is uncertain and complicates
the system’s/model’s analysis. Furthermore, such measures distract from the underlying
processes to be optimized. In general, a configuration of the SC is pursued that works well
under almost every environmental state. This ability is often labelled as flexibility in SC
performance measurement systems.143 To operationalize this dimension in SC planning,
a robust configuration is sought.144
Robustness is often an important feature, in particular when an objective’s variation
depends on the configuration.145 Robustness can be measured in various ways, e.g. by con-

140
For the latter case see e.g. Beamon (1998).
141
See Kleijnen and Smits (2003).
142
See Kleijnen and Smits (2003).
143
For a discussion on flexibility see Gunasekaran et al. (2001).
144
See Kleijnen and Smits (2003).
145
I.e. a configuration x not only influences the mean level of responses H(x) but also their variance.
175

sidering the inverse coefficient of variation instead of the pure expected performance146 or
by considering both expectation and variance of a measure as independent objectives.147
Another possibility to invoke robustness is to optimize a certain quantile instead of the
expected performance. This has the advantage that also measures with a skewed distri-
bution can be handled.148
From the deductions above it can be concluded that simulation studies on SC con-
figuration put focus on multiple objectives/performance measures which depend on the
configuration x. The configuration and its performance affect the total cost and/or total
profit. The combination of a configuration and its performance z = (x, y) is called a
constellation. In simulation studies on SC configuration typically a set of efficient con-
stellations is sought.149 Per definition an efficient constellation is not dominated by any
other constellation. A constellation z = (x, y) is (strictly) dominated by a constellation
z̃ (formally z̃ ≻ z) if z̃ is (at least) indifferent for all variables (z̃j ≿ zj for each j with
j = 1, ..., m + n) and strictly preferred for at least one variable (∃j ′ such that z̃j ′ ≻ zj ′ ).150
The set of efficient constellations is denoted by Z ef f .
To sum up, there are two cases to be distinguished in simulation optimization: If a
unique objective can be defined and calculated, an optimal configuration is searched for.
In case of multiple objectives, the set of efficient constellations has to be determined.
Both tasks are hindered by the following properties of simulation models:

• Simulation models are computationally expensive.

• Simulation models are stochastic.

• A large sample space151 has to be evaluated.

Also, the selection of xopt or Z ef f is a challenging task and much research has been devoted
to this topic.

4.3.4.2 Experimental designs

From a historical point of view, experimental designs were developed to provide methods
for planning real-world experiments.152 Here, typically the set of controllable variables
is large but their domains are restricted. Since most variables are considered as discrete
variables, they are often simplified to dichotomous variables or their domain is reduced
146
The so-called Taguchi approach see e.g. Shang et al. (2004).
147
See e.g. Kleijnen (2007, sec. 4.6) for an example.
148
See e.g. Law (2007, sec. 9.4).
149
Sometimes also called Pareto front.
150
The notation a ≻ b denotes that a is better than b, whereas a ≿ b declares that a is at least as good
as b. This notation is chosen to allow flexibility in ordering relation. E.g. if for all components of z
holds "the smaller the better", ≻ could be replaced by <. However, using e.g. service level measures or
output measures as response variables, requires a converse ordering relation. Due to this, the common
order relations are generalized by preference relations.
151
The sample space depends on the number and domains of the control variables.
152
See e.g. the seminal work of Fisher (1971).
176

to a small number of values. Assume that for all control variables xi with i = 1, ..., m
such that x = (x1 , ..., xm ), a finite number of states mi exist. Then a configuration x is
constituted by a combination of m active states, where each control variable has exactly
one level. Let bi denote a binary vector of length mi reflecting the level of variable
xi by a 1-entry at the corresponding position. Hence, the set of vectors bi is a binary
representation of configuration x. Furthermore, let y denote the vector of responses of
length n. This allows modelling the outcome of configuration x by a simple linear model
(multivariate analysis of variance):
m
ˆ
ˆ x = E (ŷx ) = μ + ∑ Γi ⋅ bi = E (Ĥ
ŷ (x∣μ, Γ1 , ..., Γm )) (4.9)
i=1

where Γi are matrices of dimension n × mi such that the entry in the kth row and lth
column (γkli ) is the effect of the lth level of control variable xi on response variable yk .
Furthermore, μ ∈ Rn denotes the mean vector.
The aim is to (consistently) estimate the coefficients μ and Γi with as little compu-
tational effort as possible. I.e. only the minimal number of configurations153 necessary
to estimate these coefficients should be evaluated. A set of configurations constitutes an
experimental design. A simple experimental design is the full-factorial design generated
by evaluating all possible configurations in the sample space. In this case, ∏m i=1 mi config-
urations have to be simulated. Obviously, this approach can only be realized in situations
when the sample space is small. For most realistic simulation models this approach is
computationally intractable since the number of configurations grows exponentially and
computational power is still limited.
From (4.9) it can be seen that for a particular response variable yj the number of
coefficients to be estimated is 1 + ∑mi=1 mi .
154 Hence, there are only

m
1 + ∑ mi − m (4.10)
i=1

configurations necessary to calculate all coefficients of yj .155 To do so, the configurations


are generated such that all pairs of configurations are orthogonal.156 This constitutes a
resolution III design which is the simplest form of a fractional factorial design.157 If all
response measures yj are assumed to be independent,158 the corresponding coefficients
153
In literature primarily the term "setting" is used instead of "configuration".
154
This is for the mean and one row of all Γi .
155
This is because one coefficient of each control variable is redundant as it equals the overall mean, see
Hinkelmann and Kempthorne (2005b) for details.
156
Orthogonality is useful as it implies pairwise uncorrelated configurations such that the classic estimators
for independent observations (e.g., the least squares estimator) can be applied, see e.g. Kleijnen (2007,
p. 33-35).
157
The term fractional refers to the fact that a subset of the full factorial design is extracted. See
e.g. Hinkelmann and Kempthorne (2005b) or Kleijnen (2007).
158
This is a reasonable assumption since otherwise a response variable could be calculated by the other
responses.
177

can be estimated simultaneously by a single design’s results such that multiple objectives
do not induce more configurations to be evaluated.159
Resolution III designs are applicable when all interactions between control variables
are assumed to be 0. If only mean and main effects are to be estimated despite the
presence of non-zero two-variable interactions, resolution III designs have to be enhanced
to consistently estimate mean and main effects.160 For enhancing the resolution III design
in the case of dichotomous variables the so-called foldover theorem can be applied.161 The
main idea is that any configuration carries information of both the variables’ main effects
and interaction effects. To separate both effects the mirrored configuration is evaluated
which carries the same interaction effects but the inverse main effects. Hence, the number
of configurations to be evaluated doubles if two-variable interactions are assumed and a
so-called resolution IV design is created.162
If two-variable interactions are assumed to be non-negative and should be estimated,
the MANOVA model (4.9) changes to

m
⎛ b′i ⋅ Γ1(ij) ⋅ bj ⎞
ˆ ⎜ ⎟
ŷx = E (ŷx ) = μ + ∑ Γi ⋅ bi + ∑ ⎜ ⋮ ⎟ (4.11)
i=1 i,j∣j>i
⎜ ⎟
⎝ b′i ⋅ Γn(ij) ⋅ bj ⎠

where Γk(ij) is the matrix of interaction effects between control variables xi and xj of
response variable yk . Matrix Γk(ij) has dimension mi × mj , but the first row and the first
column is always zero.163 Then, the number of coefficients to be estimated for a particular
response variable is given by

m
1 + ∑ ((mi − 1) + ∑ (mi − 1)(mj − 1)). (4.12)
i=1 j>i

This equals the (minimal) number of configurations in a so-called resolution V design.164


For example, assuming m = 6 dichotomous variables requires 7 coefficients to be esti-
mated for each response variable for a main-effect model as (4.9). If no interaction effects
are assumed, for estimating these coefficients a resolution III design with 8 configurations
needs to be evaluated.165 If two-way interactions are assumed to be non-zero but are
not to be estimated, the corresponding resolution IV design consists of 16 configurations.
The explicit modelling of all two-variable interaction effects increases the number of coef-
159
See Kleijnen (2007) and the references therein.
160
This is because main and interaction effects are confounded, see Kleijnen (2007, sec. 2.6) for details.
161
See Box and Wilson (1951) or Hinkelmann and Kempthorne (2005b, sec. 13.6.4).
162
Note that due to this generation scheme the resulting design is still orthogonal, see also Kleijnen (2007,
sec. 2.6).
163
The zero-entries correspond to constellations when at least one control variable is set at their base
value such that no interactions occur.
164
For details how such designs are generated see Hinkelmann and Kempthorne (2005b) or Kleijnen (2007).
165
To ensure orthogonality of a design, the number of settings has to be a power of 2 for dichotomous
variables.
178

ficients to 22 which is greater than the number of configurations in a resolution IV design


implying that the corresponding resolution V design has 32 configurations.166 To generate
an explicit design a lot of methods and algorithms are proposed which are standard pro-
cedures implemented in statistical software packages and, hence, are not discussed here
in detail.167
Fractional factorial designs are useful when the domain of control variables is discrete
(binary at best). However, in case of continuous variables whose influence on the responses
is assumed to be linear and which are restricted to specific intervals, fractional factorial
designs can be applied, too. In fact, (4.9) changes to

ˆ x = E (ŷx ) = μ + Γ ⋅ x
ŷ (4.13)

where Γ is the n × m matrix of regression coefficients. In such a case the control variables’
domains can be reduced to a binary set containing only the corresponding intervals’ ex-
tremes.168 This is because γki (the regression coefficient of control variable xi on response
variable yk ) equals γki ⋅ (xk1 − xk0 ) = γk1i − γk0i where xkl are the chosen interval points of
the control variable and γkli are the coefficients at both levels according to (4.9). Due to
the assumed linearity, efficient constellations ẑˆ = (x, ŷ) ˆ can only exist on the boundary
of the simplex spanned by the control variables’ intervals. Hence, after estimating all
ˆ
coefficients, i.e. μ̂ and Γ̂i , the meta-model Ĥ (x∣μ̂, Γ̂1 , ..., Γ̂m ) can be used to derive the
efficient configurations by checking for dominance of the vectors ẑˆ = (x, ŷ ˆ x ) of all ∏m mi
i=1
possible configurations.

Example 11 (Fractional factorial design). To illustrate the benefits of fractional factorial


designs for planning simulation experiments, the chemical SC described in Examples 9
and 10 is considered again. Logistical core components in this model are the chemical
transports between both production sites. Hence, logistical key decisions have to determine
the corresponding capacities for inter-site transports. Relevant capacities are the RTCs at
hand per chemical transported, the transfer arms, the inventory capacities, and the target
levels as well as the number of available trains and the number of working days for RTC
handling.
To apply fractional factorial designs, the control variables have to be of nominal or
ordinal scale or the relations to the performance measures are assumed to be linear. Here,
the influence of the number of available trains, the number of working days for RTC
handling, and the number of transfer arms is investigated. As status quo, the data provided
in example 9 is used: From Monday to Saturday trains can be dispatched and RTCs
can be processed. For each product an equal number of transfer arms is available at
both sites according to Table A.12 in the appendix. As an alternative configuration, the

166
For these numbers see e.g. Hinkelmann and Kempthorne (2005b, p. 535).
167
See Kleijnen (2007, ch. 2) for some generators for dichotomous variables.
168
Or any other distinct pair of values from the interval.
179

effect of making Sunday available for train dispatching and RTC processing should be
tested to investigate whether an improvement in the network’s robustness and inventory
balancing can be achieved. Furthermore, the impacts of doubling the turnover capacity for
all chemicals is investigated which might reduce RTC cycle times and, hence, reduce the
number of required RTCs. In brief, Table 4.11 shows the values for each variable.

# # handling # unloading arms


trains days C2 C3 C4 Styrene
status quo 6 6 2 3 4 3
extension 7 7 4 6 8 6

Table 4.11: Values of the control variables for the experimental design

Since for all variables two states are considered, in total 26 = 64 possible configurations
constitute the corresponding full factorial design. It is assumed that a variable’s effects
on the responses depend on the values of the other variables. E.g. it can be assumed that
the individual effect of extending the turnover capacity of a certain chemical potentially
reduces turnover times. In this case, RTCs might be repositioned faster and, hence, in
a given amount of time transport quantities could be increased. This might manifest
in more smoothed inventory levels and a reduced risk for inventory shortfalls. However,
these potentials might only be exploited if the transport capacities are extended, too. I.e. to
realize these effects, train dispatches on Sunday might be necessary.
In other words, interaction effects between the variables are considered to reflect the
systems’ dynamics. To find a balance between modelling accuracy and computational
effort, only two-variable interactions are considered. Note that, since mi = 2 for all
variables, main effects and two-variable interaction effects correspond to 22 coefficients to
be estimated according to (4.12). Hence, a resolution V design is constructed with 25 = 32
configurations. Table A.17 shows the 32 evaluated configurations. For each configuration
a time span of 180 days is simulated. To assess the responses’ variability, 100 replications
are conducted per configuration.
The system’s reactions under specific constellations of variables manifest themselves in
various measures. Three classes are distinguished here: First, inventory-related measures
reflect costs incurred by holding stocks. On the other hand, stock-out costs occur when
supplied plants have to be shut down due to a shortage of material. These costs typically
reflect the loss in revenues when customer orders cannot be fulfilled due to a shortage
of final products. It is assumed that the target stock levels reflect the points where stock
holding costs and expected stock-out costs are in balance.169 Hence, any deviation from the
desired target stock level implies an increase in total inventory-related costs. Therefore,

169
Note that a re-configuration of the logistical system may lead to an improved logistical performance
(e.g. faster cycle time) and may influence the determination of target stock levels. However, this effect
is not investigated here.
180

the relative average target stock deviation rip is calculated as follows


1
2 ∑T (sitp − sTar )2
1 2
3 t=1 ip
rip = ⋅ (4.14)
sTar
ip T

where sitp and sTar


ip denote the stock level and target stock level of chemical p in period
t at site i, respectively.170 In total, eight inventory-related measures are calculated for
each simulation run of this example.171 In (4.14) over- and under-shooting of the target
level is a squared measure. This implies that over-shooting and under-shooting have the
same negative effect by changing costs in the same rates. This might not be true for any
real-world problem. If over- and under-shooting should be handled individually, rip has to
be calculated with different weights for under- and over-shooting.
A second category of measures reflects the logistical effort inherited by a specific sys-
tem configuration. Since capacities of the logistical system are handled as input variables
or kept fix, responses should also measure the utilization of the capacities and, hence,
reflect the (additional) operational costs. In this example, operational logistic costs en-
compass RTC handling and maintenance costs, turnover costs, and charges for dispatched
trains. All costs except for the trains’ charges primarily depend on the labour costs of
the workforces. However, in the very most cases these costs are fixed for a given stock of
personnel and, hence, do not depend on the utilization. Therefore, only the total number
of dispatched trains tr is kept as a response variable

tr = ∑ ∑ ∑ trijt , (4.15)
i∈I j∈I t∈T

where trijt denotes the number of trains dispatched in period t on link (i, j).
As a third category production-related performance measures reflect the monetary cash-
flow resulting from production operations. This includes costs for raw and auxiliary mate-
rial consumption, labour costs of operators, opportunity costs for tied capital, and revenues
due to sales of final and intermediate chemicals. A configuration of the network’s logisti-
cal system affects the production system by providing prerequisites for normal production
operations. I.e. a configuration assures that raw materials are available and that final and
intermediate chemicals can be stored and delivered to customers. In this simulation study,
these effects result in the availability or storability of intermediate chemicals. If material is
unavailable or not storable (e.g. due to fully utilized stock capacities), a disturbance of the
production processes results such that plants have to reduce their production outcome or
shut down. An appropriate measure to account for production disturbances is to measure
the gap between planned and realized production for all plants associated to the considered

170
Note that rip can be seen as a relative root mean squared error (RMSE) for chemical p at site i.
171
Since there are two sites and four products in focus.
181

chemicals. This measure equals the β-service level of each chemical considered:

∑t∈T ωipt
real
βip = plan
. (4.16)
∑t∈T ωipt

real plan
In (4.16), ωipt and ωipt denote the realized and planned balance of chemical p in period t
at site i.172 Note that the planned balance ωipt plan
of a chemical also incorporates variations
of the plants’ production states due to technical or other reasons such that the difference
real plan
between ωipt and ωipt only accounts for deviations due to logistical reasons.
To sum up, for the chemicals considered in inter-site transport in total 16 responses
are defined by (4.14) - (4.16).173 To simplify the following analyses it is assumed that all
chemicals at both sites have equal priority. Service level and inventory-related responses
can be merged into average measures by

∑i∈I ∑p∈P rip


r= (4.17)
∣I∣ ⋅ ∣P∣

and
∑i∈I ∑p∈P βip
β= . (4.18)
∣I∣ ⋅ ∣P∣
Then, a scenario comprises six input variables whose effects on three response variables
are evaluated. To calculate the effects, linear models with two-way interaction effects
are estimated for each response variable. The average β-service level is restricted to the
interval [0, 1]. An ordinary linear model is not an appropriate model for probabilistic
measures since its prediction space is not restricted. Therefore, a logit model is used to
describe the input variables’ effects on the average β-service level. I.e.174

6 6
β
log ( ) = μβ + ∑ (γiβ + ∑(γiβ γjβ )ij ) (4.19)
1−β i=1 j>i

where γiβ is the (main) effect of setting input variable xi to its higher value and (γiβ γjβ )ij
is the two-way interaction effect of setting variables xi and xj to their higher values. Note
that μβ corresponds to the basic configuration where all input variables are set to their
lower/initial values.
For both other responses ordinary linear models are estimated, i.e.

6 6
tr = μtr + ∑ (γitr + ∑(γitr γjtr )ij ) (4.20)
i=1 j>i
6 6
r = μr + ∑ (γir + ∑(γir γjr )ij ) (4.21)
i=1 j>i

172 plan
Note that in the considered scenarios ∑t∈T ωipt ≠ 0 for all sites, periods, and products.
173
For C4 at site 1 no consuming plant exist. Hence, no local β-service level exists such that only seven
service levels are calculated.
174
For detailed information about logit models see e.g. Faraway (2005).
182

where the coefficients (γ) are defined as described above. Note that the relative average
sCap
inventory deviation r is restricted to the interval [0, max (1, ip
sTar
)]. This violates the as-
i∈I,p∈P ip
sumption of classic linear models that variables have continuous and unbounded support.
However, an explicit modelling of truncated variables does not reveal a significant advan-
tage of this model as the simulated values are sufficiently far away from their extremes.
Hence, a linear model is used since the violation of model assumptions is expected to be
ignorable. Beside the input variables, a main influence on the responses is the number
and duration of plant break-downs due to technological reasons. To take this influence
into account, the total quantity of the corresponding chemicals produced at both sites is
incorporated as an additional explanatory variable.175 Figures 4.12a - 4.12c show the box-
plots for the three responses. The estimated responses based on the models (4.19)-(4.21)
are superimposed by the symbol ▲.
The corresponding estimated coefficients γ can be found in Table 4.12.

input β service level (β) # trains (tr) stock deviation (r)


variable Estimate Pr(> ∣t∣) Estimate Pr(> ∣t∣) Estimate Pr(> ∣t∣)
(Intercept) -13.00 0.00∗∗∗ 205.94 0.00∗∗∗ 0.51 0.00∗∗∗
# arms C2 -0.02 0.77 -3.07 0.00∗∗∗ -0.02 0.01∗∗∗
# arms C3 0.04 0.51 -0.95 0.33 0.01 0.23
# arms C4 0.05 0.40 -0.88 0.37 0.00 0.88
# arms Styrene -0.04 0.49 -0.44 0.66 0.00 0.62
# trains 0.20 0.00∗∗∗ 18.66 0.00∗∗∗ -0.03 0.00∗∗∗
# handling days 0.07 0.20 -0.88 0.37 0.00 0.71
arms.c2:arms.c3 -0.09 0.07† 0.48 0.54 0.00 0.99
arms.c2:arms.c4 0.02 0.72 -0.05 0.95 0.01 0.23
arms.c2:arms.sty 0.05 0.28 -0.21 0.79 0.00 0.83
arms.c2:days.train -0.05 0.34 0.15 0.86 0.00 0.67
arms.c2:days.hand 0.07 0.12 -0.48 0.55 0.00 0.85
arms.c3:arms.c4 -0.02 0.65 -0.50 0.53 0.00 0.68
arms.c3:arms.sty 0.06 0.23 0.34 0.67 0.00 0.77
arms.c3:days.train -0.01 0.88 0.72 0.37 -0.01 0.17
arms.c3:days.hand 0.01 0.87 -0.17 0.83 0.00 0.70
arms.c4:arms.sty 0.02 0.74 0.14 0.86 0.00 0.88
arms.c4:days.train -0.07 0.13 0.17 0.83 0.00 0.90
arms.c4:days.hand -0.04 0.36 0.99 0.22 0.00 0.47
arms.sty:days.train 0.01 0.83 -0.73 0.36 0.00 0.79
arms.sty:days.hand -0.10 0.03∗ 1.40 0.08† -0.01 0.23
days.train:days.hand -0.06 0.22 0.63 0.43 0.00 0.97
adj. R2 /pseudo-R2 0.61 0.45 0.05
∗∗∗ ∗∗ ∗ †
significance codes: ...≤ 0.001; ...≤ 0.01; ...≤ 0.05; ...≤ 0.1

Table 4.12: Effects of input variables on responses in example 11

The corresponding QQ-plots for all models’ residuals are depicted in Figures A.3a-A.3c.
The QQ-plots displayed in Figure A.3b and Figure A.3c show a good fit to the normal

175
Note that this variable is independent from the experimental design such that no collinearity exists.
183

1.00
0.95



● ●
● ●
● ● ● ● ●

● ● ● ● ●
● ●
● ● ● ●

● ● ● ● ● ● ●
● ● ● ● ●
● ●
● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ● ●
● ●
● ● ● ● ● ●

● ● ● ●
● ●
● ● ● ● ● ●
β−service level

● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ●
0.90

● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ●
● ● ● ● ● ●
● ●
● ● ●
● ●
● ● ● ●
● ●
● ● ● ●
● ● ●
● ●
● ● ● ● ● ●
● ● ●

● ● ● ●
● ● ● ●
● ●
0.85

● ● ●
● ● ●
● ●

● ●



● ● ●
● ●
0.80

fitted value per scenario


0.75

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
scenario
(a) Boxplots of simulated and fitted β-service levels per scenario
320

● ●




300


280
# trains
260

● ●
● ● ●

● ● ●
● ●
240

● ● ● ●

● ● ●
● ●
● ● ●
220




fitted value per scenario
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

scenario
(b) Boxplots of simulated and fitted number of trains per scenario



0.7

● ●
● ●


inventory deviation
0.6
0.5
0.4





● ●

● ●

0.3

● ● ●
● ● ● ●

● ●


fitted value per scenario

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

scenario
(c) Boxplots of simulated and fitted inventory deviation per scenario

Figure 4.12: Simulated and fitted responses per experimental configuration


184

distribution.176 For the logit model (4.16) the QQ-plot has only limited relevance, since the
residuals are not assumed to be normally distributed.177 Thus, Figure A.3a shows a typical
pattern for truncated variables with increasing concentration towards the truncation. From
the QQ-plots no serious violations of the model assumptions can be detected for all three
fitted models. All models significantly reduce the corresponding residuals’ variances178
whereby the explained variance is about 50% for (4.16) and (4.20) but only 5% for (4.21).
The low level of explained variance for (4.21) is an indicator that either not all relevant
influences are included in the regression or the investigated variables only slightly affect
this measure.
From Figure 4.12 it can be deduced that the variances of the response variables are
similar in the experimental configurations. This presumption is supported by Bartlett’s test
for variance homogeneity for all three responses (at a significance level of 1%).179 Hence,
an effect of the input variables on the variation of the responses cannot be verified such
that a discussion on the robustness of an experimental configuration is obsolete w.r.t. to
the selected three responses. Table 4.12 contains the estimates and t-test probabilities
of the coefficients of models (4.19)-(4.21). It can be observed that the number of trains
dispatchable within a week (shipment capacity) is the only variable that has a significant
effect on all responses. As expected, the effect on the β-service level and the total number
of trains dispatched is positive, whereas the inventory deviation can be significantly reduced
by additionally available train transports. An extension of the turnover capacity for C2
has significant negative effects on the number of trains and the stock deviation.
Only few (slightly) significant interaction effects can be found. A negative impact on
the β-service level occurs for a simultaneous extension of turnover capacities for C2 and
C3 as well as the simultaneous extension of Styrene’s turnover capacity in combination
with extended RTC processing times. Negative two-way interaction effects are not to be
expected whereas positive ones indicate an effective exploitation of extended capacities.
Due to the fact that only 100 replications were performed it is presumed that the (slightly)
significant two-way interaction effects are spurious and reflect a noise effect.
A positive interaction effect on the total number of dispatched trains is observed for the
simultaneous extension of Styrene’s turnover capacity in combination with an extended
RTC processing time. I.e. more trains are dispatched as RTCs can be handled faster.
This might be an indication that the fleet of Styrene RTCs is too small. However, this
176
However, Jarque-Bera tests do not support this founding for both models. A closer inspection of the
QQ-plots reveals indications for slightly fat-tailed distributions of the residuals. Hence, the models
were re-fitted for Box-Cox-transformed dependent variables (with λtrain = 1.4747 and λstock = 1.6363).
For these models Jarque-Bera tests confirm the normality of residuals. For the sake of brevity the
detailed model specifications are not shown here as the general results are the same as for the original
model.
177
See Faraway (2005).
178
This founding is verified by F-tests for models (4.20) and (4.21) as well as Wald’s test for model (4.16)
at a significance level of 1%.
179
Note that the left-skewed distribution of the simulated β-service levels as depicted in Figure 4.12a is
compensated by the logit transformation.
185

founding might also be caused by the comparatively small number of replications.


Based on the models (4.16)-(4.21) with estimated coefficients as stated in Table 4.12 the
expected outcomes of the remaining 32 (not simulated) configurations can be estimated.
The results are depicted in Table A.18 in the appendix. Among all possible 64 config-
urations, 15 can be dropped as these are dominated by at least one other configuration.
Note that the basic configuration (# 1) where all variables are at their initial values is
efficient, but the extreme configuration with all variables at their maximum values (# 64)
is dominated.180 I.e. a simultaneous extension of all variables is not advisable. Based on
the significant main effects it can be concluded that an extension of the turnover capacity
for C2 results in a reduction of the logistical effort measured in terms of the number of
dispatched trains.181 On the other hand, if the focus is rather on the logistical reliability
the extension of the shipment capacity has a positive effect on both reliability measures
(β-service level and stock deviation) which comes at the cost of 18 additional trains in a 6-
month period on average.182 An extension of both capacities improves the stock reliability
only slightly and reduces the additional number of dispatched trains.183
Considering a situation, when a re-configuration of the existing SC should be evalu-
ated, the decision for a potential re-configuration depends on the costs of the potential
re-configurations184 as well as the predicted benefit of the corresponding performance ef-
fects.185

4.3.4.3 Simulation optimization

In case of continuous control variables x and non-linear relations to the responses y,


other methods have to be applied to find efficient constellations. The "classic way" is
the so-called response surface methodology (RSM).186 Here, local linear meta-models are
estimated iteratively using resolution III or IV designs (this step in the RSM is called
response surface approximation (RSA).187 Local refers to a (small) sub-space of the con-
trol variables’ sample space. In the subsequent iteration the investigated sub-space is
re-located according to a given step-width and a gradient-based direction (depending on
the preference relation of the dependent variables).188 If an optimum exists in some region
of the sample space, local linear meta-models will show a serious lack of fit which can be

180
Configuration # 64 is e.g. dominated by configuration # 60.
181
For details see e.g. configuration 3 in Table A.18.
182
See e.g. configuration # 17 in Table A.18.
183
On average by three trains, see e.g. configuration # 18 in Table A.18.
184
Re-configuration costs are e.g. investments for turnover capacity extensions.
185
Financial impacts may realize e.g. as cost savings due to more smooth stock levels.
186
See Barton and Meckesheimer (2006) for details on meta-modelling using RSM.
187
See Kleijnen (2007).
188
Classic RSM applies a simple steepest descent approach, see Law (2007) or Kleijnen (2007). Typically,
RSM assumes an unconstrained experimental space (except for upper and lower bounds). However,
an extension to incorporate (additional) constraints on the experimental space is provided by Kleijnen
(2008).
186

quantified by various measures.189 If the lack of fit exceeds a critical value, a local poly-
nomial of higher-order (i.e. second-order) is used to approximate the local responses.190
If the local meta-model fits sufficiently accurately, this model is used to determine the
optimal configuration.191
Unfortunately, in classic RSM the local RSA relies on local polynomial relations and
requires that all general assumptions of linear models have to be fulfilled. This includes
homogeneous variances among the configurations and normally distributed errors. In
many simulation experiments these assumptions are violated which hinders a consistent
fit of linear models and interferes the optimization process. Therefore, in advanced RSM,
local linear meta-models are replaced by other functions known as surrogate functions.
In the literature, five main classes of surrogate functions are distinguished: Kriging mod-
els,192 neural networks,193 regression splines,194 support vector regression,195 and radial
basis functions.196 The general procedure of the RSM remains unchanged, but the gra-
dient defining the local sub-space to be explored in the next iteration is calculated based
on the aforementioned local meta-model functions.197
As an alternative to RSM, simulation responses can be used directly to explore the sam-
ple space of control variables. To do so, a lot of combinatorial optimization approaches
were adapted for simulation optimization. In general, there are four main classes of
methods that have shown a particular applicability in (multi-objective) simulation opti-
mization: Meta-heuristics, gradient-based procedures,198 random search,199 and sample
path optimization.200 Of particular interest are meta-heuristics as they have shown a
good performance for a wide range of combinatorial optimization approaches. Therefore,
commercial simulation software primarily uses these techniques to incorporate simulation
optimization routines.201 Among meta-heuristics, tabu search, scatter search, and genetic
algorithms are most widely used.202 Table 4.13 provides an overview on all aforementioned
techniques.
In non-commercial applications often genetic algorithms are proposed.203 In general,
189
Such as the adjusted R2 or the Karesh-Kuhn-Tucker conditions, see Kleijnen (2007).
190
The most prominent design to fit a second-order polynomial is the so-called central-composite design,
see e.g. Kleijnen (2007). Here, a resolution V design is enhanced by 1 + 2 ⋅ m configurations to be able
to estimate quadratic effects, too.
191
See Wilson et al. (2001) for details on applying RSM to find efficiency frontiers.
192
See e.g. Kleijnen (2009) for details.
193
See e.g. Sabuncuoglu and Touhami (2002) or Fonseca et al. (2003).
194
See Li et al. (2010b) or Barton and Meckesheimer (2006).
195
See e.g. Clarke et al. (2005).
196
See e.g. Barton and Meckesheimer (2006) or Barton (2009) for an overview and Shin et al. (2002) for
an application of radial basis functions.
197
See Li et al. (2010b) for a comparison of these meta-models.
198
See e.g. Fu (2006).
199
See Andradóttir (2006).
200
See Fu et al. (2005) for an overview.
201
See e.g. Law (2007) or Banks et al. (2005).
202
See e.g. the commercial build-in optimizer OptQuest Laguna and Martí (2003) and Laguna (2011), or
WITNESS optimizer (Lanner Group, 2002).
203
Albeit, there is also commercial simulation software using this type of meta-heuristics, see e.g. Barton
187

Response surface
direct optimization
approximation

• meta-heuristics
• linear & quadratic regres-
sion (classic RSA) – tabu search
– scatter search
• regression splines
– genetic algorithms
• support vector regression
• gradient-based methods
• radial basis functions
• random search
• neural networks
• sample path optimization

Table 4.13: Overview on simulation optimization methods

genetic algorithms in multi-objective simulation optimization proceed as follows:204

1. construct initial configurations (i.e. a population)

2. evaluate the configurations’ performances (i.e. perform a number of simulation runs)

3. determine non-dominated configurations of the population and assign fitness values

4. create a new population by applying genetic operators (mutation and cross-over) on


the subset of non-dominated configurations with best fitness scores

5. repeat step 2 and 4 until the set of non-dominated configurations is sufficiently large.

In contrast to single-objective optimization, in multi-objective optimization two goals have


to be achieved:

a) Find configurations close or equal to the real efficient configurations (accuracy).

b) Find a heterogeneous set of non-dominated configurations in order to describe the


set of efficient configurations (i.e. the Pareto front) as entirely as possible.205

While the first goal comes along with single-objective optimization, the second goal is
characteristic for multi-objective optimization problems. Hence, multi-objective optimiza-
tion procedures try to spread the pool of non-dominated configurations along the Pareto
front. To do so, the diversity of non-dominated configurations has to be incorporated in
the solution procedure.

and Meckesheimer (2006) or (Barton, 2009).


204
See e.g. Coello et al. (2007).
205
See e.g. Srinivas and Deb (1994) or Coello et al. (2007).
188

A popular example is the non-dominated-sorting genetic algorithm (NSGA).206 Here,


the fitness value of a specific configuration is the higher, the more other configurations it
dominates and the more separated this configuration is. Originally, the measure of sepa-
rateness depends on the distance between two configurations and a sharing parameter.207
In the improved version (NSGA-II), this measure is replaced by the so-called crowding
distance which is the average distance to a configuration’s component-wise nearest neigh-
bours.208 The NSGA approach has achieved a wide field of applications, particularly in en-
gineering and SCM applications.209 To further improve the algorithm’s performance, the
set of non-dominated configurations is stored in a central, iteration-independent archive
which assures that no non-dominated configuration is lost during the optimization pro-
cess.210
The following example shows an application of the NSGA-II algorithm for a subsystem
of the chemical SC example introduced before.

Example 12 (Simulation optimization using NSGA-II). In example 9 the supply process


of Naphtha was introduced as a continuous supply of both crackers via pipeline from a
refinery. This implies a direct dependency on the supplying refinery. An alternative is
to assume that the supplier in Figure 4.8 represents a sea ship terminal where Naphtha
can be unloaded from ships. At the terminal, tanks are available for intermediate storage
of Naphtha. Transports of Naphtha via ship are now associated with shipment costs to
be charged for a specific transport relation. Often refineries are located ashore or have a
direct connection to a sea ship terminal in order to reduce the dependency from a single
supplier.211 The inventory management of Naphtha is crucial in the configuration of the
SC because Naphtha stocks constitute an important part of the total quantity of stored
chemicals. Hence, Naphtha stocks cause a relevant contribution to the SC’s total stock
holding costs. Therefore, the inventory management primarily aims at keeping the stock
holding and supply costs in balance.
The inventory policy controls the stocks at the three locations (the harbour and both
production sites) and the number of Naphtha shipments. The total demand faced by the
terminal equals the sum of the Naphtha consumption of both sites which is ω max = 5, 000
tons per day in regular operation. The crackers at both sites are modelled as described in
in = 20, 000 tons a day.
example 9. It is assumed that the harbour’s unloading capacity is ρcap

206
See Deb et al. (2002) and Srinivas and Deb (1994).
207
See Srinivas and Deb (1994).
208
See Deb et al. (2002).
209
See e.g. Deb et al. (2007) or Bin et al. (2010) for engeneering applications and Puigjaner and Guillén-
Gosálbez (2008) or Mele et al. (2006) for applications in the SCM context.
210
Otherwise, the maximal number of efficient configurations to be found equals the population size in
each iteration. This loss of non-dominated configurations is called Pareto drift and the improved
algorithm is called NSGA-IIa, see Goel et al. (2007).
211
However, typically contracts with only few suppliers will be made to cover the regular demand of
Naphtha. Nonetheless, the sea ship terminal provides the opportunity to order additional supplies
e.g. in case of a shortfall of the regular supplier(s).
189

Unloading is possible every day. The tank capacities at both sites are shown in Table 4.4
whereas the tank capacity at the sea port is assumed to be 75, 000 tons.
The inventory management follows local (si , Si ) policies at both production sites (i =
{1, 2}) and the harbour (i = h). Six parameters have to be determined to fully specify the
SC’s raw material inventory management. If the available stock at location i ∈ {1, 2, h} in
period t (say lit′ ) falls below the corresponding order point si , an order is placed with the
order quantity qit = Si −lit′ . The available stock comprises the local inventory level (lit ) and
all non-delivered orders (∑Tτ=t qiτ ). If orders are placed at the production sites, they are
immediately fulfilled via pipeline as long as a sufficient quantity of Naphtha is available
at the harbour tanks.
The pipeline is assumed to be highly reliable. In certain intervals, however, the pipeline
control system reports failures which are partly caused by misestimations of pressure but
sometimes also due to micro-leakages. In any case, an inspection of the identified pipeline
segment is necessary which forces an interruption of pipeline operations. It is assumed
that the time span between two alarms talarm (given in days) is Weibull distributed with
talarm ∼ WB (k = 1.5, λ = Γ( 5 ) such that E(t
365
)
alarm ) = 365. I.e. one year is expected to
3
elapse before a new alarm occurs.
The inspection time before the pipeline can be used again (tinsp ) is also assumed to be
Weibull distributed with (tinsp − 1) ∼ WB (k = 1.5, λ = Γ(35 ) ) with a minimum inspection
3
time of 1 day in case of a false alarm. If the pipeline has to be repaired, the time span
expands considerably. The formula of the density function and a plot of the density for
talarm are given in the appendix in (A.1) and Figure A.4, respectively.
At the harbour, Naphtha orders are placed according to the same inventory policy. Here,
however, the ordered quantity is delivered after some transport time when the ship arrives
at the harbour. This time span depends on the supplier and the shipper organizing the
transport. The order lead time (tship ) is considered as a Weibull-distributed random vari-
able with (tship − 2) ∼ WB (k = 2, λ = Γ(1.5)
5
). It is assumed that under optimal conditions
(i.e. closest supplier and immediately available shipment capacities) at least two days
elapse before an order can be received.
The objective is to minimize the inventory holding costs which comprises a) ordering/-
shipment costs, b) stock holding costs and c) shortfall costs. The drivers of these costs
are a) the number δ of shipments, b) the average total inventory ¯l and c) the plant uti-
lization β.212 As discussed above, the precise cost rates of these three drivers are assumed
to be unknown and time-varying. The aim is to determine efficient configurations of the
inventory parameters and to quantify the trade-off between the three cost drivers.
From example 9 it is known that the pipeline transport is assumed to induce negligible
fixed costs such that a continuous supply of both sites minimizes the cycle stocks. Hence,
it is concluded that Si = si + ωimax for i = 1, 2 such that at both sites only safety stocks
212 ∑t ∑i lit
β is calculated analogously to (4.18) and (4.16). ¯l is calculated as ¯l = T
. δ simply counts the
number of non-zero orders at the harbour (qht > 0).
190

are at hand to cover shortfalls of the pipeline supply but no cycle stocks due to batch
building.213 If a pipeline is under repair or inspection, the demand to be covered from
stock depends on the inspection time tinsp (which is Weibull distributed) and production
modes of the crackers during this time (which follow a discrete Markov process). Hence,
a (closed-form) mathematical expression of the distribution of Naphtha consumption at a
site during a pipeline inspection is hard to derive. However, Monte-Carlo simulation can
be used to approximate the distributions of this measure at both sites.
Based on the density estimates, the corresponding loss functions are calculated straight-
forwardly. Since the Markov processes of both crackers are very similar, the loss func-
tions show a very similar pattern, differing primarily in the scale (which corresponds to
the crackers’ capacities). Figures 4.13a-4.13c show the estimated density functions and a
QQ-plot depicting the re-order levels to be held at both sites to achieve a specific β-service
level.214 I.e. each point in Figure 4.13c corresponds to a specific β-service level that is
achieved by setting the associated re-order levels at the sites.215 The loss functions for
both sites are shown in Figure A.5 in the appendix.
Based on the estimated density functions, the mean and the 95%-quantile are superim-
posed by the dashed lines in Figure 4.13a and Figure 4.13b whereby the latter coincides
with the re-order level ensuring a 95% α-service level. For both sites the density functions
of total consumption are bimodal and skewed to the right. The first mode is at zero con-
sumption and is inherited from the Markov chain of the production models. It corresponds
to situations when a pipeline inspection coincides with a cracker shut-down. The second
mode is inherited from the Weibull distribution determining the pipeline inspection time
which also causes the skewness.
From Figure 4.13c it can be taken that a linear relation between the re-order levels at
both sites exists. This relation can be used to reduce the set of parameters to be explored by
defining s2 = s1 ⋅ 23 whereby the slope of the fitted regression line in Figure 4.13c corresponds
almost exactly to the ratio of cracker capacities.
Accordingly, the relation between the local re-order level si and the expected loss during a
pipeline inspection period can be quantified.216 Local stocks protect against the shortfall risk
due to a pipeline break-down. Additionally, the variability of supply has to be examined.
Since it is generally advantageous to consolidate inventories (risk pooling), stocks at the
harbour should be held to buffer supply and production risk. Therefore, the parameters
(sh , Sh ) have to be determined controlling the Naphtha availability of both crackers and
the number of shipments for replenishment. The density of the total Naphtha consumption
during the order lead time is a convolution of Markov processes and the Weibull distributed
213
Here, ωimax denotes the maximum consumption rate of cracker i.
214
The density functions are derived by Gaussian kernel estimation with the default settings of the density
function from the stats package based on a 10,000-replicates sample, see R Core Team (2012). The
loss functions are calculated straightforwardly from the density functions’ estimates.
215
The leftmost point implies a β-service level of 5% whereas the rightmost point constitutes a β-service
level of 99.5%.
216
This corresponds to the expected cracker utilization and, hence, expected shortfall costs.
191

mean 95%−quantile

0.00006
0.00004
density
0.00002
0.00000

0 10000 20000 30000 40000 50000


consumption (in t)

(a) Density function for site 1

mean 95%−quantile
0.00008
density
0.00004
0.00000

0 5000 10000 15000 20000 25000 30000


consumption (in t)

(b) Density function for site 2


15000


safety stock site 2






10000





●●
●●
●●
●●
●●
●●
●●
●●
●●●●●
●●●
5000

●●●
●●●
●●●
●●●
●●●●
●●●●
●●
●●●●●
●●●


●●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●
●●

0

0 5000 10000 15000 20000 25000


safety stock site 1
(c) Comparison plot of re-order levels

Figure 4.13: Loss functions for Naphtha consumption during pipeline inspection and their
relation

lead time. Figure 4.14a and Figure 4.14b show the simulated density and loss function of
192

this measure.217
mean 95%−quantile

0.000030
0.000020
density
0.000010
0.000000

0 20000 40000 60000 80000


consumption (in t)
(a) Density function for total Naphtha consumption

V−1⎛⎝E(X) ⋅ (0.05)⎞⎠
25000
expected loss V(r) (in t)
15000
5000

E(X) ⋅ (0.05)
0

0 20000 40000 60000 80000 100000


re−order point sh (in t)

(b) Loss function for total Naphtha consumption

Figure 4.14: Density and loss function for total Naphtha consumption during order lead
time

Figure 4.14a shows no (clear) bimodality whereby the density for a zero-consumption is
positive. This is because the independent Markov chains of both crackers form a collapsed
Markov process whose state space and steady state vector is smoother. In both figures,
95%-service levels are superimposed by dashed lines (the α- service level in Figure 4.14a
and the β-service level in Figure 4.14b). Using this information about the central re-order
level at the harbour in combination with the information about the local re-order level, the
system’s total expected backlog could be calculated using the density and loss functions of
each stage. However, the interaction of local safety stocks and the central safety stock at
the harbour needs to be formulated explicitly which is quite cumbersome. Since the local
217
Again a Monte-Carlo simulation with a sample size of 10,000 is used. The estimation procedure is the
same as described before.
193

safety stocks also buffer lacks of supply due to delayed shipments, an independent setting
of the parameters si based on the individual loss functions underestimates the real capacity
utilization (and the β-service level). Furthermore, the deduction of the average stock level
at all three locations cannot be expressed in a closed form.218 Therefore, the complete
inventory system is simulated.
The system is modelled on a daily basis and implemented as an R program. Due to the
simplicity of the model, the program’s pseudocode is kept for the appendix (Table A.19).
The NSGA-II algorithm is used for optimization with three parameters specifying a con-
figuration: s1 , sh , and Sh . The remaining parameters are set according to the relations
discussed above. I.e. the re-order point of location 2 is given by s2 = s1 ⋅ 2/3 whereas the
order levels are expressed as S1 = s1 + 3, 000 and S2 = s2 + 2, 000. The shipment quantity
Sh − sh is restricted to be at least 15,000 tons which avoids too frequent deliveries and
ensures a sufficient utilization of the tanker fleets. Furthermore, lower bounds for the
re-order points are set to s1 , sh ≥ 5, 000 to ensure a minimal safety stock at both crackers.
The parameters’ upper bounds are given by the local tank capacities. The implementation
of the NSGA-II algorithm in the nsga2 function contained in the mco package of the R
environment is used.219 In this optimization procedure, the population size is set to 1,024
which is a sufficiently large sample for further statistical analysis. The number of gener-
ations is set to 50 which results in a fairly good convergence to the Pareto front.220 The
remaining parameters are set to the default values.221
Each configuration is simulated for a 10-years period to account for the rather long
intervals between pipeline inspections. No replications of a configuration are performed
since the variability of the performance measures primarily depends on the length of the
simulated time span. Averaging over a number of replications would in turn overweight
the warm-up phase of the simulation. In the warm-up phase the performance measures
are quite stable due to fixed initial stocks.222 The optimization using NSGA-II generates a
Pareto front consisting of 1,024 observations. Figure 4.15 visualizes the found solutions.
In Figure 4.15 the average total stock in tons per period is plotted against the number
of shipments ordered per period. Each Pareto-optimal solution is represented by a grey-
scaled point whereby the shade indicates the realized β-service level of this solution. Bright
solutions show a β-service level of about 40% whereas dark-coloured refer to a β-service
level close to 100%.
This sample can be used to describe the Pareto front by an appropriate surrogate function

218
See Schneider et al. (1995) for an approximation.
219
See Trautmann et al. (2010).
220
Experiments with 100 and 200 generations and smaller population size reveal no significant shifts in
the Pareto front.
221
The defaults are e.g. a mutation probability of 0.2 and a crossover probability of 0.7, see Trautmann
et al. (2010).
222
Initial stocks are 15,000, 10,000, and 50,000 tons for locations 1,2, and the harbour stocks, respectively.
In general, for such a simulation study there is a trade-off between multiple replications and one long
simulation run, see Law (2007, sec. 9.5).
194

● ●● ● ● ●
●● ●●● ● ●
● ● ●● ●● ●
β−service
● ● ● ● ●●
●●
●● ●● ● ●●
●●●

● ● ●
●● ● ● ●●● ● ● ●● ●

●●
● ●● ● ● ● ●● ● ●●

● ● ● ● level●

number of shipments (per period)

● ● ● ●●● ● ● ●
● ●● ●●● ● ●● ● ●
● ● ● ● 100 %
0.20

● ● ● ●● ● ●
●●●
● ● ● ● ●● ●

● ●● ● ●
●●●●●
● ● ● ● ● ●●●●● ●
●● ● ●
●● ●● ●●● ●● ● ●
●● ●●●●● ● ●●● ● ● ●●

●● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ●●● ● ● ● ●● ● ●● 84 %
●●● ● ●● ● ● ● ● ●

●● ● ● ● ● ● ● ●
● ● ●● ● ● ● ● ● ●
● ●● ● ●● ● ● ●
●● ●●● ●●
● ●● ●● ● ● ● ●
●● ● ●● ●
●● ● ● ● ● ● ●
●●●● ●● ● ●● ● ●● ● ●● ●● ●
● ●
● ●● ● ● ● ● ● ● ● 46 %
0.15


● ● ● ●● ● ● ●
● ● ● ●● ● ● ● ● ● ●
● ● ●

● ●
● ● ● ● ● ● ● ●● ●
●● ●● ●●● ● ● ●● ● ● ● ●
●● ● ●●● ● ●
● ● ● ● ●

●● ●●●
● ● ● ●
●● ● ● ● ●

●●●●●● ●●● ●● ● ●
● ● ●●
● ●● ● ● ● ● ● ●
● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ●●
● ●●●●●
●●
●●● ●● ● ●●
● ●●●
● ●●
● ●● ● ● ● ●● ●● ●● ●
● ● ● ●● ●




●●●
●●●
●●●●●●
● ● ●●●● ● ● ● ●● ● ●
●●●
●●
● ● ●
● ● ● ● ●● ● ●

●●●




●● ●
● ●● ●
● ●● ● ● ●●

● ● ●
● ● ●

●●

●●

●●●
● ●
●●●● ●

●●● ● ●●● ● ●
● ● ●●●● ● ●
●●
●● ● ● ● ● ● ● ●


●●

●● ●● ● ● ● ● ● ●
●● ● ●● ●● ● ●●●
0.10


●●●

● ● ● ●● ●


●●
●●●●●
● ●● ●●● ● ● ●●

●●●
●●

●●●●●● ● ● ● ● ● ● ●


● ●
●●
●●●● ●● ● ● ●
● ●●
● ●● ● ●● ●
● ●●● ●● ●
● ●
●●
●●●
● ●●●● ●
● ● ● ● ● ●● ●
● ● ●

● ●●● ● ● ● ●●● ● ●
●● ● ●● ●
●●
●●●●●
●●
●● ● ●● ● ● ● ● ● ●
●●●
●● ●
●●
●●●
●● ● ●
●●

●● ●● ● ● ● ● ● ●
● ●●● ● ● ● ● ● ● ●
●●●●● ● ●●● ● ● ● ●
● ● ● ● ●● ● ●
●●●● ●● ● ● ● ● ● ●● ● ●
● ● ●● ●● ●

●●●●● ●●
● ●● ● ● ● ●●●●●

●●● ● ● ●
●● ●● ●● ●● ●●
●●

●● ● ● ● ● ●● ●● ● ●●●●●
●●
●●● ●●●● ● ●● ● ●● ● ●

0.05

●●●●●●
●● ●

●●●
● ●
● ●● ●●

●●●●
● ●●● ●●

10000 20000 30000 40000 50000 60000


average total stock level

Figure 4.15: Scatterplot of Pareto front (β-service level in grey scale)

which eases the determination of optimal configuration in the following decision support
steps.
Among the three performance measures, the realized β-service level depends on both
other performance measures. Let i be the index of Pareto optimal solutions and define by
¯li , δi and βi the corresponding performance measures. Then, the following model can be
used to approximate the observed configurations

βi
log ( ) = μ + γ11 ⋅ ¯li + γ12 ⋅ δi + γ21 ⋅ ¯li2 + γ22 ⋅ δi2 + (4.22)
1 − βi
γ31 ⋅ log(¯li ) + γ32 ⋅ log(δi ) + γ1 ⋅ δi ⋅ ¯li + i .

The model summary of (4.22), including parameter estimates, can be found in Ta-
ble A.20. The model shows a good overall fit which is indicated by unsuspicious residual
plots (see Figure A.6) and McFadden’s pseudo-R2 of about 0.99. Using this model, the
Pareto front can be described continuously which is depicted in Figure 4.16.
In Figure 4.16 the Pareto front is evaluated over the area spanned by the sample depicted
in Figure 4.15. The grey scale, representing the β-service level, is the same as before.
Additionally, contour lines are superimposed indicating levels of constant β-service level.
Figure 4.16 quantifies the trade-off between stock holding and the frequency of shipping
w.r.t. the β-service level. The shape of the substitution lines remains stable for all β-
service levels but the substitution area (i.e. the length of the contour line) decreases for
high β-service levels (β > 99.5%). This implies that a decrease in average stock cannot
be compensated arbitrarily by an increase of the shipping frequency. The minimum stock
level to achieve a specific β-service level forms the upper end of each contour line. For
195

β−service
level

1.0
number of shipments (per period)

0.20

0.9
99
0.9

0.9

0
97
0.9

5
950
0.8

0.9
900
0.15

0.9
750
0.7

0.9
0.5000

500
0.9
000

0.6
0.60

0.10
0.8
00

00
0.7

0
00

0.5
0

10000 20000 30000 40000 50000 60000


average stock level

Figure 4.16: Grey-scaled levelplot of estimated Pareto front

instance, the average stock level to achieve a β-service level of 95% varies between 20,000
and 38,000 tons approximately.
Based on the individual loss functions depicted in Figure A.5a and Figure A.5b local
re-order levels s1 and s2 of 18,000 and 13,000 tons are indicated, respectively. Setting
the re-order level sh at the harbour to about 42,000 tons (as Figure 4.14b suggests) and
the minimal order-up-to level Sh to 57,000 tons, results in an average stock level of ap-
proximately 50,000 tons accompanied with a β-service level close to 100% and a shipment
frequency of about 0.28 ships per day. As 0.28 exceeds the maximum of Pareto-optimal
solutions, this configuration is clearly inefficient. Obviously, a component-wise setting of
inventory parameters does not lead to a sufficiently accurately modelled system perfor-
mance. Based on the estimated relation between the performance measures depicted in
Figure 4.16, however, the optimal configuration for given weights of all three measures
can be easily determined.
196

4.3.5 Decision support


Decision support systems aim at helping the management of an organizational unit to
find decisions for planning problems.223 Information about the relevant processes has to
be provided and an evaluation of potential solutions has to be possible.
Various types of DSS can be distinguished of which for the present work mainly model-
driven DSS are relevant. Model-driven DSS make use of a model (optimization, simulation
or statistical) of some (sub)-system of the company to provide decision-relevant data. In
contrast, e.g. knowledge- or communication-driven DSS provide topic-specific expertises
or support shared working processes.224
Typically, DSS are integrated software packages providing three core components: 1)
a database, 2) a model-base and 3) a user interface.225 The database component is
designed to store and analyse system-relevant data. Statistical analysis routines need
to be applicable to huge volumes of data. Data and its fast analysis show a strong
impact on calculating key performance indicators, e.g. allowing to detect bottlenecks.226
Furthermore, data builds the basis of the modelling component for which suitable model
parameters have to be extracted.
The model base contains programs which form a decision/problem-specific represen-
tation of the system under study.227 Based on this core model, the performance of the
system can be evaluated under various configurations. Moreover, this component also
provides programs to facilitate the decision process e.g. by providing tools for criteria
ranking in case of multi-criteria problems228 and tools for analysing a specific configu-
ration in higher detail.229 The latter particularly addresses the comparison of multiple
configurations by a set of performance measures.
Finally, the user interface component allows managers to use the DSS without detailed
knowledge about its internal structure.230 Basically, the user interface eases the data
transfer between users and DSS whereby the users’ inputs reflect e.g. parameter ranges and
preferences of criteria. The outputs of a DSS contain a set of solutions (e.g. Pareto-optimal
solutions) in the multi-criteria case or the optimal solution in the single-objective case.
Often information is exchanged in a spreadsheet format but also graphical representations
of solutions are provided.231 In the end, the usability of a software program is crucial for
the implementation of a DSS in a business organization.
To illustrate the post-processing of generated information from a simulation model for

223
See Power (2002, ch. 1).
224
For an encompassing classification see e.g. Power (2002, ch. 1).
225
See Power (2002, ch. 1).
226
The fast access to (statistical) analyses of huge databases is often referred to as on-line analytical
processing (OLAP), see e.g. Shim et al. (2002).
227
See Power and Sharda (2007).
228
Such as an analytic hierarchy process, see Power and Sharda (2007).
229
So-called decision analyses, see Power and Sharda (2007).
230
See Power and Sharda (2007).
231
See Shim et al. (2002).
197

a chemical SC, example 12 is continued.

Example 13 (Post-processing of simulation optimization results). At the end of a simu-


lation study either a final recommendation (e.g. in the form of an optimal configuration)
or a set of potential alternatives for the system considered has to be provided. In exam-
ple 12 a meta-model for the Pareto-front was fitted which provides the set of potential
system states from which a specific solution has to be selected. After the management
has selected a specific solution, the corresponding configuration has to be extracted, ide-
ally without forcing additional computational effort due to additional simulation runs.
From the Pareto-front meta-model, however, the parameter configuration corresponding
to a specified point cannot be deduced directly. Instead an approach is necessary which
responds quickly to varying preferences.
Let xi denote the ith configuration of the determined sample of the Pareto-front and
yi the associated performance measures such that yi = Ĥ(xi ) where Ĥ(⋅) represents the
simulation model. Hence, the set of zi = (xi , yi ) constitutes a sub-sample of the set of
efficient constellation zi ∈ Z̄ ef f ⊂ Z ef f .232 Based on the range and the relations among
the y-components a Pareto-front meta-model (4.22) is estimated which can be used to
describe the set of efficient constellations Z ef f more precisely.
Let ŷj = (β̂j , δj , ¯lj ) denote an estimated performance vector whereby β̂j = f (δj , ¯lj ) and
f (⋅) denotes the Pareto-front model (4.22). Then, an inverse mapping function Ĥ −1 is
required to obtain the corresponding xj . However, the inverse of a simulation model cannot
be derived directly. Therefore, an approximation of Ĥ −1 is required.
Since simulation models are typically applied when an analytical model is unknown,
finding an appropriate analytic function that accurately mimics Ĥ −1 can be expected to be
impossible.233 Instead, RSM can be adapted. Originally, RSM is designed to find (Pareto-
)optimal configurations by iteratively approximating the simulation model’s local behaviour
using simple analytic functions (such as linear regression models). The idea is that simple
functions can provide a sufficiently accurate local approximation of the simulation model’s
behaviour as long as the region covered is pretty small. This idea can also be applied to
the presented problem.
In classic RSM the region for which the simulation model is approximated iteratively
moves through the sample space (controlled by a gradient rule) to explore the performance
function of the system. Here, the region depends on the performance vector to be tested
(ŷ). However, no additional simulation runs shall be performed. Instead, the already
evaluated constellations (xi , yi ) can be analysed. In this set, configurations with a perfor-
ˆ ˆ
mance similar to ŷ can be used to fit a local approximation of Ĥ −1 (say Ĥ −1 ). Ĥ −1 can
232
Note that in this example only the y-component of an evaluated constellation z determines the effi-
ciency, since all x values are stock control parameters and are finally reflected by the average stock
level which is a part of y.
233
Otherwise, it would be very likely that a closed-form analytic function also exists to approximate Ĥ
which would be a representation of H(⋅) (the real system). Hence, a simulation study would be
superfluous and this function would be a better analytic model.
198

ˆ
then be used to estimate the configuration x̂ that corresponds to ŷ by evaluating Ĥ −1 for
ˆ −1
ˆ = Ĥ
ŷ, i.e. x̂ (ŷ). The procedure is briefly described as follows:

1. determine a desired (Pareto-optimal) performance vector ŷ

2. determine a local subset of Z̄ ef f depending on ŷ: Z̄ŷef f ⊂ Z̄ ef f

ˆ
3. estimate a local (linear) model based on (xi , yi ) = zi ∈ Z̄ŷef f , i.e. xi = Ĥŷ−1 (yi ) + i

ˆ
4. estimate a desired configuration x̂ by x̂ = Ĥŷ−1 (ŷ).

In the described procedure, two components have to be specified: First, which type of model
ˆ
to use for the local model Ĥŷ−1 and, second, how to determine Z̄ŷef f . In this example for
ˆ
the first component Ĥŷ−1 a simple first order linear regression model with interaction terms
is estimated for each performance measure. I.e. for all (xi , yi ) ∈ Z̄ŷef f holds

xi = μ + Γ1 ⋅ yi + Γ2 ⋅ (yi yi ) + γ ⋅ (yi yi yi ) + i (4.23)

whereby μ is the intercept vector, Γ1 is the matrix of (ordinary) regression coefficients


(dimension 3 × 3), Γ2 is the matrix of first-order interaction effects (dimension 3 × 3),
(yi yi ) is the vector of all three first-order interactions (i.e. (yi yi ) = (βi ⋅ ¯li , βi ⋅ δi , δi ⋅ ¯li )),
γ is the vector of second-order interaction coefficients, (yi yi yi ) is the scalar of second-
order interaction (i.e. (yi yi yi ) = βi ⋅ ¯li ⋅ δi ), and i is the error vector. Note that for each
performance measure, eight parameters have to be estimated which requires ∣Z̄ ef f ∣ > 8. In
the presented example this restriction was always fulfilled.234 The parameters are estimated
by ordinary least squares regression.
To define a local subset of Pareto-optimal solutions close to ŷ the Delaunay triangulation
is calculated for the set Z̄ = Z̄ ef f ∪ {ŷ}. The Delaunay triangulation subdivides the convex
hull of a set of points into disjunct simplices. Each simplex consists of d + 1 points
whereby d denotes the dimension of the data set.235 A specific property of the Delaunay
triangulation is that for each simplex the circumhypersphere constituted by its points is
empty which implies that the Delaunay triangulation is unique.236 Let D denote the set of
Delaunay simplices where each simplex dk is a set of d+1 points, i.e. dk = {zk1 , ..., zk(d+1) ∈
Z̄}. Now select the subset D′ ⊂ D whose simplices all contain ŷ: D′ = {dk ∣ ŷ ∈ dk }.
Finally, retain all zi ∈ Z̄ ef f such that ∃d′k ∈ D′ with zi ∈ d′k , i.e. Z̄ŷef f = {zi ∣ ∃d′k ∈
D′ ∶ zi ∈ d′k }. Selecting Z̄ŷef f in this way has the appealing property that neighbouring
points are comparatively equally distributed around ŷ which facilitates an accurate fit

234
If this restriction would be violated, either a simpler model (e.g. by dropping interaction effects) had
to be chosen or the neighbourhood Z̄ ef f had to be enlarged.
235
See Delaunay (1934) for the original paper.
236
As long as the data set is in general position, i.e. no co-circular or degenerated subsets exist, see De Berg
et al. (2008, ch. 9) for a precise definition.
199

ˆ
of Ĥŷ−1 .237 In the two-dimensional case this fact is supported by the property that the
Delaunay triangulation maximizes the minimum angle of all simplices.238 Alternatively,
the procedure could be briefly described as finding all adjacent points of ŷ in the Delaunay
graph.239
To illustrate the determination of Z̄ŷef f , Figure 4.17 shows a Delaunay triangulation for
an artificial data set drawn from a bivariate uniform distribution.

● sample points ⎛⎝Z ef f ⎞⎠ y ⎞⎠


point to be tested ⎛⎝^ sub−sample points ⎛⎝Z ^yef f⎞⎠
10

● ●



● ●●



● ●


● ●

8







● ● ●


● ●


6





● ●
● ●

● ● ●●


● ●

4





● ●