Sie sind auf Seite 1von 12

Energy 183 (2019) 776e787

Contents lists available at ScienceDirect

journal homepage:

Energy performance analysis of continuous processes using surrogate

Benedikt Beisheim a, *, Keivan Rahimi-Adli a, Stefan Kra
€mer b, Sebastian Engell c
a €ln, Germany
INEOS Manufacturing Deutschland GmbH, Alte Strasse 201, 50769, Ko
Bayer AG, Engineering & Technology, 51368, Leverkusen, Germany
c €t Dortmund, Emil-Figge-Str. 70,
Department of Biochemical and Chemical Engineering, Process Dynamics and Operations Group, Technische Universita
44221 Dortmund, Germany

a r t i c l e i n f o a b s t r a c t

Article history: Energy intensity is a commonly used key performance indicator (KPI) for the energy performance of
Received 28 October 2018 production processes and often serves as an Energy Performance Indicator (EnPI). The energy perfor-
Received in revised form mance of a process depends on a variety of factors like capacity utilization, ambient temperature and
13 May 2019
operational performance. Understanding the influence of these factors on the relevant KPI or EnPI helps
Accepted 24 May 2019
Available online 18 June 2019
to distinguish between influenceable and non-influenceable contributions and to identify the
improvement potential. By describing the best historically observed performance as a function of the
non-influenceable factors, valuable information on the efficiency of the current operation of a plant and
Energy performance indicators
the improvement potential is provided to plant managers and operators. In this contribution, a method is
Energy baseline proposed to identify a surrogate performance model for the attainable energy performance considering
Energy management systems the relevant factors. The modeling method is based solely on the evaluation of historical process data and
Surrogate models employs a novel combination of known surrogate modeling techniques using clustering, model fitting
Process monitoring and model simplification by backward elimination. The method is applied to real process data of a large
industrial production plant and the use of the model for process performance monitoring and reporting
in accordance with energy management system requirements is illustrated and discussed.
© 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license

1. Introduction goals that are defined on the societal level, e.g. the reduction of the
carbon footprint.
1.1. Evaluation of the energy performance of industrial processes In Germany, the problem of the competitive disadvantage for
energy intensive industries caused by high energy prices due to the
The industry in Europe faces enormous challenges regarding energy transition was addressed by a reduction of levies under
productivity and competitiveness. In the chemical industry, sig- certain prerequisites [2]. As one of these prerequisites, energy
nificant investments in new production plants are made outside of intensive companies need to operate a certified energy manage-
Europe in countries with lower prices of feedstock and energy [1]. ment system according to ISO 50001:2018 [3] or EMAS [4]. Certified
This trend is expected to continue and these new plants are often sites or companies commit themselves to continuously enhancing
highly efficient, while the plant inventory is relatively old in the environmental and energy performance. Morrow and Rondi-
Europe. In addition, companies have to respond to the political and nelli [5] point out that the introduction of environmental man-
societal pressure asking for a more sustainable production. Thus, agement systems is motivated by the desire to improve the
the European process industry has to increase its energy efficiency performance beyond regulatory compliance. However, the attri-
to compete with other regions in the world and to comply with bution of improvements to the adoption and certification of man-
agement systems is difficult. Poksinska et al. [6] point out that
management systems can be used as a toolbox for environmental
performance enhancements but that the certification alone is
* Corresponding author. insufficient.
E-mail addresses: (B. Beisheim), keivan.rahimi- (K. Rahimi-Adli), (S. Kr€ amer),
ISO 50001:2018 demands the use of energy performance in- (S. Engell). dicators (EnPI) which have to be compared with an energy baseline.
0360-5442/© 2019 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (
B. Beisheim et al. / Energy 183 (2019) 776e787 777

Nomenclature Pð,Þ Probability

Pi ith percentile
Symbol Explanation PT Pretreated
AIC Akaike information criterion Rj Set of points for the identification of the
BDP Best demonstrated practice representative point
b Base function r Representative point of the cluster
Cj Set of points in cluster S Subset of active basis functions
c Cluster center S Number of selected basis functions
c Continuous state SU Maximum number of allowed basis functions
d Discrete state Tamb Ambient temperature
EnB Energy baseline Tref Reference temperature
EnPI Energy performance indicator t Time
f Function U Data matrix
i Counting variable U Loose bound for numerical optimization
j Counting variable ut Vector of states at time t
KPI Key perfomance indicator u Data value
MILP Mixed-integer linear program ub Upper bound
MIQP Mixed-integer quadratic program V c Domain of continuous states
NP Non-deterministic polynomial time V d Domain of discrete states
OIP Operational improvement potential Xij ijth entry of the transformed input data matrix
OLR Ordinary least square regression xi Pretreated and scaled data vector
UV Unit variance y Output variable
B Set of basis functions z Data point
E Energy b Regression factor
Fk Model error ratio g Backward elimination error threshold
H Stationarity condition ε Model error
k Counting variable m Mean
lb Lower bound s Standard deviation
m_ P Production rate F Objective function
N Number of data points u Cluster weight
n Number

The indicator and the baseline can be chosen by the users of the 1.2. Overview of currently available evaluation techniques
management systems. As a result, a variety of indicators and
baselines are used. A major challenge is that the performance of The evaluation of process performance and its comparison with
production processes is rarely constant as ambient conditions, modeled, historical or literature data is common practice in the
feedstock quality, throughput and the maintenance status of the process industry.
processes vary over time. The influence of these variations on the Energy efficiency benchmarks for different countries and in-
process performance must be understood for a meaningful analysis dustrial sectors are available (e.g. La€ssig et al. [7] for Germany,
of the effect of measures to improve the energy efficiency, both for Phylipsen et al. [8] for the Netherlands or Makridou et al. [9] on a
long term analysis (changes of the performance due to production European scale). There are also several studies available that
levels) and for short term analysis of the improvement potential compare the performance of different production plants of the
under certain conditions as e.g. the outside temperature. same kind using a variety of indicators. The meta-study by Saygin
Using the ambient temperature as an illustrative example it can et al. [10] revealed the energy saving potential for several processes
be seen that a single influencing factor can have different and and countries based on the Best Practice Technology (BPT), which is
contradictory effects on process performance. On the one hand, low defined as the top 10% percentile of the available processes with
temperatures have a negative influence on the energy demand of a respect to energy efficiency. This evaluation is helpful on high
process since low temperatures increase the heat loss through managerial levels or for policy makers to compare the plant port-
pipes and very low temperatures even require heat tracing. On the folio with plants of other companies or in other countries to be able
other hand, low ambient temperatures decrease the energy de- to estimate possible energy saving potentials [11]. However, it is not
mand in cooling water production and absorption processes adequate to use these literature benchmarks to evaluate the current
perform better at lower temperatures. operational performance of an existing plant as the best-in-class
The knowledge of these factors and their influence on the pro- efficiency might not be attainable with the given equipment.
cess help to explain performance deviations of the processes and The process industry has already developed solutions for the
consequently to identify if fluctuations are induced by non- evaluation of the improvement potential and also standardised
influenceable external disturbances or are due to sub-optimal them in some cases. Bayer developed a holistic methodology to
operation. The minimization of the duration of periods of sub- identify the most energy efficient operation of a production plant
optimal operation improves the efficiency, the economical perfor- [12]. In this method, two different categories of inefficiencies for
mance and the environmental footprint of the processes. chemical processes are identified: dynamic losses and static losses.
Dynamic losses are related to the operation of the given equipment,
whereas static losses are related to the choice of suboptimal
equipment, a lack of heat integration or the use of outdated
778 B. Beisheim et al. / Energy 183 (2019) 776e787

technology. This methodology introduced the term Best Demon- performance of a process. The best possible operation is not
strated Practice (BDP) for the e according to the present knowledge considered during the performance evaluation. Thus, these con-
e best possible mode of operation of a plant. It is based on rigorous cepts can only illustrate the differences in the performance during
models or on linear regression of process data after outlier removal. different time intervals. They are not able to assist plant personnel
The definition of the best demonstrated practice aims at indicating in improving the current operation as the average performance is
the optimal mode of operation at a specific set of non-influenceable not suitable for this purpose. Although different concepts for the
circumstances like ambient conditions or feedstock quality. The assessment of process performance exist in the literature (e.g.
method found its way into the NAMUR Recommendations NE 140 [15,22]), they have not yet been used for energy management
and NE 162 [13,14]. systems. Currently, there exists a gap between the methods pro-
The use of rigorous models for process performance monitoring posed in the literature and the application in industry [28,29].
is the best approach as it provides the most reliable information. In order to close this gap, this contribution proposes a method to
Model-based simulation of chemical processes is a well developed identify a model of the best demonstrated practice (the best
field in academia and industry [15,16]. Many approaches for the observed process performance depending on the values of the
efficient modeling of processes are documented (e.g. Dowling and influencing factors). The ultimate goal is to use this model to
Biegler [17]) and commercial software is available and used analyse the current process performance and to inform plant op-
extensively in industry for process optimization (e.g. AspenPlus or erators and plant managers where there is most likely room for
gPROMS and many inhouse simulators). A detailed rigorous model improvement. Because of this intended use of the model, the best
is suitable for plant optimization and improving process automa- performance is described rather than the average performance,
tion but the development effort is typically not justified if the distinguishing the method from other applications of regression
model is only used for process monitoring. In general, the models in energy management.
complexity of a model has to match the given task [18]. If suffi- The method is based on a statistical evaluation of historical data
ciently precise rigorous models cannot be developed under the and a nonlinear interpolation of the BDP as a function of the non-
constraints of time and budget, surrogate models are an alternative influenceable factors. Only limited process knowledge is neces-
that gained popularity in recent years. sary to obtain this “baseline” which can be used to monitor the
A surrogate model is a parameterized mathematical structure process behavior and to identify the potential for operational im-
(also called black-box model) that is fitted to the observations. Such provements. The resulting model represents the most important
models only represent the data that has been used to parameterize influencing variables so that a normalization of the process per-
them, so extrapolation usually is not possible and their accuracy formance with respect to these factors is possible.
depends on the density, accuracy and reproducibility of the avail- The method provides a basis for the evaluation of complex
able data. Their development also requires considerable efforts in production plants and sites using decomposition and aggregation
data pre-treatment and in the choice and parameterization of the methods as proposed in Beisheim et al. [30] and Beisheim et al. [31].
model structure. As an additional advantage, surrogate models can Prior to the investigation of the BDP, a suitable indicator has to
typically be evaluated quickly, which speeds up process optimiza- be chosen. In the industrial context several frameworks for the
tion [19]. choice of indicators are available (e.g. Giljum et al. [32], Huysman
For process monitoring and reporting, rather simple models are et al. [33], Kujanpa€€a et al. [34] or Beisheim et al. [31]). Two main
commonly used. ISO 50006:2017 [20] proposes the use of linear types are applied in these frameworks: efficiency and intensity
regression to account for different factors affecting the energy indicators. In general, the efficiency is the ratio of useful output to
performance of a process like the daily temperature figure or the total input. The intensity is the reciprocal value of the efficiency.
process utilization. Linear regression and principal component Throughout this paper, the Energy Perfomance Indicator EnPI is
analysis are effective and established methods to identify models to used which is defined as:
monitor the performance of batch and continuous processes
[21,22]. Ej
Linear regression models are relatively easy to fit to data, but EnPIj ¼ ; (1)
they are not able to capture the nonlinearities that often are present
in chemical processes. Kriging models [23] and artificial neural where Ej denotes the energy demand and mP the amount of
networks [24,25] are two of the most popular nonlinear surrogate product in specification during a specified time interval. j denotes
models. Such surrogate models have been successfully employed in the type of the energy source. In chemical processes different en-
various fields. E.g. Audet et al. [26] use Kriging models in order to ergy and utility sources are utilized, e.g. steam, electricity, pres-
optimize the wing planform design of airplanes. Neural networks surized air or cooling water. BDP models for each of these sources
and Kriging models however may exhibit some “roughness” of the can be identified separately or cumulated indicators can be used
response surfaces which can make it difficult to use them for (e.g. by using an Energy Currency [31]). The operational improve-
optimization using derivative-based methods [27]. ment potential (OIP) is defined as the difference between the EnPI
and the BDP at the given non-influenceable conditions:
1.3. Scope of this contribution
OIP ¼ EnPI  BDP: (2)
The requirements for the certification of energy management
systems are continuously refined and the factor-based normaliza- The proposed concept is independent of the chosen source of
tion of energy related indicators is increasingly in the focus of energy. Generally, indicators should be used that can be derived
certified companies and certification bodies. However, to the best from measurements of physical flows.
of the authors’ knowledge, concepts beyond multivariate linear The generation of a BDP model is a multi-step procedure which
regression are not documented in the literature nor are they part of is visualized in Fig. 1. Since the approach is data driven, data
the current standards for energy management systems. In many acquisition is the first step of the method. The raw data is pre-
cases, the use of linear regression models is insufficient since the treated by mean centering and unit variance (UV) scaling to have
performance curves of the process data clearly indicate nonlinear the same range of variation of all variables. The data is then clus-
behavior. Moreover, the available concepts focus on the average tered to identify different operational regimes and to reduce the
B. Beisheim et al. / Energy 183 (2019) 776e787 779

Fig. 1. BDP identification process.

number of data points that are used in the model identification The sampling time has to suit the purpose. For real-time
step. The model is constructed using surrogate modeling tech- monitoring, the compression interval of the data for model iden-
niques. The different steps of the modeling procedure are explained tification has to be a reasonable fraction (12  15) of the time constant
in detail in Section 2. The application to real plant data from INEOS of the process. Much smaller sampling times do not provide addi-
in Cologne is described in Section 3. tional information as the rate of change of the process is limited by
The novelty of this contribution is the combination of state-of- the time constant whereas longer intervals prevent the early
the-art methods for data clustering and model identification identification of suboptimal operation and increase the reaction
which were not yet used in the context of the computation of en- time of plant personnel.
ergy baselines. The use of the best demonstrated practice enhances The data has to be representative for typical operational sce-
the useablity of the energy baseline for both real-time performance narios. Abnormal regimes must be excluded from the data to avoid
monitoring and reporting. The use of a clustering algorithm re- the identification of a non-representative BDP model. The data
sponds to the issue of differently populated operational regimes in must be provided with consistent time stamps, for example, if lab
the model identification. The model identification combined with a data is used, the time when the sample was taken is important, not
statistical backward elimination of influence factors with minor the time at which it was sent to the lab or when it was analyzed.
significance generates a simple, non-linear model which is acces- The result is a data matrix U where each row represents one
sible for people with limited mathematical knowledge. Thereby, time step tj and each column corresponds to a state variable. In this
the gap between concepts in the scientific literature and large-scale matrix ci denotes a continuous state, di a discrete state, nt the
application in industry can be closed. number of time steps, nc the number of continuous variables and nd
the number of discrete variables:
2. Modeling procedure 0 1
B t 1 C
2.1. Data acquisition U¼B
@ « A¼
C uc1 / ucnc ud1 / udn
The first step is the acquisition of measurement data. The use of
reconciled data is advised to avoid data inconsistency due to
0 1
measurement uncertainties and gross errors. Data reconciliation is
ut1 ;c1 / ut1 ;cnc ut1 ;d1 / ut1 ;dnc
based on physical laws, in particular on conservation laws for mass B C
@ « 1 « « 1 « C A: (4)
and energy. If such relationships are not available, the method can utnt ;c1 / utnt ;cnc utn ;d1 / utn ;dnc
also be applied to the raw data, possibly after removal of outliers t t

and smoothing. If data reconciliation is applied to historical data,

the same method should be used for the live data when the model
is used for monitoring.
In what follows, it is assumed that measured values of the 2.2. Data pretreatment
quantities which enter the model are available in sufficient quality.
The more sampling points are available the better is the repre- The next step after data acquisition is data pretreatment. Three
sentation of the behavior of the process. The occurrence of signif- steps are performed: outlier removal, classification of data and
icant process modifications has to be taken into account when mean centering and unit variance (UV) scaling.
choosing the time interval for the determination of the BDP. Data Outlier removal is performed by filtering data by specifying
prior to modifications may not be suitable for the model identifi- upper and lower bounds which define the typical operational en-
cation of the current BDP, but contrariwise a significant time period velope. All data vectors at ti , in which at least one variable is outside
is necessary to collect sufficient process data to identify a reliable the specified limits, are removed. This step is a preliminary one so
and accurate BDP model. the limits should not be chosen to be too narrow. After clustering, a
Besides data on flows of energy and materials, information on second outlier removal step is applied. It is therefore recommended
other factors influencing the process performance is needed. Such to keep as much data points as possible for clustering and to
factors can be process conditions such as the ambient temperature remove potential outliers by means of selecting the BDP of the
as well as disturbances, e.g., the quality of feedstock. In many cases, clusters (see below).
plant personnel have a good overview, which external influences The available data is classified into continuous and discrete
affect the process performance. Conducting interviews before states. Some processes are operated in different distinct regimes
starting the modeling effort is a way to narrow down the set of which may not be under the influence of the operators. Conse-
factor candidates. The ambient temperature should be initially quently, the computation of the BDP should consider this
included in every analysis in order to be in accordance with information.
normative standards (c.f. ISO 50006:2017 [20]). For continuous processes, the analysis is based upon the
Measurement data is typically available in a process information assumption that the process is stationary, unless storage terms are
management system (PIMS), which archives measurement data taken into account in the computation of the indicators. The values
continuously at different sampling times (due to internal of the energy KPI during large transients will possibly be
compression). When queried, the system interpolates the stored misleading and when the data is collected automatically, a statio-
data. Therefore, data that was recorded at different sampling rates narity analysis should be performed, either manually or automat-
can be used for the computation of the BDP model. ically. Methods for this are available, e.g. comparing the variance
780 B. Beisheim et al. / Energy 183 (2019) 776e787

obtained by the raw data with the variance obtained from filtered
data and performing an F-test [35,36]. If the data was collected at
stationary points for most of the time, KPI values in transient sit-
uations will also be filtered out by the exclusion of the outermost
percentiles of the data. When the actual operation is compared to
the BDP, care must be taken to only compare data from periods
where the process is stationary.
The resulting input to the next step in the BDP modeling pro-
cedure is a pretreated data matrix UPT as defined below:
UPT ¼ ut ulb;ci  utj ;ci  uub;ci cci 2V c ; j2½1; …; nt 
o (5)
∧utj ;dk ¼ ud;dk cdk 2V d ; j2½1; …; nt 

V d is the domain of the discrete variables and V c the domain of

the continuous variables. ulb;ci and uub;ci denote the lower and up-
per bounds of the continuous variable i. ud;dk denotes the value of
the discrete variable k of the plant at time point tj .
The pretreated data matrix contains those samples of the time
series that represent normal operating conditions of the plant.
One goal of the model identification is to find factors that have a
significant influence on the energy performance of the plant. In a
later step, a clustering algorithm is applied to the data. Since this
algorithm is sensitive to the magnitude of the variation of the
process data, data normalization is necessary to avoid that large
factors are preferred.
A standard procedure to scale data to the same magnitude is
mean centering and standard deviation scaling. Each column vector
of the pretreated data matrix UPT is corrected by subtracting its
mean and dividing it by the standard deviation:

1    !
b PT;ci ¼ 
u  uPT;ci  m uPT;ci , 1
s uPT;ci (6)
cci 2V c and dim 1 ¼ dim uPT;i ;

where u b PT;ci denotes the scaled vector of the continuous variables.

The resulting data matrix has a zero mean and standard deviation Fig. 2. Illustration of the effect of mean centering and variance scaling on process data
of one for each variable that is used for model identification, which as in Eriksson [37].
is why the method is called unit variance (UV) scaling. The effect of
this pretreatment step is illustrated in Fig. 2. For a simplified pre-
sentation, a pretreated time series vector ub PT;ti is further denoted as a clustering algorithm is applied. For data clustering, two main
xi . X denotes the domain of the vectors xi . types can be distinguished: Hierarchical and partitional clustering.
Jain [39] proposes partitional algorithms for pattern recognition
and thus partitional clustering is used for BDP identification.
2.3. Data clustering Here, the kmeansþþ algorithm [40] is applied to the data set. It
is an extension of the kmeans algorithm [41].
The third step is data clustering. The motivation of clustering is Optimal clustering is NP hard, therefore randomized algorithms
to compress the data to a set of characteristic points in the data are often used. The quality of the clustering then depends on the
space to which the model is then fitted. Two major types of data initial guess for the cluster centers. For kmeans, the initial centers
analysis techniques are found in the literature [38]: are chosen arbitrarily. The kmeansþþ algorithm is a compromise
between a better initial guess for the initial points and a justifiable
1. Exploratory or descriptive techniques where pre-specified increase in computation time. In this algorithm, the probability of
models and hypotheses are not available. The goal is to choosing a candidate as a new cluster center is based on a
explore the structure of high-dimensional data sets. weighting by their distance to the closest cluster center.
2. Confirmatory or inferential techniques where the validity of a It is performed as follows:
hypothesis or of a set of assumptions is verified.
1. Choose one cluster center c1 randomly from x2X .
Two general data analysis approaches for pattern recognition 2. Choose a new center c2 randomly from x2X with the proba-
are available [39]: Classification methods and clustering methods. bility of choosing xi :
Both aim at predicting a behavior based on training data. While the
first approach is based on categorized data, the latter uses unca-
tegorized data. The classification of the dataset for the identifica- Dðxi Þ2
Pðc2 ¼ xi Þ ¼ P ; (7)
tion of the best demonstrated practice according to the discrete x2X DðxÞ2
states of the plant operation has been performed during data pre-
treatment. For each discrete operating state or suitable sets of states where DðxÞ denotes the distance of x to the next cluster center.
B. Beisheim et al. / Energy 183 (2019) 776e787 781

3. Repeat step 2 until k initial cluster centers have been chosen. 2.4. Model selection and parametrization using an adapted ALAMO
After finding the initial cluster centers, the kmeans algorithm
works as follows: The choice and parametrization of a surrogate model in the
context of determining the best demonstrated practice is per-
1. For each j2f1; …; kg, C j is the set of points in X where the formed using an adapted ALAMO approach. Automated Learning of
distance to the jth cluster center is the smallest among all Algebraic Models for Optimization (ALAMO) is a software package
centers. developed by Cozad et al. [27] for the efficient development of
2. For each j2f1; …; kg, the updated cluster centers cj are surrogate models that are suitable for simulation-based optimiza-
computed by calculating the mass of all points in the corre- tion. It generates simple and accurate models from simulated or
sponding cluster C j : experimental data that are convenient for derivative-based opti-
mization software. In order to overcome the shortcomings of linear
1 X regression models, ALAMO selects a combination of simple basis
cj ¼   x; (8)
Cj functions that fit the responses with an acceptable accuracy. The set
x2C j
of basis functions is defined by the user based on process knowl-
  edge, expected physical or chemical relationships etc. When the
where C j  is the cardinality of C j .
number of basis functions is increased, the regression has a larger
number of degrees of freedom and starts to capture the noise or
3. Repeat steps 1 and 2 until the cluster centers remain constant. other secondary features of the training dataset which results in so-
called “overfitting”. As an outcome of overfitting, the model has a
The algorithm is available in many standard engineering soft- low bias but a high variance. This means that a small change in the
ware packages (e.g. MATLAB). For large data sets with a high input variables can result in an unrealistic change in the responses
number of clusters other algorithms are available (e.g. kmeansjj [44]. In order to prevent this behavior, ALAMO utilizes a model
[42]). fitness measure that handles the trade-off between the goodness of
The sensitivity of the kmeansþþ algorithm to the magnitude of the fit and the complexity of the model.
the input data due the use of the Euclidean distance of the data In the following, the mathematical background of model fitting
points from the cluster centers explains the necessity of mean in ALAMO is briefly discussed and an adaptation of this method for
centering and UV scaling. the application discussed here is presented.
The resulting clusters represent typical operational domains of a The general idea behind the model identification step in ALAMO
plant. As the idea of the identification of the BDP model is to find is the determination of the best combination of predefined basis
the most efficient operation regimes, using the cluster center for functions and regression factors to represent the process data. The
model identification is not suitable. Instead, a representative point general formulation for a BDP model is given as:
from the cluster is chosen based on percentile filtering.
For this purpose, the following computation is used: X
n X
BDP ¼ bij fi xj ; (13)
1 X i¼1 j¼1
rj ¼   x (9)
R j x2R
where xj denotes the model input variable j and fi denotes the basis
function i. bij denotes the regression factor for basis function i and
model input j.
As presented by Cozad et al. [27], ALAMO solves a nested opti-
x2Rj c Pj;n  EnPIðxÞ  Pj;m ∧x2C j (10)
mization problem of the following general form:

R j 3 C j ∢X (11) min Fb;y ðb; yÞjS þ FS ðSÞ (14a)


1 X s:t:S < SU (14b)

EnPIR j ¼   EnPIðxÞ; (12)
Rj x2R j
min Fb;y ðb; yÞjS (14c)
where R j denotes the set of points which are used to identify the b;y
representative point of the cluster rj and Pj;n , Pj;m are the lower and X
upper percentile bounds for cluster j. s:t: yj ¼ S (14d)
The usage of percentiles to remove outliers is a common pro- j2B
cedure [43]. Here, the data points between the fifth and the tenth
percentile of each cluster are used. The BDP in this context repre-
sents good or very good and reproducible operational regimes that
bl yj  bj  bu yj j2B (14e)
are not the best-ever operating points but that are achievable for
the plant and the operators. By this procedure, also not previously yj ¼ f0; 1g j2B ; (14f)
removed dynamic effects that falsely indicate a very good or very
bad performance are excluded, so the usage of percentiles robus- where the objective function of the outer optimization problem
tifies the procedure. The smaller the upper percentile is chosen, the (Eq. (14a)) is a model fitness measure, which consists of two terms.
more challenging is it to match the BDP during operations. The first term represents the accuracy of the model whereas the
Consequently, it is a tuning parameter for the data analysis. The second term represents the model size (complexity). The inner
resulting cluster representatives are used for model identification optimization problem deals with the selection of the basis func-
in the next step. tions and the corresponding parameters for a given model size. yj is
782 B. Beisheim et al. / Energy 183 (2019) 776e787

a binary variable, which is one if the basis function j is selected and

X N  
  X X
bj Xij :
is zero otherwise. S is the number of basis functions and SU is the
ei  ¼ zi  (19)
maximum number of basis functions. jB j is the number of basis
i¼1 i¼1 j2B
functions in the problem formulation. bj is the coefficient of each
basis function j. The bounds of bj are given by (Eq. (14e)). The However, those values of the coefficients are to be found that
formulation of these constraints makes sure that bj is zero, if the result from the least squares solution of the curve fitting problem.
Therefore the optimality conditions for the parameters b of the
basis function is not selected, i.e. yj ¼ 0, and can take on a non-zero
selected subset of the basis functions (S ) are added to the opti-
value within its bounds otherwise. The upper and lower bounds are
mization problem as a constraint:
defined using the lasso concept [45] as:

X  OLR  X  OLR  d X N  X 
bl ¼  bj and b ¼
bj ; (15) z  b X 2 ¼! 0 (20)
dbj i¼1 i j2S j ij
j2B j2B

where bOLR
j is the vector of coefficients calculated by the Ordinary XN  X 
Least Square Regression (OLR).
f Xij zi  bj Xij ¼ 0 j2S : (21)
i¼1 j2S
In the original ALAMO formulation, the “Corrected Akaike In-
formation Criterion” (AICC ) is used as the fitness measure [46,47], Eq. (21) is incorporated into the optimization problem as a set of
which is an extended version of the “Akaike Information Criterion” big-M constraints (Eq. (22)). These constraints make sure that the
[48] tailored for small sample sizes. first order optimality condition is met for the coefficients of the
Here, as the resulting model is a curve to fit the calculated active basis functions:
percentile-centers of the clusters, the different weights of the  
clusters are considered using a Weighted Least Square (WLS) Uj 1  yj  Hj  Uj 1  yj j2B (22)
formulation in the objective function rather than the ordinary least
square error, similar to the approach proposed by Banks and Joyner
[49]: X
Hj ¼ Xij ðzi  bk Xik Þ j2B ; (23)
 X X  i¼1 k2S
1 N  
AICWLSC ¼ N log ui zi  bj Xij 2
N i¼1 j2B
where Uj is calculated as the maximum value that H can have
within the bounds (Eq. (15)) [27].
2SðS þ 1Þ The inner optimization problem then results as:
þ2S þ
min ei (24a)
jC j b;y i¼1
with ui ¼ P  i ; (17)
j C j
where zi is the value of the data point and Xij contains the value of
s:t: ei  zi  bj Xij i ¼ 1; …; N (24b)
the jth
basis function of the input variable xi . N is the number of the
data points and S is the number of selected functions. ui is the X
weighting factor of observation (cluster) i and is defined as the ratio ei  bj Xij  zi i ¼ 1; …; N (24c)
of the number of the points in cluster i to the sum of all data points
in all clusters. A cluster with a low number of data points indicates X
an operating range that was not observed often. Consequently, yj ¼ S (24d)
these operating ranges might not have been explored as much as j2B
The extension of the approach is suitable for all types of surro- 
gate model development using clustered data, not only for the use
Uj 1  yj
in BDP modeling. X
N  X 
The numerical solution was also adapted. While the original  Xij zi  bj Xij (24e)
i¼1 j2B
formulation iterates the inner problem for increasing values of S as 
long as the value of AICWLSC increases for the first time, here it is  Uj 1  yj j2B
solved for S ¼ Smax. The maximum value of Smax is

bl yj  bj  bu yj j2B (24f)
Smax ¼ jB j: (18)
The user can decide on choosing a smaller Smax in order to yj ¼ f0; 1g j2B : (24g)
reduce the computational effort and the solution time. It has to be
ensured that an increase of the outer objective function is observ- The application of ALAMO may require a change of the range of
able when choosing a lower Smax than proposed in Eq. (18). the input variables prior to model identification. If the set of the
In order to transform the resulting MIQP problem into a MILP basis functions includes logarithmic terms or functions with ex-
problem to solve it with free available solvers (e.g. CBC [50]), the ponents which are negative or e.g. equal to 1/2, it must be ensured
quadratic formulation of the objective function of the inner prob- that the input variables are not close to zero or negative.
lem is replaced by the sum of absolute values as proposed in Cozad In data preprocessing the values were mean centered and UV
et al. [27]: scaled. This results in a distribution of the input variables around
B. Beisheim et al. / Energy 183 (2019) 776e787 783

zero with a variance equal to one. In order to prevent the afore- improvement factor is a modified version of the criteria presented
mentioned problem, all of the percentile center variables are shif- by Hocking [52] for backward elimination.
ted to the interval ½1; 2 after the clustering step. The procedure is repeated until the number of relevant influ-
ence factors has been determined.
While during model identification the distance between the
2.5. Postprocessing by backward elimination percentiles and the cluster centers was not considered it is now
used for backward elimination. For processes with a small distance
In the previous subsection, a method to identify a general sur- between the BDP model and the cluster centers a more complex
rogate model with multiple influencing factors (xi ) was provided. model is necessary whereas for large distances a simpler model is
The method computes the model by fitting a curve to the averages sufficient.
of the data points between the chosen percentiles of each cluster. The procedure is numerically straight forward and removes the
This model fitness measure (AICWLSC ) selects the optimum number necessity of a multicriterial optimization, which would arise if both
of the basis functions by solely using the distance of the model from criteria were used in one objective function.
the percentile centers as a measure of accuracy.
In plant operation, the main goal of the development of the BDP 2.6. Summary of the developed modeling procedure
is to provide the plant operators with a measure that indicates the
distance between the current resource consumption and the best The proposed procedure can now be summarized in the
demonstrated practice that was observed under similar conditions following steps:
in the past, the operational improvement potential (OIP). Under the
assumption that the cluster-center represents the average oper- 1. Acquire representative data of the process.
ating point of each cluster, the distance between this point and the 2. Preprocess the data by mean centering and unit variance
fitted BDP is an indication of the average operational improvement scaling according to Eq. (6).
potential. 3. Cluster the data using kmeansþþ and calculate the cluster
For this purpose, it is sufficient to represent the dominating centers.
influences on the BDP so that the models can possibly be simplified. 4. Calculate the BDP point for each cluster using Eq. (9)-(12)
This also reduces the risk of artifacts appearing in the model. There where the percentile range is a tuning factor. The represen-
are two possibilities to add this to the method: Either the model tative points for each cluster (EnPIR j ) enter the optimization
fitness measure can be adapted by additional terms in the objective problem as zi in step 7. The model is fitted to these data
function or it can be checked ex post whether the calculated model points.
can be simplified because some factors have little influence on the 5. Scale the input data x to a range excluding zero, for example
OIP. Here, the second option as a pragmatic extension of the from 1 to 2.
method is chosen and an approach based on backward elimination 6. Define the maximum number of basis functions SU . A defi-
[51] to eliminate the influencing factors with a minor influence on nition of SU < Smax is advised to reduce the computation time
the OIP is applied. for problems with many input variables.
Firstly, a model is fitted using all of the possible influencing 7. Start with S ¼ 1 and solve the inner optimization problem in
factors and the average OIP is calculated as: Eq. (24) for increasing values of S until S ¼ SU . The result of
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the optimization at each step is a set of model parameters b.
u N
uX  X 2 8. The minimum value of AICWLSC determines the optimal so-
OIPavg ¼t ui EnPIC j  bj Xij (25) lution of the outer optimization problem. The corresponding
i¼1 j2S set of parameters b represents the best model.
9. For jV c j > 1 define the threshold of the backward elimina-
1 X tion, g, and run the backward elimination procedure.
EnPIC i ¼ EnPIPT ðxÞ; (26)
jC i j x2C 10. The model after the backward elimination step is the final
BDP model.
where EnPIC j is the EnPI of the cluster of preprocessed data points,
The algorithm uses 3 tuning factors, the percentiles used for the
X is the matrix of the transformed inputs by the basis functions
identification of the BDP model, the number of clusters and g. The
using the operating conditions of the cluster-center and ui is the
user has to define a sensible set of basis functions for the curve
weight of each cluster. Afterwards, at each step the influencing
fitting step.
factors are eliminated one at a time, the model is fitted again and
the average error of the new model is calculated as:
3. Application to real process data of an industrial plant
u N
uX  X 2
εk ¼ t ui zi  bjk Xijk ; (27) 3.1. Use of the best demonstrated practice (BDP) in the process
i¼1 j2B industry

In the last section, a general method to identify surrogate

where εk is the error of the model when the kth influencing factor is
models to compute BDP curves or surfaces from clustered data was
left out and zi are the percentile centers used to compute the BDP
developed. These identified BDP models can be used for different
model. The following ratio Fk is defined, as an indicator of the loss of
information considering the average improvement potential:
On the one hand, the method can be the basis for real-time
εk  εB process monitoring. By visualizing the operational improvement
Fk ¼  g; (28)
OIPavg potential of the current operating point, performance deviations
are identified and can be reduced by the plant personnel. From an
where εB is the weighted least squared error of the initial model. g industrial viewpoint, this provides very valuable support for the
is a predefined threshold for backward elimination. This improvement of the operation of plants regarding their energy
784 B. Beisheim et al. / Energy 183 (2019) 776e787

On the other hand, the concept is suitable for reporting pur-
poses. ISO 50006:2017 [20] proposes the analysis of factors influ-
encing the process performance. While this analysis is not an
explicit requirement of ISO 50001:2018, a continuous improvement
of energy performance is. In this context, it is suggested to make
comparisons between different time intervals more transparent by
normalizing the energy efficiency with respect to the key influ-
encing factors.
Furthermore, ISO 50003:2016 [53], which defines the re-
quirements for certification bodies, demands the verification of a
continuous improvement of the energy performance including
energy efficiency, energy use, and consumption. While the total
energy consumption is related to the productivity of a company and
the energy use is often defined by the processes, the efficiency is
Fig. 4. A simplified representation of the PO plant.
subject to continuous optimization in process industry. The pro-
posed concept can be used in two ways to verify an improvement in
efficiency: to indicate a decrease of the average deviation from the
BDP from one time interval to another, which correlates with a
more efficient operation, and secondly to demonstrate a decrease of
the BDP by performing another evaluation after a period of time,
which indicates a technical improvement of the process either by
changing operating conditions or by the use of different equipment.
Finally, the proposed method can be applied to redefine the
energy baseline of a process after retrofitting (c.f. ISO 50001:2018).
Fig. 3 illustrates both of these methods to verify improvements in
energy efficiency. The BDP model in this example is a function of
the plant load. On the left, the operational improvement potential
from period one to two was halved resulting in a smaller difference
between sampling points and BDP model curve. The BDP model is
still valid in this case. On the right, a process modification led to a
Fig. 5. Preprocessed data points before clustering and scaling.
BDP adjustment. The distance between the measurements and
their corresponding BDP model remains constant. The overall EnPI
level decreased. In both cases the specific energy consumption 2. Dehydrochlorination of PCH to PO using an alkaline solution,
decreased. 3. Purification of the product.1

Both, the dehydrochlorination and the purification step,

3.2. Propylene oxide production
consume steam, comprising 80% of the total energy consumption.
As the first step, the measurement data of the load and the total
An application of the developed method to real industrial data is
5 bar steam consumption of the plant together with the ambient
provided in this subsection. The data is taken from the Propylene
temperature with a sampling interval of 1 h is collected. This step is
Oxide (PO) production plant of INEOS in Cologne. This plant pro-
followed by the pretreatment of the data as explained in Section
duces PO using the Chlorohydrin process, which, as shown in Fig. 4,
2.2. Due to the confidentiality of the data, only the processed data
has three main steps:
are presented throughout this section. Fig. 5 shows the pre-
processed data for the EnPI of 5 bar steam against the plant load
1. Synthesis of Propylene Chlorohydrin (PCH) from Propylene and
and ambient temperature.
Chlorine in the presence of water,
This data was clustered and the 5e10% percentile centers of the
clusters are calculated as presented in Section 2.3. The number of
clusters is a tuning parameter for the model identification process
and here it was chosen as 16. As shown in Fig. 5, there are a few
points which lie far below the typical EnPI of the plant at some
operating points, which are considered to be outliers for the cor-
responding operating point. The occurrence of such points is the
reason for defining the lower percentile bound. Fig. 6 shows the
data after clustering.
The percentile centers are used for fitting the BDP model. The
load and the ambient temperature of the cluster centers are used as
input variables (xi ) and the EnPI values are used as the response
variables (zi ). The x variables were shifted to the range of ½1; 2
before model identification. A total number of 7 basis functions and

an intercept term 1; x±f1;2;3g ; expðxÞ were used in the model

Fig. 3. Verification of improvements of the energy efficiency: Left: reduction of the

operational improvement potential, right: decrease of the BDP and of the indicator by
technical measures. All variables are scaled to dimensionless units. More detailed information about the process can be found in Kahlich et al. [54].
B. Beisheim et al. / Energy 183 (2019) 776e787 785

Fig. 6. Data after clustering and scaling.

Fig. 8. Fitted model using three basis functions.

fitting algorithm, and the maximum number of basis function was
defined as 10. The MILP problem (Eq. (24)) is solved using the
“Coin-OR Branch and Cut” (CBC) solver [50]. All presented results model with an Fk of 0.0472, indicating that the loss of information
have proven optimality. due to its elimination is not significant. The model after the elim-
The resulting AICWLSC is depicted in Fig. 7. The minimum value is ination of the ambient temperature is:
reached when the number of basis functions is equal to 3. This
means that the increase of the model quality due to the addition of y ¼  0:2287m_ 3p þ 2:7202m_ 2
p : (30)
new basis functions is not worth the added complexity of the
model for more than 3 basis functions. In order to compare the models before and after backward
The developed model is: elimination the percentile centers were shifted to a reference
temperature along the BDP model. The chosen reference temper-
ature is the result of an optimization which minimizes the sum of
y ¼  0:6989m_ 2p þ 2:9402m_ 1 3
p þ 0:4195T amb (29) the squared distances of the original and transformed percentile
centers in the EnPI space.
and is plotted in Fig. 8. The red circles represent the cluster centers,
the green þ signs show the percentile centers. The surface depicts y ¼ f m_ p þ gðTamb Þ (31)
the fitted model. A conservative method for the definition of the
validity domain of the regression models in higher dimensions is 
the “Convex Hull” of the input variables [55]. This method was used EnPIC i ¼ EnPIC i  gðTamb Þ þ g Tref (32)
to calculate the validity bounds of the model. The model is trust-
worthy only for the operating points inside of the hull and its ac-
curacy outside of the limits cannot be guaranteed. XN  2
Tref ¼ argmin EnPIC i  EnPIC i ; (33)
As the second step, the backward elimination procedure Tref
described in Section 2.5 was applied to the model to evaluate the
importance of the influencing factors in the first step. The trimming
where EnPIC i denotes the transformed value of each EnPIC i . The
criterion g was chosen as 0.05.
BDP model was transformed accordingly.
The original model with 2 influencing factors has an OIPavg of
Fig. 9 shows the fitted models before and after backward elim-
0.4978 and a value of ε of 0.1318. First, the load of the plant was left
ination, where the dotted line shows the model with both influ-
out, which results in a model with Fk equal to 0.9958. This means
encing factors and the blue circles show the projected data points.
that the plant load has a high effect on the quality of the model and
cannot be left out. Leaving out the ambient temperature results in a

Fig. 9. Comparison of the models before and after backward elimination. Variables
Fig. 7. of the fitted models. shifted and scaled to dimensionless units.
786 B. Beisheim et al. / Energy 183 (2019) 776e787

The solid line shows the model after backward elimination and the 4. Conclusion and outlook
crosses show the data points without the temperature projection.
The left and right bound in the figure indicate the minimum and In this contribution, a method is proposed to identify a best
maximum value of the percentile centers used in the model iden- demonstrated practice model of a process based on historical
tification. Within these bounds the model is applicable for com- process data.
parison with process data whereas outside the limits the results Statistical methods and state-of-the-art surrogate modeling
have to be handled with care due to the lacking extrapolation ca- techniques are utilized to obtain a model that is composed of
pabilities of the BDP model. simple basis functions and can be used by plant personnel without
The data that is used for backward elimination was originally mathematical knowledge. The models can be used for real-time
clustered taking both of the influencing factors into consideration. process monitoring to improve the operational performance and
As the temperature has a negligible effect on the model quality, as for reporting purposes. The application of the concept to verify
the final step, a model was fitted using only the plant load as the energy performance improvements in the context of an ISO
influencing factor. The motivation for doing this is to base the 50001:2018 certification was discussed.
clustering step also only on the plant load. The proposed method is designed to handle real data from in-
The parameters of the algorithm were left unchanged compared dustrial processes. Robustness is achieved by clustering and by
to the previous case. The algorithm identified a model using 2 basis using percentile averages to identify the best demonstrated prac-
functions as: tice. Due to the use of as few as possible basis functions, the models
are less prone to show local variations that are a result of the
y ¼  0:7484m_ 2p þ 3:2778m_ 1
p : (34) interpolation procedure and not a property of the process.
The identification of the best demonstrated practice is the first
and the corresponding curve is shown in Fig. 10. The red line and step towards a more resource efficient production. Since the
the crosses present the model and the percentile centers after method only identifies the operational improvement potential the
backward elimination, whereas the blue dashed lined and circles success depends on the skills and the experience of the plant
present the model fitted and the percentile centers using only the personnel. The development of data driven rule extraction to
plant load as input. This model is different from the model fitted support plant operators to derive efficient process intervention is
after backward elimination despite using the same input data and the next step for processes where expensive modeling and
the same settings as there is a difference in the resulting clusters. advanced process control are not viable.
Nevertheless, it can be seen that the models are very similar. The The concept was applied to process data of a large production
final model is depicted against the process data in Fig. 11. plant of INEOS in Cologne, which demonstrates the practical
applicability of the method to real world imperfect data from a
process for which no rigorous model is available. The resulting
models are non-linear but as simple as possible to present clear
trends to the plant personnel. The method does not incorporate any
process knowledge which facilitates the general application to
different processes. The backward elimination step indicates that
the ambient temperature has only a minor influence on the process
performance and can consequently be removed from the BDP
model which further simplifies the model structure without a
significant loss of information. By following the workflow proposed
in this contribution, the gap between state-of-the-art methods in
the literature and the application in the industry can be closed,
providing the basis for better process monitoring and energy effi-
cient operation and a substantiated baseline for energy manage-
ment systems.
As an area for further research, the disaggregation of a process
into smaller sub-processes and the description of each sub-process
Fig. 10. Comparison of the model after backward elimination and the model based by a specific BDP model should be considered. The process can
only on the plant load. afterwards hierarchically be aggregated to obtain consistent BDP
models and performance figures on the top level. A concept which
considers BDP and performance aggregation was proposed by
Beisheim et al. [30]. It will be used at the INEOS site in Cologne with
over 20 processing plants to monitor and report the energy and
environmental performance. It is a future corner stone for reporting
in the context of the energy management system at INEOS in


The project leading to this publication has received funding

from the European Union's Horizon 2020 research and innovation
programme under grant agreement No 723575 (CoPro) in the
Fig. 11. Final model after backward elimination against process data. Variables shifted framework of SPIRE PPP and also under the FP 7 grant agreement
and scaled to dimensionless units. No 604068 (MORE).
B. Beisheim et al. / Energy 183 (2019) 776e787 787

References efficiency performance in production management e gap analysis between

industrial needs and scientific literature. J Clean Prod 2011;19:667e79.
[29] Sfez S, Dewulf J, De Soete W, Schaubroeck T, Mathieux F, Kralisch D, De
[1] The european chemical industry council, facts & figures. 2017. Online, 2017.
Meester S. Toward a framework for resource efficiency evaluation in industry:
URL: [Accessed 13 October 2018].
Recommendations for research and innovation projects. Resources 2017;6:5.
[2] Bundesamt für Wirtschaft und Ausfuhrkontrolle. In: Merkblatt für stromkos-
[30] Beisheim B, Kra €mer S, Engell S. A hierarchical aggregation concept for
tenintensive Unternehmen, vol. 2018; 2018.
resource efficiency in continuous production complexes. In: Espun ~ a A,
[3] International Organization for Standardization. ISO 50001: energy manage-
Graells M, Puigjaner L, editors. 27th european symposium on computer aided
ment systems e requirements with guidance for use. 2018.
process engineering, volume 40 of computer-aided chemical engineering.
[4] European Comission. Regulation (EC) No 1221/2009 of the European Parlia-
Elsevier; 2017a. p. 1519e24.
ment and of the Council of 25 november 2009 on the voluntary participation
by organisations in a community eco-management and audit scheme (EMAS),
[31] Beisheim B, Kalliski M, Ackerschott D, Engell S, Kra €mer S. Real-time perfor-
repealing Regulation (EC) No 761/2001 and Commission Decisions 2001/681/
mance indicators for energy and resource efficiency in continuous and batch
EC and 2006/193/EC. Official Journal of the European Union; 2009.
processing. In: Engell S, Kra€mer S, editors. Resource efficiency of processing
[5] Morrow D, Rondinelli D. Adopting corporate environmental management
plants: monitoring and improvement. John Wiley & Sons; 2017b. p. 79e128.
systems: motivations and results of ISO 14001 and EMAS certification. Eur
[32] Giljum S, Burger E, Hinterberger F, Lutter S, Bruckner M. A comprehensive set
Manag J 2002;20:159e71.
€rn Dahlgaard J, Eklund JA. Implementing ISO 14000 in Sweden: of resource use indicators from the micro to the macro level. Resour Conserv
[6] Poksinska B, Jo
Recycl 2011;55:300e8.
motives, benefits and comparisons with ISO 9000. Int J Qual Reliab Manag
[33] Huysman S, Sala S, Mancini L, Ardente F, Alvarenga RA, De Meester S,
€ssig J, Schütte T, Riesner W, editors. Energieeffizienz-Benchmark Industrie: Mathieux F, Dewulf J. Toward a systematized framework for resource effi-
[7] La
ciency indicators. Resour Conserv Recycl 2015;95:68e76.
Energieeffizienzkennzahlen 2015. Wiesbaden: Springer Fachmedien Wiesba-
[34] Kujanp€ € M, Hakala J, Pajula T, Beisheim B, Kr€
aa amer S, Ackerschott D,
den; 2017.
Kalliski M, Engell S, Enste U, Pitarch J. Successful resource efficiency indicators
[8] Phylipsen D, Blok K, Worrell E, De Beer J. Benchmarking the energy efficiency
for process industries, vol. 290. VTT Technology; 2017.
of Dutch industry: an assessment of the expected effect on energy con-
[35] Cao S, Rhinehart RR. An efficient method for on-line identification of steady
sumption and CO2 emissions. Energy Policy 2002;30:663e79.
state. J Process Control 1995;5:363e74.
[9] Makridou G, Andriosopoulos K, Doumpos M, Zopounidis C. Measuring the
[36] Bhat SA, Saraf DN. Steady-state identification, gross error detection, and data
efficiency of energy-intensive industries across European countries. Energy
reconciliation for industrial process units. Ind Eng Chem Res 2004;43:
Policy 2016;88:573e83.
[10] Saygin D, Patel MK, Worrell E, Tam C, Gielen DJ. Potential of best practice
[37] Eriksson L. Introduction to multi-and megavariate data analysis using pro-
technology to improve energy efficiency in the global chemical and petro-
jection methods (PCA & PLS). Umetrics AB; 1999.
chemical sector. Energy 2011a;36:5779e90.
[38] Tukey JW. Exploratory data analysis, Addison-Wesley series in behavioral
[11] Saygin D, Worrell E, Patel MK, Gielen D. Benchmarking the energy use of
science : quantitative methods. Reading Mass: Addison-Wesley; 1977. u.a.
energy-intensive industries in industrialized and in developing countries.
[39] Jain AK. Data clustering: 50 years beyond k-means. Pattern Recogn Lett
Energy 2011b;36:6661e73.
[12] Drumm C, Busch J, Dietrich W, Eickmans J, Jupke A. STRUCTese®eenergy ef-
[40] Arthur D, Vassilvitskii S. k-meansþþ: The advantages of careful seeding. In:
ficiency management for the process industry. Chem Eng Process: Process
Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete al-
Intens 2013;67:99e110.
gorithms. Society for Industrial and Applied Mathematics; 2007. p. 1027e35.
[13] NAMUR WG 4.17. Procedure for enhancing energy efficiency in chemical
[41] MacQueen J, et al. Some methods for classification and analysis of multivariate
plants e contribution of automation engineering, NAMUR recommendation
observations. In: Proceedings of the fifth Berkeley symposium on mathe-
140, NAMUR. Leverkusen, Germany: NAMUR; 2012.
matical statistics and probability, vol. 14; 1967. p. 281e97. Oakland, CA, USA.
[14] NAMUR WG 4.17. Resource Efficiency Indicators for monitoring and
[42] Bahmani B, Moseley B, Vattani A, Kumar R, Vassilvitskii S. Scalable k-mean-
improving resource efficiency in processing plants, NAMUR Recommendation
sþþ. Proc VLDB Endow 2012;5:622e33.
162, NAMUR. Leverkusen, Germany: NAMUR; 2017.
[43] Walfish S. A review of statistical outlier methods. Pharmaceut Technol
[15] Motard R, Shacham M, Rosen E. Steady state chemical process simulation.
AIChE J 1975;21:417e36.
[44] Wilson ZT, Sahinidis NV. The ALAMO approach to machine learning. Comput
[16] Luyben WL. Process modeling, simulation and control for chemical engineers.
Chem Eng 2017;106:785e95.
McGraw-Hill Higher Education; 1989.
[45] Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B
[17] Dowling AW, Biegler LT. A framework for efficient large scale equation-
oriented flowsheet optimization. Comput Chem Eng 2015;72:3e20.
[46] Sugiura N. Further analysts of the data by Akaike's information criterion and
[18] Falkenhainer B, Forbus KD. Compositional modeling: finding the right model
the finite corrections. Commun Stat Theor Methods 1978;7:13e26.
for the job. Artif Intell 1991;51:95e143.
[47] Burnham KP, Anderson DR. Multimodel inference: understanding AIC and BIC
[19] Søndergaard J. Optimization using surrogate models-by the Space Mapping
in model selection. Socio Methods Res 2004;33:261e304.
technique. Ph.D. thesis. Technical University of Denmark; 2003.
[48] Akaike H. A new look at the statistical model identification. IEEE Trans Autom
[20] International Organization for Standardization. ISO 50006: energy manage-
Control 1974;19:716e23.
ment systems - measuring energy performance using energy baselines (EnB)
[49] Banks H, Joyner ML. AIC under the framework of least squares estimation.
and energy performance indicators (EnPI) - general principles and guidance.
Appl Math Lett 2017;74:33e45.
2014. ISO 50006:2014.
[50] Forrest J, Lougee-Heimer R. CBC user guide. In: Emerging theory, methods,
[21] Nomikos P, MacGregor JF. Monitoring batch processes using multiway prin-
and applications. INFORMS; 2005. p. 257e77.
cipal component analysis. AIChE J 1994;40:1361e75.
[51] Draper NR, Smith H. Selecting the best regression equation. In: Applied
[22] Ge Z, Song Z. Distributed PCA model for plant-wide process monitoring. Ind
regression analysis. Wiley-Blackwell; 2014. p. 327e68.
Eng Chem Res 2013;52:1947e57.
[23] Krige DG. A statistical approach to some basic mine valuation problems on the
[52] Hocking RR. A Biometrics invited paper. the analysis and selection of variables
Witwatersrand. J S Afr Inst Min Metall 1951;52:119e39.
in linear regression. Biometrics 1976;32:1e49.
[24] McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous
[53] International Organization for Standardization. ISO 50003: energy manage-
activity. Bull Math Biophys 1943;5:115e33.
ment systems - requirements for bodies providing audit and certification of
[25] Hassoun MH. Fundamentals of artificial neural networks. MIT press; 1995.
energy management systems. 2014. ISO 50003:2014.
[26] Audet C, Denni J, Moore D, Booker A, Frank P. A surrogate-model-based
[54] Kahlich D, Wiechern U, Lindner J. Propylene oxide, ullmann's encyclopedia of
method for constrained optimization. In: 8th symposium on multidisciplinary
industrial chemistry. 2002.
analysis and optimization; 2000. p. 4891.
[55] Brooks DG, Carroll SS, Verdini WA. Characterizing the domain of a regression
[27] Cozad A, Sahinidis NV, Miller DC. Learning surrogate models for simulation-
model, vol. 42. The American Statistician; 1988. p. 187e90. URL: http://www.
based optimization. AIChE J 2014;60:2211e27.
€nsleben P, Brülhart M, Ernst FO. Integrating energy
[28] Bunse K, Vodicka M, Scho