Sie sind auf Seite 1von 15

Computers in Industry 82 (2016) 104–118

Contents lists available at ScienceDirect

Computers in Industry
journal homepage: www.elsevier.com/locate/compind

An automated approach for merging business process fragments


Mohamed Anis Zemni a,*, Amel Mammar b, Nejib Ben Hadj-Alouane c
a
Ecole Nationale des Sciences de l’Informatique, ENSI, UR/OASIS, Tunisia
b
Institut Mines-Télécom/Télécom SudParis, CNRS UMR 5157 SAMOVAR, France
c
Ecole Nationale d’Ingénieurs de Tunis, ENIT, UR/OASIS, Tunisia

A R T I C L E I N F O A B S T R A C T

Article history: In the field of business process management, adopting efficient building strategies can improve the
Received 20 April 2016 quality of companies’ business processes. The reuse of existing business processes or even fragments of
Accepted 3 May 2016 them is a practical approach to build complete business processes or coarser-grained process fragments.
Available online 15 June 2016
In the present paper, we deal with the merge of a set of business process fragments for the construction
of new complete processes. Our merge mechanism relies on a particular path matrix, that we call
Keywords: gateway path matrix. We use gateway path matrices to represent business process fragments to
Business processes
systematically compose shared components with individual ones. Moreover, our approach ensures that
Fragments
Merge approach
the resulting business process fragments subsume the behavior of initial ones and allows for adding new
Gateway paths execution scenarios while controlling undesirable ones. In fact, we detect newly generated behaviors,
Gateway paths matrix and alert process designers of undesirable ones through behavioral constraints. We provide extensive
Process behaviors experimental results derived from an implementation of our approach applied on a well-known
industrial library of business process fragments.
ß 2016 Elsevier B.V. All rights reserved.

1. Introduction modifying the underlying configuration. Such techniques have a


limited efficiency as they are typically used to adapt business
Optimizing the development periods of service-oriented soft- processes within restrained perimeters. Adding new activities to
ware can improve the efficiency of various actors in the software such processes generally implying going through the implemen-
industry. Moreover, modern software applications rely heavily on tation, optimization and test phases.
business process technologies [19]. Therefore, adding a degree of A more practical emergent approach consists in reusing only
automation to the design of business processes can extensively particular portions of existing business processes, called business
improve the development period and quality of the underlying process fragments (BPF). These fragments are generally con-
applications. strained with fewer business rules than their complete parent
A business process development approach, based solely on processes and are, hence, easier to integrate into new business
constructing business processes completely from scratch, is costly, processes. Moreover, the controlled reuse of well-established,
time consuming and lacks flexibility. In practice, designers need to tested and stable fragments, can improve significantly the quality
implement, optimize and test new software applications and of resulting new business processes [27,30]. Several approaches
thereby business processes in short periods of time. To overcome have been proposed to identify reusable fragments [35,31–
this hurdle, business process practitioners and architects are now 33]. These fragments are, afterwards, stored in libraries for future
interested in reusing and integrating existing business processes use by business process building tools. In this paper, we focus our
into new ones, rather than completely building them from scratch. interest on the development of an automated approach for
It is common knowledge that such practices can increase both the merging selected business process fragments from existing
efficiency and the productivity of the development process [13]. fragment library for the purpose of constructing complete viable
In line with business process reuse strategy, some researchers business processes.
[1,25] propose the use of what are known as configurable business It is well noted in the literature that existing fragments may
processes [16]. Variation of business processes are obtained by contain similarities or even overlapping knowledge and structures
[7] as they may be retrieved from independent business processes
offering similar services or even from the same business process. In
* Corresponding author. the approach developed in this paper, we strive to take advantage

http://dx.doi.org/10.1016/j.compind.2016.05.002
0166-3615/ß 2016 Elsevier B.V. All rights reserved.
[(Fig._1)TD$IG] M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 105

Fig. 1. Business process fragments represented in BPMN.

of such similarities as a way of interconnecting business process the experimental results obtained from the software tool is given
fragments. in Section 7. In Section 8, we review important and relevant
The interconnection approach we defined ensures that the literature and finally we conclude in Section 9. We note also that
resulting interconnected business process fragments subsume the since our approach is well grounded in mathematically formal
behavior of initial ones, add new execution scenarios, while concepts, we provide among others in Appendix the necessary
controlling undesirable ones. proofs for the properties stated along this paper.
The main contributions of this paper reside in providing a
flexible merge mechanism for business process fragments. This 2. Motivating scenario
mechanism is based on the notion of path matrices [34] used as a
way to represent node-based graphs capturing business processes. Our objective is to compose a set of already selected BPFs into a
More specifically, path matrices are used to configure paths single coarse-grained BPF. For this aim, each BPF must share at least
between adjacent nodes in node-based graphs. Our merge an activity with another BPF. Moreover, we choose to use the
mechanism relies on path matrices properties to provide correct merge principle for the composition, typically used to consolidate
merged paths between pairs of activities belonging to intercon- several versions of a BP [8,26]. In this section, we present the main
nected business process fragments. issues we face while performing the merge through a real-life
In order to properly handle the undesirable behaviors that may scenario.
rise during the merge process, we present a mechanism that Let us consider the pair of BPFs illustrated in Fig. 1 and owned by
permits to detect and control them. As a matter of fact, we a software development company specialized in supplying
integrated into resulting merged fragments behavioral constraints applications for bank management software. The BPFs f1 and f2
that the designer can use and configure as a way of allowing the were initially developed for two distinct services: ‘‘local banking’’
desirable behaviors and inhibiting undesirable ones. and ‘‘foreign banking’’. Inspired from the BPMN representation
To thoroughly evaluate our approach from a practicality, [17], activities are represented with rounded boxes and gateways
efficiency, and qualitatively point of view, we provide extensive with diamonds. Complete control flows are represented with solid
experimental results derived from an implementation of our arrows and dangling1[1_TD$IF] ones with dashed arrows. Each control flow
approach applied to a well known industrial library of business involves a passing condition that should be evaluated to true to
process fragments. We show that our approach provides good allow passing to the target object. In Fig. 1, control flows with
quality coarse-grained fragments or business process. The quality empty passing conditions means that the latter are fixed to
of the resulting fragment or business process is evaluated by true. The first BPF, f1, depicts the main activities for credit
comparing them to the manually constructed fragments. application within local banking. The service is intended for
The remainder of this paper is organized as follows: Section 2 normal customers and bank employees. Approved amounts are
presents a real-life motivating scenario to show in details the credited on the customer’s account. The second BPF, f2, is intended
issues we face when merging a pair of fragments. Section 3, for professional customers carrying out foreign financial transac-
illustrates our business process artifacts we use throughout this tions. The BPF performs credit application for them. Unlike the first
paper. In Section 4, we introduce the path matrices that serve as a BPF, approved amounts are directly transferred to the foreign party
foundation for the merge mechanism used by our approach. In with whom the professional customer realizes the transaction. Let
Section 5, we detail the merge mechanism based on path matrices. us suppose that the company desires to provide a service, called
In Section 6, we give the behavioral constraints that are used to
control the behavior of the resulting fragments. The presentation of 1
Control flows with either no source or target objects specified.
[(Fig._2)TD$IG]
106 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

Fig. 2. The resulting BPFfM from the merge of the initial BPFs of Fig. 1.

‘‘credit service’’, as required by a new bank, with a new business Next, we define the paths we handle during the merge task. We
process handling credit requests for both normal customers/bank finally introduce the behaviors a business process fragment, which
employees and professional customers as depicted by the BPFfM in we will need to maintain during the merge task.
Fig. 2. We note that the resulting fragment has been composed
manually by the company process designers by comparing input 3.1. Business process fragment
fragments, consolidating shared parts and allowing for branching
in presence of individual parts. Grey gateways g da , g ca , and g da A business process fragment (fragment for short) is a
1 4 4
are added during the merge task to enable the branching to connected sub-process capturing an incomplete business rules
individual portions. They have to be configured by the process and knowledge about a system. It is commonly designed for
designer to finalize the process. Thus, when applying for credit, a reuse purposes [28,20]. Generally speaking, BPFs have relaxed
professional customer can be asked to take out insurance. completeness criteria [4] compared to complete BPs where all
One issue consists in merging the initial BPFs while eliminating the required activities are fully specified. A BPF is composed of at
knowledge redundancy. In fact, some activities are shared between least one object, and of several edges, representing control flows.
the input BPFs, e.g., a0, a1, and a4. Designers should find the best Objects are essentially composed of activities and gateways.
way to combine shared elements and individual ones. This task is Each object is identified by a label. Control flows capture the
mainly done manually and can easily be qualified as time execution order between the objects. During the execution, a
consuming and error prone. Although it seems straightforward control flow is activated if its corresponding passing condition is
in the present motivating scenario, the task becomes tedious when evaluated to true. Activities represent the tasks to be performed.
it is about merging big BPFs. Therefore, this task needs clear Gateways are routing constructs which incorporate a type, either
consolidation rules to generate correct fragments, and this, in divergence or convergence. They are used to control the
optimal periods of time. divergence of control flows, i.e., when a control flow has to
Another issue, behavioral, may occur during the resulting BPF be split into a set of control flows, or their convergence, i.e.,
execution. In fact, we consider that the resulting BPFfM is integrated when a set of control flows have to be joined into a single one
as part of a complete BP to enable performing it. The activity a0 is [21]. BPFs are constructed with the intention of composing them
performed first, then the activities a1 and a2 and so on until to build new complete BPs. In contrast to complete BPs, BPFs can
reaching the activity a7. Such behavior does not exist in the initial depict dangling control flows, i.e., with either no source or target
BPFf1 nor f2. Let us suppose that the designer has fixed the passing objects specified. Dangling control flows represent gluing points
conditions of the control flows involving configurable gateways as from which they would be attached to other BPFs in order to
to evaluate the passing conditions of the control flows ðg da ; g 1 Þ build complete BPs [11].
1
and ðg da ; a7 Þ to true and the passing condition of the control flow We recall the following notations that we use all along this
4
ðg da ; a6 Þ to false. This scenario, however, leads to the execution paper. Given R  X  Y, X1  X and Y1  Y, then
1
failure of Activity a7 as it strongly depends on Activity a6. In fact,
Activity a7 needs the payment information from Activity a6 which  X1 v R = {(x, y)j(x, y) 2 R ^ x 2 X1}
was not eventually executed. This problem has never been asked in  R " Y1 = {(x, y)j(x, y) 2 R ^ y 2 Y1}
the initial BPFf2 as Activity a6 is necessarily performed before  R[X1] = {yjy 2 Y ^ 9 x . (x 2 X1 ^ (x, y) 2 R)}
Activity a7. In fact, this behavior is intended by the designer of the  R1[Y1] = {xjx 2 X ^ 9 y . (y 2 Y1 ^ (x, y) 2 R)}
initial BPF. Therefore, when merging a pair of BPFs, we have to
ensure that their behavior is maintained as defined initially, and We formally define a BPF model as follows.
that newly generated ones controlled.

3. Business process artifacts Definition 1 (Business process fragment).


A business process fragment is a tuple f = (O, A, G, Cf), where
In this section, we present the basic artifacts related to business
processes we use in our work. First, we start by defining the main  O = A [ G is a set of objects composed of a set of activities, A, and a
input of our approach, namely, the business process fragment. set of gateways, G,
M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 107

 o.label depicts the label of object o, and g.type 2 {‘div’, ‘conv’} Property 1 (Connectivity).
represents the type of the gateway g, A business process fragment, f, is connected iff jf.Oj = 1 or
 Cf  ((O [ { ? })  (O [ { ? }))  {(?, ?)} is the complete and/or 8 ðo1 ; o2 Þ:ððo1 ; o2 Þ 2 f :O2 ) 9 p:ðp 2 Pf ^ p:as ¼ o1 ^ p:at ¼ o2 Þ _
dangling control flow relation that links objects to each other, 9 ðp1 ; p2 Þ:ððp1 ; p2 Þ 2 P2f ^ ððp1 :as ¼ o1 ^ p2 :as ¼ o2 ^ p1 :at ¼ p2 :at Þ _
where symbol ? represents the missing source or target of a ðp1 :at ¼ o1 ^ p2 :at ¼ o2 ^ p1 :as ¼ p2 :as ÞÞÞÞ.
control flow. We define Cf such that:
– each activity has at most one incoming (resp. outgoing) control Our work is activity-driven as we aim at extending the features
flow. Formally, 8 a:ða 2 A ) jC f ½fagj1 ^ jCf1 ½fagj1Þ; of one BPF with the features of another BPF. Basically, the
– each gateway is either a divergence or a convergence. A collaboration between the activities of a BPF exposes the composite
divergence (resp. convergence) gateway has one incoming (resp. features of a BPF. Our work aims to merge a pair of BPFs at activities
outgoing) control flow and several outgoing (resp. incoming) scale while considering the relations between them. In other
control flows: 8g.(g 2 G ) (g.type = ‘div’ ^ jC f ½fggj  1 ^ jCf1 words, the merge can be achieved over a pair of consecutive similar
½fggj ¼ 1Þ _ ðg:type = ‘conv’ ^ jC f ½fggj ¼ 1 ^ jCf1 ½fggj  1ÞÞ. activities independently from the rest of the activities.
 c.x, depicts the passing condition of control flow c, with A gateway path (GP) is a particular CFP, between a pair of
 activities, and whose sequence does not contain any other activity.
true; if c 2 AÌC f It has a length of 1 when the activities are directly linked by means
c:x ¼
cond; otherwise
of a control flow. It goes through gateways exclusively, otherwise.
Formally, a GP is defined as follows.
That is during the execution, a control flow that contains a
gateway as a source is activated if its corresponding passing
condition is evaluated to true. Definition 3 (Gateway path).
Given a BPF, f, a gateway path, GP = (as, at, h(oi, oi+1)ii2[9_TD$IF]1..n), is a CFP
Note that when a gateway appears as a source in a control flow where
whose passing condition is not yet defined, denoted by the symbol
‘‘?’’, then the gateway is called configurable. Such passing  as 2 f.A is called the source activity,
conditions must be fixed at build time by the process designer.  at 2 f.A is called the target activity,
Moreover, divergence (resp. convergence) gateways with outgoing  8i.(i 2 [5_TD$IF]2..(n  1) [2_TD$IF])oi 2 f.G).
(resp. incoming) control flows having passing conditions fixed at
true represent parallel branching. For instance in the BPFf1 of Fig. 1, (a0, a1, h(a0, a1)i) is a GP
between the activity a0 and the activity a1, while there are no GPs
3.2. Control flow paths and gateway paths between the activity a0 and the activity a5. Let Gf be the set of all
GPs in a fragment f and m : AA ! PðGf Þ be a function that returns
A BPF can be viewed as a set of (i) control flow paths (CFP), and (ii) the gateway paths between a pair of activities. The symbol P
a set of gateway paths (GP). A CFP is defined as an alternated represents the power set operator.
sequence of control flows between two objects such that the source
of one control flow is the target of the predecessor control flow. 3.3. Behavior of a business process fragment
Formally, we define a CFP as follows.
In the following, we consider that a BPF is integrated as part of a
complete BP to enable performing it. The behavior of a BPF consists
Definition 2 (Control flow path).
of a set of scenarios that may be performed during its execution.
Given a BPF, f, a control flow path is a tuple CFP = (os, ot, seq), where
Each scenario is composed of sequences of activities that are
selected during the execution of a BPF. Each sequence of activities
 os 2 f.O is called the source object,
corresponds to the activities that appear in a CFP. The behavior is
 ot 2 f.O is called the target object,
derived from the structure of the BPF, namely the sequence of
 seq = h(oi, oi+1)ii21,. . .,n1, is a sequence of alternated control flows,
control flows between the objects, as well as their corresponding
where
passing conditions. That is, during the execution, each control flow
– 8i.(i 2 [3_TD$IF]1..(n  1[4_TD$IF]) ) (oi, oi+1) 2 f.Cf) and n is the number of
between a pair of consecutive objects is activated to allow enacting
objects that appear in seq,
the target object, this, assuming that the source object has already
– o1 = os and on = ot, and
been correctly accomplished. It is noteworthy that an object
– 8i.(i 2 [5_TD$IF]2..(n  1) [2_TD$IF]) oi 6¼ os ^ oi 6¼ ot).
accomplishment refers to the correct action logic of the object (i.e.,
execution for activities, and switching to a convergence or
Please note that an algorithm is given in Appendix A to retrieve divergence for gateways). Therefore, a given activity can be
the set of CFPs associated to a BPF. activated only if some specific predecessor activities that appear in
For instance, as shown in the BPFf1 of Fig. 1, p1 = (g1, a4, h(g1, g2), a single or several CFPs, have already been accomplished. Given an
(g2, g5), (g5, g3), (g3, a4)i) and p2 = (g1, a4, h(g1, g4), (g4, g3), (g3, a4)i) activity a in a BPFf, an activity a0 is a predecessor if there is a CFP
are two CFPs between the objects g1 and a4. For a given BPFf, let Pf between them (9p.(p 2 Pf ^ p.as = a0 ^ p.at = a)).
be a set of all CFPs of f. As stated above, our objective is to merge a pair of BPFs while
Throughout this paper, given a CFPp = (os, ot, h(oi, oi+1)ii2[6_TD$IF]1..(n  1[7_TD$IF])), preserving their initial behaviors. However, while performing this
h(oi, oi+1)ii2[6_TD$IF]1..(n  1[8_TD$IF]) denotes the sequence p.seq and the length of p, task, some undesirable new behaviors may occur, leading the BPF
denoted by p.len, is the number of elements in seq : seq.len = 2. to execution blocking. Such behaviors happen when performing a
A BPF must be connected by depicting a single block in order to given activity while some predecessors that used to be accom-
ensure that when a BPF is reused in a new BP, all objects can be plished in the initial BPF have not been eventually accomplished.
reached. That is, between each two objects, there must be a CFP or To control the newly generated behaviors, we define some
there must be two CFPs each one outgoing from an object and constraints over activities which we call behavioral
incoming to a third object. This property is formally defined as constraints. Each constraint defines whether the corresponding
follows. activity respects the behavior of the initial BPF it was retrieved
[(Fig._3)TD$IG]
108 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

Fig. 3. Gateway path matrix M1 for fragment f1 in Fig. 1.


[(Fig._4)TD$IG]
from. These constraints are composed of a set of sequences of
activities that must be executed before performing the reached
activity. The sequences, whose activities must be accomplished,
are selected according to the passing conditions of the initial
BPF. Therefore, performing an activity is restricted to that of its
predecessors. These behavioral constraints are defined, at design Fig. 4. Gateway path matrix M2 for fragment f2 in Fig. 1.
time, either by the designer or automatically, using formal
expressions. We formally define the Behavioral Constraints as To be a GPM, a matrix must fulfill some basic properties that
follows. denote the correctness of GPM (cf. see Appendix B for proof). We
derive these properties from the BPF definition (Definition 1).

Definition 4 (Behavioral constraints).  When an element of a GPM contains a GP of length 1, then the
Given a BPF f, the behavioral constraint of an activity a, written Ba , element denotes a singleton set and the rest of the row and the
is composed of a set of activity sequences where all activities of at column must be empty. In fact, the GP source and target activities
least a sequence must be performed before performing the activity are directly linked by means of a control flow where an activity
a. Each sequence i.e., has the form ha1, . . ., ani and is such that: has at least one incoming control flow and one outgoing control
flow. Formally,
1. 9p.(p 2 Pf ^ p.as = ai ^ p.at = ai+1), with i 2 1, . . ., (n  1), and
2. 9p.(p 2 Pf ^ p.as = an ^ p.at = a).
Property 2. Given a fragment matrix M ¼ ðA; GP; mÞ, then 8 ða1 ;
a2 Þ:ðða1 ; a2 Þ 2 AA ^ ð 9 p:ðp 2 mða1 ; a2 Þ ^ p:len ¼ 1ÞÞ ) jmða1 ; a2 Þ
For instance, given the BPF f1 of Fig. 1, a possible behavioral j ¼ 1 ^ 8 a3 :ðða3 2 Afa2 g ) mða1 ; a3 Þ ¼ ;Þ ^ ða3 2 Afa1 g ) m
constraint of the activity a4 is Ba4 ¼ fha1 ; a2 ig, which means that ða3 ; a2 Þ ¼ ;ÞÞÞ.
the activity a4 can be performed only if the activities a1 then a2
have already been performed in the right order. For instance in Fig. 4, the element containing the GP p04 has
Note that, at design time, the designer may consider these length 1 while the rest of the column and the row are empty.
behavioral constraints and incorporate them in the BPF execution  For a given row, all GPs inside an element or in different elements
logic, or leave them out. This depends on the behaviors he intends share the outgoing control flow from the row activity. This means
to provide. that there are many GPs diverging from the row activity to the
In the following, we present a systematic approach using same column activity (resp. to several column activities). Formally,
matrices to merge a set of BPFs while keeping their initial behavior
and controlling new ones with behavioral constraints.
Property 3. Given a fragment matrix M ¼ ðA; GP; mÞ then,
8 ðai ; aj ; ak Þ:ððai ; aj ; ak Þ 2 A3 ) 8 ðp1 ; p2 Þ:ðp1 2 mðai ; aj Þ ^ p2 2 m
4. Gateway path matrix ðai ; ak Þ ) p1 :seq:first ¼ p2 :seq:firstÞÞ, where given a sequence of
elements E = he1, e2, . . ., eni, E.first returns the first element a0.
In order to enable our merge mechanism, we rely on the graph
matrices formalism. We chose this specific tool because their In Fig. 3 for instance, the elements containing the GPs p2, p3,
properties match our main objectives, i.e., more specifically p4, and p5 share the first control flow (a1, g1).
column and row properties. In graph theory [34], a graph can be  For a given column, all GPs inside an element or in different
mapped onto a square matrix to represent whether an edge exists elements share the incoming control flow to the column activity.
between any pair of nodes. In our work, a BPF which is basically a This means that there are many GPs converging to the column
graph, is mapped onto a particular matrix that we call gateway activity from the same row activity (resp. from several row
path matrix (GPM). The GPM is defined by the set of BPF’s activities, activities). Formally,
and illustrates the corresponding GPs between pairs of activities.
The GPM representing a BPF is formally defined as follows. Property 4. Given a fragment matrix M ¼ ðA; GP; mÞ then,
8 ðai ; aj ; ak Þ:ððai ; aj ; ak Þ 2 A3 ) 8 ðp1 ; p2 Þ:ðp1 2 mðaj ; ai Þ ^ p2 2 m
ðak ; ai Þ ) p1 :seq:last ¼ p2 :seq:lastÞÞ, where given a sequence of ele-
Definition 5 (Gateway path matrix). ments E = he1, e2, . . ., eni, E.last returns the last element en.
A BPF f is mapped onto the GPM Mf ¼ ðA; GP; mÞ where A ¼ f :A is a
set of activities, GP ¼ Gf is a set of GPs in f, and m : AA ! PðGPÞ is In Fig. 3 for example, the elements containing the GPs p4, p5,
a function such that, given a pair of activities ða; a0 Þ 2 A2 , and p6 share the last control flow (g3, a4).
respectively referred to as source and target activities, returns
the set of GPs that link them to each other: mða; a0 Þ ¼ fpjp 2 GP
5. Merge mechanism
^ p:as ¼ a ^ p:at ¼ a0 g.
For instance, the GPMs representing the fragments f1 and f2, of Given a pair of GPMs, the merge mechanism, shown in Fig. 5,
Fig. 1, are shown respectively in Figs. 3 and 4. Empty elements must generate a single GPM that encompasses the initial GPMs
represent the absence of GPs between a pair of activities. executions.
[(Fig._5)TD$IG] [(Fig._6)TD$IG]
M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 109

Fig. 6. Aligned gateway path matrix M1A for the gateway path matrix M1 in Fig. 3.
[(Fig._7)TD$IG]

Fig. 5. Merge mechanism steps.

Fig. 7. Aligned gateway path matrix M2A for the gateway path matrix M2 in Fig. 4.

Our approach is composed of two phases: (i) merge phase and and columns. For instance, in Fig. 6, the activities a6 and a7 are
(ii) behavior preserving phase. The merge phase is explained in isolated since their corresponding columns and rows are empty.
the current section while the behavior preserving phase is
detailed in the next section (Section 6). Merging a pair of GPMs 5.2. Gateway paths matrices merge
consists in fusing their elements (GPs), that share the same source
and target activities in a pairwise fashion, i.e., one element from Once aligned, GPMs can be merged into a single one. Intuitively,
each GPM. Therefore, the GPMs must contain the same set of the merge of a pair of GPM elements consists in extending the
activities as to enable comparing elements of the first GPM with elements of one GPM with the corresponding elements of the other
the elements of the second one. The GPM correctness properties GPM. We formally define the merge over a pair of aligned GPMs as
are also checked to deliver correct BPFs. However, GPMs do not follows.
necessarily fulfill this condition. For instance, the GPM M2 in Fig. 4
does not contain the activities a2, a3, and a5, and the GPM M1 in
Fig. 3 does not contain the activities a6 and a7. Therefore, GPMs Definition 7 (Gateway path matrices merge).
should first be aligned in order to enable merging them. Finally, a Given a pair of aligned GPMs, M1A and M2A , their merge is a single
reduction task is performed. In fact, this task is recommended in GPM, MM, where
order to deliver an optimal resulting GPM where no superfluous
gateways occur.  M M :A ¼ M1A :A ¼ M2A :A,
S
 M M :GP ¼ M 1 :GP M2 :GP, and
S
5.1. Gateway path matrices alignment  8 ða1 ; a2 Þ 2 M M :A2 , then MM.m(a1, a2) = M1.m(a1, a2) M2.m(a1,
a2).
Given a set of GPMs, the alignment of a GPM consists in unifying
its corresponding activity set with the rest of GPMs’ corresponding The merge leads to extend the existing GPs between a pair of
activity sets. In other words, all GPMs must contain the same set of activities of one GPM with other GPs from the other GPM. Moreover,
activities. We assume that activities have already been pretreated the merge would create new CFPs by lengthening the CFP between
as to enable matching an activity in one BPF with an activity in the a pair of activities of one GPM with a CFP, from the other GPM,
other BPF. This can be achieved using linguistic similarity measures whose source (resp. target) activity is common with the target
[22,6] between the activity labels as in [24]. The alignment of GPMs (resp. source) activity of the former CFP.
is formally defined as follows. The resulting GPM from the merge of the aligned GPMs M1A and
M2A is depicted in Fig. 8. Let us remark that the GPs p1 and p01 are
identical the reason why their union results in a single GP and we
Definition 6 (Gateway path matrices alignment).
choose arbitrarily to keep p1.
Given a set of GPMs, M = {M1, M2, . . ., Mm}, the alignment of the
After performing the merge task, the resulting GPM can
GPM, Mi, with the remaining GPMs is written, MiA ¼ ðA; GP; mÞ, such
probably break some correctness properties. For example, the
that
merge of the elements of the row corresponding to Activity a1,
S namely p2, p3, p4, p5, and p02 , that were retrieved from the aligned
 MiA :A ¼ j 2 1;...;m M j :A, [(Fig._8)TD$IG]
 MiA :GP ¼ Mi :GP, and
 8 ða1 ; a2 Þ 2 MA :A2 , 
then
i
M i :mða1 ; a2 Þ; if ða1 ; a2 Þ 2 M i :AM i :A
MiA :mða1 ; a2 Þ ¼
;; otherwise

For instance, the alignment of the GPMs M1 and M2, respectively


of Figs. 3 and 4, is given in Figs. 6 and 7.
Note that, after aligning a GPM, its corresponding BPF may break
the connectivity property as some isolated activities are added. Fig. 8. Resulting gateway path matrix MM of the aligned gateway path matrices M1A
Isolated activities are identified with empty corresponding rows and M2A , respectively depicted in Figs. 6 and 7.
[(Fig._9)TD$IG]
110 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

Fig. 9. Gateway path merge mechanism with adding convergence and divergence gateways when the source activity and target activity are shared.

GPM M1A and M2A led to breaking Property 3: the elements’ GPs do Definition 8 (Gateway path alignment).
not share the first control flow. Similarly, the merge of the Consider a pair of not aligned GPMs, M1 and M2. The alignment of a
S
elements of the row corresponding to Activity a4, namely p4, p5, p6, GP p = (o1, on, seq), where p 2 M 1 :GP M 2 :GP, o1 is a source activity
and p03 that were retrieved from the same aligned GPMs, led to and on is a target activity, is as follows:
breaking Property 4: the elements’ GPs do not share the last control
T
flow. 1. if o1 2 M 1 :A M 2 :A, then p:seq ¼ hðo1 ; g do :label Þ; ðg do :label ; o2 Þi ::
1 1
In our work, we propose to merge GPMs elements while dealing hðok ; okþ1 Þik 2 2;...;ðn1Þ i.
T
with the GPM correctness properties. That is, in order to fulfill the 2. if on 2 M 1 :A M2 :A, then p:seq ¼ hðok ; okþ1 Þik 2 1;...;ðn2Þ ::
row property (i.e. Property 3) we systematically insert a config- hðon1 ; g co :label Þ; ðg co :label ; on Þi.
n n
urable divergence gateway, whose label refers to the GP’s source
activity, between the element source activity and the successor
object. This rule is applied when the row activity is shared. Fig. 10 depicts the resulting GPM, MM, after applying the GPs
Similarly, in order to fulfill the column property (i.e. Property 4) we alignment.
systematically insert a configurable convergence gateway, whose Although the resulting GPM respects the correctness
label refers to the GP’s target activity, between the element target properties, they can be further optimized. In fact, the resulting
activity and the predecessor object. This rule is applied when the GPM can probably contain superfluous objects consisting of
column activity is shared. Property 2 holds as well, because the configurable gateways that have a single incoming control flow
length of the GPs is at least 3. The alignment task is illustrated in and a single outgoing control flow. For instance, in Fig. 10,
Fig. 9 where the row activity and column activity are shared. Note the configurable gateways g da and g ca in the GP p1 are
0 1
that the alignment is applied during the merge. superfluous. In the following, we provide an algorithm to
We formally define the GP alignment as follows. remove them.

[(Fig._10)TD$IG]

Fig. 10. Resulting GPs for GPM MM represented in Fig. 8 after applying the GPs alignment.
M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 111

5.3. Gateway path matrix reduction 6. Behavioral constraint annotations

After merging the GPMs, the resulting one can be further In Section 3.3, we have mentioned that undesirable behaviors
optimized by applying a reduction rule. This rule consists in may be introduced when merging fragments. In this section, (i) we
removing superfluous gateways that were added during the merge identify the undesirable behaviors and (ii) propose a set of
and more specifically the GPs alignment as presented in Definition annotations to constrain them.
8. Such superfluous gateways occur when a pair of corresponding First of all, one should identify newly generated behaviors and
GPs, each one from a GPM, share the same row activity and the more specifically undesirable ones. A newly generated behavior
column activity and the rest of the row and column elements are appears in neither of the initial BPFs and encompasses activities
empty. For instance, in Fig. 10, the configurable gateways g da and among which some initially belong to the first BPF and some others
0
g ca in the GP p1 are superfluous and then can be removed from the to the second BPF. For instance, the scenario given in Section 2 is a
1
GPM. Indeed, designers are not obliged to fix passing conditions of new one. We recall that Activity a7 would fail to execute as it strongly
control flows involving such superfluous gateways and will depends on the execution of Activity a6 which has not eventually
eventually remove them as no necessary branching constructs exist. been executed. This behavior is newly generated and said
undesirable. Therefore, when merging a pair of BPFs, we have to
ensure that initial behaviors are maintained as they have been
Algorithm 1. Gateway path matrix reduction. defined by the designer and new ones are controlled. More
specifically, every activity can be performed only if its predecessors
in the initial BPF they come from have been accomplished. An intuitive
1: function GPMREDUCTIONGatewayPathSet GP solution consists in defining behavioral constraints for each activity
2: begin and each behavioral constraint would contain all the sequences of
3: for all p 2 GP do activities such that activities of a sequence appear in a CFP leading to
that activity. For instance, the behavioral constraint for Activity a7
4: if p:len  2 ^ p:seqð2Þ:x¼0 ?0 ^ @ p0 :ðp0 2 GPfpg ^ p0 :
would be the following Ba7 ¼ fha0 ; a1 ; a6 ; a4 ig. Consequently, Activ-
as ¼ p:as ^ p0 :seqð2Þ 6¼ p:seqð2ÞÞ
ity a7 can be executed only if the activities of at least one sequence of
then//p contains a configurable divergence gateway
Ba7 have already been performed.
appearing in a single control flow
However, some activities do not need any annotation. Let us
5: p.seq.removeHead consider the activities a0 and a1 which are shared by both BPFs, f1
6: p.seq.removeHead and f2. Activity a0 is a start activity and does not need annotation.
7: (o1, o2) p.seq(1) Activity A1 can be performed only if Activity a0 has already been
executed. However, Activity a0 is necessarily performed as no other
8: p.seq.addHead((p.as, o1))
scenario may exist. Therefore Activity a1 does not need annotation.
9: end if Moreover, for a given activity behavioral constraint, only some
10: if p:len  2 ^ p:seqðp:len1Þ:x¼0 ?0 ^ @ p0 :ðp0 2 GPfpg specific activities need to be represented. For example, let Ba7 ¼
^ p :at ¼ p:at ^ p0 :seqðp:len1Þ 6¼ p:seqðp:len1ÞÞ
0
fha0 ; a1 ; a6 ; a4 ig be the behavioral constraint of Activity a7. We can
then//p contains a configurable convergence gateway represent either the activity a0 or the activity a1 as there is only one
appearing in a single control flow CFP traversing the activities a0 and a1.
11: p.seq.removeTail In the following, we will define (i) which activities necessarily
12: p.seq.removeTail need annotation, and (ii) which activities must be represented in the
behavioral constraints.
13: (o1, o2) p.seq(p.len  1)
To catch the activities precedence rules, we first abstract the
14: p.seq.addTail((o2, p.at)) GPMs to consider only the activities as well as the adjacency
15: end if relation between them. That is, an abstraction of a GPM is an
16: end for adjacency matrix made of activities and illustrating the existence
of GPs between them. In fact, this abstraction represents the
17: end function
execution order between the activities independently from the
gateways. The abstracted-gateway path matrix (A-GPM) is formally
Algorithm 1 performs the reduction rule. It receives a set of GPM defined as follows.
elements, i.e., consisting of GPs, checks whether there are
unnecessary configurable gateways (line 4 and line 14) and
Definition 9 (Abstracted-gateway path matrix).
removes them from the GPM (lines 4–8 and lines 10–14). The
Given a GPM M, its corresponding abstraction is a matrix
function removeHead (resp. removeTail) removes the first (resp.
M ¼ ðA; f0; 1g; mÞ, where M :A ¼ M:A, and for a pair of activities
last) element of a sequence. Fig. 11 depicts the GPM, MM, after
ða1 ; a2 Þ 2 M :A2
[(Fig._1)TD$IG]
performing the reduction rule.

Fig. 11. Correction of the GPs shown in Fig. 10.


[(Fig._12)TD$IG]
112 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

Fig. 12. A-GPM M1 (left) and A-GPM M2 (right) of the GPMs M1 of Fig. 6 and M2 of Fig. 7, respectively.


1; if M:mða1 ; a2 Þ 6¼ ; Region (MCR) is defined as a common region between a pair of A-
M*[2_TD$IF].m(a1, a2) =
0; otherwise GPMs with maximum connected activities. Formally,

Fig. 12 represent the A-GPMs M1 , and M2 of the GPMs M1, M2


that are represented in Figs. 4 and 3, respectively. We also join the Definition 10 (Maximum common region).
corresponding graphs in Fig. 13 to ease their readability. Given a pair of A-GPMs, M1 and M2 , a region Mr 2 ðRM \ RM Þ is a
1 2
By inspection, the behavior inside a shared portion and the maximum common region iff there are no shared adjacency
behavior inside an individual portion (composed of non-shared relations between activities of the region Mr and the other
activities) respect the behavior of the initial GPMs. We recall that activities of the A-GPMs M1 and M2 : 8 ða1 ; a2 Þ:ða1 2
the behavior of individual portions is preserved by the GP merge M r :A ^ a2 2 ðM1 :AM r :AÞ ) ðM1 :mða1 ; a2 Þ ¼ 0 _ M2 :mða1 ; a2 Þ ¼ 0Þ
S
idempotency property (i.e., p ; = p, where p is a GP) and the ^ ðM1 :mða2 ; a1 Þ ¼ 0 _ M2 :mða2 ; a1 Þ ¼ 0ÞÞ.
shared portions encompass the behavior of both initial GPMs.
Therefore, an undesirable behavior may occur (i) after terminating For instance, ({a0, a1}, {0, 1}, m) and ({a4}, {0, 1}, m) are MCRs and
the execution of a shared portion and moving to an individual ({a0}, {0, 1}, m) is a common region but not a MCR since it belongs to
activity of one GPM or (ii) after terminating the execution of an the MCR ({a0, a1}, {0, 1}, m). MCRs are represented by dashed boxes
individual portion and moving to a shared activity. However, in Fig. 13. Given a pair of A-GPMs, let C be the set of MCRs between
moving to a shared activity from an individual portion does not them.
lead to any blocking. In fact, executing an individual portion Given a GPM and the set of MCRs between them, the execution of
assumes that there was no blocking. On the other hand, moving to the activities that are outgoing from the MCRs must be aware of their
a shared activity is performed as if we were executing only one of predecessors and the order between them. Moreover, only a single
the initial BPFs. Thus, we can deduce that the problem occurs after activity from each precedent MCR needs to be represented in a given
terminating the execution of a shared portion and moving to an activity behavioral constraint, until reaching an individual activity of
individual activity of one GPM. The latter activity may fail if some the same GPM.
predecessors have not been accomplished according to the Therefore, for each MCR, we retrieve the individual activities, of
behavior of the initial GPM it came from. each initial A-GPM that appear in the MCR outgoing adjacency
In the following, we define regions then maximum common relations. These activities must be annotated with behavioral
regions, connected portions of A-GPMs, to represent shared portions constraints to control their execution. That is, given a pair of A-
as well as individual ones and thus detect the activities that need GPMs M1 and M2 and their corresponding MCR set, for each MCR
annotation and those that need to appear in the activities behavioral M r 2 C, an activity a that must be annotated is such that for all (i,
constraints. Given an A-GPM M*, a region M r ¼ ðA; f0; 1g; mÞ is a j) 2 {1, 2}2 where i 6¼ j
connected portion of it, where M r :A  M :A and 8 ða1 ; a2 Þ:ðða1 ; a2 Þ
2 M r :A2 ) M r :mða1 ; a2 Þ ¼ M :mða1 ; a2 ÞÞ. Note that the connectivity  a 2 ðMi :AMj :AÞ and
of binary graph matrices can be verified based on adjacency matrix  9 a0 :ða0 2 M r :A ^ Mi :mða0 ; aÞ ¼ 1Þ.
properties. Let RM be the set of the all the regions that a A-GPM M* is
composed of. Given a pair of A-GPMs M1 and M2 , and their respective In the following, we provide some rules to define the behavioral
region sets RM and RM , the common region set corresponds to the constraints for the activities that need annotation. Let us consider
1
intersection RM \ RM .
2
the pair of A-GPMs, M1 and M2 represented in Fig. 12, and the set of
1 2
Considering common regions may lead to duplicate the MCRs, C ¼ fðfa0 ; a1 g; f0; 1g; mÞ; ðfa4 g; f0; 1g; mÞg. The following
activities that appear in the behavioral constraints the reason rules have to be performed for each individual activity, of each
A-GPM, that appears in the MCRs outgoing adjacency relations,
[(Fig._13)TD$IG]
why we define maximum common regions. A Maximum Common

Fig. 13. Corresponding graphs to the A-GPMs M1 and M2 of Fig. 12.


M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 113

namely a5 and a7. Although the activities a2, a3 and a6 meet the conducted in order to prove the effectiveness of our approach
above conditions, they do not need annotation as they are only by evaluating (i) our merge mechanism properties, (ii) the impact of
preceded by a MCR involving the activities a0 and a1. That is, we the reduction task on the resulting fragment structure, (iii) the size of
first start a new sequence for the behavioral constraints of Activity the resulting BPFs, as well as (iv) the scalability.
a5, Ba5 (the same thing is applied for Activity a7). We also denote by Our experiments have been implemented for a library
current activity and current sequence the activity we have reached consisting of a shared collection of 560 BPFs with 8874 different
and the sequence of activities we are defining in a given rule, activities among which 1661 activities appear in more than one
respectively. Note that the sequences are initially empty. fragment, and 45507 control flows. A BPF contains in average
24 activities, 81 control flows, and 18 GPs. The biggest BPF contains
1. Retrieve the activities that are the source of incoming control 239 activities, 326 control flows and 276 GPs, and the smallest one
flows to the activity to annotate (fa0 ja0 2 M1 :A [ M2 :A ^ contains 2 activities, 3 control flows and 1 GP.
ðM1 :mða0 ; aÞ ¼ 1 _ M2 :mða0 ; aÞ ¼ 1Þg with a is the activity to The tool we have implemented requires a pair of BPFs as input.
annotate and M1 and M2 are the input A-GPMs). Let A0 be the set The structure of the BPFs respects the one presented in Definition
of such activities. 1. The matching between the input BPFs is focused mainly on the
In the rest of the rules, we use a0 to denote a retrieved activity. similarity between activities labels. The tool, then, retrieves the
For instance, Activity a4 appears in Activity a5 incoming corresponding GPs of each BPF, maps the latter onto GPMs and
control flows. performs the merge task in order to generate a single GPM that
2. Duplicated the current sequence if there are several retrieved encompasses the behavior of the initial fragments. Finally, the tool
activities (i.e., jA0 j > 1). Iterate point 3 and 4 for each activity performs the reduction to remove superfluous gateways and
a0 2 A0 and a sequence (referred to as current sequence). returns back the resulting BPF.
3. Insert the activity a0 as a head of the current sequence (ha0 i :: seq, Merge properties: In this part, we present how our merge
with seq is the current sequence). respects a set of essential properties, namely the idempotency, the
associativity and the commutativity. The idempotency property is
For example, Activity a4 is inserted in the sequence. Then, we
important in that when a BPF is merged with itself, then the
get the behavioral constraint Ba5 ¼ fha4 ig.
S resulting BPF is exactly itself. Moreover, no behavioral annotations
4.  If the activity a0 belongs to a MCR (a0 2 r 2 C r:A with C is the set
are added. The associativity and commutativity properties are also
of MCRs) then retrieve the activities that appear in the MCR
important in practice. In fact, given that our approach performs in a
incoming control flows, and this, from the A-GPM to which the
pairwise fashion, the order of inserting the BPFs can be achieved
activity to annotate belongs (i.e., fa00 ja00 2 ðMi r:AÞ ^ 9 a:
randomly and the result should be the same independently from
ða 2 r:A ^ M i :mða00 ; aÞ ¼ 1Þg with Mi is the A-GPM to which the
the order. To this end, we have merged several BPFs from our
activity to annotate belongs and r is the MCR to which the
library with themselves and we obtained BPFs that are identical to
activity a0 belongs) and go to step 2.
the initial ones. Similarly, we tested the other properties by taking
For instance, Activity a4 belongs to the MCR({a4}, {1, 2}, m).
several BPFs and merged them in pairwise while varying their
Then, retrieve the activities A0 = {a2, a1}. The sequence ha4i is
merge order. The result of each merge is always the same for the
duplicated two times and each activity in A0 is inserted as the
same initial BPFs.
head of each sequence. We then get the behavioral constraint
Impact of the reduction task on the resulting fragment structure:
Ba5 ¼ fha2 ; a4 i; ha1 ; a4 ig.
To evaluate the impact of the reduction task, we pre-processed the
 End a sequence if the retrieved activity that has been inserted
fragment set in order to keep only pairs of BPFs that share at least
is individual or if there are no more predecessors for the MCR
5 activities. We were left with 206 pairs of BPFs. Fig. 14 illustrates
to which the current activity belongs.
the number of control flows of the initial BPFs, the number of the
For instance, Activity a2 is individual and belongs to the control flows, and the control flow compression factor before and
same A-GPM from which Activity a5 to annotate has been after the reduction. The results show that the reduction task
picked. Activity a1 belongs to a MCR which has no more improves marginally the compression factor (average from
predecessors. Then all the sequences are ended. 112.89 to 112.87). In fact, the bulk of the compression is given
by the merge mechanism. We recall that the reasons of adding
For instance, the behavioral constraint of Activity a5, encom- superfluous gateways have been given in Section 5.2.
passes two activity sequences: Ba5 ¼ fha2 ; a4 i; ha1 ; a4 ig. Activity a5 Size of the merged BPFs: In this part, we conduct our experiments
can be executed only if a2 then a4 have been accomplished of a1 over the library to compare the size of the initial BPFs with the size
then a4. Given that the second sequence encompasses only of the merge resulting BPF. To this aim, we use the same set as in
activities that belong to MCRs, Activity a5 would never fail and the previous part. The comparison is performed based on the
its corresponding behavioral constraint can be left out by the number of shared activities and the number of control flows as
designer. However, the behavioral constraint of Activity a7, they both influence the complexity and the readability of the
encompasses only one activity sequence: Ba5 ¼ fha6 ; a4 ig and a6 fragments. We have computed the ratio between the initial BPFs
does not belong to any MCR. Therefore, the designer should pay and the resulting one. This ratio is called compression factor. We
attention while finalizing the configuration of the process, more distinguish two types of compression factors: the activity
specifically, Gateway g da4 . compression factor, and the control flow compression factor. To
In this section, we presented which activities should be measure the activity compression factor, we use the following
contained in the behavioral constraints. In fact, no more activities 100
formula C A ¼ jf :Ajþjf jf :Aj, where f1 and f2 are initial BPFs and f is
are presented than necessary. Moreover, only key activities are 1 2 :Aj
S
the resulting fragment. Note that f.A = f1.A f2.A. An activity
annotated with behavioral constraints. The other activities are
compression factor near to CA
50 % means that the initial BPFs
performed with respect to the way they appear in the BPF.
contain almost the same set of activities, while an activity
compression factor CA = 100% means that the initial BPFs are
7. Experimental results disjoint. Similarly, we measure the control flow compression factor
C C f ¼ jf :C 100
jþjf :C j
*jf.Cfj which computes the rate of the control
We have implemented our approach and tested it over an 1 f 2 f

existing industrial BPF repository [12]. The experiments are flows that have been factored and/or inserted. A control flow
[(Fig._14)TD$IG]
114 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

[(Fig._15)TD$IG]
Fig. 14. Compression rate w.r.t. the control flow compression factor.

Fig. 15. Compression rate w.r.t. activity compression factor and control flow compression factor.

compression factor C C f close to 50% means that the initial BPFs are involved within shared GPs increase the number of the control
very similar. flows, those contained within shared GPs decrease the control
Fig. 15 represents some meaningful statistics. These statistics flows number. This case is demonstrated by the fifth example (eg.
are realized to study the relation between the shared activities and 5) of Fig. 15. From the case illustrated in the average row, we
the variation of the control flows. The columns correspond to some deduce that control flow compression factor is high compared to
results of the initial BPFs as well as the merge resulting BPF, namely the activity compression factor. In other words, it is more likely to
to the number of activities (#Ai), the number of GPs (#GPi), the have shared activities with individual adjacent activities. This
number of control flows (#C f i ), and the activity and the control implies inserting new configurable gateways.
flow compression factor (CA and CGP); i 2 {1, 2, m}. The rows Let us consider a couple of BPFs, where Ia, Oa, and Aa represent
respectively correspond to the minimum, the maximum, the the set of shared process input activities, shared process output
average, and the standard deviation. The minimum and the activities, and overall shared activities, respectively. Let also be PG
maximum rows are retrieved according to the global number of the set of shared GPs. Therefore, the number of control flow
activities that have been merged. We also added five meaningful variation DðC f Þ ¼ jIa [ Oa j þ 2 jAa  ðIa [ Oa Þjð2 þ Si 2 PG i:lenÞ. In
examples to highlight the relation between the shared activities fact, for each shared input process activity (resp. output process
and the variation of control flows. That is, the smaller the activity activity) we add a single control flow after (resp. before) the
compression factor is, the higher the control flow compression activity. For each shared activity that is neither a process input or a
factor is. process output, we add two control flows before and after the
In fact, the number of control flows increases when the number activity. A GP, in turn, involves a pair of shared activities. Thus, we
of shared activities also increases. Then, the compression factor of do not need to add a control after the source activity and before the
the control flows increases when the activity compression factor target activity. Moreover, corresponding shared GPs are consoli-
decreases. This result is confirmed by Fig. 16. This figure illustrates dated into a single GP. In fact, when the variation is negative, then
the activity compression factor on the X axis, the control flow there is a loss of control flows and C C f < 100. When the variation is
compression factor on the Y axis, and the linear regression positive, then there is a gain of control flows and C C f < 100. When a
represented with a solid line. BPF is merged with itself, then Delta(Cf) = jfi.Cfj and C C f ¼ 50.
Actually, the variation of the control flows also depends on the Scalability: The last experiments are conducted in order to
shared GPs. In fact, shared GPs are made of pairs of shared evaluate the scalability of our merge algorithm and to prove that
activities. Thus, the control flows that are contained within those the reduction is not very important in terms of execution time. The
[(Fig._16)TD$IG] are factored. Consequently, while shared activities that are not
GPs experiments are realized on a laptop core i5 intel processor,
2.27 GHz, 4 GB memory running on Microsoft 7. The results show
that our merge mechanism merges BPFs in less than 1 s. It takes
about 26 ms in average to perform:

 mapping BPF models onto GPM models (i.e., 9 ms)


 the GP retrieval (i.e., 12 ms)
 the GPM merge (i.e., 5 ms)

The reduction time is slightly smaller than the merge time (i.e.,
4 ms in average). While the merge manipulates only the head and
the tail of each GP independently from the other GPs, the reduction
must in turn compare each GP with the other GPs sharing the same
source and/or target then removes superfluous gateways. Howev-
er, the reduction is not very important in terms of superfluous
elements to remove. In order to evaluate the scalability, we
selected three (3) pairs of BPFs from the foreign banking domain.
Fig. 16. Correlation between the activity compression factor and the control flow Pairs 2 and 3 were constructed based on smaller BPFs. Small BPFs
compression factor. concern the domiciliation of foreign trade titles, letter of credit, and
[(Fig._17)TD$IG] M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 115

Fig. 17. Compression rate w.r.t. activity compression factor and control flow compression factor.

international guarantee modules. The former module intervenes connectors after (resp. before) an activity when it is followed
only in the payment phase of the latter two modules. The latter two (resp. preceded) by activities from several fragments. La Rosa et al.
modules are similar in validation, in collecting commission, and in [25] address the problem of merging variants of business processes
payment. The first pair involves small BPFs. The second pair into a consolidated one. The objective is to provide a consolidated
involves big BPFs with small similarity between the activities and business process which encompasses the behaviors of the input
the third pair involves big BPFs with high similarity between the process and permit tracing them back. The similarity mechanism is
activities. A fourth pair is simulated and involve a huge (not used to retrieve [10_TD$IF]‘Maximum Common Regions’ consisting of
realistic) set of activities with high similarity. The results are common nodes and edges, and create a unique version in the
shown in Fig. 17. The execution shows that increasing the number merged process. Individual regions are then glued to those regions
of the activities and the control flows as well as the similarity of the using either a convergence or a divergence connector. In this
BPFs does not alter the execution duration very much. approach, all edges are annotated with information referring to the
Please note that the resulting BPFs are almost similar to those process from which they were extracted. The work by Assy et al. [1]
obtained by the company analyst team. Pairs 2 and 3 were lacks precedence constraints and the behaviors of the initial
compared as if the small BPFs they contain were merged fragments may be lost after the merge task. While the work by La
separately. By inspection, we discovered that the differences Rosa et al. [25] overcomes this issue through annotating edges with
reside in the analysts composition choices (remove or add of their provenance, there is no need to annotate all of them. This
activities). We also remarked an optimization of the composition leads to heavily annotated processes, while, only particular edges
delays. In fact, the analysts reported that a composition of BPFs should be annotated to allow deriving the input processes.
takes them at least 1 man-hour for a set of small BPFs and much Moreover, while these approaches manage process variation,
more for bigger BPFs especially when they depict important our work focuses on fragments composition to generate new
overlapping structures. The fourth pair shows that the execution is value-added fragments.
performed in fair delays. A work by Eberle et al. [10] proposes a couple of Syntactical
operations, i.e., composition and decomposition operations, to
8. Related work enable weaving process fragments with each other. That is, given a
process fragment model, the composition operation extends it by
Managing business processes is an important field allowing another fragment model specification. The composition operation
organizations to manage knowledge that is encapsulated within however supposes that fragment models to compose must be
business processes. Among notable process management areas, we overlapping elements free. The decomposition operation is
cite similarity search, merging, variant management, and reuse proposed to cut out parts of a fragment model before fragments
[8]. The first step in working on process collections relies in process can be composed. Both operations are enacted given a parametri-
matching. La Rosa et al. [23,9] propose to detect approximate zation function which provides couples of dangling control flows to
clones of a set of process models to manage repositories. Dijkman weave with each other. However, no concrete guidelines have been
et al. [5] present similarity search techniques while focusing on proposed to generate the parametrization of dangling control flows
activities and control flows. to weave.
One of the notable process discovery techniques consists in On behavior handling, Gottschalk et al. [15] propose a three-
trace clustering called SMD (Slice, Mine, Dice). Logs are sliced into phases approach to merge two business process models into a
clusters, each one encompassing similar traces. Then, a process is single one without controlling the behavior of the original models.
discovered for each cluster. In [14], authors use SMD to enhance In their work, authors identify the behavior of the initial process
existing techniques by rendering them complexity-aware and models as well as the newly generated behaviors. That is, the input
fitness-aware. In our work, we suppose that fragments have process models are first reduced to only activities which represent
already been selected. the active behavior of a process and gateway types are reported
In the context of merging processes, Assy et al. [1] propose an into arc labels. The resulting graphs are them merged into a
approach for retrieving and merging existing business process another super-graph thus comprising the initial behaviors and the
fragments around particular activities. The activities are repre- added ones. However, even if this approach addresses the behavior
sented on layers organized in a tree structure. Each layer of the process models, it loses it during the model reduction and
represents the distance between activities. Activities in adjacent the merge and eventually permits new behaviors to occur without
layers are linked by means of an edge; each one describing the informing the designer. Moreover, all the approaches stated above
existing connectors (i.e., events and gateways) between the do not address the problem of BPF behavior inconsistencies. A work
activities. Redundant activities on different layers are then merged by Kunze et al. propose to compute the processes similarities
in the layer having the least layer index. Authors also provide through their behavioral profiles [18]. A behavioral profile is an
merge rules to fuse edges with similar sources and targets abstract representation of a business process model. It is generated
activities. These rules consist in adding split (resp. merge) from process execution traces. It aims to catch the order relations
116 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

between each pair of process activities. Three relations can be 11: remaining true
distinguished: strict order (i.e., one-way relation), exclusive order 12: P P [ {(o, ot, (o, os) :: seq)}
(i.e., no relation) and interleaving order (i.e., two-way relation).
13: end if
The behavioral profiles can be used to identify the similarities
between a pair of processes and retrieve the predecessor activities 14: end for
necessary for a given activity execution. However, processes are 15: end if
not always delivered with their traces to deduce the behavioral 16: if after6¼ ; then
profiles. This issue also applies for BPFs.
17: for all o 2 Oafter ^ : seq.objectOf(o) do
In configuring business processes, Assy et al. [3] propose to
automate configuration guidance models to guides process 18: = P then
if (os, o, seq :: (ot, o)) 2
designers in finalizing the definition of configurable business 19: remaining true
processes. Guidance settings are based on past-user-experience. In 20: P P [ {(os, o, seq :: (ot, o))}
[2], authors provide configuration rules based on selected
21: end if
configuration frequency. In the same axis, Schunselaar et al. [29]
propose an approach to enhance supporting designers in configur- 22: end for
ing process models through using general concepts closely related 23: end if
to the current context in which the configurable process is used. 24: end for
25: end while
9. Conclusion
26: return P
In this work, we provided a novel approach to help process 27: end function
designers in designing new business processes using process
fragments. The latter are retrieved, beforehand, from existing
processes. Our contribution is twofold. On one hand it allows the
Given a control flow set of a given BPF, Algorithm 2 generates
merge of two business process fragments. On the other hand it
the corresponding set of CFPs. It first generates trivial CFPs that are
helps overcome the issues raised by undesirable behaviors which
made of a single control flow (line 3). Then, it searches for CFPs that
may be generated during the merge task.
are made of several control flows (lines 4–25) by extending the
Our systematic merge mechanism is based on a particular
already extracted CFPs with a new control flow among the BPF
matrix, that we call gateway path matrix. Each matrix’s element
control flow set, either in the beginning of the selected CFP or in its
corresponds to the paths between a pair of adjacent activities. That
end. The symbol : : denotes the operator that adds an element at
is, each fragment is represented with a gateway path matrix where
the head/tail of a sequence. To this aim, it handles a Boolean
their respective elements, sharing similar source and target
variable, remaining, to check whether there are additional CFPs to
activities, are merged in a pairwise fashion, independently from
generate. The algorithm has remaining CFPs to generate, if there
the rest of the elements. furthermore, we provided the resulting
are other control flows that can extend some generated CFPs (line
fragment with a set of behavioral constraints as to keep the
7). A CFP can be extended with a control flow if the target object of
behaviors of the initial fragments and avoid undesirable ones.
the CFP is similar to the source object of the control flow (lines 8–
It is noteworthy that our merge does not allow any correction
15), and vice-versa (lines 16–23). The conditions in lines 9 and
until after the new merged process is delivered. A manual
17 avoid infinite loops that may be caused by cycles where the
approach, on the other hand, permits us to check and control
function objectOf checks whether an object appears in a sequence
the fragments even during the merge thus enhancing the resulting
or not. The conditions in lines 10 and 18, respectively, check
fragment.
whether the extended CFP already exists in the CFP set or not. This
In the future, we aim to take into account and integrated
condition avoids inserting already existing CFPs.
designers past-experience within the merge task to provide
correction suggestions. We also intend to prevent privacy issues
in the merge task. These issues come from the fact that merging
two fragments may lead to the disclosure of sensitive information Appendix B. Properties correctness proofs
compiled from both of them.
Correctness properties are hereafter proved.

 When an element of a GPM contains a GP of length 1, then the


Appendix A. Retrieving CFPs algorithm
element contains no more GPs and the rest of the elements of the
Algorithm 2. Control flow paths extraction row and the column must be empty. In fact, the GP source and
target activities are directly linked by means of a control flow
where an activity has at least one incoming control flow and one
1: function CFPEXTRACTIONControlFlowSet Cf: CFPSet outgoing control flow. Formally,
2: CFPSet P ;, boolean remaining true Property 2. Given a fragment matrix M ¼ ðA; GP; mÞ, then
3: P {(os, ot, h(os, ot)i)j(os, ot) 2 Cf} 8 ða1 ; a2 Þ:ðða1 ; a2 Þ 2 AA ^ ð 9 p:ðp 2 mða1 ; a2 Þ ^ p:len ¼ 1ÞÞ ) jm
4: While remaining do ða1 ; a2 Þj ¼ 1 ^ 8 a3 :ðða3 2 Afa2 g ) mða1 ; a3 Þ ¼ ;Þ ^ ða3 2 Afa1 g
) mða3 ; a2 Þ ¼ ;ÞÞÞ.
5: remaining false
6: for all (os, ot, seq) 2 P do Proof. Let a1 and a2 be a pair of activities of A such that there
7: ObjectSet Obefore Cf1 ½fos g, Oafter Cf[{ot}] exists a GPp1 2 m(a1, a2) and p . len = 1. We recall that p1 is a GP
having the form (a1, a2, hoi, oi+1ii2[9_TD$IF]1..(n1)) with o1 = a1 and
8: if before6¼ ; then
on = a2. Therefore, p1 = (a1, a2, ha1, a2i).
9: for all o 2 Obefore ^ : seq . objectOf(o) do
For any activity a3 2 A, let us suppose that there exists
10: = P then
if (o, ot, (o, os) :: seq) 2 p2 2 m(a1, a3) and p2 6¼ p1. That is, p2 = (a1, a3, hoi, oi+1ii2[9_TD$IF]1..(n1))
M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118 117

with o1 = a1, on = a3, and (oi, oi+1) 2 Cf. Since p1 6¼ p2, Cf[{a1}] [4] A. Caetano, A. Assis, J.M. Tribolet, Using business transactions to analyse the
consistency of business process models, in: HICSS, 2012, 4277–4285.
contains at least a2 and o2 (o2 is the object contained in the GP p2). [5] R.M. Dijkman, M. Dumas, L. Garcı́a-Bañuelos, Graph matching algorithms for
Thus, jCf[{a1}]j 2, which contradicts Definition 1. business process model similarity search, in: Business Process Management,
7th International Conference, BPM 2009, Ulm, Germany, September 8–10,
Similarly, for any activity a3 2 A, let us suppose that there 2009, 2009, 48–63.
exists p2 2 m(a3, a2) and p2 6¼ p1. That is, p2 = (a1, a3, hoi, [6] R.M. Dijkman, M. Dumas, L. Garcı́a-Bañuelos, R. Käärik, Aligning business process
oi+1ii2[9_TD$IF]1..(n1)) with o1 = a3, on = a2, and (oi, oi+1) 2 Cf. Since models, in: Proceedings of the 13th IEEE International Enterprise Distributed
Object Computing Conference, Auckland, New Zealand, 2009, pp. 45–53.
p1 6¼ p2, Cf1 ½fa2 g contains at least a2 and on1 (on1 is the [7] R.M. Dijkman, M. Dumas, B.F. van Dongen, R. Krik, J. Mendling, Similarity of
object contained in the GP p2). Thus, jCf1 ½fa2 gj  2, which business process models: metrics and evaluation, Inf. Syst. (2011) 498–516.
contradicts Definition 1. & [8] R.M. Dijkman, M. La Rosa, H.A. Reijers, Managing large collections of business
process models – current techniques and challenges, Comput. Ind. 63 (2)
 When several GPs appear in a single element (resp. in several
(2012) 91–97.
elements) of a given row, then the outgoing control flow from the [9] M. Dumas, L. Garcı́a-Bañuelos, M. La Rosa, R. Uba, Fast detection of exact
row activity of each GP is shared by all of them. This means that clones in business process model repositories, Inf. Syst. 38 (4) (2013) 619–633.
[10] H. Eberle, F. Leymann, D. Schleicher, D. Schumm, T. Unger, Process fragment
there are many GPs diverging from the row activity to the same
composition operations, in: 5th IEEE Asia-Pacific Services Computing
column activity (resp. to several column activities). Formally, Conference, APSCC, 2010, 157–163.
Property 3. Given a fragment matrix M ¼ ðA; GP; mÞ then, [11] H. Eberle, T. Unger, F. Leymann, Process fragments, in: On the Move to
Meaningful Internet Systems: OTM 2009, vol. 5870 of Lecture Notes in
8 ðai ; aj ; ak Þ:ððai ; aj ; ak Þ 2 A3 ^ mðai ; aj Þ 6¼ ; ^ mðai ; ak Þ 6¼ ; ) 8 ðp1 ; Computer Science, Springer, 2009, pp. 398–405.
p2 Þ:ðp1 2 mðai ; aj Þ ^ p2 2 mðai ; ak Þ ) p1 :seq:first ¼ p2 :seq:firstÞÞ, [12] D. Fahland, C. Favre, J. Koehler, N. Lohmann, H. Vlzer, K. Wolf, Analysis on
where given a sequence of elements E = ha0, e2, . . ., eni, E.first returns demand: instantaneous soundness checking of industrial business process
models, Data Knowl. Eng. (2011).
the first element a0.
[13] W.B. Frakes, K. Kang, Software reuse research: status and future, IEEE Trans.
Softw. Eng. 31 (7) (2005) 529–536.
[14] L. Garcı́a-Bañuelos, M. Dumas, M. La Rosa, J. De Weerdt, C.C. Ekanayake,
Proof. Let a1, a2, and a3 be three activities of A such that there Controlled automated discovery of collections of business process models, Inf.
Syst. 46 (2014) 85–101.
exists p1 2 m(a1, a2), p2 2 m(a1, a3) and p1 6¼ p2. We recall that, p1 is [15] F. Gottschalk, W.M.P. van der Aalst, M.H. Jansen-Vullers, Merging event-driven
a GP that describes a path between a1 and a2 having the form (a1, process chains, in: OTM Conferences (1), 2008, 418–426.
a2, hoi, oi+1ii2[9_TD$IF]1..(n1)) with o1 = a1 and on = a2, and p2 is a GP that [16] F. Gottschalk, W.M.P. van der Aalst, M.H. Jansen-Vullers, M. La Rosa,
Configurable workflow models, Int. J. Cooperative Inf. Syst. 17 (2) (2008)
describes a path between the activities a1 and a3 having the form 177–221.
ða1 ; a3 ; ho0i ; o0iþ1 ii 2 1;::;ðn1Þ Þ with o01 ¼ a1 and o0n ¼ a3 . Let us sup- [17] Object Management Group, Business Process Modeling Notation (BPMN),
pose that p1.seq.first 6¼ p2.seq.first. That is, ðo1 ; o2 Þ 6¼ ðo01 ; o02 Þ with Version 2.0, January 2009.
[18] M. Kunze, M. Weidlich, M. Weske, Behavioral similarity – a proper metric, in:
o1 ¼ o01 ¼ a1 , then o2 6¼ o02 . Therefore, Cf[{a1}] contains at least o2
Business Process Management – 9th International Conference, 2011, 166–181.
and o02 , so jCf[{a1}]j  2, which contradicts Definition 1. & [19] F. Leymann, D. Roller, Business Processes in a Web Services World: A Quick
 When several GPs appear in a single element (resp. in several Overview of BPEL4WS, IBM Software Group, 2002, pp. 2–28.
elements) of a given column, then the incoming control flow to [20] C. Ouyang, M. Dumas, A.H.M. Ter Hofstede, W.M.P. Van Der Aalst, Pattern-
based translation of BPMN process models to BPEL web services, Int. J. Web
the column activity of each GP is shared by all of them. This Serv. Res. (JWSR) 5 (1) (2007) 42–62.
means that there are many GPs converging to the column activity [21] C. Ouyang, M. Dumas, A.H.M. ter Hofstede, W.M.P. van der Aalst, From BPMN
from the same row activity (resp. from several row activities). process models to BPEL web services, in: ICWS, IEEE, 2006, pp. 285–292.
[22] E. Rahm, P.A. Bernstein, A survey of approaches to automatic schema
Formally, matching, VLDB J. 10 (4) (2001) 334–350.
Property 4. Given a fragment matrix M ¼ ðA; GP; mÞ then, [23] M. La Rosa, M. Dumas, C.C. Ekanayake, L. Garcı́a-Bañuelos, J. Recker, A.H.M. ter
Hofstede, Detecting approximate clones in business process model
8 ðai ; aj ; ak Þ:ððai ; aj ; ak Þ 2 A3 ^ mðaj ; ai Þ 6¼ ; ^ mðak ; ai Þ 6¼ ; ) 8 ðp1 ;
repositories, Inf. Syst. 49 (2015) 102–125.
p2 Þ:ðp1 2 mðaj ; ai Þ ^ p2 2 mðak ; ai Þ ) p1 :seq:last ¼ p2 :seq:lastÞÞ, [24] M. La Rosa, M. Dumas, R. Uba, R.M. Dijkman, Merging business process
where given a sequence of elements E = ha0, e2, . . ., eni, E.last returns models, in: OTM Conferences (1), 2010, 96–113.
the last element en. [25] M. La Rosa, M. Dumas, R. Uba, R.M. Dijkman, Business process model merging:
an approach to business process consolidation, ACM Trans. Softw. Eng.
Methodol. 22 (2) (2013) 11.
[26] M. La Rosa, H.A. Reijers, W.M.P. van der Aalst, R.M. Dijkman, J. Mendling, M.
Proof. Let a1, a2 and a3 be three activities of A such that there Dumas, L. Garcı́a-Bañuelos, APROMORE: an advanced process model
repository, Expert Syst. Appl. 38 (6) (2011) 7029–7040.
exists p1 2 m(a1, a2), p2 2 m(a3, a2) and p1 6¼ p2. We recall that p1
[27] D. Schumm, D. Karastoyanova, O. Kopp, F. Leymann, M. Sonntag, S. Strauch,
is a GP that describes a path between a1 and a2 having the form Process fragment libraries for easier and faster development of process-based
(a1, a2, hoi, oi+1ii2[9_TD$IF]1..(n1)) with o1 = a1 and on = a2, and p2) is a GP applications, J. Syst. Integr. 2 (1) (2011) 39–55.
that describes a path between the activities a3 and a2 having the [28] D. Schumm, F. Leymann, Z. Ma, T. Scheibler, S. Strauch, Integrating compliance
into business processes: process fragments as reusable compliance controls,
form (a3, a2, hoi, oi+1ii2[9_TD$IF]1..(n1)) with o1 = a3 and on = a2. Let us in: Proceedings of the Multikonferenz Wirtschaftsinformatik (MKWI10), 2010.
suppose that p1.seq.last 6¼ p2.seq.last. That is, ðon1 ; on Þ 6¼ ðo0n1 ; o0n Þ [29] D.M.M. Schunselaar, H. Leopold, H.M.W. Verbeek, W.M.P. van der Aalst, H.A.
with on ¼ o0n ¼ a2 , then on1 6¼ o0n1 . Therefore, Cf1 ½fa2 g contains Reijers, Configuring configurable process models made easier: an automated
approach, in: Business Process Management Workshops – BPM
at least on1 and o0n1 , so jCf1 ½fa2 gj  2, which contradicts 2014 International Workshops, Eindhoven, The Netherlands, September 7–8,
Definition 1. & 2014, Revised Papers, 2014, 105–117.
[30] V. Seidita, M. Cossentino, S. Gaglio, A Repository of Fragments for Agent
System Design, WOA, 2006.
[31] S. Smirnov, R.M. Dijkman, J. Mendling, M. Weske, Meronymy-based
aggregation of activities in business process models, in: Conceptual Modeling
References – ER 29th International Conference on Conceptual Modeling, Vancouver, BC,
Canada, 2010, 1–14.
[32] S. Smirnov, M. Weidlich, J. Mendling, Business process model abstraction
[1] N. Assy, N.N. Chan, W. Gaaloul, Assisting business process design with based on synthesis from well-structured behavioral profiles, Int. J. Cooperative
configurable process fragments, in: IEEE SCC, 2013, 535–542. Inf. Syst. 21 (1) (2012) 55–83.
[2] N. Assy, W. Gaaloul, Configuration rule mining for variability analysis in [33] S. Smirnov, M. Weidlich, J. Mendling, M. Weske, Action patterns in business
configurable process models, in: Service-oriented Computing – 12th process models, in: Service-oriented Computing, 7th International Joint
International Conference, ICSOC 2014, Paris, France, November 3–6, 2014, Conference, ICSOC-ServiceWave 2009, Stockholm, Sweden, 2009, 115–129.
2014, 1–15. [34] R.J. Wilson, An Introduction to Graph Theory, Pearson Education India, 1970.
[3] N. Assy, W. Gaaloul, Extracting configuration guidance models from business [35] M.A. Zemni, A. Mammar, N.B. Hadj-Alouane, Formal approach for generating
process repositories, in: Business Process Management – 13th International privacy preserving user requirements-based business process fragments, in:
Conference, BPM 2015, Innsbruck, Austria, August 31–September 3, 2015, ACSC, 2014, 89–98.
2015, 198–206.
118 M.A. Zemni et al. / Computers in Industry 82 (2016) 104–118

Mohamed Anis Zemni received the Bachelor’s degree of an Associate Professor at Télécom SudParis School,
in Computer Sciences from the High Institute of Evry, France. Her research activities focus on software,
Management, Tunis, Tunisia, in 2008 and the Master’s languages and model checking and verification, Safety
degree in computer engineering from the National Engineering, traffic safety, accident analysis, business
Institute of Applied Sciences, Lyon, France, in 2009. Cur- process management.
rently, he is a PhD student at the University of Manouba,
Tunis, Tunisia. He is a member of the OASIS research lab
at the National School of Engineers of Tunis, and his
Nejib Ben Hadj-Alouane received the B.S. degree in
research activities focus on Business Process Manage-
computer engineering from Syracuse University, Syr-
ment.
acuse, NY, in 1984, and the M.S.E. and Ph.D. degrees in
computer information and control engineering from
the University of Michigan, Ann Arbor, in 1986 and
1994, respectively. Currently, he holds the position of a
Professor at the National School of Engineers of Tunis
Amel Mammar received the Ph.D. degree in computer (ENIT), University of Tunis El Manar, Tunisia. His
sciences from the National Conservatory of Arts and research activities focus on discrete-event and hybrid
Crafts (CNAM Paris), Paris, France, in 2002 and the systems, security in computer networks and systems,
accreditation to supervise research at the University of and issues in real-time and embedded systems as well
Créteil, Paris, France. Currently, she holds the position as, Web services and their composition.

Das könnte Ihnen auch gefallen