Beruflich Dokumente
Kultur Dokumente
1, JANUARY-MARCH 2010
Abstract—Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the
confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data
fragmentation approaches can be combined with dynamic replication. In this paper, we consider data partitioning (both secret sharing
and erasure coding) and dynamic replication in data grids, in which security and data access performance are critical issues. More
specifically, we investigate the problem of optimal allocation of sensitive data objects that are partitioned by using secret sharing
scheme or erasure coding scheme and/or replicated. The grid topology we consider consists of two layers. In the upper layer, multiple
clusters form a network topology that can be represented by a general graph. The topology within each cluster is represented by a tree
graph. We decompose the share replica allocation problem into two subproblems: the Optimal Intercluster Resident Set Problem
(OIRSP) that determines which clusters need share replicas and the Optimal Intracluster Share Allocation Problem (OISAP) that
determines the number of share replicas needed in a cluster and their placements. We develop two heuristic algorithms for the two
subproblems. Experimental studies show that the heuristic algorithms achieve good performance in reducing communication cost and
are close to optimal solutions.
Index Terms—Secure data, secret sharing, erasure coding, replication, data grids.
1 INTRODUCTION
graph topology GC ¼ ðH C ; E C Þ. Here, H C ¼ fH1 ; . . . ; HM g incur an m=ðm kÞ fold of storage waste (the rest are for
and E C is the set of edges connecting the clusters. Each edge improving availability and performance). If the storage
represents a logical link which may be multiple hops of the space is a concern, then erasure coding schemes can be
physical links. It is likely that the clusters are linked to the used. An erasure coding scheme uses the same mathe-
backbone and should be modeled by a general graph. matics except that the k 1 coefficients of the polynomial
Within each cluster, there may be many subnets from the are dependent to d. Thus, partial information may be
same or multiple institutions. Among all the physical nodes inferred with fewer than k shares, and hence, encryption
in the cluster, some nodes, such as servers, proxies, and is needed for confidentiality assurance. Generally, the
other individual nodes, may be committed to contribute its encryption keys are secret shared and distributed with the
storage and/or computation resources for some data grid data. Erasure coding schemes achieve best storage
applications. These nodes are connected via logical links. efficiency, even when compared with replication [14],
According to [23], internet message routing is relatively [15], [16], [24]. The access performance in secret sharing,
stable in days or even weeks and the multiple routing paths erasure coding, and replication (with secret shared keys)
generally form a tree. Thus, for simplicity, we model the schemes are approximately the same. Here, we do not
topology inside a cluster as a tree. Consider a cluster Hx . Let limit the data partitioning schemes (as long as it is
Gx ¼ ðVx ; Ex Þ represents the topology graph within the secure). To ensure that the secret data can be recon-
cluster, where Vx ¼ fPx;1 ; Px;2 ; . . . ; Px;Nx g denotes the set of structed even when k 1 nodes are compromised, we
Nx (N if only considering cluster Hx ) nodes in cluster Hx , require m 2k 1. Let l denote the number of distinct
and Ex is the set of edges connecting nodes in Hx . Also, let shares to be accessed for each read request (it is fixed for
Pxroot denote the root node in Hx (e.g., Pxroot 2 Vx ). We all read requests). We have k l m. If l > k, the original
assume that all traffic in Hx goes through the network data can be reconstructed and validity of the shares can
where Pxroot resides. Let ðPx;i ; Px;j Þ denote the shortest path be checked. The parameter l can be determined for each
between Px;i and Px;j in Hx , and jðPx;i ; Px;j Þj denote the specific application system depending on its needs.
distance of ðPx;i ; Px;j Þ. Also, let ðHx ; Hy Þ denote the In many applications, data could be read as well as
shortest path between Hx and Hy (actually, between Pxroot updated by clients from geographically distributed areas.
and Pyroot ), and jðHx ; Hy Þj denote the distance of ðHx ; Hy Þ. For example, in a major rescue mission or a disaster relief act,
We assume that jðPx;i ; Px;j Þj for any i, j, and x is much less the problem areas and resources need to be updated in real
than jðHy ; Hz Þj for any y and z, where y 6¼ z (i.e., the time to facilitate dynamic and effective planning. Net-centric
distance between any two nodes within a cluster is less than command and control systems rely on GIG for its dynamic
the distance between any two clusters). information flow, and frequently, critical data need to be
The data grid (represented by the set of clusters H C ) updated on-the-fly to support agility. Also, updates in
storage systems for encryption keys can be quite frequent
hosts a set of data objects D (D can contain the application
due to the changes in membership and access privileges of
data or keys). One of the clusters is selected as the Master
the individuals. In our model, for each update request, all
Server Cluster (MSC) for some data objects in D, denoted as
shares and share replicas need to be updated using a primary
HMSC (different data objects may have different HMSC ).
lazy update protocol as that discussed in [10] and [16].
HMSC hosts these data objects permanently (it may be the
Generally, eager update is not feasible in widely distributed
original data source). These data objects may be partially
systems since it takes too long to finish the updates. Also, the
replicated in other clusters in H C to achieve better access
large-scale network may be partitioned and some clusters
performance. Due to the increasing attacks on Internet, a
may be temporarily unreachable. Thus, a lazy update is more
node hosting some data objects in D has a significant chance
suitable. Furthermore, a primary copy is frequently used to
of being compromised. If a node is compromised, all the
avoid system delusion when the system size is large or the
plaintext data objects stored on it are compromised. If a
update ratio is high [10]. Based on the primary lazy update
storage node storing some encrypted data is compromised
protocol, all update requests are first forwarded to HMSC for
and the nodes maintaining the corresponding encryption execution and the updates are then propagated to other
keys are also compromised (note that they may be the same clusters along a minimum spanning tree (MST) as described
nodes), then the data are compromised. We assume that the in [34]. Consistency can also be maintained periodically
probabilities of compromising two different storage nodes using a distributed vector clock [16], [17] without concerning
are not correlated. This is true for many new attacks, such node failures or network partitioning. Moreover, various
as viruses that are spread through emails and the major update execution protocols can be chosen flexibly to further
buffer overflow attacks. achieve Byzantine fault tolerance [16] and/or high security
To cope with potential threats, the data partitioning assurance [17].
technique is used. Each data object d ðd 2 DÞ is partitioned More details of the read and update protocols and their
into m shares. Major data partitioning schemes include costs will be discussed in the next section. In this paper, we
secret sharing [17], [20], [28], [37] and erasure coding [14], assume secure communication channels for the delivery of
[15], [16], [24], [37]. In an ðm; kÞ secret sharing scheme, data shares. Standard encryption algorithms, such as SSL,
m shares are computed from a data object d using a can be used to achieve this.
polynomial with k 1 randomly chosen coefficients and
distributed over m servers. d can be determined uniquely 2.1 Access Model and Problem Decomposition
with any k shares and no information about d can be Data placement decisions are made based on historical client
inferred with fewer than k shares. Secret sharing schemes access patterns. We model access patterns by analyzing the
TU ET AL.: SECURE DATA OBJECTS REPLICATION IN DATA GRID 53
number of read/write accesses from each node or each (encrypted using the session key between Hy and the
cluster. Consider a data object d. Let T denote the time client); 2) Pyroot puts the pieces into one message and sends
period unit for collecting information of access patterns. Let it to Pxroot ; and 3) Pxroot sends the shares to the requesting
Ar ðPx;i Þ and Aw ðPx;i Þ denote the numbers of read and write node Px;i and Px;i forwards them to client C. Overall, the
accesses, respectively, initiated from node Px;i over time T . read cost is jðPx;i ; Pxroot Þj þ jðHx ; RC Þj þ jðPyroot ; Ry ; lÞj.
Also, let Ar ðHx Þ and Aw ðHx Þ denote the numbers of read and Note that jðPx;i ; Rx ; lÞj ¼ 0 if Hx hosts less than l shares,
write accesses, respectively,
P initiated from a Pcluster Hx over and jðPx;i ; Pxroot Þj þ jðHx ; RC Þj þ jðPyroot ; Ry ; lÞj ¼ 0, other-
time T . Ar ðHx Þ ¼ i Ar ðPx;i Þ and Aw ðHx Þ ¼ i Aw ðPx;i Þ. Let wise. Let readCost denote the total read cost in the system.
wC denoteP the total number of update requests on d in GC , We have
i.e., wC ¼ Hx Aw ðHx Þ.
Based on the client access patterns, subsets of the readCost
m shares of each data in D may be replicated to the clusters 8
> ðPx;i ; Rx ; lÞ; if Hx holds
in H C . The set of clusters that hold shares is defined as the >
>
>
cluster level resident set. Let RC denote the cluster level X X> <
at least l shares;
resident set, i.e., RC ¼ fHx j clusters hold sharesg. To mini- ¼ ðPx;i ; P root Þ þ ðHx ; RC Þ
>
> x
mize the communication cost, we consider that a cluster
Hx i >
>
>
:
þðPyroot ; Ry ; lÞ; otherwise:
holds either none or at least l distinct share replicas (will be
proven in Theorem 3.1). Also, we assume that each cluster
holds only distinct shares (i.e., at most m) since, otherwise, As discussed earlier, the update access protocol is primary
extra efforts are required to avoid reading duplicated shares lazy update. Inside a cluster Hx , updates are always
(a large m value ensure that a sufficient distinct shares can propagated from the root node Pxroot to all other nodes
be allocated to each cluster). Intracluster residence set is the holding share replica along an MST. The update cost for a
set of nodes that holds a share replica within a cluster. Let client at node Px;j updating P shares is jðPx;i ; Pxroot Þj þ
C C root
Rx denote the intracluster residence set of Hx , i.e., Rx ¼ jðHx ; HMSC Þj þ j ðR Þj þ Hy jðPyP ; RP y ; jRy jÞj, w h e r e
fPx;i j Px;i holds a share and Px;i is in Hx g. Correspondingly, Hy 2 RC . Let forwardCost denote Hx
root
i jðPx;i ; Px Þj þ
jRx j denotes the number of share replicas in Hx . We say Rx jðHx ; HMSC Þj.
(or RC ) is connected if and only if every node in Rx (or RC ) Let updateCost denote the overall update cost in the
has at least one path to any other node in Rx (or RC ), and system. We have
each node on the path also belongs to Rx (or RC ). X root
Otherwise, Rx (or RC ) is partitioned. In this case, Rx (or updateCost ¼ wC C ðRC Þ þ P ; Rx ; jRx j
Hx x
RC ) contains multiple subgraphs Rx;1 ; Rx;2 ; . . . ; Rx;n (or þ forwardCost:
RC1 ; RC2 ; . . . ; RCn ), n > 1, Rx;i (or RCi ), 1 i n, is a con-
nected subgraph, and Rx;i and Rx;j (or RCi and RCj Þ, i 6¼ j, Note that we do not consider the extra cost required
are not connected. for the detection and recovery of an invalid share. With
Now, consider the intracluster level. Let ðPx;i ; Rx Þ both update and read cost, the total cost becomes
denote the shortest path from Px;i to any node in Rx , and Tcost ¼ updateCost þ readCost. Table 1 gives a summary
ðPx;i ; Pxroot Þ denote the shortest path from Px;i to the root of the notation used in this paper.
node in Hx . Also, let ðHx ; RC Þ denote the shortest path from Our goal is to replicate the data shares and allocate
Hx to the closest cluster in RC , and jðHx ; RC Þj is the distance them to different nodes in the data grid to minimize
of path ðHx ; RC Þ (only counting the cluster level cost). Let Tcost. We decompose the allocation problem into two
ðPx;i ; Rx ; Þ denote the MST rooted at Px;i and includes a subproblems—intracluster and intercluster share alloca-
total of nodes hosting shares in cluster Hx . jðPx;i ; Rx ; Þj tion problems—and deal with them separately and
represents the total distance of the MST. Let C ðRC Þ denote independently. First, consider the updateCost. All updates
the MST from HMSC to all clusters in RC at the cluster level. need to be sent to a cluster that holds share replicas.
jC ðRC Þj is the total distance of the MST C ðRC Þ, but only Assume that Hx holds share replicas. Pxroot has the
considering the costs at the cluster level (i.e., the distance to knowledge of the total update access number wC (note
the root node of each involved cluster). that every update needs to be propagated to Hx ) and the
The read access protocol tries to read the closest l share topology within the cluster. The forwardCost is a constant,
replicas. Consider a client C sending a read request to Px;i . because it has no impact on the choice of resident set.
If the local cluster of Px;i (i.e., Hx ) holds shares (note that Thus, the local allocation within a cluster Hx is not
Hx holds either none or at least l share replicas), then it impacted by the allocation in other clusters (besides
reads l shares within Hx (the l nodes are selected such that needing to know wC and to have an intercluster level
the communication cost is minimal). The access cost in this algorithm to determine whether Hx should hold share
case is jðPx;i ; Rx ; lÞj (assume that the communication cost replicas). Now, consider the readCost. If the local cluster
between the client and Px;i is negligible). If Hx does not Hx holds the P share P replicas, then the allocation for
r
hold share replicas, then Px;i obtains all l shares from the minimizing Hx i ðjðPx;i ; Rx ; lÞj A ðPx;i Þ within Hx can
closest cluster Hy where Hy 2 RC . The algorithm for be computed locally. Otherwise, Hx does not hold shares
transferring shares from the source cluster Hy to the and needs to read from a remote cluster Hy . No matter
requesting cluster Hx is defined as follows: 1) all the nodes how Ry is decided, jðPx;i ; Pxroot Þj þ jðHx ; RC Þj is the same;
in Hy holding the desired shares send the shares to Pyroot thus, jðPx;i ; Pxroot Þj þ jðHx ; RC Þj Ar ðHx Þ has no effect on
54 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
TABLE 1
Summary of the Frequently Used Notation
the decision of Ry . Now, the effect of Ar ðHx Þ on the Let UpdateCostC ðGC ; RC Þ denote the total update cost in
decision of Ry can be computed as Ar ðHx Þ jðPyroot ; Ry ; lÞj, G with the resident set RC , then
C
node of Hx , ðPi ; Pj Þ represents the shortest path between RC except that in RC0 , Hx holds l distinct shares. Thus, in
two nodes inside Hx , and R represents the resident set of RC0 , jðHx ; RC0 Þj ¼ 0. So, the read cost for read requests
Hx . Note that the simplification will be used below and in from Hx becomes zero. Also, in GC , there may be clusters
Section 5. In the situation where multiple clusters are that read from Hx . Assume that Hx is the closest cluster
considered, the original notation is used. Note that we only in RC of Hy (Hy is not in RC ). If the optimal resident set is
need to consider clusters with l or more share replicas for RC , then Hy needs to read from Hx and some other
this subproblem OISAP. clusters since Hx has less than l shares. Thus, we can
Let ReadCost (R) denote the total read cost from all the conclude
nodes in cluster Hx :
X ReadCostC ðGC ; RC Þ ReadCostC ðGC ; RC0 Þ
ReadCostðRÞ ¼ P 2H
jðPi ; R; lÞj Ar ðPi Þ: Ar ðHx Þ jðHx ; RC0 Þj and; hence;
i x
For each update in the system, the root node P root needs ReadCostC ðGC ; RC0 Þ < ReadCostC ðGC ; RC Þ:
to propagate the update to all other share holders inside Hx .
Now let us consider the update cost. Note that
Let WriteCost(R) denote the total update cost in Hx . Then
we have UpdateCostC ðGC ; RC Þ ¼ wC jC ðRC Þj. Because
RC0 and RC are actually composed of the same set
W riteCostðRÞ ¼ wC P root ; R; jRj :
of clusters, so jC ðRC0 Þj ¼ jC ðRC Þj. Also, wC is
Let Cost(R) denote the total cost of all nodes in Hx , then independent of the resident set. So, we have
UpdateCostC ðGC ; RC0 Þ ¼ UpdateCostC ðGC ; RC Þ.
CostðRÞ ¼ W riteCostðRÞ þ ReadCostðRÞ: Since RC0 has the same update cost as, but lower read
Our goal is to determine an optimal resident set R to cost than RC , so CostC ðGC ; RC0 Þ < CostC ðGC ; RC Þ. RC ,
allocate the shares in Hx , such that Cost(R) is minimized. thus, cannot be an optimal residence set. It follows that
Note that m jRj l (we will prove this in the next 8Hx in RC , jRx j l. u
t
section). In Section 5, we propose a heuristic algorithm with
a complexity of OðN 3 Þ to find the near-optimal solution for We also observe that the clusters in the resident set RC
this problem, where N is the number of nodes in the cluster. form a connected graph (which is a subgraph in GC ). This
property is formally proven in Theorem 3.2. From this
property, we can see that for resident set expansion
3 OIRSP SOLUTIONS (considering allocating share replicas to new clusters), only
In this section, we present a heuristic algorithm for OIRSP. neighboring clusters of the current resident set need to be
First (in Section 3.1), we discuss some properties that are considered. Thus, we can have a greedy approach to obtain
very useful for the design of the heuristic algorithm. In a solution. Note that, in Theorem 3.2, we assume that for
Section 3.2, we present the heuristic algorithm that decides each cluster Hx in GC , Ar ðHx Þ > 0.
which cluster should hold share replicas to minimize Theorem 3.2. The optimal resident set is a connected graph
access cost. within the general graph GC .
3.1 Some Useful Properties Proof. Assume that RC is an optimal resident set for GC and
We first show that if a cluster Hx is in RC (an optimal it is not connected. Since RC is not a connected graph,
resident set), then Hx should hold at least l share replicas (l there are two subgraphs RC1 and RC2 that are not
is the number of shares to be accessed by a read request). If connected. Without loss of generality, assume that cluster
Hx is in RC and Hx has less than l shares, then read accesses HMSC 2 RC1 and RC2 is the closest subgraph to RC1 in the
from Hx will anyway need to go to another cluster to get the update propagation minimal spanning tree of RC . Since
remaining shares. If Hx holds no share replicas, then read GC is connected, at least one path existed that connects
accesses from Hx may need to get the l shares from multiple RC1 and RC2 . Let ðRC1 ; RC2 Þ denote the path connecting RC1
clusters. These may result in unnecessary communication and RC2 in GC with the minimal distance (or minimum
overhead. The formal proof is given in Theorem 3.1. Based number of hops between RC1 and RC2 if distance is
on this property, the computation of the update and read measured by the number of hops) and let jðRC1 ; RC2 Þj
costs can be simplified. Essentially, for a cluster that is in denote the distance. Since RC1 and RC2 are disconnected,
RC , all read requests can be served locally. For a cluster that there exists a cluster Hx 2 ðRC1 ; RC2 Þ and Hx 62 RC .
is not in RC , all read requests can be forwarded to one Let us consider a new resident set RC0 such that RC0
single cluster in RC and all l shares can be obtained from is the same as RC , except that all clusters on path
that cluster. ðRC1 ; RC2 Þ are in RC0 . For each cluster Hx 2 ðRC1 ; RC2 Þ,
jðHx ; RC0 Þj ¼ 0. Together with Theorem 3.1, we know
Theorem 3.1. In a general graph GC , 8x, Hx 2 GC , jRx j ¼ 0 or t h a t ReadCostC ðGC ; RC Þ < ReadCostC ðGC ; RC0 Þ. F o r
jRx j l. each update in GC , an update propagation message is
Proof. Assume that there exists one cluster Hx in RC , such propagated from RC1 to RC2 through ðRC1 ; RC2 Þ, no matter
that jRx j < l. When the resident set is RC , a read request whether RC or RC0 is the residence set, since ðRC1 ; RC2 Þ
from Hx cannot be served locally and the remaining is the shortest path between RC1 and RC2 in GC . Thus,
shares have to be obtained from at least one other cluster UpdateCostC ðGC ; RC0 Þ ¼ UpdateCostC ðGC ; RC Þ.
in GC that holds those shares. Thus, jðHx ; RC j > 0. Let Since RC0 yields a lower read cost than and has the
us construct another resident set RC0 . RC0 is the same as same update cost as RC , we can conclude that
56 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
Fig. 2. Sample GC and SP T ðGC ; RC Þ. (a) The original GC with RC ¼ fH1 ; H2 ; H3 g. (b) Super node S and SP T ðGC ; RC Þ constructed by Build_SPT.
CostC ðGC ; RC0 Þ < CostC ðGC ; RC Þ. Thus, RC is not a cluster in RC ). Since all read requests from Hy go through
minimal residence set. And, we can conclude that the the root, say Hz , Ar ðHy Þ is added to Ar ðHz Þ0 for later use (for
optimal resident set is a connected graph in GC . u
t new resident cluster identification). In case Hy already has a
parent, the distances to S via the original parent and via Hx
Theorem 3.2 is proven based on the assumption that for are compared. If Hx offers a shorter path to S, then
each cluster Hx , Ar ðHx Þ > 0. For the case where Ar ðHx Þ ¼ 0, Hy ’s parent is reset to Hx and the corresponding adjust-
there may be multiple optimal resident sets in GC and at ments are made. To achieve a faster convergence for new
least one of them is a connected graph. The proof is similar RC identification, Hy ’s parent is also changed to Hx if
to the proof of Theorem 3.2. For any nonconnected resident Hx ’s tree root Hz has a higher value of Ar ðHz Þ0 , when the
set, a connected resident set that incurs the same or lower distances to S via Hy ’s original parent and via Hx are equal.
communication cost can always be found. The detailed algorithm for Build_SPT is given in the
following (assume that V C ðGC ; RC Þ is already identified).
3.2 A Heuristic Algorithm for the OIRSP
In the algorithm, each node Hx has several fields. Hx :root
The goal of OIRSP is to determine the optimal resident set
and Hx :parent are the root and parent clusters of Hx ,
RC in GC . GC is a general graph. Each edge in GC is
respectively. Hx :dist is the distance from Hx to Hx ’s root (at
considered as one hop. The optimal resident set problem in a
the end of the algorithm, it is the shortest distance). We also
general graph is an instance of the problem discussed in [34].
use NextðHx Þ to denote the set of Hx ’s neighbors.
It has been shown that the problem is NP-complete. Thus,
we develop a heuristic algorithm to find a near-optimal Build_SPT ðGC ; RC Þ
solution. Our approach is to first build a minimal spanning { For all Hx , Hx 2 V C ðGC ; RC Þ
tree in GC with RC being the root and then identify the { Insert Hx into Queue; Hx :root Hx ; Hx :dist 0;
cluster to be added to RC based on the tree structure. Ar ðHx Þ0 Ar ðHx Þ; }
The clusters in GC access data hosted in RC along the While ðQueue 6¼ Þ
shortest paths, and these paths and the clusters form a set { Hx Remove a node from Queue;
of the shortest path trees. Since all the nodes in RC are For all Hy , Hy 2 NextðHx Þ ^ Hy 62 RC
connected, we view them as one virtual node S. Then, S, { If (Hy is not marked as visited) then
all clusters that are not in RC , and all the shortest access { Insert Hy into Queue; Hy :dist Hx :dist þ 1;
paths form a tree rooted at S, which is denoted as Hy :parent Hx ; Hy :root Hx :root;
SP T ðGC ; RC Þ (an example of the tree is shown in Ar ðHy :rootÞ0 Ar ðHy :rootÞ0 þ Ar ðHy Þ; Mark Hy as
Fig. 2b). We develop an efficient algorithm Build_SPT to visited; }
construct SP T ðGC ; RC Þ based on the current resident set Else
RC . To facilitate the identification of a new resident cluster, { If (Hy :dist > Hx :dist þ 1 _ ððHy :dist ¼
we also define V C ðGC ; RC Þ as the vicinity set of S, where
Hx :dist þ 1Þ ^ Ar ðHy :rootÞ0 < Ar ðHx :rootÞ0 ÞÞ then
8Hx 2 V C ðGC ; RC Þ, we have Hx 62 RC and Hx is a neigh-
{ Ar ðHy :rootÞ0 Ar ðHy :rootÞ0 Ar ðHy Þ;
boring cluster of S. Note that from Theorem 3.2, we know
Hy :dist Hx :dist þ 1; Hy :parent Hx ;
that the clusters in RC are connected. Thus, we only need
Hy :root Hx :root;
to consider clusters in V C ðGC ; RC Þ when looking for a
Ar ðHy :rootÞ0 Ar ðHy :rootÞ0 þ Ar ðHy Þ; } }
potential cluster to be added to RC .
} } }
Build SP T ðGC ; RC Þ first constructs V C ðGC ; RC Þ by visit-
ing all neighboring clusters of RC . If a cluster Hx in Actually, the check for Hy :dist > Hx :dist þ 1 in the
V C ðGC ; RC Þ has more than one neighbor in RC , then one of algorithm is not necessary since a queue is used (a node
them is chosen to be the parent cluster. Next, is always visited from a neighbor with the shortest distance
Build SP T ðGC ; RC Þ traverses GC starting from clusters in to S). A sample general graph GC with current resident set
V C ðGC ; RC Þ. From a cluster Hx , it visits all Hx ’s neighboring RC ¼ fH1 ; H2 ; H3 g is shown in Fig. 2a. The corresponding
clusters. Assume that Hy is a neighboring cluster of Hx . SP T ðGC ; RC Þ is shown in Fig. 2b, where RC is represented
When Build_SPT visits Hy from Hx , it assigns Hx as Hy ’s by the super node labeled as S. When constructing
parent if Hy does not have a parent. In this case, Hy is in the SP T ðGC ; RC Þ, S’s immediate neighbors, including H4 , H5 ,
same tree as Hx , and Hy ’s tree root is set to Hx ’s (which is a H6 , H7 , H8 , and H9 , are visited first. H4 is visited twice but
TU ET AL.: SECURE DATA OBJECTS REPLICATION IN DATA GRID 57
H1 is selected as the parent since H4 is visited from H1 first fHMSC g. It is obvious that H2 2 ðHx ; fHMSC gÞ and H2 is
and there is no need for adjustment when it is visited the the cluster on ðHx ; fHMSC gÞ right next to HMSC , and
second time. From the clusters nearest to S, the clusters that jðHx ; H2 Þj ¼ jðHx ; fHMSC gÞj jðH2 ; fHMSC gÞj. A n y
are two hops away from S, including H10 , H11 , H12 , H13 , other path ðHx ; H2 Þ0 or ðHx ; HMSC Þ0 has a distance no
H14 , and H15 , are visited. Finally, the nodes that are further less than jðHx ; H2 Þj. With resident set fHMSC ; H2 g,
away from S are visited. ðHx ; H2 Þ will continue to be the least distance path for
We develop a heuristic algorithm to find the new cluster Hx to read from H2 in GC , and ðHx ; fHMSC ;
resident set for GC in a greedy manner. We try to find a H2 gÞ ¼ ðHx ; fHMSC gÞ jðH2 ; fHMSC gÞj. For any cluster
new resident cluster in V C ðGC ; RC Þ and, once found, update Hx that reads fHMSC g through H2 , ðHx ; fHMSC gÞ will, at
RC accordingly. The algorithm is shown below. RC is least, not increase if H2 is added into the resident set.
initialized to fHMSC g. The algorithm first constructs Then, we can easily get ReadCostC ðGC ; fHMSC ; H2 gÞ þ
SP T ðGC ; RC Þ and identifies V C ðGC ; RC Þ. Then, a cluster Ar ðH2 Þ0 ReadCostC ðGC ; fHMSC gÞ.
Hy with the highest Ar ðHy Þ0 is selected. If Ar ðHy Þ0 > wC , According to the heuristic resident set algorithm, we
then Hy is added to RC . If Ar ðHy Þ0 wC , then the algorithm know Ar ðH2 Þ0 > wC . Thus, CostC ðGC ; fHMSC ; H2 gÞ
terminates since no other nodes can be added to RC while CostC ðGC ; fHMSC gÞ ¼ UpdateCostC ðGC ; fHMSC ; H2 gÞ
reducing the access communication cost. Note that, in each UpdateCostC ðGC ; fHMSC gÞ þ ReadCostC ðGC ; fHMSC ;
step, only one cluster can be added into RC because H2 gÞ ReadCostC ðGC ; fHMSC gÞ wC Ar ðH2 Þ0 jðH2 ;
SP T ðGC ; RC Þ and Ar ðHx Þ0 changes when RC changes. fHMSC gÞj < 0.
Step 2. Assume that CostC ðGC ; fHMSC ; H2 ; . . . ; Hk gÞ <
RC fHMSC g; CostC ðGC ; fHMSC ; H2 ; . . . ; Hk 1 gÞ. W e s h o w t h a t
Repeat CostC ðGC ; fHMSC ; H2 ; . . . ; Hk gÞ > CostC ðGC ; fHMSC ;
{ Build SP T ðGC ; RC Þ; H2 ; . . . ; Hkþ1 gÞ, with k < n. It can be seen that the proof is
Select a cluster Hy , where Hy has the maximum the same as above and we will not show it here.
Ar ðHy Þ0 among all clusters in V C ðGC ; RC Þ; By induction, we know that CostC ðG; fHMSC ; H2 ; . . . ;
If Ar ðHy Þ0 > wC RC RC [ fHy g; } Hn gÞ < CostC ðGC ; fHMSC ; H2 ; . . . ; Hn 1 gÞ. Thus, CostC
r 0
Until ðA ðHy Þ w Þ C ðGC ; RC Þ < CostC ðGC ; fHMSC gÞ. Also, from the induction
process, we can conclude that every time a new cluster Hi
Now, we analyze the complexity of the heuristic
joins RC , the communication cost decreases, i.e.,
resident set algorithm. Build_SPT has a time complexity
CostC ðGC ; RC ði 1ÞÞ < CostC ðGC ; RC Þ. u
t
OðP degÞ, where P is the number of clusters in GC and
deg is the maximal degree of vertexes (clusters) in GC .
Finding a cluster Hy in V C ðGC ; RC Þ with the highest As shown in Fig. 3, the heuristic resident set algorithm is
Ar ðHy Þ0 can be done when building SP T ðGC ; RC Þ. Thus, not always optimal for a general graph GC . We expect that
the time complexity for the heuristic resident set algorithm the heuristic algorithm will obtain optimal RC in a tree
is OðjRC j P degÞ. Note that the final resident set RC graph. We show this in Theorem 3.4. In Lemma 3.1, we first
computed by Build_Tree is not always optimal [32]. show that when a tree network is considered, the resident
The heuristic resident set algorithm works by adding a set computed by the heuristic algorithm is a subset of the
candidate cluster Hx in GC into RC at each step, with optimal resident set.
Ar ðHx Þ0 > wC . By adding cluster Hx , the read cost is Lemma 3.1. Consider a tree network TMSC rooted at HMSC . Let
reduced by at least Ar ðHx Þ0 , and the update cost is increased RC ðTMSC Þ0 denote the optimal resident set in TMSC . Let
by wC . Thus, the total cost of GC with resident RC is less RC ðTMSC Þ be the resident set computed by the heuristic
than that of GC with initial resident set fHMSC g, if jRC j > 1. algorithm. If Hx 2 RC ðTMSC Þ, then Hx 2 RC ðTMSC Þ0 .
This will be shown in Theorem 3.3.
Proof. This is also to show that RC ðTMSC Þ RC ðTMSC Þ0 .
Theorem 3.3. In a general graph GC , if jRC j > 1, then Assume that RC ðTMSC Þ 6 RC ðTMSC Þ0 . Let RC ðTMSC Þ \
CostC ðGC ; RC Þ < CostC ðGC ; fHMSC gÞ. Furthermore, every RC ðTMSC Þ0 ¼ S1 , and S1 6¼ because HMSC 2 S1 . Let
time a new cluster Hx (Hx satisfies the cost constraint) is added RC ðTMSC Þ S1 ¼ S2 , then S2 6¼ . We will show that
to current resident set RC0 ðRC0 RC Þ, the communication cost there is another resident set RC ðTMSC Þ00 exists such that
decreases, i.e., CostC ðGC ; RC0 [ fHx gÞ < CostC ðGC ; RC0 Þ. CostC ðTMSC ; RC ðTMSC Þ00 Þ < CostC ðTMSC ; RC ðTMSC Þ0 Þ. Let
Proof. According to Theorem 3.1, 8x, Hx 2 RC , jRx j l. RC ðTMSC Þ00 ¼ RC ðTMSC Þ0 [ fHy g, where Hy 2 S2 and Hy
The algorithm works by adding one cluster at a is a neighboring cluster of S1 . Let RCtemp ðTMSC Þ denote the
t i m e . L e t RC ¼ fH1 ; H2 ; . . . ; Hn g, jRC j ¼ n a n d resident set just before cluster Hy join in RC ðTMSC Þ, we
H1 ¼ HMSC . Assume that Hi is added at the ði 1Þth know that Ar ðHy Þ0 > wC . Since Hy 2 S2 , let node Hz be
step to RC . If we show that after adding each cluster, the parent cluster of Hy in TMSC , then Hz 2 RC ðTMSC Þ
the cost reduces, then we can conclude that and Hz joins the resident set RC ðTMSC Þ before Hy . As
CostC ðGC ; RC Þ < CostC ðGC ; fHMSC gÞ. We use induction we know in TMSC , ðHy ; Hz Þ is the unique path that
to prove this. clusters read RCtemp ðTMSC Þ through Hy , and RCtemp ðTMSC Þ.
Step 1. We show that CostC ðGC ; fHMSC ; H2 gÞ <
According to Theorem 3.2, if Hy 2 RC ðTMSC Þ0 , then no
CostC ðGC ; fHMSC gÞ. According to the algorithm,
cluster in the subtree rooted at Hy belongs to RC ðTMSC Þ0 .
H2 2 V C ðGC ; fHMSC gÞ, then UpdateCostC ðGC ; fHMSC ; For each access to RC ðTMSC Þ0 initiated from any cluster
H2 gÞ ¼ UpdateCostC ðGC ;fHMSC gÞ þ wC jðH2 ;fHMSC gÞj. in the subtree, it has to be forwarded to RC ðTMSC Þ0
For each cluster Hx that reads fHMSC g through H2 , by cluster Hy . We know Ar ðHy Þ0 > wC . Thus, costC ðTMSC ;
ðHx ; fHMSC gÞ is the shortest path in GC from Hx to RC ðTMSC Þ00 Þ < costC ðTMSC ; RC ðTMSC Þ0 Þ. It follows that
58 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
4 OISAP SOLUTIONS
Now, we only consider the cost inside a single cluster Hx .
As discussed in Section 2, the topology of Hx is a tree,
denoted as T . For simplicity, we define the distance of each
edge in T uniformly as one hop. In the following, we first
show two important properties of the OISAP problem with
a tree topology. Then, we give a heuristic algorithm to
decide the numbers of shares needed in Hx and where to
place them.
If the Hx ’s resident set R is not connected, then R consists
of multiple disconnected subresident set R1 ; R2 ; . . . ; Rn ,
where n > 1, and each subresident set is connected. We say
R is j þ connected in Hx , if and only if minðjRi jÞ j, where
j > 0, Ri , for all i n, are subgraphs in Hx , and jRi j is the
number of server nodes in Ri . We define Ri <pos Rj as
f o l l o w s : I f Ri <pos Rj , t h e n 9Py , Pz 2 Hx , w h e r e
Fig. 3. (a) The performance impact of graph size. (b) The performance Py 2 Ri ^ Pz 2 Rj , such that Pz is an ancestor of Py in T .
impact of graph degree. (c) The impact of update/read ratio. Informally, nodes in Rj are closer to the root than nodes in
Ri . Otherwise, Ri pos Rj .
Hy 2 RC ðTMSC Þ0 . Therefore, we conclude that there exists Theorem 4.1. In a tree network, the optimal resident set R in Hx
no such Hy , and it follows that RC ðTMSC Þ RC ðTMSC Þ0 .t
u is lþ connected.
Theorem 3.4. The resident set RC ðTMSC Þ computed by our Proof. The proof is omitted due to space limitation. Please
algorithm is optimal in a tree network TMSC rooted at HMSC . refer to [32] for the full proof. u
t
The corresponding time complexity is OðP Þ.
Proof. According to the algorithm, nodes in RC ðTMSC Þ are Next, we discuss another important property that helps
connected and HMSC 2 RC ðTMSC Þ. Suppose there exists the construction of the allocation algorithm. We show that if
another resident set RC ðTMSC Þ0 which is optimal, i.e., the constraint jRj m is removed, P root should be in R and
costC ðTMSC ; RC ðTMSC Þ0 Þ < costC ðTMSC ; RC ðTMSC ÞÞ. W e R is a connected subgraph in T . This property is shown in
prove that such a RC ðTMSC Þ0 does not exist in TMSC . Theorem 4.2. Based on this property, a share allocation
algorithm can start with R ¼ fP root g, and add one neigh-
Case 1. RC ðTMSC Þ0 RC ðTMSC Þ. According to
boring node in each step.
Theorem 3.2 and the heuristic algorithm, both
RC ðTMSC Þ and RC ðTMSC Þ0 form connected graphs. Theorem 4.2. Without the constraint jRj m, we have P root 2
According to Lemma 3.1, we can easily get R and R is a connected subgraph.
RC ðTMSC Þ0 6 RC ðTMSC Þ. Proof. Since the updates are propagated from P root to nodes
Case 2. RC ðTMSC Þ RC ðTMSC Þ0 . Let RC ðTMSC Þ0 in Hx that hold share replicas. According to the cost
R ðTMSC Þ ¼ S1 . Since RC ðTMSC Þ and RC ðTMSC Þ0 are both
C
models defined in Section 2.2, if a node on the path from
connected, let S2 ¼ S1 \ V C ðTMSC ; RC ðTMSC ÞÞ, then P root to the residence set hosts a share, the update cost
S2 6¼ . For each cluster Hx 2 S2 , Hx is the root of the does not increase and the read cost decreases. Thus, each
subtree that reads RC ðTMSC Þ through Hx . Since of these nodes, including P root , should be in R. We can
Ar ðHx Þ0 wC for all Hx 2 V C ðTMSC ; RC ðTMSC ÞÞ. Thus, use similar proof given in Theorem 3.2 to show that the
CostCðTMSC ; RC ðTMSC ÞÞ CostCðTMSC ; RCðTMSC Þ[fHx gÞ. residence set R is a connected graph in T if the constraint
S o , RC ðTMSC Þ 6 RC ðTMSC Þ0 . Ot h e r w i s e : L e t S1 ¼ jRj < m is not considered. Thus, the theorem follows. t u
RC ðTMSC Þ \ RC ðTMSC Þ0 . S1 6¼ because HMSC 2 S1 . Let Now, we present the algorithm SDP-Tree, which
RC ðTMSC Þ0 S1 ¼ S2 , RC ðTMSC Þ S1 ¼ S3 . Then, S2 6¼ determines the ðm; kÞ secrete sharing residence set in T .
and S3 6¼ , because RC ðTMSC Þ and RC ðTMSC Þ0 are Initially, the residence set R only contains node P root .
TU ET AL.: SECURE DATA OBJECTS REPLICATION IN DATA GRID 59
SDP-Tree first adds nodes into R. We call this the node Ar ðPi Þ jl 1 þ ðPi ; RÞj. Adding Pi to R decreases
joining phase. Similar to Build SP T ðGC ; RC Þ, a node that is the read cost of each node in Ti by one and does
not in the resident set and, if allocated a share replica, can not change the read cost of any node Pj 62 Ti . Also,
maximally reduce the cost compared with all other nodes, is adding Pi to R increases the update propagation cost
selected and added to the resident set. Note that since each of the root by one. According to the cost models
cluster contains either no share or at least l shares as shown defined in Section 2.2, we have costðR [ fPi gÞ ¼
in Theorem 3.1, SDP-Tree guarantees that the result set P
costðRÞ Pj 2Ti ReadCostðPj Þ þ wC . S i n c e Ti Tk ,
contains at least l shares. In the second phase, SDP-Tree P P
Pj 2Ti ReadCostðPj Þ Pj 2Tk ReadCostðPj Þ, we have
removes nodes from the residence set and it is called the
diffðPk Þ diffðPi Þ. u
t
node removal phase. Node removal phase removes nodes
from the resident set constructed during the first phase, if it Theorem 4.3. Let RS denote the residence set computed by the
contains more than m nodes. A node, if removed from R node joining phase of SDP-Tree. If the constraint jRj < m is
will cause minimum increase in access cost, is selected and removed, then RS is the optimal resident set such that
removed from the resident set. The process continues until costðRSÞ is minimal.
only m nodes are left in R. Note that any removed nodes Proof. Assume that RS is not optimal. Suppose that there
will not cause the violation of the l þ connectivity property exists an optimal resident set RS 0 such that
and will not result in more than m=l disconnected costðRSÞ < costðRS 0 Þ. Two cases exist.
subresident sets. The OISAP is given as follows: In the Case 1. RS RS 0 . According to Theorem 4.2 and the
algorithm, VN is the set of neighboring nodes of resident set SDP-Tree algorithm, RS and RS 0 are both connected,
R. Ar ðPi Þ0 is the number of total read requests issued from and any node in RS 0 RS must be a descendant of
nodes inPthe subtree Ti rooted at node Pi , i.e., some node in RS. Otherwise, SDP-Tree would have
Ar ðPi Þ0 ¼ Pj 2Ti Ar ðP j Þ. MaxR is a temporary variable. added the node into RS. Let Py denote a node such that
SDP-Tree{ its parent node is in RS and Py 62 RS while Py 2 RS 0 .
R fP root g; VN fchild nodes of P root g; MaxR 0; According to SDP-Tree, if Py is not added in RS, it is
while ðVN ! ¼ Þ only because that adding Py will increase access cost.
{ For 8Pi 2 VN From Lemma 4.1, we know that adding any subset of
{ If Ar ðPi Þ0 > MaxR descendants of Py together with Py would also increase
the cost. Thus, there exists no RS 0 such that RS RS 0 .
{ MaxR Ar ðPi Þ0 ;
Case 2. RS 6 RS 0 ^ RS 6¼ RS 0 . Let Py be the first node
X Pi ;
that SDP-Tree adds, such that Py 2 RS and Py 62 RS 0 . Let
} } // find the node with highest Ar ðPi Þ0
R0 denote the residence set that SDP-Tree computed
If ðMaxR wC Þ
before adding Py . Note that R0 6¼ (because R0 contains at
if ðjRj < lÞ { R R [ fXg; }
least P root ), and R0 RS 0 . According to Theorem 4.2, RS 0
else {X null; delete all nodes in VN ;}}
is a connected subgraph in T . Since Py 62 RS 0 , then no node
else {R R [ fXg; Y parent node of X;
in the subtree rooted at Py should be in RS 0 . According to
Ar ðYÞ0 Ar ðYÞ0 Ar ðXÞ0 ; }
the SDP-Tree algorithm, if Py 2 RS, then diffðPy Þ is
delete X from VN ;
minimal among the neighboring nodes of R0 . Two cases
if ðX 6¼ nullÞ { insert all child nodes of node X into VN ;}}
should be considered. 1) costðR0 [ fPy gÞ < costðR0 Þ. The
while ðjRj > mÞ
node Py is a neighbor of some node in R0 . This means
{ MaxR 1;
costðRS 0 [ fPy gÞ < costðRS 0 Þ, which is a contradiction to
For 8Pi 2 R
the assumption. 2) jR0 j < l, diffðPy Þ 0, and diffðPy Þ is
{if (removing Pi will retain the l þ connectivity and
minimal among the neighboring nodes of R0 . According to
does not result in more than m=l disconnected
Lemma 4.1, for any node Px such that Px 2 RS 0 and
subresident sets)
Px 62 R0 , 0 diffðPy Þ diffðPx Þ. If for any node Px such
{ If ðAr ðPi Þ0 < MaxRÞ
that Px 2 RS 0 and Px 62 RS, diffðPy Þ ¼ diffðPx Þ, then
{ MaxR Ar ðPi Þ0 ;
there exists no Pz such that Pz 2 RS 0 and diffðPz Þ
X Pi ; }} // find the node with lowest Ar ðXÞ0
diffðPx Þ. Otherwise, Px is added into RS before Pz . Thus,
R R fXg; }}}
costðRSÞ costðRS 0 Þ, which is contradictory to the
Next, we show a property of the solution obtained by the assumption. If there exists some node Px such that
SDP-Tree algorithm in Theorem 4.3. In Lemma 4.1, we first Px 2 RS 0 , Px 62 RS, and diffðPy Þ < diffðPx Þ, then let Pz
define how to compute the cost difference when a node is be a leaf node of the tree composed only by nodes in RS 0
added into the resident set. and Pz is a descendant of Px . According to Lemma 4.1,
Lemma 4.1. Let Ti denote P the subtree rooted at Pi . diffðPy Þ < diffðPx Þ diffðPz Þ. Now, construct another
costðR [ fPi gÞ ¼ costðRÞ Pj 2Ti ReadCostðPj Þ þ wC , resident set RS 00 such that RS 00 ¼ RS 0 [ fPy g fPz g. We
Pi is a neighboring node of R. Also, let diffðPi Þ ¼
where P know that costðRS 00 Þ < costðRS 0 Þ, which contradicts the
wC Pj 2Ti ReadCostðPj Þ, then diffðPk Þ diffðPi Þ if Pk is assumption. u
t
an ancestor of Pi in T . When the number of nodes in the resident set computed
Proof. For a node Pi , Pi 2 T , ReadCostðPi Þ ¼ Ar ðPi Þ by the node joining phase is greater than m, we need to
jðPi ; R; lÞj. According to Theorem 4.1, ReadCostðPi Þ ¼ remove some nodes in the set. The greedy removal may not
60 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
be optimal. Also, with l < 2, the resident set computed by cluster; and 3) the update/read ratio, which is the ratio of the
SDP-Tree algorithm is always optimal even if the number of total number of update requests in the entire system to the
nodes in the resident set computed by the node joining average number of read requests issued from a single
phase is greater than m. cluster (these are the requests each cluster needs to process).
Now consider the time complexity of SDP-Tree algo- The results are shown in Fig. 3, in which HEU, RKR, NR,
rithm in a tree T with N nodes. The node joining phase and FR denote the OIRSP heuristic algorithm, the rando-
traverses all nodes in T and the time complexity is OðNÞ. In mized K-replication, the no-replication allocation, and the
the node removal phase, it needs to remove jRj m nodes, complete replication algorithms, respectively.
which needs jRj m steps. In each step, it needs to select Fig. 3a shows the impact of graph size on the performance
Pi from R, the worst case of which could be OðNÞ. For each of the four algorithms. The parameters are set as follows:
tested Pi , it needs to check whether the removal of Pi cluster size ¼ 100, which means that there are 100 nodes in
retains the l þ connectivity, which takes OðNÞ time in each cluster, graph degree ¼ 5, and update=read ¼ 2. The
worst case. So, the complexity of the node removal phase is results show that the OIRSP heuristic algorithm incurs much
OðN 3 Þ. Thus, the overall time complexity of the SDP-Tree lower communication cost than other replication strategies.
algorithm is OðN 3 Þ. Also, it can be seen that with a larger graph size, the OIRSP
heuristic algorithm achieves better performance compared
to the no-replication allocation strategy. The reason is
5 EXPERIMENTAL STUDY obvious. With larger graph size, the number of clusters that
We conduct experiments to evaluate the performance of the need replicas increases. Allocating share replicas to these
heuristic algorithms for secret share allocation. The OIRSP clusters will reduce communication cost, and hence, the
heuristic algorithm is compared with the randomized heuristic algorithm shows a better performance than the no-
K-replication, no-replication allocation, and complete repli- replication allocation strategy.
cation strategies, to study its performance and effectiveness The effect of graph degree is shown in Fig. 3b. The
on reducing communication cost at the cluster level. In the other parameters are set as follows: cluster size ¼ 100,
randomized K-replication, share replicas are randomly graph size ¼ 80, and update=read ¼ 2. From the results, we
allocated among K clusters, where K is the number of can see that the performance gain of the heuristic algorithm
clusters holding replicas computed by the OIRSP heuristic becomes less significant with the increasing graph degree.
algorithm. In the complete replication strategy, share This is because the graph becomes more complete, and the
replicas are allocated in every cluster. In the no-replication distances between nodes become smaller, when the graph
allocation strategy, there is no replication in any cluster. degree increases. When the graph degree becomes large, the
The SDP-tree algorithm for OISAP is compared with communication cost for the complete replication strategy
the optimal allocation algorithm and randomized drops more significantly than that for other algorithms.
M-replication, to see how well the SDP-tree algorithm When graph degree 20, the communication cost for
performs in terms of reducing communication cost within complete replication becomes stable and it stays as twice of
a cluster. In the randomized M-replication, M shares are that of the heuristic algorithm. With update=read ¼ 2, most
randomly allocated among the nodes in a cluster, where clusters will not be allocated with share replicas, thus the
M is computed by SDP-tree and it is the number of nodes result of the heuristic algorithm is closer to (but is still better
that hold shares. than) the other two strategies. Compared to the complete
The underlying network topology for the experimental replication strategy, the performance of our heuristic
studies is created by using a topology generator, Inet [11]. algorithm is much better when the graph degree is small.
Inet has a lower bound on the total number of nodes in the This is because more clusters get unneeded replicas in the
network. We removed the bound so that the graph with complete replication strategy and the update cost increases
different number of clusters (or nodes) can be created. We significantly.
have also modified the generator on read/write request The effect of update/read ratio is shown in Fig. 3c.
generation for each node. The numbers of read and write The parameters are set as follows: cluster size ¼ 100,
requests on the nodes in the system are generated randomly graph size ¼ 80, and graph degree ¼ 5. With increasing
following a uniform distribution. update/read ratio, fewer clusters should get replicas. So, the
The metric we consider is the communication cost, which communication cost of the complete replication strategy
is the product of the number of messages and the number of increases rapidly and becomes far worse than the other two
hops along the message propagation path. To avoid biased replication strategies.
access patterns and topology structures, we repeat the
experimental steps 100 times. The final result is the average 5.2 The Efficiency of the OISAP SDP-Tree Algorithm
of the 100 trials. The performance of SDP-tree algorithm is compared with the
optimal allocation algorithm and the randomized
5.1 Performance of the OIRSP Heuristic Algorithm M-replication algorithm. In the experiments, the trees are
In this section, we compare the performance of the OIRSP generated randomly by using the topology generator with
heuristic algorithm with the randomized K-replication, no- changing N, D, and read/update ratio, where N is the total
replication allocation, and complete replication strategies. number of nodes in the cluster, D is the maximum node
We study the impacts of three factors: 1) the graph size, degree, and read/update is the ratio of the average number of
which is the number of clusters in the system; 2) the graph read requests in the cluster to the total number of update
degree, which is the average number of neighbors of a requests in the system. Two configurations are considered:
TU ET AL.: SECURE DATA OBJECTS REPLICATION IN DATA GRID 61
Fig. 4. (a) The impact of l with read=update ¼ 3. (b) The impact of l with read=update ¼ 30. (c) The impact of m with read=update ¼ 3. (d) The impact
of m with read=update ¼ 30.
1) N ¼ 30, D ¼ 5, read=update ¼ 3 and 2) N ¼ 30, D ¼ 5, the subtree) is higher than the total number of update
read=update ¼ 10. We vary m and l to evaluate their impact frequency in the entire cluster. Thus, in average, a certain
on the performance of the algorithms. A larger m value number of shares hosted in the cluster are sufficient to
results in higher availability if some of the share holders are minimize the communication cost, with a specific read/
not available or compromised, where a larger l value update ratio. With a reasonable read/update ratio, the number
achieves better data confidentiality. Thus, different m and of shares required to minimize access cost inside a cluster is
l values could be chosen based on the requirements of the usually small (note that a small number of shares can
data. Note that we only show the results with N ¼ 30, partition the cluster into a much larger number of subtrees
because the computation cost for obtaining the optimal such that all of them are rooted at the neighboring nodes of
solutions is prohibitive. The results are shown in Fig. 4, in the resident set, i.e., the nodes host shares). Thus, a small
which HEU denotes the heuristic algorithm, OPT denotes value of m would be big enough to provide sufficient shares
the optimal solution, and RMR denotes the random to minimize the communication cost inside a cluster. In
M-replication algorithm. other words, the value of m has little impact on the access
From Fig. 4, we can see that the communication costs cost inside a cluster.
using RMR are always the highest. By using RMR, the share
replicas are randomly allocated. So, the shares may not
6 STORAGE LIMITATIONS AND LOAD BALANCING
be close to clients with most frequent accesses. For all
configurations, the heuristic algorithm obtains near-optimal Replication is a natural solution for reducing the commu-
or optimal solutions. In fact, for the worst individual case we nication cost (as we have discussed) as well as sharing the
have observed, the cost obtained by the heuristic algorithm access load. In peer-to-peer data grids, replica can be placed
is only about 10 percent higher than the optimal algorithm. on widely distributed nodes to achieve better access
performance and load sharing. In cluster-based data grids,
In most cases (about 75 percent), the heuristic algorithm can
caching data on widely distributed nodes is necessary (in
obtain the optimal solutions.
addition to replication on cluster nodes) to achieve
From Figs. 4a and 4b, it can be seen that with increasing l,
improved access performance and load sharing. Data
the communication costs of all solutions increase sharply.
partitioning can contribute to reduced storage cost. It has
With higher l, a read or update request needs to access a been shown that erasure coding-based schemes can greatly
larger number of nodes that host shares, and, hence, incurs reduce the overall storage cost and effectively share the
a higher access cost. Figs. 4c and 4d show the impact of the storage consumption [16], [24], [37]. However, with these
m values on the performance gains of the three algorithms. schemes, it is still possible to have unbalanced access load
As can be seen, in general, m has little impact to the access or storage requirements due to very unbalanced access
performance. Only when the extreme case is considered, patterns. There may be too many requests inside a cluster or
with read=update ¼ 30 (most requests are read requests) a server inside a cluster may be a hot spot for many data
and m changes from 30 to 3, we can see the effect of objects. In these cases, it is necessary to adapt the algorithms
increasing m to the access performance. This is because the proposed in this paper to bound the load and storage cost.
total number of read access frequency of each subtree (i.e., Let CAPxload and CAPx;i load
denote the access load thresh-
the total number of read requests issued by the nodes inside old for cluster Hx , and server node Px;i , respectively.
62 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
Several future research directions can be investigated. [18] J.H. Lala, Foundations of the Intrusion Tolerant Systems OASIS.
IEEE CS, ISBN 076952057X.
First, the secure storage mechanisms developed in this [19] V. Matossian and M. Parashar, “Enabling Peer-to-Peer Interac-
paper can also be used for key storage. In this alternate tions for Scientific Applications on the Grid,” Proc. Ninth Int’l
scheme, critical data objects are encrypted and replicated. Euro-Par Conf. (Euro-Par), 2003.
The encryption keys are partitioned and the key shares are [20] A. Mei, L.V. Mancini, and S. Jajodia, “Secure Dynamic
Fragment and Replica Allocation in Large-Scale Distributed
replicated and distributed. To minimize the access cost, File Systems,” IEEE Trans. Parallel and Distributed Systems,
allocation of the replicas of a data object and the replicas of vol. 14, no. 9, 2003.
its key shares should be considered together. We plan to [21] N. Nagaratnam, P. Janson, J. Dayka, A. Nadalin, F. Siebenlist,
V. Welch, I. Foster, and S. Tuecke, The Security Architecture for
construct the cost model for this approach and expand our Open Grid Services, Version 1, 2002.
algorithm to find best placement solutions. Also, the two [22] www.gloriad.org/gloriad/projects/project000053.html, 2008.
approaches (partitioning data or partitioning keys) have [23] V. Paxson, “End-to-End Routing Behavior in the Internet,” IEEE/
pros and cons in terms of storage and access cost and have ACM Trans. Networking, vol. 5, no. 5, pp. 601-615, 1997.
[24] M. Rabin, “Efficient Dispersal of Information for Security, Load
different security and availability implications. We plan to Balancing, and Fault Tolerance,” J. ACM, vol. 36, no. 2, 1989.
investigate their tradeoffs and some preliminary analysis [25] K. Ranganathan and I. Foster, “Identifying Dynamic Replication
results are available in [38]. Moreover, it may be desirable to Strategies for a High Performance Data Grid,” Proc. Second Int’l
Workshop Grid Computing, 2001.
consider multiple factors for the allocation of secret shares [26] M. Reiter and P. Rohatgi, “Homeland Security,” IEEE Internet
and their replicas. Replicating data shares improves access Computing, 2004.
performance but degrades security. Having more share [27] A. Samar and H. Stockinger, “Grid Data Management Pilot
replicas may increase the chance of shares being compro- (GDMP): A Tool for Wide Area Replication,” Proc. IASTED Int’l
Conf. Applied Informatics (AI), 2001.
mised. Thus, it is desirable to determine the placement [28] A. Shamir, “How to Share a Secret,” Comm. ACM, vol. 22, 1979.
solutions based on multiple objectives, including perfor- [29] G. Singh, S. Bharathi, A. Chervenak, E. Deelman, C. Kesselman,
mance, availability, and security. M. Manohar, S. Patil, and L. Pearlman, “A Metadata Catalog
Service for Data Intensive Applications,” Proc. ACM/IEEE Conf.
Supercomputing (SC), 2003.
[30] H. Stockinger, “Distributed Database Management Systems and
REFERENCES the Data Grids,” Proc. 18th IEEE Symp. Mass Storage Systems,
[1] S. Arora, P. Raghavan, and S. Rao, “Approximation Schemes for 2001.
Euclidean k-Medians and Related Problems,” Proc. 30th ACM [31] B.M. Thuraisingham and J.A. Maurer, “Information Surviva-
Symp. Theory of Computing (STOC), 1998. bility for Evolvable and Adaptable Real-Time Command and
[2] M. Baker, R. Buyya, and D. Laforenza, “Grids and Grid Control Systems,” IEEE Trans. Knowledge and Data Eng., vol. 11,
Technology for Wide-Area Distributed Computing,” Software- no. 1, Jan. 1999.
Practice and Experience, 2002. [32] M. Tu, “A Data Management Framework for Secure and
[3] A. Chervenak, E. Deelman, I. Foster, L. Guy, W. Hoschek, Dependable Data Grid,” PhD dissertation, Univ. of Texas at
C. Kesselman, P. Kunszt, M. Ripeanu, B. Schwartzkopf, Dallas, http://www.utdallas.edu/~tumh2000/ref/Thesis-Tu.pdf,
H. Stockinger, and B. Tierney, “Giggle: A Framework for July 2006.
Constructing Scalable Replica Location Services,” Proc. ACM/IEEE [33] http://www.whitehouse.gov/reports/katrina-lessons-learned/,
Conf. Supercomputing (SC), 2002. 2008.
[4] Y. Deswarte, L. Blain, and J.C. Fabre, “Intrusion Tolerance in [34] O. Wolfson and A. Milo, “The Multicast Policy and its Relation-
Distributed Computing Systems,” Proc. IEEE Symp. Research in ship to Replicated Data Placement,” ACM Trans. Database Systems,
Security and Privacy, 1991. vol. 16, no. 1, 1991.
[5] http://csepi.utdallas.edu/epc_center.htm, 2008. [35] O. Wolfson, S. Jajodia, and Y. Huang, “An Adaptive Data
[6] I. Foster and A. Lamnitche, “On Death, Taxes, and Convergence of Replication Algorithm,” ACM Trans. Database Systems, vol. 22,
Peer-to-Peer and Grid Computing,” Proc. Second Int’l Workshop no. 2, 1997.
Peer-to-Peer Systems (IPTPS), 2003. [36] T. Wu, M. Malkin, and D. Boneh, “Building Intrusion Tolerant
[7] http://www.ccrl-nece.de/gemss/reports.shtml, 2008. Applications,” Proc. DARPA Information Survivability Conf. and
[8] Global Information Grid, Wikipedia. Exposition (DISCEX), 2000.
[9] www.globus.org, 2008. [37] J. Wylie, M. Bakkaloglu, V. Pandurangan, M. Bigrigg, S. Oguz,
[10] J. Gray, P. Helland, P. O’Neil, and D. Shasha, “The Dangers of K. Tew, C. Williams, G. Ganger, and P. Khosla, “Selecting the
Replication and a Solution,” Proc. ACM SIGMOD, 1996. Right Data Distribution Scheme for a Survivable Storage
[11] C. Jin, Q. Chen, and S. Jamin, “INET: Internet Topology System,” Technical Report CMU-CS-01-120, Carnegie Mellon
Generator,” Technical Report CSE-TR-433-00, EECS Dept., Univ. Univ., 2000.
of Michigan, 2000. [38] L. Xiao, I. Yen, Y. Zhang, and F. Bastani, “Evaluating Dependable
[12] K. Kalpakis, K. Dasgupta, and O. Wolfson, “Optimal Placement of Distributed Storage Systems,” Proc. Int’l Conf. Parallel and
Replicas in Trees with Read, Write, and Storage Costs,” IEEE Distributed Processing Techniques and Applications (PDPTA), 2007.
Trans. Parallel and Distributed Systems, vol. 12, no. 6, 2001.
[13] O. Kariv and S.L. Hakimi, “An Algorithmic Approach to Location Manghui Tu received the PhD degree in
Problems—II: The p-medians,” SIAM J. Applied Math., vol. 37, computer science from the University of Texas
no. 3, 1979. at Dallas, in 2006. He is an assistant professor in
[14] H. Krawczyk, “Distributed Fingerprints and Secure Information the Department of Computer Science and
Dispersal,” Proc. 12th Ann. ACM Symp. Principles of Distributed Information Systems, Southern Utah University,
Computing (PODC), 1993. Cedar City. His research interests include
[15] H. Krawczyk, “Secret Sharing Made Short,” Proc. 13th Ann. Int’l information security, computer forensics, distrib-
Cryptology Conf. (Crypto), 1993. uted systems, grid computing, and ecological
[16] J. Kubitowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, informatics. He is a member of the IEEE.
D. Geels, R. Gummadi, S. Rhea, H. Weatherspoon, W. Weimer,
C. Wells, and B. Zhao, “OceanStore: An Architecture for Global-
Scale Persistent Storage,” Proc. Ninth Int’l Conf. Architectural
Support for Programming Languages and Operating Systems
(ASPLOS), 2000.
[17] S. Lakshmanan, M. Ahamad, and H. Venkateswaran, “Responsive
Security for Stored Data,” IEEE Trans. Parallel and Distributed
Systems, vol. 14, no. 9, 2003.
64 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
Peng Li received the BSc and MS degrees in Bhavani Thuraisingham received the degrees
computer science from the Renmin University of from the University of Bristol and the University
China and the PhD degree in computer science of Wales. She joined the Department of Com-
from the University of Texas at Dallas. He is the puter Science, University of Texas at Dallas
chief software architect of Didiom LLC. Before (UTD), Richardson in October 2004 as a
that, he was a visiting assistant professor in the professor of computer science and the director
Department of Computer Science, Western of the Cyber Security Research Center in the
Kentucky University. His research interests Erik Jonsson School of Engineering and Com-
include database systems, database security, puter Science. Her current research interests
transaction processing, distributed and Internet include assured information sharing and trust-
computer and E-commerce. worthy semantic web, secure geospatial information management, and
security, surveillance, and privacy technologies. She is an elected fellow
I-Ling Yen received the BS degree from Tsing-Hua University, Beijing of three professional organizations: the Institute for Electrical and
and the MS and PhD degrees in computer science from the University of Electronics Engineers (IEEE), the American Association for the
Houston. She is currently a professor of computer science in the Advancement of Science (AAAS), and the British Computer Society
Department of Computer Science, University of Texas at Dallas, (BCS) for her work in data security. She received the IEEE Computer
Richardson. Her research interests include fault-tolerant computing, Society’s prestigious 1997 Technical Achievement Award for “out-
security systems and algorithms, distributed systems, Internet technol- standing and innovative contributions to secure data management.”
ogies, E-commerce, and self-stabilizing systems. She had published Prior to joining UTD, she worked for MITRE Corp. for 16 years which
more than 100 technical papers in these research areas and received included an Intergovernmental Personnel Act (IPA) at the National
many research awards from US National Science Foundation, Depart- Science Foundation as the program director for data and applications
ment of Defense, National Aeronautics and Space Administration, and security. At MITRE, she conducted research in secure data manage-
several industry companies. She has served as a program committee ment and data mining and was also the department head in Information
member for many conferences and the program chair/cochair for the and Data Management as well as a consultant to the DoD, Intelligence
IEEE Symposium on Application-Specific Software and System En- Community and the Treasury. Her work in information security and
gineering and Technology, IEEE High Assurance Systems Engineering information management has resulted in more than 80 journal articles,
Symposium, IEEE International Computer Software and Applications more then 200 refereed conference papers, and three US patents. She
Conference, and IEEE International Symposium on Autonomous is the author of eight books in data management, data mining, and data
Decentralized Systems. She is a member of the IEEE. security. She teaches courses in data security and digital forensics. She
is actively involved in promoting Math and Science for women and
underrepresented minorities and gives talks at SWE, WITI, and Career
Communications Inc. She is a fellow of the IEEE.