4944 A 453

2013 IEEE Seventh International Symposium on Service-Oriented System Engineering
A Self-Stabilizing Process for Mobile Cloud Computing

Hyun Jung La and Soo Dong Kim*
Department of Computer Science Soongsil University 511 Sangdo-Dong, Dongjak-Ku, Seoul, Korea 156-743 {hjla80, sdkim777}@gmail.com y
Abstract Mobile computing provides benefits of mobility, convenience and convergence. However, mobile devices have limited computing power and resources. An effective approach to remedying the problem is for mobile apps to subscribe cloud services, called Mobile Cloud Computing (MCC). However, there exist two potential problems in MCC; low quality of service (QoS) and limited manageability. Managing MCC is challenging mainly due to performance degradation due to the network-based service invocations, dynamism caused by mobility, high heterogeneity on mobile platforms and services and the overhead for managing various elements of MCC which are highly distributed in nature. To provide a comprehensive and practical solution for MCC, in this paper, we propose a self-stabilizing process and its managementrelated methods. We first define an extended meta-model of MCC with self-stabilizing related elements. And, we define a main process for self-stabilization and a supplementary process of optimal clustering. We conduct experiments with the proposed process and methods, and the result shows that the self-stabilization and its methods are effective in stabilizing MCC in presence of QoS degradation and severe resource drains. Keywords- Mobile Cloud Computing, QoS, Autonomous Management, Self-stabilization
I.
INTRODUCTION
Lack of remedy methods (on service consumers side) when subscribed services reveal faults or low QoS, due to the limited visibility and manageability of cloud services To effectively handle the technical challenge of MCC, in this paper, we propose a comprehensive and practical process of self-stabilizing mobile cloud environment. The process is based on the paradigm of autonomic computing [6], and it deploys a number of practical QoS management schemes which can be implemented with currently available technologies and tools. By adopting the self-stabilizing process, most of the quality problems in MCC can be effectively resolved. For the organization of the paper, we survey related works in section II, and present a formal view of MCC in section III which becomes the basis for defining selfstabilization process and schemes. In section IV, the main process for self-stabilizing MCC is defined with key algorithms. Section V presents our simulation work which is to assess the practical applicability of the proposed process and schemes. By adopting the self-stabilization process for MCC, many of the technical issues are effectively resolved, and mobile cloud environments can maintain consistent levels of quality in autonomous manner. II. RELATED WORKS
Mobile devices such as smartphones and tablet PCs are widely accepted as a convergence machine which provides both cell phone capability and a lightweight computing capability. The potential of utilizing mobile devices goes beyond traditional PCs due to their support for mobility and context-sensing capability. However, mobile devices have a major drawback of limited resources due to the small formfactor [1][2]. Consequently, large-scaled applications consuming large amount of resource could not be deployed on the devices. An effective approach to overcoming this limitation is for mobile apps to subscribe cloud services, namely, Mobile Cloud Computing (MCC) [3][4][5]. A key technical challenge in adopting MCC is to effectively manage the quality of services (QoS). More specially, the following potential problems should be addressed; y Low performance due to momentary congestions on services y Application failures do to the failures of subscribed services y Unstable QoS due to the wireless mobile network and mobility of mobile app users
* Corresponding Author
There are several approaches to statically or dynamically offload applications at runtime to reduce computation burden on mobile devices under mobile cloud computing environment. Chuns work presents an approach to dynamic partitioning of applications between resource-constrained mobile devices and clouds, to adapt workloads dynamically [7]. They present main issues for dynamic partitioning such as how to dynamically execute partitioned functionalities and how to determine the right configuration at runtime. They elaborate the system, CloneCloud, which enables unmodified mobile applications running on virtual machine to seamlessly offload computation to clouds [8]. The system consists of three components; static analyzer identifying legal choices for migration and re-integration points in the code, dynamic profiler creating a profile tree which is a representation of execution on a single platform with execution time and energy consumption, and optimization solver determining the most appropriate migration method by using the profile tree. This work only considers offloading pre-determined
978-0-7695-4944-6/12 $26.00 2012 IEEE DOI 10.1109/SOSE.2013.19
453
components to the designated cloud node and present less detailed methods on performed by each component. Satyanarayanans work presents an approach to execute the computation intensive software from mobile device by using virtual machine (VM) technologies [9]. The computation is migrated to cloudlet, which is a trusted, resource-rich computing computer or a cluster of computers nearby a user, rather than remote cloud. The migration is realized by VM synthesis where small VMs delivered from mobile devices is overlaid to cloud let having the base VM. This work only conceptually describes how the MV synthesis performs without mentioning the overall architecture. Cuervos work presents a system that enables finegrained energy-aware offload of mobile code to the infrastructure, called MAUI [10]. MAUI adopts client-server architecture style, and both sides embed three components; proxy, profiler, and solver. Profiler determines whether the method invocation runs locally or remotely by using device, program, and network profiling. And, solver finds the optimal program partition strategy in terms of energy consumption and latency. By using reflection, server proxy extracts and executes remote methods which are pre-defined at design time. With MAUI, a code is offloaded to the designated server, which is quite constrainedly applied to MCC. Maleks work presents an architecture-driven framework supporting the entire life-cycle of a mobile software system [11]. It consists of tools for assessing, deploying, and migrating mobile software systems at runtime. This work focuses on elaborating the functionality of the tools without detailed realization mechanisms. Hans work presents an adaptive software architecture supporting component migration and redeployment at runtime [12]. The work proposes four types of connectors for adaptive architecture; link, pull, copy, and stamp. However, implementation issues of the connectors are not treated. In addition, there has been research on proposing middleware or framework for supporting offloading mechanisms such as Lis work [13], Yangs work [14], Yes work [15], and Hungs work [16]. And, there are similar approaches by using Cyber Foraging [17] such as Flinns work [18], Balans work [19], and Kalasapurs work [20]. III. FORMAL VIEW OF MOBILE CLOUD COMPUTING
1*
Node
1*
MCC Environment
0*
OS Platform Middleware
Mobile Node
1*
0*
0*
Cloud Node Desktop Node Server Node
Application
1*
Service
MCC Manager Service Type Service Instance
Component
Figure 1. Meta-Model of Mobile Cloud Computing
A. Meta-Model of MCC In this section, we define the key elements of mobile cloud computing environment and their relationships, which will serve as the basis for defining our self-stabilizing process and schemes. The meta-model is derived from our extensive technical survey of literatures on mobile clouds [3][4][17], and the model is consistent with existing mobile cloud platform architectures including OpenShift [21] and Cloud Foundary [22].
As shown in Figure 1, MCC Environment consists of six types of elements; Node, OS Platform, Middleware, Application, Cloud Service, and MCC Manager. The Node is a physical computing device, and it can be one of the four types; Mobile Node is a resource constrained device such as Smartphone or Table PC, Desktop Node is a resource unconstrained station node such as a typical personal computer, Cloud Node is a cloud-based virtual computer such as Amazon EC2, and Server Node is a resource-rich high-powered computer delivering server-side functionality. A Node typically runs an operating system, shown as OS Platform in the figure. On top of operating system, a Middleware J2EE or .NET may be deployed to provide system-level common functionality to applications. The Application in the figure typically is a mobile app, and Cloud Service provides reusable software functionality to service consumers [23]. The type of cloud services in the ecosystem will largely be Component-as-a-Service (CaaS). Each service has a type, Service Type, and instances of a service type (denoted as Service Instance) are actually deployed on service repository [23]. In mobile cloud computing, applications often subscribe Cloud Service for benefits of reusing functionality and remedying resource limitation on mobile devices. A heterogeneous dynamic environment such as Mobile Cloud Ecosystem typically employs a management subsystem, and such a sub-system is denoted as MCC Manager in the figure. Its main role is to carry a set of tasks to monitor the ecosystem in order to maintain a consistent level of QoS. It is done by monitoring the activities of various elements, diagnosing causes for various faults and low QoS, and applying remedy actions in autonomous manner. Implementations of MCE Manager depend largely on the underlying MCC environment and policies governing the QoS. B. Meta-Model Extended with Self-Stabilization Features From the example configuration, we can make several observations which exhibit technical challenges in realizing MCC; y Overall Complexity due to the extremely high level of Heterogeneity y Increased burden on designing mobile cloud applications with the consideration of remote services and dynamic nature of the ecosystem y Overhead for maintaining up-to-date states of various
454
y y y
elements in the repository Potentially low QoS due to the congestion when a large number of invocations are concentrated on a service at a time Potentially low reliability of mobile apps due to the limited resources of mobile nodes such as battery and memory Inefficiency of managing various elements of many nodes by a centralized MCE Manager
To address and provide effective solutions to the technical challenges, we extend the initial meta-model with self-stabilization related elements, as shown in Figure 2. The additional elements are shown in rectangles with dark background in the figure, and we discuss the essence and the roles of the elements. We introduce a unit of Cluster, which consists of nodes with high coupling among them where coupling is a relative strength for dependency between two nodes in terms of direct invocations and datasets shared [21]. Managing various elements and their qualities in the ecosystem requires a number of tasks; monitoring activities of nodes, applications, services, and network, detecting anomalous situations such as faults and QoS degradation, planning to remedy the situations, and performing the remedy actions such as service migration, replications, and re-routing, all in autonomous manner.
1* 1*
Cluster MCC Environment

1*
OS Platform Middleware
Node
1*
0*
1*
0*
0*
Mobile Node
Application
1*
Cloud Service
MCC Manager
Global Manager Cluster Manager Node Manager
some dependency on the operating system, middleware, and/or programming language. For instance, an Activity component of Android could not deploy and run on iOSbased nodes. The interface, Remotability, is to support the development of components which provide the capability of dynamic deployments and offloading. In the figure, Application consists of software components, and each Component may optionally implement Remotability. It defines signatures and semantics of the methods needed to support dynamic configurations of cloud services. Only the components which implement Remotability can be dynamically managed by the MCE Manager. There are three types of MCE Manager; Global Manager, Cluster Manager, and Node Agent, as shown in the figure. Node Agent runs on a node where applications are deployed, monitors the activities of the node, communicates with Cluster Node, and carries out management tasks instructed by its Cluster Node. Each application node runs a Node Agent in background. Cluster Manager oversees the activities of nodes in a cluster, communicates with Global Manager, and carries out management tasks instructed by the Global Manager. By having this hierarchy of managers in terms of management scope, parallelism among management tasks is increased and the amount of network overhead for exchanging management-related data/information is decreased. Consequently, the overall efficiency of ecosystem management can be greatly enhanced, comparing to the typical centralized master-slaver type of management. The MCE Manager keeps track of states of various elements in Configuration Repository in the ecosystem, including profiles of nodes, clusters, applications, services, and QoS measures. Based on the extended meta-model, we present an example of typical deployments in Figure 3.
Clusteri Cluster Configuration Repository
Application Application Node Agent Node Agent Service Node Agent Cluster Manager
Cloud Node
Component
Desktop Node
implements
Service Type Service Instance

implements
Clusterk Global Configuration Repository

Global Manager
Server Node
interface
app_3
Remotability
Configuration Repository
Agent
S21
Agent
CMana ger
GManager
Middleware Middleware
app_1 app_2
Agent
Middleware
Middleware
Middleware
Figure 2. Extended Meta-Model of MCC
.NET Framework
OS Platform
MySQL Server
Apache Tomcat
MySQL Server
Java Platform
OS Platform
OS Platform
OS Platform
Android
Windows 7
Desktop Node
Windows Server
Server Node
Windows Server
Server Node
When the number of nodes in the ecosystem is large, the overhead of managing the ecosystem to stabilize the overall quality is considerably high and a centralized management becomes infeasible. Our effective approach to balancing the management overhead is to partition the whole ecosystem into smaller regions, i.e. clusters, and to perform both intracluster and inter-cluster managements. As in the figure, Cluster Manager performs intra-cluster management, while Global Manager performs inter-cluster management. A common effective way to ensure a consistent QoS and remedy fault problems in mobile cloud ecosystem is to provide the capability of dynamic deployments and offloading for components and services [3]. However, delivering the capability in mobile cloud ecosystem is highly challenging, mainly due to the heterogeneity as discussed in section 3.1. That is, implementations of components have
Mobile Node
Samsung Galaxy
Intel i7 2600
Sun SPARC T4-2
Sun SPARC T4-4
Application
Node Agent
Service
Node Agent
Application
Node Agent
Cluster Manager
app_1 app_3
Agent
S11
Agent
app_3
Agent
CManager
Middleware Middleware
OS Platform
Middleware
Middleware
MySQL Server
Java Platform
Android
Mobile Node
Java Platform
OS Platform
Java Platform
OS Platform
OS Platform
Windows 7
Desktop Node
Linux Server
Cloud Node
Windows 7
Desktop Node
Samsung Galaxy2 Amazon EC2 Intel i5 750
Intel i7 2600 Cluster Configuration Repository
Clusterj
Application
Service
Figure 3. Example of a Deployment Configuration with the Meta-Model
In the example configuration, three types of manages are deployed. And, nodes are grouped into clusters for more
455
efficient QoS management. The algorithm and roles of clusters are described in section IV.C. IV. THE PROCESSES AND ALGORITHMS
A. Process to Stabilize QoS Based on the extended meta-model, we define its selfstabilizing process using activity diagram in Figure 4. The process consists of nine activities. The process is run by three types of managers; Node Manager, Cluster Manager and Global Manager, as presented in section III-B.
Node Manager Cluster Manager Global Manager
Measure Node Activity & Resource
no
Significant Offset?
yes
Transfer Measurement
Evaluate QoS & Stability
yes
Acceptable Stability?
no
Plan Cluster-level Remedy
Plan Available?
t
no
Plan Global-level Remedy
yes
Collaborate in Executing Remedy Plan
r s
Execute Cluster-level Remedy Plan Update Cluster Configuration
u v
Execute Global-level Remedy Plan Update Global Configuration
Figure 4. Main Process to Stabilize QoS
The frequency of measurements can be set to either interval basis or event basis. Activity o is to transfer the measured data to its Cluster Manager which first stores the data onto its Cluster Configuration Repository. To minimize the network communication overhead caused by the transfer of measured data, we apply tactics including; y To transfer when the measured data is significantly offset from earlier measurements. That is, new measurement of which value set is nearly same as the previous ones are not transmitted. To encode various measured values into a single packet of which format is bit-wise rather than byte-wise. This saves a considerable amount of data transmitted. Each cluster employs a Cluster Manager which performs activities 4 through 6. Activity p is to evaluate QoS values for pre-defined quality attributes with the transmitted measurement data and the past measurement retrieved from the Cluster Configuration Repository. Then, it determines the stability of nodes with the computed QoS values and preset min and max threshold values for quality attributes. The algorithm is given in section V. Activity q is to generate a cluster-level remedy plan when the stability of nodes and the cluster is considerably hampered. If a cluster-level remedy plan is successfully generated, Activity r is carried out to stabilize the cluster. In the course of executing the plan, Node Managers are instructed to carry out node-level tasks by the Cluster Manager. For example, the remedy scheme of offloading requires node-level tasks such as capturing session snapshot, transmitting code/data, and delegating incoming invocation messages. Activity s is to update the Cluster Configuration Repository with the new configuration set by remedy schemes. Activity t is to plan a global-level remedy actions and it runs only when the generation of the cluster-level remedy plan fails. The typical situation requiring global-level remedy actions is when intra-cluster remedy actions could not solve the instability. Hence, the global-level remedy plan includes inter-cluster remedy actions such as migrating services from a cluster to another cluster. Activity r is to run the globallevel remedy actions. In the course of running the plan, Cluster Managers are instructed to carry out cluster-level tasks by the Global Manager, and the Cluster Managers then may instruct Node Managers to carry out node-level tasks. Activity v is to update the Global Configuration Repository with the new configuration set by remedy schemes. B. Process to Maintain Optimal Clusters One of the strategies to minimize the management overhead while increasing the overall QoS in the proposed process is to distribute the management chores among cluster managers. Each cluster is assigned with nodes with high dependency/coupling, while inter-cluster dependency can be minimized. An example of clustering configuration is shown in Figure 5, where three clusters and the Global Manager are shown. At the center of each cluster in the figure is a Cluster Manager, which is marked with a star icon. All the mobile,
Each node is deployed with a Node Manager, which performs the first two activities. Activity n is to measure two types of information; y Activities of applications and services deployed on the node such as ResponseTime(APPi, SVCj) for a mobile application APPi to invoke a service SVCj, and AveTrunaroundTime(SVCj) for a service SVCj deployed on a server node. y Usages and available amount of resources of the node such as Available Free Memory and Amount of Battery Remained.
456
desktop, server and cloud nodes in the same cluster are managed by their Cluster Manager.
M M M M M M
information manipulated by the clustering process. That is, the process will wait until acquiring locks on all the data items it manipulated, perform the clustering, and releasing locks. This avoids the overwriting and inconsistency problems which may be caused on shared resources.
S
M M M M
S S S C C Global Manager
Mobile Node Desktop Node Server Node Cloud Node
D S C
Evaluate Cohesion & Coupling
Evaluate Efficiency & Resource
M M M
S
M
M M
S D
M M M M
S S
C S C D D C C
no
Need Re-Clustering
D
M M M
e f g
yes
Acquire Locks
Figure 5. An Example of Clustring Configuration
Run Clustering Algorithm
In the process of service invocations, running remedy actions, and deploying/deleting nodes, applications, and services, the overall configuration of the mobile cloud environment is modified. Hence, re-clustering should be taken to reflect the changes and to yield a new optimal set of clusters. We first define principles for optimal clustering as the followings; y Cohesion of each cluster should be higher than the pre-defined value of CohesionThreshold. y Coupling between any two clusters should be lower than the pre-defined value of CouplingThreshold. y Efficiency of each cluster should be higher than the pre-defined value of EffiThreshold. y Resource availability of each cluster should be higher than the pre-defined value of ResourceAvailThreshold. The first two principles are for the independency of clusters, and the latter two principles are for QoS of time efficiency and resource availability. Based on the principles, we define the clustering process in Figure 6. Activities c and e are to acquire values from configuration repositories, and to compute the four values; Cohesion, Coupling, Efficiency, and Resource Availability If any of the four values shows abnormality regarding threshold values, re-clustering is needed. The activity e is to acquire locks on the elements which the clustering algorithm manipulated, as an effort to ensure the mutual exclusion between the two processes. Until locks are released by the clustering process, the main process could be put into pause state. Once locks are acquired, activity f is to run the clustering algorithm, which is elaborated in the next section. The activity g is to update the configuration repositories with the modified cluster information. This process runs independently from the main process since the goals of two processes are orthogonal although both processes contribute to reaching a high stability. However, the two processes shares the information in repositories. Hence, we employ mutual exclusion policies on
Update Configuration Repositories
Figure 6. Supplementary Process for Clustering
To maximize the parallelism among cluster managers and to minimize the interactions among clusters, the cohesion of each cluster should be high, and the coupling among clusters should be low. For this, it is essential to group nodes which are near enough, and we define and use a function Nearness(NODEi, NODEj), which returns the comprehensive logical distance between two nodes, NODEi, NODEj. It returns a value in the range between 0 and 1 where 0 represents identity (i.e. 100% nearness) and 1 represent extreme far distance (i.e. 0% nearness). The range of the return value can be calibrated to be 0..1 by considering the maximum possible and measured values. In computing the nearness, a number of nearness-related factors should be considered. For this, we use a list of factors, FLIST, and define the following 5 factors. Note that the list of 5 factors is not meant to be complete; rather the factors should be adjusted to the specific domain and characteristics of the target mobile cloud environment. y FLIST [1] = Geographical Distance between two nodes if their locations can be inferred. y FLIST [2] = Declared Network Bandwidth between two nodes y FLIST [3] = Measured Network Bandwidth between two nodes for a pre-defined time period such as 10 minutes. y FLIST [4] = Frequency of Interaction between two nodes where an interaction can typically be a service invocation. y FLIST [5] = Average Response Time for an application on a node in invoking a service deployed on the other node during a pre-defined time period such as 30 minutes.
457
The function should be implemented by considering the availability of the factor values. If the value of a factor is not available or applicable, the function should calculate the nearness value without that factor. With the Nearness function available, we utilize two clustering algorithms; K-Means Clustering [24] when there is a relatively stable value of k for the number clusters, or Hierarchical Clustering algorithm [24] when the value of k could evolve considerably. With K-Means algorithm, each node is placed to the cluster with the nearest mean which is calculated by the function Nearness() as defined earlier. That is, the algorithm aims to partition the n nodes into k sets, k n, such that S = {S1, S2, , Sk} so as to minimize the within-cluster sum of squares. C. Algorithm to Evaluate Stability To let the activity p determine the cluster-level stability, we first define a set of 5 stability states; y Normal: The stability is in the normal range; no remedy action is required. y Poor: The stability is being deteriorated; remedy actions to remedy the low stability are expected. y Exceeding: The stability exceeds the high threshold; indicating a potential waste of resources. y Likely Poor: The stability is probably being deteriorated; no remedy action is needed but subsequence decisions will consider this state. y The stability decision requires historical data or next measured quality. y Likely Exceeding: The current state probably exceeds the high threshold; no remedy action is needed but subsequence decisions will consider this state. In determining the stability state, we use threshold values; y Thresholdhigh(x) for the high-bound threshold value of element x, y Thresholdlow(x) for the low-bound threshold value of element x, and y Range (x) for the effective measurement range of element x, where x is one of quality attributes measured by a node manager. Range(x) is needed to distinguish poor state from likely poor, and exceeding from likely exceeding state. If the quality x is related to an application or a node, the three values are defined by an end-user. And if the x indicates quality of the service, the values are defined with Service Level Agreement (SLA), which is a part of a service contract where the level of service is formally defined and contains level of qualities that service providers can guarantee. By comparing the measured quality attributes to the threshold values and range values, the cluster manager performs node-level evaluation as shown in Figure 7.
Thresholdlow(x)
Range(x)
Thresholdhigh(x)
Poor
Likely Poor
Normal
Exceeding
Likely Exceeding
Figure 7. 5 Levels of Stability
In a pre-define time interval, cluster manager calculates an average quality of applications, service instances, service types, and nodes. D. Algorithm for Generating Remedy Plan We consider two aspects in making a remedy plan as shown in Figure 8; Goal and Coverage of the remedy plan.
Enhancing Stability Releasing Resources
Remedy Plan
Coverage
Goal
Cluster-level Remedy
Global-level Remedy
Figure 8. Two Aspects in Generating Remedy Plans
The goal for being poor state is to enhance the stability while the goal for being exceeding state is to release relevant resources. The coverage of remedy plans can be either Cluster-level or Global-level. In either coverage, node-level remedy tasks are performed such as deploying new service instances. The complexity of the planning largely depends on the second aspect. If the QoS problem is only related to a specific node, a remedy plan is made with two steps; selecting one of the remedy actions to be taken and deciding a target destination node for the actions. The procedure for generating remedy plans consists of the following tasks; y Identifying causes for the quality problem y Selecting the most appropriate remedy action y For the chose remedy action, defining concrete remedy plan including what components or services are moved and what target nodes are for the components or services The first task is to identify causes by analyzing the result of quality stability. With the measured data, the cluster manager only knows whether the quality problem occurs or not in activity p with the algorithm presented in the previous section. To generate a remedy plan, the cluster manager needs to know underlying causes for the problem. For example, the slow response time of an application can be resulted from the lack of the resource or slow response time of a service subscribed by the application. The second task is to choose the most appropriate remedy action among the six candidates as shown in Figure 9. Service routing is to change the direction of the service invocation by choosing another service instance implementing the same service type. Re-clustering is to change the configuration of the current clusters. That is, re-
458
clustering is to modify and refine the current clusters by considering logical distances.
Remedy Plan Remedy Action
TABLE 3. RULES FOR CLUSTER/GLOBAL-LEVEL REMEDY ACTIONS

Faults Inequality of QoS on Applications in the Cluster Low QoS on Applications in the Cluster Causes Low QoS of Service Invoked by the Application Low Resource on Node Remedy Actions Re-clustering
Service Migration Service Deployment Service Routing
Service Replication
Component Offloading
Re-Clustering
Figure 9. Six Types of Remedy Actions
Among them, 4 remedy actions; service deployment, service migration, service replication, and component offloading, are related to physical deployments. Table 1 summarizes the functionality of the remedy actions. The determination of which the remedy action to apply depends on the underlying causes. We specify dependency relationships between causes and remedy actions in the form of rules, which are used by the managers.
TABLE 1. FUNCTIONALITY OF THE REMEDY ACTIONS Remedy Action Service Migration Service Replication Component Offloading Service Deployment Relevant Element Service Instance Service Instance Mobile Application Service Instance Source Node Server Node Server Node Mobile Node Server Node Target Node Server Node, Desktop Node, Cloud Node Server Node, Desktop Node, Cloud Node All Types of Nodes Mobile Node Task to Perform Move
Component Offloading on multiple nodes Normal Resource State on Service Routing Node Service Deployment Inequality of Unstable Network, Service Routing (i.e load QoS on Services Extremely large number balancing) in the Cluster of invocations Low QoS on Low Resource on Node Service Migration Services in the Re-clustering (if needed) Cluster Unstable Network, Re-clustering Extremely large number of invocations
The third task is to define make more concrete decision on the chose action including new destination nodes. This decision is made by using Dijkstra algorithm, which is the representative mechanism to find the shortest path. To apply Dijkstra algorithm, we consider weights for the edges to show a logical closeness, which is defined as the combination of physical location, invocation frequency, bandwidth, potential resource, and so on. V. ASSESSMENT WITH EXPERIMENTS
Copy
To show the applicability of the self-stabilizing process and its methods, we run a number of experiments and interpret the results as an assessment effort. A. Experiment Setup Based on the extended meta-model given in Figure 2, we create and deploy the following elements; y For Node: 2 Server Nodes, 2 Desktop Nodes, and 2 Mobile Nodes y For Application: 4 Mobile Apps y For Cloud Service: 2 Service Types y For MCC Manager: 1 Global Manager, 2 Cluster Managers, and a Node Manager for each Node The deployment of the elements is shown as a UML deployment diagram in Figure 10. The initial setup for experiments has two clusters. Cluster 1 is equipped with four mobile node deploying instances of two mobile applications, App1 and App2, one server node deploying one service, S11, and one desktop node deploying cluster manager. Cluster 2 is equipped with four mobile node deploying instances of the three mobile applications, App1, App3, and App4, one server node deploying one service, S21, and one desktop node deploying one service, S32. And, global manager is not included in any clusters. The current configuration of the clusters is made by considering; y S11 invoked by App1 and App2 and S21 invoked by App3 and App4 y Closeness of nodes in the network
Copy
Copy
Table 2 shows a part of the rules used in making nodelevel plan. If there are multiple candidate actions, the cluster manager evaluates their potential profits by using utility functions such as cost function.
TABLE 2. RULES FOR NODE-LEVEL REMEDY ACTIONS
Faults Low QoS on Application Low QoS on Service Causes Low Resource on Node Low QoS of Service Invoked by the Application Low Resource on Node Unstable Network, Extremely large number of invocations Remedy Actions Component Offloading Service Routing, Service Replication Service Migration, Service Routing Service Replication, Service Routing
Table 3 shows examples of the rules used in generating cluster/global-level plan.
459
App1 App2
Desktop Node Server Node Cloud Node Mobile Node Service Instance Mobile Application
D
App2 App1 App2
S C
M
D S 11
Sij
App1
Global Manager
App4
App3 App1
C D
App3
2 SD 3
App3 App4
S 21
Figure 10. Deployment of the Elements for Experiments
B. Experiment for Cluster-Level Self-Stabilization By using Poisson distribution [25], we generate application executions with service invocations, which result in faulty situations as shown in Figure 11. The upperside figure shows the qualities of the four applications, and the lower-side figure shows the qualities of the two services. For the experiment, we measured response time of the applications and the services to evaluate stability. The x-axis indicates the elapsed time, and the y-axis indicates the quality of elements measured in efficiency.
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 49 50 51 210 211 212
manager of Cluster 1 figures out that executions of App1 and App2 result in faults near the time period of 50 seconds where the quality of App1 and App2 was dropped below Thresholdlow and the quality of S11 invoked by the two applications was also dropped below Thresholdlow. In activity q, the cluster manager diagnoses that the fault results from the congestions caused by extremely a high number of service invocations (S11) made in a short period of time. The cluster manager chooses service replication of S11 as a remedy action since DNode1 has lower level of applicable resources. And, the cluster manager makes more concrete plan, which is to replicate S11 to SNode2, since SNode2 is logically close to DNode1 and MNode1 and has enough resources. After replicating S11, the qualities of App1 and App2 are returned to the normal, which are about 0.7 and 0.9, and the quality of S11 also increases. Note that, in the figure, the quality of the new service instance, S12, is added after the service replication. The reason of this quality stabilization is that the service invocations made by App1 and App2 are dispersed into S11 and S12. C. Experiment for Global-Level Self-Stabilization In Figure 11, a cluster manager of Cluster 2 receives quality data from all the node managers of MNode2 and DNode3 and figures out another fault occurrence. The fault occurs near the time period of 211 seconds where the quality of App3 was dropped below Thresholdlow. At this time, no more quality degradation is detected. In activity q, the cluster manager of Cluster 2 diagnoses that the fault results from the limited resource of MNode2. The cluster manager chooses component offloading of App3 as a remedy action since App3 requires large amount of resources. And, the cluster manager cannot make more concrete plan including the target of offload App3 since all the nodes in the cluster do not have enough resources. Hence, the cluster manager requests to make global-level plan to global manager, and the global manager recommends SNode2 as a target of offloading App3. That is SNode2 is equipped with enough resources, and is logically closer to MNode2 and DNode2 than the other nodes. After offloading App3 to SNode2, the qualities of App3 is returned to the normal, which are about 0.8, and any other fault is detected. The reason of this quality stabilization is that the App3 is executed on the node having enough resources to run the application. VI. CONCLUDING REMARKS
Quality of App
ThresholdHigh
App4 App3 App2 App1
ThresholdLow
Time (sec)
49 50 51 210 211 212
Quality of Service
ThresholdHigh
S1 2 S2 1 S11
ThresholdLow
Time (sec)
Figure 11. Quality of Applications and Services
Once the cluster manager for Cluster 1 receives quality data from all the node managers of MNode1, DNode1, and SNode2, the cluster manager evaluates the current stability in activity p of Figure 4. With the measured data, a cluster
Mobile computing provides benefits of mobility, convenience and convergence, and also reveals limitations in deploying complex applications. To overcome this limitation, MCC is emerging. However, there still exist two potential problems in MCC; low QoS and limited manageability. Current approaches to remedying the two issues suffer from management overhead, performance degradation, inefficient
460
in handling dynamism, and difficulties in managing the heterogeneity of MCC. In this paper, we presented a vision and a practical approach to enabling self-stabilizing MCC. First, we defines an extended meta-model for the consideration of selfstabilization. To achieve the stability of MCC, we proposed processes for self-stabilizing MCC with the notion of Clusters, MCC managers, QoS Remedy planning, and Remedy Actions. Through the experiments conducted, the proposed processes and their methods are shown to be effective and applicable to a number of MCC applications. ACKNOWLEDGMENT This research was supported by the ICT Standardization program of MKE(The Ministry of Knowledge Economy). And, this research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2012R1A6A3A01018389). REFERENCES
B. Knig-Ries and F. Jena, Challenges in Mobile Application Development, it-Information Technology, vol. 52, no. 2, pp. 69-71, 2009, doi: 10.1524/itit.2009.9055. [2] G.H. Forman and J. Zahorjan, The Challenges of Mobile Computing, Computer, vol. 27, no. 4, pp. 38-47, 1994, doi: 10.1109/2.274999. [3] M. Fernando, S.W. Loke, and W. Rahayu, Mobile Cloud Computing: A Survey, Future Generation Computer Systems, Vol. 29, pp. 84-106, 2013, doi: 10.1016/j.future.2012.05.023. [4] L. Guan, X. Ke, M. Song, and J. Song, A Survey of Research on Mobile Cloud Computing, Proc. 2011 10th IEEE/ACIS International Conf. on Computer and Information Science (ICIS 2011), pp. 387392, Dec. 2011, doi: 10.1109/ICIS.2011.67. [5] Y. Natchetoi, V. Kaufman, A. Shapiro, Service-Oriented Architecture for Mobile Applications, Proc. 1st Intl Workshop on Software architectures and mobility (SAM 2008), pp. 27-32, 2008, doi: 10.1145/1370888.1370896. [6] R. Murch, Autonomic Computing, IBM Press, 2004. [7] B.G. Chun and P. Maniatis, Dynamically Partitioning Applications between Weak Devices and Clouds, Proc. the 1st ACM Workshop on Mobile Cloud Computing & Services: Social Networks and Beyond (MCS 2010), Article No. 7, 2010, doi: 10.1145/1810931.1810938. [8] B.G. Chun, S.H. Ihm, P. Maniatis, M. Naik, and A. Patti, CloneCloud: Elastic Execution between Mobile Device and Cloud, Proc. 6th European Conf. on Computer Systems (EuroSys 2011), pp. 301-314, 2011, doi: 10.1145/1966445.1966473. [9] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davis, The Case for VM-Based Cloudlets in Mobile Computing, IEEE Pervasive Computing, vol. 8, no. 4, pp. 14-23, Oct. 2009,doi: 10.1109/MPRV.2009.82. [10] E.Cuervo, A. Balasubramanian, D.K. Cho, A. Wolman, S. Saroiu, R. Chandra, and P. Bahl, MAUI: Making Smarphones Last Longer with Code Offloaded, Proc. 8th Annual Intl Conf. Mobile Systems, Applications, and Services (Mobisys 2010), pp. 49-62, 2010, doi: 10.1145/1814433.1814441. [11] S. Malek, G. Edwards, Y. Brun, H. Tajalli, J. Garcia, I. Krka, N. Medvidovic, M. Mikic-Rakic, and G.S. Sukhatme, An Architecturedriven Software Mobility Framework, The Journal of Systems and Software, vol. 83, pp. 972-989, 2010, doi: 10.1016/j.jss.2009.11.003. [1]
[12] S. Han, S. Zhang, Y. Zhaing, and C. Fan, An Adaptable Software Architecture based on Mobile Components in Pervasive Computing, Proc. the 6th International Conf. on Parallel and Distributed Computing, Applications, and Techniques (PDCAT 2005), pp. 309311, 2005, doi: 10.1109/PDCAT.2005.66. [13] X. Li, H. Zhang, and Y. Zhang, Deploying Mobile Computation in Cloud Service, Proc. 1st Intl Conf. on Cloud Computing (CloudCom 2009), Lecture Notes in Computer Science (LNCS) 5931, pp. 301-311, 2009, doi: 10.1007/978-3-642-10665-1_27. [14] K. Yang, S. Ou, and H.H. Chen, On Effective Offloading Services for Resource-Constrained Mobile Devices Running Heavier Mobile Internet Applications, IEEE Communications Magazine, vol. 46, no. 1, pp. 56-63, 2008, doi: 10.1109/MCOM.2008.4427231. [15] Y. Ye, N. Jain, L. Xia, S. Joshi, I. Yen, F. Bastani, K.L. Cureton, and M.K. Bowler, A Framework for QoS and Power Management in a Service Cloud environment with Mobile Devices, Proc. the 5th IEEE Intl Symposium on Service Oriented System Engineering (SOSE 2010), pp. 236-343, 2010, doi: 10.1109/SOSE.2010.53. [16] S.H. Hung, C.S. Shih, J.P. Shieh, C.P. Lee, and Y.H. Huang, Executing Mobile Applications on the Cloud: Framework and Issues, Computers and Mathematics with Applications, vol. 63, no. 2, pp. 573-587, Jan. 2012, doi: 10.1016/j.camwa.2011.10.044. [17] M. Sharifi, S. Kafaie, and O. Kashefi, A Survey and Taxonomy of Cyber Foraging of Mobile Devices, IEEE Communications Surveys & Tutorials, preprint, 17 Nov. 2007, doi: 10.1109/SURV.2011.111411.00016. [18] J. Flinn, S. Park, and M. Satyanarayanan, Balancing Performance, Energy, and Quality in Pervasive Computing, Proc. 22nd Intl Conf. Distributed Computing Systems (ICDCS 2002), pp. 217-226, 2002, doi: 10.1109/ICDCS.2002.1022259. [19] P.K. Balan, D. Gergle, M. Satyanarayanan, and J. Herbsleb, Simplifying Cyber Foraging for Mobile Devices, Proc. 5th USENIX Intl Conf. Mobile Systems, Applications, and Services (MobiSys 2007), pp. 272-285, 2007, doi: 10.1145/1247660.1247692. [20] S. Kalasapur and M. Kumar, Scavenger: Transparent Development of Efficient Cyber Foraging Applications, Proc. IEEE Intl Conf. Pervasive Computing and Communications (PerCom 2010), pp. 217226, 2010, doi: 10.1109/PERCOM.2010.5466972. [21] D. OBrien, OpenShift Reference Guide: A Guide to the Inner Workings of OpenShift, Red Hat, 2012. [22] Cloud Foundry, http://www.cloudfoundry.com/, accessed by October 18, 2012. [23] F. Gillett, Future View: The New tech Ecosystems of Cloud, Cloud Services, And Cloud Computing, Technical Report, Forrester Research, August, 2008. [24] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition, Springer, 2009. [25] S.D. Poisson, Research on the Probability of Judgments in Criminal and Civil Matters, Bachelier publishing, 1837.
461

4944 A 453

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

4944 A 453

Hochgeladen von

Copyright:

Verfügbare Formate

2013 IEEE Seventh International Symposium on Service-Oriented System Engineering

A Self-Stabilizing Process for Mobile Cloud Computing

978-0-7695-4944-6/12 $26.00 2012 IEEE DOI 10.1109/SOSE.2013.19

Cloud Node Desktop Node Server Node

MCC Manager Service Type Service Instance

Figure 1. Meta-Model of Mobile Cloud Computing

Cluster MCC Environment

Global Manager Cluster Manager Node Manager

Service Type Service Instance

Clusterk Global Configuration Repository

Figure 2. Extended Meta-Model of MCC

Sun SPARC T4-2

Sun SPARC T4-4

Samsung Galaxy2 Amazon EC2 Intel i5 750

Intel i7 2600 Cluster Configuration Repository

Figure 3. Example of a Deployment Configuration with the Meta-Model

Measure Node Activity & Resource

Evaluate QoS & Stability

Plan Cluster-level Remedy

Plan Global-level Remedy

Collaborate in Executing Remedy Plan

Execute Cluster-level Remedy Plan Update Cluster Configuration

Collaborate in Executing Remedy Plan

Collaborate in Executing Remedy Plan

Execute Global-level Remedy Plan Update Global Configuration

Figure 4. Main Process to Stabilize QoS

Mobile Node Desktop Node Server Node Cloud Node

Evaluate Cohesion & Coupling

Evaluate Efficiency & Resource

Figure 5. An Example of Clustring Configuration

Run Clustering Algorithm

Update Configuration Repositories

Figure 6. Supplementary Process for Clustering

Figure 7. 5 Levels of Stability

Figure 8. Two Aspects in Generating Remedy Plans

TABLE 3. RULES FOR CLUSTER/GLOBAL-LEVEL REMEDY ACTIONS

Service Migration Service Deployment Service Routing

Figure 9. Six Types of Remedy Actions

Table 3 shows examples of the rules used in generating cluster/global-level plan.

Figure 10. Deployment of the Elements for Experiments

Figure 11. Quality of Applications and Services

Das könnte Ihnen auch gefallen