Energy-Saving Adaptive Computing and Traffic Engineering For Real-Time-Service Data Centers

See
discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/273342425
Energy-Saving Adaptive Computing and Traffic

Engineering for Real-Time-Service Data Centers
CONFERENCE PAPER JUNE 2015
CITATIONS
DOWNLOADS
VIEWS
68
84
4 AUTHORS:
Mohammad Shojafar
Nicola Cordeschi
Sapienza University of Rome
52 PUBLICATIONS 176 CITATIONS
SEE PROFILE
SEE PROFILE
Danilo Amendola
Enzo Baccarelli
SEE PROFILE
SEE PROFILE
Available from: Mohammad Shojafar

Retrieved on: 05 July 2015
IEEE ICC 2015 - Workshop on Cloud Computing Systems, Networks, and Applications (CCSNA)
Energy-Saving Adaptive Computing and Traffic Engineering

for Real-Time-Service Data Centers
Mohammad Shojafar , Nicola Cordeschi , Danilo Amendola and Enzo Baccarelli
Sapienza
University of Rome, Rome, Italy

Email: {Shojafar, Cordeschi, Amendola, Enzo.Baccarelli}@diet.uniromal.it
AbstractIn this paper, we propose a traffic engineeringbased adaptive approach to dynamically reconfigure the
computing-plus-communication resources of networked data centers which support in real-time the service requirements of mobile
clients connected by TCP/IP energy-limited wireless backbones.
The goal is to maximize the energy-efficiency, while meeting hard
QoS requirements on the delivered transmission rate and processing delay. In order to cope with the (possibly, unpredictable)
fluctuations of the offered workload, the proposed optimal crosslayer resource controller is adaptive. It jointly performs: i) the
balanced control and dispatching of the admitted workload; ii) the
dynamic reconfiguration of the Virtual Machines (VMs) instantiated onto the parallel computing platform at the data center;
and iii) the rate control of the traffic injected into the wireless
backbone for delivering the service to the requiring clients. Our
experimental results show that the proposed technique improves
energy consumption of servers by 25% compared to state of the
art improvement on average in the entire data center.
KeywordsTCP/IP connections; networked data center; energyefficiency; adaptive resource management.
I.
I NTRODUCTION AND BACKGROUND
The forecast development of adaptive ubiquitous applications (such as, iCloud) for highly parallel wireless (possibly,
mobile) processing platforms demands for a novel design
approach that integrates both computing and communication
aspects and it is capable to effectively cope with the inherently
stochastic and time-varying nature of the wireless domain.
This novel approach should be characterized by a tight interaction between two still distinct engineering fields, e.g.,
Parallel Computing [1] and Wireless Mobile Communication
[2]. From a communication perspective, over 50% of current
wireless traffic leverages TCP/IP architectures [3]. One of the
most challenging tasks in data center technology is resource
and energy management for the applications [4]. Therefore,
from an application-centered perspective, it is necessary for
the computing platforms hosted on Network Data Centers
(NetDCs) to exchange information with the underlying TCP/IP
wireless communication infrastructures, in order to provide
QoS guarantees to (possibly, real-time) computing-intensive
multimedia applications over energy-limited congestion-prone
TCP/IP mobile connections. In a nutshell, the goal is to minimize the overall energy for the computing-plus-communication
resources in NetDCs.
In this paper, we propose a new approach to decrease the
energy consumptions induced by computing, communication
and reconfiguration costs of virtualized clouds. Our approach
takes into account dynamic load balancing, because we consider the state of the server for the next workload, which comes
into the admission control system. We resort to online job
decomposition and scheduling (i.e., which jobs are scheduled
in global cost function including energy saving and running
978-1-4673-6305-1/15/$31.00 2015 IEEE
9866
time in VMs, simultaneously) for resource management. The

energy-saving management of the computing resources in
green Clouds is the specific topic of some quite recent contributions [5], [6], [7], [8]. In particular, [5] proposes a computing
architecture for green Clouds that exploits Dynamic Voltage
and Frequency Scaling (DVFS) techniques for increasing the
computing energy efficiency. In [6], the authors present a green
Cloud architecture for reducing the computing-induced energy
consumption, while attempting to meet the performance limits
requested by the clients. The numerical results reported in
[6] show, indeed, that the proposed green Cloud architecture
is capable to save up to 27% of the computing energy. The
current state-of-the-art about the exploitation of DVFS-based
techniques for attaining the green Cloud paradigm is well
summarized by the (recent) contribution in [7]. It deals with
the energy-aware optimized scheduling of jobs in computing
clusters equipped with DVFS-enabled processors. Finally, authors in [8] used Lyapunov optimization technique to design an
algorithm for joint job admission control, routing, and resource
allocation in a virtualized data center.
Overall, although the target computing platforms considered in aforementioned references are parallel and managed
by Virtual Machine Monitors (VMMs), their frameworks differ
from that considered in our paper under three main aspects.
First, the temporary input/output buffering of the arriving tasks
for efficiently coping with both workload peaks and networking congestion is not considered. Second, no QoS guarantees
are provided by the computing architecture considered in
[5], [6], [7], [8] in terms of minimum processing rate and
maximum allowed processing delay. Third, the presence of
wireless backbones is not considered in [5], [6], [7], [8], and
the effects induced by the client mobility are not addressed.
Passing to consider the researching area of the Wireless
Mobile Communication, a first research line focuses on the
cross-layer analysis and optimization of TCP/IP traffic control mechanisms for single-antenna and multi-antenna mobile
connections [9], [10], [11]. These contributions support the
conclusion that an optimal control of the energy employed by
the wireless transmission is an effective means to improve the
resulting TCP goodput. However, this conclusion is partially
offset by the fact that [9], [10], [11] neglect the computing
aspects. Analogous conclusion holds for the works in [12], and
[13], in which optimized schedulers are derived by exploiting
nonlinear optimization and queuing theory. Specifically, the
scheduler developed in [12] does not present adaptive capabilities. Finally, the scheduler in [13] does not account for the
limitation on the energy budget available for the transmission
over the wireless backbone.
The paper structure can be outlined as follows. The considered model, the proposed method and the mathematical proves
are clearly detailed in Section II. Simulation results can be

found in Section III. Finally, Section IV summarizes the main
results and outlooks future research.
II.
M ODEL AND S OLVING A PPROACH
The goal of this section is twofold. First, we define some

important elements engaged in data center problems. Second,
we introduce an optimization mathematical problem that captures the main issues of several energy minimization problems
and resolve the resulting nonconvex problem in closed-form.
A. Basic Definitions
In this subsection, we explore the employed parts that are
exploited in our approach. We apply competitive analysis [14]
to analyze the subproblem of energy and performance efficient
dynamic load balancing/VMs consolidation and data center
energy minimization.
Fig. 1 reports the proposed platform for parallel realtime processing of workloads composed by multiple physical servers which host M virtual machines (VMs), which
are interconnected by a switched rate-adaptive Virtual LAN
(VLAN) and are managed by a central controller. Formally
speaking, physical servers are equipped with multi-frequency
CPUs. We model each multi-frequency CPU with frequency
ranges between minimum frequency considered for each VM is
denoted by fimin , and maximum available frequency is denoted
by fimax .
The targeted system is an Infrastructure-as-a-Service (IaaS)
environment. Each computing node is comprised by M heterogeneous VMs or CPUs which can work in the aforementioned frequency ranges, and each one has M independent congestion-free half-duplex channel powered by Pinet
i {1, . . . , M }. We observe that, in order to limit the
implementation cost, current data centers utilize off-the-shelf
rackmount physical servers, which are interconnected by commodity Fast/Giga Ethernet switches. For the sake of clearness,
we consider a single physical server or host and M VMs
allocates to the host. In this paper, we consider the discretetime model where the slot (i.e., time slot) length matches
the timescale at which the data center can adjust its capacity,
the workload comes in each time-slot (i.e., time-slot duration
is definite) and processing workload immediately (i.e., realtime); no queue is considered for incoming/outgoing workload
into/from the system.
In the considered model, it is clear that the wireless channel
should be modeled at the level of the Transport layer. In
particular, we must consider that all the preparations and
performances of the services takes place in adaptive load
dispatcher at the application level, then the wireless channel
(which in general can be supposed multi-hop) is placed between the interface of the transport layer of the dispatcher
and the corresponding interface at the transport layer of the
client, generally mobile clients. Considering the structure of
the platform described above, we should formulate the problem
of optimizing the allocation of resources and the wireless
backbone model. In particular, it will be necessary to model
below: i) Energy consumption and VM consolidation which
we called computing and is denoted by ECP U (J); ii) Energy
consumption for switching the VMs frequencies we called
9867
Fig. 1: Model of the considered communication-pluscomputing technological platform.
reconfiguration and is denoted by EReconf (J); iii) Energy

consumption of the virtual-link reconfigurable LAN which
we called communication and is denoted by ELAN (J); iv)
Wireless channel: energy transmitted and traffic patterns called
end-to-end wireless backbone and is denoted by EW (J). To
sum-up,
ET OT , ECP U + EReconf + ELAN + EW , (J).
(1)
We try to model these parameters and minimize the whole

energy consumption of the system or ET OT .
According to the described model, at the beginning of each
time-slot, a new job of size Ltot (bit) arrives at the input
of the scheduler (i.e., VMM) of Fig. 1. The input job is
characterized by: i) processed workload size denoted by Ltot
; ii) the maximum tolerated processing delay Tt ; and, iii) the
job granularity, that is, the (integer-valued) maximum number
MT 1 of independent parallel tasks embedded into the
submitted job. In principle, each VM may be modeled as a
virtual server, that is capable to process fi [15]. The VMM of
Fig. 1 must carry out two main operations at run-time, namely,
virtual machine management and load balancing. Specifically,
goal of the virtual machine management is to adaptively
control the Virtualization Layer of Fig. 1. In particular, the
set of the (aforementioned) VMs attributes:
{, fimax (i), i (i ), Eimax (i), i = 1, . . . , M } ,
(2)
are dictated by the Virtualization Layer and, then, they are

passed to the VMM of Fig. 1. Furthermore, due to the realtime nature of the considered application scenario, the time
allowed the VM to fully process each submitted task is fixed
in advance at (s), regardless of the actual size L of the task
currently assigned to the VM. Also, Eimax (i)(J) is the per-job
maximum energy consumed by V M (i). Hence, by definition,
the utilization factor of the VM equates , fi /fimax
[0, 1]. Then, as in [7], let Ei = Ei (fi ) (J) be the overall energy
consumed by the VM to process a single task of duration
at the processing rate fi , and let Eimax = Ei (fimax ) (J) be
the corresponding maximum energy when the VM operates at
the maximum processing rate fimax . Hence, by definition, the
(dimensionless) ratio
() ,
Ei (fi )
=
Eimax
fi
fimax

,
(3)
is the so-called Normalized Energy Consumption (NEC) of

the considered VM [15]. From an analytical point of view,
() : [0, 1] [0, 1] is a function of the actual value of the

utilization factor of the VM. Its analytical behavior depends
on the specific features of the resource provisioning policy
actually implemented by the VMM of Fig. 1. A quite common
expression is the quadratic form on [7], [16] or () = 2 .
So, the computing cost of the system can be summarizes as
ECP U ,
M
X
()Eimax
i=1
2
M
X
fi
Eimax ,
max
fi
i=1
(4)
It is in charge of the VMM to implement a suitable frequencyscaling policy, in order to allow the VMs to scale up/down in
real-time their processing rates fi s at the minimum cost [17].
At this regard, we note that switching from the processing
frequency f1 to the processing frequency f2 entails an energy
cost of (f1 ; f2 ) (J). Although the actual behavior of the function (f1 ; f2 ) may depend on the adopted DVFS technique,
any practical (f1 ; f2 ) function typically retains the following
general properties: i) it depends on the absolute frequency gap
|f1 f2 |; ii) it vanishes at f1 = f2 and is not decreasing
in |f1 f2 |; and, iii) it is jointly convex in f1 , f2 . A quite
common practical model, which retains the aforementioned
formal properties, is the following one:
EReconf ,
M
X
(f1 ; f2 ) =
i=1
M
X
ke (f1 f2 )2 (J),
(5)
i=1
where ke (J/(Hz)2 ) dictates the per VM reconfiguration cost

induced by an unit-size frequency switching. Typical values of
ke for current reconfigurable virtualized computing platforms
are limited up to few hundreds of J 0 s per (M Hz)2 [16]. For
sake of concreteness, we directly subsume the quadratic model
in (5). The generalization to the case of (.; .) functions that
meet the aforementioned (more general) analytical properties
is, indeed, direct. For communication cost, the ShannonHartley exponential formula as

Pinet (Ri ) = i 2Ri /Wi 1 (J),
(6)
(i)
with i ,
(i)
N0
N0 Wi
,
gi
i = 1, . . . , M noise spectral power density
(W/Hz), transmission bandwidth Wi (Hz) and (nonnegative) gain gi of the i-th link [18] is instance of power-rate
functions of practical interest that meet the above assumptions.
Therefore, since the corresponding one-way transmission delay
equates: Di = Li /Ri , the resulting one-way communication
energy ELAN (i) which is needed for sustaining the i-th virtual
link of Fig. 1 is: ELAN (i) = Pinet (Li /Ri ) where Ri (bit/s)
is communication rate of the i-th virtual link and Li (bit) is
assigned workload (received job) for the V M (i). Communication virtual channel duration for each physical node, on
the (one-way) delays {Di , i = 1 . . . M } introduced by the
Virtual LAN (VLAN) and the allowed per-task processing
time . Specifically, since the M virtual connections of Fig.
1 are typically activated in a parallel fashion, the overall twoway communication-plus-computing delay induced by the ith connection of Fig. 1 equates 2Di + , so that the hard
constraint on the overall per-job execution time reads as in:
max {2Di } + Tt .
1iM
(7)
The wireless connection of Fig. 1 is understood to model all

layers of the stack protocol up to the Transport layer. With
regard to the physical level, we go to consider a connection
9868
(typically multi-hop) affected by interference due to multiple

access, noise and fading (the latter considered constant over
a time slot, a physical block-faded channel which operates in
the steady-state condition). The resulting state (t) R+
0 in
correspondence of the t-th slot for the overall end-to-end
connection is modeled as a non-negative random real variable.
We assume that the state (t) is known to the controller at the
beginning of slot t (this is necessary because the controller
has to manage a per-slot resource allocation). Regarding the
goodput offered by the wireless connection of Fig. 1 and
the evaluation of the cost in terms of energy required for
the transmission, the parameter r.v. r(t) depends on both
EW (t) and (t) through the rate function RW (; ) which
is precisely the measures the instantaneous goodput that the
wireless connection is able to offer. In particular, we can write:

r(t) , RW EW (t), (t) , t 0 (byte/slot)
(8)
where RW (; ) in (8) is a nonnegative time-invariant function,

whose arguments (EW (t) and ) are also non-negative. It
depends on multiple factors, including: i) performance of
the modulation and coding adopted at the physical layer; ii)
statistical characteristics fading and interference that impact
on our wireless channel; iii) phenomena of loss on the MAC
layer; iv) statistical characteristics of delays introduced by the
Network layer; and, v) characteristics of the client, in terms of
speed of movement. Having established the above, it remains
to perform a more detailed analysis about the characteristics
of the goodput that the TCP/IP end-to-end wireless connection
is able to offer. We will see the assumptions made about the
analysis of the steady-state goodput of a mobile connection
TCP/IP with Rayleigh fading. The Network layer of the connection is assumed as a connection type of IP best-effect which
is not reliable. This assumption implies that the Network layer
introduce packet loss that is uncertain time-varying delays in
[9]. At the Transport layer, we adopt the protocol TCP-Reno
with Congestion-Avoidance [9]. In this model, we consider
that in the Transport layer is implemented Triple Duplicate
Acknowledgment (TDACK) as a technique used to notify the
loss of packets. For all subsequent considerations we consider
the following conditions: (1) the underlying physical channel
is simultaneously subject to two types of fading, Rayleigh and
Log-normal distributed fading. We can consider them flat in
the frequency domain and constant in the duration of a time
slot (at least); (2) we consider that the feedback channel used
to carry the ACK messages from the client is reliable and
delay-free. In addition, when the buffer of the MAC layer is
saturated, the new incoming frames are directly discarded and
definitively considered lost. Finally, if the MAC layer of the
client receives a frame incorrectly, the encapsulated segment is
irreversibly declared lost and a TDACK message is sent back
to the controller; (3) in order to limit the end-to-end transport
delay, we dont implement any kind of fragmentation. The
IP-based link (generally considered multi-hop) arising at the
Network layer is therefore characterized by the presence of the
phenomenon of packet loss and random delay.
According to the analysis presented in [9], we can model
the sequence {IP (t) R+
0 , t 1} of the packets-delays (in
multiple of slot period) as a i.i.d. random sequence. The pdf
function of each of the random variables IP is uniformly
max
distributed in the interval [0, max
(measured
IP ], where IP
in multiples of the slot period) is the maximum packet delay
introduced by the IP layer which is known. At this point, under

the usual assumption that the segment loss rate PL (t) present
at the input of the Transport layer is limited to the value 102
which is obtained by taking account for the phenomenon of
Rayleigh-distributed fading as

PL (t) C + (A/(CB 2 ))(1; CB) (z(t)/EW (t)), t 1,
(9)
where (, ) is the incomplete Gamma function, the positive constants A, B and C are completely described the
performance, in terms of error, of the FEC system in Fig.
1 [10], EW (t) is the energy that we need for transmitting
over the wireless channel of Fig. 1 at slot t and z(t) takes
into account in mobility, it is modeled as a time-correlated
log-distributed sequence z(t) R+
0 , t 1 , as [19]: z(t) ,
a0 100.1x(t) , t 1 where a0 0.9738 assures the E {z(t)}
1 (J)1 , and {x(t), t 1} is a time correlated, stationary,
zero-mean and unit-variance Markov random sequence with
probability
density
function of uniformly distributed in the
interval [ 3, 3] [19]. As a result, the goodput value
RW (t)(byte/slot) is given by the following formula:
h
i
RW (t) = (3/2b)1/2 M SS/(RT T (t))(PL (t))1/2 , t 1,
RT T (t) = 0.75RT T (t 1) + 0.25IP (t), t 1, RT T (0) = 0.

(11)
Finally, by making the appropriate substitutions (e.g., inserting

eq. (9) in (8)) we can elicit the following analytical expression
that is precisely determinate by the instantaneous goodput that
the wireless TCP/IP can provide in the steady state:
(12)
where the state of the connection at slot t is

(t) , (K0 (z(t))1/2 )/RT T (t), t 1,
B. Optimization Problem
The proposed scheduling algorithm, which allows to determine 3M parameters {fi , Li , Ri , i = 1, . . . , M }, is completely independent of the size of the arrived workload Ltot
and the total capacity in terms of Rt rate of the local LAN.
(
2
M
X
2
fi
min
Eimax + ke fi fi0 +
max
f
{Ri ,fi ,Li }
i
i=1

Li
+2Pinet (Ri )
,
(14.1)
Ri
s.t.: (Li ) fi , i = 1, . . . , M,
(14.2)
M
X
Li = Ltot ,
(14.3)
i=1
(10)
where b = 2, M SS (byte) (Maximum Segment Size) is the

maximum permitted size of the segment, and RT T (t) is the
average Round Trip Time (measured in multiples of the slot
period). RT T (t) is calculated iteratively using the following
formula:
r(t) = (t)(EW (t))1/2 , (byte/slot),
Eave (J). therefore, it is important to reach to this energy for
the transmission beside finding r (t). To find EW

(t), we use
the stochastic gradient projection algorithm [20] to find the
optimum transmission energy.
(13.1)
while the positive constant

K0 , (3/2b)1/2 M SS /(C + (A/CB 2 )(1; CB))1/2 (byte)

(13.2)
describes the performance of the FEC-based error-recovery

system implemented at the Physical layer of Fig. 1. For finding
optimum energy consumption for the end-to-end wireless
backbone, it is enough to achieve r()1 and minimize the
EW (t) respect to the r(t). As we know, average incoming
workload to the system is strictly correlated to the average
goodput of the channel while the system is in the steady state.
Therefore, the average goodput is known and it is calculated
according to the average incoming workload in each time
slot to the system. Therefore, EW
(t) which is the optimum
energy consumption in the end-to-end wireless backbone is in
quadratic form and closed form in the feasibility region and
easily can be achieved. To do this, we assume a boundary
for the r(t) [rmin , rmax ] (byte/slot) and we should take
into account the stability of the network and find proper
goodput which the energy-consumption in the network for each
time-slot t should be equal to the average energy available
for the transmission over wireless/wired network denoted by
9869
0 fi fimax , i = 1, . . . , M,
Li 0, i = 1, . . . , M,
2Li
+ Tt , i = 1, . . . , M,
Ri
M
X
Ri Rt ,
(14.4)
(14.5)
(14.6)
(14.7)
i=1
Ri 0, i = 1, . . . , M.
(14.8)
About the stated problem, the first two terms in the summation in (14.1) account for the computing-plus-reconfiguration
energy Ec (i) consumed by the VM(i), while the third term
in (14.1) is the communication energy E net (i) or ELAN (i)
requested by the corresponding point-to-point virtual link for
conveying Li bits at the transmission rate of Ri (bit/s).
Furthermore, fi0 and fi in (14.1) represent the current (i.e.,
already computed and consolidated) computing rate and the
target one, respectively. Formally speaking, fi is the variable
to be optimized, while fi0 describes the current state of the
V M (i), and,
2then, it plays the role of a known constant. Hence,
ke fi fi0 in (14.1) accounts for the resulting switching
cost. The constraint in (14.2) guarantees that V M (i) executes
the assigned task within secs, while the (global) constraint
in (14.3) assures that the overall job is partitioned into M
parallel tasks. According to (7), the set of constraints in (14.6)
forces the considered problem of Fig. 1 to process the overall
job within the assigned hard deadline Tt . Finally, the global
constraint in (14.7) limits up to Rt (bit/s) the aggregate
transmission rate sustainable by the underlying VLAN of Fig.
1, so that Rt is directly dictated by the actually considered
VLAN standard [6]. The first and second terms of the objective
function in (14.1) are convex and non-decreasing, the third
i
term is nonconvex but by replacing (Tt )/2 and T2L
t
Li
instead of Ri , Ri from (14.6), respectively, we can make the
communication cost convex in fi and Li . As a result, it can
2 Li
PM
(T
be simplified as: (T )
(2 t ) Wi 1)2 .
t
i=1 i
We propose an iterative method in which we can calculate

the optimum Li or fi for each V M (i) for each incoming
going to perform the simulations, it is through the evaluation

of the total cost that we can be aware of the differences,
in terms of energy saving, between a case study and the
other. Note that, the following reports index i indicates the
i-th VMs and n represents the iteration indexes. In detail, we
have proposed three convergence conditions after each iterative
loop. Formally speaking, at the end of each iterative cycle, we
are going to checkout some conditions that will determine if
the optimization process should continue or if we can stop
because the solution is to be considered excellent. Each
condition should be processed one after the other or step by
step, for each iteration must be performed to verify that the
solution found { ; Li , i , fi , i = 1, . . . , M } is optimal and
permissible. In following, the three aforementioned conditions
are listed in the same order in" which they were placed
#
PM
|( i=1 Li ) Ltot |
inside the iterative algorithm: i)
a
Ltot
is to ensure that the total load (Ltot is the total workload
size) has been fully allocated with a sufficiently accuracy
(according to what we choose small a ); ii) [Li fi ]
has the meaning of verifying that the working frequency
selected for the generic VM is sufficient to enable it and be
responded in the time limit with the Li load assigned; iii)
the so-called complementary condition is enable to verify that
[(|i | b ) or (|Li fi | c )]. If all three conditions
are met then, our current workload scheduler converged to
the optimal solution in terms of resource allocation, i.e., we
have minimized the energy consumption for computing-pluscommunication of proposed model.
workload. Iterative method has flexibility and reliability in

a multiple scenarios. After some iterations, we will reach
the optimal solution, in the sense of the energy-saving, of
the considered problem which is consist of a set of optimal
parameters { ; Li , i , fi , i = 1, . . . , M }. We iterate our
method n-times (i.e. n is the loop counter for searching proper
optimum workload and frequency for each VM). There are
some initializations points that we should point out for reproduction of the approach: i) the n index iteration to be
considered as n 1; ii) and are positive constants; iii) the
i index of the VM should be considered as i = 1, . . . , M ; and,
(0)
(0)
iv)(0) = 0, (0) = , i = 0 for i = 1, . . . , M , Li = 0
for i = 1, . . . , M , V (0) = 0.
"
(n)
(n1)
(n1)
M
X
!#
(n1)
Li
Ltot
i=1
(n)
yi

=
(Tt )
Wi log2
2
(15.1)
(n)
T H(i)

(15.2)
+
(i)
2 Wi ln(2) N0
T H(i) =
,
gi
h
ii
h
(n)
(n1)
(n1)
(n1)
(n)
i = i
+ N EW yi fi
,
+
fimax
(n)
fi
(n)
(n)
= min { fi
(n)
Li
2 ke fi(0) + i(n)
max
=
Ei
2 ke + 2
max 2
(fi )
0
(15.3)
(15.4)
(15.5)
(n)
yi },
(15.6)
M
X
n
max 0; min ; (n1) V (n1)
!))
(n1)
Li
Ltot
III.
P ERFORMANCE E VALUATION AND N UMERICAL T EST
i=1
(15.7)
n
n
(n1)
max 0; min N EW ; N EW
!))
M
X
(n1)
,
Ltot
Li
V (n1)
(n)
N EW
(15.8)
i=1
(n)
M
X

1 (n1) V (n1)
(n)
VN EW
(n1)
N EW
(n1)
VN EW
!
(n1)
Li
i=1
M
X
Ltot
(15.9)
!
(n1)
Li
Ltot
i=1
(15.10)
In eq. (15), we propose two different formulas for calculating

the variable and V which represents the step of adaptation,
and the intermediate parameter for updating the s to be used
separately in the calculation of the dual variables and ,
respectively. This change justified the fact that, in the optimization problems in general, each variable should always have its
own step-size. In practice, when approaches Tt , the values
gets far from N EW and within long distance of N EW .
Specifically, and N EW will be much larger,simultaneously.
To conclude, the description of the optimization algorithm
implemented must necessarily specify that, in addition to the
sequence of steps just described, our software is composed of
three basic parts. The sum of the three costs along wireless
backbone cost will provide us with the total cost. This value
is essential to evaluate the performance of our scheduler in
terms of energy savings and real utility in achieving the goal
of projection towards the green paradigm. In fact, when we are
9870
This section presents the simulated performance of the

proposed scheduler for a synthetic workload and compares
the simulation results with the no-DVFS techniques in [7]
and the well-known method (i.e., Lyapunov-based method)
which recovers CPU and reconfiguration approach in [8].
The simulations were carried out with the numerical software
MATLAB platform under Microsoft Windows 8 x64 on Intel
Core i7. We want to evaluate the average energy per-job related
to communication-plus-computing E tot that will be consumed
by the system managed by our optimum scheduler. The general scenario considered in this paper is as follows: DVFS
Frequency for each VM (i.e., fi = {0, 5, 50, 70, 90, fimax })
Tt = 5 (s), Rt = 100 (M b/s), M SS = 120 (byte),
ke = {0.005, 0.05} (J/(M Hz)2 ), fimax = 105 (M bit/s),
Eimax = 60 (J), = 0.1 (s), Wi = 1 (M Hz), Eave = 8 (J),
rmax = 4rmin = 2000(byte/slot),#itrmax = 104 , = 0.5,
a = b = c = 0.01 and time slot = 2000.
We test the performance of the scheduler paying particular
attention to the cost of reconfiguration in case there are temporal fluctuations of workload. Specifically, the sequence of the
workload {Ltot (mTt ), m = 0, 1, . . .} will be characterized by
a period of inter-arrival Tt and a dimension represented by an
uniformly distributed random variable in [Ltot a , Ltot + a]
with Ltot = 8 (M bit) and a = 2 (M
bit) that is, a peakmean-ratio (PMR) equal to 1.25. fi0 represents the initial
state of the VM in terms of their frequency (each VM has
a working frequency). In general,for every workload, after
the service of the first workload fi0 matches the optimal
frequency {fi } which is calculated from the previous task. To
10
10
6
4
M=2, N=50, =0.1 (s)

M=2, N=1000, =0.1 (s)
M=100, N=50, =0.1 (s)
M=100, N=1000, =0.1
M=2, N=50, =4.4 (s)
M=2, N=1000, =4.4 (s)
M=100, N=1000, =4.4 (s)
M=100, N=50, =4.4 (s)
2
0
10
10
10
#iterations
(a) Dual variable
10
in = 0.1 (s), = 0.5, Tt = 5 (s), = 0.15

M=2, N=50
M=2, N=1000
M=100, N=50
M=100, N=1000
E W (J)
= {0.1, 4.4} (s), = 0.5, Tt = 5 (s), = 0.15, #itr max = 104
10
10
0
4
10
2
0
Eave = 8(J ), = 0.5, = 0.15, #itrmax = 104
6
4
2
100
200
300
400
500
#iterations
0
0
600
20
(b) Dual variable
40
60
80
#iterations
100
120
(c) E W for each t
Fig. 2: Example of achieved convergence to the optimal value for and in scenario: M = {2, 100}, = {0.1, 4.4}(sec)
and ke = {0.005} (J/(M Hz)2 )
Furthermore, we evaluated the average (per-job) energy

consumed by the system to vary the number of available VMs,
and the variation of {i }. In particular, in addition to the
two constant values, we have carried out the simulation in
the case of increasing {i } with respect to the VM indexes.
The aforementioned system describes the following scenario:
every-time we allocate a new VM, it has a higher channel cost
because maybe it is physically placed in a server far away.
We can see what happen to the mean per-job communicationplus-computing energy E T OT at the variation of the parameters
ke and that represent the re-configuration cost of VM and
channel variances, simultaneously. The parameters that are
variables in this simulation are M = 2, . . . , 15, ke = 0.5 and
0.005 (J/(M Hz)2 ) and, for the channel we have: i) [HMC]
VMs HoMogeneous channels with i 0.5 (mW ) and ii) [HTC]
VMs HeTerogeneous channels with i = [0.5 + 0.05(i 1)]
and i = [0.5 + 0.1(i 1)] (mW ), (i.e., the program execute
9871
140
= 0.1 (s), M = 15 (i.i.d runs), HMC=HeMogenous Channel, HTC=HeTerogeneous Channel

[HMC]ke=0.005(J/(MHz)2), i=0.5 (mW) i=1,,M
[HMC]ke=0.05(J/(MHz)2), i=0.5 (mW) i=1,,M
120
E T OT (Joule)
confer to our simulations, a certain reliability from a statistical

point of view will be used 2000 workloads cycles. The first
simulation demonstrates the convergence rates for the proposed
scheduler facing with various parameters fluctuations. We
can see these results in Figs. 2(a), 2(b) and 2(c). Figs. 2
demonstrate the internal iterative loop convergence for the
mentioned parameters in (15). Specifically, Fig. 2(a) concludes
that: i) the proposed scheduler is able to converge even with
low VMs (i.e., for M = 2 which is hard to gain) and
high computing time near to the maximum tolerated delay Tt
(i.e., = 4.4(s)), and ii) when the number of application
increases dramatically the convergence for the is reached
faster. Fig. 2(b) indicates that while M increases not only
decreases but also the convergence for finding suitable in
the iterative method reach faster, it means that, the scheduler
is able to balance load easier with more handy resources. Fig.
2(c) demonstrates that the iterative gradient method is able
to find proper EW for each incoming workload transmitted
into the end-t-end backbone channel in each time slot in some
iterations. The second simulation presents the goodput and
RTT of the the TCP connection in the case where we vary the
value of the energy available for transmission EW (t). About
code modulation of the system, we used QPSK with coding
parameters (A = 90.2514, B = 3.4998, C = 1.0942, rate=
1.5). The available energy for wireless transmission EW (t) is
modeled as a random variable with a unit mean value and
variance. We performed simulations for three different values
of E W in the case of unit variance (E2W = 1). Here, for
average energy 8 (J) for the wireless backbone channel our
average goodput is approximately 33.83 bit per-slot.
[HTC]ke=0.005(J/(MHz)2), i=0.5+0.05(i1)(mW) i=1,,M
100
[HTC]k =0.05(J/(MHz)2), =0.5+0.1(i1) (mW) i=1,,M

e
80
60
40
20
10
1
10
11
12
13
14
15
VMs
Fig. 3: E tot -vs.-M with different ke for homogeneous and

heterogeneous channels.
15 times for group of VMs independently).

Based on the synthetic traces of workload in Fig. 3,
comparisons of HTC plots with the corresponding ones of the
HMC in the same ke confirms that by increasing the VMs the
energy reduces which ranges from 4% (case of ke = 0.005
with two lower plots) to 8.5% (case of ke = 0.05 with two
upper plots). These results proceed the expectations [5] that
noticeable energy savings may be attained by jointly changing
the available computing-plus-communication resources. In two
upper most plots in Fig. 3, while VM is 2, the E tot increases
suddenly due to increasing the reconfiguration cost but while
M increases, the scheduler controls/manages the energy and
decrease the energy parts according to aforementioned formula
in (14) and (15).
In the last simulations, in order to evaluate the energy
reduction due to scaling up/down of the computing and reconfiguration rates by increasing the VMs (i.e., we process
the results for the one time implementation over 100 VMs),
we have also implemented two well-known recent schedulers
[7], [8] based on aforementioned general scenario which are
presented in Figs. 4 and 5, simultaneously. According to
Fig. 4, average energy saving for the proposed method (i.e.,
green color continue plot;) compared with Lyapunovbased method (i.e., light-thick blue color continue plot) and
no DVFS method (i.e., yellow blue color continue plot with
4 points) are about 60% and 25%, simultaneously. It
confirms that the proposed method is able to adapt itself with
the incoming workload whilst increasing the VM number faster
than no-DVFS method which concentrates on the optimum
frequency in each time-slot. Also, Fig. 5 shows that, the
10
= 0.1 (s), Tt = 5 (s), itr max = 104 , #W L = 2000, fmax = 105 (Mbit/s), V = 100, E max = 60(J), Pmin = 10(w)
E C P U (J)
ECP U , DVFS
E CP U per VM, DVFS
ECP U [8]
E CP U per VM [8]
NO-DVFS [7]
E CP U NO-DVFS [7]
10
instantaneous goodput. The energy-efficient adaptive management of the delay-vs.-throughput trade off of the WAN TCP/IP
mobile connections becomes an additional topic for further
research.
R EFERENCES
[1]
[2]
[3]
10
10
20
30
40
50
V Ms
60
70
80
90
100
Fig. 4: E CP U (J) for the proposed method, no-DVFS method

in [7], and Lyapunov method in [8].
[4]
[5]
= 0.1 (s), Tt = 5 (s), itr max = 104 , W L = 2000, fmax = 105 (Mbit/s), V = 100, E max = 60 (J), Pmin = 10 (w)
[6]
10
E Reconf (J)
10
EReconf
E Reconf per VM
E Reconf per VM [7]
EReconf [7]
E Reconf per VM [8]
EReconf [8]
10
10
[7]
[8]
10
[9]
10
20
30
40
50
V Ms
60
70
80
90
100
[10]
Fig. 5: E Reconf (J) for the proposed method (i.e., using

DVFS)-vs.- no-DVFS method in [7]-vs.- Lyapunov method in
[8].
average reconfiguration cost differences between the proposed
method and the no-DVFS method [7] is negligible but is higher
than Lyapunov-based method which is approximately 1000
times lower than our method, but, with looking at these two
figures (Figs. 4 and 5), we can easily understand that this
difference is unable to fill the gap of computing part, as a
result, [8] even with lower switching cost has much higher
computing cost compared to the proposed method.
IV. C ONCLUSION
In this paper, we developed an iterative-based model for
the joint admitted workload, delivered throughput, and, resource reconfiguration of computing-plus-communication platforms equipped with wireless Internet-based connections. The
overall goal is the energy-saving support of QoS demanding computing-intensive delay-sensitive services that utilize
TCP/IP wireless connections for delivering remotely processed
workload to clients. Its implementation indicates that the
resulting complexity fully scales with the number of the
available VMs and takes at the minimum the energy consumed
by the overall platform for computing, communication and
transmission over the wireless connection; and, it is capable to
provide hard QoS guarantees, in terms of minimum delivered
9872
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
tion:
Z. Sanaei, S. Abolfazli, A. Gani, and R. Buyya, Heterogeneity in mobile cloud computing: taxonomy and open challenges, Communications
Surveys & Tutorials, IEEE, vol. 16, no. 1, pp. 369392.
B. Hayes, Cloud computing, Commun. ACM, vol. 51, no. 7, pp. 911,
Jul. 2008.
S. Jin, L. Guo, I. Matta, and A. Bestavros, A spectrum of tcp-friendly
window-based congestion control algorithms, IEEE/ACM Transactions
on Networking (TON), vol. 11, no. 3, pp. 341355, 2003.
G. Aceto, A. Botta, W. De Donato, and A. Pescap`e, Cloud monitoring:
A survey, Computer Networks, vol. 57, no. 9, pp. 20932115, 2013.
R. Buyya, A. Beloglazov, and J. Abawajy, Energy-efficient management of data center resources for cloud computing: A vision, architectural elements, and open challenges, arXiv preprint arXiv:1006.0308,
2010.
L. Liu, H. Wang, X. Liu, X. Jin, W. B. He, Q. B. Wang, and Y. Chen,
Greencloud: a new architecture for green data center, in Proceedings
of the 6th international conference industry session on Autonomic
computing and communications industry session. ACM, 2009, pp.
2938.
N. Cordeschi, M. Shojafar, and E. Baccarelli, Energy-saving selfconfiguring networked data centers, Computer Networks, vol. 57,
no. 17, pp. 34793491, 2013.
R. Urgaonkar, U. C. Kozat, K. Igarashi, and M. J. Neely, Dynamic
resource allocation and power management in virtualized data centers,
in Network Operations and Management Symposium (NOMS), 2010
IEEE. IEEE, 2010, pp. 479486.
E. Baccarelli and M. Biagi, Optimized power allocation and signal
shaping for interference-limited multi-antenna ad hoc networks, in
Personal Wireless Communications. Springer, 2003, pp. 138152.
Q. Liu, S. Zhou, and G. B. Giannakis, Cross-layer combining of
adaptive modulation and coding with truncated arq over wireless links,
Wireless Communications, IEEE Transactions on, vol. 3, no. 5, pp.
17461755, 2004.
E. Baccarelli and M. Biagi, Error resistant space-time coding for
emerging 4g-wlans, in Wireless Communications and Networking,
2003. WCNC 2003. 2003 IEEE, vol. 1. IEEE, 2003, pp. 7277.
D. Mitra and Q. Wang, Stochastic traffic engineering for demand
uncertainty and risk-aware network revenue management, IEEE/ACM
Transactions on Networking (TON), vol. 13, no. 2, pp. 221233, 2005.
S. Faruque, Traffic engineering for multi rate wireless data, in
Electro/Information Technology, 2008. EIT 2008. IEEE International
Conference on. IEEE, 2008, pp. 280283.
A. Borodin and R. El-Yaniv, Online computation and competitive
analysis. Cambridge University Press, 1998.
R. Nathuji and K. Schwan, Virtualpower: coordinated power management in virtualized enterprise systems, in ACM SIGOPS Operating
Systems Review, vol. 41, no. 6. ACM, 2007, pp. 265278.
D. Zhu, R. Melhem, and B. R. Childers, Scheduling with dynamic
voltage/speed adjustment using slack reclamation in multiprocessor
real-time systems, IEEE Trans. Parallel Distrib. Syst., vol. 14, no. 7,
pp. 686700, Jul. 2003.
D. Warneke and O. Kao, Exploiting dynamic resource allocation for
efficient parallel data processing in the cloud, Parallel and Distributed
Systems, IEEE Transactions on, vol. 22, no. 6, pp. 985997, 2011.
N. Cordeschi, T. Patriarca, and E. Baccarelli, Stochastic traffic engineering for real-time applications over wireless networks, Journal of
Network and Computer Applications, vol. 35, no. 2, pp. 681694, 2012.
M. Gudmundson, Correlation model for shadow fading in mobile radio
systems, Electronics letters, vol. 27, no. 23, pp. 21452146, 1991.
D. P. Bertsekas and J. N. Tsitsiklis, Parallel and distributed computanumerical methods. Prentice-Hall, Inc., 1989.

Energy-Saving Adaptive Computing and Traffic Engineering For Real-Time-Service Data Centers

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Energy-Saving Adaptive Computing and Traffic Engineering For Real-Time-Service Data Centers

Hochgeladen von

Copyright:

Verfügbare Formate

See

Energy-Saving Adaptive Computing and Traffic

Sapienza University of Rome

Sapienza University of Rome

52 PUBLICATIONS 176 CITATIONS

76 PUBLICATIONS 155 CITATIONS

Sapienza University of Rome

Sapienza University of Rome

154 PUBLICATIONS 821 CITATIONS

Available from: Mohammad Shojafar

Energy-Saving Adaptive Computing and Traffic Engineering

University of Rome, Rome, Italy

I NTRODUCTION AND BACKGROUND

978-1-4673-6305-1/15/$31.00 2015 IEEE

time in VMs, simultaneously) for resource management. The

are clearly detailed in Section II. Simulation results can be

M ODEL AND S OLVING A PPROACH

The goal of this section is twofold. First, we define some

Fig. 1: Model of the considered communication-pluscomputing technological platform.

reconfiguration and is denoted by EReconf (J); iii) Energy

We try to model these parameters and minimize the whole

are dictated by the Virtualization Layer and, then, they are

is the so-called Normalized Energy Consumption (NEC) of

() : [0, 1] [0, 1] is a function of the actual value of the

where ke (J/(Hz)2 ) dictates the per VM reconfiguration cost

i = 1, . . . , M noise spectral power density

The wireless connection of Fig. 1 is understood to model all

(typically multi-hop) affected by interference due to multiple

where RW (; ) in (8) is a nonnegative time-invariant function,

introduced by the IP layer which is known. At this point, under

RT T (t) = 0.75RT T (t 1) + 0.25IP (t), t 1, RT T (0) = 0.

Finally, by making the appropriate substitutions (e.g., inserting

where the state of the connection at slot t is

where b = 2, M SS (byte) (Maximum Segment Size) is the

r(t) = (t)(EW (t))1/2 , (byte/slot),

Eave (J). therefore, it is important to reach to this energy for

the transmission beside finding r (t). To find EW

while the positive constant

K0 , (3/2b)1/2 M SS /(C + (A/CB 2 )(1; CB))1/2 (byte)

describes the performance of the FEC-based error-recovery

We propose an iterative method in which we can calculate

going to perform the simulations, it is through the evaluation

workload. Iterative method has flexibility and reliability in

P ERFORMANCE E VALUATION AND N UMERICAL T EST

In eq. (15), we propose two different formulas for calculating

This section presents the simulated performance of the

M=2, N=50, =0.1 (s)

(a) Dual variable

in = 0.1 (s), = 0.5, Tt = 5 (s), = 0.15

= {0.1, 4.4} (s), = 0.5, Tt = 5 (s), = 0.15, #itr max = 104

Eave = 8(J ), = 0.5, = 0.15, #itrmax = 104

(b) Dual variable

(c) E W for each t

Furthermore, we evaluated the average (per-job) energy

= 0.1 (s), M = 15 (i.i.d runs), HMC=HeMogenous Channel, HTC=HeTerogeneous Channel

confer to our simulations, a certain reliability from a statistical

[HTC]ke=0.005(J/(MHz)2), i=0.5+0.05(i1)(mW) i=1,,M

[HTC]k =0.05(J/(MHz)2), =0.5+0.1(i1) (mW) i=1,,M

Fig. 3: E tot -vs.-M with different ke for homogeneous and

15 times for group of VMs independently).

Fig. 4: E CP U (J) for the proposed method, no-DVFS method

Fig. 5: E Reconf (J) for the proposed method (i.e., using

Das könnte Ihnen auch gefallen