You are on page 1of 100

Contents | Zoom in | Zoom out

For navigation instructions please click here

Search Issue | Next Page

The

+ Datacenter Trends and Challenges 10


+ Practical Cloud Security 28

MAY 2014
www.computer.org/cloudcomputing

Contents | Zoom in | Zoom out

For navigation instructions please click here

Search Issue | Next Page

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Seeking Editor in Chief

he IEEE Computer Society seeks applicants for the


position of editor in chief, serving a two-year term
starting 1 January 2016. The EIC would need to be available for
training and interim activity beginning 1 October 2015.

Prospective candidates are asked to provide (as PDF les),


by 1 August 2014, a complete curriculum vitae, a brief plan
for the publications future, and a letter of support from their
institution or employer.

Qualications and Requirements


Candidates for any IEEE Computer Society editor in chief
position should possess a good understanding of industry,
academic, and government aspects of the specic publications
eld. In addition, candidates must demonstrate the managerial
skills necessary to process manuscripts through the editorial
cycle in a timely fashion. An editor in chief must be able to
attract respected experts to his or her editorial board.
Major responsibilities include

actively soliciting high-quality manuscripts from potential

authors and, with support from publication staff, helping


these authors publish their manuscripts;
identifying and appointing editorial board members, with the
concurrence of the Publications Board;
selecting competent manuscript reviewers, with the help of
editorial board members, and managing timely reviews of
manuscripts;
directing editorial board members to seek special-issue
proposals and manuscripts in specic areas;
providing a clear, broad focus through promotion of personal
vision and guidance where appropriate; and
resolving conicts or problems as necessary.

Applicants should possess recognized expertise in the computer


science and computer security community, and must have clear
employer support.

Contact Information
For more information on the search process and to submit
application materials for IEEE Security & Privacy, please contact:
Kathy Clark-Fisher at __________________
kclark-sher@computer.org.

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

EDITOR IN CHIEF

Mazin Yousif, T-Systems International, mazin@computer.org


_____________

EDITORIAL BOARD
Zahir Tari, RMIT University
Rajiv Ranjan, CSIRO Computational Informatics
Eli Collins, Cloudera
Kim-Kwang Raymond Choo, University of South Australia
Ivona Brandic, Vienna University of Technology
David Bernstein, Cloud Strategy Partners

Alan Sill, Texas Tech University


Omer Rana, Cardiff University
Beniamino Di Martino, Second University of Naples
Samee Khan, North Dakota State University
J.P. Martin-Flatin, EPFL
Pascal Bouvry, University of Luxembourg

STEERING COMMITTEE
Manish Parashar, Rutgers, the State University of New Jersey
Steve Gorshe, PMC-Sierra (Communications Society
liaison; EIC Emeritus IEEE Communications)
Carl Landwehr, NSF, IARPA (EIC Emeritus IEEE S&P)
Dennis Gannon, Microsoft

V.O.K. Li, University of Hong Kong


(Communications Society liaison)
Rolf Oppliger, eSecurity Technologies
Hui Lei, IBM
Kirsten Ferguson-Boucher, Aberystwyth University.

EDITORIAL STAFF

CS MAGAZINE
OPERATIONS COMMITTEE

,BUIZ$MBSL'JTIFS.BOBHJOH&EJUPS
kclark-sher@computer.org
_________________
Chris Nelson, Mark Gallaher, Cheryl Baltes, Joan
5BZMPS BOE,FSJ4DISFJOFS$POUSJCVUJOH&EJUPST
.POFUUF7FMBTDP +FOOJF;IV.BJ1SPEVDUJPO%FTJHO
3PCJO#BMEXJO4FOJPS.BOBHFS &EJUPSJBM4FSWJDFT
+FOOJGFS4UPVU.BOBHFS &EJUPSJBM4FSWJDFT
&WBO#VUUFSFME1SPEVDUTBOE4FSWJDFT%JSFDUPS
4BOEZ#SPXO4FOJPS#VTJOFTT%FWFMPQNFOU.BOBHFS
.BSJBO"OEFSTPO4FOJPS"EWFSUJTJOH$PPSEJOBUPS

IEEE Cloud Computing (ISSN 2325-6095) is published quarterly by the IEEE Computer
Society. IEEE headquarters: Three Park Ave., 17th Floor, New York, NY 10016-5997.
IEEE Computer Society Publications Office: 10662 Los Vaqueros Cir., Los Alamitos, CA
90720; +1 714 821 8380; fax +1 714 821 4010. IEEE Computer Society headquarters:
2001 L St., Ste. 700, Washington, DC 20036.

Paolo Montuschi (chair), Erik R. Altman, Maria Ebling, Miguel


Encarnao, Lars Heide, Cecilia Metra, San Murugesan, Shari
Lawrence Peeger, Michael Rabinovich, Yong Rui, Forrest
Shull, George K. Thiruvathukal, Ron Vetter, Daniel Zeng

CS PUBLICATIONS BOARD
Jean-Luc Gaudiot (VP for Publications), Alain April,
Laxmi N. Bhuyan, Angela R. Burgess, Greg Byrd,
Robert Dupuis, David S. Ebert, Frank Ferrante, Paolo
Montuschi, Linda I. Shafer, H.J. Siegel, Per Stenstrm

Subscription rates: IEEE Computer Society members get the lowest rate of US$39
per year. Go to www.computer.org/subscribe to order and for more information on
other subscription prices.

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

50

CONTENT
What will the future of cloud computing look like? What are some of the issues
professionals, practitioners, and researchers need to address when utilizing cloud
services? This inaugural issue of IEEE Cloud Computing magazine serves as a forum for the constantly shifting cloud landscape, bringing you original research, best
practices, in-depth analysis, and timely columns from the luminaries in the eld.

THEME ARTICLES

10 Trends and Challenges in Cloud


Datacenters
Kashif Bilal, Saif Ur Rehman Malik, Samee U. Khan,
and Albert Y. Zomaya

28

Practical Methods
for Securing the Cloud
Edward G. Amoroso

21

Enabling On-Demand Science


via Cloud Computing
Kate Keahey and Manish Parashar

FEATURED ARTICLES

40 Cloud Computing Roundtable


Mazin Yousif, Tom Edsall, Johan Krebbers, Stefan Pappe,
and Yousef A. Khalidi

58 Cloud and Adjacent


Technology Trends
Emerging Paradigms and Areas for
Expansion

50 Standards and Compliance

Pascal Bouvry

Setting Cloud Standards in a New World


Alan Sill

62 Cloud Economics
The Costs of Cloud Migration

54 Cloud Security and Privacy


Security and Privacy in Cloud
Computing
Zahir Tari

Omer Rana

66 Cloud Management
Challenges in Cloud Management
J.P. Martin-Flatin

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

62
10

71
May 2014
Volume 1, Issue 1

www.computer.org/cloudcomputing

71 Cloud Experiences and Adoption

74 Cloud Services

Elements of Cloud Adoption

Applications Portability and Services


Interoperability among Multiple Clouds

Samee U. Khan

Beniamino Di Martino

COLUMNS

4 From the Editor in Chief


Introducing IEEE Cloud Computing:
A Very Timely Magazine

86 StandardsNow
Dening Our Terms
Alan Sill

Mazin Yousif

8 Q&A
Q&A with Mazin Yousif, IEEE Cloud
Computing Editor in Chief

78 BlueSkies
Streaming Big Data Processing in
Datacenter Clouds

90 Cloud Tidbits
Todays Tidbit: VoltDB
David Bernstein

93 Cloud and the Law


Legal Issues in the Cloud
Kim-Kwang Raymond Choo

Rajiv Ranjan

84 Whats Trending?

53 IEEE CS Information
70 Advertising Index

Intersection of the Cloud and Big Data


Eli Collins

Reuse Rights and Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for profit; 2)
includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third-party products
or services. Authors and their companies are permitted to post the accepted version of their IEEE-copyrighted material on their own Web servers without
permission, provided that the IEEE copyright notice and a full citation to the origin al work appear on the first screen of the posted copy. An accepted manuscript is a version which has been revised by the author to incorporate review suggestions, but not the published version with copyediting, proofreading and
formatting added by IEEE. For more information, please go to: http://www.ieee.org/publications_standards/publications/rights/paperversionpolicy.html.
Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new collective works for resale or redistribution
must be obtained from the IEEE by writing to the IEEE Intellectual Property Rights Office, 445 Hoes Lane, Piscataway, NJ 08854-4141 or pubs-permissions@
_________
ieee.org. Copyright 2014 IEEE. All rights reserved.
Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided the
per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

FROM THE EDITOR IN CHIEF

Introducing
IEEE Cloud
Computing:
A Very Timely
Magazine
IT IS A PLEASURE TO WELCOME YOU TO
THE FIRST ISSUE OF IEEE CLOUD COMPUTING. Cloud computing, or simply the cloud, is
changing how we deploy and run IT. Cloud computing promises that we dont need to worry about
running our IT because it will be delivered as a service from inside or outside the enterprise or from
the walls of our offices and homes. Thats a great
vision, and to some extent, its already happening.
So, instead of spending time on our IT, we can
focus on more interesting things: the develop-

ment of our businesses, our customers, or simply on having fun.


Youve probably already seen multiple definitions
of cloud computing. Put simply, cloud computing is
a consumer/delivery model in which IT functions
are offered as services, billed based on usage, and
accessed with an Internet connection, anytime from
anywhere. Its basic premise is that consumersindividuals, industry, government, academia, and so
onpay for IT services while theyre using them.
In other words, cloud computing turns IT expenses from a capital expenditure into an operational
expenditure.
A cloud architecture has four key attributes:
r Elasticity: the ability to scale up or down as
workload resource needs increase or decrease;
r Multitenancy: resources are shared by more
than one workload and possibly more than one
customer;
r Connectivity: the ability to connect to the cloud
from anywhere and anytime; and
r Visibility: consumers can monitor their deployment parameters such as usage and cost.
Examples of cloud services include infrastructure as a service (IaaS), platform as a service (PaaS),
and software as a service (SaaS).
Cloud computing has progressed and has been
adopted in the marketplace at an astonishing pace.
Today, the cloud is a reality for millions of users all over the world. However, there is still work
to be done to facilitate its use, instill confidence
in its promised capabilities, address users security
and privacy concerns, encourage innovation, and
discourage lock-in through interoperability among
cloud providers.

Why a New Magazine?

MAZIN YOUSIF
T-Systems International
mazin@computer.org
______________

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

The cloud vision is still a work in progress. Consumers, especially enterprises, have not yet put their full
faith in cloud computing. There are therefore many
opportunities for researchers to improve cloud technologies and elevate them to the promised vision.
This is a call to action to all researchers and technologists to push the envelope to address current
cloud challenges. IEEE Cloud Computing offers a
powerful forum in which to highlight cloud chal2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

lenges and bring their resolutions to the


forefront. The magazine will also be a
venue in which to debate the technologies and bring the best to the market.
IEEE Cloud Computing is the newest addition to the IEEE portfolio of
magazines. Dedicating a magazine to
this topic is the right decision at the right
time. The decision was based on many
factors, including, but not limited to
r the increasing adoption of cloud
computing as a critical platform in
the market;
r the rapid evolution of technologies
pertinent to cloud computing;
r the rapid expansion of the open
source cloud software community;
and
r the need to push for better controls such as security, privacy, and
standardization.
Another catalyst that accelerated
the decision to establish a magazine
on cloud computing is the fast-paced
acceptance and growth of the IEEE
initiative on cloud computing (http://
____
cloudcomputing.ieee.org/). The initia________________
tive, which includes an easy-to-use Web
portal, aims to provide a platform for
all you need to know about cloud computing. Thus, the magazine will cover
a blend of diverse cloud computing topics from all venues, including industry,
academic, government, research institutions, and independent professionals.
We plan to have four issues per annum. In 2014, in addition to the current
May issue, issues will be published in
August, October, and December. The
frequency may change in the future,
depending on the number of submitted
articles and cloud-related news items.

Magazine Structure
IEEE Cloud Computing seeks to foster
the evolution of cloud computing and
M AY 2 0 14

provide a forum for reporting original


research, exchanging experiences, and
developing best practices. Topics will be
presented in an easy-to-read style with
the goals of educating, offering insight,
broadening horizons, and clearing misconceptions. The magazine will consist
of two main sections: areas and columns.
Areas
The magazine is soliciting articles from
industry and academia in the following
areas.
Cloud architecture. If customers are to
reap the expected benefits of migrating IT services to the cloud, they must
choose the cloud architecture that best
matches their business model and IT
landscape. This area will look at efforts to further evolve cloud architectures in all their delivery models and
deployments.
Cloud management. This area deals
with the command and control of the
cloud. The area is still very much a
work in progress, and considerable
research and development remains
to be done to enable the cloud to become self-managed and self-governed.
The vision here is to enable the cloud
infrastructure (both hardware and
software) to self-configure, self-heal,
self-protect, and self-optimize, thereby
increasing its robustness and performance and reducing its operational
costs. But more importantly, cloud providers must provide enough monitoring
and reporting to give consumers full
visibility and control of their deployments in the cloud.
Cloud security and privacy. Concerns
about cloud security and privacy often
make consumers and enterprises tread
slowly and cautiously with their sensitive and critical data. These concerns

stem from technology, process and


governance, national and international
laws, cloud services crossing international borders, and the slow progression
of legal frameworks to deal with what
technology is enabling us to accomplish.
Cloud services. Consumers adopt cloud
computing differently, typically as a
function of how their business/operational/market models evolve and of
their existing landscapes. That said,
new cloud services will likely evolve as
markets develop, the Internets reach
becomes truly global, industries and humans evolve, and so on. This area also
covers the specifics of cloud services for
various industry sectors.
Cloud experiences and adoption. As

they adopt cloud services, consumers


experiences can vary considerably depending on the specifics of the services,
the cloud service provider, and the underlying service-level agreement (SLA).
The cloud experiences and adoption
area will look at topics such as how consumers feel about cloud services, the
pace and challenges of adoption, and
cloud deployment hiccups.
Cloud and adjacent technology trends.

Because there are usually several trends


in the marketfor example, Big Data,
mobility, and social mediaexploring
their intersection with cloud computing
is essential. The cloud and adjacent technology trends area will focus on commonalities between current trends and
cloud computing. For example, cloud
computing could easily serve as the main
infrastructure in Big Data deployments.
Cloud economics. This area deals with

two key issues: the direct and indirect


cost of cloud adoption for customers,
and developing sustainable business
models for providers.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

5
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

FROM THE EDITOR IN CHIEF

Some of the promises of lower costs


for deploying services in the cloud are
true. But in other cases, cloud computing might not be an option because of
indirect costs. For example, migrating
existing or legacy applications to the
cloud could result in a high total cost
of ownership (TCO), mainly resulting from migration costs. Or, for applications that need to run 24/7, using
a cloud might be more expensive than
keeping them on premise. To incentivize consumers, we need more exotic
economic models for using the cloud,
whether the deployment is short lived
or 24/7. Such models should work with
or without an advanced reservation and
whether the cloud provider targets individuals, enterprise users, or government
employees.

dos and donts when multiple providers


are involved, and a clear legal framework outlining what to do when things
go wrong. (For example, what happens to
the data if the provider or the consumer
goes bankrupt?) Much remains to be
learned in this area. Well initially address the topic in a transversal manner,
across all areas. In the future, as practitioners increase their understanding
of what governance means in the cloud
computing context, it might become a
separate area in the magazine.

Cloud standardization and compliance. The best way to encourage cloud

r Cloud Tidbits (cloud technology


news), edited by David Bernstein of
Cloud Strategy Partners, will highlight industry cloud technologies
and innovations such as those that
resolve cloud challenges, facilitate
cloud adoption, introduce efficiencies, or bring new economic models.
The column might also highlight
cloud-related events and technologies from specific startups that are
deemed to have a deep impact on
the cloud industry. We could also
dissect the acquisition of a cloud
technology or talk about a cloud
competition challenge. Cloud Tidbits will appear in every issue and is
expected to be the longest column.
r Blue Skies (cloud research news),
edited by Rajiv Ranjan of CSIRO,
will be similar to Cloud Tidbits, but
it will focus on academic and research news.
r CloudServ (cloud computing services) will present, analyze, or
evangelize cloud usage and delivery
models. It will cover all sectors: in-

interoperability is to facilitate the standardization of cloud technologies and


to define test suites for checking compliance. Until recently, however, cloud
standardization and compliance have
received little attention. Ongoing efforts in industry, academia, and government are appreciated, but they must
be backed up by stronger voices and
actions by major cloud players. IEEE
Cloud Computing will give all parties a
forum in which to voice their opinions.
Cloud governance. Cloud governance is
increasingly mentioned but seldom enforced. Providing cloud consumers with
monitoring and reporting tools lets them
visualize and control their deployments
in the cloud. But what is the point of providing the right tools when people dont
know what governance principles should
be enforced or what their enforcement
means? Cloud governance goes beyond
technology and includes the transparency of the underlying processes, the
6

I EEE CLO U D CO M P U T I N G

Columns
The columns in IEEE Cloud Computing
will seek to provide in-depth analysis of
cloud-related topics. Well start with the
following list and expand it as the market demands evolve:

dustry, healthcare, academia, government, and so on.


r Whats Trending? (industry trends
intersecting with cloud computing)
will look at various industry trends,
such as Big Data, social networks,
and mobility, and present areas of
commonality. It will investigate how
cloud computing facilitates such
trends, how the trends increase
cloud adoption or bring new usage
or delivery models, and so on. Eli
Collins of Cloudera will coedit this
column, focusing on the intersection of Big Data and the cloud.
r StandardsNow (cloud standards),
edited by Alan Sill of Texas Tech,
will cover standardization and compliance issues in cloud computing.
r Cloud and the Law will cover topics related to cloud security, privacy, cross-border legal constraints,
international laws related to cloud
data protection, and so on. This column will be edited by Kim-Kwang
Raymond Choo of the University of
South Australia.
r Interview Corner will feature interviews with opinion leaders or cloud
experts involved in research, development, usage, or business. Interviewees will come from industry,
academia, government, or any other
organization. I will edit this column
until a new editor is assigned.
Were currently contemplating adding
several other columns:
r a column on cloud misconceptions
to clear misleading jargon about
cloud computing;
r a column on cloud benchmarking to
compare and contrast cloud offerings from multiple providers, taking
into consideration multiple dimensions such as performance, functionality, and pricing;
W W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

r a column on cloud and governments


that looks at how governments position clouds and their use within
governments; and
r a column on cloud and industry
verticals that explores cloud deployments with specific capabilities
tailored to various industry sectors,
such as healthcare and automobiles.
These may appear within the next few
issues.

Editorial Board
Finally, I am pleased to announce the
IEEE Cloud Computing editorial board:
r David Bernstein (Cloud Strategy
Partners, USA)
r Ivona Brandic (Vienna University of
Technology, Austria)
r Pascal Bouvry (University of
Luxembourg)
r Kim-Kwang Raymond Choo (University of South Australia)
r Eli Collins (Cloudera, USA)
r Beniamino Di Martino (Seconda
Universit di Napoli, Italy)
r Samee U. Khan (North Dakota
State University, USA)
r J.P. Martin-Flatin (EPFL, Switzerland)
r Omer Rana (Cardiff University,
UK)
r Rajiv Ranjan (CSIRO, Australia)
r Alan Sill (Texas Tech University,
USA)
r Zahir Tari (RMIT, Australia)
These well-accomplished individuals
have extensive experience in cloud computing. They also have the energy and
commitment to deliver an outstanding
magazine. Im very excited to have them
onboard.
Members of the board will serve
as column editors or area editors. The
main role of the area editors is to manage articles submitted in their respecM AY 2 0 14

tive areas. They might also organize


special issues, moderate panels focused
on specific cloud themes, or write feature articles. Column editors will manage the write-ups for their respective
columns. Ill assign more area and column editors in the next few months to
balance the efforts of each editor. Ill
also strive to maintain a mix of editors
from academia and industry.
Were in the early days of cloud computing. Real challenges still need to be
resolved before we can feel comfortable
with cloud computings promise and vision. Together, well try to make IEEE
Cloud Computing the best platform for
solving cloud challenges, promoting the
clouds positive aspects, highlighting
inefficiencies in cloud technologies and
deployments, and rewarding outstanding cloud technologies.
I encourage you, the cloud community, to embrace this magazine and help
us develop it into your reference magazine for the best in cloud computing.

MAZIN YOUSIF is the chief technology


officer and vice president of architecture
for the Royal Dutch Shell Global account
at T-Systems, International. T-Systems
operates information and communication technology (ICT) systems for multinational corporations and public sector
institutions. The company provides integrated solutions for the networked future
of businesses and society. Before joining
T-Systems, Yousif was with IBM Canada, Global Technology Services, where
he focused on cloud computing. He was
also a chief architect at Numonyx, where
he focused on the role of phase change
memory (PCM) in server architectures
and data center optimizations, and a
principal engineer at Intel, leading many
projects on energy optimization, virtualization, and autonomic computing. Before that, he spent some time with IBM

xSeries Division in Research Triangle


Park, North Carolina, working on various architecture and development topics,
and was an assistant professor at Louisiana Tech University.
Yousif chairs the advisory board of
the European Research Consortium for
Informatics and Mathematics (www.er______
cim.org). He founded the US National
_____
Science Foundation Industry/University
Cooperative Research Center for Autonomic Computing and then delegated
to professors in three universities (Arizona, Florida, and Rutgers) to develop
the necessary documentation and paperwork to establish the center. Yousif
has been an adjunct professor at Duke
University, North Carolina State University, the University of Arizona, and
the Oregon Graduate Institute. He was
a principal leader in defining the InfiniBand Architecture and cochaired the
management working group in the InfiniBand Trade Association, which was
responsible for defining the InfiniBand
Architecture. He has served as the general chair or program chair for many
conferences and serves on the editorial
boards of many journals. He is a frequent speaker at academic and industry
conferences on various topics related to
cloud, autonomic, and green computing. He has also published extensively
(more than 70 publications). He was an
IEEE Distinguished Visitors Program
speaker from 2008 to 2013.
Yousif has an MSc and PhD in electrical engineering and computer engineering, respectively, from Pennsylvania
State University. Contact him at mazin@
_____
computer.org.

Selected CS articles and columns


are also available for free at ___
http://
ComputingNow.computer.org.
_________________

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

7
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Q&A

Q&A with Mazin Yousif,


IEEE Cloud Computing
Editor in Chief
he IEEE Cloud Computing Initiative sat down with Mazin Yousif, editor
in chief of the new IEEE Cloud Computing magazine, to share his perspective on the future of cloud, key challenges facing the industry today,
and drivers for broader adoption.

Why do you think consumers and enterprises


have not put their full faith in cloud computing?
Are cloud security and privacy key areas of concern when considering adoption?

There is certainly some sensitivity surrounding


security and privacy of data that inhibits further
cloud adoption by consumers and enterprises. For
instance, when people send their data to the cloud,
they no longer have physical access to oversee security like they would if it were hosted on a server on
premise. They do not know who has access to their
data or how the service provider is protecting their
data. Therefore, for industries where data is a critical business asset, these unknowns keep enterprises
from storing their data in the cloud.
Currently, service providers do not publish
their internal processes or details of technological
capabilities. There is no transparency, which is another reason that consumers and enterprises have
a hard time placing their data in the cloud. Publishing internal processes would allow for more
transparency and for enterprises to compare the
service providers processes against their internal
processes. Additionally, people want to know what
will happen to their data if a service provider goes
out of business. Although addressing these concerns should increase cloud adoption, enterprises
are more likely to turn to a hybrid solution, where
they can store some data on premise and some off
premise.

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

What do you see as the key challenges facing the


cloud computing industry today and in the future?
Today, cloud is happening. Organizations, governments, educational institutions, and many others
are adopting or using the cloud in some way, as they
see the positives of increased agility and reduced total cost of ownership. However, the industry is still
trying to figure out what the cloud is and where it is
going, which means there are challenges and concerns that still must be addressed.
Some of these key challenges are around security
and privacy. Interoperability among service providers
is another area that needs attention. People are also
concerned that if they move their workloads and data
to a service provider, will they be locked in or will they
be able to freely move it somewhere else, if desired? As
the cloud industry moves forward, I am confident that
these challenges will be addressed in the near future,
and that service providers, technology companies, the
open source community, government, and standards
bodies will each play a role in the solutions.

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

How can the critical challenges in


cloud computing be addressed by
key players in the industry?
Lets look at these critical challenges from three angles: technological,
confidence building, and legal. From
a technology angle, service providers
need to develop innovations that provide
more visibility, transparency, and monitoring capabilities to allow consumers
to know more about their deployments
in the cloud. Globally, service providers
also need to embrace technology trends.
For example, software-defined environments provide a great deal of flexibility
and agility; and although they have been
used to some extent, greater adoption
is needed. I am confident that the open
source community will play a major role
in addressing technological challenges.
Building confidence can be accomplished if service providers publish their
internal processes, governances, and
compliance mechanisms to consumers.
This way consumers can know who can
see or access their data in the cloud and
have a baseline to compare against their
internal on-premise processes. Another
idea to enhance confidence is third-party
auditors that can examine the datacenter
deployments on the consumers behalf to
ensure the service-level agreement is being satisfied. These are just few examples
of means to build confidence.
We need regulations and laws that
govern cloud services and the data that
is placed in the cloud. Service providers
need to work with legislatures and governments to draft laws that are fair and
can manage and help protect consumers when they put data in the cloud.
Its evident that there are synergies
with cloud/cloud deployments as they
relate to big data, mobility, and select
industry verticals such as the healthcare and automobile sectors. What
are your thoughts on these intersectM AY 2 0 14

IEEE CLOUD COMPUTING WEB


PORTAL
With cloud computing signicantly impacting todays information
and communications ecosystem, the IEEE Cloud Computing Initiatives web portal offers many opportunities to participate, inuence,
and contribute to this technology. A collaborative source for all things
related to cloud computing and big data, the portal features the
latest news and a variety of resources, including access to upcoming conferences, education opportunities, publications, developing
standards, and the intercloud testbed. Connect with the Cloud Computing Initiative on social media and join our free technical community to learn more about what IEEE is doing in the revolutionary elds
of cloud computing and big data.
To learn more, check out cloudcomputing.ieee.org.

ing industry trends and the potential


for increasing cloud adoption?
Cloud is becoming a major catalyst
for the adoption of trends such as big
data and mobility. For example, the cloud
has allowed many small players to make
big dents in these industry trends because they dont need to have large capital investments, just minimal operational
expenses to run their workloads and services in the cloud. Industry verticals such
as healthcare and automotive have also
benefited from using the cloud because a
cloud can easily be designed with specific
features to satisfy just about any compliance, regulatory, or functional need. Another use case is higher education, where
cloud can play a huge role. For example, it
would be easy to build a shared research
cloud that can be used by any number of
universities. A second related example is
building a cloud to provide all the capabilities a university needs, such as online

training, outreach, and online courses.


Overall, the cloud is a great technology
paradigm for all industry sectors.
Are there a few specic areas that
you are excited about highlighting in
the new publication that will have a
tangible, future impact for the cloud
industry at large?
We have a threefold approach to the
overall vision of the magazine, which is
to be the best platform available in the
market for understanding cloud and
how to use it well. We want to publish
approaches and methodologies to resolve the current cloud challenges. We
want to highlight cloud-related news,
challenges, experiences, and technological developments. And we want to
enable a greater adoption of cloud computing by sharing the benefits, along
with featuring the numerous ways to
use the cloud.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

9
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

DATACENTER MANAGEMENT

THE WORLDS NEWSSTAND

Trends and
Challenges in Cloud
Datacenters
Kashif Bilal, Saif Ur Rehman Malik, and Samee U. Khan,
North Dakota State University

Albert Y. Zomaya, University of Sydney

Next-generation datacenters (DCs) built on


virtualization technologies are pivotal to the effective
implementation of the cloud computing paradigm. To
deliver the necessary services and quality of service,
cloud DCs face major reliability and robustness
challenges.
loud computing is the next major paradigm shift in information and
communication technology (ICT). Today, contemporary society relies
more than ever on the Internet and cloud computing. According to a
Gartner report published in January 2013, overall public cloud services
are anticipated to grow by 18.4 percent in 2014 into a $155 billion market.1 Moreover, the total market is expected to grow from $93 billion in
2011 to $210 billion in 2016. Figure 1 depicts the public cloud service market size along
with the growth rates. Weve seen cloud computing adopted and used in almost every
domain of human life, such as business, research, scientific applications, healthcare, and
e-commerce2 (see Figure 2).
The advent and rapid adoption of the cloud paradigm has brought about numerous
challenges to the research community and cloud providers, however. Datacenters (DCs)
constitute the structural and operational foundations of cloud computing platforms.2 Yet,
the legacy DC architectures cannot accommodate the clouds increasing adoption rate
and growing resource demands. Scalability, high cross-section bandwidth, quality of service (QoS) concerns, energy efficiency, and service-level agreement (SLA) assurance are
10

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Cloud Datacenter Architectures


The DC architecture plays a pivotal role in the
performance and scalability of the cloud platform.
Cloud computing relies on DCs to deliver the expected services.2 The widespread adoption of the
cloud paradigm mandates exponential growth in the
DCs computational, network, and storage resources. Increasing the computational capacity of todays
DCs is not an issue. However, interconnecting the
computational resources to deliver high intercommunication bandwidth and specified QoS are key
challenges. Todays DCs are not constrained by computational power but are limited by their interconnection networks.
Legacy, multirooted tree-based network architectures, such as the ThreeTier architecture, cannot
accommodate cloud computings growing demands.4
Legacy DC architectures face several major challenges: scalability, high oversubscription ratio and
low cross-section bandwidth, energy efficiency, and
fault tolerance.
To overcome these challenges, researchers have
proposed various new DC architectures, such as FatTree, DCell, FiConn, Scafida, and JellyFish.2 However, these proposed DC architectures overcome
only a fraction of the challenges faced by legacy
DC architectures. For instance, the FatTree architecture delivers high bisection bandwidth and a 1:1
M AY 2 0 14

Growth rate
Market

25

200

20

150

15

100

10

50

0
2010

2011

2012

2013

2014

2015

2016

Growth rate (%)

250

Cost (billions USD)

some of the major challenges faced by todays cloud


DC architectures. Multiple tenants with diverse resource and QoS requirements often share the same
physical infrastructure offered by a single cloud provider.3 The virtualization of server, network, and storage resources adds further challenges to controlling
and managing DC infrastructures.2 Similarly, cloud
providers must guarantee reliability and robustness
in the event of workload perturbations, hardware failures, and intentional (or malicious) attacks3 and ultimately deliver the anticipated services and QoS.
The cloud computing paradigm promises reliable services delivered through next-generation DCs
built on virtualization technologies. This article
highlights some of the major challenges faced by
cloud DCs and describes viable solutions. Specifically, we focus on architectural challenges, reliability and robustness, energy efficiency, thermal awareness, and virtualization and software-defined DCs.

0
2017

Year
FIGURE 1. Market and growth rate of public clouds. The market is

expected to reach more than $200 billion by 2017.

oversubscription ratio, but it lacks scalability. The


DCell, FiConn, Scafida, and Jellyfish architectures,
on the other hand, deliver high scalability but at the
cost of low performance and high packet delays with
high network loads.
Because of the huge number of interconnected
servers in a DC, scalability is a major issue. Treestructured DC architectures, such as ThreeTier,
VL2, and FatTree, offer low scalability. Such DC
architectures are capped by the number of network
switch ports. Server-centric architectures, such as
DCell and FiConn, and freely/randomly connected
architectures, such as JellyFish and Scafida, offer
high scalability.2
DCell is a server-centric DC architecture, in
which the servers act as packet-forwarding devices
in addition to performing computational tasks.4
DCell is a recursively built DC architecture consisting of a hierarchy of cells called dcells. The dcell0 is
the building block of the DCell topology, which contains n servers connected through a network switch.
Multiple lower-level dcells constitute a higher-level
dcellfor instance, n + 1 dcell0 builds a dcell1. A
four-level DCell network with six servers in dcell0
can interconnect approximately 3.26 million servers. However, such DC architectures cant deliver
the required performance and cross-section bandwidth within a DC.4
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

11
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

DATACENTER MANAGEMENT

THE WORLDS NEWSSTAND

FIGURE 2. Adoption of cloud computing in the information and communications technology (ICT) sector. In 2014,
the amount spent on clouds is expected to reach $55 billion annually.

Similarly, JellyFish and Scafida are nonsymmetric DC architectures that randomly connect servers
to switches for high scalability. In the JellyFish architecture, the servers are connected randomly to
switches such that a network switch can be connected to n servers. Each network switch is then connected to k other switches. The Scafida DC architecture has a scale-free network architecture. The
servers are connected to switches using the Barabasi
and Albert network-generation algorithm. Because
of the large number of nodes within the network,
DC architectures cant use conventional routing algorithms. The customized routing algorithms that
DC architectures use, such as DCell Routing, perform poorly under high network loads and many-tomany traffic patterns.
In a previous study,4 we analyzed the network
performance of state-of-the-art DC architectures
with various configurations and traffic patterns.4
Our analysis revealed that server-centric architectures, such as DCell, suffer from high network
delays and low network throughput compared with
12

I EEE CLO U D CO M P U T I N G

tree-structured switch-centric DC architectures,


such as FatTree and ThreeTier. Figure 3 shows
that DCell experiences higher network delays and
low throughput as the number of nodes within the
DC architecture increases.4 This is because, for
larger topologies, all the inter-dcell network traffic
must pass through the network link that connects
the dcells at the same level, resulting in increased
network congestion. However, for smaller network
topologies, the traffic load on the inter-dcell links
are low and the links serve fewer nodes, resulting in
high throughput. Moreover, the routing performed
in DCell is not the shortest routing path, which increases the number of intermediate hops between
the sender and receiver.
High cross-sectional bandwidth is a necessity
for todays DCs. An industry white paper estimated
that the cloud will process around 51.6 exabytes
(Ebytes) of data in 2015.5 The network traffic pattern within a DC may be one-to-many, one-to-all,
and all-to-all. For instance, serving a search query
or social networking request, such as group chats
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

250

180
FatTree

DCell

ThreeTier

FatTree

ThreeTier

DCell

160

200

Throughput

120
150

100
80

100

60

Packet delay (ns)

160

40

50

20
0

0
16

32

64

128

256

512

1,024

2,048

4,096

Seconds
FIGURE 3. Average network throughput and packet delay of datacenter networks. As the number of nodes
within the DC architecture increases, DCell experiences higher network delays and low throughput.

and file sharing, requires thousands of servers to act


in parallel.6 The high oversubscription ratio within some DC architectures, such as ThreeTier and
DCell, severely limits the internode communication
bandwidth and affects performance. For instance,
a typical oversubscription ratio in legacy DC architectures is between 4:1 and 8:1. An oversubscription
of 4:1 means that the end host can communicate at
only 25 percent of the available network bandwidth.
The FatTree architecture offers a 1:1 oversubscription ratio by using a Clos-based interconnection
topology. However, the FatTree architecture is not
scalable and uses numerous network switches and
network cables for interconnection. For example, a
FatTree topology of 128 nodes (8 pod) requires 80
network switches to interconnect.
The industry is also considering the use of hybrid DC architectures (optical/electrical and wireless/electrical) to augment DC networks. Optical
interconnects offer high bandwidth (up to terabytes
per second per fiber), low latency, and high port
density. Therefore, optical networks are certainly a
possible solution for the ever-increasing bandwidth
demands within DCs.7 Various hybrid (optical/electrical) DC architectures, such as Helios, c-Through,
and HyPac, have been proposed recently to augment
existing electrical DC networks.7 However, optical
networks also face numerous challenges:
r high cost;
r high insertion loss;
r large switching and link setup time (usually 10
to 25 ms);
M AY 2 0 14

r inefficient packet header processing;


r unrealistic and stringent assumptions, such as
networks flows without priorities, independent
flows, and hashing-based flow distribution, that
are effective but not applicable in real-world DC
scenarios; and
r optical-electrical-optical (OEO) signal conversion
delays caused by the lack of efficient bit-level
processing technologies and incurred at every
routing node when the optical links are used
with electrical devices.6
Similar to the optical interconnects, emerging
wireless technologies, such as 60-GHz communications, are also being considered to overcome various
challenges faced by the current DC networks, such
as cabling costs and complexity.8 However, 60-GHz
technology in DCs is still in its infancy and faces
numerous challenges, such as propagation loss,
short communication range, line of sight, and signal attenuation.8 Hybrid, fully optical, and wireless
DCs may be viable solutions for DC networks, but
the aforementioned open challenges are currently a
barrier to their widespread adoption.

Energy Efficiency in Cloud Datacenters


Concerns about environmental impacts, energy demands, and electricity costs of cloud DCs are intensifying. Various factors, such as the massive amounts
of energy DCs consume, excessive greenhouse gas
(GHG) emissions, and idle DC resources mandate
that we consider energy efficiency as one of the
foremost concerns within cloud DCs. DCs are one
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

13
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

DATACENTER MANAGEMENT

THE WORLDS NEWSSTAND

100
Idle servers

Idle servers (%)

80

60

40

20

0
5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

44,270

Time (minutes)
FIGURE 4. Idle servers in the University of New York at Buffalo datacenter. Careful workload placement and

consolidation can result in better resource allocation and thus reduced energy consumption.

of the major energy-consuming entities worldwide.


The cloud infrastructure consumed approximately
623 terawatt-hours (TWh) of energy in 2007.9 The
estimated energy consumption of the cloud infrastructure in 2020 is 1,963.74 TWh.10 DCs are experiencing a growth rate of around 9 percent every
year, and as a result, their energy demands, which
have doubled in the last five years,11 are continuing
to increase as well.
Because of the increasing energy costs (around
18 percent in the past five years and 10 to 25 percent in the coming years), DC operational expenses
are also increasing.11,12 The energy bill of a typical
DC dominates its total operational expenses. For instance, approximately 45 percent of one IBM DCs
operational expenses went to its energy bill.12 IBM
has reported that, over a 20-year period, a DCs operational expenses are three to five times that of its
capital expenditures.12 In certain cases, the energy
costs may account for up to 75 percent of operational expenses.12
The enormous GHG emissions produced by DCs
and the cloud infrastructure have intensified environmental concerns. The ICT sectors carbon footprint
was approximately 227.5 metric tonnes in 2010, higher than that of the worldwide aviation industry.9 The
cloud infrastructures carbon footprint may be close
to 1,034 Mt by 2020.10 Moreover, most of the electricity used by DCs is produced by dirty resources,
such as coal.10 Coal-fired power stations are among
the biggest sources of the GHG emissions, and the
biggest source of GHG emissions in the US.
A typical DC experiences around 30 percent
of average resource utilization,13 meaning that DC
14

I EEE CLO U D CO M P U T I N G

resources are overprovisioned to handle peak loads


and workload surges. Therefore, a DCs available
resources remain idle most of the time. In fact, as
much as 85 percent of the computing capacity of distributed systems remains idle.11 As a result, significant energy savings are possible using judicial DC
resource optimization techniques. We can achieve
energy efficiency within DCs by exploiting workload
consolidation, energy-aware workload placement,
and proportional computing.
Consolidation techniques exploit resource
overprovisioning and redundancy to consolidate
workloads on a minimum subset of devices. Idle devices can be transitioned to sleep mode or powered
off to save energy by using the Wake on LAN (WoL)
feature on network interface cards (NICs). The Xen
platform also provides a host power-on feature to
sleep/wake devices. Researchers have recently proposed various workload consolidation strategies,
such as ElasticTree and DCEERS, for energy savings within DCs.8,13 Such strategies use optimization techniques, such as calculating a minimum
subset of devices to service the necessary workload.
However, the aforementioned consolidation strategies do not consider two critical issues:
r How will the strategy handle workload surges
and resource failures?
r When will the resources be transitioned to
sleep/wake mode?
Activating a power off or sleep device requires a
long delay (usually in seconds or minutes). Such extra delays are intolerable in SLA-constrained DC enW
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

vironments, so there is an immense need to consider


system robustness while using energy-efficiency techniques. Similarly, energy-aware workload placement
and on-the-fly task migration can help maximize
resource utilization, minimize energy consumption,
and control the network load.
We used a real DC workload from the University of New York at Buffalo to observe the impact of
energy-aware workload placement and live migration
to save energy. We observed that careful workload
placement and consolidation can result in a high
percentage of idle resources within a DC. Figure 4
shows that, with task migrations and proper workload placement, most of the resources remain idle in
a DC. Such idle resources can be placed in sleep or
low power mode to attain significant energy savings.
Proportional computing involves consuming
energy in proportion to resource utilization. DC resources in an idle or underutilized state consume
around 80 to 90 percent of the energy consumed
during peak utilization. Proportional computing
techniques, such as dynamic voltage and frequency
scaling (DVFS) and adaptive link rate (ALR), aim to
execute resources (processors and network links) in
a scaled-down state to consume less power.8 Such
techniques depend on a mechanism for efficiently scaling the power state of the resources up and
down and a policy for determining when to alter the
power state.
The DVFS technique is applied to processors to
scale down a processors voltage or frequency. However, the scaled-down state upturns the execution
time of the tasks, leading to larger makespan. Similarly, ALR techniques are applied to network links
to scale down the links data rates for reduced energy consumption. The IEEE 802.3az standard uses
the ALR technique to scale down Ethernet link data
rates.8 However, the IEEE 802.3az standard only
provides a mechanism to change the links state, and
true efficiency and energy savings depend on the
proportional computing policies that decide when
to change the state of the resources. Therefore, efficient and robustness-aware policies are mandatory
to exploit the full potential of the proportional computing techniques.

Robustness in Cloud Datacenters


As the operational and architectural foundation of
cloud computing, DCs hold a fundamental role in
the operational and economic success of the cloud
paradigm. SLA-constrained cloud environments
must be robust to workload surges and perturbations
as well as software and hardware failure3 to deliver the specified QoS and meet SLA requirements.
M AY 2 0 14

However, dynamic and virtualized cloud environments are prone to failures and workload perturbations. A small performance deprivation or minor
failure in a cloud may have severe operational and
economic impacts. In one incident, a small network
failure in the O2 network (the UKs leading cellular network provider) affected around seven million customers for three days.3 Similarly, because
of a core switch failure in BlackBerrys network,
millions of customers lost Internet connectivity for
three days. In other incidents, the Bank of America
website outage affected around 29 million customers, and Virgin Blue airline lost approximately $20
million because of a hardware failure in its system.14
Major brands faced service outages in 2013, including Google, Facebook, Microsoft, Amazon, Yahoo,
Bank of America, and Motorola.
The cloud market is growing rapidly, and the
European Network and Information Security Agency (ENISA) has projected that approximately 80 percent of public and private organizations will be cloud
dependent by 2014. Many cloud service providers
(CSPs) offer 99.9 percent annual availability of their
services. However, a 99.9 percent availability rate
still translates into 8.76 hours of annual downtime.
For any cloud-dependent organization, around-theclock availability is of utmost importance. Moreover,
even a short downtime could result in huge revenue
losses. For instance, in a survey of 200 DC managers, USA Today reported that DC downtime costs
per hour exceed $50,000.14 The business sector is
expected to lose around $108,000 for every hour of
downtime. InformationWeek reported that IT outages result in a revenue loss of approximately $26.5
billion per year.14
In addition to huge revenue losses, service
downtimes also result in reputation damage and
customer attrition. Therefore, robustness and
failure resiliency within the cloud paradigm is of
paramount importance. We analyzed the robustness
and connectivity of the major DC architectures
under various types of failures, such as random,
targeted, and network-only failures.3 We found
that the legacy DC architectures lack the required
robustness against random and targeted failures.
A single access layer switch failure disconnects
all the connected servers from the network. The
DCell architecture exhibits better connectivity
and robustness against various types of failures.
However, the DCell architecture cannot deliver the
required QoS and performance necessary for large
networks and heavy network loads.
Using consolidation, dynamic power (sleep/
wake) management, and proportional computing
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

15
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

DATACENTER MANAGEMENT

THE WORLDS NEWSSTAND

Software-driven thermal
management strategies

Air-ow management strategies

Jobs

Thermal

Management

Exhaust lter/fan
Input lters/fans

Exhaust
air to
outside

Air from
outside

ITE racks

Datacenter design strategies

Economization

FIGURE 5. Thermal management strategies. Cloud DCs can utilize one or more strategies to regulate and
manage operating temperatures.

techniques to save energy may also affect a clouds


performance. Google reported a revenue loss of
around 20 percent because of an additional response
delay of 500 ms.3 Similarly, Amazon reported around
1 percent sales reduction because of an additional
delay of 100 ms. In high frequency trading (HFT)
systems, a delay of as small as nanoseconds may have
huge financial effects.11 Therefore, energy-efficient
techniques must not compromise system robustness
and availability. Activating sleep and power-off
devices requires significant time. Similarly, scaling
up a processor or network link may also result in an
16

I EEE CLO U D CO M P U T I N G

extra delay and power spike. Therefore, appropriate


measures must be taken to avoid any extra delays.
The dynamic power management and proportional
computing policies and consolidation strategies
must be robust enough to handle workload surges
and failures. Similarly, prediction-based techniques
for forecasting future workloads and failures can
also help enhance system robustness.

Thermal Awareness in Cloud Datacenters


As we stated earlier, electricity costs comprise a major portion of overall DC operational expenses. A
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

further breakdown of the energy consumption within a DC reveals that an overwhelming portion of
those costs are incurred to stabilize the DCs thermal dynamics, such as the computer room air conditioning (CRAC) units, chillers, and fans. In a typical
DC, the annual electricity cost of cooling alone is $4
to $8 million, including the cost of purchasing and
installing the CRAC units.15 High operating temperatures can decrease the reliability of the underlying
computing devices. Moreover, inappropriate air-flow
management within DCs can create hotspots that
may cause servers to throttle down, increasing the
possibility of failures. The DC industry uses several
strategies to stabilize thermal subtleties. As Figure
5 shows, we can broadly categorize such strategies
into four areas:
r software-driven thermal management and temperature-aware strategies,
r DC design strategies,
r air-flow management strategies, and
r economization.
Software-driven thermal management strategies
mainly focus on maintaining a thermal balance
within the DC. The goal is to reduce the average
heat dissipation of the servers to reduce the cost
of running the CRAC unit. Such strategies adopt
various methods for job allocation. For instance,
genetic-algorithm-based job allocation16 attempts
to select a set of feasible servers to minimize the
thermal impact of job allocation, the integer linearprogramming modeling approach17 aims to meet
real-time deadlines while minimizing hotspots
and spatial temperature differences through job
scheduling, and thermodynamic-formulation and
thermal-profiling-based strategies optimize the
DCs thermal status.18 However, different softwaredriven thermal strategies produce different thermal
footprints, depending on the nature of the workload
being processed.
DC design strategies aim to build efficient
physical DC layouts, such as a raised floor and
hot and cold aisles. In a typical air-cooled DC, hot
and cold aisles are separated by rows of racks. The
CRAC units blower pressurizes the under-floor
plenum with cold air that is drawn through the vents
located in front of the racks in the cold aisle. The
hot air coming out of the servers is pushed into the
hot aisles. To enhance efficiency, DC managers have
added containment systems that isolate hot and cold
aisles to avoid air mixing. Initially, physical barriers,
such as vinyl plastic sheeting or Plexiglas covers,
were used for containment. However, today vendors
M AY 2 0 14

offer other commercial options, such as plenums


that combine containment with variable fan drives
to prevent air from mixing.
Other DC design strategies involve the placement of cooling equipment. For example, the cabinetbased strategy contains the closed-loop cooling
equipment within a cabinet; the row-based strategy
dedicates CRAC units to a specific row of cabinets; the perimeter-based strategy uses one or more
CRAC units to supply cold air through plenums,
ducts, or dampers; and the rooftop-based strategy
uses central air handling units to cool the DC. Generally, equipment-placement strategies are adopted
based on the physical room layout and the buildings
infrastructure.
The DC cooling system is significantly influenced by the air movement, cooling delivered to
servers, and hot air removal dissipated from the
servers. In this case, air-flow management strategies
are adopted to appropriately maneuver the hot and
cold air within the DC. Three air-flow management
strategies are usually followed: open, where no intentional air flow management is deployed; partial
containment, where an air-flow management technique is adopted but there is no complete isolation
between hot and cold air flows (using hot and cold
aisles); and contained, where the hot and cold air
flows are completely isolated.
The economization strategy reduces the cost
spent on cooling infrastructure by drawing in
cool air from outside and expelling the hot air to
outdoors. Intel IT conducted an experiment and
claimed that an air economizer could potentially reduce annual operating costs by up to $2.87 million
for a 10-MW DC.19
A combination of all of the aforementioned
strategies could be used to implement an efficient
thermal-aware DC architecture.

Virtualization in Cloud Datacenters


The process of abstracting the original physical
structure of innumerable technologies, such as a
hardware platform, operating system, a storage device, or other network resources is called virtualization.20 Virtualization is one of the key aspects used
to achieve scalability and flexibility in cloud DCs
and is the underlying technology that contributes
application and adoption of the cloud paradigm. A
virtual machine monitor (VMM) serves as an abstraction layer that controls the operations of all the
VMs running on top of it. Every physical machine in
the cloud hosts multiple VMs, which from a users
perspective, is equivalent to a fully functional machine. Virtualization ensures high resource utilization
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

17
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

Eucalyptus memory
Open Nebula memory
Nimbus memory

3.5

Memory (Mbytes)

3.0

350
300

Eucalyptus execution time


Open Nebula execution time
Nimbus execution time

2.5

250

2.0

200

1.5

150

1.0

100

0.5

50

Execution time (ms)

DATACENTER MANAGEMENT

THE WORLDS NEWSSTAND

0
10

20

30

40

50

60

70

80

90

100

Virtual machines
FIGURE 6. Verication time and memory consumed by VM-based cloud management platforms. The exercise to investigate the

scalability of the models revealed they functioned appropriately as the numbers of VMs increased.

and thus leads to huge savings in hardware, power


consumptions, and cooling. Several VM-based cloud
management platforms are available, such as Eucalyptus, OpenStack, and Open Nebula.
Today, the primary focus for virtualization continues to be on servers. However, the virtualization
of other components, such as storage and networks,
is also evolving as a prominent strategy. Moreover,
virtualization is also used in other areas: application
virtualization, where every user has an isolated virtual application environment; hardware-layer virtualization, where a VMM runs directly on hardware,
controlling and synchronizing the access to hardware resources; OS-layer virtualization, where multiple instances of the same OS run in parallel; and
full virtualization, where the I/O devices are allotted
to the guest machines by imitating the physical devices in the VMM.
Virtualization is experiencing an annual growth
rate of around 42 percent.11 According to a Gartner
survey, the workload virtualization will increase
from around 60 percent in 2012 to almost 90 percent in 2014.21 Several major reasons exist for this
increase:
r scalability and availability,
r hardware consolidation,
r legacy applications continuing to operate on
newer hardware and operating systems,
r simulated
hardware
and
hardware
configurations,
r load balancing, and,
r easy management of tasks, such as system migration, backup, and recovery.
18

I EEE CLO U D CO M P U T I N G

Despite all the benefits, virtualization technology poses several serious threats and adds further
challenges to efficiently and appropriately managing a DC. Moreover, network services in a virtualized environment have to look beyond the physical
machine level to a lower virtual level. The advent of
virtual switches and virtual topologies bring further
complexity to the DC network topology. A legacy
ThreeTier topology, for example, may grow to four or
five tiers, which may be suboptimal and impractical
in various cloud environments.11 The MAC address
management and scalability of the consolidated
VMs is a major concern that must prevent the MAC
tables from overloading in network devices.
Specifically, virtualization faces some key challenges, including VM hopping, where an attacker on
one VM can access another VM; VM mobility, or the
quick spread of vulnerable configurations that can
be exploited to jeopardize security; VM diversity,
where the range of operating systems creates difficulties when securing and maintaining VMs; and
cumbersome management, where managing the configuration, network, and security-specific settings is
a difficult task. The inception of the cloud is based
on distributed (grid and cluster) computing and
virtualization.
Previous research has focused on the computing and storage aspects of the cloud, while a crucial
aspect, the connectivity (networking), is usually unaddressed.22 In a recent study, we performed formal
modeling, analysis, and verification of three stateof-the-art VM-based cloud management platforms:
Eucalyptus, Open Nebula, and Nimbus.20 The exercise was to demonstrate the models flexibility and
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

scalability. We instantiated 100 VMs and verified


whether the models functionality was affected by
the increase in the number of instances. The results
from the exercise revealed that the models were
functioning appropriately as the numbers of VMs increased, as Figure 6 shows.
The increasing criticality and complexity involved in cloud-based applications, such as routing
and VM management, necessary to deliver QoS has
led to the maturity of formal method techniques.
Such techniques aim to increase software quality, reveal incompleteness, remove ambiguities, and
expose inconsistencies by mathematically proving
program correctness as opposed to using test cases.
Formal methods have gained a great deal of popularity since the famous Pentium Bug that caused Intel to recall faulty chips, resulting in a $475 million
loss.23 Most of the well-known names connected to
DCs, such as Microsoft, Google, IBM, AT&T, and
Intel, have already realized the importance of formal
methods and are using techniques and tools to verify the functionality and reliability of their respective
software and hardware.
As we stated earlier, the occurrence of errors
and miscalculations are hazardous and expensive in
large-scale computing and critical systems, such as
cloud and real-time systems. The New York Times
reported one such error in August 2012: the Knight
Capital Group lost $440 million in just 45 minutes
when newly installed trading software went haywire.
Formal method techniques can be adopted to ensure
system correctness, reliability, robustness, and
safety by introducing rigorousness and performing
proofs and verification of the underlying systems.
Software-defined networking (SDN) involves separating the network control plane from the data plane
within a network.11 The data plane is considered a
hardware-based portion of the device for sending and
receiving network packets, whereas the control plane
is the software-based portion of the network device
that determines how the packets will be forwarded.
SDN offers decoupled and centralized network management of the control plane to manage the whole
network in a unified way. Control plane management
is performed independently of the devices and the
forwarding rules; for example, routing tables are assigned to the data plane on the fly using communication protocols. SDN offers high flexibility, agility, and
control over a network using network programmability and automation.11 The SDN market is expected to
grow by $35 billion in 2018.24 Various SDN frameworks such as Cisco One and Open Daylight offer
APIs, tools, and protocols for configuring and building a centrally controlled programmable network.
M AY 2 0 14

SDN-based automated DC networks are a possible solution to the various challenges faced by legacy DC networks, but such technologies are still in
their infancy. Moreover, SDN deployment requires
OpenFlow (or another SDN-based communication
protocol) compliant network devices to operate, but
legacy network devices do not support such communication protocols. In addition, a central SDN controller creates a single point of failure, and prevention of malicious misuse of the SDN platforms is a
major security concern.

nergy efficiency, robustness, and scalability are among the foremost concerns faced by
cloud DCs. Researchers and industry are striving
to find the viable solutions for the challenges facing
DCs. Hybrid DC architectures employing optical
and wireless technologies are one of the strongest
feasible solutions today. The SDN-based DCs architectures are also being considered to handle
various network-related problems and to deliver
high performance. The hybrid DC architectures and
SDN-based DCs are still in their infancy, however.
Therefore, serious research efforts are necessary
to overcome the limitations and drawbacks of the
emerging technologies to deliver the required QoS
and performance.
References
1. Gartner, Forecast Overview: Public Cloud
Services, Worldwide, 20112016, 4Q12 Update,
2013.
2. K. Bilal et al., A Taxonomy and Survey on
Green Data Center Networks, to be published
in Future Generation Computer Systems; doi:10
.1016/j.future.2013.07.006.
3. K. Bilal et al., On the Characterization of
the Structural Robustness of Data Center
Networks, IEEE Trans. Cloud Computing, vol.
1, no. 1, 2013, pp. 6477.
4. K. Bilal et al., Quantitative Comparisons of
the State of the Art Data Center Architectures,
Concurrency and Computation: Practice and
Experience, vol. 25, no. 12, 2013, pp. 17711783.
5. Bell Labs and Univ. of Melbourne, The Power
of Wireless Cloud: An Analysis of the Energy
Consumption of Wireless Cloud, Apr. 2013;
w
w w.ceet.unimelb.edu.au/pdfs/ceet_white_
________________________________
paper_wireless_cloud.pdf.
_________________
6. A. Vahdat et al., The Emerging Optical Data
Center, Proc. Conf. Optical Fiber Comm., 2011;
w w w.opticsinfobase.org /abstract.cfm?URI=
ofc-2011-otuh2.
__________
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

19
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

DATACENTER MANAGEMENT

THE WORLDS NEWSSTAND

20

7. C. Kachris and I. Tomkos, A Survey on Optical

21. M. Warrilow and M. Cheung, Will Private

Interconnects for Data Centers, IEEE Comm.


Surveys & Tutorials, vol. 14, no. 4, 2012, pp.
10211036.
8. K. Bilal, S.U. Khan, and A.Y. Zomaya, Green
Data Center Networks: Challenges and
Opportunities, Proc. 11th IEEE Intl Conf.
Frontiers of Information Technology, 2013, pp.
229234.
9. G. Cook and J. Horn, How Dirty Is Your Data,
A Look at the Energy Choices That Power Cloud
Computing, tech report, Greenpeace Intl, 2011.
10. Greenpeace Intl, Make IT Green, Cloud
Computing and its Contribution to Climate
Change, 2010; www.greenpeace.org/usa/Global/
u s a / r e p o r t / 2 010 / 3 / m a k e -it- g r e e n- c l o u d _______________________________
computing.pdf.
_________
11. IBM, IBM and Cisco: Together for a World
Class Data Center, 2013.
12. S.L. Sams, Discovering Hidden Costs in Your
Data CenterA CFO Perspective, IBM, 2010.
13. J. Shuja et al., Data Center Energy Efficient
Resource Scheduling, Cluster Computing, Mar.
2014; http://link.springer.com/article/10.1007%2
Fs10586-014-0365-0.
______________
14. Evolven, Downtime, Outages and Failures
Understanding Their True Costs, Wind of
Change blog, 18 Sept. 2013; http://urlm.in/sjhk.
15. E. Pakbaznia and M. Pedram, Minimizing
Data Center Cooling and Server Power Costs,
Proc. 14th ACM/IEEE Intl Symp. Low Power
Electronics and Design, 2009, pp. 145150.
16. Q. Tang, S. Gupta, and G. Varsamopoulos,
Energy-Efficient Thermal-Aware Task Scheduling
for Homogeneous High-Performance Computing
Data Centers: A Cyber-Physical Approach,
IEEE Trans. Parallel and Distributed Systems,
vol. 19, no. 11, 2008, pp. 14581472.
17. E. Kursun and C.Y. Cher, Temperature Variation
Characterization and Thermal Management of
Multicore Architectures, IEEE Micro, vol. 29,
2009, pp.116126.
18. J. Moore et al., Making Scheduling Cool:
Temperature-Aware Workload Placement in Data
Centers, Proc. Usenix Conf., 2005, pp. 6175.
19. Intel Information Technology, Reducing Data
Center Cost with an Air Economizer, 2008;
www.intel.com/it/pdf/Reducing_Data_Center_
________________________________
Cost_with_an_Air_Economizer.pdf.
_______________________
20. S.U.R. Malik, S.U. Khan, and S.K. Srinivasan,
Modeling and Analysis of State-of-the-Art VMBased Cloud Management Platforms, IEEE
Trans. Cloud Computing, vol. 1, no. 1, 2013, pp.
5063.

Cloud Adoption Increase by 2015? research


note G00250893, Gartner, May 2013.
22. F. Panzieri et al., Distributed Computing
in the 21st Century: Some Aspects of Cloud
Computing, Technology-Enhanced Systems and
Tools for Collaborative Learning Scaffolding,
Springer, 2011, pp. 393412.
23. T. Coe et al., Computational Aspects of the
Pentium Affair, IEEE Computational Science
and Eng., vol. 2, no. 1, 1995, pp. 1830.
24. P. Bernier, Openwave Exec Discusses the
Benefits, Challenges of NFV and SDN, SDN
Zone Newsletter, 12 Nov. 2013; http://urlm.in/sjij.

I EEE CLO U D CO M P U T I N G

KASHIF BILAL is a doctoral student at North Dakota


State University. His research interests include cloud
computing, datacenter networks, green computing,
and distributed systems. Bilal has an MS in computer
science from the COMSATS Institute of Information
Technology, Pakistan. He is a student member of IEEE.
Contact him at ______________
kashif.bilal@ndsu.edu.

SAIF UR REHMAN MALIK is a doctoral student at


North Dakota State University. His research interests
include formal methods, large-scale computing systems, cloud computing, and datacenters networks.
Malik has a MS in computer science from COMSATS
Institute of Information Technology, Pakistan. He is a
student member of IEEE. Contact him at ______
saif.rehmanmalik@ndsu.edu.
____________
SAMEE U. KHAN is an assistant professor at North
Dakota State University. His research interests include
optimization, robustness, and security of cloud, grid,
cluster and big data computing, social networks, wired
and wireless networks, power systems, smart grids,
and optical networks. Khan has a PhD in computer
science from the University of Texas, Arlington. He
is a senior member of IEEE. Contact him at samee.
____
khan@ndsu.edu.
___________

ALBERT Y. ZOMAYA is a professor at the University


of Sydney. His research interests include areas of algorithms, complex systems, and parallel and distributed
systems. Zomaya has a PhD from the Sheffield University in the United Kingdom. He is a fellow of IEEE.
Contact him at ___________________
albert.zomaya@sydney.edu.au.

Selected CS articles and columns are also available


for free at http://ComputingNow.computer.org.

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUDS AND SCIENTIFIC COMPUTING

Enabling On-Demand
Science via Cloud
Computing
Kate Keahey, Argonne National Laboratory and University of Chicago
Manish Parashar, Rutgers, The State University of New Jersey

The advantages of on-demand resource availability


are making cloud computing a viable platform option
for research and education that may enable new
practices in science and engineering.

nfrastructure cloud computing has emerged as a new, revolutionary resource


procurement paradigm that has been widely adopted by enterprises. Clouds
provide on-demand access to computing utilities, user control over the computing
environment, and an abstraction of unlimited computing resourcesoverall, a
fundamental building block for on-demand scale up, scale out, and scale down.
Furthermore, a diverse and dynamically federated marketplace of cloud-of-clouds
can accommodate heterogeneous and highly dynamic application requirements by
composing appropriate (public and/or private) cloud services and capabilities best suited
to the needs of a given application. Clouds are also rapidly joining high-performance
computing (HPC) systems, clusters, and grids as viable platforms for scientific exploration
and discovery.1
Analogous to their role in enterprise IT, clouds can enable the outsourcing of many of
the potentially distracting aspects of research and education, such as procuring, building, housing, and operating infrastructures, thus enabling research institutions to
2325- 6095/14/$31 .0 0 2014 IEEE

PUBLISHED BY T HE IEEE COMPU T ER SO CIE T Y

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

21
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUDS AND SCIENTIFIC COMPUTING

THE WORLDS NEWSSTAND

make science their primary focus. Furthermore, the


ability to represent a computational environment
as an appliance that different researchers can publish and then easily share enables the reproducibility of associated computations and thus facilitates
sharing not only data but also new algorithms and
methods. Additionally, similar to their enterprise role,
clouds in the research and education context can democratize access to computational and data resources
because institutions and individual researchers can
lease powerful resources for a short time at relatively little cost.
Its important, however, to look beyond these
benefits and understand application formulations
and usage modes that are meaningful in a cloudcentric cyberinfrastructure (CI). We must also determine how a hybrid CI can enable new practices
in science and engineering. Not all application patterns or usage patterns common in the scientific
community lend themselves to the cloud computing
platform, and likewise, not all the potential created
by infrastructure clouds is currently being leveraged. As we look to the future needs of scientific
platforms, it becomes clear that their characteristics
will evolve to place additional requirements on com-

The advent of infrastructure cloud


computing has had a tremendous,
disruptive force.

putational support, in particular as extreme data


and computing scales continue to transform and
drive science and engineering research.
This article discusses the current and future
needs of data-driven scientific exploration based
on traditional as well as emergent scientific instruments and experiments. We explain how on-demand
resource availability provided by cloud computing
can become a vital part of such an instrument and
discuss both opportunities and obstacles to cloud
adoption in science. We then articulate challenges
in research and current practices that need to be
overcome to leverage those opportunities and overcome obstacles before developing cloud computing
into a viable scientific platform. We conclude with
recommendations for catalyzing integration of cloud
computing opportunities into the current scientific
landscape and discuss the significance of such an
integration.
22

I EEE CLO U D CO M P U T I N G

Science On Demand
Large-scale experiments, such as the Large Hadron
Collider (LHC), equipped with millions of sensors
and capable of producing up to petabytes of data per
second, have highlighted the importanceor, rather, the criticalityof computational support as an
extension of scientific instruments. Data produced
in such large quantities often must first be reduced
by orders of magnitude in real time to a volume that
can be stored at an acceptable cost. Data may even
have to be analyzed in real time so that it can provide feedback during the experiment. Additionally,
raw data must be processed into derived products
that give actual insight into the observed phenomena and can be analyzed by groups of diverse scientists contributing their expertise and generating
new scientific insight. Such processing must happen within the context of an experimentthat is,
in real or near-real time. Thus, science is performed
in bursty cycles, akin to the uptick of shopping
during the Christmas season relative to other times
of the year. This exploration pattern increasingly
places a premium on the on-demand availability of
resources, a demand that traditional batch-oriented
computational centers cant always satisfy.
The advent of infrastructure cloud
computing has had a tremendous, disruptive force in this space; it enables
the ability to lease resources on demand
with a preconfigured environment that
guarantees correct and consistent execution. The transformation and the potential that this capability has opened
up are exemplified by the Solenoidal
Tracker at RHIC (STAR) nuclear physics experiment.2 Using a traditional approach, a
local clusters computational capacity would have
throttled the speed at which experimental results
could be processed, and as a result, STAR scientists
would have had to wait almost a year to assess the
results of the experiment. With cloud computing resources, the STAR scientists were able to reduce this
time to just three monthsa significant difference
in a competitive field. Furthermore, the scientists
were able to run the data calibration component of
processing concurrently with data collection, opening up the possibility of adaptively tuning the experimental parameters, a highly desirable capability.
Such advances are particularly interesting as we
consider the types of experiments we are likely to
conduct in the future. Inexpensive and increasingly
sophisticated sensor devices now allow scientists to
instrument ecological systems (such as oceans and
rivers) or cities, turning our planet and everything
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

in it into an instrument at largedynamic, customized, and often self-organizing groups of sensors


with outputs that we can aggregate and correlate
to support experiments organized around specific
questions. For example, structured deployments,
such as the global network of flux towers, are being
augmented by the innovative use of personal mobile
devices (such as using cell phones to detect earthquakes), data from social networks, and even citizen
science. Many of those sensors allow for adaptive
feedback or can be combined with actuators that
can control the experiments environment.
Driven by the proliferation of personal
sensors marketed at large scales and technological progress (battery life, alternative energy sources, miniaturization, and
so on) as well as economic factors (price),
this trend is likely to continue accelerating
and offering unprecedented opportunities
to science.
The online analysis needs of such instruments at large are more challenging than those
of traditional instruments. Although traditional instruments present demandingbut roughly known
and finiterequirements for online processing, an
instrument at large consists of a dynamic set of sensors that can become active or inactive at different
times, for example, as their batteries run out, darkness prevents taking pictures, or social network
sources become inaccessible. Furthermore, using
carefully selected streams of spatial data from a variety of sources, scientists can uniquely customize
experiments to answer specific questions.
Such levels of dynamicity and customization,
however, will result in unpredictable and highly volatile requirements for the instruments that provide
computational support. Moreover, based on the online inspection of data streams, a user may want to
modify an experiment by accessing additional data
streams or moving mobile sensors to different locations. Doing so requires a responsive infrastructure,
capable of performing such an inspection within
the required time constraints. Finally, whereas an
experiment on a traditional instrument often has a
well-defined beginning and end, experiments supported by instruments at large canand often
dogo on indefinitely. Such experiments require
always-on, highly available computational support
as incoming data is filtered, reduced, correlated,
processed, and stored.
Given these requirements, computational support for this kind of experiment is clearly no longer
an option; rather, it constitutes an inherent and indispensable component of an instrument at large.
M AY 2 0 14

The groundbreaking possibilities created by such


instruments will make them widely useful and a
focus of activity over the coming years. The on-demand availability provided by cloud computing will
be a fundamental building block for the support of
experimental science, but further development is
necessary to combine it with the additional infrastructure that satisfies the timeliness, scalability,
and reliability requirements of experiment-driven
processing. This situation places new emphasis and
urgency on investigating the applicability of infra-

Given the requirements of experiments


such as STAR, computational support
is no longer optional.

structure clouds in science and shaping their capabilities and ecosystem into a viable and responsive
scientific tool.

Clouds as Enablers of On-Demand Science


The opportunities offered by on-demand and datadriven science are compelling and could dramatically impact science, engineering, and society. Clouds
can help make this vision a reality in multiple ways.
They can provide resources for running applications
on demand when local infrastructures are unavailable. They can also supplement existing systems by
providing additional capacity or complementary capabilities to meet heterogeneous or dynamic needs.
For example, clouds can serve as accelerators or provide resilience to scientific workflows by moving the
workflow execution to alternative resources when a
failure occurs.
Current cloud installations can also provide
effective platforms for certain classes of computational and data-enabled science and engineering
(CDS&E) applicationsfor example, high-throughput computing (HTC) applications. The cloud abstractions simplicity can also alleviate some of the
problems that scientific applications face in current
HPC environments. For example, the increasingly
important and growing class of many task computing (MTC) applications can benefit from the ease of
use, the abstraction of elastic and readily accessible
resources, and the ability to easily scale up, down, or
out. Finally, the abstractions provided by the cloud
model will allow scientists to address their problems
more effectively and can even enable them to formuI EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

23
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUDS AND SCIENTIFIC COMPUTING

THE WORLDS NEWSSTAND

nificant.6 As the underlying technology that allows


cloud providers to enable users to project their images onto their infrastructure securely, quickly, and
reliably to support on-demand availability, virtualization imposes a performance penalty. Furthermore, research is still ongoing regarding how best
to integrate cloud computing with hardware accelerators and fast communication hardware such as
InfiniBand. Work on lightweight hypervisors holds
out promise that, in the future, virtualized supercomputers can offer performance close to those of
HPC resources.7 However, a host of features, ranging from HPC-specific resource management to reliability, make computing at large scales using infrastructure clouds a challenge.
Other ways of looking at this question may exist, howeverwe may come to the mountain. As
HPC resources grow in scale to include ever-larger
numbers of processing elements, sustaining a model
when all of them must reliably move in lockstep is
increasingly hard. This situation opens the possibility of turning to more loosely coupled and asynchronous computational models. Key research challenges here include exploring application formulations that can effectively utilize clouds
and addressing, at the algorithmic level,
implications of cloud characteristics
such as elasticity, on-demand proviThe most frequently cited reasons for the
sioning, virtualization, multitenancy,
and failure. For example, the asynchrolack of adoption of cloud computing in
nous replica exchange8 formulation is
science are technical.
a novel, decentralized, asynchronous,
and resilient formulation of the replica
exchange algorithm for simulating proed by many branches of industry, its broad adoption tein structure, folding, and dynamics.
in science has been slow. Several factors play a part
Another key research challenge is developing
in the applicability of cloud computing to science.
appropriate programming models and systems that
can enable CDS&E applications to take advantage
Moving the Mountain
of clouds. These include developing programming
When we consider whether clouds can provide a abstractions and tools to support the federation of
suitable platform for HPC, we tend to think in terms clouds and CIs to provide elastic access to cloud
of whether the mountain will come to usin other services and extending existing cloud programming
words, will cloud computing evolve to suit the needs models and platforms (such as MapReduce and Bigof traditional HPC? Although clouds can accom- Table) to support scientific computing. Additionally,
modate ever-larger computations (the largest known entire applications will need to be exported, facilivirtual cluster currently stands at 156,000 cores), tating a need for new application patterns and kerthe characteristics of such supercomputers differ nels, optimized libraries, and/or specialized middlefrom those of traditional HPC machines. The most ware as a service. New tools will thus be necessary
frequently cited reasons for the lack of adoption of for application debugging, validation management,
cloud computing in science are technical factors: and performance engineering.
inadequate performance and management options,
size, and a lack of understanding of reliability con- Wholesale to Retail
cerns of large computations.
A more insidious problem with cloud computing
The impact of these limitations can be sig- adoption within the scientific community is social
late their applications and algorithms in new ways.
Several early projects have reported successful deployments of such applications on existing
clouds.3,4 Running these applications typically involves using virtualized commodity-based hardware,
which is provisioned on demand at commercial
cloud providers such as the Amazon Elastic Computer Cloud (EC2) or Microsoft Azure. A recent
technical report by Geoffrey Fox and Dennis Gannon provided an extensive study of HPC applications
in the cloud.5 According to that study, commodity
clouds work effectively only for specific classes of
HPC applications. Application examples include embarrassingly parallel applications (those that are efficiently parallelized into components with little or
no communication and can withstand latency across
networked systems) that analyze independent data
or spawn independent simulations that integrate
distributed sensor data, science gateways and portals, or data analytics that can use MapReduce-like
formulations.
However, although cloud computing has been
around for more than a decade, has developed rapidly in that time, and has been enthusiastically adopt-

24

I EEE CLO U D CO M P U T I N G

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

rather than technological. Perhaps the most important aspect of the cloud computing disruption is that
it has revolutionized our idea of resource procurement. Instead of buying a system wholesale to run a
certain class of computationsan investment that
can cost millions of dollars to buy, house, build, and
operatewe can now shop retail and spend only a
few thousand dollars on a per-computation basis as
the need arises. This capability makes the timecapacity product more flexible. For example, instead of buying a small cluster
and waiting a year for a computation to
complete, researchers can now rent a
large cluster for a short time and complete
the computation using all the available
resources.
Making this equation work, however,
requires admitting that there is a premium on time, in other words, acknowledging that the instantaneous, on-demand
availability of a resource is worth more than batch
cycles and that this should be reflected in the market price of time on said resources. Currently, established funding, procurement, and allocation systems
arent equipped to deal with such a nuanced and
multifaceted concept of worth, even if it can bring
substantial benefits.
And now that we have computing power on tap,
turning the tap on proves to be a nontrivial operation.
Previously, maintenance and user support were provided as part of the wholesale purchase; a traditional
cluster user would expect it to be configured and upgraded as needed and to include all the standard software. In contrast, the cloud currently provides some
features, such as resource availability, but doesnt
provide other features, such as virtual machine configuration. Moreover, choosing the optimal configuration among the myriad cloud offeringsincluding diverse services, instance types, billing models, storage
options, and providersrequires special expertise
and a significant time commitment.
The Case of the Missing Infrastructure
Many of the challenges we outline here are, arguably, merely the growing pains of a deceptively
simple but deeply disruptive innovation. Certainly,
many of them can be resolved with a research and
development investment in critical ecosystem components and by creating new support relationships
that provide the necessary layer between users/applications and cloud services.
A relatively short-term challenge is establishing a cloud ecosystem that can enable and drive research and can address issues related to deployment
M AY 2 0 14

and transition to practice. Research issues include


the definition of community standards, development
of community testbeds and benchmarks, documentation of experiences and best practices, and development of curricula and training modules. Providing appliances that can be automatically rendered as
consistent sets of images working across virtual machine image formats (such as Xen and the Kernelbased Virtual Machine, or KVM) and cloud provid-

Now that we have computing power


on tap, turning the tap on
proves to be nontrivial

ers (including Amazon Web Services and Microsoft


Azure) will provide a solution to the image problem.
Similarly, higher-level abstractions for science at the
platform-as-a-service (PaaS) and software-as-a-service (SaaS) levels can make clouds more accessible
to scientists. Defining a community administrator
in charge of the computing environment for a given
community will also help. Services and informative
benchmarks for service comparison, in terms of performance, consistency, and reliability, together with
a better understanding and automated characterization of the load, can all help resolve the new complexity problem.
A platform layer that can take all this information and automatically build and maintain a user- or
application-specific platform, clearly indicating the
trade-offs and their implications, will make that
tuning exercise simple. Middleware stacks and services are essential for supporting CDS&E application formulations and hybrid usage modes targeted
to cloud and CI environments, including support for
dynamic cloud bursting and infrastructure federation. A key research issue is provisioning, scheduling, management, and optimization of these hybrid
infrastructures with respect to multiple objectives
including performance, energy, cost, and reliability, and autonomic mechanisms that can balance
these objectives at runtime. A related challege is
the interoperability between cloud providers and
the creation of cloud federations. Data management
research challenges exploring the different types of
cloud storage solutions and the nature of cloud connectivity are also important; specific issues include
support for selecting from among the diverse storage
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

25
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUDS AND SCIENTIFIC COMPUTING

THE WORLDS NEWSSTAND

options with varying service levels, networks architectures to support data transport needs and their
interaction with cloud storage offerings, and the colocation of computing and data. Combining those
two areas of exploration into support for cyberphysical systems will ultimately provide a viable platform
for instruments at large.
Lack of understanding of security and privacy
issues as they relate to clouds is a critical barrier to
adoption, especially in areas dealing with private
data such as biomedical applications. Clouds renegotiate the security space with new types of attacks proposed all the time, emphasizing the need
for high-quality security mechanisms because of the
sharing of storage and computing.
In addition to crosscutting cloud security challenges, specific issues related to cloud and CI integration with CDS&E include the interoperability
with broader CI security mechanisms and policies,
such as single-sign-on, federated identify management (such as inCommon, cilogin, and SCIM), and
security policies and mechanisms for specific applications (including differential privacy and data ano-

demonstrated. Such challenges are also effective in


nucleating a community. On the cloud computing
frontier, such challenges to date have been driven
more by what we can answer than by what wed like
to know. This approach highlights the strengths of
a technology, but it doesnt fully relate it to the context of surrounding requirements. Offering specific
problem formulations as well as benchmarks and
metrics in collaboration with the scientific community will help address this shortcoming and highlight areas in which additional work is necessary.

Construct Experimental Testbeds


An open, reconfigurable experimental testbedlarge
enough to reflect the scale appropriate to handle the
big data and big compute challenges we faceis as
critical to the advancement of computer science as
large instruments such as LHC are to the advancement of physical sciences. A testbed alone is insufficient, however. Data that can lead to specific problem formulations, such as cloud utilization data, is of
critical importance as well. This data is often available only from commercial providers, and thus collaboration between academia and industry
emerges as a critical ecosystem element
of such a testbed.
Another critical enabler is the deep
What can we do to overcome obstacles
familiarity with specific usage patterns
that can be obtained only by working dito adoption of a promising innovation
rectly with application scientists. Open
access to such resources, problems, and
and catalyze its impact?
data will create a community operating
within the same collaborative context
and thus capable of creating research
nymization requirements for bio/medical informat- that is more than the sum of its parts. A viable exics applications). Investments in homomorphic or perimental testbed should therefore place emphasis
partial homomorphic encryption are driven largely on building such a community.
by the needs of those applications.
Standards, Policies, and Practices
Many important standard activities exist, from those
The Path Forward
What can we do to overcome obstacles to adoption specifying the basic virtual machine structure to
of a promising innovation and catalyze its impact? higher-level standards defining the PaaS/SaaS enWe propose several ideas that can accelerate the de- vironment. Although these standards, such as the
velopment of cloud computing capabilities relevant Open Grid Forums Open Cloud Computing Interto science and promote an understanding of the im- face (OCCI) in OpenNebula and OpenStack, have
some support, this area is still under development,
pact of its power.
with the US National Institute of Standards and
Technology (NIST) and IEEE playing leadership
Throw Down a Challenge
A well-defined challenge that captures the gradient roles. In addition, substantial progress is needed to
of missing capabilities can be an effective vehicle enable the procurement of services through capped
of progress. Successive milestones in responding to purchase orders or subcontracts; subaccount adsuch a challenge can be a good yardstick for judg- ministration; resource and authority delegation; and
ing the state of the art of a promising technology, as monitoring, managing, and reporting. Furthermore,
projects such as the Top 500 list have successfully the development and modification of codes adapted
26

I EEE CLO U D CO M P U T I N G

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

to the cloud environment require a unique skill set


that necessitates appropriate educational and training structures.

4. K. Keahey and T. Freeman, Science Clouds:

Practice Makes Perfect


Some problems can be fully understood and resolved
only by facilitating the use of clouds in practice, in
the context of specific applications or application
groups, and by experiencing and solving problems
on the fly. Encouraging cloud-based application
platforms will lead to solutions that offer practical
solutions and thereby generate more confidence,
familiarity, and expertise related to this emergent
platform.

5.

6.

7.
lthough industry has enthusiastically embraced
cloud computing, and it has demonstrated enticing possibilities for various branches of scienceparticularly those that place a premium on on-demand
availability such as the experimental sciencescloud
computing currently runs the risk of getting stuck
crossing the chasm between potential and reality in its
broad application to scientific problems. This impasse
is due to the computationally demanding nature of scientific applications, both in terms of performance and
infrastructure support, as well as the lack of economic
flexibility in the scientific environment. Catalyzing
progress in this space is essential before the potential
of clouds as enablers for science can be realized.
As we look to the future and ponder the needs
of technologies underlying future experimental instruments that integrate computation as an inherent component, we can see this will become all the
more important. Such computations will rely on the
on-demand availability and control over the environment provided by infrastructure clouds. They will
also require support for the big compute applications
that are currently running in HPC centers. Finding
ways to overcome the performance, usage modes,
and infrastructure barriers currently dividing clouds
and HPC is therefore of primary importance.
References
1. M. Parashar et al., Cloud Paradigms and
Practices for Computational and Data-Enabled
Science and Engineering, Computing in Science
& Eng., vol. 15, no. 4, 2013, pp. 1018.
2. J. Balewski et al., Offloading Peak Processing to
Virtual Farm by STAR Experiment at RHIC, J.
Physics Conf. Series, 2012, p. 368.
3. E. Deelman et al., The Cost of Doing Science on
the Cloud: The Montage Example, Proc. 2008
ACM/IEEE Conf. Supercomputing, 2008, pp. 112.
M AY 2 0 14

8.

Early Experiences in Cloud Computing for


Scientific Applications, Proc. Cloud Computing
and Its Applications, 2008, pp. 825830.
G. Fox and D. Gannon, Programming Models
for Technical Computing on Clouds and
Supercomputers (aka HPC), Proc. Cloud Futures
Workshop, 2012; http://research.microsoft.com/
en-us/um/redmond/events/cloudfutures2012/
________________________________
monday/
Plenar y_ ProgrammingParadigms_
_______________________________
Geoffrey_Fox.pdf.
____________
K. Yelick et al., The Magellan Report on Cloud
Computing for Science, Office of Science and
Office of Advanced Scientific Computing
Research (ASCR), US Dept. of Energy, 2011.
J. Lange et al., Minimal Overhead Virtualization
of a Large Scale Supercomputer, Proc. 2011 ACM
SIGPLAN/SIGOPS Intl Conf. Virtual Execution
Environments (VEE), 2011, pp. 169180.
Z. Li and M. Parashar, Grid-Based Asynchronous
Replica Exchange, Proc. 8th IEEE/ACM Intl
Conf. Grid Computing, 2007, pp. 201208.

KATE KEAHEY is a scientist in the Mathematics


and Computer Science Division at Argonne National
Laboratory and a senior fellow of the Computation
Institute at the University of Chicago. She created and
leads the Nimbus Project, recognized as the first open
source infrastructure-as-a-service, implementation
more recently focusing on infrastructure platform tools.
Her research interests focus on resource management
in cloud computing and cyberphysical systems. Keahey
has a PhD in computer science from Indiana University. Contact her at _____________
keahey@mcs.anl.gov.
MANISH PARASHAR is professor of electrical and
computer engineering at Rutgers University. He is
also the founding director of the Rutgers Discovery Informatics Institute (RDI2), site codirector of the NSF
Cloud and Autonomic Computing Center (CAC),
and the associate director of the Rutgers Center for
Information Assurance (RUCIA). His research interests are in parallel and distributed computing, with
a focus on computational and data-enabled science
and engineering. He is fellow of the IEEE Computer
Society and AAAS. Manish has a PhD in computer
engineering from Syracuse University. Contact him at
parashar@rutgers.edu.
______________

Selected CS articles and columns are also available


for free at http://ComputingNow.computer.org.

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

27
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUD SECURITY

THE WORLDS NEWSSTAND

Practical Methods for


Securing the Cloud
Edward G. Amoroso, AT&T

Combining the various methods of securing the cloud


infrastructure, services, and content can help meet
or exceed the protection benefits of a traditional
enterprise perimeter.

he advantages of virtualizing servers, databases, and applications into the


cloud are well known: hardware costs are reduced, content becomes more
ubiquitous, and IT services can better adapt to an organizations changing
needs. Such benefits have led to many new cloud initiatives, ranging from
private cloud efforts behind corporate firewalls to the widespread use of
publicly accessible cloud services such as Amazon Web Services (AWS).
Despite the success of these cloud-based initiatives and services, concerns remain
about security protection. The financial services community, for example, is engaged in
a vigorous debate about whether public cloud services are secure enough for financial
applications.1 The specific cloud threats generally cited include the compromise or
unauthorized modification of cloud-resident financial data, as well as the possibility that
denial-of-service attacks will cause cloud-resident financial data to become unavailable.
28

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Enterprise organizations and cloud service


providers today are using several practical methods
to secure their cloud infrastructure and services:
r A private cloud with enterprise perimeters is the
most common large enterprise approach to securing cloud content.
r A public cloud with service gateways involves
popular cloud services used by millions of individuals and businesses today.
r Content encryption focuses on protecting data
stored in the cloud from unauthorized compromise and leakage.
r Session containers ensure that data are properly
removed from client devices such as mobile devices after cloud access.
r Cloud access brokers integrate security measures
such as authentication or access monitoring for
users accessing cloud services.
r Runtime security virtualization integrates dynamic runtime virtual security functions directly into virtual entities in the cloud.
By properly utilizing these practical cloud
security methods, an organization can meet or
exceed the existing security capabilities offered by
its enterprise perimeter. The goal of this survey
is to provide cloud decision makers with broader
insight into how best to mitigate cloud-specific
security threats, thereby making this technology
more acceptable to a wider range of industries,
environments, and applications.

Private Cloud with Enterprise Perimeters


The most common solution for enterprise organizations seeking to mitigate cloud security threats
currently involves building a virtual infrastructure
inside an existing corporate firewall (see Figure 1).
With this approach, enterprise perimeter-protected
datacenters host cloud services and/or are used to
virtualize applications. These services and applications are accessible only to users who have been
properly authenticated and securely admitted to
the corporate intranet. This is a mature security
approach thats consistent with existing protection
strategies for all other enterprise assets.
Using a private cloud infrastructure within the
enterprise, an organization gains the advantages of
M AY 2 0 14

software virtualization, such as reduced hardware


costs through shared virtual machines with high
utilization, but without the security concerns that
come from ubiquitous, open access. Enterprise
auditors and regulators approve of this architecture
because the familiar perimeter remains a primary
control for security compliance. The safeguards
inherent in the private cloud approach include the
following:
r Identity and access. A private cloud available for
internal enterprise users is easily integrated with
existing identity and access management functions, such as corporate directory services. Products such as the IBM Security Identity Manager
and Security Access Manager, for example, provide customizable identity and access management support for private cloud deployments.
r Firewall, IDPS, and DLP. Private clouds mediate
external access from untrusted, nonenterprise
users via the corporate firewall, an intrusion
detection/prevention system (IDPS), and a data
loss prevention (DLP) tool. Cisco Systems, for
example, offers intrusion detection and prevention signatures that protect private clouds utilizing an enterprise perimeter.
r Encryption. Private clouds can integrate enterprise encryption capabilities, including key
management to further protect cloud-resident
content. Encryption solutions from companies
such as Checkpoint Software support integration into cloud-resident data storage.
r SIEM analytics. Integration is usually straightforward between a private cloud and the
enterprise security information and event management (SIEM) system, providing data analytics and incident response processes and tools.
HPs ArcSite SIEM, for example, is often used
in conjunction with a private cloud deployment.
Figure 2 illustrates the security architecture for
a typical private enterprise cloud.
The challenge associated with private cloud
implementations is that, despite the perimeter solutions available to protect the enterprise, the typical
organization is still unable to stop attacks such as
advanced persistent threats (APTs) from the Internet. In addition, complex policy-based decisions
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

29
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUD SECURITY

THE WORLDS NEWSSTAND

Internal, trusted
enterprise resident user

Internal
access
External network
(Internet)

External, untrusted
nonenterprise resident user

Private
cloud

Corporate
rewall
External
access

Enterprise
perimeter

Enterprise
network

FIGURE 1. Private cloud with enterprise perimeter. As the most common solution for enterprise organizations,

this mature security approach is consistent with existing protection strategies.

made over long periods of time to allow a multitude of enterprise services and approved exceptions
through the corporate firewall, combined with the
increasingly common method of bypassing the perimeter using mobile devices, have rendered the
enterprise perimeter essentially useless from an advanced threat perspective.2
An additional fatal issue with private clouds is
that enterprise security teams cant stop determined
insider attacks. Even in the presence of segregation
of duty controls, as with Sarbanes-Oxley relevant
systems, the approach is vulnerable to collusion,
which is easy to achieve with malware on multiple
compromised systems. Thus, by situating a private
cloud inside the enterprise and assuming that internal access can be trusted, an organization places its
cloud infrastructure at direct risk of compromise.
The result is that private cloud infrastructures
have devolved into architectures that are indistinguishable, at least to the security engineer, from public cloud systems. Purveyors of private clouds may
have control over vendor selection, cloud service fea30

I EEE CLO U D CO M P U T I N G

tures, degree of sharing between users, and day-today system administration, but the idea that theyre
immune to external attacks because of enterprise perimeter protections is no longer justifiable. As such,
private cloud deployments should never rely on an enterprise perimeter as their sole security control.

Public Cloud with Service Gateways


A second approach to cloud security involves using
the native protections in a public cloud service.
Public cloud service providers generally differentiate
their services via the familiar XaaS designation,
where X is a wildcard for infrastructure, compute,
software, storage, or even security. For the
purposes of this discussion, we can abstract such
distinctions to focus on the common underlying
architectures in the various public cloud services.
The primary public cloud security solution
involves dedicated service gateways in front of
the cloud platform, customer data, and business
support systems (see Figure 3). Specifically, cloud
users log into their accounts through dedicated
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Internal, trusted
enterprise resident user

External network
(Internet)

Internal
access

Identity and access


management

External, untrusted
nonenterprise resident users
Firewall,
IDPS, DLP

External
access

Private
cloud

Access
gateway

Enterprise
perimeter

Enterprise
cryptography
Internal
integration

Enterprise
network

Internal
integration
Enterprise
SIEM

FIGURE 2. Private cloud security architecture. Private clouds may incorporate additional enterprise safeguards

such as encryption and identity management to protect cloud-resident data.

service gateway interfaces, cloud administrators log


into their accounts via side channel infrastructure
gateways, and cloud developers gain access to
services and infrastructure through API gateways.
Admittedly, the gateway approach to protecting
public cloud services is similar to the enterprise
perimeter because chokepoints separate trusted,
internal networks from untrusted, external users.
The difference, however, is that public cloud service
gateways are dedicated to the cloud service and arent
subject to the security risks of everyday enterprise
usage (email phishing, firewall exceptions, direct
mobile access, and so on).
Organizations such as financial services firms
(as mentioned earlier) have expressed low confidence
in public cloud security because of a perceived loss
of infrastructure control. Such lack of confidence
is inconsistent with the common reliance of business entities on shared services such as the Domain
Name Service (DNS). Similarly, every organization
must connect to the Internet through a service provider, which may introduce shared risks. The security controls in public cloud services include the
following:

frastructure behind gateways integrated with


perimeter security functions, such as firewalls,
IDPS, and DLP. Public cloud services also typically deploy SIEM functionality inside the provider enterprise.
r User account security. The most basic security
primitive for cloud service provision is the user account. Key security issues include user authentication, provisioning controls, and the administrative
and access controls used to manage accounts.
r User separation. Cloud services include logical
separation mechanisms that prevent cascading
of malware across user accounts or break-ins
from one users cloud assets to another.
r Content distribution. A content distribution
network (CDN) reduces distributed denial-ofservice (DDOS) risk for public cloud services.
DDOS controls offered by Internet service providers can complement a CDN as well.
r Virtualized security capabilities. Public cloud offerings can bundle advanced security functions,
such as incident response, that some smaller enterprise customers or individuals might not be
able to afford.

r Service provider perimeter. Cloud service providers, like all service providers, run an in-

Figure 4 illustrates the typical security architecture


for a public cloud service offering.

M AY 2 0 14

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

31
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUD SECURITY

THE WORLDS NEWSSTAND

Cloud service provider


administrator

Authorized
administrative
access

External network
(Internet)

Unauthorized
external traffic
Authorized
external users

Public
cloud

Cloud
service
gateways
Access
(usage or API)

Cloud provider
network

Cloud provider
perimeter

FIGURE 3. Public cloud with service gateways. The gateway approach isnt subject to the security risks of
everyday enterprise usage.

Many organizations choose to combine public


and private clouds into a hybrid arrangement. Hybrid clouds introduce orchestration issues for security mechanisms that differ between the component
clouds. For example, identities established in one
cloud will require federation to other hybrid elements. The adoption of a public cloud for dedicated
or even hybrid arrangements thus requires a degree
of transparency on the part of the public cloud service provider.
Organizations considering the use of public cloud
services must analyze whether advantages such as
Internet-facing ubiquity outweigh the risks inherent
in any shared, external service. These risks will vary
between providersas in, for example, the ability
of that provider to fend off denial-of-service attacks.
Dropbox is a popular public cloud service that provides security solutions, including strongly authenticated user accounts through gateways and user control of permission settings through an account management tool. Nevertheless, public cloud users should
integrate additional security controls such as the ones
described in the remaining sections here.
32

I EEE CLO U D CO M P U T I N G

Content Encryption
To address data confidentiality, cloud encryption is
generally designed to ensure that cloud-resident content cant be retrieved as plain text by APT malware
or by compromised insiders with direct access inside
a perimeter. The encryption algorithms strength and
key management should be based on risk analysis. Encryption tools can be integrated on top of a public or
private cloud infrastructure or can be selected from
native encryption features offered by the cloud service
provider (see Figure 5). The over-the-top encryption
approach lets users maintain control of key management and infrastructure, but it usually increases costs.
Cloud encryption works only if the underlying
cryptographic algorithm or supporting key management cant be broken. Strong, resilient ciphers that
utilize expert cryptanalysis are readily available, so
the primary focus is generally on the security of the
underlying key management. Stated simply, if malicious actors can easily gain access to decryption
keys, then encrypted cloud storage is useless. The
primary security requirements for encrypted content
in the cloud are as follows:
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Cloud service provider


administrator
External network
(Internet)
Authorized
administrative access

Authenticated user
account management

Unauthorized
external traffic
CDN
Authorized
external users

Private
cloud

Gateway
security
(rewall,
IDPS, DLP)

Separation

Access
(usage or API)
Cloud provider
perimeter

Internal
integration

Cloud provider
network
SIEM

FIGURE 4. Public cloud security architecture. Cloud service providers run an infrastructure behind gateways

integrated with perimeter security functions.

r Stored data secrecy. Encrypting cloud data prevents backdoor leakage and restricts access to
privileged users and administrators. Many companies provide encryption for cloud systems
data at rest, including Pawaa, which encrypts
files at the device before they are sent off to the
cloud infrastructure for storage.
r Cloud storage malware resistance. Encryption
provides malware resistance for stored data, especially in the case of remote access tool (RAT)
attacks that target individuals with authorized
access to data. Additional tools exist to ensure
that malicious users dont insert malware directly into the cloud. Companies such as CipherCloud, for example, include filters that scan
in-bound and outbound cloud content for the
presence of malware.
The functional requirements for most cloud
ciphers include maintaining search capabilities
for stored data as well as the ability to perform big
data analysis. CipherClouds Searchable Strong Encryption (SSE) is one example. Such interoperability with public, private, or hybrid cloud capabilities
and associated business processes is an important
requirement for encryption solutions. Cloud fedM AY 2 0 14

eration and orchestration of key management infrastructure in hybrid systems require a bit more attention, but theyre still practically workable.

Session Containers
A cloud security solution for mobile access to a public cloud involves a session container (see Figure 6).
The idea is that any user interested in obtaining
access to cloud services or content would initiate a
secure connection that would maintain end-to-end
closure, not unlike the way HTML5 sessions are encapsulated between the browser and website. Such
closure usually requires a software client-server arrangement with the provision that no residual information exists on the client device after the session
has been completed.
A key consideration for session containers involves support for multiple personas. Bring-yourown-device (BYOD) environments, for example, require differentiation between corporate personas,
where session-contained access to proprietary applications such as payroll systems is done under a
corporate persona. Correspondingly, access to nonbusiness relevant applications such as games or YouTube is done under nonsession-contained access in a
noncorporate persona.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

33
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUD SECURITY

THE WORLDS NEWSSTAND

Malicious
insider
Direct,
back door
inside access
to data

Malicious
actors

APT
attack

Perimeter
rewall

APT
attack

Successful
data
capture

Unencrypted,
stored plaintext
content
Encrypted,
stored plaintext
content

Successful
data
capture

Cloud

Unsuccessful
data
capture
Perimeter

Direct,
back door
inside access
to data

Unsuccessful
data
capture

Malicious
insider
FIGURE 5. Encrypted content in the cloud. Encryption tools integrated on top of a public or private cloud

infrastructure can further protect cloud-resident content.

34

One additional consideration is the degree to


which data that temporarily reside on client systems
are properly wiped. Algorithms for secure wipe are
available, and session container users should check
with their vendor to ensure acceptable implementation of well-known standards.4 Session containers
provide security benefits for cloud services in the
following functional areas:

mentalizing different personas on a client device


is long established in computer security. Modern implementations of BYOD programs using
session containers generally allow granularity at
the persona or application level. AT&Ts Toggle
product, for example, provides flexible multiple
persona support with the ability to create customized server controls.

r Client system data wipe. Session containers ensure that, once a user has completed access to
a cloud-resident object such as an application,
the associated data are properly wiped from the
client device. Invincea, for example, provides a
session container solution that allows for access
to cloud applications from a variety of devices,
such as mobile smartphones, and wipes the data
securely afterward.
r Data separation. Session containers provide dynamic separation of different user activities
within the cloud. The separation is enforced at
the client and server levels by controls that keep
data from being intermingled with resources
outside the container. The company Bromium
uses hardware assistance to ensure trusted separation during user access to cloud resources.
r Multiple persona support. The idea of compart-

Most session containers include support for


end-to-end encryption, although this may not be
required for less critical applications, especially in
private clouds. Encryption might incur minor additional overhead and additional key management infrastructure support. When end-to-end encryption
is employed, integrity, secrecy, and authentication
functions are supported on a per-session basis.
The biggest practical issue with session containers is whether they work with legacy computing. In complex environments, session containers
often cant create the runtime support environment required for user access and local computing
requirements. Thus, local testing is necessary to
determine the feasibility of this approach. Nevertheless, organizations are advised to integrate session containers into their use of public, private, or
hybrid clouds.

I EEE CLO U D CO M P U T I N G

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Cloud session
Client container

Cloud object

Cloud session (contained)

Cloud user

All cloud data


destroyed after session
completed

Cloud

Cloud session
server (shared)

FIGURE 6. Session container. The user may obtain access to cloud services or content via a secure connection

that maintains end-to-end closure.

Cloud Access Broker


A security method that provides additional security
capability for cloud application usage involves the
use of a broker that either observes or integrates
with the authentication path from users (see
Figure 7). For example, the Gartner Group has
introduced a concept called a Cloud Access Security
Broker, which includes more functionality (such
as encryption support) than just proxy or simple
gateway services.3 Nevertheless, this article uses the
terms gateway, proxy, and broker synonymously.
The idea behind such man-in-the-middle security
functionality is that when any user decides to access
a cloud-based application, a special broker, often
implemented as a forward or reverse proxy, can be
used to provide enhanced security.
Brokers can be passive, in which case indicators
and security statistics are provided, or active, in
which case in-line mitigation is possible. If the
cloud access is encrypted, as in a session container,
the cloud access broker will require the ability
to interrupt the end-to-end secrecy. Providing
certificates and keys to brokers has always been an
issue of some debate, because it breaks the end-toend nature of the client-server secrecy.
Brokers implemented as proxies have been included in security architectures for many years. Positioning proxies at the perimeter has been the basis
for several growing successful companies, such as
Blue Coat Systems, which offers a proxy solution
for enterprises that works well with private clouds.
With cloud access to public or hybrid clouds, the
proxy must be more ubiquitous and virtual because
no perimeter exists. Blue Coat has successfully virM AY 2 0 14

tualized their proxy capability to support this type


of use.
The specific security advantages of the cloud
proxy method include the following:
r Passive security monitoring. Off-line cloud access
brokers can passively collect statistics about the
use of cloud services. This may be desirable for
organizations that want to better understand the
intensity and nature of public cloud use from the
enterprise. Adallom provides a cloud access proxy
tool that resides in the authentication path between clients and cloud applications for the purpose of collecting information for security teams.
r Active security mitigation. Cloud access proxies
in active mode can mitigate malware or policy
violations in real time. Generally, such a capability is similar to a Web application firewall
(WAF), even when the solution resides between
clients and cloud solutions, rather than applications on Web sites. Qualys offers a WAF solution called QualysGuard that integrates well
with common cloud services such as AWS.
Cloud access brokers allow for flexible integration of new security capabilities. A provider can modify an existing broker collecting information about
one attribute to collect information about another
property. Similarly, an in-line broker such as a WAF
can be adjusted to meet a changing policy or maintain consistency with changes in an application. This
is a double-edged sword, however, because changes in
applications require that the WAF also be adjusted.
Thus, WAF maintenance is more complex than tradiI EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

35
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUD SECURITY

THE WORLDS NEWSSTAND

Passive mode:
offline security monitoring
(IDS, SIEM)

Cloud access
security module
(offline)

Cloud user
Cloud access

Cloud access

Cloud
object

Cloud access
security broker
(inline)

Cloud
access
gateway
Cloud
object

Cloud user

Active mode:
in-line security mitigation
(rewall, IPS)

Passive mode:
in-line security monitoring
(IDS, SIEM)

FIGURE 7. Cloud access broker. Brokers, often implemented as a forward or reverse proxy, can provide either

passive security monitoring or active security mitigation.

tional five-tuple firewalls, which often require no rule


changes when applications are modified.
Nevertheless, broker solutions for cloud access
will likely be important components in cloud security architectures in the coming years. They can help
simplify access from an organization to multiple
vendor clouds, for example. Theyre also comparable
in their operation to familiar security tools such as
firewalls, so compliance auditors should accept brokers as suitable control replacements as organizations virtualize on the cloud.

browsers. Any WAF can be inserted physically into


the network access path, either as a proxy or gateway
function. If we port that application to a virtual machine on a hypervisor-based system integrated into a
cloud platform, then the WAF can be virtualized as
well. In particular, the WAF would exist as a virtual
machine appliance woven into the execution, tracing
all users accessing the application from their mobile, desktop, or other device.
The primary security controls offered by the
runtime virtualization approach to cloud security
include the following:

Runtime Security Virtualization


The most innovative security solution in the
cloud ecosystem involves the dynamic creation of
runtime security virtualization. The idea is that
as the computing, storage, and infrastructure are
embedded in a virtual runtime system, security
functions such as firewall, IDPS, and DLP should
be embedded in the same environment (see Figure
8). The result is the dynamic creation of runtime
security components that are virtualized alongside
the cloud objects theyre intended to protect.
An example of runtime virtualization involves
the provision of a virtual WAF to protect an HTTP
application. Traditionally, the application resides
on a physical Web server, accessible by users with
36

I EEE CLO U D CO M P U T I N G

r Security for dynamic objects. As objects such as


virtual machines are created into the cloud, the
security protections associated with such objects
are created dynamically. In essence, they create
a customized runtime environment for the cloud
object. AWS offers this type of protection for
many of its services in conjunction with companies such as Tenable Systems and AlertLogic.
r Tunable policy based on assets. With runtime security virtualization, different assets that reside
together in the same cloud can be associated
with different security protections. Because
providers can customize security, an object with
a low security risk might have light functional
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Object function specication


(CPU, memory)

Cloud user

Cloud object
conguration

Cloud object
conguration

Cloud object
provisioning

Cloud object
provisioning

Runtime object functionality


(CPU, memory)

Object attributes

Runtime security
provisioning

Cloud
object

Object attributes
with security

Runtime
secured object

Cloud user

Security function specication


(rewall, IDPS, DLP, SIEM)

Runtime security functionality


(rewall, IDPS, DLP, SIEM)

FIGURE 8. Runtime virtualization. Runtime security components are virtualized alongside the cloud objects theyre intended to

protect.

protections, whereas another object with high


risk might include multiple, more intense security functions. Catbird provides a cloud security
platform that includes virtual machine appliances that allow for customization of protection
across different assets.
r Flexible security vendor management. The dynamic nature of virtual runtime protections
allows for multiple layers of defense using different security vendor products. In addition, if a
vendor is no longer desired or needed, it can be
easily decommissioned from the runtime environment by simple changes in API calls.
r Expandable security protections. During an event
such as a DDOS attack, a provider can dynamically expand the runtime environment to include
more protection. Layer 7 intrusion-detection support for DDOS protection from companies such
as Radware can be virtualized to expand during a
major attack and contract afterward.
The advantages of this runtime approach have
led to the planning and development of security
marketplaces for cloud service users. AWS has
already established an impressive portfolio of
security companies offering dynamic runtime
protection. VMware includes support for such
runtime protection as part of its native suite of cloud
services. Service providers such as AT&T are also in
the process of creating similar marketplace offerings
for their customers.
M AY 2 0 14

he most important consideration in securing


cloud services and infrastructure is whether
the methods selected can properly mitigate relevant
threats. Many readers would list compliance as the
top of their priority list, especially with respect to
winning customer contracts for cloud services, but
compliance frameworks measure attention to management process rather than whether a target system is actually secure. Industry groups such as the
Cloud Security Alliance (CSA) have done a good job
advancing this notion of security versus compliance.
Determining whether a given arrangement
of practical cloud security methods for some
environment can sufficiently mitigate threats
must include two important thresholds. First, it
must be determined whether the cloud security
methods provide equivalent protection to an existing
perimeter, because the vast majority of practical
cases will include a legacy enterprise demilitarized
zone (DMZ). Second, it must be determined whether
the cloud security methods adequately mitigate
advanced attacks, which implies some target degree
of protection higher than existing cloud solutions.
The possible solutions for threat protection are
either below the state of the practice (solution A in
Figure 9), above the state of the practice but below
target protection levels (solution B in Figure 9), or
above target protection levels, but below perfect
(solution C in Figure 9). Although accurate threat
assessment requires a detailed investigation of local
assets, possible threat vectors, and the consequences
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

37
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

CLOUD SECURITY

THE WORLDS NEWSSTAND

Solution A:
Below state
of practice

No protection

Solution B:
Above state of practice,
below target

Typical protection
(existing)

Solution C:
Above target,
below perfect

Target protection
(practical)

Perfect protection
(theoretical)

FIGURE 9. Security protection effectiveness. Today, solutions for threat protection range from below the state

of the practice (solution A), above the state of the practice but below target protection levels (solution B), or
above target protection levels, but below perfect (solution C).

of an attack, we can argue that by combining the


practical methods for securing the cloud described
here into a cohesive security architecture, cloud providers and users can achieve approaches consistent
with solution C.
Specifically, by arranging our cloud methods into
broad equivalence classes, motivated by the original
Orange Book security criteria,5 a hierarchy emerges.
Readers can create classes as they see fit for their
environment, but heres one possible approach:
r Cloud security solution A implementation (below
the state of the practice)utilize a public, private, or hybrid cloud with no additional protections beyond perimeters and gateways.
r Cloud security solution B implementation (above
the state of the practice, but below target)utilize a public, private, or hybrid cloud with full
integration of perimeter protections into the
cloud infrastructure, encryption of stored data,
and session containers for client access to critical data and applications.
r Cloud security solution C implementation (above
target, but below perfect)utilize a public, private, or hybrid cloud with full integration of
perimeter protections into the cloud infrastructure, encryption of stored data, session containers for client access to critical data and
applications, proxy access capabilities for authentication and monitoring, and dynamic runtime protection for all cloud objects based on a
threat assessment.
For organizations that currently protect their
data using an enterprise perimeter with presumed
trust for insiders, the cloud security solution
C approach would provide a higher degree of
protection for their data in public, private, or hybrid
clouds because it addresses insider threats and
38

I EEE CLO U D CO M P U T I N G

APTs without dependence on any perimeter. In this


scenario, the migration of data, applications, and
systems to the cloud, even in critical environments
such as financial services, should be immediately
adopted to promote both IT and cybersecurity
objectives.
References
1. Cloud Computing in the Finance Industry,
panel discussion, New York Technology Council,
27 Mar. 2014; https://www.nytech.org/events/
Cloud_Finance.
__________
2. E. Amoroso, From the Enterprise Perimeter to a
Mobility-Enabled Secure Cloud, IEEE Security
and Privacy, vol. 11, no. 1, 2013, pp. 2331.
3. Gartner, IT Glossary, 2013; www.gartner.com/
it-glossary/cloud-access-security-brokers-casbs.
______________________________
4. US Dept. of Defense, National Industrial
Security Program Operating Manual, DOD
5220.22-M, Natl Industrial Security Program,
28 Feb. 2006.
5. US Dept. of Defense, Trusted Computer System
Evaluation Criteria (TCSEC), DOD 5200.28STD (popularly known as the Orange Book), 15
Aug. 1983 (updated 21 Mar. 1988).

EDWARD G. AMOROSO is the senior vice president and chief security officer at AT&T, where his primary responsibilities lie in the real-time protection of
AT&Ts vast enterprise, network, and computing infrastructure, including its emerging Long-Term Evolution (LTE) mobile network and cloud services. He
also manages AT&Ts intellectual property and patent
development group. Amoroso has a PhD in computer science from the Stevens Institute of Technology,
where he also serves as an adjunct professor of computer science. He was awarded the AT&T Labs Technology Medal and is an AT&T fellow. Contact him at
eamoroso@att.com.
____________
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Call for Papers

Special Issue
on Secure Cloud
Computing Techniques
for Big Data
For IEEE Cloud Computings Sep/Oct 2014 issue
Submission Deadline: 20 July 2014

ata explosion is an ever-evolving area of focus in


business and research, and cloud based platforms
areincreasingly utilized as potential hosts for big
data. Current workon big data focuses on information processing such as data mining and analysis. However, security
and privacy of big data are vital concerns whichhave received less research focus. The aim of this special issue isto
solicit both original research and tutorial articles that discuss
the security and privacy ofbig data within the cloud.
Example topics of interest include, but are not limited to:
t Access control in big data

Submission Guidelines
Submissions will be subject to IEEE Cloud Computing magazines peer-review process. Articles should be at most 6,000
words, with a maximum of 15 references, and should be
understandable to a broad audience of people interested in
cloud computing, big data, and related application areas. The
writing style should be down to earth, practical, and original.
All accepted articles will be edited according to the IEEE
Computer Society style guide. Submit your papers through
Manuscript Central at https://mc.manuscriptcentral.com/
ccm-cs.
_____

t Network security, privacy, and trust in big data


t Collaborative threat detection using big data analytics

Guest Editors

t Big data encryption

For more information, contact the guest editors:

t Models and languages for big data storage


t Data privacy preservation
t Joint encryption and compression of big data
t Obfuscation of big data
t Watermarking of big data

t Bharat Bhargava, Purdue University, USA,


bbshail@purdue.edu
_____________
t Ibrahim Khalil, RMIT University, Australia,
ibrahim.khalil@rmit.edu.au
_________________

t Secure and efficient transmission of big data


t Secure storage/retrieval of big data
t Secure database transactions of big data

t Zahir Tari, RMIT University, Australia,


zahir.tari@rmit.edu.au
______________

www.computer.org/cloudcomputing
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

ROUNDTABLE

THE WORLDS NEWSSTAND

EDSALL

KREBBERS

PAPPE

KHALIDI

Cloud Computing
Roundtable
Mazin Yousif, T-Systems International
Tom Edsall, Cisco
Johan Krebbers, Shell
Stefan Pappe, IBM
Yousef A. Khalidi, Microsoft

In this issue of IEEE Cloud Computing, EIC Mazin Yousif chats


with cloud experts from Cisco, Shell, IBM, and Microsoft about
directions in cloud computing through 2020. They discuss a
range of issues, from standards and compliance, to security and
privacy, to the role of open source.

Tom Edsall: The industry is trying to figure out what


cloud computing is and where its going. Theres a lot

of change occurring, as you would expect with any


sort of large idea thats relatively new in the mind
of the individual consumer and the enterprise, as
well as the service provider. Whether youre trying
to provide or consume cloud services, what people
are looking for is the lowest total cost of ownership
(TCO), achieved through agilitythat is, the ability

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Mazin Yousif: Thank you for participating in our


roundtable. Lets start with each of you taking a few
minutes to tell us about the current state of cloud
computing in the market.

40

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

to quickly adapt applications to the cloud, and then


deploy and expand them.
Johan Krebbers: Royal Dutch Shell started looking
at cloudmeaning public cloud, as I dont believe
in private cloudaround four years ago, focusing
mainly on infrastructure as a service (IaaS). Now,
were moving to software as a service (SaaS). When
we go for a SaaS offering, its unlikely that well retrench back to just an on-premise software model.
Latency is not really a major cloud challenge for us;
its our inability to export our data from selected
countries. So when we cant use cloud, we rely on
local datacenters.
Also, when we use platform as a service (PaaS),
we prefer larger providers over small ones for fear
that smaller providers will go out of business and we
might not be able to retrieve our data.
Stefan Pappe: You dont hear me saying public and
private cloud because it means so many things to
different people. I would like to distinguish between
off-premise and on-premise deployment models.
Both can be dedicated or shared. Public cloud is often shared, private cloud is often dedicated, but it
does not need to be.
Let me take a client view and reflect on how clients currently adopt cloud. A good share of clients
is already adopting cloud services. That drives the
growth of managed service providers. Clients often
start with an IaaS and with specific workloadsfor
example, Internet-facing workloads. Another entry point is clients putting test workloads into the
cloud, but increasingly we see production workloads
as well. Sometimes there are real reasons and other
times there are perceived reasons and feelings for
not putting certain workloads on the cloud. But regardless, there are workloads that stay on premise.
So, what we see is often a mix of on-premise and
off-premise cloud models. The resulting organizational pattern is called a hybrid cloud pattern, which
is what clients often implement these days.
That said, clients still want to see a single catalog and a single service-management interface, allowing smooth transitions between clouds. This is
where APIs and standards come in to enable a variety of integration and delivery models. The cloud
services consumed are mainly IaaS, but that is
M AY 2 0 14

changing as we speak. The next level of services


that are already accepted is PaaS. These services
are often implemented as workload patterns. They
usually start with simple Web service database patterns, which are then uniformly standardized and
offered to all developers across an enterprise. Additionally, were seeing more complex patterns, such
as high-performance computing (HPC). These PaaS
services are often based on open source models
such as Cloud Foundry. This trend is fueled by the
increasing number of what we call systems of engagements, which represents a more agile type of application and application development, different from
the traditional system of records. Systems of records
include enterprise resource planning (ERP) systems,
for examplethe transaction systems of the world
as we know them, which will always be aroundbut
theyre more and more frequently front-ended by the
new systems of engagements, which are, for example, mobile and/or social applications. These types
of applications require short turnaround times.
They react quickly to new client demands and market changes. This means they require very short development cycles and a continuous delivery model.
Yousef Khalidi: If you look back three years, lets say
to 2007 or 2008 or 2009, the hype cycle was really
high and the workload was still new. Most enterprise customers were asking, What is the cloud and
what does it really mean? Whats the difference
between hosting and cloud, and why should we even
bother? Startups might have thought, Yes, well go
to a big cloud and put our stuff there.
Fast-forward a couple of years and the question
shifted from, What is the cloud? to Why should I
use the cloud, and how would you meet my requirements, especially requirements related to compliance, security, data governance, and so forth? If
you fast-forward to today, many of those questions
are gone. There are still always questions about
compliance, security, and control, but the questions
have shifted to, My CIO said use the cloud. What
can I use it for? Which of my applications are appropriate for the cloud? When do I adopt the cloud in
my technology-refresh cycle? In which geographical
location should I do this? In which should I not?
What I see at the moment is that adoption is
definitely there. You see services from vendors left
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

41
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

ROUNDTABLE

THE WORLDS NEWSSTAND

and right and people are seeing advantages to the


cloud. The agility aspect is very useful for many customers. But the core question now is how I should
use the cloud vis--vis the rules of constraint I might
have with my on-premise systems. Theres a balanced discussion happening as we speak, and the
adoption Im seeing now spans the spectrum from
customer premises to the public cloud.
Were also seeing enterprise mission-critical applications being put in the cloud; perhaps not the backend transactional applications, but more organizations
now depend on the cloud to run their business.
Yousif: Lets have a little discussion on private and
public versus on and off premise. Does that mean
the industry needs to do a better job defining these
terms? Is there market confusion?
Pappe: I think there is confusion and sometimes we
confuse them ourselves. I think its perfectly fine to use
public and private cloud if you define what you mean,
but sometimes I hear conversations in which different things are just implicitly assumed. Therefore, its
good to say, Oh, I use a public cloud and I use it in a
shared environment, which is off premise in another
country, in another datacenter. That defines it.
Edsall: I like the terminology that Stefan was suggesting. I think it makes a lot of sense.
Khalidi: I use the term on premises to refer to data
and applications that are controlled by you behind
some security wallwithin your datacenter. Its
yours, customized, and you have direct control over
it. Your IT department can see the gear and wires.
You have total control.
Yousif: Theres also confusion using the term virtualization, which some people refer to meaning cloud.
Edsall: Virtualization is so overused its not even a
useful term anymore.
Khalidi: Ive seen a lot of confusion in the marketplace, and some people confuse virtualization with
cloud, which is not the case. Virtualization is a necessary condition for cloud but its not sufficient. You
need to virtualize both the compute infrastructure
and the network infrastructure, but the cloud is
beyond that. You need to have the on-demand aspect, and, importantly, the scale. A cloud basically
includes scale and is truly global, so whatever your
business needs, you can get your data from anywhere, and it can reside almost anywhere.
42

I EEE CLO U D CO M P U T I N G

Yousif: I would like you to come up with statements


about the pain versus gain when moving applications to an off-premise cloud. Sometimes people
think they can move everything to the off-premise
cloud and still do better than on premise, not realizing that sometimes it might be painful to move
workloads off premise.
Edsall: There are a lot of considerations and everybody focuses on the positives of going to a cloud, but
if we look at some of the negatives, clearly there are
compliance and regulatory concerns that have already been mentioned. On a related note, it might
be easy to move applications off premise, but not so
easy to move the data.
Pappe: An interesting topic because sometimes we
see migration to cloud as a technology exercise and
miss the transformational aspect. You move to cloud
usually to lower the cost and increase your agility,
which means faster time to value, but that comes
with a price. It comes with a transformation need for
the client moving to cloud because you need to clearly articulate the levers to lower the cost and increase
agility. One aspect is the need to standardizenot
necessarily cloud standardsbut the number of
variants you operate in your environment, starting
with the operating system (OS) level. Sounds simple,
but there might be dependencies with your middleware and your applications, and changing an OS
level might have a far reaching ripple effect.
This means you need to standardize your stack,
OS, middleware, and application to be able to lower
the number of elements in your service catalog because this will result in greater consistency and less
cause for errors when moving from development to
test to production. That drives down your cost. Lowering the number of variants of catalog elements
makes it also easier to automate. And then you can
automate the heck out of it because instead of 400
variants you have maybe 10.
With that, your server is up in minutes, but your
internal compliance still needs to be executed with
manual hand-offs and evidence gathering, and actions
like that to fulfill your internal compliance. If you do
not change your post provisioning processes, you might
end up not having improved your overall time for service activation. That type of transformation needs to
be clearly spelled out with clients, because this is often
overlookedwhich means theres a need for internal
transformation to fully exploit the benefits of the cloud.
Khalidi: I strongly believe on premise will stay for a
long time. If you look at what people have on premW
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

ise, youll see your large transactional big database systems supporting your ERP system and the like. Some
move to the cloud just fine. Others will probably stay
on premise. You still have systems that are technically built out of very large systems, use half a terabyte
of memory, a big cluster of machines, transactional
workloads, and so on. Trying to move that to the cloud
would be very painful at the moment. The customization you can do on premise isnt really possible in the
cloud. You can do it in a hoster, in which case you really need a hosting place, the same kind of cost structure, anyway, that you might have on premise.
Another example is if you have a lot of data on
premise, latency considerations, or governance considerations you might keep it on premise. These things
would be painful to move to the cloud. It wont be cost
effective. Therefore, I believe that hybrid is the way
were going to live in this space for a long time.
Yousif: How do you envision cloud evolving going
forward, lets say in 2020. Are we going to see more
diversity in services? What about manageability?
What about the degree of automation?
Pappe: Software-defined environments (SDE) are
the drivers for cloud automation, and its implemented in pockets already. SDE enables the abstraction
from the infrastructure, it makes your infrastructure programmable. Thats great because you can
hide all the specifics from lets say our many infrastructure vendors from your cloud automation. For
example, you dont have to bother with your switch
configuration in your network anymore. OpenStack
is a central element to this concept, and our strategy
is very much aligned with OpenStack. SDE based on
OpenStack to programmatically manage your infrastructure without having to manually configure it or
go down to the device. It is enabling the next level
of agilityby having a programmable infrastructure. SDE also prevents a vendor lock-in because
you can move between infrastructures much more
easily. These are the infrastructure benefits, but it
becomes even more interesting, if you couple this
programmable infrastructure with workload awareness. What that means is you will be able to define
your workload characteristics formally, for example
in OpenStack HOT, including the topology and
the networks setup and so on, but also your nonfunctional requirements. For example, you can define thresholds, like for performance and with that
knowledge built into the application definition, your
control layer can automatically react if an incident
happens. So, for example, if your performance falls
below a certain threshold, you monitor it, you will
M AY 2 0 14

detect it, and then you can trigger an automated


policy. The policy might say, if your performance
of application XYZ goes down, scale out. For your
application XYZ this is the right thing to do, because
it nicely scale out and its written for that. The scale
out happens automatically by using the programmable infrastructure. You might see that on a dashboard, but nobody has to do anything manually, and
youll recover your service-level agreement (SLA)
automatically. I think this combination of workload
awareness and programmable infrastructure will
make a big difference in the future.
Edsall: I agree that what Stefan is talking about is
a big part of what were going to see in the future.
One example is Cisco and IBM working together
on a group-based policy model thats being pushed
into OpenStack and Open Day Light ODL; an open
source consortium for an SDN controller), which lets
customers describe what they need from the infrastructure from the applications perspective. I think
this will be big because it will let you automatically
define the infrastructure requirements when you develop applications, instead of handcrafting them.
Krebbers: I expect most software will be off-premise
SaaS in 2020, with very little on premise, along with
more standardized integration.
Edsall: 2020 is a long way off. What we dont know
now, which I think will be an interesting area of
technology development for the future, is when
these applications will begin to directly interact with
the infrastructure. Today, if my performance falls
below some threshold, the infrastructure can react.
But what is more interesting is that the application
might anticipate that its going to need more capacity. And it will provision more or it will ask for the
infrastructure proactively to do something, or maybe theres an advantageI am thinking strictly from
a networking perspective. We always treat all traffic equally. We spend a lot of time trying to be fair.
Well, maybe fair isnt the right answer in all cases.
And if the application could inform us of how its
traffic should be treated, we might see performance
improvements in those applications. The applications actually run faster because of the integration
with the infrastructure.
And so what we really see is that the line between
the infrastructure and the application is starting to
blur, just as we see the line between development and
quality assurance (QA) and operations, and the line
between compute, networking, and storage blurring.
So all these lines are blurring together.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

43
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

ROUNDTABLE

THE WORLDS NEWSSTAND

Yousif: In terms of manageability, the degree of


monitoring that you see currently in the cloud, do
you need more visibility, or are you satisfied with the
current monitoring and reporting?
Krebbers: The ability to see more, especially when
business-critical applications need to comply with
certain rules, runs in the cloud. The same applies for
applications with legal restrictions. So, current monitoring in the cloud is basicnot as sophisticated as onpremise monitoring. That said, off-premise cloud needs
to have monitoring similar to on-premise deployments.
Pappe: You also see mixed-delivery models with
mixed responsibilities between providers and users.
Todays outsourcing model, even if its within a single
enterprises IT shop, is often all or nothing in terms of
management. In this more traditional model, a provider manages the workload completely (OS, middleware,
and application). With the emerging, agile types of
applications, the systems of engagement, this changes. These applications often replace their underlying
middleware, as they grow for example. Regularly, the
development teams are directly involved in operations,
in the spirit of DevOps. The rate of change in these applications is high, therefore they implement a continuous deployment model. Systems of engagement often
need a lightweight, modular way of service management, and associated tools and processes, which lend
themselves to different splits in roles and responsibilities between providers and users.
Khalidi: You hear me speak of governance and about
who controls what. A common question from enterprise customers is, when on premise, I know everything. I can walk to the machine, switch, and so on.
Now, when off premise, my system is beyond my security wall and I have no control. Everything is an
extension of what I have on premise. So, legitimate
questions are: Do you want visibility and assurances? Whos accessing the data? Whos doing configurations? Moreover, customers want control over the
policies of who can do what, who can access their
systems, and so forth. And they have to work with
what they have on premise, too, because in my view,
the cloud will be an extension of what they have,
and therefore it has to augment in-house systems
not just in terms of network and storage and the like
but also in terms of management.
I think youll see more automation for load management, moving workloads around and so forth,
but the first part, I think, is more visibility and understanding whats going on, and controlling whats
going on is going to be important.
44

I EEE CLO U D CO M P U T I N G

However, and to be candid, almost by definition


you will not have the exact visibility you have when
on premise.
Yousif: Any insights on how security and privacy
concerns will be dealt with in the next few years?
Edsall: I think the cloud raises a new set of security
concerns, and also raises a whole set of security opportunities. With an application-based model, you can
clearly define what you would like from the infrastructure in terms of security as well as interaction of the
application components in a way thats comprehensive
and not dependent on the physical infrastructure or
where the applications are within that infrastructure.
This can be a great opportunity for making cloud more
compliant and more secure than in the past.
I think off-premise deployments, and especially a
shared environment, provide a great economic advantage, but it opens a whole new set of questions about
security and how do I know that there is real separation? How can I be sure that someone isnt looking
at my data? Again, I think recent events are causing
a lot of people to ask about the security of their data
on a public infrastructure. Well face new challenges
in that space. We might also see more attacks coming
from the inside rather than from the outside. On a
note related to monitoring, I think well see a huge
movement in the area of big data analytics pertinent
to security and understanding compliance. I do think
that will happen way before 2020. I think applicationbased models for defining the behavior of the infrastructure will be an integral part of the tools used to
develop applications such as Cloud Foundry.
Yousif: Because security isnt just a technology issue, any insights beyond technology such as processes or governance?
Pappe: When it comes to security, I want to differentiate between infrastructure platform and
software. On the infrastructure side, you need to
protect your cloud infrastructures, securely deploy
workloads, and meet the enterprise compliance objectives. If you have a hybrid cloudon-premise
and off-premise deploymentyou need to have full
visibility across both deployments. When you go
up to the PaaS level, you want your developers to
be able to develop secure cloud applications with
the respective APIs. You also want to protect them
against fraud and application threats. On the SaaS
level, you want to have complete visibility into the
enterprise usage of SaaSon premise and off premise. You want to create a risk profile and see if offW
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

premise deployments have similar or different risk


profiles than on-premise deployments.
Khalidi: There has been a shift from a few years ago,
when the notion of putting anything outside my firewall was a no-no. Actually, Im using the word firewall figuratively. Nowadays, its hard to define what
a firewall really is. I see the shift happening now,
and the trick, again, is to recognize what makes
sense to move to the cloud or to use the cloud to
augment what you have on premise.
There will still be reasons for keeping things
within your control. The main reasons arent technical. The law says so. Countries say so. My state says
so. You can go around the globe and youll find good
reasons, just reasons that the law says that this has
to stay here. And thats not really open to debate as
far as Im concerned. The law is the law. So, for such
reasons, youll stay on premise.
Krebbers: I challenge in certain cases whether on
premise is more secure than cloud (off premise). In
many cases, on premise is as insecure as the cloud
or as secure as the cloud.
Edsall: I think that the challenge isconsidering
the services provided to youhow do you know that
the service provider has its house in order? How do
you guarantee their compliance, that the providers
standards are as strict as your own or as you expect
them to be? Thats a difficult problem.
Krebbers: We need new ways to find out. We need
independent parties to monitor services on customers behalf on an ongoing basis. Service providers
also need to find a way to satisfy their customer
base, which can be in the thousands, because they
cant dedicate an auditor to each individual customer. I dont know what the best model is.
Edsall: This can be an opportunity for a whole new
industry, although I think it raises questions around
liability. If something goes wrong, is the third-party
auditor liable or is the provider liable or does it ultimately come back to the customer?
Yousif: Too early to tell. We still need to figure that
out, but well likely need something because there
will always be this trust split between service providers and customers.
Pappe: Let me challenge you. In principle, youre
right. Why should an off-premise service be different
from an on-premise service if the same rules, processM AY 2 0 14

es, and policies are applied? Often you dont know the
off-premise rules and policies, as well as you know your
own policies. Lets take the example of the famous malicious insider. On premise, hopefully you dont have
shared privileged IDs, so if theres a malicious insider,
you know who it is. Does your off-premise provider
follow the same rule set? If not, the probability that
theyll catch a malicious insider is much lower.
Krebbers: Im more careful of statements like on
premise is more secure than off premise.
Pappe: My statement is only that if you dont
know the policies, you cant judge. Therefore, a
general statement is difficult to make. But, if you
say, I know cloud provider XYZs policies, and
they publish them, and Im fine with them, then
you have a base on which to judge your security
level. If its not published to the detail you need,
then you cant judge.
Krebbers: You need to create a base, but even if you
have the base, you still need confidence that people
operate against that base. You need to verify it. And
my point is that, internally and externally, dont operate against that base, even if you have agreed upon it.
Yousif: For third-party independent consultants to
do their jobs, do we need to architect additional capabilities in the platforms?
Edsall: That will be a matter of gathering information and providing audit trails and having standards
around what should be done, along with documentation of processes and those sorts of things. I think
a lot of this will be worked out. I do believe there
is room for differentiation as cloud providers might
differentiate themselves on their level of transparency or level of security.
Yousif: This could be along the lines of auditing performance benchmarks.
Krebbers: Yes, but the point we made is farther
along, because certain companies will start offering
certain types of services or certain types of compliance services that will start to add the hooks youre
talking about, but only if you provide the types of
service that will provide the type of hooks you need.
Okay, but I fully focus, in Shell terms, on storing
the most confidential data. If theres data you need
to be very secure, youll need to add another type
of hook to your environment, so there will special
companies for that. There are special companies for
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

45
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

ROUNDTABLE

THE WORLDS NEWSSTAND

other types of services, and theyll start driving the


hooks you were talking about.
Edsall: Another industry thats seen effective auditing is the finance community, where there are general accounting principles, audits of public companies
by specialized firms, and standardized and regulated
reporting.
Yousif: Lets move to standardization and compliance. Are we going to see interoperable cloud offerings during that timeframe with sufficient degree of
standardization?
Edsall: Absolutely. I think were beginning to see it
now.

building blocks that we rely on are things like TCP/IP


and SSL. More and more, most cloud providers have
well-defined RESTful interfaces to their services.
Many, but not all, of the libraries and APIs are available in open source, so theres actually a fair amount
of transparency to at least the big providers. And most
of them have a way to get your data in and out quickly
if you need it. The question really is, as you move up
the stack, how much can you expect to have common
interfaces for things like application management?
It becomes harder and harder to have standards further up the stack. So I believe youre going to see a lot
more standardization at the lower part of the stack,
and I think that a fair amount of standardization of
data access APIs is currently available.
One more thing. Its still early. There are technical
arguments for it being too early for standardization.

Yousif: How do we get there?


Edsall: Theres a lot happening in the open source
community. OpenStack and ODL are a few examples. But they have to be done in such a way that
service providers can differentiate among themselves. So there has to be flexibility for innovation
and differentiation. I think this will be done through
the open community, much more than through standardization communities such as IEEE or IETF.
Yousif: Why do you think so, Tom?
Edsall: Because I think the whole industry is so fluid
and there is such rapid rate of innovation that those
standards committees arent agile enough and sometimes they get a bit too mired in politics. The open
source community, on the other hand, is all about
delivering actual products. I know thats a little bit
contentious.
Pappe: This is also where the actions is, where the
cool developers are. See how well-visited the OpenStack conferences have been recently. Its cool to be
there, be part of the party, be influential, maybe even
be a committer. Thats how innovation is driven. If
you win the hearts of the developers, you created a
de facto standard. Its happening, for example, with
OpenStack. Cloud Foundry is another candidate.
Krebbers: I also agree. Its quite interesting to see
open source moving in many areas into the mainstream. If you look at the whole analytics space,
open source is becoming dominant.
Khalidi: We already have a fair amount of commonality among different providers and systems in that the
46

I EEE CLO U D CO M P U T I N G

Pappe: I agree with Tom on the needs; companies


need to participate in open source. And the big companies need to develop the ability to innovate with,
and sometimes despite, open source. And thats the
trick the big companies need to learn. But the community also needs to allow that there can be differentiators for the individual companies.
Krebbers: But the difference is in how you apply it.
From a supplier viewpoint, its all about how you apply
the software; it cant be the same software, but how
you applied it. But that would be different depending
on whether you receive or produce the services.
Yousif: Are you going to see interoperability among
major service providers such as Amazon Web Services (AWS), Microsoft Azure, and IBM PureSystems?
Edsall: Either youll have direct interoperability, or
well see a suite of tools that will provide middleware
that ensures interoperability. I think youll definitely be
able to have workloads spread across all of those environments. In fact, I believe youll see that very soon.
Pappe: I think there is a trend to marketplaces,
ecosystems, and service composition frameworks.
Theyre tied together with open standards, enabling
interaction and preventing vendor lock in. But not
everybody is participating in all of them and we will
some evolution and gravitation over time.
Yousif: Another topic is economics. Are we going to
see different economic models, different consumption models from service providers versus what the
customers and consumers are asking for, to entice
more consumers to use the clouds?
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Edsall: That will be part of how providers differentiate themselves. For example, theyll provide different
levels of security or compliance, or audit capability,
or maybe theyll provide content to their customers
that you cant get anywhere else. Certainly, if I was
running one of these networks, I would be doing everything I could to attract more customers and trying every economic model I could think of.
Pappe: I see a trend from the pure lowering of IT
costa trend shifting the value to industry solutions, to industry transformations.
Yousif: What does that mean?
Pappe: I mean, using clouds to provide industryspecific applications and value adds, which wouldnt
be possible without cloud. Take Netflix, for example.
Netflix wouldnt be there without cloud. Netflixs
entire business model is based on the cloud. Netflix wrote its own open source platform on top of an
infrastructure cloud service. Netflixs platform is an
industry solution which is a large differentiator for
them and their business model. The future drivers
of cloud are new business models and platforms enabling those which wouldnt work without a cloud.
Yousif: What about cloud use cases, experiences,
and adoption? Do you think enterprises will have
full faith in the cloud by that timeframe, that user
experiences will be always positive? Are we going to
see additional use cases that will be defined?
Edsall: Certainly not everyone is using cloud yet. In
fact, most enterprises dont have a cloud. Theyre developing their cloud strategies. Theyre experimenting with it and clearly there are exceptions. Theres
a whole decision process theyre going through right
now, whether they want to have the cloud on premise or off.
That will be mostly driven by economics. As the
infrastructure community reacts to whats happening in the cloud space, those economics are changing. Until just recently, if you wanted a private
cloud, you had to build it yourself using components
designed for a different kind of infrastructure. Now
were starting to see products designed for the cloud.
So maybe I can build my own cloud economically. I
might therefore evaluate the on-premise versus offpremise decision a bit differently.
There is also the question, If Im going to use
cloud, will it be on-premise or off-premise? I recently had some feedback that companies are pulling
back a little bit from off premise, primarily because
M AY 2 0 14

of recent US National Security Agency (NSA) revelations. This is related to concerns about who is really looking at my data when it is off premise.
Pappe: Let me put a different spin on it. Im not as
strict as Johan in terms of cloud vis--vis on premise.
Were seeing a transformation in how we develop and
deploy applications. Say some developers are writing
an application using the DevOps method. The process is driven by continuous delivery, so turnaround
times are fast. A cloud delivery model is essential for
such a process to work, regardless of whether the
cloud runs on or off premise. The cloud model is the
underlying principle of such a DevOps model. The
ability to execute that is a huge value for our enterprise and our clients, because they derive business
value out of it. They become more agile and they can
differentiate themselves better from the competition,
and thats more than infrastructure services.
Khalidi: A few years back, it was testing and development, not production. Now were seeing production
in the cloud. Were also seeing new application mixes
in the cloud. Youll see more and more services running in the cloud that are extending what you have
on premise and, importantly, adding value to what
you have. In the next few years, I predict well see a
combination of more lift and shift, more extension,
but, importantly, new applications. Frankly, Im surprised at what people are coming up with. When you
give them a global computer infrastructure with rich
services and go all the way up to SaaS services and
unshackle them from the mundane aspects of putting
in the data centers doing mundane work, people are
actually coming up with very interesting applications.
Yousif: Are you saying that adoption will increase
quite quickly, despite the existing sensitivities about
security and privacy?
Edsall: I agree with Stefan. The adoption of cloud
is occurring across the industry. Everybodys moving
to cloud. By cloud, I mean both on and off premise.
As I said earlier, theres a lot happening on premise,
but thats going to change. There are a lot of considerations for a company when they adopt a cloud
strategy and as we were saying about the DevOps
lifecycle of an application, it changes how we think
about applications and how they interact with the
infrastructure. I might be developing policies in
parallel with my application development, and developing my QA as Im deploying these applications and
iterating on this rapidly. That certainly will happen
quite a lot if youre going on premise.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

47
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q

ROUNDTABLE

THE WORLDS NEWSSTAND

OUR PANELISTS

Tom Edsall is the chief technology officer of


Ciscos Insieme Business Unit, a Cisco Fellow,
and a cofounder of Insieme Networks. At
Insieme (recently spun back into Cisco), Edsall
has led the development of the applicationcentric infrastructure (ACI), which includes a
new line of Nexus 9000 switches that form an
application-aware switching fabric along with
a centralized controller that manages both
virtual and physical network infrastructures.
He has been awarded more than 70 patents
in the networking industry. Edsall has an MS in
electrical engineering from Stanford University.

Stefan Pappe is an IBM Fellow and vice president for Cloud Architecture in IBMs Global
Technology Services. In this capacity he
oversees the architecture and design of cloud
offerings and client solutions. Pappe received
a Master degree in Economics from University
of Karlsruhe, Germany, and a PhD in Computer Science from University of Kaiserslautern,
Germany. Stefan spent most of the 25 years
of his IBM career fueling the services business
through technical advancements and assetbased innovation. He is an author of several
patents and technical papers, including the
IBM Cloud Computing Reference Architecture, a comprehensive technical blueprint
guiding cloud design and delivery.

Johan Krebbers is the Shell Group IT architect and the lead architect for Shells Projects
& Technology Business. As Group IT architect he is responsible for the IT architecture
across the entire group, including business,
applications, data, and infrastructure. Previous position include infrastructure architect
in Shells Exploration and Production business unit and architecture and development
manager for the Shell Group Infrastructure
Desktop (GID) project, which rolled out the
same desktop infrastructure to 130,000 users in more than 130 countries. Krebbers is
currently based in the Netherlands.

Yousef A. Khalidi is a distinguished engineer


at Microsoft, working on Windows Azure,
Microsofts large-scale cloud system. Khalidi is
currently concentrating on Azure networking,
including network virtualization, softwaredened networks, and hybrid networks.
He has worked on and published papers in
several areas, including operating systems,
cloud systems, distributed systems, networking, memory management, and computer
architecture. He holds more than 40 patents
in these areas. Khalidi has a PhD in information and computer science from the Georgia
Institute of Technology.

Mazin Yousif is the EIC of IEEE Cloud Computing. For his full bio, see page 7.

Its starting to erode traditional organizational


boundaries. Most enterprises have a compute organization, a networking organization, and a storage organization. As they go into even the on premise clouds,
those boundaries probably dont make as much sense.
In fact, quite often how you organize your tools and
develop applications and how you structure your organization all tend to mirror one another. I dont know
what changes firstthe organization, the tool sets, or
the applicationsbut they all change together.
And lastly, organizations skillsets will change,
48

I EEE CLO U D CO M P U T I N G

moving from a lot of handcrafted scripts and configurations that are somewhat fragile and static to a
process thats much more automated and more software driven.
Yousif: On a related topic, do you see in that timeframe cloud services delivered by few 800-pound
gorilla providers or a large number of small cloud
providers, each delivering specialized cloud services?
Pappe: There might be a consolidation of infraW
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

structure providers because of the need for large investments and continued optimization of operations
down to the tenth of a penny and even smaller. I
think we might see consolidation on the infrastructure side. But at the higher level of the stack, middleware, platforms, industry solutions, we will see
the number of service offerings exploding. New enterprises are coming up, which wouldnt be possible
without a cloud. Were seeing a lot of SaaS and PaaS
providers with exceptional innovations. Of course,
often they get acquired by somebody, which leads to
some consolidation, but they fuel innovations on all
levels, including new industry models.
Edsall: I agree completely. Of course were going to
see a mix. The race to zero will be by the big guys
and everybody else is going to try to figure out how
to inject value so they dont have to race to zero, but
we see that with almost everything that happens on
the Internet.
Khalidi: Given that building a cloud requires capital-intensive businesses, not just at datacenters and
servers, but also at the global network level, it will
favor large scale. I am not an economist, but this
pattern will result in fewer providers. Having said
that, there are regulations. There are geopolitical
considerations that will make this more than a pure
economical argument, which in my opinion says
you still need either on-premise private clouds or
specialized vendors that are within some domains.
So, in my opinion, we can end up with a handful of
large public global cloud providers, augmented with
technology providers and on-premise technologies
that cater to local governance issues.
As you move up the stack you get specialization.
Youll always see vendors that do special functions.
So as you move up the stack, you get more specialization, meaning many providers.

tions and maybe even more standardization than many


standard organizations, by creating de-facto standards.
Supporting open source is essential. Our entire product strategy is based open standards and open source
at IBM. It lets us offer an open environment to clients
with no vendor lock-in, allowing for choices, while differentiating with our own value adds and eco systems.
That is the key ingredient of the future cloud.
Edsall: Thats also true for Cisco. We see that using open source can be a key differentiator for us
and so we feed in on the open source movement
around our ACI (Application Centric Infrastructure)
architecture.
Khalidi: Open source has an important role. Theres
a lot of activity in open source for all technologies.
Certain technologies are useful for building largerscale systems for synchronization, data replication, caching, and so forth, and theyre very much
in the mix. I think everybody supports them at the
moment. What I see missing is an infrastructure to
make it easy to build cloud applications. You might
have Ruby-on-rails, so you can do some website
type of stuff, but its still Lego blocks. You have to
compose things, relatively low-level stuff, be it open
source or otherwise. Just pick up that piece of code
and you can say, Oh, my gosh, configure this. Do
this and this and that, so the good news, again, is
that there are a lot of building blocks available. Hadoop is a great engine, but to build solutions, a lot of
plumbing is still needed.
Yousif: Anything else you want to address here that
we havent touched on?

Yousif: We touched on the role of open source. I


would like you to summarize your thoughts about it.

Edsall: The only thing I can say is that 2020 is six


years away. I think that all of what were talking
about can happen in the next three years. If you ask
for a prediction for six years out, I think were all
making it up. We really dont know and thats so far
in the future in this industry.

Edsall: I definitely think you are going to see service


providers trying to use open source. When theres
a critical mass in the open source community, it
moves very, very quickly. And the rate of innovation is hard for private companies to match. Youll
see them taking the open source, injecting their own
value into it, and integrating it with their own systems to provide their services. I really believe theyll
embrace this technology.

Pappe: I agree with Tom. I hope that everything


we talked about happens within the next couple of
years, and theres a large likelihood to it. Actually,
much takes place as we speak, innovations, which we
evolve and make operationally robust and efficient.
For an IT person like me, its one of the most exciting times in my lifetime, because of the exciting
transformation we talked about and the fact that we
are in the middle of it forming our future.

Pappe: The open source community drives innova-

Yousif: Thank you all.

M AY 2 0 14

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

49
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

STANDARDS AND COMPLIANCE

Setting Cloud Standards


in a New World
Alan Sill, Texas Tech University

IEEE Cloud Computing aims to publish articles that describe not only
cloud-specific standards seeing use, but also the process through which
theyre developed.

50

defect in a cryptographic standard might expose you, me, and


everyone working with us to
financial ruin, catastrophic security vulnerabilities, or weaknesses. An inconsistent Web
browsing experience between products is no longer
the most unpleasant thing we might expect from inconsistent HTTP frameworks. Such inconsistencies
led to server-side developers spending a great deal
of time and effort to deal with them. Although this
problem still occurs, an even more important consideration now stems from the manner in which
hypermedia and Web APIs are being designed and
integrated deeply into business processes for essentially all new software. The corresponding need for
consistency and reproducibility in the associated
frameworks and standards is very high.
The ongoing process of defining, developing,
and recognizing standards in general is intrinsically a community activity. It might be possible to
define what constitutes a standard, at its most basic level, as anything agreed to by more than one
party, and in such a definition, we could sweep up

a broad range of human intellectual effort. In the IT


field, there is and has always been a wide variety of
community time and effort expended to define and
codify the framework and details of our work.
This great range of activity continues to this
day. In the fi rst article in the StandardsNow series, which appears elsewhere in this issue, I describe some of the practical aspects and nuances of
the current cloud standards landscape. For reasons
detailed here and in that column, the area of standards for cloud computing is now mature enough
to merit coverage in IEEE Cloud Computing on an
ongoing basis.
Cloud Computing magazine welcomes articles
covering standards and compliance. Like all the
other editorial sections of the magazine, we want
to recruit well-written, clear, succinct articles from
leading projects and experts that illuminate our readers about the topic at hand. Although other areas
might touch on standards from time to time, and
many of the introductory editorials already mention
such topics, articles in this part of the magazine will
focus in detail on the development process used to
create cloud computing standards, on their analysis

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

and description, and on their practical use to solve


real-world problems.

Topics of Interest
Are there really established and emerging standards in the new world of cloud computing? Yes,
defi nitely. You might not know about such efforts
yet, so the magazine aims to collect the most coherent explanations available and expand on them
wherever possible. This effort will aim to study, explain, document, and give a forum for describing
not only the standards that are seeing use, but also
the process through which theyre developed to the
point that they can see the light of day. As well
see, standards in the cloud computing world arent
new at all; many of these protocols and specifications have been under continuous development for
several years, leading to an increasing state of maturity that makes it possible and practical to take
on such an effort.
What about standards adoption? We shouldnt
miss the opportunity to engage in this topic directly.
Articles that document adoption efforts for both established and emerging standards sets are welcome
here. Of course, standards that are experiencing
substantial uptake are the best ones to document,
and possibly the ones that least need documentation, but theres room also to put promising new
efforts into the spotlight to provide exposure and
possibly improve their uptake.
Because standards are in fact a communal activity, Ill be relying on the community, meaning you,
to identify whats of interest. Contributions can be
historical or modern in approach, as long as theyre
focused on creating a successful standards-based
framework for cloud computing innovation. Ill also
make space available for short tutorials or relevant
and revealing use-case examples, if these examples
are general in nature and illustrate the solution provided by the standard being described.
Specific topics to be targeted include
r architectural efforts;
r ontology, taxonomy, and definitions;
r standards structured for particular branches
of service-oriented architectures (SOAs), such
as infrastructure as a service (IaaS), software
as a service (SaaS), and platform as a service
(PaaS);
r standards intended to cut across or bridge SOA
levels;
r proofs of principle;
r use cases and requirements;
r test infrastructures; and
M AY 2 0 14

r benchmarking for performance and functionality.


Im also interested in articles that describe cooperative work across multiple standards-developing
organizations, especially work that combines efforts
to reduce duplication, promote cooperation, or refine cooperative standards and specification sets
into new levels.

Improving Interoperability and Promoting


Innovation
It must be acknowledged at the outset that not all
parts of the cloud computing world are or will be
amenable to treatment by standards. This is true of
any field. I explore this topic in more detail in the
StandardsNow column, but for the moment let
me just say that there are already successful cloud
standards, and its even possible to understand
where theyll work best; in brief, standardize at the
interfaces to enable innovation between the boundaries of a process or workflow. This approach not
only works best, but it can simultaneously improve
interoperability and promote innovation in a given
cloud project or product.
The process through which standards are developed and studied is obscure to most people and even
to a large fraction of experienced developers in this
field, even if theyre familiar with the resulting specifications. In my experience, the standardization
process works best when there are multiple coordinated avenues for close, iterative communication between people working on the documents comprising
the standards and those using them in the field.
Although this much almost goes without saying, its nonetheless true that there are many ways
for such communication to take place, and these
methods differ substantially between various organizations developing standards. In fact, they vary so
much that theres often little resemblance among
them from one organization to the next.
For this reason, it will be interesting to the community to document and describe the procedures
used by the various organizations, which range
from completely open processes designed to develop
working software for a given project, to much more
elaborate specification-oriented document production methods that arent tied to any one particular
software product. (Disclosure: I work a lot with organizations that are in the latter category, and in the
case of the Open Grid Forum, a specification cant
even be promoted to the highest recommendation
level without documented evidence of more than
one successful implementation in the field as well as
significant uptake.)
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

51
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

STANDARDS AND COMPLIANCE

Road to Adoption
I understand that the topic of standards might not
be everyones cup of tea, and that often people need
to encounter this topic multiple times before it even
begins to make sense to them. This is true of many
other aspects of IT development, especially in the
new world of cloud computing (which, again, is actually not that new).
This characteristic makes it especially important to recruit good, clear articles that not only
capture the technical details of a standard set or
specification, but that go beyond such details to
explain the motivation, usage scenarios, value, and
expected interdependency of such standards for the
benefit of the educated reader.

Not all of these standards fit together,


and many of them have been designed
independently.

Consider the Open Cloud Computing Interface


as an example. (This explanation anticipates a longer discussion to be published in an upcoming issue, as mentioned in my StandardsNow column.)
OCCI was one of the first standards to be deployed
in the area of infrastructure services control, the
IaaS level of SOA.13 It was designed as a general
boundary-layer protocol to allow RESTful control
and communication across that boundary, and has
an extensible design that allows discovery of service
aspects at any level of the URI for accessing those
services, as well as mix-ins to allow customization
of service descriptions and interactions. Although
OCCI was initially applied to IaaS control and communication, its intended to be general in nature
and, with its flexible format, can apply equally well
to PaaS and SaaS applications.
Other standards have also been developed to
tackle particular aspects of cloud computing interfaces for specific tasks, often with greater specialization to the task at hand. For example, the Open
Virtualization Format (OVF) is a packaging standard
designed to improve the portability and deployment
of virtual appliances, including machine images and
associated metadata needed to deploy, start, and
manage them.4 The implementation of such metadata to carry out detailed machine control once such
images are running wouldnt be covered under OVF,
52

I EEE CLO U D CO M P U T I N G

but would require an IaaS control standard, either


OCCI or a standard such as the Cloud Infrastructure Management Interface (CIMI), which is specifically designed to perform such tasks.5
We can use these standards to orchestrate workflows requiring coordination among multiple machine instances, using communication flows that
are possible within the information that theyre designed to convey. For this purpose, there are other
standards either already existing or in process that
are designed specifically to help with such coordination, such as the Topology and Orchestration Specification for Cloud Applications (TOSCA).6 Although
it might be possible to handle such workflow coordination on your own or through a general standard,
specifications that are specifically written for particular cloud tasks will have
features that are customized to make
your life easier when handling the characteristic details of those tasks.
Not all of these standards fit together, and many of them have been
designed independently. (This isnt unusual: often many successful software
products are also designed independently, but any thought directed toward
how they can be used to accomplish different parts
of a given task can be fruitful.) Explaining the relationship between these standards, or even whether
they can in fact be used together within a given
piece of cloud software, is the ongoing goal of this
area of the magazine.
Its also true, as should be obvious to even casual observers of the cloud computing scene, that
not all aspects of the field have yet been covered
by sufficiently mature standards. Leaving aside the
question of uptake, which will be discussed in detail when we get to each standard, theres the question of whether the architecture and landscape of
cloud computing applications is sufficiently settled
to identify all areas in which the application of
standards-based approaches is even sensible.

need your help identifying areas in which a substantial discussion on cloud standards is now
possible. To be successful, articles must go beyond
normal levels of clarity and readability. Too often,
jargon associated with the standards-development
process can and does exhaust the patience of many
participants in the cloud computing world, so I appeal to you to write topical, lively (but not too argumentative!) articles that will truly illuminate the
subject under discussion.
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

M
q
M
q

M
q

MqM
q

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

THE WORLDS NEWSSTAND

If you can produce a coherent, readable account


of recent work in this area, Id like to hear from you.
For purposes of this area of IEEE Cloud Computing,
you can reach me at ___________________
alan.sill@standards-now.org.
On behalf of all of the editors of the magazine,
we look forward to your submissions.
References
1. Open Cloud Computing InterfaceCore, Open
Grid Forum, GFD.183, v1.1, 7 Apr. 2011 (updated 21 June 2011); www.ogf.org/documents/
GFD.183.pdf.
_________
2. Open Cloud Computing InterfaceInfrastructure, Open Grid Forum, GFD.184, v1.1, 7 Apr.
2011 (updated 21 June 2011); www.ogf.org/
documents/GFD.184.pdf.
________________
3. Open Cloud Computing InterfaceHTTP Restful Rendering, Open Grid Forum, GFD.185, 21
June 2011; www.ogf.org/documents/GFD.185.pdf.
4. Open Virtualization Format, Distributed Management Task Force, DSP0243, v2.1.0, 3 Jan.
2014; www.dmtf.org/standards/ovf.
5. Cloud Infrastructure Management Interface (CIMI)
Model and RESTful HTTP-based Protocol, Distrib-

uted Management Task Force, DSP0263, v1.1.0,


25 Oct. 2013; www.dmtf.org/standards/cmwg.
6. Topology and Orchestration Specification for Cloud
Applications, Organization for the Advancement
of Structured Information Standards (OASIS),
TOSCA 1.0, Nov. 2013; http://docs.oasis-open.org/
tosca/TOSCA/v1.0/os/TOSCA-v1.0-os.html.
___________________________

ALAN SILL is an adjunct professor of physics and


senior scientist at the High Performance Computing
Center and directs the US National Science Foundation Center for Cloud and Autonomic Computing at
Texas Tech University. He also serves as the vice president of standards for the Open Grid Forum and cochair of the US National Institute of Standards and
Technologys Standards Acceleration to Jumpstart
Adoption of Cloud Computing working group. Sill
has a PhD in particle physics from American University. Hes an active member of the Distributed Management Task Force, IEEE, TM Forum, and other cloud
standards working groups, and has served either directly or as liaison for the Open Grid Forum on several
national and international standards roadmap committees. Contact him at __________________
alan.sill@standards-now.org.

EXECUTIVE STAFF

PURPOSE: The IEEE Computer Society is the worlds largest association


of computing professionals and is the leading provider of technical
information in the field.
MEMBERSHIP: Members receive the monthly magazine Computer,
discounts, and opportunities to serve (all activities are led by volunteer
members). Membership is open to all IEEE members, affiliate society
members, and others interested in the computer field.
COMPUTER SOCIETY WEBSITE: www.computer.org

Next Board Meeting: 1617 Nov. 2014, New Brunswick, NJ, USA

EXECUTIVE COMMITTEE
President: Dejan S. Milojicic
President-Elect: Thomas M. Conte; Past President: David Alan Grier;
Secretary: David S. Ebert; Treasurer: Charlene (Chuck) J. Walrad; VP,
Educational Activities: Phillip Laplante; VP, Member & Geographic
Activities: Elizabeth L. Burd; VP, Publications: Jean-Luc Gaudiot; VP,
Professional Activities: Donald F. Shafer; VP, Standards Activities: James
W. Moore; VP, Technical & Conference Activities: Cecilia Metra; 2014
IEEE Director & Delegate Division VIII: Roger U. Fujii; 2014 IEEE Director
& Delegate Division V: Susan K. (Kathy) Land; 2014 IEEE Director-Elect &
Delegate Division VIII: John W. Walz

BOARD OF GOVERNORS
Term Expiring 2014: Jose Ignacio Castillo Velazquez, David S. Ebert,
Hakan Erdogmus, Gargi Keeni, Fabrizio Lombardi, Hironori Kasahara,
Arnold N. Pears
Term Expiring 2015: Ann DeMarle, Cecilia Metra, Nita Patel, Diomidis
Spinellis, Phillip Laplante, Jean-Luc Gaudiot, Stefano Zanero
Term Expriring 2016: David A. Bader, Pierre Bourque, Dennis Frailey, Jill
I. Gostin, Atsuhiro Goto, Rob Reilly, Christina M. Schober

M AY 2 0 14

Executive Director: Angela R. Burgess; Associate Executive Director &


Director, Governance: Anne Marie Kelly; Director, Finance & Accounting:
John Miller; Director, Information Technology & Services: Ray Kahn;
Director, Membership Development: Eric Berkowitz; Director, Products &
Services: Evan Butterfield; Director, Sales & Marketing: Chris Jensen

COMPUTER SOCIETY OFFICES


Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C. 20036-4928
Phone:
  


GFax: +1 202 728 9614
Email: hq.ofc@computer.org
___________
Los Alamitos: 10662 Los Vaqueros Circle, Los Alamitos, CA 90720
Phone:


  GEmail: help@computer.org
__________
MEMBERSHIP & PUBLICATION ORDERS
Phone:
   GFax:


 
GEmail: help@computer.org
__________
Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Minato-ku, Tokyo

  ,;,9G$3:90
  

GFax:
    G
Email: tokyo.ofc@computer.org
_____________

IEEE BOARD OF DIRECTORS


President: J. Roberto de Marca; President-Elect: Howard E. Michel; Past
President$0?0=*&?,0.60=; Secretary: Marko Delimar; Treasurer:
John T. Barr; Director & President, IEEE-USA: Gary L. Blank; Director
& President, Standards Association: Karen Bartleson; Director & VP,
Educational Activities: Saurabh Sinha; Director & VP, Membership and
Geographic Activities: Ralph M. Ford; Director & VP, Publication Services
and Products: Gianluca Setti; Director & VP, Technical Activities: Jacek
M. Zurada; Director & Delegate Division V: Susan K. (Kathy) Land;
Director & Delegate Division VIII: Roger U. Fujii

revised 23 May 2014

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

53
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD SECURITY AND PRIVACY

Security and Privacy in


Cloud Computing
Zahir Tari, RMIT University

Significant research and development efforts in both industry and


academia aim to improve the clouds security and privacy. The author
discusses related challenges, opportunities, and solutions.

he cloud has fundamentally changed


the landscape of computing, storage,
and communication infrastructures
and services. With strong interest and
investment from industry and government, the cloud is being increasingly
patronized by both organizations and individuals.
From the cloud providers perspective, cloud computings main benefits include resource consolidation,
uniform management, and cost-effective operation;
for the cloud user, benefits include on-demand capacity, low cost of ownership, and flexible pricing.
However, the features that bring such benefits, such
as sharing and consolidation, also introduce potential
security and privacy problems. Security and privacy
issues resulting from the illegal and unethical use of
information, and causing disclosure of confidential
information, can significantly hinder user acceptance
of cloud-based services. Recent surveys support this
observation, indicating that security and privacy concerns prevent many customers from adopting cloud
computing services and platforms.
In response to such concerns, significant research and development efforts in both industry and
54

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

academia have sought to improve the clouds security and privacy. Here I give a quick (and incomplete)
overview of new challenges, opportunities, and solutions in this area, with the purpose of stimulating
more in-depth and extensive discussion on related
problems in upcoming issues of this magazine.

Identifying New Threats and Vulnerabilities


An essential task in cloud security and privacy research is to identify new threats and vulnerabilities
that are specific to cloud platforms and services.
Several recent reports have explored such vulnerabilities. For example, in 2009, researchers from the
University of California, San Diego, and the Massachusetts Institute of Technology demonstrated
leakage attacks against Amazons Elastic Compute
Cloud (EC2) virtual machines (VMs).1 More specifically, the researchers showed that its possible
to probe and infer the overall placement of VMs in
the EC2 infrastructure. Furthermore, an attacker
can launch a malicious EC2 instance and then determine whether that instance is physically colocated with a targeted (victim) instance. When the
attackers instance is successfully colocated with the
2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

victim, it can launch a side-channel attack by monitoring the status of shared physical resources such
as level-1 and level-2 caches, and thus infer the victims computation and I/O activities.
A follow-up study showed that its possible to
extract private keys via the cross-VM side channel
in a lab environment.2 In another study, researchers
from the College of William and Mary reported that
side-channel attacks arent just a potential risk, but
a realistic threat.3 They created a covert channel via
another shared resource (the memory bus) that had
a level of reliability and throughput of more than
100 bps in both lab and EC2 environments.
These risks represent a small subset of known
cloud-specific vulnerabilities and threats. However, they motivate us to think further about new
adversary models, trust relations, and risk factors
relative to cloud computing stakeholders. In the examples, the cloud provider isnt trusted because of
its resource sharing and VM consolidation practices.
Hence, the cloud provider doesnt provide a desirable
level of isolation and protection between tenants in
the cloud, allowing them to attack each other.

Protecting Virtual Infrastructures


Virtual infrastructures are infrastructure-level (virtual)
entities, such as VMs and virtual networks, created in
the cloud on behalf of users. Side-channel attacks
target these virtual infrastructures. Researchers
have proposed several solutions to defend against
cross-VM side-channel attacks. Dppel, for example, aims to disrupt cache-based side channels. In
this self-defensive approach, the target VMs guest
operating system injects cache access noise (that
is, flushes) so the collocated attack VM cant infer
cache access patterns.4 This solution doesnt require modifying the underlying hypervisor or cloud
platform. To defend against memory bus-based
side channels, a simple and practical approach is
to prevent a VM from locking the memory bus and
let the hypervisor emulate the execution of atomic
instructions that would otherwise require memory
bus locking.5
Other attacks against virtual infrastructures
include malware attacks against tenant VMs. The
cloud presents a new opportunity to defend against
these attacks. More specifically, the cloud provides a
uniform and tamper-resistant platform to deploy system monitoring and antimalware functions. The uniformity is reflected by the cloud providers consistent
installation, configuration, and update of antimalware
services for all hosted tenants. Its tamper resistant because monitoring and detection of malware attacks
can be performed from outside the hosted VMs,
M AY 2 0 14

either by the underlying hypervisor or by the more


privileged management domain (for example, Domain 0 of Xen). In CloudAV, a production-quality
system that reflects the antivirus-as-a-service idea,
a group of in-cloud antivirus engines analyzes suspicious fi les submitted by agents running in client
machines (including VMs) and collectively detects
malware in them.6 VMwatcher, a virtualizationbased malware-monitoring and detection system,
moves commodity, off-the-shelf antimalware software from the inside to the outside of each tenant
VM.7 This way, the antimalware software is out of
the malwares reach, preventing the malware from
detecting, disabling, or tampering with it. Malware
targeting a tenant VMat either the user or kernel
levelcan be detected and prevented using such an
out-of-the-box antimalware service.
A networked virtual infrastructure can consist
of multiple VMs connected by a virtual network.
With the rapid advances in software-defi ned networking (SDN), the cloud increasingly supports such
networked virtual infrastructures. SDN decouples
the control and data-forwarding functions of a physical networked infrastructure, such as a datacenter
network. The SDN control plane performs control
functions such as routing, naming, and firewall
policy enforcement, and the SDN data plane follows
the control planes decisions to forward packets belonging to different flows. Such decoupling makes it
easy to optimize the control and data planes without them affecting each other. However, the SDN
paradigm raises security issues. Researchers have
reported that its possible to launch attacks against
the SDN architecture, incurring excessive workload
and resource consumption to both the control and
the data plane.8 Although researchers are developing defenses against such attacks, we need more
generic, scalable solutions that make the SDN architecture secure, robust, and scalable, which would
support virtual infrastructure hosting in the cloud.

Protecting Outsourced Computation


and Services
Many organizations have been increasingly outsourcing services and computation jobs to the cloud. A
client that outsources a computation job must verify the correctness of the result returned from the
cloud, without incurring significant overhead at its
local infrastructurethe extreme being to execute
the job locally, which would nullify the benefit of
outsourced job execution. Such verifiability is important to achieving cloud service trustworthiness and
hence has become a topic of active research. Encouragingly, researchers have in recent years developed
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

55
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD SECURITY AND PRIVACY

techniques and real systems to bring the vision of


a verifiable cloud service closer to reality. For example, the Pantry system composes and outsources
proof-based verifiable computation with untrusted
storage.9 It achieves theoretically sound verifiability
of computation for realistic cloud applications, such
as MapReduce jobs and simple MySQL queries.
In addition to computation outsourcing, the
cloud can support network service/function outsourcing. Example network functions include traffic
filtering, transcoding, firewall policy enforcement,
and network-level intrusion detection. Seyed Kaveh Fayazbakhsh and his colleagues noted that,
similar to computation outsourcing, a major challenge is to verify (at end points of network connections) that the middle boxes in the cloud correctly
execute outsourced network functions with satisfactory performance.10 They also proposed a framework
for verifiable network function outsourcing (vNFO)
that aims to achieve verifiability, efficiency, and accountability of outsourced network functions. Such
a framework will pave the way for deploying trusted
network middle boxes, in addition to end points (that
is, VMs), in the cloud, enriching the cloud ecosystem.

Protecting User Data


User data is another important cloud citizen. To
protect user data in the cloud, a key challenge is
to guarantee the confidentiality of privacy-sensitive
data while its stored and processed in the cloud.
This problem assumes a somewhat different trust
model, in which the cloud is not fully trusted because of operator errors or software vulnerabilities.
As a result, the cloud provider shouldnt be able to
see unencrypted or decrypted sensitive data during
the datas residence in the cloud. (In other words,
sensitive data should remain encrypted while in the
cloud.) However, such a requirement can limit the
usability of (encrypted) data when a cloud application processes it. Fortunately, researchers at the University of California, Santa Barbara, observed that
many cloud applications can process encrypted data
without affecting the correctness of the data execution. These researchers proposed Silverline, which
identifies data that the application can properly process in encrypted form.11 Such data will remain encrypted and hence maintain its confidentiality to the
cloud provider. The cloud user will perform data decryption locally once the encrypted data is returned
from the cloud as application output.
In-cloud data confidentiality poses even greater challenges. For example, even if the application
data is encrypted, the access patterns exhibited by
the corresponding applications can reveal sensitive
56

I EEE CLO U D CO M P U T I N G

information about the nature of the original data,


weakening the datas confidentiality. Hence a challenge is to achieve confidentiality of data access
patterns in the clouda problem called oblivious
RAM (ORAM). Recently, researchers reported a
breakthrough in achieving both practical and theoretically sound ORAM.12 The solution, called Path
ORAM, is elegant by design and efficient in practice.12 In fact, Path ORAM has been implemented
as part of a processor prototype called Phantom,13
which achieves realistic performance for real-world
applications. This is a significant step toward ultimate deployment of ORAM-enabled machines for
sensitive data processing in the cloud.

Securing Big Data Storage


and Access Control
In the recent past, more research has focused on
cloud-based big data applications. Many consider the
cloud to be the most promising platform for hosting,
collaborating on, and sharing big data. The challenge is to secure the storage and access to this data
to preserve its integrity, confidentiality, authenticity,
and nonrepudiation while facilitating availability.
Interesting solutions to increase the accountability of data sharing have been proposed for cloudbased distributed systems. Smitha Sundareswaran
and his colleagues, for example, proposed a decentralized accountability framework with logging capabilities using the programmable capabilities of
Java Archive files.14 The advent of many types of
big data, such as electronic health records and sensor data, have spurred research on secure access
and sharing with greater accountability. Recently,
researchers have proposed solutions for increasing accountability and secure access to cloud-based
health data,15 as well as robust cryptographic access control methods to increase the storage security of privacy-sensitive big data. Guojun Wang and
his colleagues proposed hierarchical attribute-based
cryptography to facilitate secure access to users in
large-scale cloud storage systems.16 More recently,
researchers have designed more advanced solutions
(for example, homomorphic cryptography17) for secure cloud-based storage systems to facilitate secure
distributed access.
Given emerging trends in big data, we need
more research on efficient, scalable, and accountable privacy-preserving mechanisms that can address application-specific requirements.

Call for Contributions


The magazine welcomes articles that discuss new
challenges, opportunities, and solutions in the area
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

of cloud security and privacyin particular, articles


that relate to data, storage, computation, and communication. Enabling techniques include cryptography, virtualization, data management and analytics,
software-defined networking, fault tolerance and
recovery, and forensics. Id like to hear from practitioners about their lessons and experience in developing, deploying, and using cloud security and
privacy solutions and services. I also welcome reports from academia on cutting-edge research and
development, new vulnerabilities and challenges,
and new or even controversial ideas and visions.
References
1. T. Ristenpart et al., Hey, You, Get Off of My
Cloud: Exploring Information Leakage in ThirdParty Compute Clouds, Proc. ACM Conf. Computer and Comm. Security (CCS 09), 2009, pp.
199212.
2. Y. Zhang et al., Cross-VM Side Channels and
Their Use to Extract Private Keys, Proc. 19th
ACM Conf. Computer and Comm. Security (CCS
12), 2012, pp. 305316.
3. Z. Wu, Z. Xu, and H. Wang, Whispers in the
Hyper-space: High-speed Covert Channel Attacks in the Cloud, Proc. Usenix Security Symp.,
2012.
4. Y. Zhang and M.K. Reiter, Dppel: Retrofitting Commodity Operating Systems to Mitigate
Cache Side Channels in the Cloud, Proc. 20th
ACM Conf. Computer and Comm. Security (CCS
13), 2013.
5. B. Saltaformaggio, D. Xu, and X. Zhang, BusMonitor: A Hypervisor-Based Solution for Memory Bus Covert Channels, Proc. 6th European
Workshop on Systems Security (EuroSec 13), 2013.
6. J. Oberheide, E. Cooke, and F. Jahanian,
CloudAV: N-Version Antivirus in the Network
Cloud, Proc. 17th Usenix Security Symp., 2008,
pp. 91106.
7. X. Jiang, X. Wang, and D. Xu, Stealthy Malware
Detection Through VMM-Based Out-of-theBox Semantic View Reconstruction, Proc. ACM
Conf. Computer and Comm. Security (CCS 07),
2007, pp. 128138.
8. S. Shin and G. Gu, Attacking Software-Defined
Networks: A First Feasibility Study, Proc. ACM
SIGCOMM Workshop Hot Topics in Software
Defined Networking (HotSDN 13), 2013, pp.
165166.
9. B. Braun et al., Verifying Computations with
State, Proc. 24th ACM Symp. Operating Systems
Principles (SOSP 13), 2013, pp. 341357.
10. S.K. Fayazbakhsh, M.K. Reiter, and V. Sekar,
M AY 2 0 14

Verifiable Network Function Outsourcing: Requirements, Challenges, and Roadmap, Proc.


ACM Workshop Hot Topics in Middleboxes and
Network Function Virtualization (HotMiddlebox
13), 2013, pp. 2530.
11. K.P.N. Puttaswamy, C. Kruegel, and B.Y. Zhao,
Silverline: Toward Data Confidentiality in
Storage-Intensive Cloud Applications, Proc.
2nd ACM Symp. Cloud Computing (SoCC 11),
2011, article 10.
12. E. Stefanov et al., Path ORAM: An Extremely
Simple Oblivious RAM Protocol, Proc. ACM
Conf. Computer and Comm. Security (CCS
2013), 2013, pp. 299310.
13. M. Maas et al., PHANTOM: Practical Oblivious Computation in a Secure Processor, Proc.
ACM Conf. Computer and Comm. Security (CCS
13), 2013, pp. 311324.
14. S. Sundareswaran, A.C. Squicciarini, and D.
Lin, Ensuring Distributed Accountability for
Data Sharing in the Cloud, IEEE Trans. Dependable and Secure Computing, vol. 9, no. 4,
2012, pp. 556568.
15. Y. Tong et al., Cloud-Assisted Mobile-Access
of Health Data with Privacy and Auditability,IEEE J. Biomedical and Health Informatics,
vol. 18, no. 2, 2014, pp. 419429.
16. G. Wang, Q. Liu, and J. Wu, Hierarchical
Attribute-Based Encryption for Fine-Grained
Access Control in Cloud Storage Services, Proc.
17th ACM Conf. Computer and Comm. Security,
2010, pp. 735737.
17. W. Lu, A.L. Varna, and M. Wu, ConfidentialityPreserving Image Search: A Comparative
Study between Homomorphic Encryption and
Distance-Preserving Randomization, IEEE
Access, vol. 2, 2014, pp. 125141.

ZAHIR TARI is a full professor of distributed systems


at RMIT University, Australia. His research interests
include system performance (for example, Web servers, P2P, and cloud computing) and system security (for example, SCADA and cloud). Tari received
a PhD in computer science from the University of
Grenoble, France. In addition to serving on the IEEE
Cloud Computing editorial board, hes an associate editor of IEEE Transactions on Computers and
IEEE Transactions on Parallel and Distributed Systems. Contact him at _______________
zahir.tari@rmit.edu.au.

Selected CS articles and columns are also available


for free at http://ComputingNow.computer.org.

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

57
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD AND ADJACENT TECHNOLOGY TRENDS

Emerging Paradigms
and Areas for Expansion
Pascal Bouvry, University of Luxembourg

IEEE Cloud Computing seeks articles on emerging cloud and adjacent


technologies and their impact on the perception and use of the cloud.

58

ductory article, Ill describe some potential major


players. By its nature, the list is nonexhaustive, and I
expect even more upcoming breakthroughs to revolutionize how we see things.

loud computing was born from


the opportunity to open major
distributed datacenters to end users to provide on-demand services,
ranging from infrastructure as a
service (IaaS) to software as a service (SaaS). The new pay-per-use business model is
so appealing that were entering an everything-as-aservice (EaaS) era, extending the approach beyond
existing borderssuch as hardware-as-a-service
(HaaS) and business-process-as-a-service (BPaaS)
to other areas and dimensions (robot-as-a-service,
sensor-as-a-service, and so on). The cloud is like the
Borg from Star Trek, making all resistance futile and
assimilating all existing technologies and services.
Emerging paradigms and technologies will have
a major impact on society and industry. We plan to
cover these paradigms in the Cloud and Adjacent
Technology Trends area as well as provide a longterm futuristic vision of how the cloud will look and
the new opportunities it will offer.
These paradigms and related technologies can
be domain-driven (for example, personalized medicine and social networks) or transversals (such as
big data and the Internet of Things). In this intro-

With the development of new technologies in biomedicine, researchers gained access to -omics
experimental readouts of high dimensionality and
volume, such as genomics, transcriptomics, proteomics, and metabolomics.1 Consequently, tremendous amounts of data have become available. One
reason is the great reduction of -omics costs. In
the last decade, the price of genetic sequencing (genomics) dropped from millions to thousands of US
dollars per sequence, and the costs will eventually
drop even more.2 At the same time, the scope of collected data keeps growing, from sequencing a family
a few years ago, to cohorts today, to entire populations in the future.
Although the prices are dropping, the size of the
collected genomics data remains the sameroughly 0.3 terabytes per sequence. With the number of
sequences reaching hundreds, current networks
cant support their prompt transfer. Instead, major

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Growth in Data

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Hardware advances boosted by research on motransport companies (Fedex, UPS, TNT, DHL, and
so on) ship disks across the globe. From a broader bile computing create new opportunities to enrich
perspective, these data transfers occur in huge data the cloud. For example, more than 10 billion ARM
flows. Newly developed models and techniques will processors are sold each year. Moreover, chip and
be required to parallelize the information transfer to board manufacturers continually announce new
exploit the many paths connecting one point to an- generations of low-power chipsets and the coupling
of such chipsets at the cache level with GPUs and
other but also multipoint communications.
Another biomedical domain observing a rapid other accelerators, such as field-programmable gate
increase in the size and volume of collected data arrays (FPGAs). We intend to investigate new generis imaging. Here the challenge isnt only to store ations of hardware, how well they work in the cloud
or analyze the data,3 but also to remotely visualize paradigm, which category of cloud services they can
provide, and upcoming trends.
large images.4
Big data and data analytics are
among the biggest technology trends.5
IBM divides big data into four dimensions: volume, velocity, veracity, and
Hardware advances boosted by research
variety, or the 4 Vs. Each dimension
brings new challenges in terms of reon mobile computing create new
quired models, methods, algorithms,
and technologies. The notion of big
opportunities to enrich the cloud.
data is also tightly coupled with the
emergence of the cloud. Indeed, Web
At the other end of the cloud spectrum, the
2.0 technologies and social networks present a tremendous amount of data that cant be stored locally main reason for the current relatively restrictive use
to be processed for key information, such as societal of the cloud for high-performance computing (HPC)
resides in the lack of cloud offers featuring highor marketing studies.
Tying together zetabytes of data with the required performance interconnects, such as Infi niband, as
processing power and providing this as a service to well as the lack of efficient cloud driver implementapotential customers involves some major underlying tions for such interconnects. Therefore, HPC users
challenges. Well need new generations of data ware- typically restrict their use of cloud computing to the
houses, (no-)file systems, and data-processing tech- bag-of-tasks paradigmthat is, groupings of uncorniques. IEEE Cloud Computing will investigate all of related tasks.
The virtualization and cloud management laythese adjacent technologies and explore how theyll
ers also induce an overhead; however, the pay-perimpact and shape the clouds future.
use paradigm and the clouds elasticity features are
so attractive that users are willing to pay this extra
Hardware Advances
Some of the paradigm shifts, such as sustainable price.
Technology advances in this field that will incomputing and technologies like those developed
for mobile computing (for example, low-power CPUs crease the clouds appeal for scientific computing in
and systems on chip) or advanced networking (such the coming years is another prime area of interest.
as passive components and network coding) are also
expected to revolutionize the clouds core compo- Toward a Safer and Trusted Cloud
nents. Cyberphysical systems develop quickly, and A key goal is to increase the security level and trust
cloud computing will help further blur the hard- in the cloud. Indeed, the virtualization layer, includware/software border.6
ing the hypervisor and various device drivers, is the
Because of the mass market, i.e. billions of source of several newly discovered vulnerabilities.
units sold per year, the unit prices of the newest Standard subcontracting approaches involve a sergeneration of hardware components has dropped vice-level agreement (SLA) and trusting the subconlow enough to allow an HaaS approach, in which tractor. However, trusting the subcontractors other
hardware sharing needs, such as CPUs using virtu- customers, which is somehow implicit when sharing
alization, are less crucial. Some of the techniques hardware with them, is rather new and unusual.
developed by the grid computing community, such Dedicated hardware coupled with trusted platform
as elastic parallel designs, will be re-explored and modules will help build chains of trusts and attract
more customers to the cloud.
further developed in this new context.
M AY 2 0 14

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

59
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD AND ADJACENT TECHNOLOGY TRENDS

Confidentiality and privacy are also of primary


concern to many cloud customers. Many applications
transport much more information than required because they process data centrally, leading to potential data leakage. Decentralized approaches should
let applications call remote services, keep the data
where its produced, and return just the requested
resultno more, no less. New cryptographic data
processing will allow applications to process data
without uncovering unnecessary information.7
Among other issues, the abundance of information and the opportunity to cross compare it enable

Because clouds are distributed across


many countries, international laws and
regulations also play a key role.

us to rebuild original, missing information. For example, recent stories have reported successful attempts to trace the names of anonymous genetic
sequences simply by looking at publicly available
information, such as the study location and local
phone books. An enhanced legal framework and
recommendations are required to bring customers
peace of mind. Such frameworks have started to appear,8 but the technologies needed to enforce such
rules require further development.
Cloud management techniques must also be improved. Many recent publications have highlighted
the problems multiobjective nature, minimizing cost
and energy while maximizing resilience. These aspects are currently handled at various levelsfrom
hardware to middleware to application. Decisions
at the various levels could be contradictory, or they
might unnecessarily reinforce some requirements,
for example, duplicating resources used for fault tolerance. This certainly calls for cross-layer approaches, such as hardware-software codesign, and for the
various research communities to join forces.

The Last Mile


Ubiquity of services, anywhere, anytime, will be reflected in the clouds expansion to mobile devices required to meet the challenges of the last mile.9 At the
other end, the cloud is expected to provide the necessary backbone to the Internet of Things. Sensors are
now everywhere, enabling new trends of continuous
monitoring of individuals provided by fancy hard60

I EEE CLO U D CO M P U T I N G

ware such as smart watches, clothes, and glasses.


The cloud should let us mine our quantified selves.
Service roaming will let users move from one
cloud to another, regardless of the service provider.
This trend, initiated by the need for multitenant
clouds, will also favor cloud brokering, which aims
to find the best match between customers and providers. Cloud brokering will also help create highervalue services by combining services from various
providers.
But the client side also becomes more demanding with the appearance of 4K TV, Qualcomms
NexCave, and other emerging high-resolution 3D screens such as the University of California, San Diego (UCSD)
SCOPE (Scalable Omnipresent Environment) project, smart surface, and
smartboards. Thus, the clouds last
mile also requires broad connections
to support these technical challenges.
To put the cherry on the cake, the new
generation of applications will require
not only large bandwidth, but also low
latency (for gaming, HPC, remote control of robots,
and so on).

Other Areas of Development


There is certainly still room in the cloud computing
paradigm for theoretical development. The cloud is
driving new needs. The many underlying sets of ontologies describing the data form complex networks
that can be described using hypergraphs, which
are known to turn simple problems into NP-hard
ones.10 We need new algorithms and heuristics to
meet these new challenges. Also, with millions of
devices interconnected through the cloud, some of
which will certainly fail, there is a critical need for
stochastic and fault-tolerant approaches.
On the economic side, the paradigm change
brought by the cloud induces new business models,
empowering small- and medium-sized companies, as
well as individuals, to operate worldwide businesses.
These new business models coupled with the opportunities of microcredit and community funding, allow anyone to have a major impact. Now anyone can
potentially accomplish what only corporations could
do in the past.
Finally, because major findings will likely come
from multidisciplinary research, major world changes will likely emerge from a mix of technology, law,
and economics. This is illustrated in recent actions
to fight global warming and in the grid computing
communitys attempts to provide a sound business
model. For example, the government of LuxemW
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

bourg passed a project of law guaranteeing the conservation of data in the event of a local providers
bankruptcy.
Because clouds are distributed across many
countries, international laws and regulations also
play a key role. Classical ways of dealing regionally with copyright for technologies, such as zoning,
dont hold in distributed cloud services. Watching
the emergence of new international laws and regulations facilitating the use of the cloud will also be of
a prime importance.

EEE Cloud Computing calls for the academic and


research communities to provide exciting articles
on emerging paradigms on cloud and adjacent technology trends and their impact on how we perceive
and use the cloud in the short, medium, and long
term.
References
1. J. Lederberg and A. McCray, Ome Sweet
OmicsA Genealogical Treasury of Words,
The Scientist, vol. 15, no. 7, 2001, p. 8.
2. J.C. Roach et al., Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome
Sequencing, Science, vol. 328, no. 5978,
2010, pp. 636639; www.ncbi.nlm.nih.gov/
pubmed/20220176.
____________
3. B. Neumann et al., Phenotypic Profiling of the
Human Genome by Time-Lapse Microscopy Reveals Cell Division Genes, Nature, vol. 464, no.
7289, 2010, pp. 721727; www.ncbi.nlm.nih.gov/
pubmed/20360735.
____________
4. S. Samsi, A.K. Krishnamurthy, and M.N. Gurcan, An Efficient Computational Framework
for the Analysis of Whole Slide Images: Application to Follicular Lymphoma Immunohistochemistry, J Computer Science, vol. 3, no.
5, 2012, pp. 269279; www.ncbi.nlm.nih.gov/
pubmed/22962572.
____________
5. V. Mayer-Schonberger and K. Cukier, Big Data:
A Revolution That Will Transform How We Live,
Work, and Think, Eamon Dolan/Houghton Mifflin Harcourt, 2013.
6. E. Lee, Cyber Physical Systems: Design Challenges, Univ. of California, Berkeley, tech. report
UCB/EECS-2008-8, 23 Jan. 2008.
7. M.D. Ryan, Cloud Computing Security: The
Scientific Challenge, and a Survey of Solutions,
J. Systems and Software, vol. 86, no. 9, 2013, pp.
22632268.
8. Committee on Strategies for Responsible Sharing of Clinical Trial Data; Board on Health
M AY 2 0 14

Sciences Policy; Institute of Medicine. Discussion Framework for Clinical Trial Data Sharing:
Guiding Principles, Elements, and Activities,
Natl Academies Press, 2014.
9. S. Fowler, Survey on Mobile Cloud ComputingChallenges Ahead, IEEE CommSoft ELetters, vol. 2, no. 1, May 2013.
10. C. Berge, Hypergraphs, Combinatorics of Finite
Sets, North Holland Mathematical Library/
Elsevier, 1989.

PASCAL BOUVRY is a professor in the Computer


Science and Communication research unit of the
Faculty of Science, Technology and Communication at the University of Luxembourg and a faculty
member at the Luxembourg Interdisciplinary Center
of Security, Reliability, and Trust. His research interest include cloud & parallel computing, optimization, security and reliability. Bouvry has a PhD in
computer science from the University of Grenoble
(INPG), France. He is on the IEEE Cloud Computing editorial board. Contact him at pascal.bouvry@
__________
uni.lu.
____

Selected CS articles and columns are also available


for free at http://ComputingNow.computer.org.

NEWSLETTERS

Stay Informed on Hot Topics

computer.org/newsletters

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

61
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD ECONOMICS

The Costs of
Cloud Migration
Omer Rana, Cardiff University

The cost of outsourcing computing infrastructure requires consideration


not only of potential savings in operational and capital expenditures, but
also of human and management costs.

loud computing is one of the


most potent examples of how we
can use computing as a utility.
The ability to outsource computing infrastructure to one or more
providers, with varying levels of
trust, has allowed various companies to successfully adopt the cloud computing utility model.
Smaller companies unable to afford an in-house
computing infrastructure (and operational support for maintaining and managing such an infrastructure) are often cited as potential benefactors
of this model. By migrating to a cloud infrastructure, such companies can save operational expenditures and focus on their core business rather
than their computing systems.
However, the actual costs, especially for longterm use of an outsourced computing infrastructure, can be unfavorable for smaller companies. Of
course, human issues can also influence cloud migration decisions, such as concern about job security
for existing in-house administrative staff and lack
of understanding of how the provider operates. By
outsourcing, many companies surrender important
62

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

systems management skills that they could have developed in house.


Understanding the true cost of outsourcing infrastructure therefore requires more detailed consideration than many organizations undertake when
performing cost-saving analyses to decide whether
cloud migration would benefit them.

A Multicriteria Economic Perspective


Taking a wider, multicriteria economic perspective
is essential to ensuring that sustained, long-term use
of cloud systems makes sense. Such analyses also
offer important insights into which part of the local system should be moved to an external provider
and which should remain in house. In this context,
its also useful to understand the organizational
changes that cloud computing would generate. For
instance, how would departments within a company
deal with pay-as-you-go pricing?
Economic models therefore play an essential role in determining whether such outsourcing is likely to benefit the company. They strongly
influence the decision of many companies as to
whether theyll migrate (either partially or fully)
2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

their infrastructure or services to a cloud provider. Such decisions factor in issues of pricing/cost,
reputation/trust, performance/availability, energy
savings, and security and privacy. Each of these
decision factors impacts both shorter-term revenue
and cost savings, and longer-term reputation and
strategic operation.
Cloud providers also need to estimate the cost
of provisioning infrastructure and services to clients, accounting for their own
operational and capital expenditures,
as well as potential reputation concerns
(such as how potential clients perceive
them in terms of reliability and their
ability to deliver what they advertise)
that impact their long-term survivability in the marketplace. Energy costs are
increasingly important in this equation
for many cloud providers and have influenced where
they build their datacenters as well as potential alliances with energy providers offering special pricing.
There is also often a cloud supply chain, in
which a single company uses services that are provisioned by others (in various service mashups), as
well as associated dependencies within the supply
chain. For instance, a company might run its own
website but outsource storage to an infrastructureas-a-service (IaaS) provider, establishing mutually
beneficial service-level agreements (SLAs) that provide financial security for the company running the
website (allowing them to establish penalty clauses
that could lead to crediting customers in case of
unavailability, for instance). Are users who access
such websites fully aware of the different providers
in the supply chain? Do service providers fully disclose their dependencies within their supply chains
to their users/customers? Brokers play an important
role in establishing these supply chains, matchmaking service requests to providers based on factors
such as cost, operational history (for example, uptime and availability), and user feedback.

Pricing and Usage Models


Understanding how external infrastructure and service platforms should be compared also remains a
challenge for companies. Difficulty arises when cloud
providers use different names/terms for computing
and storage resources, or bundle provider-specific
services, making an overall comparison of providers
a nontrivial process. Today, a limited number of providers dominate the market, offering users a range
of configurable options.
Understanding how resource requirements can
map to products from such providers, often available
M AY 2 0 14

in a range of pricing bands (current versus older instances) and market models (spot market versus reserved instances, and so on), often requires input
from economic and technical experts working in
collaboration. Standards (see the StandardsNow
column in this magazine) play an important role in
creating suitable terminology that can be shared
across providers. In the research community, there

The multitenancy nature of cloud


computing is often a leading reason for
revenue generation by cloud providers.
is also significant interest in providing auctionbased models for improving utilization of spare
capacity in the cloud market. Simulation-based
approaches are generally used to demonstrate the
benefit of these auction-based models and how they
can improve utility for both consumers (cheaper resources) and providers (increased utilization of an
otherwise rarely used resource). However, in practice, cloud providers have been reluctant to adopt
auction-based models. Why are most cloud providers unwilling to offer underutilized capacity in an
auctions market?
The multitenancy nature of cloud computing
is often a leading reason for revenue generation by
cloud providers. By sharing common aspects of an
infrastructure (using virtualization technologies)
across multiple users, providers can benefit from
economies of scale (and management efficiency).
However, resource sharing across users also provides
a key limitation if used inefficiently or incorrectly.
Interference between virtual machine (VM) instances, potential data leakage (due to dirty disks), and
VM/hypervisor escapethat is, when a VM goes
beyond its defined boundaries and interacts directly
with the operating system (and other VMs)could
negatively affect the potential cost benefits for customers. Its important to understand how such data
hosting risks can be quantified and presented to users (and providers), so users can factor in these risks
when choosing a cloud provider.
Existing cloud providers generally require users
to register for their services using a credit card, and
the number of instance hours used are charged to
that card. With the emergence of digital currencies
such as Bitcoin, cloud providers might begin offering
exchangeable credit schemes. With this approach,
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

63
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD ECONOMICS

Interest in distributed clouds, which attempt


to integrate datacenter and edge device capabilities,
along with the potential of using software-defined
network models for accessing network components
such as routers and switches (using GENI OpenFlow, for instance), opens up new possibilities for
cloud economics. The marketplace can now extend
beyond the datacenter to devices owned by individuals or consortia, along with access to services made
available by backbone network providers. Understanding how resource sharing and revenue models
can factor into such multilayer capability remain important research challenges for the future.
The highly configurable nature of cloud computingespecially software-as-a-service (SaaS) offeringssuggests the potential for new business
models that have yet to be fully realized. According to a Gartner report,
global spending on SaaS will likely
reach US$250 billion by 2017.2 How
Distributed clouds extend the marketplace
much current SaaS offerings will need
to change to reach such spending tarbeyond the datacenter to devices owned
gets is unclear. Currently, SaaS business
models are often similar to traditional
by individuals or consortia.
models for selling software products (or
licensing), and customers have limited
ability to negotiate or alter these modthreshold point is passed, the action is taken based on els. Existing SaaS vendors also limit their products
the advertised capability (service or product is deliv- configurability, requiring clients to use standard caered); otherwise, no party is required to perform the pabilities offered by the vendor. This limits the beneaction and any fees paid are refunded. Examples of fit to the client of using the SaaS capability. Delivery
this include offers from daily deal websites (making of software capability through the Internet (hence,
special-price offers of products/services that must be outsourcing deployment and hosting) could offer a
accepted within a limited time frame). Such contracts range of potential capabilities, such as
allow cloud providers to reduce operational expenditure through economies of scale. Its essentially a
r combining software hosting and development as
form of advance best-effort lease (that is, the provider
a key part of the business offering;
cant guarantee in advance that a resource will be
r establishing long-term strategic relationships
available). The consumer requests a resource to use
with customers through negotiated provisioning
in the future, but it is not guaranteed. The provider
agreements; and
only delivers the resource if it can benefit from havr providing subscription-based pricing that ating a certain minimal number of consumers.1
tempts to understand how consumers use the
system, and how providers can use data from
such usage to enhance their offerings.
Emerging Issues
Recently, interest in the creation of cloud havens
How companies integrate SaaS capability into
(that is, cloud computing service providers operating in countries with lax or nonexistent security and their existing in-house systems also remains an imprivacy regulations) creates the potential for the portant question. For instance, are SaaS offerings
coexistence of multiple marketplaces (with varying primarily obtained for services that are seen as nondegrees of guarantees provided on the privacy of the critical for a companys operation? Of course, it is
stored data). Understanding how these cloud havens useful to remember that cloud service consumers
impact a cloud marketplace, how they influence the arent just companies, but also individuals. Should
operation of cloud providers, and how customers per- business models for consumers differ from those for
providers? How different should they be?
ceive them remain interesting research questions.
users could purchase several instance tokens from
a cloud marketplace and redeem them at a number
of different providers. The mapping between token
value and number of instance hours received could
vary depending on the popularity (and demand) for
offerings from a particular provider. More popular
providers could charge more tokens for their services than less popular providers, thereby creating
a marketplace driven by supply-demand principles.
A token exchange, in which unused tokens are auctioned to potential bidders, could create a dynamic
marketplace for services/resources.
Another example is the use of provision-point (assurance) cloud provider contracts, in which members
of a group pledge to contribute to an action if some
prespecified threshold condition is met. Once the

64

I EEE CLO U D CO M P U T I N G

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Call for Submissions


Cloud Computing magazine seeks submissions in
a variety of areas, covering both micro- and macro-economic issues. Authors are encouraged to
contribute articles demonstrating novel thinking on how economic models adapted from other
domains could improve the use of multilayered
cloud systems, and how novel resource provisioning strategies could lead to the development of new
economic models. Authors are also encouraged to
report on the practical use of economic models
for provisioning cloud services/infrastructure and
their experiences using pricing/economic models
from commercial providers.
Cloud Computing magazine also seeks articles
covering aspects of risk, especially risk from an economic perspective, given new cloud providers (some
of which use services from more established companies, such as Amazon and Google). Risk-assessment
strategies are often influenced by criteria that have
a particular bearing on a companys operation (because a risk-versus-opportunity assessment is often
needed). Different users are therefore likely to place
varying degrees of emphasis on particular factors
that impact their operation.
Brokerage-based risk assessment also remains
an important challenge. Here, intermediate brokers
can consider concerns relevant to particular users
or companies and subsequently compare these with
offerings (at particular price bands) from various
cloud providers. Its also important to note that an
organizations ability to understand such risk factors
can change with maturitythat is, how long theyve
been using cloud services.

hese are still early days for the cloud computing research and development communityin
particular, how this community perceives and uses
utility computing. The IEEE/ACM Utility and
Cloud Computing conference, for instance, is only
in its seventh year, compared with conferences in
areas such as parallel computing that have been
occurring for decadesfor example, the International Conference on Parallel Processing is now
in its 43rd edition. There is still plenty of room for
innovation. With improved understanding of how
cloud computing systems and services are used in
practice, intermediate (brokerage) organizations
can find numerous opportunities for interacting
with users and cloud providers. Increasing interest
and adoption of cloud standards can also be an important catalyst for generating a more sustainable
cloud market.
M AY 2 0 14

References
1. O. Rogers, Improved Public Cloud Capacity
Planning through the Sale of Options, Forwards
and Provision Point Contracts, PhD thesis,
Dept. of Computer Science, Bristol Univ., UK,
2013.
2. Gartner, Forecast: Public Cloud Services,
Worldwide, 2011-2017, 4Q13 Update, 26
Dec. 2013; www.gartner.com/doc/2642020/
forecast-public-cloud-services-worldwide.
__________________________

OMER RANA is a professor of performance engineering in the School of Computer Science and Informatics at Cardiff University. He also currently acts
as an advisor to CBNine, a company specializing in
cloud computing for the architecture, engineering,
and construction sector. His research interests include
high-performance distributed computing, data analysis/mining, and multiagent systems. Rana has a PhD
in neural and parallel computing from Imperial College (London University). He is a member of the IEEE
Cloud Computing editorial board and a member of
IEEE. Contact him at ________________
o.f.rana@cs.cardiff.ac.uk.

Selected CS articles and columns are also available


for free at http://ComputingNow.computer.org.

Expert Online Courses


Just $49.00
Topics: Project Management, Software Security, Embedded
Systems, and more.
www.computer.org/online-courses

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

65
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD MANAGEMENT

Challenges in Cloud
Management
J.P. Martin-Flatin, EPFL

In addition to generic management concerns, cloud management poses


particular challenges for researchers and practitioners in areas such as
scalability, interoperability, and security.

ne of the topic areas covered


by this new magazine is cloud
management. In this short article, I briefly present the rationale for managing clouds,
explain what management in
general is about, and describe several challenges
posed by cloud environments.

Rationale for Managing Clouds

66

ensure that these services are delivered with the expected quality in terms of robustness, performance,
and so on. If service-level agreements (SLAs) are
specified, these guarantees are translated into measurable quality metrics such as maximum downtime,
average uptime per month, maximum response time,
average response time per hour, or maximum time to
detect and block an intrusion.
Cloud management deals with the operations of
cloud infrastructures (software and hardware) and
cloud services, and the enforcement of SLAs. The
latter specify cloud quality metrics, including userperceived quality of cloud services.
Some aspects of cloud management are generic management concerns; others pose challenges
that are specific to (or particularly acute in) cloud
environments.

The key difference between test and production


environments is the dependability of the systems and
services that they offer. Test labs provide only besteffort services. In contrast, production environments
rely on management systems and support staff to
offer service guarantees (such as no service downtime
of more than 15 minutes during business hours).
These support teams implement processes and
follow procedures known as operations. Management
systems, known as operations support systems in
the telco world and management platforms in the
Internet world, are key enablers of operations.1
When new services go into production, management systems and support staff are put in action to

Since the early 1990s, the Internet world has used


for management a terminology that originated in the
telecommunications world.2 Management functions
are traditionally broken down into five areas3,4: fault
management, configuration management, account-

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Generic Management: FCAPS

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

M AY 2 0 14

Users

SaaS
Networks

ing management, performance management, and


security management. These five areas are known as
FCAPS for the first letter of each area. Whether
dealing with IP network management, server management, Web service management, or cloud management, the same terminology is used for grouping
different management functions into coherent sets.
Fault management deals with malfunctioning
cloud resources and services (such as when a server is down or a service is unavailable). It processes
monitoring data and user complaints to detect
anomalies, correlate them, find their root causes,
trigger corrective actions, and restore the systems
or resume the services. The fault management subsystem ensures that cloud resources and services are
up and running, but it does not check whether they
perform satisfactorily.
Configuration management is about setting the
parameters that control all cloud resources and services: IP addresses, number of small buffers for a database, virtual machine (VM) image to be launched
for running a given service, and so on. Configuration parameters are usually stored in databases and
pushed to production software and hardware. The
configuration management subsystem plays an essential role in guaranteeing robustness, for instance
by making it possible to resume an entire production
environment from scratch in a new datacenter.
Accounting management is typically associated
with charging (which is handled by business support
systems in the telco world), but it also deals with
usage statistics independent of monetary aspects.
Fine-grained accounting management is crucial for
public cloud providers as most of them charge on a
pay-per-use basis, with small margins, and rely on
high volumes for generating profits. As a result, even
small errors in accounting management systems can
turn expected profits into actual losses.
Performance management consists in regularly
checking the health of all cloud resources and services, and in making sure they perform as expected
by comparing quality measurements (extracted from
monitoring data and user complaints) with a baseline. In performance management, a normal system
is not just a system that is up and running, but a
system that provides the expected service with the
expected quality. When services do not fulfill their
SLAs, the performance management subsystem is in
charge of triggering corrective actions by interacting
with the fault management subsystem, the configuration management subsystem, and so on.
Security management guarantees the security of
all cloud systems and services, as well as the cloud
management systems themselves. For instance,

PaaS add-on

PaaS

IaaS

Datacenters

FIGURE 1. Multiprovider public clouds. Unlike private clouds, they span

multiple management domains and require a complete rethinking of


management processes and procedures.

the security management subsystem is expected to


detect and block intrusions and denial-of-service
(DoS) attacks using firewalls and other tools. In
contrast to information security terminology,5 where
security is based on the confidentiality-integrityavailability triangle, security management does not
deal with availability, which falls into the realm of
fault management.
Management functions can be classified in many
other ways. In particular, a number of corporations
and government agencies follow the recommendations of the Information Technology Infrastructure
Library (ITIL)6 to structure their IT service management process. To date, however, ITIL and public
clouds rarely coexist because the former increases
costs and bureaucracy for the sake of robustness,
whereas the latter strives to decrease costs, increase
agility, and facilitate the swift deployment of new
functionality at the expense of robustness.

Cloud Management: Six Challenges


Cloud environments pose many management challenges, including versatility, scalability, automation,
interoperability, security, and diagnosis.
Versatility
Behind the term cloud lurk polymorphic realities
that pose radically different management problems.
Managing a private cloud based on VMware,
Xen, or the Linux kernel-based VM (KVM) and deployed in a single datacenter is not vastly different
from managing a cluster in a noncloud environment.
You need new software, but management processes
and procedures are essentially the same.
However, as depicted by Figure 1, managing systems and services in a multiprovider public cloud is
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

67
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD MANAGEMENT

agement more scalable. Cloud management tools


are therefore increasingly adopting techniques that
emerged a few years ago under various names: selfmanaging systems, self-adaptive systems (with localloop adaptation), self-organizing systems, and autonomic computing. In turn, this adoption is encouraging further work in areas such as self-configuration,
self-repair, and self-optimization.
For multiprovider public clouds, the need for
more automation is also driven by the development
and operations (DevOps) trend, which is becoming
widespread in SaaS engineering. DevOps advocates
tight links between software development and operations, thereby addressing the management as an afterthought syndrome. For example, when software
engineers design and produce software,
DevOps encourages them to adapt their
software at design-time to facilitate its
operation when it is eventually put into
Diagnosing the causes of a performance
production. At first sight, DevOps looks
great; the reality, however, is a bit more
problem, a fault, or a security problem
prosaic.
These days, the majority of SaaS
requires access to monitoring data.
providers are very small companies that
worship two gods: Scrum and DevOps.
(comprising, for example, a private cloud on premise For them, adopting DevOps means asking the same
and a public cloud provided by a telco), and public people to develop SaaS software and support its opclouds for smartphone users (also known as mobile eration. The sad reality, however, is that few people
clouds). Cloud management processes, procedures, are outstanding in both software engineering and opand systems have little in common in these different erations. Darwinism ensures that poor software developers soon disappear from the SaaS market, so we
cloud environments.
are left with people who, in their vast majority, are
not experts in cloud management. When software deScalability
In the precloud days, small computer rooms were velopers do not know how to instrument their applithe rule and large datacenters the exception. With cations to provide enough monitoring information to
public clouds, the exact opposite is true: large cloud analytics engines, they rely on management software
datacenters have become the norm. Having more to do it for them automatically. This market segment
than 100,000 servers in a single cloud datacenter is growing rapidly. In the past few years, it focused
is increasingly common, whereas having 10,000 on monitoring (as with NewRelic). More recently, it
was considered exceptional a decade ago. Such an began turning to diagnosis, too.
increase in the average datacenter size implies a
tremendous increase in measurement points, and Interoperability
hence in the amount of monitoring data to be col- Shortly after people start deploying new services in
lected, transferred to the analytics engines, and public clouds, they usually want to integrate them
processed to monitor performance, detect faults and with other servicessome running in public clouds,
others in private clouds, and yet others in noncloud
security breaches, and so on.
Precloud IT analytics solutions are not suited to environments. The first lesson they learn is that
public clouds. To scale to XXL-size cloud datacenters, systems integration costs are non-negligible. Their
new management platforms need to be architected, second lesson is that anyone can replace one cloud
designed, and implemented from scratch. Precloud service provider with another, but when doing so,
vendor lock-in issues and high migration costs abound.
IT analytics is dead; long live big IT analytics!
The solution to these problems is cloud
interoperability. As usual in IT, the public cloud
Automation
Automation is a classic solution for making man- market began in a rather anarchic manner, with
entirely different. End users interact with softwareas-a-service (SaaS) from different providers, each
service might run on a different platform-as-a-service
(PaaS) such as Heroku, CloudBees, or EngineYard,
and each platform might use a different underlying
infrastructure-as-a-service (IaaS) such as Amazon
Web Services or CloudSigma. In such environments,
you have to cope with multiple management domains
and monitoring data silos. As a result, you must rethink your management processes and procedures
from the ground up.
Other cloud environments include singleprovider public clouds (such as Google App Engine and Google Compute Engine, or Microsofts
Windows Azure), hybrid clouds for enterprise users

68

I EEE CLO U D CO M P U T I N G

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

many newcomers rushing to the market and


focusing on just one thing: functionality. Now that
the market is maturing and consolidating, people are
gradually reengineering their cloud services to take
nonfunctional aspects into account (a trend known
as SaaS industrialization in the context of SaaS). A
key nonfunctional aspect is interoperability.
Cloud interoperability is primarily driven by
two factors: customer demands and standards. The
former is something that cloud service customers will
learn over timeeither the hard way (by trial and
error) or through education. The latter is presented
in the Settings Standards in a New World article
and the StandardsNow department Defining
Our Terms. Some of the first standards that were
released in the cloud management arena
pertain to interoperability. They include
the Open Cloud Computing Interface
(OCCI)7 and the Cloud Infrastructure
Management Interface (CIMI).8 Another
de facto standard is OpenStack.

eral administrative entities that enforce their own


management policies, irrespective of others.
Suppose, for instance, that a user complains
about a servicesay, a SaaS running on a PaaS. That
SaaS accesses a Database-as-a-Service (DaaS) that is
provided as a PaaS add-on. Both the PaaS and the
PaaS add-on run atop the same IaaS. Assume that the
users problem is due to the PaaS add-on, which heavily uses a network switch shared by multiple customers, and the switch saturates due to a software design
mistake in the PaaS add-on. So, we have at least five
companies involved here: the users employer, the
SaaS provider, the PaaS provider, the PaaS add-on
provider, and the IaaS provider. For the sake of clarity, suppose that the multiple customers mentioned

Security
In the past few years, cloud security has
received considerable attention in the
press. Data security is one of the main concerns
of people who remain hesitant about using public
clouds (see the Securing Cloud Infrastructure, Services, and Content: An Overview of Current Methods introduction).
In cloud management, security also covers other
aspects. In public clouds, for example, monitoring
data needs to be transferred from the monitored entity to the analytics engine. How can you secure all
these monitoring data exchanges that go across public cloud datacenters, telco networks, Internet backbone switches, transoceanic optical fibers, and so
on? Security problems also abound in multiprovider
public clouds, when multiple providers are involved
in the root cause analysis of a given problem.

different types of cloud environments.

Diagnosis
Diagnosing the causes of a performance problem, a
fault, or a security problem requires access to monitoring dataa lot of monitoring data in the case
of cloud environments. In private clouds, all cloud
resources and services normally run in a single administrative domainthat is, under the control of
a single administrative entity that enforces its own
management policy. Issues such as access control
(Who can access what monitoring data?) are therefore easy to solve. The situation is quite different
when multiple providers are involved, because we
have multiple administrative domains and thus sevM AY 2 0 14

Step by step, we need to assemble a


corpus of best practices for managing

above all use the same SaaS provided by the same


supply chain. How do you perform root-cause analysis automatically across five management domains?
How can cloud providers exchange enough monitoring data to debug and solve such a problem, without
incurring the risk of sensitive information leaks by
sharing too much monitoring data?

loud management poses interesting challenges


to researchers and practitioners alike. We need
to improve the state of the art in this field by testing
new approaches, publishing experience reports, and
sharing the lessons learned. Step by step, we need
to assemble a corpus of best practices for managing different types of cloud environments. This new
magazine offers a unique forum for sharing experiences and know-how, and your contributions are
very welcome!
References
1. Recommendation M.3400, TMN Management
Functions, Intl Telecommunication Union, 1992.
2. Recommendation X.700, Data Communication
NetworksManagement Framework for Open
Systems Interconnection (OSI) for CCITT Applications, Intl Telecommunication Union, 1992.
3. D. Davis, G. Pilz, and A. Zhang, eds., Cloud Infrastructure Management Interface (CIMI) Primer,
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

69
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD MANAGEMENT

4.

5.

6.

7.
8.

DSP2027, v. 1.0.1, Distributed Management


Task Force, 2012.
H.G. Hegering, S. Abeck, and B. Neumair, Integrated Management of Networked Systems: Concepts, Architectures, and Their Operational Application, Morgan Kaufmann, 1999.
ISO/IEC 2700: Information TechnologySecurity TechniquesInformation Security Management SystemsOverview and Vocabulary, 2nd
ed., Intl Org. Standardization (ISO)/Intl Electrotechnical Commission (IEC), 2012.
T. Metsch and A. Edmonds, eds., Open Cloud
Computing InterfaceInfrastructure, GFD-PR.184, errata update, Open Grid Forum, June
2011.
ITIL Service Operation, UK Office of Govt.
Commerce, 2011.
M. Sloman, ed., Network and Distributed Systems Management, Addison-Wesley, 1994.

J.P. MARTIN-FLATIN is an Academic Guest at


EPFL, Switzerland. Throughout his career, he has

worked alternatively in research and the real world,


often acting as a bridge between academia and industry. His research interests include integrated management, self-managing systems, big IT analytics, and
cloud troubleshooting. Martin-Flatin has a PhD in
communication systems from EPFL, Switzerland.
He is or has been on the editorial boards of IEEE
Transactions on Network and Service Management (TNSM) and Journal of Network and Systems
Management (JNSM). He also co-founded the SASO
conference series (IEEE International Conference on
Self-Adaptive and Self-Organizing Systems). Contact
him at ________________
jp.martin-flatin@ieee.org.

Selected CS articles and columns are also available


for free at http://ComputingNow.computer.org.

ADVERTISER INFORMATION

Advertising Personnel
Marian Anderson: Sr. Advertising Coordinator
Email: ________________
manderson@computer.org
Phone: +1 714 816 2139 | Fax: +1 714 821 4010
Sandy Brown: Sr. Business Development Mgr.
Email ______________
sbrown@computer.org
Phone: +1 714 816 2144 | Fax: +1 714 821 4010

Southwest, California:
Mike Hughes
Email: _________________
mikehughes@computer.org
Phone: +1 805 529 6790
Southeast:
Heather Buonadies
Email: _________________
h.buonadies@computer.org
Phone: +1 973 304 4123
Fax: +1 973 585 7071

Advertising Sales Representatives (display)


$GYHUWLVLQJ6DOHV5HSUHVHQWDWLYHV &ODVVLHG/LQH

Central, Northwest, Far East:


Eric Kincaid
Email: _______________
e.kincaid@computer.org
Phone: +1 214 673 3742
Fax: +1 888 886 8599
Northeast, Midwest, Europe, Middle East:
Ann & David Schissler
Email: ________________
a.schissler@computer.org, ________________
d.schissler@computer.org
Phone: +1 508 394 4026
Fax: +1 508 394 1707

70

I EEE CLO U D CO M P U T I N G

Heather Buonadies
Email: h.buonadies@computer.org
_________________
Phone: +1 973 304 4123
Fax: +1 973 585 7071
Advertising Sales Representatives (Jobs Board)

Heather Buonadies
Email: _________________
h.buonadies@computer.org
Phone: +1 973 304 4123
Fax: +1 973 585 7071

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD EXPERIENCES AND ADOPTION

Elements of
Cloud Adoption
Samee U. Khan, North Dakota State University

Sharing experiences in transitioning from traditional computing paradigms


to the cloud can provide a blueprint for organizations to gauge the depth
and breadth of cloud-enabled technologies.

loud computing, a mainstream


of research over the last decade,
is expected to revolutionize the
information and communication technology (ICT) sector. The
cloud offers everything as a service
(XaaS) on a pay-per-use model. Among the main incentives to adopt the cloud computing paradigm are
easy and pervasive (anytime, anywhere) access to
data and applications and cost effectiveness.
Significant savings in initial capital expenditures and operational expenses inspire enterprises
and businesses to adopt cloud services for their
computing demands. Enterprises using the cloud
dont need an enormous budget to deploy a computing infrastructure. Moreover, by implementing
a cloud-based system, they can reduce running/
operational costs by reducing the IT staff, relieving data security and backup concerns, and lowering energy bills. Employees can access cloud-based
services anywhere and anytime using handheld devices. Moreover, pervasive and convenient access to
enterprise data and applications augment employees productivity. Furthermore, cloud computing
2325- 6095/14/$31 .0 0 2014 IEEE

makes computing and storage resources available


when required on the fly. Enterprises can procure
and release cloud resources for short-term needs
based on the pay per use policy.
According to the US National Institute of Standards and Technology (NIST),
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access
to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services) that can be rapidly
provisioned and released with minimal management effort or service provider interaction.1
As an IT buzz word, cloud computing has been defined in a variety of ways,2,3 yet most definitions
include on-demand, pay-per-use, and elastic services provisioning; and access to virtually unlimited
shared resources. From a system engineers perspective, cloud computing is software implemented on a
shared pool of interconnected resources in a largescale datacenter to deliver various cloud services.
Such a viewpoint ignites a debate that perhaps the
M AY 2 0 14

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

71
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD EXPERIENCES AND ADOPTION

cloud was available as early as the 1990s, with Web


mail as the prime example.
These conflicting and sometimes ambiguous
definitions and interpretations mandate the need
for a forum in which to share successes (and lessons learned) from cloud experiences and adoption.
Such a forum will not only help us understand what
works and what doesnt but also help move academia
and industry toward this game-changing technology.
Here, I describe some of the elements of cloud adoption to set the tone for the types of articles that will
be of interest under the emblem of Cloud Experiences and Adoption.

Growth Areas
In a Market Trends report, Gartner estimates that
the cloud-based business services and software-as-aservice (SaaS) markets will increase from US$13.4
billion in 2011 to $32.2 billion in 2016.4 Similarly,
the infrastructure-as-a-service (IaaS) and platformas-a-service (PaaS) markets are estimated to grow
from $7.6 billion in 2011 to $35.5 billion in 2016.4
In addition to supporting various operations in
the business and enterprise sector, cloud computing is transforming many aspects of our social and
personal lives. For instance, social networking has
minimized the communication gap by helping users
connect seamlessly through the cloud. The cloud
also facilitates the downloading and updating of various mobile applications and allows people to easily share pictures, videos, files, and product reviews.
Moreover, cloud gaming lets users play state-of-theart online games on low-performance endpoints,
such as smartphones. Not only do players have a
rich set of online competitors to choose from, but all
of the game processing and rendering is performed
in the cloud for a real-time gaming experience.
The business sector is overwhelmingly adopting cloud computing. An IBM Institute of Business
Value and Economist Intelligence Unit survey of
572 technology and business executives across the
globe revealed that around three-fourths of the surveyed companies are using the cloud.5 Moreover, 90
percent of these surveyed executives are expected
to adopt the cloud paradigm within the next three
years5 The benefits offered by cloud computing,
such as unlimited resources at nominal prices, are
motivating enterprises and research organizations to
use the cloud for their computation and data storage requirements. Cloud computing is also being
used widely in e-commerce, agriculture, nuclear
science, healthcare, smart grids, and scientific applications.6 For example, pharmaceutical company
Eli Lilly executed a complex bioinformatics workload
72

I EEE CLO U D CO M P U T I N G

on a 64-machine cluster within a cloud with a price


tag of $6.40.7
Government agencies are also envisioning the
cloud as a cost-effective and unified paradigm. In
September 2009, the US government announced the
Federal Governments Cloud Computing Initiative.8
The US government spends more than $76 billion
annually on IT,8 an amount it expects to reduce with
the adoption of cloud computing. In 2010, Recovery.
gov became the first government-wide system to migrate to the public cloud. The system used the Amazon Elastic Compute Cloud(EC2) infrastructure to
provide added security. According to a government
report, moving Recovery.gov to the cloud saved
$334,800 in 2010 and $420,000 in 2011.8 In collaboration with RightNow solutions, the US Air Force
implemented an SaaS-based solution for knowledge
management, case tracking, contact center tracking, and customer survey mission needs.8 The cloudbased system empowered the Air Force to reduce
manpower and save around $4 million annually. It
also led to an overwhelming increase in queries to
the knowledge base to around 2 million per week,
raising customer engagement to 70 percent.
The Defense Information Systems Agency (DISA)
launched the Forge.mil project to deliver a software development platform for reusing software code. Through
Forge.mil, DISA provides the tools and services necessary for rapid development, testing, and deployment
of new software and systems to the entire Department of Defense. By using this cloud-based collaborative environment and open development platform,
DISA avoided large start-up costs and increased its
return on investment (ROI) through software reuse.
DISA saves an estimated $200,000 to $500,000 per
project using the Forge.mil environment.8 Moreover,
the agency saves $15 million through software reusability and collaborative development.

Open Issues
Round-the-clock service availability is integral to
cloud-based organizations. However, these automated systems are error prone. Regardless of safety
measures and infrastructure robustness, many organizations have faced failures. In the cloud, downtime and failures have a huge effect. Organizations
pay an average of approximately $5,600 per minute
of the datacenter downtime.9 For a datacenter outage having a recovery time of 134 minutes, the average loss is around $680,000.9
Data privacy and security are among the foremost concerns pertaining to cloud computing. In addition to malicious threats, cloud providers receive
information disclosure requests from government
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

agencies and courts across the globe. Google received


approximately 27,477 requests for user data as of December 2013.10 US government agencies submitted
10,574 data requests, and Google provided data for 83
percent of them, specifying 18,254 user accounts.10
Amazon Web Services (AWS) states in its servicelevel agreement (SLA) that, AWS reserves the right
to refuse service, terminate accounts, remove or edit
content in its sole discretion (https://aws.amazon.
____________
com/terms). A UK-based insurance provider claimed
to have accessed the medical records of 47 million patients to determine insurance premiums.11
However, lawmaking agencies are suggesting
various laws and opinions to protect user privacy (see
http://epic.org/privacy/cloudcomputing). The Article
29 Working Party, a privacy agency representing European Union countries, states that cloud providers
must abide by the EU Data Protection Directive.
In the US, NIST released a draft of the Guidelines
on Security and Privacy in Public Cloud Computing
for public comment. The US federal government is
vigilant about establishing security standards to secure cloud environments.8 Cloud providers are using state-of-the-art security measures to protect and
secure user data from unintentional access and use.

loud computing is poised to penetrateand,


to a certain extent, has already penetrated
(or replaced)mainstream computing paradigms.
We must therefore share as much information as
possible about our experiences pertaining to the
transition from traditional computing paradigms
to the cloud. Such information dissemination will
act as a blueprint for academia, industries, governments, and funding agencies to gauge the depth and
breadth of cloud-enabled technologies. Moreover, it
will help clarify ambiguities pertaining to the definition of cloud computing and related technologies.
I strongly encourage you to consider submissions that highlight various aspects of cloud experiences and adoption. Such write-ups could be deep
technical articles, surveys, cloud competitive technology articles, or position papers. Together, as we
will learn more about the cloud computing technology, I would encourage you to consider submitting
to Cloud Computing magazine to share your experiences with the rest of the scientific and industrial
communities.
References
1. P. Mell and T. Grance, The NIST Definition of
Cloud Computing, National Institute of Standards and Technology, NIST Special PublicaM AY 2 0 14

tion 800-145, Sep. 2011, http://csrc.nist.gov/


publications/nistpubs/800-145/SP800-145.pdf.
______________________________
2. L. Wang et al., Scientific Cloud Computing:
Early Definition and Experience, Proc. IEEE
Intl High Performance Computing and Comm.
(HPCC 08), vol. 8, 2008, pp. 825830.
3. I. Foster et al., Cloud Computing and Grid
Computing 360-Degree Compared, Proc. Grid
Computing Environments Workshop (GCE 08),
2008, pp. 110.
4. L. Columbus, Cloud Computing and Enterprise Software Forecast Update, 2012, Forbes,
8 Nov. 2012; www.forbes.com/sites/louiscolum______________________
bus/2012/11/08/cloud-computing-and-enterprise________________________________
software-forecast-update-2012.
____________________
5. IBM Institute of Business Value, The Power of
Cloud: Driving Business Model Innovation, IBM,
2012.
6. K. Bilal et al., A Taxonomy and Survey on Green
Datacenter Networks, to be published in Future
Generation Computer Systems, July 2014; http://
____
dx.doi.org/10.1016/j.future.2013.07.006.
7. M. Bockrath, Cloud Computing and Pharma: A
Prescription for Success, Kelly Outsourcing and
Consulting Group, 2011.
8. V. Kundra, State of Public Sector Cloud Computing, CIO Council, 2010.
9. Unplanned IT Outages Cost More Than $5,000
Per Minute: Report, Channel Insider; www.
____
channelinsider.com/c/a/Spotlight/UnplannedIT-Outages-Cost-More-than-5000-per-Minute_______________________________
Report-105393.
__________
10. Google, Transparency Report, 2014; w
w w.
____
google.com/transparencyreport/userdatarequests.
11. A. Keeley, The Society Which Used Data on Every NHS Patientand Used It to Guide Insurance
Companies on Premiums, Daily Mail, 23 Feb.
2014; www.dailymail.co.uk/news/article-2566397/
The-insurance-firms-buy-data-NHS-patient.
_______________________________
html#ixzz2uRHBCYib.
______________

SAMEE U. KHAN is an assistant professor of electrical and computer engineering at North Dakota State
University. His research interests include the optimization, robustness, and security of cloud, grid, cluster,
and big data computing; social networks; wired and
wireless networks; power systems; smart grids; and
optical networks. Khan has a PhD in computer science from the University of Texas, Arlington. He is a
Fellow of the Institution of Engineering and Technology (IET, formerly IEE) and a Fellow of the British
Computer Society (BCS). Khan is a member of the
IEEE Cloud Computing editorial board. Contact him
at _______________
samee.khan@ndsu.edu.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

73
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD SERVICES

Applications
Portability and Services
Interoperability among
Multiple Clouds
Beniamino Di Martino, Second University of Naples

Although researchers are actively seeking answers, definitive solutions to


interoperability and portability issues among multiple cloud environments,
especially at PaaS and SaaS level, remain elusive due to the technical
complexity and a lack of standards.

74

hanks to the relative ease of managing and configuring resources and


the low cost required for setup and
maintaining cloud services, cloud
providers are increasingly offering new and different services and
steadily incrementing the available cloud service at
all levelsinfrastructure-, platform-, and softwareas-a-service (IaaS, PaaS, and SaaS). The current
scenerys complexity is further increased by the introduction of virtual appliances, which are virtual
machine images running on virtualization platforms
that deliver a complete, fully functional, and immediately available appliance to users. In contrast, a
cloud service is offered by a cloud computing platform that users access through the Web by exploit-

ing some kind of interface. In many cases, virtual


machines offer functionalities that are very similar
to those provided by cloud services.
This context gives rise to two main issues in
terms of cloud applications development: services
interoperability and portability. Interoperability issues are related to how different cloud platforms
and provider offers interoperate in the presence
of multiple clouds, provider federations, or, even
worse, cloud providers who wont federate. Portability in the cloud refers to two different but strictly
interlinked aspects: legacy softwares modernization
aimed at exploiting current cloud-based technologies, and the portability of cloud-ready applications
among different cloud platforms and providers.
These issues affect the cloud computing landscape

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

in different ways. Brokering, negotiating, managing,


monitoring, and reconfiguring cloud services are challenging tasks for cloud application developers, and,
as a consequence, for the users of those applications.

Promising Approaches and Technologies


Introducing an upper layer of abstraction improves
the portability and reusability of cloud resources and
services among several clouds. Indeed, even if the
system is designed for a specific platformincluding
framework, middleware, or cloud servicesthese
entities often rely on similar concepts, which can
be abstracted from the specificities of each cloud
provider. Typically, the topology of the system in the
cloudas well as the minimum hardware resources
required to run itcan be defined in a cloud-agnostic
way. Thanks to this new abstraction layer, its possible to map a platform-specific model to one or more
cloud providers.
Recently, several initiatives have emerged that
define approaches to support application migration
to the cloud. Some of them, such as initiatives relying on model-driven engineering and semantic
approaches, adopt the cloud-agnostic abstraction
methodology as a key point.
Model-Based Approaches
The Object Management Group (OMG) ModelDriven Architecture (MDA; ____________
www.omg.org/mda),1 is
a model-based approach for software system development. The MDAs main benefits from the cloud
perspective are the ease of portability, interoperability, and reusability of system parts (which makes
it easy to move the parts from one platform to another). Another benefit is that system maintenance
occurs through human readable and reusable specifications at various abstraction levels.
In the cloud computing context, model-driven
development can be helpful in letting developers
design a software system in a cloud-agnostic way
and still be supported by model transformation
techniques when instantiating the system into specific and multiple clouds. This approach, which is
commonly summarized as model once, generate
anywhere, is particularly relevant when it comes
to application design and management across multiple clouds, as well as to migrating them from one
cloud to another. Given this, several research groups
and projects are combining model-driven application engineering with cloud computing, including
ModaClouds (www.modaclouds.eu),
the Advanced
_____________
Software-Based Service Provisioning and Migration
of Legacy Software (Artist; www.artist-project.eu),
______________
and PaaSage (www.paasage.eu).
___________
M AY 2 0 14

Multiagent Systems
Multiagent systems seem to offer another effective
approach. In particular, the outcome of the European Commissions Open Source API and Platform for
Multiple Clouds (mOSAIC) research project2,3 demonstrates in the cloud agency4 the benefits of adopting a cloud multiagent technology.
Cloud Patterns
Another promising methodology currently emerging is cloud patternsthat is, defining sets of prepackaged and preconfigured architectural solutions,
exposed through the concepts and mechanisms of
software engineering design patterns.5,6 These cloud
patterns support cloud application developers in
defining, in vendor-agnostic terms, the most viable
cloud architectural solutions for their cloud development or porting activity. Patterns describe common
aspects of cloud computing environments and application designs and can be useful in understanding
the application code changes that might be needed
for a successful migration to cloud.
Several cloud pattern catalogs are emerging, proposed from academia 6 (see also www.
____
cloudcomputingpatterns.org and http://cloudpatterns
.org) and from commercial cloud providers such
___
as Amazon Web Services (AWS; see http://en
.clouddesignpattern.org),
Windows Azure,7 and IBM
_______________
(www-01.ibm.com/software/ucd/designpatterns.html).
Some of these catalogs are closer to a specific cloud
platform, and thus they present patterns that are
cloud-platform specific in terms of cloud components
that can implement the pattern. They also propose
specific, platform-dependent cloud services to use
during application development and deployment.
Following a specific cloud pattern or a composition of cloud patterns to migrate and port an application to the cloud represents a best practice:
the patterns themselves support the redesign and
deployment of applications on the cloudand, because the design pattern solutions are proven, their
consistent application tends to naturally improve the
quality of system designs.
Semantic Models
A contributing factor in interoperability and portability issues is the difference in the offered services
semantics: providers use proprietary terms and semantics, without offering uniform representations
of services. As Amit Sheth and Ajith Ranabahu
stated,8 semantic models are helpful in three ways:
functional and nonfunctional definitions, data modeling, and service description enhancement.
Metadata added through annotations pointing
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

75
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD SERVICES

to generic operational models would play a key role


in consolidating these APIs and enable interoperability among the heterogeneous cloud environments.
To address these different aspects, its possible to
use existing technologies inherited from the semantic Web field, such as

Cloud (http://deltacloud.apache.org), which address


IaaS interoperability by letting consumers use APIs
from different cloud vendors.

Toward a Standard

Weve yet to establish an internationally accepted


standard or set of standards that definitively solves
r OWL (www.w3.org/TR/owl-features) to define a interoperability and portability issues, despite the
common, machine-readable dictionary that can many different efforts to that end. Researchers
express resources, services, APIs and related have conducted studies to collect existing standards
parameters, service-level agreements, require- and proposals (not exclusively for the cloud) to dements and offers, and related key performance termine which specific cloud issues they can solve;
indicators (KPIs);
an example here is the Cloud Standards Coordina11
tion initiative (http://csc.etsi.org).
___________
Researchers have also sought to define
how such standards can be used to
build a cloud infrastructure. A notable
Cloud services interoperability and
example here is the IEEEs Standard for
Intercloud Interoperability and Federaportability help avoid cloud vendor locktion (SIIF) project,12 which aims to dein and enable services composition.
fine a topology, a set of functionalities,
and a governance model for cloud interoperability and federation. The current SIIF standard, still in development
r OWL-S (www.w3.org/Submission/2004/SUBMand reported only as a draft, focuses on describing
_________________________
OWL-S-20041122)
the intercloud topology, which refers to the NIST
____________ to add semantics to cloud
services that let users and software agents au- definition of cloud computing,13 defining in detail its
tomatically discover, invoke, and compose cloud components and the relationships among them.
services;
Of course, many other cloud-specific standards
r SPARQL (www.w3.org/TR/rdf-sparql-query) to are under development, including the Open Cloud
perform queries to retrieve services according to Computing Interface (OCCI, _______
occi-wg.org), which foparticular constraints; and
cuses mostly on IaaS but, given its flexibility, could
r Semantic Web Rule Language (SWRL; www.
be applied to other service layers. Other standards
____
w3.org/Submission/SWRL)
include the Cloud Management Initiative (http://
_________________ to express additional
____
rules and heuristics.
dmtf.org/standards/cloud);
the Topology and Or________________
chestration Specification for Cloud Applications
These aspects are also addressed by mOSA- Version 1.0. (TOSCA)14; the Cloud Data ManageIC2 in particular, by its semantic engine9 and dy- ment Interface (CDMI, www.snia.org/cdmi), which
addresses IaaS offers, focusing on infrastructure,
namic discovery and mapping system10 (see www.
____
services, and data-storage management; and the
mosaic-cloud.eu).
___________
Oasis Cloud Application Management for Platforms
(CAMP),15 which relates to PaaS offers, defining a
Open Source Application Programming
Interfaces and Platforms
set of basic APIs.
Various other commercial and open source solutions
have been developed to resolve interoperability and
portability issues. This is especially the case with
ue to the huge number of vendors, offers, and
open source solutions, which usually rely on large
technologies involved, were still far from a decommunities for support and further development. finitive solution on interoperability and portability
Notable examples here are Openshift (www.
issues among multiple cloud environments. Among
____
openshift.com), supporting application portability methodologies, model-based approaches, such as
__________
for PaaS architectures; Openstack (www.openstack. MDA, and pattern-oriented solutions based on cloud
org), an IaaS cloud platform that is positioning it- patterns seem to be the most promising. European
__
self as a de-facto standard for interoperability; and research projects are producing very good results
OpenNebula (http://opennebula.org) and Delta- in defining new frameworks and standards: among
76

I EEE CLO U D CO M P U T I N G

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

these, mOSAIC clearly shows how semantic and


agent-based technologies can ease interoperability
and portability issues and lead to an effective and
efficient solution. Commercial proposals distributed
as open source software and sustained by cloud vendors, such as Openshift for PaaS and Openstack for
IaaS, are steadily growing in importance and being
adopted by a growing number of consumers, due in
part to the support of large developer communities.
Cloud vendors also support the creation and adoption of new standards, by proposing them to standardization groups; clear examples here are CAMP
for PaaS and TOSCA for IaaS.
Because no standard, methodology, technique,
or framework stands above the others, more research and technological innovation efforts are
hugely needed to solve cloud application portability and services interoperability challenges. Thus,
I encourage you to submit research briefs, position
papers, and descriptions of practical solutions, challenging case studies, and application domains (such
as public administration services) on these and other relevant cloud services challenges.

Acknowledgment
I thank both Giuseppina Cretella and Antonio Esposito for their valuable contributions to this article.
References
1. R. Soley, OMG: Model Driven Architecture,
white paper, Object Management Group, 2000.
2. B. Di Martino et al., Building a Mosaic of
Clouds, Euro-Par 2010 Parallel Processing Workshops, LNCS 6586, Springer, 2011, pp. 571578.
3. D. Petcu et al., Experiences in Building a Mosaic
of Clouds, J. Cloud Computing: Advances, Systems, and Applications, vol. 2, no. 1, 2013, p. 12.
4. S. Venticinque, Luca Tasquier, and Beniamino
Di Martino, Agents-Based Cloud Computing
Interface for Resource Provisioning and Management, Proc. 6th Intl Conf. Complex, Intelligent and Software Intensive Systems (CISIS 12),
2012, pp. 249256.
5. C. Baudoin, Migrating Applications to the Cloud:
Roadmap for Success, white paper, Cloud Standards Customer Council (CSCC), 2013; www.
____
cloudstandardscustomercouncil.org/MigratingApps-to-the-Cloud-Final.pdf.
___________________
6. C. Fehling et al., Cloud Computing Patterns,
Springer, 2014.
7. J.D. Meier, Windows Azure Application Patterns, blog, 11 Sept. 2010; http://blogs.msdn.
com / b/jmeier/archive/2010/09/11/w indows________________________________
azure-application-patterns.aspx.
_____________________
M AY 2 0 14

8. A. Sheth and Ajith Ranabahu, Semantic Modeling for Cloud Computing, Part 2, IEEE Internet
Computing, vol. 14, no. 4, 2010, pp. 8184.
9. G. Cretella and B. Di Martino, Towards a Semantic Engine for Cloud Applications Development, Proc. 6th Intl Conf. Complex, Intelligent, and Software Intensive Systems, 2012, pp.
198203.
10. G. Cretella and B. Di Martino, Semantic and
Matchmaking Technologies for Discovering,
Mapping and Aligning Cloud Providers Services, Proc. 15th Intl Conf. Information Integration
and Web-Based Applications and Services (iiWAS
13), 2013, p. 380384.
11. Cloud Standards CoordinationFinal Report,
version 1.1, European Commission, 1 Sept. 2013;
http://ec.europa.eu/digital-agenda/en/news/
cloud-standards-coordination-final-report.
___________________________
12. P2302 Standard for Intercloud Interoperability and Federation (SIIF), Intercloud Working
Group, IEEE Standards Assoc., 2012; ____
http://
standards.ieee.org/develop/project/2302.html.
13. P. Mell and T. Grance, The NIST Definition of
Cloud Computing (draft), NIST Special Publication 800-145, 2011.
14. Topology and Orchestration Specification for
Cloud Applications Version 1.0, Oasis Committee Specification 01, 18 Mar. 2013; http://
____
docs.oasis-open.org/tosca/ TOSCA /v1.0/cs01/
TOSCA-v1.0-cs01.html.
_______________
15. Cloud Application Management for Platforms
Version 1.1, Oasis Committee Specification
Draft 03, 31 July 2013; http://docs.oasis-open.
org/camp/camp-spec/v1.1/csprd01/camp-spec-v1
.1-csprd01.html.
___________

BENIAMINO DI MARTINO is a full professor and


vice director in the Department of Industrial and Information Engineering at Second University of Naples. He participates in and leads several European
Commission projects on cloud computing, including
the mOSAIC project; his research interests include
cloud and high-performance computing, knowledge
engineering, semantics, and software patterns. Di
Martino has a PhD in computer science from University Federico II of Naples. He is the editor or associate
editor of four international journals, including this
publication and IEEE Transactions on Cloud Computing, and is a member of the IEEE P3203 Standard
on Cloud Interoperability Working Group, the IEEE
Intercloud Testbed Initiative, the Cloud Standards
Customer Council, and the ECs Cloud Computing
Experts Group. Contact him at ____________
beniamino.dimartino@unina.it.
_________
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

77
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

BLUE SKIES

Streaming Big Data


Processing in
Datacenter Clouds

Rajiv Ranjan
Commonwealth
Scientific and Industrial Research
Organization,
Australia

espite clear technological advances, research challenges


must be solved to realize a standard large-scale, QoSoptimized platform for managing streaming big data
analytics ecosystem.

Today, we live in a digital universe in which information and technology are not only around us but also
play important roles in dictating the quality of our
lives. As we delve deeper into this digital universe,
were witnessing explosive growth in the variety, velocity, and volume of data1,2 being transmitted over
the Internet. A zetabyte of data passed through the
Internet in the past year; IDC predicts that this digital universe will explode to an unimaginable eight
Zbytes by 2015. These data are and will be generated
mainly from Internet search, social media, mobile
devices, the Internet of Things, business transactions, next-generation radio astronomy telescopes,
high-energy physics synchrotron, and content distribution. Government and business organizations

are now overflowing with data, easily aggregating to


terabytes or even petabytes of information.
The above examples demonstrate the rise of big
data applications, in which data has grown unrestrainedly. Conventional data processing technologies are now unable to process this data within a
tolerable elapsed time. Such applications generate
datasets that dont fit the data processing model
frameworks of traditional relational databases (such
as Oracle, MySQL, and DB2) and data mining (such
as Microsoft Excel, Matlab, and R). Relational databases operate on archived data in response to queries such as commit a credit card transaction (as
in e-commerce). That is, the data processing technologies are designed to maintain an efficient and
fault-tolerant collection of data thats accessed and
aggregated only when users issue a query or transaction request (and thus the data must be archived
prior to processing).
In contrast, all state-of-the-art implementations
of data mining algorithms operate by loading the
whole training dataset into the main (RAM) memory of a single machine or simple machine clusters
that have static processing and storage capacity configurations. This approach has two key problems.36

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Welcome to the inaugural


Blue Skies column of IEEEs
flagship cloud computing magazine. This column intends to
provide an in-depth analysis
of the most recent and influential research related
to cloud technologies and innovations, focusing on
streaming big data processing in datacenter clouds.

Big Data Computing Paradigm

78

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

First, the data can simply grow too big


over time to fit into the available RAM.
Second, most of big data applications
produce data spread across multiple distributed data sources (including streaming sources). Moving all the datasets
to a centralized machine is thus expensive (due, for example, to network
communication and other I/O costs),
even if we assume that the machine
has a super-large RAM to hold all the
data for processing. Further, when the
data mining algorithms computational
complexity exceeds the available RAM,
the algorithms dont scale well and they
never finish or are unable to process the
whole training dataset.
To process data as they arrive, the
paradigm has changed from the conventional one-shot data processing
approach to elastic and virtualized
datacenter cloud-based data processing
frameworks that can mine continuous,
high-volume, open-ended datastreams.
This advancement is broadly supported
by two key technologies.
Data Mining/Application
Programming Frameworks
Data mining and application programming frameworks enable the creation of
a big data analytics application architecture. Broadly, these frameworks can
be classified into four categories. Largescale data mining frameworks, such as
GraphLab,3 FlexGP,6 Apache Mahout
(http://mahout.apache.org), and MLBase,7 implement a wide range of data
mining algorithmsincluding clustering, decision trees, latent Dirichlet allocation, regression, and Bayesianthat
can mine datasets in parallel by leveraging distributed set of machines.
Distributed message queuing frameworks, such as Amazon Kinesis (https://
____
aws.amazon.com/kinesis) and Apache
Kafka (http://kafka.apache.org),
_______________ provide
a reliable, high-throughput, low-latency
M AY 2 0 14

system of queuing real-time datastreams.


Data application programming frameworks, such as Apache Hadoop (http://
____
hadoop.apache.org) and Apache Storm
____________
(http://storm.incubator.apache.org),
write applications that rapidly process
massive amounts of data in parallel on
large sets of machines. To speed up the
data mining algorithms, these frame-

Datacenter Clouds
The second key technology is datacenter clouds,810 which promise on-demand access to affordable large-scale
resources in computing (such as multicore CPUs, GPUs, and CPU clusters)
and storage (such as disks) without substantial upfront investment.
Datacenter cloud services are a natu-

Today, we live in a digital universe in


which information and technology
are not only around us but also
play important roles in dictating the
quality of our lives.

works simplify the process of distributing the training and learning tasks
across a parallel set of machines. The
frameworks also automatically take care
of low-level distributed system management complexities, such as task scheduling, fault management, interprocess
communication, and result collection.
Finally, NoSQL database frameworks,
such as MongoDB (www.mongodb.org),
____________
HyperTable (http://hypertable.org),
Cas_____________
sandra (http://cassandra.apache.org), and
Amazon Dynamo (http://aws.amazon.
com/dynamodb), allow data access based
on predefined access primitives such as
key-value pairs. Given the exact key, the
value is returned. This well-defined data
access pattern results in better scalability and performance predictability
that is suitable for storing and indexing
real-time streams of big datasets. These
frameworks can scale more naturally to
ad hoc and evolving large datasets, as
NoSQL databases dont require fixed
table schemas or support expensive join
operations.

ral fit1,4,6 for processing big datastreams,


because they allow data mining algorithms (and underlying application programming and database frameworks) to
run at the scale required for handling
uncertain data volume, variety, and velocity. However, to support a complicated,
dynamically configurable big data ecosystem, we need to innovate and implement
novel services and techniques for orchestrating cloud resource selection, deployment, monitoring, and QoS control.

Big Data Analytics Ecosystem


As Figure 1 shows, a big data ecosystems high-level architecture consists of
three main components or layers:
r Data ingestion accepts data from
multiple sources, such as online services and back-end system logs.
r Data analytics consists of many systemssuch as stream/batch processing systems and scalable machine
learning frameworksthat ease implementation of data analytics use
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

79
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

BLUE SKIES

Applications
Disaster
management

Radio
astronomy

Smart energy
grids

Healthcare

Telephone fraud
detection

Big data analytics ecosystem

Data ingestion layer

Data analytic layer

Data storage layer

Data application programming frameworks


Streams

Distributed
streaming
systems

Distributed data
queuing systems

Batch processing
systems

Large-scale data
mining
framework

NoSQL
databases

Scalable data centre cloud resource layer

Datacenter provider A

Datacenter provider B

Datacenter provider C

FIGURE 1. A high-level architecture of large-scale data processing service. The big data analytics architectures have three layers

data ingestion, analytics, and storageand the rst two layers communicate with various databases during execution.

Apache
Cassandra

Apache Kafka

Apache Storm
FIGURE 2. A simple instance of large-scale datastream-processing service. The

example service consists of Apache Kafka (data ingestion layer), Apache Storm (data
analytics layer), and Apache Cassandra Systems (data storage layer).

cases such as collaborative filtering


and sentiment analysis.
r Data storage consists of nextgeneration database systems for
storing and indexing final as well as
intermediate datasets.
The first two layers talk with different
databases during execution and, where
required, persist or load the data in
or from a database. The simple architecture in Figure 1 offers a snapshot
80

I EEE CLO U D CO M P U T I N G

of real ones; we encourage passionate


readers to also investigate the Lambda
Architecture.11
Recently, each architectural layer
changed dramatically in terms of the
software stack when services such as
Yahoo!, Twitter, and LinkedIn released
open source solutions for dealing with
big data. Figure 2 shows an example
of the new architecture: Apache Kafka
serves as a high-throughput distributed
messaging system, Apache Storm as a

distributed and fault-tolerant real-time


computation, and Apache Cassandra as
a NoSQL database. Its not surprising
that real-time stream-processing systems are just one building block in the
big data ecosystem; computing arbitrary
datasets via arbitrary queries demands a
variety of tools and techniques.

Open Source Real-Time Stream


Computation Frameworks
Although the stream-processing concept
is not new, the available open source
stream-processing systems are quite
young and a silver bullet solution doesnt
exist. Therefore, picking an appropriate platform for (near) real-time stream
processing is a nontrivial task given the
number of offers and their multiple features. To ease this process, we created
an initial list of criteria: architecture,
language support, integration with other
technologies, and documentation and
community support.
We divide the architecture dimension into centralized, distributed, and
parallel distributed systems. CentralW
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

ized in-memory streaming systems are


suitable for handling queries with lowlatency requirements, and they dont
produce much intermediate state data.
For example, Esper (http://esper.code___________
haus.org)
_____ is a streaming system with
a centralized architecture that runs
on a single node and keeps everything
(states, operators, and so on) in memory. However, if the continuous queries
have a large window size and might
entail millions of tuples per second, a
better answer is found in systems with a
parallel-distributed architecturesuch
as Apache Samza (http://samza.incuba_____________
tor.apache.org)which let you partition
_________
the streams and parallelize operators
execution across a cluster of machines.
Language support refers to the
frameworks flexibility in letting your
team develop an application using your
choice of language. Apache Storm is a
good example here; it supports both
JVM and non-JVM languages.
Another salient dimension is technology integrationspecifically, the
availability of ready-to-plug libraries
for connecting the system to various inline technologies. For example, Apache
Kafka is a high-throughput distributed
in-memory messaging system that complements every stream-processing system and has a ready component for this
integration, which is a trump card.
The last (but not least) criterion
is the frameworks documentation and
community support, which lets developers employ APIs easily. Here, Esper and
Apache Storm have adequate documentation support, which is only expected to
grow as more and more end-users adopt
these systems. Table 1 gives an overview
of state-of-the-art open source systems
and how they meet the criteria.
Given more space, we would expand
the criteria list to include more technical features such as dynamic rebalancing, state management, fault-tolerance,
M AY 2 0 14

built-in monitoring, and metric reporting capabilities. However, in-depth


analysis of overriding open source or
commercial frameworks is beyond this
columns scope.

Open Challenges and Research


Directions
Despite the clear technological advances
in machine learning, big data application
programming frameworks, and datacenter clouds, weve yet to realize a standard
large-scale, QoS-optimized platform as
a service-level software for managing a
streaming big data analytics ecosystem.
Future efforts will focus on solving the
following research challenges.
Understanding an Optimal Analytics
System
Its not yet clear how to build an optimal
big data application architecture given
the abundance of existing frameworks
that offer competing functionalities
for large-scale data mining, distributed message queuing, data application
programming, and NoSQL databases.
Frameworks such as Apache Mahout
implement several data mining algorithms, but its not clear which is most
suitable for processing given both historical and streaming big data in a distributed and parallel setting. Similarly,
some data application programming
frameworks such as Apache Hadoop
are suitable for handling historic data,
while others like Spark Streaming12 or
Apache S413 are better suited to streaming data. Similar complexities exist in
choosing NoSQL databases, especially
for heterogeneous (structured and unstructured) datatypes. Therefore, we
must develop a solid scientific foundation and decision-making technique
that can help us select these key functionalities based on the big datas nature (that is, its volume, variety, and
velocity).

Monitoring and Managing End-toEnd QoS


Guaranteeing QoS for large-scale data
processing across multiple layers and
various computing platforms is a nontrivial task. The QoS for each computing
platform in the ecosystem isnt necessarily the same; key quality factors include
throughput and latency in distributed
messaging system, response time in the
batch processing platform, and precision
recall in the scalable data mining platform. Therefore, it is not yet clear
r how these QoS could be defined coherently across layers;
r how the various measures should be
combined to give a holistic view of
the stream of data flows end-to-end;
or
r how optimal optimization would
be realized in cases with large sets
of variables and constraints, such
as with heterogeneous resources,
bursty workloads, and so on.
To this end, future research efforts
must take an end-to-end QoS view of
the ecosystem and develop techniques
and frameworks that cater to all components rather than treating them as silos.
Provisioning Datacenter Cloud
Resources for Real Time Analytics
Handling large volumes of streaming and historical dataranging from
structured to unstructured and numerical to micro-blog datastreams
is challenging because its volume is
heterogeneous and highly dynamic.
Although datacenter clouds offer abundant resources, they dont support QoSdriven autonomic resource provisioning
or deprovisioning in response to changes in the 3Vs (that is, in the big data applications behavioral uncertainties).
The datacenter cloud resource provisionings uncertainty1416 has two
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

81
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

BLUE SKIES

Table 1. Capability Analysis of Recent Open Source Stream-Processing Systems.

Esper

Architecture

Language
Support

Integration

Documentation

Centralized
in-memory

Java

API for integrating functionalities

Well-documented API
and a thorough reference
architecture that covers
all features with clear-cut
examples

.NET
Declarative
SQL-like query
language

Apache
Samza

Paralleldistributed

Java Virtual
Machine (JVM)
languages

Active community mailing list


Managed by Apache Yet Another
Resource Negotiator (YARN) resource
manager (Storm-YARN)

Limited documentations and


examples

Apache Kafka
Spark
Streaming12

Paralleldistributed

Integrated scalable machine learning


library (MLlib)

Scala
Java
SQL-like query
language
(Shark)

Limited documentations and


examples

Integrated graph processing


algorithms
Apache Kafka
Apache Flume
Twitter
ZeroMQ
Message Queuing Telemetry
Transport (MQTT)

Apache Storm

Paralleldistributed

JVM and nonJVM languages

Managed by apache YARN resource


manager (Storm-YARN)

Well-documented APIs and


online tutorials

Higher-level
programming
model (Trident)

Apache Kafka

Several books

Kestrel

Active community

RabbitMQ
Java Messaging Services (JMS)
Apache HBase (Storm-HBase)
Twitter
Machine learning integration with
TridentML library

Apache S413

Paralleldistributed

JVM and nonJVM languages

aspects. First, from a big data applications perspective, its difficult to estimate workload behavior in terms of
the data volume to be analyzed, data
arrival rate, datatypes, data processing time distributions, and I/O system
behavior. Second, from a datacenter
resource perspective, without knowing the big datas requirements or behaviors, its difficult to make decisions
about the size of resources to be provi82

I EEE CLO U D CO M P U T I N G

Apache YARN

sioned at any given time. Furthermore,


the availability, load, and throughput of
datacenter resources can vary in unpredictable ways, due to failure, malicious
attacks, or network link congestion.
In other words, we need reasonable
workload and load resource performance prediction models when making
provisioning decisions for datacenter
resources that host instances of data
mining algorithms, distributed mes-

Limited documentations and


examples

sage queuing systems, data application


programming frameworks, and NoSQL
databases.
Ensuring End-to-End
Security and Privacy
Data stored on (and processed by) cloud
resources and big data analytics ecosystem components arent secured at
finer granularity levels. The application data managed by these resources
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

and components are vulnerable to theft,


because adversaries can gain access to
private data and malicious database administrators might capture or leak data.
Hence, research efforts must focus on
developing techniques that efficiently
support the following:
r end-to-end data encryption and decryption without causing additional
query and data processing overhead
(time and space);
r execution of various traditional
SQL queriessuch as equality
checks, order comparisons, aggregates, and joinsor NoSQL queries
over encrypted data; and
r preservation of data security and
privacy at each lifecycle stage (including creation, ingestion, analytics, and visualization).
Developing techniques that can ensure
end-to-end stream security and privacy
remains a challenging research problem.

his column also welcomes highquality position, survey, and review


papers from cloud computing and related research areas. Future contributions
might also include in-depth reports on
innovative research projects in academia and research institutions, cloud
and big data challenges at leading international conferences, and open source
cloud computing projects.

Acknowledgements
I thank Omer Rana (Cardiff University), Lizhe Wang (Chinese Academy
of Sciences), and Alireza Khoshkbarforoushha (Australian National University) for providing and discussing their
viewpoints on research areas related to
this column. I also thank Khoshkbarforoushha for his instrumental input in
the compilation of Table 1.
M AY 2 0 14

References
1. X. Wu et al., Data Mining with Big
Data, IEEE Trans. Knowledge and
Data Eng., vol. 26, no. 1, 2013, pp.
97107.
2. W. Fan and A. Bifet, Mining Big
Data: Current Status, and Forecast
to the Future, SIGKDD Explorations Newsletter, vol. 14, no. 2, 2013,
pp. 15.
3. Y. Low et al., Distributed GraphLab:
A Framework for Machine Learning and Data Mining in the Cloud,
Proc. Very Large Database Endowment, vol. 5, no. 8, 2012, pp. 716727.
4. S.R. Upadhyaya, Parallel Approaches to Machine LearningA Comprehensive Survey, J. Parallel Distributed Computing, vol. 73, no. 3,
2013, pp. 284292.
5. D. Peteiro-Barral and B. GuijarroBerdias, A Survey of Methods for
Distributed Machine Learning,
Progress in Artificial Intelligence,
vol. 2, no. 1, 2013, pp. 111.
6. O.C. Derby, FlexGP: a Scalable System for Factored Learning in the
Cloud, doctoral dissertation, Dept.
Electrical and Computing Eng.,
MIT, 2013.
7. T. Kraska, MLBase: A Distributed Machine-Learning System,
Proc. Sixth Biennial Conf. Innovative Data Systems Research, 2013;
www.cidrdb.org/cidr2013/Papers/
CIDR13_Paper118.pdf.
_______________
8. M. Armbrust et al., A View of Cloud
Computing, Comm. ACM, vol. 53,
no. 4, 2010, pp. 5058.
9. D.A. Patterson, Technical Perspective: The Data Center Is the Computer, Comm. ACM, vol. 51, no. 1,
2008, pp. 105105.
10. L. Wang et al., eds., Cloud Computing: Methodology, Systems, and Applications, CRC Press, 2011.
11. N. Marz, Big Data: Principles and
Best Practices of Scalable Real-Time

Data Systems, OReilly Media, 2013.

12. M. Zaharia et al., Discretized


Streams: An Efficient and FaultTolerant Model for Stream Processing on Large Clusters, Proc. 4th
Usenix Conf. Hot Topics in Cloud
Computing, 2012; www.cs.berkeley.
edu/~matei/papers/2012/hotcloud_
________________________
spark_streaming.pdf.
_____________
13. L. Neumeyer et al., S4: Distributed
Stream Computing Platform, Proc.
IEEE Intl Conf. on Data Mining
Workshops, 2010, pp. 170177.
14. R.N. Calheiros, R. Ranjan, and R.
Buyya, Virtual Machine Provisioning Based on Analytical Performance and QoS in Cloud Computing Environments, Proc. 40th Intl
Conf. Parallel Processing (ICPP 11),
2011, pp. 295304.
15. J. Schad et al., Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance, Proc.
Very Large Database Endowment, vol.
3, nos. 12, 2010, pp. 460471.
16. A. Iosup et al., On the Performance
Variability of Production Cloud Services, Proc. IEEE/ACM Intl Symp.
Cluster, Cloud, and Grid Computing
(CCGrid 11), 2011, pp. 103113.

RAJIV RANJAN is a senior research


scientist and Julius Fellow at the Commonwealth Scientific and Industrial Research Organization (CSIRO), Australia.
His research interests include cloud computing, big data, and quality of service
(QoS) optimization in distributed systems. Ranjan has a PhD in computer science and software engineering from the
University of Melbourne. Contact him at
rajiv.ranjan@csiro.au and http://www.ict.
______________
csiro.au/staff/rajiv.ranjan.
________________

Selected CS articles and columns


are also available for free at http://
___
ComputingNow.computer.org.
_________________

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

83
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

WHATS TRENDING?

Intersection of
the Cloud and
Big Data

Dening Big Data

THE

TWO BIGGEST TRENDS IN THE DATA


CENTER TODAY ARE CLOUD COMPUTING
AND BIG DATA. This column will examine the intersection of the two. Industry hype has resulted in
nebulous definitions for each, so Ill start by defining terms.

Dening the Cloud


When defining the cloud and big data, its helpful
to consider both the consumer and producer perspectives. For consumers, the cloud is about consuming hardware or software as a service (SaaS)
and the various implications of this approach. For
example, pricing models and data governance may
change dramatically. In public clouds, the services
are run by a third party, while in private clouds,
they are owner-operated on premise. Consumers effectively choose the level of vertical integration for
their IT; they can choose to own or outsource every-

ELI COLLINS
Cloudera
eli@cloudera.com
____________

84

thing from the data center to the storage, computing, networking, and software infrastructure up to
the application.
For producers, on the other hand, the cloud is
about the technology that goes into providing service offerings at each level. The technology required
to provide an application as a service in the public
cloud may differ significantly from the software
product that a customer installs to run an internal
service. For example, virtual machines are the resource allocation units in most cloud infrastructure
offerings, but they might not be used when implementing an application as a public service.

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

For consumers, big data is about using large datasets from new or diverse sources to provide meaningful and actionable information about how the world
works. For example, Netflix can use customer data
to produce shows tailored to their audiences.
For producers, however, big data is about the
technology necessary to handle these large, diverse
datasets. Producers characterize big data in terms
of volume, variety, and velocity. How much data is
there, of what types, and how quickly can you derive
value from it?
Although these are good technical descriptions
of big data, they dont fully explain it. Just as adopting a service-oriented approach is the macro trend
behind the cloud, there are several macro trends
behind big data. The first trend is consumption; we
consume data as part of the everyday activities in
our personal and working lives. From booking a
flight, to finding a partner, to diagnosing disease,
data is driving many more decisions today than it
has in the past. We live in a relatively new social
context where people increasingly want to make
data-driven decisions.
Related to consumption, the second trend is instrumentation. We collect data at each step in many
of our activities, and much of it is now produced by
machines instead of people. From supply chains to
Fitbits, we collect information about all our activities with the intent to measure and analyze them.
The third trend is exploration. The relatively
easy access to this abundance of data means we can
use it to construct, test, and consume experiments
that were previously not feasible. Finally, related to
2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

exploration is the concept that the data


itself has value. Data is increasingly an
asset, not just input to or a byproduct of
a business process. This isnt a new idea
of course, but in the context of consumption, instrumentation, and exploration, its driving new business models
and applications.
Ultimately, big data is about the
change in relationship between us and
our data and, in the context of this column, the implications of this change on
cloud technology.

Converging Technologies
So what is the relationship between big
data and the cloud? Big data has its origins in the cloud. Apache Hadoop, one
of the most widely used big data technologies today, was built on research
from Google and initially deployed at
Yahoo. Google invented this technology
because indexing the Web was infeasible
with existing systems. Now companies
adopting Hadoop are bringing a cloud
architecture into their data centers.
The simultaneous rise of cloud and
big data technologies isnt coincidentaltheyre mutually reinforcing. Big
data enables the cloud services we consume. For example, SaaS lets us collect
data that was infeasible or impossible
in a world of packaged software. An application can record every interaction
from millions of users. This service in
turn drives demand for big data technologies to store, process, and analyze
these interactions and inject the value
of the analysis back into the application
through query and visualization.
The expansion of the cloud continues to drive both the creation of new big
data technologies and big data adoption
by making it easier and cheaper to access storage and computing resources.
Companies can run their big data platforms on infrastructure provided as a
service (IaaS) or consume the big data
M AY 2 0 14

platform as a service (PaaS). Both models work in the public cloud and in onpremise systems.
The decision for enterprises is
thus a familiar one: How vertically or
horizontally integrated should your infrastructure be? A spectrum of valid
options exists, but cloud technology is
already enabling more infrastructure
outsourcing, whether its outsourced to
a cloud provider or an internal centralized IT department.
Big data infrastructures also play a
role in this trend. For example, recent
advances in the Apache Hadoop ecosystem enable more types of workloads
and more tenants to share a cluster.
What were once discrete systems running on their own hardware are now
effectively applications running on Hadoop, sharing the same data and hardware resources. As this abstraction layer
evolves and more projects build on it,
users will be able to run more types of
infrastructures on the same Hadoop
cluster, which itself may be running
on a cloud infrastructure. As big data
infrastructures become more generic,
the cloud infrastructure will add more
specialized services for data storage,
processing, and analysis. Future columns will examine new developments
in both areas and the increasing overlap
between them.
Another area of exploration for
this column will be technologies and
trends that are leveraging both cloud
computing and big data. The combination of big data, cloud computing, and
new algorithms and techniques for visualizing information enables converged
analyticsperforming analytics on data
from many different sources. These new
techniques for data delivery and data
management also enable cloud-based
analytics as a service (AaaS). Upcoming
columns will cover the development and
use of converged analytics and AaaS.

From security and privacy to pricing


models, the combination of big data and
cloud computing is having a substantial
impact on the nontechnical aspects of
our lives as well. There is a tension between our desire for converged analytics
and cloud computingwhich is about
sharing more computing resources and
data with increasingly diverse tenants
and our desire for better privacy controls and data protection. Usage-based
pricing models are forcing us to rethink
how we produce and consume technology. Future columns will look at how policies and economics are being shaped by
these technological advances.

THESE ARE EXCITING TIMES FOR


BOTH BIG DATA AND THE CLOUD.
Future columns will examine how people are using these trends together, big
data developments from cloud builders,
and how people are making the cloud
better through data. I look forward to
exploring all these topics here.

ELI COLLINS is Clouderas chief technologist. His research interests include


cloud computing and data management.
Collins has an MS in computer science
from the University of WisconsinMadison. Contact him at ____________
eli@cloudera.com.

Selected CS articles and columns


are also available for free at ____
http://
ComputingNow.computer.org.
_________________

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

85
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

STANDARDS NOW

Dening Our
Terms
WHEN I WAS IN COLLEGE, A MEMBER OF
THE DEBATE TEAM ONCE TOLD ME THAT
THE BEST METHOD HE HAD FOUND SO FAR
TO STOP A SUCCESSFUL ARGUMENT BY
MEMBERS OF AN OPPOSING TEAM WAS TO
ASK THEM TO DEFINE THEIR TERMS. This always struck me as obscure and unproductive advice:
after all, isnt the point of a useful debate not just
tactics, but arriving at the truth of the matter?
After much reflection and subsequent experience, Ive decided that there might be more value
hidden in my debate team friends observation than
I noticed at the time, or perhaps than even he had
in mind. Collectively defining our terms is a crucial
step toward fully understanding and illuminating
any topic under discussion.

Ingredients for Success


In IT standards, we tend to get caught up in particular positioning or viewpoints, becoming captured
by a localized team argument such as the debate
team example. Sometimes the advocacy of a viewpoint can distort and filter our perceptions of prog-

ALAN SILL
Texas Tech University,
alan.sill@standards-now.org
__________________

ress. Theres a premium, however, that we can gain


if instead of advocating for one particular viewpoint
or camp in an argument, we look at the general
point of the standard or standards under discussion
from multiple viewpoints. A crucial step in pursuing
this approach is defining our terms.
As Marvin Waschke, Andre Merzky, David
Wallom, and I discuss in our article, Mapping the
Current State and Future Directions of Cloud Standards, which will appear in a subsequent issue of
IEEE Cloud Computing, there are several essential
ingredients for making progress on this topic in the
fast-paced modern world of cloud computing development. These ingredients can occur in a wide variety of combinations, but we find that taken together,
they characterize the success patterns of many current projects. They are:
r promoting innovation,
r identifying solutions and problems,
r defining terms,
r gathering input from multiple communities, and
r taking a broad-based, multitechnology approach
toward implementing the identified solutions to
real-world problems.
A key element of a successful outcome is not
only recognizing areas where standards exist or
should exist, but also being fully aware of those areas in which standardization isnt the right approach
and in which latitude must be left or built in by design for multiple approaches to exist.
This distinction bears closer scrutiny. Winston
Bumpus, chair of the Board of the Distributed Management Task Force (DMTF) and senior director of
Architecture and Standards at VMWare, often says
(and I dont mind quoting), you should standardize
at the interfaces and innovate between the boundaries1 of a system or ecosystem of products. Such
an approach is already central to several successful open source and commercial product sets and is
a key feature of the overall framework that makes
cloud computing possiblethat is, the Internet itself. It can be applied to all large-scale applications,
cloud stacks, and computing projects.

Promoting Standards Innovation


Taking such an approach also lets us make short

86

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

work of dispatching useless arguments


regarding architectural use patterns
and room for creativity in design. Advocates of Representational State Transfer
(REST)/HTTP API design methods, for
example (practitioners of which include
me), sometimes say that such methods
dont use standards or are an antistandards alternative, when in fact they simply appropriately use the set of HTTP
standards and the REST architectural
design pattern to allow creativity to
flourish within and between boundaries
defined by HTTP interfaces. Far from
being antistandards in nature, modern
RESTful API design is a vindication
of the standards-based approach and a
real-world example of using communitydesigned standards to solve practical
problems in creative ways.
In this column and in articles covering the area of standards and compliance (for more information, see page 50),
I hope to take just such an ecumenical
approach in gathering and distributing
news on the current state of cloud computing. With your cooperation, I intend
to use your input to draw attention to appropriate innovations in standards, API
innovations, use case requirements, and
groups that are extending these methods. These methods can be historical
or modern, and old or new in approach,
as long as they contribute to producing
a successful standards-based framework
for innovation.
Specifically for this column, Id like
to hear from those studying and implementing new techniques from the viewpoint of enabling further developments,
as well as news about formal standards
progress and publications or calls for
input on software projects, standards
for commercial products, or notable
standards body publications. I hope
that you will come to view this portion
of the magazine as an opportunity to
highlight progress and topics of interest
M AY 2 0 14

to the broader community and a forum


for cloud standards progress. I further
hope that youll be persuaded to submit
longer articles that expand on the topics highlighted here. Ill be editing this
column from that viewpoint.
The world of cloud computing is
moving quickly, but to make the best

To some degree, the need to interoperate among significant components


can only become clear once the market
coalesces around the major players. Nonetheless, significant cross-cutting standards have begun to emerge, and further
work is ongoing in several sectors to lay
the groundwork for additional progress.

The need to interoperate among


significant components can only
become clear once the market
coalesces around the major players.

possible progress, we need to define our


terms, highlight successes as well as
instructive failures in standardization,
and create open channels for communication. Not only is there room in such
an approach for both standardization
and innovation in standards-related topics, but making such room is required.

Emerging Cloud Standards


Even a light and cursory overview of the
cloud computing ecosystem reveals that
several large projects and vendor-based
products have captured large segments
of the market, in terms of both mind
share and current financial performance. Developers, vendors, and contributors involved in these projects and
products are working hard to build components that can operate at least within
their respective boundaries. Work to
create cross-product or cross-project
standards is noticeably less advanced,
but it still has the benefit of several
years worth of ongoing progress, some
results of which are beginning to demonstrate noticeable deployment.

The US National Institute of Standards and Technology (NIST) and the


European Telecommunications Standards Institute (ETSI) have recently
drawn up lists of cloud-computing-specific and cloud-computing-related (or
relevant) standards for evaluation.2,3
Such lists inevitably produce a mix of
well-known, tried-and-true formal specifications that might in principle be applied to any particular task, as well as
others that are more specific and either
on the cusp of, or already achieving significant uptake by portions of the cloud
computing community.
Ill leave the detailed comparison of
such lists to you as the reader. Its worth
noting that both lists mentioned share
some recent cloud-specific standards
that are already being implemented. A
short list of such cloud-specific standards that have made their way into
multiple software products includes
r Cloud Data Management Interface
(CDMI) from the Storage Networking Industry Association (SNIA),
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

87
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

STANDARDS NOW

r Open Cloud Computing Interface


(OCCI) from the Open Grid Forum
(OGF),
r Open Virtualization Format (OVF)
from the Distributed Management
Task Force (DMTF), and
r Topology and Orchestration Specification for Cloud Applications (TOSCA)
from the Organization for the Advancement of Structured Information Standards (OASIS).4
Each of these standards has already
received multiple implementations in
the field. To this list, I think well soon
be able to add the DMTFs Cloud Infrastructure Management Interface (CIMI),
which is an infrastructure-as-a-service
(IaaS) management and control specification set that has reached a published state and is also beginning to see
significant work on implementations.
(For references to these, please see the
Standards and Compliance areas introductory editorial on page 52.)
Several standards-development organizations are continuing work on additional cloud-specific standards that I
hope to draw to your attention in this
column. Along these lines, the IEEEs
own P2301 and P2302 groups have
been defined for some time and have
each recently increased their levels of
activity and organization. The P2301
group aims to develop a Guide for
Cloud Portability and Interoperability
Profiles (CPIP)5 as an aid to vendors
and users in developing, building, and
using standards-based cloud computing
products and services. John Messina of
NIST chairs this group.
In addition, the P2302 project, also
known as the IEEE Intercloud Working Group (ICWG, http://grouper.ieee.
org/groups/2302)6 has set out to develop
___________
the Standard for Intercloud Interoperability and Federation (SIIF) and, more
broadly, has the ambitious goal of op88

I EEE CLO U D CO M P U T I N G

erating an intercloud testbed (www.


___
in support of this
work. Deepak Vij of Huawei Technologies chairs the working group.
Membership in each of these
groups is open to non-IEEE members.
More general work is also occurring, such as detailed efforts to establish the ontology and terminology of
cloud computing. Such work can be
valuable. Chris Kemp, the widely
praised instigator of the NASA Nebula
project, which was an important forerunner and contributor to the formation
of OpenStack, includes the content of
one such document (the NIST Definition of Cloud Computing7) in nearly
every talk he gives, and considers it his
favorite government document ever.
The NIST definition has become widely
known and nearly universally adopted
as a starting point in understanding the
landscape of cloud computing.

intercloudtestbed.org)
______________

News on SDO Efforts


Finally, let me draw your attention to work
being done in the context of the International Standards Organization (ISO)/
International Electrotechnical Commission (IEC) Joint Technical Committee
1 Special Committee 38 Distributed
Application Platforms and Services
(DAPS)8 in its Working Group 3 on
cloud service-level agreements (SLAs).
The joint in this special committee
name means that its a merged effort
between these organizations, spanning previous work of several branches
of each participating organization,
and thus represents an organizational
simplification. It sometimes frustrates
those who dont deal frequently with
these organizations to encounter such
fine-grained subdivision of effort, but
this granularity is part of how standards
organizations often order their internal
work to make progress on the widely
varying topics that they pursue.

Seungyun Lee of the Electronics and Telecommunications Research


Institute of Korea (ETRI) leads the
special committee working group. The
group is working on a new multipart
document set, Service-Level Agreement (SLA) Framework and Terminology, ISO/IEC CD 19086, for which Eric
Simmon of NIST serves as the editor.
The document builds on the cloud computing Overview and Vocabulary produced by this committee as ISO/IEC
17788, and is intended to be compatible
with the cloud Reference Architecture,
ISO/IEC 17789, also by Special Committee 38. Along with the committees
other terminology and ontology products on service-oriented architecture,
this work is a literal example of defining our terms in ways that should yield
valuable long-term results. Those who
have been following the NIST work on
the cloud computing reference architecture, terminology, ontology, and metrics will want to read and comment on
this output through their national body
representatives.
What is a national body representative, you ask? (Apologies in advance
to those of you who know this already;
we have to start the discussion somewhere.) Well go into this topic in more
detail in the next column, along with
a summary of the types and varieties
of standards-developing organizations
and a review of the roles of open source
projects and commercial products in
vetting the output of these organizations. Your opinions on all of these topics are welcome.

MEANWHILE, SEND IN YOUR


NEWS. And, if you can produce a coherent, readable account of recent work
in this area that you would like to call to
the attention of the community, please
submit it for consideration. I hope you
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

will contribute to this magazine, and I


invite you to contact me regarding any
and all topics that you think are interesting or that deserve a wider audience.
For purposes of this column, I can be
reached at __________________
alan.sill@standards-now.org.
References
1. G. Hulme, Cloud Standards Necessary For Portability, Innovation,
Cloud Commons, 23 Apr. 2012;
www.cloudcommons.com/articles/-/
a________________________
sset _ publ isher/ bY1m /content /
cloud-standards-necessary-for-por________________________
tability-innovation.
____________
2. NIST Cloud Computing Standards
Roadmap, NIST Special Publication
500-291 version 2, Natl Inst. of Standards and Technology, 2013; ____
www.
nist.gov/itl/cloud/upload/NIST_SP500-291_Version-2_2013_June18_
________________________
FINAL.pdf (see also _________
http://collabo_______
rate.nist.gov/twiki-cloud-computing/
_______________________
bin/v iew/CloudComputing /
_______________________
StandardsInventory).
_____________
3. Cloud Standards Coordination Report version 1.0, European Telecomm. Standards Inst. (ETSI),
2013;
www.etsi.org/images/files/
Events/2013/2013_CSC_Delivery_
________________________
WS/CSC-Final_report-013-CSC_
________________________
Final_report_v1_0_PDF_format-.
________________________
PDF.
___
4. Topology and Orchestration Specification for Cloud Applications Version 1.0, OASIS, 2013; http://docs.
oasis-open.org/tosca/TOSCA/v1.0/
os/TOSCA-v1.0-os.html.
________________
5. P2301Guide for Cloud Portability
and Interoperability Profiles (CPIP),
IEEE Standards Assoc., 2014;
http://standards.ieee.org/develop/
project/2301.html
____________
6. P2302Standard for Intercloud Interoperability and Federation (SIIF),
IEEE Standards Assoc., 2014;
http://standards.ieee.org/develop/
project/2302.html.
____________
M AY 2 0 14

7. The NIST Definition of Cloud Computing, NIST Special Publication


800-145, Natl Inst. of Standards and
Technology, 2011; http://csrc.nist.
gov/publications/nistpubs/800-145/
SP800-145.pdf.
__________
8. ISO/IEC JTC 1/SC 38Distributed
Application Platforms and Services
(DAPS), standard catalogue, Intl
Standards Organization (ISO)/
Intl Electrotechnical Commission
(IEC), 2014; www.iso.org/iso/home/
store/catalogue_tc/catalogue_tc_
_______________________
browse.htm?commid=601355.
___________________

ALAN SILL is an adjunct professor of


physics and senior scientist at the High
Performance Computing Center and directs the US National Science Foundation Center for Cloud and Autonomic

Computing at Texas Tech University. He


also serves as vice president of standards
for the Open Grid Forum and cochair of
the US National Institute of Standards
and Technologys Standards Acceleration
to Jumpstart Adoption of Cloud Computing working group. Sill has a PhD in particle physics from American University.
Hes an active member of the Distributed
Management Task Force, IEEE, TM Forum, and other cloud standards working
groups, and has served either directly or as
liaison for the Open Grid Forum on several national and international standards
roadmap committees. Contact him at
alan.sill@standards-now.org.
__________________

Selected CS articles and columns


are also available for free at ____
http://
ComputingNow.computer.org.
_________________

IEEE COMPSAC 2014


38th Annual IEEE International Computers, Software and
Applications Conference

July 21-25, 2014


Vasteras, Sweden
COMPSAC is the IEEE Signature Conference on Computers, Software,
and Applications. It is one of the major international forums for
academia, industry, and government to discuss research results,
advancements and future trends in computer and software
technologies and applications. The theme of the 38th COMPSAC
conference is The Integration of Heterogeneous and Mobile Services
in Smart Environments.

Register today!
www.compsac.org
________________

I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

89
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD TIDBITS

Todays Tidbit:
VoltDB
WELCOME TO CLOUD TIDBITS! In each issue,
Ill be looking at a different tidbit of technology
that I consider unique or eye-catching and of particular interest to IEEE Cloud Computing readers.
Todays tidbit is VoltDB, a new cloud database.
This system caught my eye for several reasons.
First, its the latest database designed by Michael
Stonebraker, the database pioneer best known for
Ingres, PostgreSQL, Illustra, Streambase, and
more recently, Vertica. But interestingly, in this goaround, Stonebraker declared that he has thrown
all previous database architecture out the window and started over with a complete rewrite.1
Whats resulted is something totally different from
every other databaseincluding all the column- and
table-oriented NoSQL systems. Moreover, VoltDB
claims a 50 to 100x speed improvement over other
relational database management systems (RDBMSs)
and NoSQL systems. It sounds too good to be true.
What we have is nothing short of a whole class
of SQL, as compared to the NoSQL compromises
detailed above. This total rearchitecture, called
NewSQL, supports 100 percent in memory op-

eration, supports SQL and stored procedures, and


has a loosely coupled scale-out capability perfectly
matched to cloud computing platforms.
Wait a minute! That doesnt sound possible. Thats
precisely why I thought it made for a perfect tidbit.

Early Databases
The first databases used hierarchical data models
in which all data was organized in a tree-like structure. This structure is simple but inflexible because
its confined to a one-to-many relationship. The
IBM Information Management System (IMS), one
of the first production databases, used this model.
The hierarchical data model lost traction as the relational model became the de facto standard used
by virtually all mainstream DBMSs. The relational
database uses a data model much more aligned with
real-world business models. In this model, each data
item has a row of attributes, so the database displays
a fundamentally tabular organization. Tables can be
related to other tables using a key mechanism. Relational databases displaced hierarchical databases
because the ability to add new relations made it possible to add new, valuable information. SQL offered
a way to program relational queries, and the database-powered IT marketplace was born.
Implementations of relational databases trace
their roots to the original RDBMS designs (IBM
System R and follow-ons) of the 1970s. At that time,
business data processing was the only DBMS market.
The main user interface device then was the dumb
terminal, and vendors imagined operators inputting
queries through an interactive terminal prompt. Key
architectural features of the original DBMSs were
disk-oriented storage and indexing structures, multithreading to hide latency, locking-based concurrency
control mechanisms, and log-based recovery.

Things Have Changed

DAVID
BERNSTEIN
Cloud Strategy Partners,
david@cloudstrategypartners.com
______________________

90

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

In the past 25 years, several other markets have


evolved, including data warehousing, text management, the Web, and stream processing. These markets have very different requirements from business
data processing. Also in the last 25 years, processors became thousands of times faster and memories grew to be thousands of times larger. Storage
volumes have increased enormously. Through the
use of scale-out techniques, cloud computing has
2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

extended these resources, making them essentially


infinite. Finally, systems that use DBMSs rarely run
interactive transactions or present users with direct
SQL interfaces.
And yet, no RDBMS has had a complete redesign since the technologys inception. Of course,
there have been extensions over the years, including
support for compression, shared-disk architectures,
bitmap indexes, and user-defined data types and operators. However, all the major commercial DBMSs
are still built around the original System R architectural features.
As a result, todays RDBMSs are expensive and
difficult to scale. Scaling a DBMS typically involves
migrating from an inexpensive commodity server
to an expensive symmetric multiprocessing (SMP)
server; redesigning the database (and corresponding
application data access logic); implementing a data
partitioning or sharding scheme (that is, manually
dividing the database into many smaller databases
running on different servers and modifying the application code to coordinate data access across the
partitions); and then implementing a key-value (KV)
store, thereby forfeiting transactional consistency
and the ability to use SQL.

Why Not Start Over?


In 2005, Stonebraker noted this lack of technological evolution and predicted the end of one size
fits all as a commercial RDBMS paradigm. He assembled a group of researchers from Brown University, Carnegie Mellon University, the Massachusetts
Institute of Technology, and Yale University. They
worked together on a new way to implement a relational database called H-Store and, in 2007, published the paper The End of an Architectural Era
(Its Time for a Complete Rewrite).1
According to the Stonebraker team, the typical
RDBMS engine only spends 12 percent of its time
doing useful work (see Figure 1). Traditional DBMSs
have five sources of processing overhead:
r Index management: B-tree, hash, and other indexing schemes require significant CPU and I/O.
r Write-ahead logging: Traditional databases write
everything twice: once to the database and once
to the log. Moreover, the log must be forced to
disk to guarantee transaction durability. LogM AY 2 0 14

20%

18%
10%

11%

12%

29%

Index management
Logging
Locking
Latching
Buffer management
Useful work

Figure 1. General-purpose relational database


management system (RDBMS) processing prole.
( 2014 VoltDB. Used with permission.)

ging is, therefore, an expensive operation.


r Locking: Before touching a record, a transaction
must set a lock on it in the lock table. This is an
overhead-intensive operation.
r Latching: Updates to shared data structures
(B-trees, the lock table, resource tables, and so
on) must be done carefully in a multithreaded
environment. Typically, this is done with shortduration latches, which are another considerable
source of overhead.
r Buffer management: Data in traditional systems is stored on fixed-size disk pages. A buffer pool manages which set of disk pages is
cached in memory at any given time. Moreover,
records must be located on pages and the field
boundaries identified. Again, these operations
are overhead-intensive.
Stonebraker and his team addressed all these
issues, claiming that the H-Store was the first implementation of a new class of parallel database
management systems, NewSQLs, that provide the
high throughput and high availability of NoSQL
systems, but without giving up support for SQL and
the transactional guarantees of a traditional DBMS.
Such systems can scale out horizontally across multiple machines to improve throughput, unlike all
other SQL DBMSs, which must be scaled up (for
example, multiprocessor, shared-memory machines).

Introducing VoltDB
VoltDB was formed to commercialize the NewSQL technology. The H-Store project found an open
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

91
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD TIDBITS

source home in VoltDB, completing its


implementation:
r Data and the associated processing
are partitioned together and distributed across the CPU cores (virtual
nodes) in a shared-nothing hardware cluster.
r Data is held in memory for maximum throughput, eliminating the
need for buffer management.
r Each single-threaded partition operates autonomously, eliminating
the need for locking and latching.
r Data is automatically replicated for
intracluster and intercluster high
availability.
VoltDB isnt the first attempt to
overcome the performance and scalability limitations of traditional databases.
Two alternatives to VoltDB are running
conventional databases in memory and
using a NoSQL KV store.
In-memory systems can safely operate without buffer management and often
without logging (at the expense of durability). However, even with these systems
removed, the maximum performance improvement is roughly double. To achieve
VoltDBs 50x speedup, all legacy online
transaction processing (OLTP) time syncs
must be removed (that is, buffer management, logging, latching, and locking).
To deliver better performance on
scale-out hardware, some databases,
such as NoSQL KV stores, eliminate
some of this overheadand SQL and
data integrity along with it (delivering
eventual consistency). Unfortunately,
because KV stores dont execute SQL,
functionality that would normally be executed by the database must be implemented in the application layer.
Unique Place in the CAP Theorem
According to theoretical computings
CAP theorem (also known as Brewers
92

I EEE CLO U D CO M P U T I N G

theorem), a distributed system cannot


satisfy the following guarantees at the
same time:
r Consistency, which ensures that all
nodes in the system see the same
data simultaneously;
r Availability, which ensures that the
system returns a response for every
request indicating whether or not it
was successful;
r Partition tolerance, which ensures
that the system will continue to operate in the event of arbitrary message loss or partial system failure.
DBMSs that use a relational model
(and support SQL) fall into the CA
category (that is, they offer both consistency and availability, but not partition
tolerance). Examples are MySQL, Oracle,
DB2, Postgres, and SQL Server, as well
as distributed SQL processors such as
Aster Data and Greenplum. These architectures therefore have no partition tolerance; for example, they wont work well on
a loosely coupled/distributed system (such
as a cloud). But we already knew that.
Unique Place in Cloud Computing
Databases that will work well with
cloud computing must have partition
tolerance. And there are plenty of them;
the cloud/big data area has no shortage of innovations!
If we look at the AP and CP categories (that is, with partition tolerance),
we find all kinds of clever systems. AP
examples include Dynamo, Voldemort,
Tokyo Cabinet, KAI, Cassandra, SimpleDB, CouchDB, and Riak. CP examples include BigTable, Hypertable,
Hbase, MongoDB, Terastore, Slarais,
BerkeleyDB, MemcacheDB, and Redis.
However, all these systems use a KV, column-oriented/tabular, or document-oriented data model. None use a relational
data model, and none support SQL.

VoltDB is a CP solution, making


it the only relational model, partitiontolerant DBMS Ive ever seen. It has
a truly unique place in the CAP theorem and, therefore, a unique place in
the cloud. This notably unique piece of
technology should make a lot of developers lives a lot easier. When you wish
you could use SQL on a problem where
data is piling into the cloud, you need
scale-out capability to handle it, and
you really want to use SQLVoltDB
gives you a new tool.

A CLOUD-FRIENDLY SCALE-OUT
ARCHITECTURE RDBMS? We thought
that was impossible. And this, my friends,
qualified it to be this columns Cloud
Tidbit. I hope you enjoyed it!
Reference
1. M. Stonebraker et al., The End of
an Architectural Era (Its Time for
a Complete Rewrite), Proc. Very
Large Databases (VLDB), 2007, pp.
11501160.

DAVID BERNSTEIN is the managing


director of Cloud Strategy Partners, cofounder of the IEEE Cloud Computing
Initiative, founding chair of the IEEE
P2302 Working Group, and originator and chief architect of the IEEE Intercloud Testbed Project. His research
interests include cloud computing, distributed systems, and converged communications. Bernstein was a University of
California Regents Scholar with highest
honors BS degrees in both mathematics and physics. Contact him at _____
david@
cloudstrategypartners.com.

Selected CS articles and columns


are also available for free at ____
http://
ComputingNow.computer.org.
_________________

W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

The Community for Technology Leaders

Membership Matters
to Your Career, Your Technical Excellence,
the Profession, and the World.

IEEE Computer Society membership has helped me advance at each


stage of my career. When I was first starting in my field, its magazines
and journals were a great source of career-helping information. By
staying on top of the leading-edge trends and networking with the
top people in computing, I became more established in my career.
The publications, digital library, and courses, as well as the ability to
participateinconferences, all helped me develop in my field.
Cecilia Metra
Associate Professor in Electronics, University of Bologna
Computer Society membership offers you a host of ways to stay up-to-date with the
technology in your fieldas well as learn new foundational and management skills which
can advance your career opportunities:
: YOUR CHOICE OF FOCUS AREA and digital magazinechoose from Software
and Systems, Security and Privacy, Computer Engineering, or Information and
Communication Technologies
: FREE AND DISCOUNTED PROFESSIONAL TRAINING AND PUBLICATIONS,
including Computer and other resources to enhance your knowledge
: FREE ONLINE BOOKS from Safari Books Onlinea library of 600 tech and
business books from top publishers such as OReilly Media, Addison-Wesley,
Cisco Press, and more
: FREE ONLINE COURSES organized into 16 Knowledge Centers with hundreds
of online courses and career resources from Skillsoft, covering topics related to
Cisco, IT Security, Java, Project Management, Leadership, Oracle, MS Office
2010, among others

Join or renew today at www.computer.org/membership

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD AND THE LAW

Legal Issues in
the Cloud
ENSURING A SECURE CLOUD SYSTEM (AND
ECOSYSTEM) IS A HIGHLY SPECIALIZED AND
INTERDISCIPLINARY FIELD. It requires a deep
understanding of the underlying technical, social,
public policy, regulatory, and legal and law enforcement aspects, as well as intimate knowledge of
temporal trends (historical, recent, and emerging).
Although security, privacy, public policy, legal, and
forensic challenges associated with cloud computing
have attracted academic attentionparticularly the
issues relating to data sovereignty and confidentiality and to the inadequacy of our existing legislative
and regulatory frameworks to protect data from prying eyes16 research on the topic is still in its infancy. To inaugurate this column, I present here a
general overview of some of these legal issues.
As cloud computing use grows throughout society, so, too, does its use by criminals.1,7 This is particularly true in sophisticated and organized crime,
where ongoing secure communication, dissemination, and data storage is critical for a criminal syndicates operation. However, emerging technologies
such as cloud computing entail various challenges
and implications for governmentsparticularly law

KIM-KWANG
RAYMOND CHOO
University of South Australia
raymond.choo@fulbrightmail.org
______________________

94

I EEE CLO U D CO M P U T I N G P U B L I SH ED BY T H E I EEE CO M P U T ER S O CI E T Y

enforcement and regulatory agenciesas well as other key stakeholders in both public and private sectors.
Crimes involving cloud computing use typically
involve an accumulation or retention of data on a
digital device (such as a mobile phone) that must be
identified, preserved, analyzed, and presented in a
court of lawa process known as digital forensics.810
Conventional forensic tools often focus on physically
accessing the media that stores the target data. However, in a cloud computing environment, it is often
impossible or infeasible to access the physical media
and, in many cases, forensic investigators would have
to rely on the cloud service provider to locate where
the evidential data resides in the cloud.11,12
As Darren Quick and I pointed out,10 not all
countries have legal provisions that allow data to be
secured when a warrant is served, such as during
a search and seizure process. For example, Section
3L of Australias Crimes Act 1914 (a Commonwealth
legislation) allows the officer executing a search and
seizure warrant to access data. This includes data
not held at the premises, such as data accessible
from a computer or data storage device used to access cloud services. Such provisions are designed to
overcome the efforts of accused people to conceal
data through the use of passwords or encryption, including in cloud services, but the provisions might
not be available in other countries.10
Data fragmentation and distribution across numerous international datacenters also presents technical and jurisdictional challenges in identifying
and seizing (the fragile and elusive) evidential data
by government agencies in criminal investigations,
as well as by businesses in civil litigation.7,13,14 The
technical and legal uncertainties surrounding these
questions are, perhaps, why traditional boundaries
are now blurred.15 As Australias Chief Defence Scientist Alexander Zelinsky noted,
Due to the virtual, dynamic, and borderless nature of cloud computing services, government and law enforcement
investigations into malicious cyber activities will require cooperation between government agencies from multiple countries.
Government and law enforcement investigators face difficulty in accessing the physical hardware to locate evidential data.16
2325- 6095/14/$31 .0 0 2014 IEEE

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Cloud security threats and vulnerability windows evolve over time,


partly in response to defensive actions
or crime displacement. For example,
existing digital forensic techniques are
designed to collect evidential data from
typical digital devices, in which advanced security features and antiforensic techniques are rarely fully exploited.
In contrast, sophisticated and organized
criminals often use secure services and
devices specifically designed to evade
legal interception and forensic collection attempts. Therefore, the digital forensics space can be seen as a race to
keep up with
r hardware and software/application
releases, such as those released by
cloud service providers; and
r software and hardware modifications made by end usersparticularly sophisticated and organized
criminalsto complicate or prevent digital evidence collection and
analysis.
Although a legitimate need exists
for cooperation between cloud service
providers and government and law enforcement agencies, there are also legitimate concerns about cloud service
providers being compelled to hand
over user data that reside in the cloud
to government agencies without the
users knowledge or consent due to territorial jurisdiction by a foreign government.1719 As I noted in a 2010 article,
[F]oreign intelligence services
and industrial spies may not
disrupt the normal functioning of an information system
as they are mainly interested in
obtaining information relevant
to vital national or corporate
interests. They do so through
clandestine entry into computer
M AY 2 0 14

systems and networks as part


of their information-gathering
activities. Cloud service providers may be compelled to scan or
search data of interest to national security and to report
on, or monitor, particular types
of transactional data as these
data may be subject to the laws
of the jurisdiction in which the
physical machine is located

r data protection,
r data governance,
r data sovereignty,
r forensics,
r incident response and management,
r information assurance,
r privacy,
r provenance,
r publicprivate partnership,
r risk management,
r security,

Cloud security threats and


vulnerability windows evolve over
time, partly in response to defensive
actions or crime displacement.

overseas cloud service providers may not be legally obliged


to notify the clients (owners of
the data) about such requests.1

IT MIGHT BE IMPOSSIBLE TO
COMPLETELY ERADICATE ILLEGAL AND MALICIOUS CYBER ACTIVITIES, but its important to maintain
persistent pressure on threat actors to
safeguard cloud security and a secure
cloud ecosystem. To help advance the
state of the art in this research area and
to address emerging cloud-related risks,
this column is actively seeking highquality technical-, business-, policy- and
legal-oriented submissions related to
cloud issues such as
r cloud computing strategies,
r extraterritorial jurisdiction (in theory and practice),

r service-level agreements (SLAs),


r surveillance, and
r visualizations.
We also welcome high-quality position, survey, and review papers from
computer science and interdisciplinary scholars. Examples might include
a comparative analysis, survey, and review of legal and privacy issues, such as
r the legislative trends across countries and the interplay between different legal areas (such as privacy,
telecommunication interception,
national data sovereignty, and national security legislations) and the
cloud computing strategies in those
countries; and
r the legal implications for cloud service providers and users if the data
is breached or users suffer an economic loss resulting from the providers negligent act.
I EEE CLO U D CO M P U T I N G

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

95
M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

CLOUD AND THE LAW

Our goal with this column is to help


mitigate emerging and evolving cloud security threats and facilitate informed decisions about cloud security and privacy,
as well as keep pace with societys needs
and preferences in these areas. We welcome your contributions to this end.

Choo, Cloud Computing and Its


Implications for Cybercrime Investigations in Australia, Computer
Law & Security Review, vol. 29, no.
2, 2013, pp. 152163.
8. B. Butler and K.-K.R. Choo, IT
Standards and Guides Do Not Ad-

Our goal with this column is to help


mitigate emerging and evolving
cloud security threats and facilitate
informed decisions about cloud
security and privacy.

References

1. K.-K.R. Choo, Cloud Computing:


Challenges and Future Directions,
Trends & Issues in Crime and Criminal Justice, no. 400, 2010, pp. 16.
2. A. Gray, Conflict of Laws and the
Cloud, Computer Law & Security Review, vol. 29, no. 1, 2013, pp. 5865.
3. K. Irion, Government Cloud Computing and National Data Sovereignty, Policy & Internet, vol. 4, no.
3-4, 2012, pp. 4071.
4. W. Maxwell and C. Wolf, A Global Reality: Governmental Access to
Data in the Cloud, white paper,
2012; www.cil.cnrs.fr/CIL/IMG/
pdf/Hogan_Lovells_White_Paper_
________________________
Government_ Access_to_Cloud_
________________________
Data_Paper_1_.pdf.
____________
5. P. Ryan and S. Falvey, Trust in the
Clouds, Computer Law & Security Review, vol. 28, no. 5, 2012, pp. 513521.
6. K.L. Ter, Singapores Personal
Data Protection Legislation: Business Perspectives, Computer Law
& Security Rev., vol. 29, no. 3, 2013,
pp. 264273.2.
7. C. Hooper, B. Martini, and K.-K.R.
96

I EEE CLO U D CO M P U T I N G

equately Prepare IT Practitioners


to Appear as Expert Witnesses: An
Australian Perspective, to be published in Security J., 2014; ____
http://
dx.doi.org/10.1057/sj.2013.29.
9. R. McKemmish, What is Forensic
Computing? Trends and Issues in
Crime and Criminal Justice, no. 118,
1999, pp. 16.
10. D. Quick and K.-K.R. Choo, Forensic Collection of Cloud Storage
Data: Does the Act of Collection
Result in Changes to the Data or
its Metadata? Digital Investigation,
vol. 10, no. 3, 2013, pp. 266277.
11. B. Martini and K.-K.R. Choo, Cloud
Storage Forensics: ownCloud as a
Case Study, Digital Investigation,
vol. 10, no. 4, 2013, pp. 287299.
12. B. Martini and K.-K.R. Choo, An
Integrated Conceptual Digital Forensic Framework for Cloud Computing, Digital Investigation, vol. 9,
no. 2, 2012, pp. 7180.
13. D. Quick and K.-K.R. Choo, Digital Droplets: Microsoft SkyDrive
Forensic Data Remnants, Future
Generation Computer Systems, vol.

29, no. 6, 2013, pp. 13781394.

14. D. Quick and K.-K.R. Choo, Dropbox Analysis: Data Remnants on


User Machines, Digital Investigation, vol. 10, no. 1, 2013, pp. 318.
15. D. Jones and K.-K.R. Choo, Should
There Be a New Body of Law for Cyber Space?, to be published in Proc.
22nd Euro. Conf. Information Systems (ECIS 2014), 2014.
16. D. Quick, B. Martini, and K.-K.R.
Choo, Cloud Storage Forensics, Syngress/Elsevier, 2014.
17. An Open Letter from US Researchers in Cryptography and Information Security, 24 Jan. 2014; http://
____
masssurveillance.info.
______________
18. D. Castro, How Much Will PRISM
Cost the U.S. Cloud Computing
Industry? Information Technology
and Innovation Foundation, 2013.
19. R.A. Clarke et al., Liberty and
Security in a Changing World, 12
Dec. 2013; www.whitehouse.gov/
sites/default/files/docs/2013-12-12_
_______________________
rg_final_report.pdf.
_____________

KIM-KWANG RAYMOND CHOO is


a senior lecturer in the School of Information Technology and Mathematical Science at the University of South Australia.
His research interests include cyber and
information security and digital forensics,
and his books include Secure Key Establishment (Advances in Information Security) (Springer, 2009) and Cloud Storage
Forensics, with Darren Quick and Ben
Martini (Elsevier, 2013). Choo has a PhD
in information security from Queensland
University of Technology, Australia. His
honors include a 2009 Fulbright Scholarship, a 2008 Australia Day Achievement
Medallion, the 2010 Australian Capital
Territory Pearcey Award, the 2010 Consensus IT Professional Award, and the British
Computer Societys Wilkes Award. Contact
him at ____________________
raymond.choo@fulbrightmail.org.
W
W W.CO M P U T ER .O RG /CLO U D CO M P U T I N G
_________________________

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Call for Papers

Special Issue on Cloudbased Smart Evacuation


Systems for Emergency
Management
For IEEE Cloud Computings Nov/Dec 2014 issue
Submission Deadline: 20 August 2014

atural and man-made emergencies, such as tsunamis, earthquakes, floods, and epidemics pose a significant threat to human societies. Well-coordinated
emergency management activities that involve guiding citizens out of danger areas, placing medical team in the most
appropriate locations, and planning evacuation routes before
and after a disaster, play a significant role in saving lives, protecting critical infrastructures, and minimizing casualties.
The management of evacuation activities, such as guiding
people out of dangerous areas and coordinating rescue teams,
depends on the availability of historical data as well as on the
effective real-time integration and utilization of data streaming
from multiple sources, including on-site sensors, social media
feeds, and messaging on mobile devices. However, the growing ubiquity of on-site sensors, social media, and mobile devices means there are more sources of outbound traffic, which
ultimately results in the creation of a tsunami of data, beginning shortly after the onset of emergency events. This data
tsunami phenomenon presents a new grand challenge in computing. During the 2010 Haiti earthquake, text messaging via
mobile phones and Twitter made headlines as being crucial for
disaster response, but only some 100,000 messages were actually processed by government agencies because of the lack
of an automated and scalable data processing infrastructure.
Design and development of evacuation systems for emergency management requires a complete information and communication technology (ICT) paradigm shift so that systems
do not get overwhelmed by incoming data volume, data rate,
data sources, and data types. New cloud-based techniques
are needed that can extract meaningful information from

large-scaled data in real time, while avoiding unnecessary


data transmission or storage. Future initiatives should focus
on developing cloud-based techniques to improve the performance of multiple datastream processing systems while balancing computational complexity and quality of service.
This special issue aimsto solicit both original research and
tutorial articles that discuss cloud computing strategies to enable safer and more effective emergency response.

Submission Guidelines
Submissions will be subject to IEEE Cloud Computing magazines peer-review process. Articles should be at most 6,000
words, with a maximum of 15 references, and should be
understandable to a broad audience of people interested in
cloud computing, big data, and related application areas. The
writing style should be down to earth, practical, and original.
All accepted articles will be edited according to the IEEE
Computer Society style guide. Submit your papers through
Manuscript Central at https://mc.manuscriptcentral.com/
ccm-cs.
_____ For more information, contact the guest editors:
t Rajiv Ranjan, CSIRO, Australia, raj.ranjan@csiro.au
____________
t Samee Khan, NDSU, USA, _______________
samee.khan@ndsu.edu
t Joanna Kolodziej, CUT, Poland, _____________
joanna.kolodziej68@
gmail.com
t Albert Zomaya, Sydney University, Australia, zomaya@
______
it.usyd.edu.a
_________

www.computer.org/cloudcomputing
Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND

F
O
S
R
A
T
S
K
C
RO

R
E
B
Y
C

Y
T
I
R
U
C
E
S

PETER ALLOR
IBM, Cyber Security Strategist
Federal

GARY MCGRAW
Cigital, CTO

PETER FONASH
Dept. of Homeland Security, CTO

At the Rock Stars of Cybersecurity


conference, well-respected
cybersecurity authorities from leading
companies will deliver case studies
and actionable advice that you can
immediately put to use. At the Rock
Stars of Cybersecurity conference, you
will learn:
& Effective strategies for securing
business operations
& New and innovative approaches
to responding to todays security
threats
& How government agencies are
balancing cybersecurity threats
and privacy

& Big datas implications for security


analytics
& Implications of the cybersecurity
skills shortage on the ability to
respond to attacks

BRETT WAHLIN
HP, Vice President and CISO

24 SEPTEMBER 2014
Brazos Hall
Austin, TX

REGISTER NOW

PEDER JUNGCK
BAE, Vice President and CTO

Early pricing now available:


$299 (Full price: $399)
IEEE Computer Society Member:
$229 (Full price: $329)
Special discounts available for teams of 3 or more.

& How to implement a secure


enterprise architecture

computer.org/cyber-security

SARATH GEETHAKUMAR
VISA, Senior Director
Global Information Security

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next Page

M
q
M
q

M
q

MqM
q
THE WORLDS NEWSSTAND