Beruflich Dokumente
Kultur Dokumente
Janakiram MSV
Cloud Computing Strategist
www.janakiramm.net | mail@janakiramm.net
Chapter 1
Defining the Cloud
Evolution of Cloud Computing
Evolution of ISP
There are multiple factors that led to the evolution of Cloud Computing. One of
the key factors is the way Internet Service Providers (ISP) matured over a period of
time. I am borrowing this analogy from Forrester Research.
Evolution of ISP
From the initial days of offering basic internet connectivity to offering software as
a service, the ISPs have come a long way. ISP 1.0 was all about providing internet
access to their customers. ISP 2.0 was the phase where ISPs offered hosting
capabilities. The next step was co-location through which the ISPs started leasing
Page 2
Virtualization
Virtualization is the most discussed term among CIOs and IT decision makers.
Through Virtualization, the data center infrastructure can be consolidated from
hundreds of servers to just tens of servers. All the physical server roles like Web
Servers, Database Servers and Messaging Servers run as virtualized instances. This
results in lower Total Cost of Ownership (TCO) and brings substantial savings on the
power bills and reduced cost of cooling equipment.
Though the evolution of ISP, programmable web and virtualization are independent
trends, they contribute to the evolution of Cloud Computing.
If you are wondering what is so special about the „Cloud‟ in Cloud Computing, here
is the explanation. Traditionally, developers and architects used a picture of cloud
to illustrate a remote resource connected via the web. Eventually cloud became
the logical connector between the local and remote resources on the internet.
Page 3
Cloud OS
Visualize a scenario where the hardware and the Operating System (OS) are
exposed as a Web Service over the public internet. Based on the principles of Web
Services, we could send a request to this service along with a few parameters.
Since the OS is expected to act as an interface to the CPU and the devices, we can
potentially invoke a service that accepts a „job‟ that will be processed by the OS
and the underlying hardware. Technically, this Web Service has just turned the OS
+ H/W combination into a „Service‟. We can start consuming this service by
Page 4
Cloud FX
Developers always develop and deploy their applications on the application
development platforms. Some of the most popular application development
platforms are .NET and Java. In the last scenario, we have seen how the OS + H/W
combination is offered as a service. Now, imagine a scenario where the application
development platform is offered to you as a service. Through this, you will be able
to develop and test your applications on a low end, inexpensive notebook PC but
will able to „submit‟ my code to run on the most powerful hardware infrastructure.
It is the same programming language, SDK and the runtime that runs on your
development environment. If the hardware, OS, the language runtime and the SDK
are offered to you as a service, what would you call this? A Cloud Platform or may
be Cloud FX? We will address this in the next section.
Page 5
Infrastructure as a Service
In the previous section we discussed the Cloud OS. All that the Cloud OS offers is
the infrastructure services. You may choose to use REST API to manage this OS or
use SSH or Remote Desktop console. Technically, when you are able to delegate a
program to execute on a remote OS running on the Web, you are leveraging
Infrastructure as a Service (IaaS). This is different from classic web hosting. Web
hosting only hosts web pages and cannot execute code that needs low level access
to the OS API. Web hosting cannot dynamically scale on demand. IaaS enables you
to run your computing task on virtually unlimited number of machines. Remember
that through IaaS, you have just moved a server running in your backyard into the
Page 6
Platform as a Service
Platform as a Service or PaaS goes one level above the Cloud OS. Through this,
developers can leverage a scalable platform to run their applications. The
advantage of PaaS is that the developers need not worry about installing,
maintaining, securing and patching the server. The PaaS provider takes the
responsibility of the infrastructure and exposes the platform alone as a service.
Through this, the developers can achieve higher level of scalability, reliability and
availability of their applications. Microsoft Azure and Google App Engine are
examples of PaaS.
Software as a Service
Software as a Service (SaaS) is a silent revolution in the world of traditional
software products. With the availability of Intel Atom based Netbooks and
Page 7
Page 8
Consumers
Consumers will experience the Cloud through a variety of applications that they
will use in their day to day life. If you have ever used Google Docs or Microsoft Live
Mesh, you have already leveraging the Cloud. Consumers will subscribe to Software
as a Service offerings.
Page 9
Here are the 4 key capabilities that the Cloud Computing offers:
Elasticity
This is the most important attribute of the Cloud. You might start running your
application on just a single server. But in no time, Cloud Computing enables
you to scale your application to run on 100‟s of servers. Once the traffic and
usage of your application decreases, you can scale down to 10‟s of servers. All
this happens almost instantly and the best thing is your application and your
customers don‟t even realize that. This dynamic capability to scale up and
scale down is called Elasticity. Elasticity brings an „illusion of Infinity‟. Though
nothing is infinite in this world, your application can get any number of
resources as it demands. This is the biggest unique selling point of the Cloud.
Now, think of web hosting. When you want to add another server to your web
application, your hoster has to manually provision that for you. Adding
additional servers and configuring the network topology introduces additional
time lag that your business cannot afford. Most of the Cloud Computing vendors
offer an intuitive way of manipulating your server configuration and topology.
Elasticity is the single most important attribute of the Cloud.
Page 10
Pay-By-Use
Elasticity and Pay-By-Use attributes go hand in hand. When you are scaling up
your application by adding more resources, you know how much it is going to
cost you. Pay-By-Use is a boon for the startups. As an entrepreneur, you got to
balance your investment between human resources and IT resources. The
biggest benefit of Pay-By-Use is that it reduces the CAPEX and turns your IT
investment into OPEX. The analogy that I typically use is that of Cable or DTH
TV subscription. During the season of Cricket World Cup or NBA, you would
want to subscribe to the sports channels and unsubscribe that moment the
event is over. With Pay-By-Use, you can subscribe and unsubscribe to the IT
infrastructure based on your needs and you only pay for what you use. This is
the most optimal way of spending your IT budget.
Self Service
When you are able to enjoy the capability of scaling up and scaling down and
only pay for what you use, you never want to wait for someone in the
datacenter to add an additional server to your application. Cloud can deliver its
promise only when there is Self Service. Through this, you can control the
resources all by yourself without an intermediary. When you add a new CPU
core, a server instance or add extra storage, you do it by yourself by using the
Console offered by the Cloud provider. This results in reduction in IT support
and maintenance. Today most of the organizations have dedicated IT teams to
provision a new machine, storage, collaboration portal and mailboxes as a part
of on-boarding the new employees. Through Self Service, a fairly non-technical
person can achieve these tasks and you don‟t need certified system
Page 11
Programmability
This is a critical parameter of the Cloud. The Cloud makes the developers
extremely important. Developers are familiar with the concepts of
multithreading where they spawn new threads to achieve scalability and the
responsiveness of the application. They incorporate logic to create additional
threads on demand. The programmability aspect of the Cloud adds a new
Page 12
Page 13
So, let‟s summarize what we just discussed. Cloud Computing has 4 key tenets –
1) Elasticity, 2) Pay-By-Use, 3) Self Service, and 4) Programmability.
Having understood the key attributes of the Cloud, you might start wondering how
you can bring these capabilities to your data center in the enterprise. The reality is
that these capabilities can be applied to your data center and officially that is
called as the Private Cloud. It is time for us to discuss various implementations of
the Cloud. We will look at 4 different mechanisms the way Cloud can be
implemented.
Page 14
Page 15
Private Cloud
Simply put, Private Clouds are normal data centers within an enterprise with all
the 4 attributes of the Cloud – Elasticity, Self Service, Pay-By-Use and
Programmability. By setting up a Private Cloud, enterprises can consolidate their IT
infrastructure. They will need fewer IT staff to manage the data center. They will
also realize reduced power bills because of the low electricity consumption and
lesser cooling equipment needs. Private Cloud empowers employees within an
organization through Self Service of their IT needs. It becomes easy to provision
new machines and quickly assign them to project teams. Private Cloud borrows
some of the best practices of Public Cloud but limited to an organizational
boundary. Private Cloud can be setup using a variety of offerings from vmWare,
Microsoft, IBM, SUN and others. There are also some of the Open Source
implementations like Eucalyptus and Ubuntu Enterprise Cloud. We will discuss more
of Private Cloud in the coming episodes.
Page 16
Hybrid Cloud
There are scenarios where you need a combination of Private Cloud and Public
Cloud. Due to the regulations and compliance issues in few countries, sensitive
data like citizen information, patient medical history, and financial transactions
cannot be stored in servers that physically not located within the political
boundaries of a country. In some scenarios, the enterprise customers want to get
best of the both worlds by logically connecting their Private Cloud and the Public
Cloud. Through this, they can offer seamless scalability by moving some of the on-
premise and Private Cloud based applications to the Public Cloud. Security plays a
critical role in connecting the Private Cloud to the Public Cloud. Realizing its
importance, Amazon Web Services has recently announced Virtual Private Cloud
(VPC) that securely bridges Private Cloud and Amazon Web Services. It is almost
like extending your infrastructure beyond the organizational boundary and the
firewall in a secure way. Microsoft‟s recent announcement of Windows AppFabric
brings the concept of Hybrid Cloud to Microsoft‟s future customers.
Page 17
Community Cloud
Community Cloud is implemented when a set of businesses have a similar
requirement and share the same context. This would be made available to a set of
select organizations. For example, the Federal government in US may decide to
setup a government specific Community Cloud that can leveraged by all the states.
Through this, individual local bodies like state governments will be freed from
investing, maintaining and managing their local data centers. Similarly, the
Reserve Bank of India (RBI) may setup a Community Cloud for all the financial
institutions that share common goals and requirements. So, a Community Cloud is a
sort of Private Cloud but goes beyond just one organization.
Community Cloud
Page 18
Server Virtualization
There are many reasons for running Virtualization on the servers running in a
traditional data center. Here are a few:
It is far more flexible and faster to restore a failed web server, app server or a
database server that is running as a virtualized instance. Since these instances are
physical files on the hard disk for the host operating system, just copying over a
replica of the failed server image is faster than restoring a failed physical server.
Administrators can maintain multiple versions of the VMs that come handy during
the restoration. The best thing about this is that whole copy and restore process
can be automated as a part of disaster recovery plan.
Page 19
It is very common that certain servers in the data center are less utilized while
some servers are maxed out. Through virtualization, the load can be evenly spread
across all the servers. There are management software offerings that will
automatically move VMs to idle servers to dynamically manage the load across the
data center.
Virtualization has a direct impact on the bottom line. First, by consolidating the
data center to run on fewer but powerful servers, there is a significant cost
reduction. The power consumed by the data center and the maintenance cost of
the cooling equipment comes down drastically. The other problem that
virtualization solves is the migration of servers. When the hardware reaches the
end of the lifecycle, the physical servers need to be replaced. Backing up and
restoring the data and the installation of software on a production server is very
complex and expensive. Virtualization makes this process extremely simple and
cost effective. The physical servers will be replaced and the VMs just get restarted
without any change in the configuration. This has a lot of impact on the IT budgets.
Efficient management
Page 20
A Hypervisor can potentially replace the OS and can even boot directly from a VM.
This is called bare metal approach to virtualization. These Hypervisors have low
footprint of few megabytes (vmWare ESXi is just 32MB in size!) and have an
embedded OS with them. Hypervisors are assisted by the hardware virtualization
features built into the latest Intel and AMD CPUs. This combination of hardware
and Hypervisor turns the server into a lean and mean machine to host multiple
VMs. The VM that is used by the Hypervisor to boot as a host is called a
paravirtualized VM. This concept makes virtualization absolutely powerful. Imagine
a server booting in few seconds and the required paravirtualized (host) VM gets
copied over a gigabit Ethernet to run multiple guest VMs. This turns the datacenter
to be very dynamic and agile. The Hypervisor can be controlled by a central
console and can be instructed about the host VM to boot and the guest VMs to be
run on it.
Page 21
This product is based on the proven, open source Hypervisor called Xen. Xen‟s
paravirtualization technology is widely acknowledged as the fastest and most
secure virtualization software in the industry and it is enhanced by taking full
advantage of the latest Intel® VT and AMD-V™ hardware virtualization assist
capabilities. This product is free and can be downloaded from Citrix.com.
VMware ESXi
This product is another bare metal Hypervisor from the virtualization leader,
VMware. This is one of the best Hypervisors with just 32MB footprint. ESXi ships
with Direct Console User Interface (DCUI) that provides basic UI required for
administering and managing the Hypervisor. Through its standard Common
Information Model (CIM) system, it also exposes the APIs to control the
infrastructure.
This is a free Hypervisor from Microsoft based on the same Hypervisor that ships
with Microsoft Windows Server Hyper-V edition. This is best suited for Virtual
Desktop Infrastructure (VDI) because of its compatibility with Windows Vista and
Windows 7. Hyper-V does not have any local GUI but can be managed from System
Center Virtual Machine Manager (SCVMM).
Page 22
Elasticity
We know that the key attribute of the Cloud is Elasticity, which is the ability to
scale up and scale down on the fly. This capability is achieved only through
virtualization. Scaling up is technically adding more server VMs to an application
and scaling down is detaching the VMs from the application.
Self Service
The next attribute is Self Service. The Hypervisor comes with an API and the
required agents to manage it remotely. This functionality can surface through the
Self Service portals that the Cloud vendor offers. So, when you move a slider to
increase the number of servers in your web tier, you are essentially talking to the
Hypervisor to action that request.
Pay-By-Use
Pay-By-Use is the next attribute of the Cloud. By leveraging the management and
monitoring capabilities of the Hypervisor, metering the usage of resources like the
CPUs, RAM and storage can be easily achieved.
Programmable Infrastructure
Programmable Infrastructure is the last key tenet of the Cloud. We already saw
how the API wired into Hypervisors can be leveraged. Developers can directly talk
to the Hypervisor through the native APIs or Web Services exposed by the Cloud
vendors. Through this, they can take the control of the VMs.
It is very obvious that the Cloud is heavily relying on virtualization and efficient
Hypervisors to achieve its goal.
Now that we know how Virtualization forms the core of the Cloud, let‟s me put
things in perspective. Let‟s see what actually goes inside the Cloud.
Geographic location
We start deciding where to physically run our application. Most of the Cloud
providers give you an option to host your application at a specific location.
Depending on the customer base and the expected user location, you can choose a
Page 23
Data Center
Though you do not have a direct choice in this, your app will be deployed at a data
center physically located at a place that you have chosen. These data centers
typically run thousands of powerful servers that offer a lot of storage and
computing power.
Server
You never know which physical server is responsible for running your code and the
application. In most of the cases, the app that you deployed may be powered by
more than one server running within the same data center. You cannot assume that
the same physical server will run the next instances of your app. Servers are
treated as a commodity resource to host the VMs. There is no affinity between a
VM and a physical server. Each server in the data center is optimally utilized at any
given point.
Page 24
Virtual Machine
This is the layer that you will directly interact with. In Platform as a Service
(PaaS), you may not realize that you are dealing with a VM but in reality most of
the Cloud implementations will host your code or app on a VM. VMs are essential to
respect the 4 tenets of the Cloud. Your application runs on a VM that is managed by
the Hypervisor running across all the servers. These VMs are moved across servers
based on the server utilization. There is no guarantee that the VM that you launch
will run on the same physical server. There will be a load balancer which will
ensure that your applications are scalable by exploiting the power of all the VMs
associated with your application.
Page 25
Given that Amazon offers the core capabilities to run a complete web application
or a Line of Business application, it is obvious that it is Infrastructure as a Service
(IaaS). AWS is truly the platform of the platforms. You can choose an OS, App
server and the programming language of your choice. AWS SDK and API is available
for most of the popular languages including Java, .NET, Python and Ruby.
Page 26
Amazon Physical
Infrastructure
Let‟s take a closer look at some of the major Cloud service offerings from Amazon:
S3
Amazon‟s Simple Storage Service or S3 is a great way to store data on the Cloud
that can be accessed by any application with access to the internet. S3 can store
any arbitrary data as objects accompanied by metadata. These objects can be
organized into buckets. Every bucket and object has a set of permissions defined in
the Access Control List (ACL). The objects stored in S3 can be anything from a
document, a media file, serialized objects or even Virtual Machine images. Each
object can be 5GB in size while the metadata can be up to 2KB. All the objects can
be accessed using simple REST or SOAP calls. This makes S3 an ideal storage
solution to centrally store and retrieve data across multiple clients. S3 can also be
treated as a virtual file system to provide persistence storage capabilities to
applications.
Page 27
Object
EC2
In simple terms, EC2 is hiring a server running at a remote location. These servers
are actually Virtual Machine images running on top of Amazon‟s powerful data
centers. Amazon calls these virtualized server instances as Amazon Machine Images
or AMI. These instances come in different sizes that you can choose from. Please
refer to http://aws.amazon.com/ec2/#instance for more details on the instance
types. There are many pre-configured AMIs that you can choose from. The typical
workflow on EC2 is that you choose a pre-configured AMI, launch that AMI,
customize it by adding additional software and by loading an app and finally, save
that AMI as your custom AMI on S3. You can launch multiple instances of your AMI
and attach them to an IP called the Elastic IP. Because of the dynamic capability of
launching multiple instances of the same AMIs to scale up and terminating them to
scale down, it is called Elastic Compute Cloud.
Page 28
SQS
SQS is the message queue on the Cloud. It supports programmatic sending of
messages via web service applications as a way to communicate over the internet.
Message Oriented Middleware (MOM) is a popular way of ensuring that the messages
are delivered once and only once. Moving that infrastructure to the web by
yourself is expensive and hard to maintain. SQS gives you this capability on-demand
and through the pay-by-use model. SQS is accessible through REST and SOAP based
API.
CloudFront
When your web application is targeting the global users, it makes sense to serve
the static content through a server that is closer to the user. One of the solutions
based on this principle is called Content Delivery Network (CDN). But this
infrastructure of geographically spread servers to serve static content can be very
expensive. CloudFront is CDN as a service. Amazon is leveraging its data center
presence across the globe by serving content through these edge locations.
CloudFront utilizes S3 by replicating the buckets across multiple edge servers.
Amazon charges you only for the data that is served through CloudFront and there
is no requirement for upfront payment.
Page 29
SimpleDB
If S3 offers storage for arbitrary binary data, SimpleDB is a flexible way to store
Name/Value pairs on the Cloud. This dramatically reduces the overhead of
maintaining a relational database continuously. SimpleDB is accessed through REST
and HTTP calls and can be easily consumed by any client that can parse a HTTP
response. Many Web 2.0 applications built using AJAX, Flash and Silverlight can
easily access data from SimpleDB. It is the only service from Amazon that is free up
to a specific threshold.
Simple DB
Page 30
Scenarios
Scalable Web Application
If you are an aspiring entrepreneur and want to go-live with your app without an
upfront investment, Amazon is the place to go. By running your web app on
Amazon, you can dynamically scale you application on demand and only pay for
what you use. This can be the best playground for you to determine the server
capacity needs and asses the peak traffic patterns before the commercial launch of
your web app.
Page 31
Page 32
At a high level, Windows Azure Platform has 4 key services in it. The first one is
Windows Azure which is the Cloud OS from Microsoft. The second service is the
AppFabric which enables the integration of on-premise services with the Cloud.
The third service is a Database on the Cloud called SQL Azure which is based on
Microsoft SQL Server. The latest addition to the platform is a service Codenamed
“Dallas” which is a marketplace to publish, discover, consume and analyze premier
content.
Page 33
I will first explain each of the components of Windows Azure Platform and then
walk you through the scenarios for deploying applications on this platform.
Windows Azure
Windows Azure is the heart & soul of the Azure Platform. It is the OS that runs on
each and every server running in the data centers across multiple geographic
locations. It is interesting to note that Windows Azure OS is not available as a retail
OS. It is a homegrown version exclusively designed to power Microsoft‟s Cloud
infrastructure. Windows Azure abstracts the underlying hardware and brings an
illusion that it is just one instance of OS. Because this OS runs across multiple
physical servers, there is a layer on the top that coordinates the execution of
processes. This layer is called the Fabric. In between the Fabric and the Windows
Azure OS, there are hundreds of Virtual Machines (VM) that actually run the code
and the applications. As a developer, you will only see two services at the top of
this stack. They are 1) Compute and, 2) Storage.
You interact with the Compute service when you deploy your applications on
Windows Azure. Applications are expected to run within one of the two roles called
Web Role or Worker Role. Web Role is meant to host typical ASP.NET web
applications or any other CGI web applications. Worker Role is to host long running
processes that do not have any UI. Think of the Web Role as an IIS container and
the Worker Role as the Windows Services container. Web Role and Worker Role can
Page 34
When you run an application, you definitely need storage to either store the simple
configuration data or more complex binary data. Windows Azure Storage comes in
three flavors. 1) Blobs, 2) Tables and, 3) Queues.
Blobs can store large binary objects like media files, documents and even serialized
objects. Table offers flexible name/value based storage. Finally, Queues are used
to deliver reliable messages between applications. Queues are the best mechanism
to communicate between Web Role and Worker Role. The data stored in Azure
Storage can be accessed through HTTP and REST calls.
So, we just discussed that Windows Azure offers a Compute and Storage service.
Compute service is consumed by deploying a Web Application in a Web Role and
long running process in the Worker Role. Storage can be consumed through Blobs,
Tables and Queues.
AppFabric
Windows Azure platform AppFabric was earlier called the .NET Services. This
service enables seamless integration of services that run within an organization
Page 35
Service Bus provides a secure connectivity between on-premise and Cloud services.
It can be used to register, discover and consume services irrespective of their
location. Services hosted behind firewalls and NAT can be registered with the
Service Bus and these services can be then invoked by the Cloud Services. The
Service Bus abstracts the physical location of the Service by providing a URI that
can be invoked by any potential consumer.
SQL Azure
Page 36
SQL Azure
Page 37
Page 38
Page 39
Google App Engine currently supports Python and Java environments. Java
developers will be able to deploy and run JSPs and Servlets while Python
developers can use standard library. Since GAE runs in a sandbox, not all
operations are possible. For example, opening and listening on sockets is disabled.
The applications running on GAE live in a sandbox that provides multi-tenancy and
isolation across applications.
Page 40
Java Runtime – GAE is based on Java 6 VM and Servlet 2.5 Container. The datastore
can be accessed through the JDO/JPA API. It supports JSR 107 for MemCache API.
Mail can be accessed through javax.mail API. Javax.net.URLConnection provides
access to URLFetch service. Apart from core Java language, other dynamic
languages based on Java like JRuby and Scala.
Python Rumtime – GAE comes with a rich set of API and tools for developing web
applications based on Python. It supports Python 2.5.2 and Python 3 is being
considered for the future releases. You can also take advantage of a wide variety
of mature libraries and frameworks for Python web application development, such
as Django. The Python environment provides rich Python APIs for the datastore,
Google Accounts, URL fetch, and email services. App Engine also provides a simple
Python web application framework called webapp to make it easy to start building
applications.
Datastore – App Engine comes with a very powerful data storage that can scale
dynamically. It also features a query engine and support for transactions. The
Page 41
User Authentication – One of the advantages of using GAE is its integration with
Google Accounts. This empowers the developers to leverage Google‟s secure
authentication engine for their custom applications. While a user is signed in to the
application, the app can access the user's email address, as well as a unique user
ID. The app can also detect whether the current user is an administrator, making it
easy to implement admin-only areas of the app.
URL Fetch – This service will fetch external web pages using the high bandwidth
that many other Google applications use.
Mail – This will enable developers to programmatically send email messages from
custom web applications.
Page 42
Scheduled Tasks – Scheduled Tasks are also called “cron jobs”. Other than running
interactive web applications, GAE can also schedule tasks that can be invoked at a
specific time.
To get started on Google App Engine, download the Eclipse plug-in and the SDK.
The SDK emulates the GAE environment locally and enables you to design, develop
and test applications on your machine before finally deploying on GAE.
Page 43