Cloud Computing: A Birds Eye View J. Lakshmi and Sathish S. Vadhiyar Supercomputer Education and Research Centre Indian Institute of Science, Bangalore 560 012 {jlakshmi,vss}@serc.iisc.ernet.in 1 Introduction Cloud computing refers to the latest computing technology that enables utility based computing, i.e. pay by use rather than the ownership of computing resources. The utility part can be hardware, system software or application software that can be accessed from anywhere and used anytime. Typically the interface used for accessing the utility is web based. Cloud computing is a result of evolution and convergence of several independent computing trends like utility computing, virtualization, distributed and grid computing, elasticity, Web2.0, service oriented architectures, content outsourcing and internet delivery. Thus, the cloud can be viewed as an extension of the Internet, wherein opportunities for using large-scale distributed computing infrastructure are being explored for tangible solutions to applications relevant to society and its businesses. Cloud computing, as defined by the National Institute of Standards and Technology (NIST), covers the most comprehensive vision of the cloud computing model: Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (for example, networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction (Pallis, 2010). Thus, cloud computing is a computing paradigm that abstracts many of the computational, data and software functionalities needed by a community into a virtual, remote and distributed environment. The term cloud refers to both the resources and the associated services that provide effective utilization and remote access of the resources. 2 Cloud Computing: What is It? One of the core concepts in cloud computing that makes it an attractive paradigm is virtualization. By virtualization of the entire hardware, software, and network stack, cloud services provide a virtual environment of almost limitless capabilities to the user providing the flexibility to use resources of much larger magnitude than what is actually available. The cloud model promotes availability and is composed of five essential characteristics: 1. On-demand self-service: A cloud user can locate and launch a cloud service without any third party help. 2. Broad network access: Ubiquity of service access from any access device like laptop, mobiles, etc., and from anywhere.
Cloud Computing-TR01 SERC, IISc., Bangalore. 2 | 9 17 Sep. 11 3. Resource pooling: Same resource can potentially be used by simultaneous as well as many different users. 4. Rapid elasticity: As the demand for the service increases, so does the availability of resources to support the demand. Similarly, as service demand decreases, unused resources are released. 5. Measured service: A service is charged by its usage and hence measured for its usage as against the current models where ownership cost is associated with its use. A cloud can be designed to deliver three service models, namely, 1. Infrastructure as a Service (IaaS) cloud: A cloud infrastructure as a service composes of hardware resources, aggregated using special infrastructure middleware, and projected as a compute service. The user, in this model, can demand, acquire and use resources in the form of CPU cycles or storage space. Amazon Web Services is an example of infrastructure as a cloud service. In this model the cloud user gets the hardware resources as a service, over which he needs to deploy the system and application software meeting his use. The bottom most layer in Figure 1 depicts this service mode. 2. Platform as a Service (PaaS) cloud: While the infrastructure as a cloud, provides the hardware resources as a service, the cloud platform extends this model by superimposing a runtime system software layer over the hardware, that can be used to deploy user applications. The hardware along-with the application runtime environment forms the service in this model. Google App Engine and MS-Windows Azure are examples of cloud platform as a service. The Platform-as-a-Service layer in Figure 1 represents this mode of service. 3. Software as a Service (SaaS) cloud: A complete user application, offered as a service, forms the cloud software as a service. Google Docs, SalesForce, Zoho are some examples of this cloud service model. The Application-as-a-Service layer in Figure 1 represents this mode of service. Above this layer, other abstractions are possible, as represented by the Business Process-as-a-Service layer in Figure 1. The cloud architecture is captured, in its all-encompassing form, in Figure 1:
Figure 1: Different conceptual layers of the Cloud Services Model. (Breiter, 2010) Further, clouds can be deployed as: 1. Private cloud: Ownership and access is restricted to the owner or organisation. 2. Community cloud: Collective ownership and access by the members forming a community based on common interest and use. 3. Public cloud: Built for commercial use and available to general public based perhaps on subscription basis and through publicised modes like the Internet. 4. Hybrid cloud: Mix on any of the above three described deployment models. Cloud computing also places high emphasis on seamless access through easy-to-use interfaces and on-demand provisioning of resources, aspects that are important for easy adoption of clouds, and effective resource and cost management. Typical cloud middleware components also provide services related to resource discovery, management, mapping, monitoring, replication, accounting, virtualization, problem solving environments, reliability and security. While a cloud is yet another large scale distributed systems setup, it is quite different from the traditional distributed systems from the perspective of resource access, ownership and usage. Clouds promote the use of self-service with an on-demand usage model. Thus, the user has the freedom to choose required services and only pay for its usage. This is different from current practices wherein large data-centres need to be owned, for using. The pay-by-use pattern has scope for significant reduction in the total cost of ownership (TCO) for any organisation that is intending to
Cloud Computing-TR01 SERC, IISc., Bangalore. 4 | 9 17 Sep. 11 use the cloud. At the same time, clouds promote better commercial opportunities for the providers by allowing optimized usage of resources due to sharing by different users. 3 Usage of Cloud Computing The key motivators for the cloud computing model are its features like availability (anywhere and anytime), elasticity (increase or decrease service capacity), pay-as-you-go (utility), and reduction in cost of ownership for the compute resources. Cloud computing is highly useful in many scenarios in scientific, administrative (governance), and commercial applications. Cloud computing infrastructure at the national level can address problems of diverse nature. These problems can be related to e- governance applications including archiving documents, sharing information about national policies, rules and rights, propagating education material, managing health records, processing agricultural information, land documents, urban planning, traffic control and coordination etc. Scientific applications including nanoscience, bioinformatics, climate and weather modeling, molecular simulations, earthquake modelling, homeland security, surveillance, reconnaissance, remote sensing, signal and image processing can also be addressed effectively using cloud computing. The storage or data cloud will act as a repository of data belonging to different domains and service data requests from the users and computational resources in the computational cloud. E-governance applications like maintaining health records, UID information, bank and property documents, and voting records of about one billion people can lead to huge voluminous data of many exabytes. Utility applications like maintaining digital libraries of books and journals, and archives related to different information can lead to data explosion. Further, close knit communities that can share vital information of mutual interest through clouds can be formed. Some interesting areas, in which cloud usage is emerging, worldwide, are depicted in Figure 2 and Figure 3. Figure 2: Emerging Customer patterns for cloud usage (Breiter, 2010).
Figure 3: Some examples of cloud applications in the developing world (Kshetri, 2010).
4 Cloud Computing Solutions and Infrastructures Numerous commercial solutions and open-source infrastructures exist for enabling cloud computing. 4.1 Commercial Cloud Solutions Most of the solutions handle core cloud computing tasks including resource discovery, virtualization, problem solving environments, monitoring, and web services. However, the solutions differ in their thrust areas and the associated techniques. Amazon EC2 (Elastic Compute Cloud) is the most popular, robust, and standard cloud computing paradigm. It provides a web service through which a user can boot a customized operating system called Amazon Machine Image to create a virtual machine in the cloud. A user can create, launch and terminate virtual machine instances using simple interfaces. Amazon EC2 supports such virtual machine instances of different kinds. Each standard virtual machine instance has a definite computational and storage capacity and an associated pay-per-use price model. For example, the large virtual instance provides 7.5 GB of memory, 4 EC2 computer units and 160 GB of local instance storage with a price model of $0.34 per hour and additional charges for data transfer. Amazon EC2 also enables high performance computing by supporting special instances called cluster compute and cluster GPU instances. The EC2 cloud also provides control over geographical locations of instances, thereby providing latency optimization. EC2 also provides replication and reliability by placing instances in multiple locations or availability zones.
Cloud Computing-TR01 SERC, IISc., Bangalore. 6 | 9 17 Sep. 11 Eucalyptus is an open-source cloud computing paradigm that provides high level abstractions over different cloud service mechanisms provided by various vendors. It predominantly uses the Amazon EC2 services for file systems, and other utilities. Eucalyptus provides a hierarchical cloud computing architecture consisting of cluster, node and storage controllers. Similar to EC2, Eucalyptus also uses Xen hypervisor for supporting virtualization. Besides processor virtualization, Eucalyptus also provides network and data storage virtualization. Eucalyptus has demonstrated its solutions for large-scale numerical and data mining applications. The Microsoft Azure cloud computing solution provides most of the services of Amazon EC2 for remote access of Microsoft clusters and software. The cost model followed is based on storage amount and amount of transactions, data transfer to locations etc. The Azure cloud also supports high performance computing whereby a user can remotely execute parallel applications on the cloud. The Azure cloud solutions have been demonstrated with real scientific and non-scientific applications including seismic solutions, CFD, and financial services. Yahoo!s Hadoop cloud computing solution is another important paradigm that is widely used. Its primary purpose is to help Yahoo! web analytics, and thus specializes in processing large data sets in parallel with special-purpose distributed file system called HDFS. The Hadoops MapReduce framework is a popular model for data flow execution where the output from a set of map tasks are grouped and pipelined as inputs to the second layer of reduce tasks. Hadoop supports simple function mechanisms to allow users to specify the functionalities of map and reduce tasks. Hadoop also supports load balancing mechanisms for placing the map and reduce tasks near the needed data, and replications for fault tolerance. The Hadoops framework also supports a high level dataflow language and execution framework for parallel computing called Pig. There are also specialized cloud solutions for high performance computing like the Nimbus cloud that uses popular batch scheduling mechanisms like PBS or SGE to schedule virtual machines. All these solutions except Eucalyptus target specific hardware and software, or applications or business models. None of the solutions have been demonstrated for applications belonging to diverse scientific and non-scientific domains. 4.2 Cloud Infrastructures Many cloud computing infrastructures and testbeds have been created using the above cloud computing solutions. Following are some examples. NASAS Nebula cloud uses Eucalyptus cloud solution to enable NASA scientists and researchers to share large, complex data sets with external partners and the public. The primary purpose of Nebula was to save hundreds of staff hours needed for obtaining/providing data and installing/executing the necessary software for the data. A typical Nebula cloud contains about 15,000 CPU cores and 15 petabytes of data. Another main objective is to use the cloud for effective resource usage and minimize idling in NASAs large number of computing cores. Nebula provides services on-demand basis by commissioning and decommissioning computing capabilities. One good use case of Nebula is an ongoing attempt in making NASA's data accessible through Microsoft's World Wide Telescope platform.
Cloud Computing-TR01 SERC, IISc., Bangalore. 7 | 9 17 Sep. 11 Another important testbed is the OpenCirrus cloud testbed, a collaborative effort supported by HP, Intel and Yahoo! in which the cloud resources are located at ten Centres of Excellence including academic Institutes. OpenCirrus currently supports about twenty thousand CPU cores and several petabytes of data. Carnegie Mellon University (CMU), one of the Institutes in the OpenCirrus effort, also has a partnership with Yahoo! to allow CMU academic researchers access about 4000 CPU cores and petabytes of data in Yahoo!s M45 cluster. Besides, there are also research efforts related to military applications where soldiers can use mobile devices and offload computation intensive tasks like natural language processing, image and voice recognition to clouds. These infrastructures, however, are small-scale clouds for specific purposes. 5 Economics of Cloud Computing By leveraging the power of remote cloud resources in seamless ways, end-users or clients can offload most of their burden related to planning, procurement, installation, learning to use, adopting best practices and many other complexities associated with software and hardware resources to the services in the cloud. This results in rapid solutions to problems, significant savings in staff hours, and large cost reductions for resources and manpower. This model also allows scientific community to spend quality time on major scientific problems without being distracted by the computational means to solve the problems. On the other hand, cloud providers by catering to a large community can adequately justify the procurement of resources and effectively utilize the resources with very little effort. The cloud providers can also employ intelligent cost models to obtain profitable payments from the clients for use of the resources. Due to these comprehensive benefits and business logic for all concerned entities, IT companies became major players in the development and adoption of cloud computing, making it the default computing mechanism, and in general promoting its wide acceptance. 6 Cloud Computing: Challenges and Opportunities Many challenges still lie ahead for using the cloud in all its foreseen circumstances of usage. Significant challenges include metering of cloud service usage, performance isolation on shared resources, security issues associated with data privacy, protection, accessibility and jurisprudence, cloud interoperability to avoid vendor lock-in and assure service reliability in case of outages, commercial software availability and licensing on clouds based on metered usage (Armbrust, 2009). Novel cloud computing services related to seamless access mechanisms, automatic management and orchestration of data and computing, dynamic query mechanisms, algorithm building, and relationship determination, workflow composition and many others need to be developed to sustain such very large-scale cloud computing. With the increase in the cloud adoption, there is a substantial effort in the academic and industrial research and manufacturing sectors to fill in the perceived lacunae of the clouds sphere. 7 Impact of Cloud Computing on National Missions Cloud computing has the potential to change the way information technology is used in the coming years. The impact of cloud computing, on an economy, is associated with the determinants and drivers of the cloud, including both, providers and users, as indicated in Figure 4.
Figure 4: Cloud related indicators in developing countries (Kshetri, 2010). Cloud computing is highly beneficial enabling seamless access to complex hardware, software and data environments, easy adoption of large scale computing, on-demand servicing, flexible computational models, effective resource utilization and huge cost-cuts in terms of infrastructures and manpower. Specifically, in the Indian context, cloud computing can be deployed in: 1. E-governance applications, like maintaining health records, UID information, bank and property documents, and voting records of about one billion people; 2. Geographical information system putting together the available maps, satellite images, geospatial databases, geo-tagged tables, and crowd-sourced data, and developing a series of GIS Applications service for governance; 3. Very large scale computational clouds for scientific applications such as design of transport aircraft, nanosecond simulations, military applications where the cloud can act as a command and control centre for facilitating interactions between different teams on the field, earthquake modeling, homeland security, surveillance, reconnaissance, remote sensing, signal and image processing; 4. Utility applications like maintaining digital libraries of books and journals, and archives related to different information specifically pertaining to education, as a part of the education portal; 5. Facilitating software usage across academic and research institutions, by providing software as service. This will significantly reduce the time and cost burden of the users due to avoiding the complex installation procedures associated with the software packages, and meeting the license requirements. Thus non-expert users and users with resource constraints including undergraduate academic institutions, government agencies and small- scale start-ups will be highly benefited and encouraged to solve problems of large magnitude. Examples of such are remote use of Matlab and Mathematica functions by the scientific users.
8 Cloud Computing as a Thrust Area The impact of cloud computing on various national missions was discussed in detail in the previous section. Observing the national scenario today, data centres are deployed in dedicated access mode and the privilege rests with few high-end universities or R&D organisations. The cost of ownership in such cases is high and inhibits smaller organisations to invest in such facilities. Cloud computing addresses this challenge head-on and if the volume of users increases, becomes a very cost-effective and viable solution. As a result, it has the potential to open up high-end computing, to many smaller organisations, at a very economically feasible pricing. Cloud computing can provide easy-to-use abstraction and seamless access to diversified software, hardware and storage services. By providing seamless access to resources, clouds can encourage the general public to large-scale adoption of IT as a fundamental tool for many of the essential daily services, and the scientific community to target large problems of national importance. By mapping user requirements to a complicated set of tasks and automatic composition of workflows behind the scenes, cloud facilities can act as one-stop locations for accessing different and inter-linked services related to e-governance. Further, the cloud computing initiatives in various sectors can help avoiding replication of infrastructure at multiple locations, and thus help decrease IT expenditure by the government. Thus, for India to completely realize its IT and scientific potential with economically viable solutions, cloud computing has to be treated as one of the major thrust area, and large-scale national cloud computing facilities will have to be set up. Several segments of society can benefit because of this. Some of the obvious segments that can directly reap the benefits are listed below: Schools, Colleges & Universities: Cloud computing can help schools, colleges and universities access the latest technologies at an affordable price. New Innovative Business Firms: Start-ups and SMBs need not invest for their IT infrastructure cost. With the cloud services they can consume as their business grows. In fact, one can run their own business on the cloud with an office at home. Multimedia Content Providers: Multimedia digital content can be distributed to various consumers for a lower price. Entertainment, agriculture and meteorology, are some of the areas where compute clouds can provide wider reach. E-Governance: Many government departments have to deal with huge data and mining this data for useful information needs sophisticated computing infrastructure. Cloud computing resolves this issue effectively by enabling access to the required infrastructure. Apart from this, secured application services on the cloud, to such data, can allow visibility of information from anywhere and everywhere. Accessing information dealing with land records, demography associated like UIDAI, health associated, tax records, etc., are some of the areas where cloud computing can bring far reaching reforms. Cloud computing in academia has been confined to a few isolated groups. Research on computing has been pursued at the Indian Institute of Science both in the Computer Aided Design Laboratory and the Grid Applications Research Laboratory. The computer services centre of IIT Delhi provides a cloud for scientific and high performance computing (HPC) usage of faculty of the Institute. The
Cloud Computing-TR01 SERC, IISc., Bangalore. 10 | 9 17 Sep. 11 cloud is implemented using 192 processors and virtualizes computing, storage and network resources. It also has provision for automatically switching off nodes during lean periods and switching on during demand, thereby maintaining high utilization. The faculty can request for a specific number of dedicated virtual resources with specific storage, operating system and duration requirements. A similar cloud computing cluster facility has been set up by Yahoo! in IIT Mumbai for research on web analytics by students and faculty. Many of these cloud computing projects cater to a specific community with limited set of objectives. It will be essential to have large scale national clouds catering to a large society for use in diverse areas and applications. The Centre for Design of Advanced Computing (CDAC) is also involved in cloud and grid computing research. In addition, proposals are currently underway to the planning commission to use cloud computing in a major way in the national Geographical Information Systems (GIS) by the National GIS Interim Group. Another initiative on high performance computing headed by Prof. N. Balakrishnan has also submitted a proposal to the planning commission which will use large scale cloud for high performance computing with infrastructure as a service (IaaS) and for applications with software as a service (SaaS) model. A complete list of all Indian research groups is listed in the Appendix A. Appendix - A Indian Academic Organisations involved in Cloud Computing IISc-Bangalore: Dr. J. Lakshmi/Prof. S.K. Nandy http://www.serc.iisc.ernet.in/cadl/ IISc-Bangalore: Prof. Sathish S. Vadhiyar http://www.serc.iisc.ernet.in/garl/ IIT-Mumbai: Prof. Umesh Bellur http://www.cse.iitb.ac.in/~umesh/ IIT-Delhi: Dr. Sourav Bansal http://www.cse.iitd.ernet.in/~sbansal/ IIT-Guwahati:Dr. Diganta Goswami http://www.iitg.ernet.in/dgoswami/ IIIT-Hyderabad: Search and Information Extraction Lab LTRC, IIIT-Hyderabad http://search.iiit.ac.in/cloud-computing IIT-GandhiNagar; NIT Surat: Prof. Dhiren R. Patel http://www.iitgn.ac.in/faculty/comp/dhiren.htm
Indian Research Organisations involved in Cloud Computing CDAC, Hyderabad and CDAC, Bangalore: http://www.cdac.in/ In addition, proposals are currently underway to the planning commission to use cloud computing in a major way in the national Geographical Information Systems (GIS) by the National GIS Interim Group. Another initiative on high performance computing headed by Prof. N. Balakrishnan has also submitted a proposal to the planning commission which will use large scale cloud for high performance computing with infrastructure as a service (IaaS) and for applications with software as a service (SaaS) model.
Cloud Computing-TR01 SERC, IISc., Bangalore. 11 | 9 17 Sep. 11 Indian Commercial organisations providing cloud services Company Service Location Remarks AppPoint AppsOnAzure - PaaS Bangalore Cloud based application infrastructure using Microsoft Azure as the platform. I am yet to explore the details. Clogeny Cloud Enabler Pune Cloud related services such as: Migration Deployment Planning Consulting CtrlS CtrlS Cloud -IaaS Hyderabad On-Demand Private Cloud. 99.995% uptime Tier 4 datacenter EazeWork EazeHR - SaaS EazePayroll -SaaS EazeSales -SaaS Noida Cloud SaaS for SMEs/SMBs. NetMagic Solutions Cloud 2.0 CloudNet CloudServe PrivateCloud Mumbai A front runner in the IndianIaaS space. OrangeScape OrangeScape Studio - PaaS Chennai USP - Visual PaaS. OrangeScape CEO Interview OrangeScape Launches into US Market with Persistent Systems Partnership 2011 TiE50 Software/Cloud Computing Winners Ozonetel Systems KooKoo PaaS CTS - SaaS Hyderabad In India it has definitely a first-mover advantage in cloud telephony services (CTS) PK4 Software Impel CRM -SaaS Bangalore USP a non-western CRM for India. PK4 CEO Interview Ramco Ramco OnDemand -SaaS Chennai An early mover in SaaS. An ERP on the cloud.
Cloud Computing-TR01 SERC, IISc., Bangalore. 12 | 9 17 Sep. 11 Remindo Remindo - SaaS Mumbai Your company branded official social media tool in cloud (Still in Beta, free up to 20 users) Synage DeskAway -SaaS Mumbai Cloud based project management. Synage Founder & CEO Speaks The SaaS Edge by Sahil Parikh Tata Communications InstaCompute - IaaS InstaOffice -SaaS Mumbai Data Centers located at Hyderabad, Singapore InstaOffice is powered byGoogleApps TCS iON - ITaaS Mumbai Covers the entire spectrum of business processes for SMBs. Domains: Manufacturing Wellness Retail Education Wolf Frameworks Wolf PaaS Bangalore Cloud PaaS with 99.97% SLA. Director Wolf Frameworks Speaks
Bibliography Amazon EC2: http://aws.amazon.com/ec2 Armbrust, e. A. (2009). Above the Clouds: A Berkeley View of Cloud computing. http://radlab.cs.berkeley.edu/. Breiter, G. (2010). Cloud Computing Architecture and Strategy. IBM Corporation. Eucalyptus: http://www.eucalyptus.com/ Group, C. C. (2010). Cloud Computing UseCases - White Paper version 4.0. Kshetri, N. (2010). Cloud computing in developing economies: Drivers, Effects and Policy Measures. PTC'10 Proceedings, (pp. 1-22). Microsoft Azure: http://www.microsoft.com/windowsazure/ National Knowledge Network (NKN): http://www.nkn.in NetSolve: http://icl.cs.utk.edu/netsolve Nimbus cloud: http://www.nimbusproject.org/ NASA Nebula Cloud: http://www.nasa.gov/open/plan/nebula.html OpenCirrus: https://opencirrus.org/
Cloud Computing-TR01 SERC, IISc., Bangalore. 13 | 9 17 Sep. 11 Pallis, G. (2010, Sep.-Oct.). Cloud Computing The New Frontier of Internet Computing. IEEE Internet Computing, 70-73. Yahoo! Hadoop: http://hadoop.apache.org/ http://www.techno-pulse.com/2011/05/india-based-cloud-computing-companies.html