Distributed Systems and Cloud Computing A Comparative View

Distributed Systems and Cloud Computing: A Comparative View
Malcolm Bell 09274120 bellmr@iol.ie
Abstract In this current age of continually increasing levels of technology and internet availability and broadband speeds, there has been a significant push for organisations, and indeed individuals, to enter the cloud. Whilst the idea of the cloud is not new, however, the cloud in its current forms could be seen as a relatively recent phenomenon. Distributed computing is a network of remote, heterogeneous computers, capable through middleware of sharing resources. Grid computing is a form of distributed system where typically all of the processors in the grid perform the same tasks at the same time. This paper aims to provide a comparison of these two similar, sometimes disparate and sometimes intertwined technologies. Introduction Distributed Systems A distributed system can be thought of as a system of autonomous, often geographically separate, computers, connected through the use of a network and utilising appropriate middleware, which thus enables the computers to coordinate activities and share the resources of the entire system. To an end user, the entire system would appear as a single computer. Distributed systems are said to have the following characteristics [1]: No common physical clock This introduces the element of distribution in the system, and gives rise to the inherent asynchrony between the processors. No shared memory - A key feature that requires message-passing for communication. Implies the absence of the common physical clock. Geographical separation The further apart geographically that the processors are, the more indicative of a distributed system. Autonomy and heterogeneity - The processors are loosely coupled in that they have different speeds and each can be running a different operating system.
Another, perhaps tongue in cheek, definition of a distributed system states You know you are using one when the crash of a computer you have never heard of prevents you from doing work [2] The notion of having no shared memory between devices within distributed systems necessitates the need for message passing middleware to enable communication between the individual devices. The possible solutions can range from low level socket communications between the machines to higher level interfaces such as remote procedure calls (RPC) or remote method invocation (RMI). Other methodologies exist also, such as
common object request broker architecture (CORBA) and distributed component object model (DCOM). There are a number of transparency requirements for distributed systems to employ, briefly listed as: access, location, migration, relocation, replication, concurrency, failure, and persistence. These requirements hide from the user the details of such things as where the resources are located, whether the resource has been moved during use etc. Distributed systems are characterised by heterogeneous processors or machines and an irregular network connecting the machines. Grid Computing Grid computing is a particular form of the distributed system concept, where the computers on the network function as a virtual supercomputer performing together to carry out large tasks[3]. The resources within the system are owned by different organisations which collaborate to provide the services for the end users. Grids are usually developed and utilised within scientific or academic fields, however this is not always the case. Some examples of a grid in the scientific fields would be the EGEE (Enabling Grids for E-sciencE), and UNOSAT. SETI@home would perhaps be the best known public or home use implementation of a computational grid. In this system, ordinary home computers are connected to the grid, and client software installed on them utilises the computers idle CPU cycles to carry out computations on large data sets. Pricing structures for grid computing are many and varied. Some of the more common models are listed below[4]: the commodity market model the posted price model the bargaining model the tendering/contract-net model the auction model the bid-based proportional resource sharing model the community/coalition/bartering model the monopoly and oligopoly
Cloud Computing The term cloud computing was first coined in February 2007 [5], however the idea of cloud computing has been around since 1961, when Professor John McCarthy first spoke of utility computing [6]. This vision was for the provision of computing services as a utility similar to those of ordinary electricity or water utilities. However, it soon became apparent that the technologies of the day were not suitable for such a vision, and the idea became less
popular. Technological advancements in network hardware, the availability of high speed broadband connections, and improvements in communications protocols have since made this vision of utility computing possible Cloud computing is typically defined as the combination of a computational grid, utility computing, and clients. Utility computing is the provision of computing resources on demand, a computational grid is a large-scale geographically distributed hardware and software infra-structure[7], and the clients are the end users. In terms of services provided by cloud computing, they can be broken down in to three very distinct models, with a fourth model covering everything else. These areas are: IaaS PaaS SaaS XaaS Infrastructure as a service Platform as a service Software as a service Anything else as a service
These are the business models or provision of services that cloud computing is based upon, and each describes the level of service that cloud clients are being provided by the supplier. For example, IaaS is the provision of infrastructural services to the client, which consist of hardware, like servers, networking components and storage, that the client agrees to purchase based on a fee structure typically related to the amount of resources consumed. The major characteristics of the cloud computing paradigm are:
Elasticity and scalability Resource allocation can grow or shrink dependent on demand. Self-service provisioning Customers must be able to provision the services they require. Standardized interfaces Standardized interfaces let users more easily link cloud services. Billing and service usage metering Users pay only for the resources that they consume.
There are many cloud computing providers already established, and new suppliers enter the market regularly. Levels of service vary from free email or social networking services, such as Gmail, Facebook or Twitter, to massive data hosting and virtualised server packages running multiple cloud applications, such as Amazon EC2. Pricing structures in cloud computing are generally based on the amount of resources utilised, the amount of storage needed and the level of bandwidth required.
Parallel Computing In contrast to the distributed system model of many geographically separate computers, a parallel computing system consists of several processing units within the same physical machine, using a combination of either shared or distributed memory, or both. Further, whereas the distributed system is heterogeneous and relies on an irregular network, the parallel system is homogeneous and utilises a regular network. Comparison The cloud computing paradigm relies partially on a computational grid, which is itself a form of distributed system. As such, conceptually at least, the cloud could be thought of as an extension to distributed computing. However, the motivation for the cloud is usually somewhat dissimilar to that of a normal or usual distributed system. In order to provide meaningful comparisons and avoid ambiguity, this section will not be specifically focussed towards providing a comparison between cloud computing and general distributed computing, but rather, where possible, towards a comparison between cloud computing and the specific distributed computing area of grid computing otherwise known as the computational grid. Security In todays IT infrastructures, security is a major concern. This includes security of users personal details, corporate data, and also security of the underlying hardware systems. There are numerous techniques and protocols that have been developed in parallel with the distributed systems that they serve for ensuring the security of the users, their data and details, and the machines, components and networks they are running on. These may include authentication protocols, controls and policies to limit access to specific items or areas amongst others. The most common security framework in use today for grid systems is the Globus Toolikt, developed specifically for grid systems, and to allow them to operate across multiple disparate firewalls. Prior to the development of the Globus Toolkit, the majority of distributed systems developed were only operated within a single corporate firewall.[8] Cloud computing, as an extension to distributed computing, must also offer the same security levels. In addition, there are a number of essential security concerns that must be addressed unique to the cloud paradigm. Chief amongst these would be the idea that client data is secure from malicious use or viewing, either by representatives of the cloud provider, or other parties such as client competitors, industrial spies or the like. Another item of major importance is the isolation of data and applications running on a virtual machine from any other virtual machine. This is necessary as several virtual machines can be
running on a single physical machine, and each virtual machine should have no knowledge or access to the data of any other virtual machine. Fault Tolerance Integral to any distributed system is the notion that machines or components are likely to fail at any time. The concept of fault-tolerance has been developed to allow for this, so that if any component were to fail, the system could continue running without major problems. Faults or failures can be overcome through the use of fault tolerant algorithms which may redirect processes from failed nodes to working nodes, replication of tasks on more than one node so that at least one stream of tasks will successfully complete, and through checkpointing, where regular checkpoints are made of the state of each process in the system, and in the event of a failure, the system can be restarted from the previous consistent error free global state.[9] To maintain a fault tolerant system for persistent storage, multiple redundant storage locations are typically utilised. Fault tolerant mechanisms are usually built into grid computing environments, and hence, also in cloud computing environments. Load Balancing Load balancing is the practice of distributing work or processes between the available nodes within a network. The distribution is determined based on the nodes capabilities and current workloads. To ensure the best possible runtime for compute or data-intensive tasks, load balancing is crucial to a system. Thus it is another vital component of any grid or cloud computing environment. Scalability Scalability is the ability of a system to instantly adapt to current conditions or workloads so that the appropriate levels of resources, whether they are available processors, storage, memory etc. are available whenever a task requires them. Both the computational grid and the cloud environment strive for achieving scalability. Within the grid environment, scalability can be achieved by adding in additional resources when required. Whilst this is a scalable system, it could not be deemed to be instantly scalable. In terms of the cloud, scalability is also a major component of the system. However, the system in this instance can be said to be instantly scalable. This instant scalability is provided through the use of virtualisation. As demand for more resources are encountered, the cloud provider will launch new virtual services, such as web servers, to accommodate the demand.
When the demand has eased, the virtual server numbers will be reduced to serve the current level of demand. It is common on both types of system to speak of infinite scalability in the eyes of the users. This is somewhat of an abstract concept in that there is no possibility of the system being infinitely scalable, however, in the eyes of the users or clients, it would almost appear to be the case. Processing Power Both the grid computing and cloud computing scenarios present to the end user a vision of a virtual supercomputer, so in effect, to most users, the differences between the two architectures could be considered minimal. Infrastructure Costs Distributed systems typically require a larger upfront investment as it is the users or the organisation that own the hardware, whereas in a cloud environment it is the cloud supplier that owns the majority of components. Hence a cloud user has little or no initial upfront costs. Similarly, the cloud service supplier is responsible for maintaining and updating the hardware, which means that the users have lower ongoing costs relative to non cloud users. Maintenance and Upkeep For users of the cloud, maintenance and updating issues have been taken out of their responsibility, and placed in the hands of the cloud supplier. This leads to less time and expenditure by the user in keeping the applications and systems they use up to date. It also means that should a component of the cloud fail, it is the cloud suppliers responsibility to have it back in service or replaced as soon as possible. Ownership Within a typical collaborative computational grid system, various organisations ranging from research institutes to state bodies to financial services organisations amalgamate resources for the purposes of providing a virtual supercomputer to carry out large distributed computing tasks. In contrast to this, in the cloud environment, though the aims can sometimes be similar, the cloud providers would generally have ownership of all of the system, and provide their own massively distributed grid for the clients to run their tasks on. It could therefore be said that collaborative grids are mainly used and provided by nonprofit entities, whereas the cloud computing providers are almost exclusively commercial entities. As always however, there are obvious exceptions to this.
Conclusions It can be seen that the two different systems of cloud computing and grid computing or distributed computing have in fact got many similarities. This is due to the fact that at the foundation of cloud computing is the computational grid. One major area where the two systems differ is in their respective pricing structures. Another is in the availability of access to the services. In the cloud it is a simple matter for any individual or company to access the services, and for some small applications, the cost can be zero. However, in the grid computing model, there would be no such simple browser access, and no zero cost. It would also be next very difficult for an individual to gain access to some of the grids that are available, as they are exclusively for use of research organisations or the like. References [1] [2] [3] [4] Kshemkalyani, A.D., Singhal, M.: Distributed Systems Principles, Algorithms and Systems, Cambridge University Press, 2008. Lamport L., Distribution email, May 28, 1987, available at: http://research.microsoft.com/users/lamport/pubs/distributed_systems.txt Rittinghouse J.W,. Ransome J.F.: Cloud Computing Implementation, Management, and Security, CRC Press, 2010 Buyya, R., Abramson, D., Giddy, J., Stockinger, H.: Economic models for resource management and scheduling in Grid computing, Concurrency and Computation: Practice and Experience, Vol. 14, No.13-15, pp 15071542, 2002 http://cloudcomputingexpo.com Retrieved 7 March 2011 Dupre, F. (2008): Utility (Cloud) Computing...Flashback to 1961 Prof. John McCarthy, http://computinginthecloud.wordpress.com/2008/09/25/utility-cloudcomputingflashback-to-1961-prof-john-mccarthy/ Bote-Lorenzo, M.L,. Dimitriadis, Y.A., Gomez-Sanchez, E.: Grid Characteristics and Uses: a Grid Definition, Postproc. of the First European Across Grids Conference (ACG03), Springer-Verlag LNCS 2970, pp. 291-298, Santiago de Compostela, Spain, Feb. 2004 Erlanger, L. : Distributed Computing An Introduction, http://www.extremetech.com/article2/0,1697,11769,00.asp Cao, G., Mukesh, S.: On Coordinated Checkpointing in Distributed Systems, IEEE Transactions on parallel and distributed systems, Vol. 9, No. 12, December 1998
[5] [6]
[7]
[8] [9]

Distributed Systems and Cloud Computing A Comparative View

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Distributed Systems and Cloud Computing A Comparative View

Hochgeladen von

Copyright:

Verfügbare Formate

Distributed Systems and Cloud Computing: A Comparative View

Malcolm Bell 09274120 bellmr@iol.ie

Das könnte Ihnen auch gefallen