SYNOPSIS ON Going Back and Forth: Efficient Multi deployment and Multi snapshotting on Clouds

SUBMITTED BY
SAMEER BANSOD NIKHIL RATHOD MUKESH BURADKAR KAMLESH ADHAU

UNDER THE GUIDANCE OF MR. SAGAR BADHIYE

DEPARTMENT OF COMPUTER TECHNOLOGY YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING HINGNA ROAD, WANADONGRI, NAGPUR-441110 YEAR 2013-2014

2 Disadvantages 4.2 System Architecture 7. Problem Definition Abstract Aim & Objective Literature Review 4. 3.1 Advantage 6. Scope 5. 3 4 5 6 7 8 8 9 11 11 12 15 16 17 References . Plan of Action Software & Hardware Requirements Page No. 2.1 Data flow diagram 6. 8. 4.Table of Contents Title 1.1 Existing System 4. High Level Design 6.4 Application State 5.3 Cloud Infrastructure 4.

. for check pointing or off-line migration) to another cluster or cloud . starting from a set of VM images previously stored in a persistent fashion • Multisnapshotting : Many VM images that were locally modified need to be concurrently transferred to stable storage with the purpose of capturing the VM’s state for later use (e. Problem Definition To provide the basic functionalities for the use of Virtual Machine’s (VM’s) over the cloud given as: • Multideployment : The operation of Infrastructure As A Service (IaaS) is the need to deploy a large number of VMs on many nodes of a datacentre at the same time.g.1..

another challenge is to simultaneously take a snapshot of many images and transfer them to persistent storage to support management tasks. . it is important to enable efficient concurrent deployment and snapshotting that are at the same time hypervisor independent and ensure a maximum compatibility with different configurations. This new model raises new challenges in the design and development of IaaS middleware. such as suspend-resume and migration. It is based on a lazy transfer scheme coupled with object versioning that handles snapshotting transparently in a hypervisor-independent fashion. ensuring high portability for different configurations. Once the VM instances 1are deployed.Abstract Infrastructure as a Service (IaaS) cloud computing has transform the way we think of acquiring resources by introducing a simple change: allowing users to lease computational resources from the cloud provider’s datacenter for a short time by deploying virtual machines (VMs) on these resources. With datacenters growing rapidly and configurations becoming heterogeneous. This paper addresses these challenges by proposing a virtual file system specifically optimized for virtual machine image storage.2. One of those challenges is the need to deploy a large number (hundreds or even thousands) of VM instances simultaneously.

. We want to reduce the contention on current system & allow maximum number of user to access VM’s with quick resume. we are creating cloud infrastructure which allows users to lease computational resources from cloud provider. Aim and Objectives In this project. One resource can be available to the all the users .2. Our aim is to create and implement load balancing mechanisms. A focus on the use of reusable frameworks to provide cost and times benefits.by anticipating their requirements we provide them without confliction. They’re focused on coming up with solutions that serve customer requirements today and anticipate future needs. for managing numbers of users at the same time and within the context of time slices. restart & suspend operation.

. Infrastructure as a Service (IaaS) cloud computing has emerged as a viable alternative to the acquisition and management of physical resources. When taking frequent snapshots for a large number of VMs.4. multisnapshotting must be handled in a transparent and portable . Custom image formats are not standardized and can be used with specific hypervisors only. With IaaS. such approaches generate a large number of files and interdependencies among them. This can make the response time of the IaaS installation much longer than acceptable and erase the ondemand benefits of cloud computing. This emerging model leads to new challenges relating to the design and development of IaaS systems. Since the user has complete control over the configuration of the VMs using on-demand deployments. The on-demand nature of IaaS is critical to making such leases attractive. not counting the time to boot the operating system itself.g.[1] Once the VM instances are running. users can lease storage and computation time from large datacentre’s. since it enables users to expand or shrink their resources according to their computational needs. Therefore. We refer to this pattern as multideployment. this pattern occurs when the user wants to deploy a virtual cluster that executes a distributed application or a set of environments to support a workflow. Literature Survey In recent years. For example. A typical deployment consists of hundreds or even thousands of such images. Leasing of computation time is accomplished by allowing users to deploy virtual machines (VMs) on the datacentre’s resources[1]. One of the commonly occurring patterns in the operation of IaaS is the need to deploy a large number of VMs on many nodes of a datacentre at the same time. a process that can take tens of minutes to hours. for check pointing or off-line migration to another cluster or cloud). Conventional snapshotting techniques rely on custom VM image file formats to store only incremental differences in a new file that depends on the original VM image as the backing file. This problem is particularly acute for VM images used in scientific computing where image sizes are large (from a few gigabytes up to more than 10 GB). a similar challenge applies to snapshotting the deployment: many VM images that were locally modified need to be concurrently transferred to stable storage with the purpose of capturing the VM state for later use (e. starting from a set of VM images previously stored in a persistent fashion. which limits the ability to easily migrate VMs among different hypervisors. configurations are becoming more and more heterogeneous. by using external resources to complement their local resource base. Furthermore. Conventional deployment techniques broadcast the images to the nodes before starting the VM instances. Such a large deployment of many VMs at once can take a long time. which are difficult to manage and which interfere with the ease-of-use rationale behind clouds. with growing datacentre trends and tendencies to federate clouds. IaaS leasing is equivalent to purchasing dedicated hardware but without the long-term commitment and cost. We refer to this pattern as multisnapshotting.

Since the patterns are complementary. raw image files (understood by most hypervisors) to the outside[6]. We addressed several major requirements related to these challenges. these patterns may also generate high network traffic that interferes with the execution of applications on leased resources and generates high utilization costs for the user.BLOBs that can grow to TB) in very large-scale distributed systems while maintaining a very high data throughput for highly concurrent.fashion that hides the interdependencies of incremental differences and exposes standalone VM images.3 Cloud infrastructure IaaS platforms are typically built on top of clusters made out of loosely-coupled commodity hardware that minimizes per unit cost and favours low power over maximum speed. storage space. Our proposal offers a good balance between performance. In addition to incurring significant delays and raising manageability issues. One such requirement is the need to efficiently cope with massive unstructured data (organized as huge sequences of bytes . and data and processing outsourcing.2 Disadvantages To give less performance and storage space. provisioning. while handling snapshotting transparently and exposing standalone. 4. Network traffic consumption also very high due to non concentrating on application status. high-performance distributed data-storage service that facilitates data sharing at large scale. load balancing. The role of virtualization in Clouds is also emphasized by identifying it as a key component. 4.1 Existing system The huge computational potential offered by large distributed systems is hindered by poor data sharing scalability. Clouds have been defined just as virtualized hardware and software plus the previous monitoring and provisioning technologies. while keeping maximum portability among different hypervisor configurations. Moreover. fine-grain data accesses [1]. 4. Disk storage (cheap hard-drives with . This paper proposes a distributed virtual file system specifically optimized for both the multideployment and multisnapshotting patterns. we investigate them in conjunction. It is not possible to build a scalable. Cloud Computing is a “buzz word” around a wide variety of aspects such as deployment. and network traffic consumption.

state of devices. there is an acute need for scalable storage. The machines are configured with proper virtualization technology. While several methods have been established in the virtualization community to capture the state of a running VM (CPU registers. The repository is responsible for storing the VM images persistently in a reliable fashion and provides the means for users to manipulate them: upload. in terms of both hardware and software. With the recent explosion in cloud computing demands. etc. Therefore. In order to provide persistent storage.capacities in the order of several hundred GB) is attached to each machine.point-in-time deployment checkpoint. delete. a dedicated repository is deployed either as centralized or as distributed storage service running on dedicated storage nodes. Thus. saving the application state implies saving both the state of all VM instances and the state of all active communication channels among them. Model 2 can further be simplified such that the VM state is represented only by the virtual disk attached to it (Model 3). which is unacceptable for a single one. Such an approach has two important practical benefits: (i) Huge reductions in the size of the state. the necessary storage space can explode to huge sizes. and (ii) Portability since the VM can be restored on another host without having to worry about restoring the state of hardware devices that are not supported or are in. under the assumption that a fault-tolerant networking protocol is used that is able to re. such as configuration files that describe the environment and temporary files that were generated by the application. and the like does not need to be saved.4 Application state The state of the VM deployment is defined at each moment in time by two main components: the state of each of the VM instances and the state of the communication channels between them (opened sockets. . and so forth. saving 2 GB of RAM for 1. CPU registers.compatible between different hypervisors. which is used to store only minimal information about the state. since the contents of RAM.store communication channels and resend lost information[5]. For example. for VM instances that need large amounts of memory.[5] This information is then later used to reboot and reinitialize the software stack running inside the VM instance. In order to avoid this issue. Even so. RAM. etc. virtual topology.000 VMs consumes 2 TB of space. 4. the general case is usually simplified such that the application state is reduced to the sum of states of the VM instances . download. such that they are able to host the VMs.).). while the machines are interconnected with standard Ethernet links. the issue of capturing the global state of the communication channels is difficult and still an open problem. in the most general case . in-transit network packets. Any in-transit network traffic is discarded.

To illustrate this point. storage space. 5. and network traffic consumption. a versioning storage service specifically designed for high throughput under concurrency. while handling snapshotting transparently and exposing standalone. storage space.5. raw image files (understood by most hypervisors) to the outside[4]. We show how to realize these design principles by building a virtual file system that leverages versioningbased distributed storage services. raw image files. Scope A distributed virtual file system specifically optimized for both the multi deployment and multi snapshotting patterns. Our proposal offers a good balance between performance. We introduce a series of design principles that optimize multi deployment and multi snapshotting patterns and describe how our design can be integrated with IaaS infrastructures. we describe an implementation on top of Blob Seer. Since the patterns are complementary. we investigate them in conjunction. .1 ADVANTAGE A good balance between performance. and network traffic consumption. while handling snapshotting transparently and exposing standalone.

6. SYSTEM ARCITECTURE . SYSTEM FLOW DIAGRAM RESOURCES RESOURCE1. High Level Design User system REGISTER GETTING AUTHORIZATION TO STORE RESOURCES DATACENTER CONTROL API HYPERVISOR Request Requesting files LOCAL DISK FIG 1. RESOURCE 2 VM VM CENTRALIZED DATA STORAGE FIG 2.

The cloud middleware [2] in turn coordinates the compute nodes to achieve the afore-mentioned management tasks. For ex-ample. A global snapshot of the whole application. subsequent global snapshots are performed by is. telling it when to start and stop VMs. when to create a new image clone (CLONE) [2]. distributed application needs to be debugged. Once a clone is created for each VM instance. which is responsible for ondemand mirroring and snapshotting and relies on both the local disk and the distributed versioning storage service to do so. is performed in the following fashion. and snapshotting individual VM instances or the whole set. A distributed versioning storage service that supports cloning and shadowing is deployed on the compute nodes and consolidates parts of their local disks into a common storage pool.suing each mirroring module a COMMIT to its corresponding clone. they can be either collectively or independently analyzed and modified in an attempt to . The cloud client has direct access to the storage service and is allowed to upload and download images from it. the cloud client interacts with the cloud middleware through a control API that enables a variety of management tasks. However. Every uploaded image is automatically striped. let’s assume a scenario where a complex. and the mirroring module. including deploying an image on a set of compute nodes. The first time the snap-shot is taken. The cloud middleware interacts directly with both the hypervisor. while the elements that are part of our proposal are highlighted by a darker background. which involves taking a snapshot of all VM instances in parallel. CLONE and COMMIT can also be exposed by the cloud middleware at the user level through the control API for fine-grained control over snapshotting. and when to persistently store its local modifications (COMMIT) [2]. Each compute node runs a hypervisor [1] that is responsible for running the VMs. Running the application repeatedly and waiting for it to reach the point where the bug happens might be prohibitively expensive. the application right before the bug happens. Both CLONE and COMMIT are control primitives that result in the generation of a new. fully independent VM image that is globally accessible through the storage service and can be deployed on other compute nodes or manipulated by the client. CLONE is broadcast to all mirroring modules. Since all image snapshots are independent entities. The reads and writes of the hypervisor are trapped by the mirroring module [1]. The typical elements found in the cloud are illustrated with a light background. Furthermore. dynamically adding or removing compute nodes from that set. followed by COMMIT.Plan of Action The simplified architecture of a cloud that integrates our approach is depicted in Figure 1.7. This approach enables snapshotting to be leveraged in interesting ways. telling it what image to mirror from the repository.

. which is usually performed at smaller scale. the approach can continue iteratively until a fix is found. Once this fix is made. Such an approach is highly useful in practice at large scale because complex synchronization bugs [3] tend to appear only in large deployments and are usually not triggered during the test phase.fix the bug. If the attempt was not successful. the application can safely resume from the point where it left.

1 Hardware requirement : CPU type Clock speed Ram size Hard disk capacity : Dual Core : 2.2 Software requirement : Operating System: Windows Language Back End Documentation : JAVA(JDK-1.8.6) : MS-Access : Ms-Office .65 GHz : 2 GB : 40 GB 8. Software And Hardware Requirements 8.

R. In CCA’08: Proceedings of the 1st Conference on Cloud Computing and Its Applications. Afshin Rostamizadeh.4. Konwinski. and O. pages 91 [5]K. Claudel. Keahey and T. the MIT Press ISBNhttp://hal. Stoica. Freeman.53:50 [4]327.A.A. ACM. Huard. Richard. Rabkin. D. B.Ameet Talwalkar (2012) Foundations of Machine Learning. Zaharia. A. Science clouds: Early experiences in cloud computing for scientific applications. R.pdf [3]M.pdf [2]Mehryar Mohri. 2000. Atlanta. GA. and M.nimbusproject. Joseph. G. [6] Hypervisor Alternative: http://siliconangle. Armbrust. A view of cloud computing. Lee.USENIX Association. A. Taktuk.adaptive deployment of remote executions.com/blog/2013/09/19/red-hat-teams-up-with-dotcloud-topromote-open-hyperviper-alternative/ .I.fr/docs/00/57/06/82/PDF/final-paper.org/files/nicolae_hpdc2011. Katz. Commun. 2008.inria.References : [1]Going Back and Forth: Efficient Multi deployment and Multi snapshotting on Clouds: www. In HPDC’09: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing. Fox. Griffith. Patterson. G.

Right : 1. Subtitles Times New Roman 12. Written matter – Times New Roman 12 (justified) Line spacing – 1. Headings in the chapters should have size Times New Roman 14. The paper should be A4 size with margins: Top: 1”. left aligned. 3.25” 5.5” . 2. bold. Report should be spiral bound with white plastic cover pages only. Page nos.2” . Bottom : 1.02 (Project guide + Project Coordinator) .Guidelines 1. bold. 6.5 Paragraph spacing – 2 lines 4. center. Left : 1. should be at the center of bottom of each page. Number of copies -.

7. Seminar report should be as per the guidelines only. .

Sign up to vote on this title
UsefulNot useful