Divya Kapil, Emmanuel S. Pilli and Ramesh C. Joshi Department of Computer Science and Engineering Graphic Era University Dehradun, India (divya.k.rksh, emmshub, chancellor.geu)@gmail.com
AbstractCloud is an emerging technology in the world of information technology and is built on the key concept of virtualization. Virtualization separates hardware from software and has benefits of server consolidation and live migration. Live migration is a useful tool for migrating OS instances across distant physical of data centers and clusters. It facilitates load balancing, fault management, low-level system maintenance and reduction in energy consumption. In this paper, we survey the major issues of virtual machine live migration. We discuss how the key performance metrics e.g downtime, total migration time and transferred data are affected when a live virtual machine is migrated over WAN, with heavy workload or when VMs are migrated together. We classify the techniques and compare the various techniques in a particular class. KeywordsCloud computing; Virtualization; Virtual machine, Live migration; Pre-copy; Post-copy; I. INTRODUCTION The computational world has become very large and complex. Cloud computing is the latest evolution of computing, where IT capabilities are offered as services. Cloud computing delivers services like software or applications (SaaS Software as a Service), infrastructure (IaaS - Infrastructure as a service), and platform (PaaS - Platform as a service). Computing is made available in a Pay-As-You-Use manner to users. Some common examples are Googles App Engine [1], Amazons EC2 [2], Microsoft Azure [3], IBM SmartCloud [4]. Cloud based services are on demand, scalable, device independent and reliable. Many different businesses and organization have adopted the concept of the cloud computing. Cloud computing enables consumer and businesses to use application without installation and they can access their files on any computer through Internet. A standard definition for cloud computing is a model for enabling convenient, on demand network access to a shared pool of configurable computing resources (e.g., networks, server, storage, application, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction [5]. Cloud model is composed of five essential characteristics: On-demand self service, Broad network access, Resource pooling, Rapid elasticity and Measured service. On-demand self service ensures that a consumer can one-sidedly provision computing capabilities automatically without requiring human interaction with each service provider. Broad network access gives access to capabilities available over the network through standard mechanisms. Resource pooling pools computing resources to serve multiple consumers. Rapid elasticity is used to elastically provision and release capabilities or resources. Measured service control and optimize resource use by leveraging a metering capability. The key concept of the cloud computing is Virtualization. Virtualization technology has become popular and valuable for cloud computing environment. Virtualization technology was implemented on IBM mainframe 1960. Virtualization is the abstraction of the physical resources needed to complete a request and underlying hardware used to provide service. It splits up a physical machine into several virtual machines. A virtual machine (VM) is a software implementation of a computing environment in which an operating system or program can be installed and run [6]. Application Application Operating Systems Operating System Virtualized Hardware Virtualized Hardware Hypervisor (Virtualization)
Fig. 1. Virtualization
VMware ESX / ESXi [7], Virtual PC [8], Xen [9], and Microsoft Hyper-V [10], KVM [11], VirtualBox [12] are popular virtualization software. Virtualization can run multiple operating systems concurrently as shown in Fig. 1. A single host can have many smaller virtual machines in which isolated operating system instances are running. Virtualization technologies have a host program called Virtual Machine Monitor or Hypervisor, which is a logical layer between underlying hardware and computational processes, and runs on the top of a given host. In cloud computing, storage, application, server and network devices can be virtualized. Virtualization can provide many benefits, such as resource utilization, portability, and CPU MEM NIC DISK 963 978-1-4673-4529-3/12/$31.00 c 2012 IEEE application isolation, reliability of system, higher performance, improved manageability and fault tolerance. The reasons for VM migration are: Load Balancing, accomplished by migrating VMs out of overloaded / overheated servers, and Server Consolidation, where servers can be selectively brought down for maintenance after migrating their workload to other servers [13]. In this paper we survey on the performance technologies of the VM live migration. We discuss live migration techniques to cluster, grid etc, much before the concept was applied to Cloud computing. We survey the literature on the evaluation of various VM migration techniques and identify the performance metrics. All the existing live virtual machine migration techniques are studied and classified based on these metrics. This paper is organized as follows. Section II gives a brief introduction of Virtual Machine Migration (VMM). Section III describes some related work on evaluation metrics. Live VMM Techniques are surveyed in section IV. We conclude our work in section V with future directions. II. BACKGROUND Virtualization technology allows multiple operating systems run concurrently on the same physical machine. Virtualization provides facility to migrate virtual machine from one host (source) to another physical host (destination). Virtual Machine Migration (VMM) is a useful tool for administrator of data center and clusters: it allows clean separation between hardware and software. Process level migration problems can be avoided by migrating a virtual machine. VMM avoids Residual Dependencies. Virtual Machine Migration enables energy saving, load balancing, efficient resources utilization. Virtual Machine Migration methods are divided into two types: Hot (live) migration and cold (non-live) migration. The status of the VM loses and user can notice the service interruption in cold migration. Virtual machine keeps running while migrating and does not lose its status. User doesnt feel any interruption in service in hot (live) migration. In live migration process, the state of a virtual machine to migrate is transferred. The state consists of its memory contents and local file system. Local file system need not be transferred. First, VM is suspended, then its state is transferred, and lastly, VM is resumed at destination host. Live migration facilitates online maintenance, load balancing and energy management: 1. Online maintenance: To Improve systems reliability and availability a system must be connected with the clients and the up gradation and maintenance of the system is also necessary task so for this all VMs are migrated away without disconnecting. 2. Load Balancing: VMs can be migrated from heavy loaded host to light loaded host to avoid overloading of any one server. 3. Energy Management: VMs can be consolidated to save the energy. Some of the underutilized server VMs are switched down and the consolidated servers ensure power efficient green cloud. Sapuntzakis et. al. [14] demonstrate how to quickly move the state of a running computer across a network, including the state in its disks, memory, CPU registers, and I/O devices. It is a hardware state called a capsule and includes the entire operating system as well as applications and running processes. They have developed techniques to reduce the amount of data sent over the network. The copy-on-write disks track only the updates to capsule disks, "ballooning" zeros unused memory, demand paging fetches only needed blocks, and hashing avoids sending blocks that already exist at the remote end. The basic idea of live migration algorithm, first proposed by Clark et. al. [15]. First Hypervisor marks all pages as dirty, then algorithm iteratively transfer dirty pages across the network until the number of pages remaining to be transferred is below a certain threshold or a maximum number of iterations is reached. Then Hypervisor mark transferred pages as clean, since VM operates during live migration, so already transferred memory pages may be dirtied during iteration and must need to be re-transferred. The VM is suspended at some point on the source for stopping further memory writes and transfer remaining pages. After transferring all the memory contents, the VM resumes at destination. Nelson et. al. [16] describes the design and implementation of a system that uses virtual machine technology to provide fast, transparent application migration, neither the applications nor the operating systems need to be modified. Performance is measured with hundred virtual machines, migrating concurrently with standard industry benchmarks. It shows that for a variety of workloads, application downtime due to migration is less than a second. A high performance virtual machine migration design based on Remote Direct Memory Access (RDMA) was proposed by Huang et al. [17]. InfiniBand is an emerging interconnects offering high performance and features such as OS-bypass and RDMA. RDMA is a direct memory access from the memory of one computer into that of another without involving either one's operating system. By using RDMA remote memory can be read and write (modified) directly, hardware I/O devices can directly access memory without involving OS. Luo et. al. [18] describe a whole-system live migration scheme, which transfers the whole system run-time state, including CPU state, memory data, and local disk storage, of the virtual machine (VM). They propose a three-phase migration (TPM) algorithm as well as an incremental migration (IM) algorithm, which migrate the virtual machine back to the source machine in a very short total migration time. During the migration, all the write accesses to the local disk storage are tracked by using Block-bitmap. Synchronization of the local disk storage is done according to the block-bitmap in the migration. The migration downtime is around 100 milliseconds, close to shared-storage migration. Using IM algorithm, total migration time is reduced. Synchronization mechanism based on the block-bitmap is simple and effective. Performance overhead of recording all the writes on migrated VM is very low. Bradford et. al. [19] presented a system for supporting the transparent, live wide-area migration of virtual machines which use local storage for their persistent state. This approach is 964 2013 3 rd IEEE International Advance Computing Conference (IACC) transparent to the migrated VM, and does not interrupt open network connections to and from the VM during wide area migration, guarantees consistency of the VMs local persistent state at the source and the destination after migration, and is able to handle highly write-intensive workloads. III. PERFORMANCE METRICS Researchers have evaluated the issues in live virtual machine migration and suggested various performance metrics. Voorsluys et. al. [13] evaluate the effects of live migration of virtual machines on the performance of applications running inside Xen VMs. Results show that migration overhead is acceptable but cannot be disregarded, especially in systems where availability and responsiveness are governed by strict SLAs. Kuno, et. al. [20] present performance evaluation of both migration methods (live and non-live), and demonstrate that performance of processes on a migrating virtual machine severely declines. The important reasons for the decline are a host OS communication and memory writing. They also analyze the reasons of I/O performance decline. These results demonstrate that one of important reasons of the performance decline is transmission for migration. Feng et. al. [21] compare the performance of VMotion and XenMotion. VMotion performs better in generating total live migration data when migrating VM instance than XenMotion. The performance of both VMotion and XenMotion degrades in network with delay and packet loss. VMotion performs much worse than XenMotion in certain network with moderate delay and packet loss. Existing live migration technology performs well in LAN live migration. The following metrics are usually used to measure the performance of live migration [22]: 1. Preparation Time: The time when migration has started and transferring the VMs state to the target node. The VM continues to execute and dirty its memory. 2. Downtime: The time during which the migrating VMs is not executing. It includes the transfer of processor state. 3. Resume Time: This is the time between resuming the VMs execution at the target and the end of migration, all dependencies on the source are eliminated. 4. Pages Transferred: This is the total amount of memory pages transferred, including duplicates, across all of the above time periods. 5. Total Migration Time: This is the total time of all the above times from start to finish. Total time is important because it affects the release of resources on both participating nodes as well as within the VMs. 6. Application Degradation: This is the extent to which migration slows down the applications executing within the VM. IV. LIVE VM MIGRATION TECHNIQUES IN CLOUD Live migration is an extremely powerful tool for cluster and cloud administrator. An administrator can migrate OS instances with application so that the machine can be freed for maintenance. Similarly, to improve manageability, OS instances may be rearranged across machines to relieve the load on overloaded hosts. In order to perform the live migration of a VM, its runtime state must be transferred from the source to the destination with VM still running. There are two major approaches: Post-Copy and Pre-Copy memory migration. Post-copy first suspends the migrating VM at the source, copies minimal processor state to the target node, resumes the virtual machine, and begins fetching memory pages over the network from the source. There are two phases in Pre-copy approach: Warm-up phase and Stop-and-Copy phase. In warm up VM memory migration phase, the hypervisor copies all the memory pages from source to destination while the VM is still running on the source. If some memory pages change during memory copy processdirty pages, they will be re-copied until the rate of re- copied pages is not less than page dirtying rate. In Stop and Copy phase, the VM will be stopped in source and the remaining dirty pages will be copied to the destination and VM will be resumed in destination. A. Post Copy Approaches: Hines et. al. [22] present the design and implementation of a post-copy technique for live migration of virtual machines. Post-copy consists of four key components: demand paging, active pushing, prepaging, and dynamic self-ballooning. They have implemented and evaluated post-copy on Xen and Linux based platform. The evaluations show that post-copy significantly reduces the total migration time and the number of pages transferred compared to pre-copy. The bubbling algorithm for prepaging is able to significantly reduce the number network faults incurred during post-copy migration. Michael et. al. [23] compare post-copy against the pre-copy approach on top of the Xen Hypervisor. This shows improvements in several migration metrics including pages transferred, total migration time and network overhead using a range of VM workloads. They use post-copy with adaptive pre- paging in order to eliminate all duplicate page transmissions. They eliminate the transfer of free memory pages in both migration schemes through a dynamic self-ballooning (DSB) mechanism. DSB periodically releases free pages in a guest VM back to the hypervisor and significantly speeds up migration with negligible performance degradation. B. Pre Copy Approaches: There are many categories in pre-copy approach. Many technologies are combined, existing pre-copy approaches are improved, multiple VMs are migrated, and specific application loads are considered. The techniques are explained below: 1) Combined Technologies: Liu et. al. [24] describe a novel approach. They combine technology of recovering system (check pointing / recovery and trace / replay) with CPU scheduling to provide fast and transparent migration. Target host executes log files generated on source host to synchronize the states of source and target hosts, during which a CPU scheduling mechanism is used to adjust the log generation rate. This approach has short downtime and reasonable total migration time. 2013 3 rd IEEE International Advance Computing Conference (IACC) 965 Liu et. al. [25] describes novel approach CR/TR-Motion that adopts check pointing / recovery and trace / replay technology to provide fast, transparent VM migration. This scheme can greatly reduce the migration downtime and network bandwidth consumption. In multi-processor (or multi- core) environment, as expensive memory race among different VCPUs must be recorded and replayed, this make an inherent difficult for this approach to migrate SMP guest OS. VCPU hot plug technique may address this issue by dynamically configuring the migrated VM to use only one VCPU before migration, and give back the VCPUs after the migration. When CPU and/or memory intensive VMs are migrated, it has extended migration downtime that may cause service interruption or even failure, and prolonged total migration time that is harmful for the overall system performance. Svard et. al. [26] approach this two-fold problem through a combination of techniques. They dynamically adapt the transfer order of VM memory pages during live migration reducing the risk of re-transfers for frequently dirtied pages and use a compression scheme that increases the migration throughput and the migration downtime is effectively reduced. Moving live VM with large size over WAN with low bandwidth is a big problem. Bose et. al. [27] propose to combine VM replication with VM scheduling so that migration latencies can be minimized. They compensate for the additional storage requirement due to the increase in the number of replicas by exploring commonalities across different VM images using de-duplication techniques. Kumar Bose et. al. [28] propose to combine VM replication with VM scheduling to overcome the challenge of Migration latencies associated with moving large files (VM images) over the relatively low-bandwidth networks. They replicate a VM image selectively across different cloud sites, choose a replica of the VM image to be the primary copy, and propagate the incremental changes at the primary copy to the remaining replicas of the VM. The proposed architecture for integrated replication and scheduling called CloudSpider, is capable of minimizing migration latencies associated with the live migration of the VM images across WANs. 2) Improved Pre-copy Approaches: Jin et. al. [29] present the design and implementation of a novel memory-compression based VM migration approach (MECOM). They first use memory compression to provide fast and stable virtual machine migration, though virtual machine services may be slightly affected based on memory page characteristics. They also designed an adaptive zero-aware compression algorithm for balancing the performance and the cost of virtual machine migration. Pages are quickly compressed in batches on the source and exactly recovered on the target. Experiments demonstrate that compared with Xen, this system can significantly reduce 27.1% of downtime, 32% of total migration time and 68.8% of total transferred data. Fei Ma et. al. [30] improved pre-copy approach on Xen 3.3.0 by adding a bitmap page which marks those frequently updated pages. In the iteration process, frequently updated pages are put into the page bitmap, and those pages can only be transmitted in the last round of the iteration process. This ensures that frequently updated pages are transmitted just once. Svard et al. [31] implemented delta compression live migration algorithm as a modification to the KVM hypervisor. The performance is evaluated by migrating running VMs with different type of workload and it shows a significant decrement in migration downtime. They demonstrate that when VMs migrate with high workloads and/or over low-bandwidth networks there is a high risk of service interruption. Using delta compression, risk of service can be reduced as data is stored in the form of changes between versions. In order to improve performance, either the dirtying rate has to be reduced or the network throughput increased. Ibrahim et al. [32] present a performance analysis of the current KVM implementation and study the behavior of iterative pre-copy live migration for memory intensive applications. The scientific application (VM contains multiple cores) memory rate of change is likely to be higher than the migration draining rate. They present a novel algorithm that achieves both low downtime and low application performance impact. This approach is implemented in KVM. 3) Multiple VMs migration Al-Kiswany et. al. presents VMFlockMS [33], a migration service optimized for cross-datacenter transfer and instantiation of groups of related VM images with an application-level solution (e.g., a three-tier web application). VMFlockMS uses two techniques: 1) data deduplication to be migrated within the VMFlock, and among the VMs in the VMFlock and the data already present at the destination datacenter. 2) Accelerated instantiation of the application at the target datacenter after transferring only a partial set of data blocks and prioritization of the remaining data based on previously observed access patterns originating from the running VMs. A scalable and high performance migration service can be achieved. Ye et. al. [34] present Resource reservation based live migration framework consists of Migration decision maker, Migration controller, Resource reservation controller and Resource monitor. The reserved resource in source machine includes CPU (in Xen virtualization platform) and memory resource (dynamically adjusting VM memory size) while in target machine it includes the whole virtual machine resources. Three metrics to quantify the efficiency are downtime, total time, workload performance overheads. Kikuchi et. al. [35] constructed a performance model of concurrent live migrations in virtualized datacenters. First the data is collected and live migration is executed simultaneously on the data. A performance model is constructed representing the performance characteristics of live migration using the PRISM probabilistic model checker. This approach provides to orchestrate management operations and determine the appropriate configuration to avoid undesirable situations from a probabilistic viewpoint in cloud system. Deshpande et. al. [36] present the design, implementation, and evaluation of a de-duplication based approach to perform concurrent live migration of co-located VMs. This approach transmits memory content that is identical across VMs only once during migration to significantly reduce both the total 966 2013 3 rd IEEE International Advance Computing Conference (IACC) migration time and network traffic. They used QEMU/KVM Linux platform for live gang migration of virtual machines. Deshpande et. al. [37] present a inter-rack live migration (IRLM) system. IRLM reduces the traffic load on the core network links during mass VM migration through distributed deduplication of VMs memory images. IRLM initial prototype migrates multiple QEMU/KVM VMs within a Gigabit Ethernet cluster with 10GigE core links. For a configuration of 6 hosts per rack and 4 VMs per host, IRLM can reduce the amount of data transferred over the core links during migration. 4) Specific Cloud Environments: Elmroth et. al. [38] presented two novel interface and architectural contributions, facilitating for cloud computing software to make use of inter and intra-site VM migration and improved inter- and intra-site monitoring of VM resources, both on an Infrastructural and on an application-specific level. Celesti et. al. [39] propose a Composed Image Cloning (CIC) methodology to reduce consumption of bandwidth and cloud resources. This approach does not consider the disk- image of a VM as a single monolithic block, but as a combination between composable and user data blocks. Suen et. al. [40] propose and compare techniques that can reduce the transfer bandwidth and storage cost of data involved during the migration process. 5) Application / Workload Specific Technologies: There is a limitation of migration technology, when it is used on larger application system such as SAP ERP. Such systems consume a large amount of memory. Hacking and Hudzia [41] present design, implementation, and evaluation a system for supporting the transparent, live migration of virtual machines running typical large enterprise applications workloads. It minimizes service disruption due to using delta compression algorithm by VM for memory transfer, as well as the introduction of an adaptive warm up phase in order to reduce the rigidity of migrating large VMs. Sato et. al. [42] presented a VM relocation algorithm for data intensive applications on a virtual machine in a geographically distributed environment. The proposed algorithm determines optimal location of VM to access target file, while minimizing the total expected file access time to files by solving DAG shortest path search problems, on the assumption that the network throughput between sites and the size and locations of target files are given. It can achieve higher performance than simple techniques. Piao et. al. [43] presents a network aware VM placement and migration approach for data intensive applications in cloud computing environments. The proposed approach places the VMs on physical machines with consideration of the network conditions between the physical machines and the data storage. V. Shrivastava et. al. [44] introduce AppAwarea novel, computationally efficient scheme for incorporating (1) inter- VM dependencies and (2) the underlying network topology into VM migration decisions. AppAware is a greedy algorithm with heuristics for assigning VMs to physical machines one at a time, while trying to minimize the cost that results from the mapping at each step. They compared metrics used for the evaluation of AppAware, such as total traffic volume that is transported by the data center network once all overloaded VMs have been assigned to physical machines using one of the methods. Using simulations, they show that it decreases network traffic by up to 81% compared to a well known alternative VM migration method that is not application-aware. H. Liu et. al. [45] construct two application oblivious models for the cost prediction by using learned knowledge about the workloads at the hypervisor level. It is the first model VM migration costs in terms of both performance and energy. They validate the models by conducting a large set of experiments. The evaluation results demonstrate the effectiveness of model-guided live migration in both performance and energy costs. Modeling the performance of migration involves several factors: the size of VM memory, the workload characteristic (denotes the memory dirtying rate), network transmission rate, and the migration algorithm (different configurations of migration algorithm means great variations of migration performance). The most important challenge is to correctly characterize the memory access pattern of each running workloads. 6) Other Technologies: Nocentino et. al. [46] proposes a novel dependency-aware approach to live virtual machine migration and presents the results of the initial investigation into its ability to reduce migration latency and overhead. The approach uses a tainting mechanism originally developed as an intrusion detection mechanism. Dependency information is used to distinguish processes that create direct or indirect external dependencies during live migration. Akoush et. al. [47] show that the link speed and page dirty rate are the major factors impacting migration behavior. These factors have a non-linear effect on migration performance largely because of the hard stop conditions that force migration to its final stop-and-copy stage. Migration times should be accurately predicted to enable more dynamic and intelligent placements of VMs without degrading performance. Address-warping problem is one of the difficulties in wide-area migration, the address of the VM warps from the source server to the destination server which complicates the status of the WAN, and the LANs connected to the WAN. Kanada et. al. [48] propose two solutions to this problem: 1) To switch an address-translation rule (analogous to paging in memory virtualization) and 2) To switch multiple virtual networks (analogous to paging in memory virtualization). Wood et. al. [49] present cloud framework CloudNet consisting of cloud computing platforms linked with a VPN based network infrastructure to provide seamless and secure connectivity between enterprise and cloud data center sites. An optimized support for live WAN migration of virtual machines is provided by CloudNet, that is beneficial over low bandwidth and high latency Internet links, it minimizes the cost of transferring storage and virtual machine memory during migrations. At the heart of CloudNet is a Virtual Cloud Pool (VCP) abstraction that enables server resources across data 2013 3 rd IEEE International Advance Computing Conference (IACC) 967 centers and cloud providers to be logically grouped into a single server pool. VPN, based on Multi-Protocol Label Switching (MPLS) is used in CloudNet to create the abstraction of a private network and address space shared by multiple data centers. The hypervisors memory migration is coordinated with a disk replication system by CloudNet so that the entire VM state can be transferred if needed. CloudNet is optimized to reduce the amount of data transferred and total migration time and application downtime. Huang et. al. [50] present an implementation in live migration benchmark, Virt-LM, for comparing live migration performance among different software and hardware environments in a data center scenario. Metrics, Workloads, Impartial Scoring Methodology, Stability, Compatibility, and Usability are the goals to design Virt-LM. Resource availability can help to make better decision on when to migrate VM and how to allocate necessary resources. Wu and Zhao [51] create a performance model using statistical method such as regression. It can be used to predict migration time and guide resource management decision. They did experiment by migrating a xen-based VM running CPU, memory, or I/O intensive application and allocating different amount of CPU share. It shows that the available resources to live migration have an impact on migration time. Jing et. al. [52] propose a optimization migration framework to reduce the migration downtime, which is based on the analysis of the memory transfer in the real-time migration of current Xen virtual machine. This framework makes use of layered copy algorithm and memory compression algorithm, optimizes the time and space complexity of real- time migration, reduces the migration downtime greatly and improves the migration performance. Ashino et. al. [53] propose VM migration method to solve the problems (Guest OS fails to boot up on destination after migration, loading device drivers, or adjusting its device configuration). EDAMP migration method is proposed and is still in development. Method only overwrites the files and does not destroy the device driver. EDAMP can be used in multiple cloud services and integrating to one hypervisor. V. RESEARCH CHALLENGES IN LIVE VM MIGRATION A. Low Bandwidth over WAN A virtual machine can be scheduled for execution at geographically disparate cloud locations depending upon the cost of computation and the load at these locations. However, translocating a live VM across highlatency lowbandwidth wide area networks (WAN) within reasonable time is nearly impossible due to the large size of the VM image [27]. B. Virtual Machine with different type of workload There is a limitation of migration technology, when it is used on larger application system such as SAP ERP. Such systems consume a large amount of memory [41]. C. Link speed and page dirty rate Link speed and page dirty rate are the major factors impacting migration behavior. Link capacity is inversely proportional to total migration time and downtime. Page dirty rate is the rate at which memory pages in the VM are modified which, in turn, directly affects the number of pages that are transferred in each pre-copy iteration. [47]. D. Available Resources Resource availability can help to make better decision on when to migrate VM and how to allocate resources [51]. E. Address wrapping Address-warping problem is one of the difficulties in wide- area migration, the address of the VM warps from the source server to the destination server which complicates the status of the WAN, and the LANs connected to the WAN [48]. There are some other challenges such as network faults [22], overloaded VMs [44], memory and data intensive applications [32, 42, 43], consumption of bandwidth and cloud resources [39]. VI. CONCLUSION AND FUTURE WORK This paper is a survey of live migration of virtual machine techniques. Live migration involves transferring a running virtual machine across distinct physical hosts. There are many techniques which attempt to minimize the down time and to provide better performance in low bandwidth environment. We have categorized the papers and there is a need to compare techniques in each category to understand the strengths and weaknesses. In future, we plan to propose a performance model, based on the research gaps identified through the limitations. This will be helpful for reducing the migration time with heavy workload. We would want to parallelize the migration process using MapReduce so that data can be distributed among various places. The other problem in live migration is low bandwidth we can have a better utilization of network bandwidth by allocating it dynamically. REFERENCES
[1] Google, "Google App Engine", (2012), [online]. Available: cloud.google.com [Nov 1, 2012]. [2] Amazon, "Amazon Elastic Compute Cloud (Amazon EC2)", (2012), [online]. Available: aws.amazon.com/ec2/ [Nov 1, 2012]. [3] Microsoft, "Windows Azure.", (2012), [online]. Available: windowsazure.com [Nov 1, 2012]. [4] IBM, "SmartCloud." (2012), [online]. Available: ibm.com/cloud- computing [Nov 1, 2012]. [5] P. Mell and T. Grance, "The NIST definition of cloud computing (draft)," NIST special publication, vol. 800, p. 145. [6] A. Desai, "Virtual Machine." (2012), [online]. Available: http://searchservervirtualization.techtarget.com/definition/virtualmachin [7] VMWare, "vSphere ESX and ESXi Info Center.", (2012), [online]. Available: vmware.com/products/vsphere/esxi-and-esx [Nov 1, 2012]. [8] Microsoft, "Windows Virtual PC.", (2012), [online]. Available: http://www.microsoft.com/windows/virtual-pc/ [Nov 1, 2012]. [9] Xen, "Xen Hypervisor.", (2012), [online]. Available: http://www.xen.org/products/xenhyp.html [Nov 1, 2012]. [10] Microsoft, "Hyper-V Server 2012.", (2012), [online]. Available: microsoft.com/server-cloud/hyper-v-server/ [Nov 1, 2012]. [11] KVM, "Kernel-based Virtual Machine.", (2012), [online]. Available: linux-kvm.org [Nov 1, 2012]. [12] Oracle, "VirtualBox.", (2012), [online]. Available: virtualbox.org [Nov 1, 2012]. [13] W. Voorsluys, J. Broberg, S. Venugopal, and R. Buyya, "Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation," in 1st 968 2013 3 rd IEEE International Advance Computing Conference (IACC) International Conference on Cloud Computing, Berlin, Germany, 2009, pp. 254-65. [14] P. S. Constantine, C. Ramesh, P. Ben, C. Jim, S. L. Monica, and R. Mendel, "Optimizing the migration of virtual computers," in 5 th
Symposium on Operating Systems Design and Implementation, SIGOPS Oper. Syst. Rev., vol. 36, Issue SI, pp. 377-390, 2002. [15] C. Christopher, F. Keir, H. Steven, H. Jacob Gorm, J. Eric, L. Christian, P. Ian, and W. Andrew, "Live migration of virtual machines," 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2: USENIX Association, 2005. [16] N. Michael, L. Beng-Hong, and H. Greg, "Fast transparent migration for virtual machines," Annual conference on USENIX Annual Technical Conference Anaheim, CA: USENIX Association, 2005. [17] H. Wei, G. Qi, L. Jiuxing, and D. K. Panda, "High performance virtual machine migration with RDMA over modern interconnects," in IEEE International Conference on Cluster Computing, 2007, pp. 11-20. [18] L. Yingwei, Z. Binbin, W. Xiaolin, W. Zhenlin, S. Yifeng, and C. Haogang, "Live and incremental whole-system migration of virtual machines using block-bitmap," in IEEE International Conference on Cluster Computing, 2008, pp. 99-106. [19] B. Robert, K. Evangelos, F. Anja, S. Harald, and berg, "Live wide-area migration of virtual machines including local persistent state," 3rd International Conference on Virtual execution environment, San Diego, California, USA: ACM, 2007. [20] Y. Kuno, K. Nii, and S. Yamaguchi, "A study on performance of processes in migrating virtual machines,", 10 th International Symposium on Autonomous Decentralized Systems, ISADS 2011, 2011, pp. 567-572. [21] X. Feng, J. Tang, X. Luo, and Y. Jin, "A performance study of live VM migration technologies: VMotion vs XenMotion," The International Society for Optical Engineering, 2011. [22] R. H. Michael, D. Umesh, and G. Kartik, "Post-copy live migration of virtual machines," SIGOPS Oper. Syst. Rev., vol. 43, pp. 14-26, 2009. [23] R. H. Michael and G. Kartik, "Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning,", ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, Washington, DC, USA: ACM, 2009. [24] L. Weining and F. Tao, "Live migration of virtual machine based on recovering system and CPU scheduling," in 6 th IEEE joint International Information Technology and Artificial Intelligence Conference, Piscataway, NJ, USA, May 2009, pp. 303-7. [25] L. Haikun, J. Hai, L. Xiaofei, H. Liting, and Y. Chen, "Live migration of virtual machine based on full system trace and replay," 18th ACM International Symposium on High performance distributed computing Garching, Germany: ACM, 2009. [26] P. Svard, J. Tordsson, B. Hudzia, and E. Elmroth, "High performance live migration through dynamic page transfer reordering and compression," 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, pp. 542-548. [27] S. K. Bose, S. Brock, R. Skeoch, and S. Rao, "CloudSpider: Combining replication with scheduling for optimizing live migration of virtual machines across wide area networks," 11 th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2011, May 2011, pp. 13-22. [28] S. Kumar Bose, S. Brock, R. Skeoch, N. Shaikh, and S. Rao, "Optimizing live migration of virtual machines across wide area networks using integrated replication and scheduling," in 2011 IEEE International Systems Conference, SysCon 2011 - pp. 97-102. [29] J. Hai, D. Li, W. Song, S. Xuanhua, and P. Xiaodong, "Live virtual machine migration with adaptive, memory compression," in IEEE International Conference on Cluster Computing and Workshops, CLUSTER '09, pp. 1-10. [30] M. Fei, L. Feng, and L. Zhen, "Live virtual machine migration based on improved pre-copy approach," in IEEE International Conference on Software Engineering & Service Sciences ICSESS), 2010, pp. 230-233. [31] S. Petter, H. Benoit, T. Johan, and E. Erik, "Evaluation of delta compression techniques for efficient live migration of large virtual machines," 7th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, California, USA: ACM, 2011. [32] K. Z. Ibrahim, S. Hofmeyr, C. Iancu, and E. Roman, "Optimized pre- copy live migration for memory intensive applications," in International Conference for High Performance Computing, Networking, Storage and Analysis (SC),2011, pp. 1-11. [33] S. Al-Kiswany, D. Subhraveti, P. Sarkar, and M. Ripeanu, "VMFlock: Virtual machine co-migration for the cloud," IEEE International Sym. on High Performance Distributed Computing, 2011, pp. 159-170. [34] Y. Kejiang, J. Xiaohong, H. Dawei, C. Jianhai, and W. Bei, "Live Migration of Multiple Virtual Machines with Resource Reservation in Cloud Computing Environments," California, USA, 2011, pp. 267-74. [35] S. Kikuchi and Y. Matsumoto, "Performance modeling of concurrent live migration operations in cloud computing systems using prism probabilistic model checker," 2011 IEEE 4th International Conference on Cloud Computing, CLOUD 2011, July 2011, pp. 49-56. [36] D. Umesh, W. Xiaoshuang, and G. Kartik, "Live gang migration of virtual machines," 20th International Symposium on High performance distributed computing, San Jose, California, USA: ACM, 2011. [37] D. Umesh, K. Unmesh, and G. Kartik, "Inter-rack live migration of multiple virtual machines,", 6 th International workshop on Virtualization Technologies in Dist. Computing, Delft, Netherlands. [38] E. Elmroth and L. Larsson, "Interfaces for placement, migration, and monitoring of virtual machines in federated clouds," in 8th International Conf. on Grid and Cooperative Computing, GCC 2009, pp. 253-260. [39] A. Celesti, F. Tusa, M. Villari, and A. Puliafito, "Improving virtual machine migration in federated cloud environments," 2nd International Conference on Evolving Internet, Internet 2010, pp. 61-67. [40] S. Chun-Hui, M. Kirchberg, and L. Bu Sung, "Efficient Migration of Virtual Machines between Public and Private Cloud," in IEEE Third International conference on Cloud Computing Technology and Science (CloudCom), Los Alamitos, CA, USA, Nov 2011, pp. 549-53. [41] H. Stuart, Beno, and H. t, "Improving the live migration process of large enterprise applications," 3 rd International Workshop on Virtualization Technologies in Distributed Computing, Barcelona, Spain: ACM, 2009. [42] K. Sato, H. Sato, and S. Matsuoka, "A model-based algorithm for optimizing I/O intensive applications in clouds using vm-based migration," in 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGRID 2009, pp. 466-471. [43] J. T. Piao and J. Yan, "A network-aware virtual machine placement and migration approach in cloud computing," 9 th International Conference on Grid and Cloud Computing, GCC 2010, pp. 87-92. [44] V. Shrivastava, P. Zerfos, L. Kang-won, H. Jamjoom, L. Yew-Huey, and S. Banerjee, "Application-aware virtual machine migration in data centers," in IEEE INFOCOM, 2011, pp. 66-70. [45] L. Haikun, X. Cheng-Zhong, J. Hai, G. Jiayu, and L.Xiaofei, "Performance and energy modeling for live migration of virtual machines," 20th International Symposium on High Performance Distributed Computing, San Jose, California, USA: ACM, 2011. [46] N. Anthony and M. R. Paul, "Toward dependency-aware live virtual machine migration," 3rd International Workshop on Virtualization Technologies in Distributed Computing, Barcelona, Spain: ACM, 2009. [47] A. Sherif, S. Ripduman, R. Andrew, W. M. Andrew, and H. Andy, "Predicting the Performance of Virtual Machine Migration," in IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2010. [48] Y. Kanada and T. Tarui, "A "network-paging" based method for wide- area live-migration of VMs," in International Conference on Information Networking 2011, ICOIN 2011, Jan 2011, pp. 268-272. [49] T. Wood, P. Shenoy, K. K. Ramakrishnan, and J. Van Der Merwe, "CloudNet: Dynamic pooling of cloud resources by live WAN migration of virtual machines," 2011 ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE 2011, pp. 121-132. [50] H. Dawei, Y. Deshi, H. Qinming, C. Jianhai, and Y. Kejiang, "Virt-LM: a benchmark for live migration of virtual machine," Second joint WOSP/SIPEW International Conference on Performance engineering Karlsruhe, Germany: ACM, 2011. [51] Y. Wu and M. Zhao, "Performance modeling of virtual machine live migration,", IEEE 4 th International Conference on Cloud Computing, CLOUD 2011, pp. 492-499. [52] J. Yang, "Key technologies and optimization for dynamic migration of virtual machines in cloud computing," Int. Conf. on Intelligent Systems Design and Engineering Applications, ISDEA 2012, pp. 643-647. [53] Y. Ashino and M. Nakae, "Virtual machine migration method between different hypervisor implementations and its evaluation," 26th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2012, pp. 1089-1094. 2013 3 rd IEEE International Advance Computing Conference (IACC) 969