Parallelizing Live Migration of Virtual Machines

Slides:

Advertisements

Similar presentations

Live migration of Virtual Machines Nour Stefan, SCPD.

Advertisements

S.Chechelnitskiy / SFU Simon Fraser Running CE and SE in a XEN virtualized environment S.Chechelnitskiy Simon Fraser University CHEP 2007 September 6 th.

Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield.

Fast and Safe Performance Recovery on OS Reboot Kenichi Kourai Kyushu Institute of Technology.

PlanetLab Operating System support* *a work in progress.

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration Xiang Zhang 1,2, Zhigang Huo 1, Jie Ma 1, Dan Meng 1 1. National Research Center.

KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.

Live Migration of Virtual Machines Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, Andrew Warfield.

Heterogeneous Live Migration of Virtual Machines Pengcheng Liu, Ziye Yang, Xiang Song, Yixun Zhou, Haibo Chen, and Binyu Zang Parallel Processing Institute,

1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,

Concurrent Data Structures in Architectures with Limited Shared Memory Support Ivan Walulya Yiannis Nikolakopoulos Marina Papatriantafilou Philippas Tsigas.

XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.

Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.

COMMA: Coordinating the Migration of Multi-tier applications 1 Jie Zheng* T.S Eugene Ng* Kunwadee Sripanidkulchai† Zhaolei Liu* *Rice University, USA †NECTEC,

Predicting The Performance Of Virtual Machine Migration Presented by : Eli Nazarov Sherif Akoush, Ripduman Sohan, Andrew W.Moore, Andy Hopper University.

Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.

G Robert Grimm New York University Disco.

ENFORCING PERFORMANCE ISOLATION ACROSS VIRTUAL MACHINES IN XEN Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, Amin Vahdat Middleware '06 Proceedings of.

1 Distributed Systems: Distributed Process Management – Process Migration.

By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and

Virtualization Infrastructure Administration Cluster Jakub Yaghob.

Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.

CSE598C Virtual Machines and Their Applications Operating System Support for Virtual Machines Coauthored by Samuel T. King, George W. Dunlap and Peter.

Presented by : Ran Koretzki. Basic Introduction What are VM’s ? What is migration ? What is Live migration ?

Virtualization and Cloud Computing Research at Vasabilab Kasidit Chanchio Vasabilab Dept of Computer Science, Faculty of Science and Technology, Thammasat.

Evaluation of Delta Compression Techniques for Efficient Live Migration of Large Virtual Machines Petter Svärd, Benoit Hudzia, Johan Tordsson and Erik.

Zen and the Art of Virtualization Paul Barham, et al. University of Cambridge, Microsoft Research Cambridge Published by ACM SOSP’03 Presented by Tina.

Dual Stack Virtualization: Consolidating HPC and commodity workloads in the cloud Brian Kocoloski, Jiannan Ouyang, Jack Lange University of Pittsburgh.

Microkernels, virtualization, exokernels Tutorial 1 – CSC469.

Kenichi Kourai (Kyushu Institute of Technology) Takuya Nagata (Kyushu Institute of Technology) A Secure Framework for Monitoring Operating Systems Using.

Virtual Machine Scheduling for Parallel Soft Real-Time Applications

Improving Network I/O Virtualization for Cloud Computing.

CSE 691: Energy-Efficient Computing Lecture 6 SHARING: distributed vs. local Anshul Gandhi 1307, CS building

1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.

张俊 BTLab Embedded Virtualization Group Outline  Introduction  Performance Analysis  PerformanceTuning Methods.

Zero-copy Migration for Lightweight Software Rejuvenation of Virtualized Systems Kenichi Kourai Hiroki Ooba Kyushu Institute of Technology.

Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.

Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

Our work on virtualization Chen Haogang, Wang Xiaolin {hchen, Institute of Network and Information Systems School of Electrical Engineering.

CS533 Concepts of Operating Systems Jonathan Walpole.

High Performance Computing on Virtualized Environments Ganesh Thiagarajan Fall 2014 Instructor: Yuzhe(Richard) Tang Syracuse University.

Dynamic Resource Monitoring and Allocation in a virtualized environment.

Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.

VTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella,

Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

Full and Para Virtualization

Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.

Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Virtual Machines (part 2) CPS210 Spring Papers  Xen and the Art of Virtualization  Paul Barham  ReVirt: Enabling Intrusion Analysis through Virtual.

Urgent Virtual Machine Eviction with Enlightened Post-Copy Yoshihisa Abe†, Roxana Geambasu‡, Kaustubh Joshi, and Mahadev Satyanarayanan† †Carnegie Mellon.

Split Migration of Large Memory Virtual Machines

Ran Liu (Fudan Univ. Shanghai Jiaotong Univ.)

Anshul Gandhi 347, CS building

Presented by Yoon-Soo Lee

Kenichi Kourai Kouta Sannomiya Kyushu Institute of Technology, Japan

Distributed Network Traffic Feature Extraction for a Real-time IDS

Kenichi Kourai Hiroki Ooba Kyushu Institute of Technology, Japan

Comparison of the Three CPU Schedulers in Xen

Sho Kawahara and Kenichi Kourai Kyushu Institute of Technology, Japan

Page Replacement.

Preventing Performance Degradation on Operating System Reboots

Yiannis Nikolakopoulos

User-level Distributed Shared Memory

Resource Cages: A New Abstraction of the Hypervisor for Performance Isolation Considering IDS Offloading Kenichi Kourai*, Sungho Arai**, Kousuke Nakamura*,

OPERATING SYSTEMS MEMORY MANAGEMENT BY DR.V.R.ELANGOVAN.

A workload-aware energy model for VM migration

Efficient Migration of Large-memory VMs Using Private Virtual Memory

Presentation transcript:

Parallelizing Live Migration of Virtual Machines Xiang Song Jicheng Shi, Ran Liu, Jian Yang, Haibo Chen IPADS of Shanghai Jiao Tong University Fudan University Hi, everyone. My name is xiang song from IPADS group of shanghai jiao tong university and fudan university. Today, I want to introduce our work on Parallelizing Live Migration of Virtual Machines This work is done with jicheng Ran Liu, Jian Yang and my advisor haibo.

Virtual Clouds Nowadays, Virtual Clouds are extremely popular, which provides isolated computing resources as well as abundant VM management operations such as VM migration, checkpoint and so on.

Live VM Migration Live VM migration is one of the major primitive operations to manage virtualized cloud platforms. Such operation is usually mission critical and disruptive to the running services

VM Migration is Time-consuming Live VM migration becomes time-consuming Increasing resources of a VM Limited resources of migration tools Migrating a memcached server VM on Xen 4Gbyte vs. 16 Gbyte 1592s 257s Unfortunately, with the increasing amount of resources configured to a VM, such operations are becoming increasingly time-consuming. For example, when migrating a memcached VM with 4 Gbyte memory and 16 Gbyte memory, the migration time increased by about 4 times and the downtime is increased by about 3 times. 80s 400s Migration Time Downtime Time

VM Migration Insight Example: Migrating a memcached VM with 16 Gbyte memory on Xen Migration Time 1592s Data transfer 1200s Map guest mem 381s Others We look insight into the basic primitives of live VM migration using a memcached VM with 16 Gbyte memory. The total migration time is 1592 seconds, which includes 1200 seconds spent on data transfer, 381 seconds on mapping guest memory and, 11 seconds spent on other operations. When migrating a such VM, about 49.3 Gbytes are pre-copied and 9.3 Gbytes are sent at VM downtime. The average CPU utilization reaches 95.4% With a single-thread migration toolkit, it reaches its resource limit to achieve better performance. VM Memory Size 16.0Gbyte Data Transfer 49.3Gbyte Pre-copy 9.3Gbyte Downtime Avg. CPU Usage 95.4%

VM Migration Insight A lot of memory dirtied during pre-copy Dirty rate Transfer rate Improving the transfer rate CPU preparing rate Network bandwidth After analysis, we can see that a lot of memory is dirtied during the pre-copy. How many memory dirtied is decided by the memory dirty rate and the data transfer rate. Here, the dirty rate is decided by the application while the transfer rate is decided by the migration process. The more time it takes to transfer the data, the more memory will be dirtied. So, we need to improving the transfer rate, which is further decided by the CPU preparing rate and the network bandwidth. As we can see, the single thread migration process has reached its maximum performance in the above case.

Parallelizing Live VM Migration With increasing amount of resources Opportunities to leverage resources for parallelizing live VM migration We design and implement PMigrate Live Parallel Migration Parallelize most basic primitives of migration Data parallelism Pipeline parallelism However, with increasing amount of resources, there exists opportunities to leverage resources (such as network and CPU) for parallelizing live VM migration. Thus, we design and implement Pmigrate – a parallel migration tool kit for live VM migration, which parallelizes most basic primitives of migration.

Contributions A case for parallelizing live VM migration The range lock abstraction to scale address space mutation during migration The design, implementation and evaluation of PMigrate on Xen and KVM The contributions of this paper includes a case for parallelizing live VM migration by analyzing the possible parallelism of migration primitives; The range lock abstraction to scale address space mutation during migration and The design, implementation and evaluation of PMigrate on Xen and KVM

Outline Design of PMigrate Challenges for PMigrate Implementation Evaluation Here is the outline of today’s presentation. I will talk about the design of Pmigrate first.

Analysis of Live VM Migration: Source Node Enter iteration Get/Check Dirty Bitmap Handle Data Transfer Data Transfer CPU/Device Map Guest VM Memory Memory Data Both Xen and KVM use a pre-copy based live VM migration strategy. During the pre-copy iterations, the migration process will first get and check the dirty bitmap of memory and disk image, then handle the data which includes memory data and disk data. When handling the memory data, the migration process will map the guest VM memory into its address space and handle the pages one by one. When handling the disk data, it will load the disk data. After handling the data, the data are transferred to the destination node. There will be several iterations before the final iteration. During the final iteration, all the dirty memory and disk data will be transferred as well as the CPU and device data. The destination node has the similar steps in receiving the migration data. Handle Zero/PT Page Disk Data Load Disk Data

Analysis of Parallelism Data Parallelism No dependency among different portions of data E.g., mapping guest VM memory Pipeline Parallelism When data parallelism is not appropriate E.g., check disk/memory dirty bitmap When designing Pmigrate, we use two kinds of parallelisms. First is data parallelism. This is applied to migration primitives which has no dependency among different portions of data. The second is pipeline parallelism, which is applied to migration primitives which can not be data parallelized.

Analysis of Parallelism Data Pipeline Cost Check dirty bitmap small Map guest VM memory heavy Handle Unused/PT Page modest Transfer memory data Restore memory data Transfer disk data Load/Save disk data This table shows which kind of parallelism is chosen for major primitives in VM migration. As show in the table, most operations can be data parallelized.

PMigration: Source Node Memory Data Producer Pipeline Parallelism Task Pool Disk Data Producer Here is an example on how data parallelism and pipeline parallelism are applied on the source node. The memory data producer and disk data producer will push data tasks into a task pool. Meanwhile, send_consumer threads will pull the tasks from the pool and handle the task in parallel. Here, the data producers and the send consumers are pipeline parallelized, while the send consumers are data parallelized. Send consumer Send consumer Send consumer Data parallelism

PMigration: Destination Node Data Parallelism Send consumer Receive consumer Disk Writer Send consumer Receive consumer On the destination node, the receive consumers can receive and handle data transferred from the source node in data parallel. While the disk data is handled by the receive consumers and a disk writer in pipeline. Send consumer Receive consumer Pipeline parallelism

Outline Design of PMigrate Challenges for PMigrate Implementation Evaluation After leveraging data parallelism and pipeline parallelism to parallelize the migration There comes several challenges.

Challenge: Controlling Resource Usage Parallel VM Migration operations Consume more CPU/Network resources Problem: Lower the side-effect Solution: Resource usage control The first challenge is how to control the migration resource usage. The Parallel VM Migration operations consumes more CPU and Network resources Thus, we need to lower the side-effect of migration. Here, we use resource usage control to limit the side-effect of VM migration.

Resource Usage Control: Network Daemon thread Monitor network usage of each NIC Migration process Adjust network usage of each NIC Reserve some bandwidth for migration For network resource, we use a daemon thread to monitor the network usage of each NIC. And ask the migration process to adjust the network usage of each NIC according to the monitoring result.

Resource Usage Control: CPU & Memory CPU Rate Control Depend on VMM scheduling [L. Cherkasova et.al. PER 07] Control the priority of the migration process Memory rate control Maintain a memory pool for pipeline stages For CPU resource, we depend on the VMM scheduling mechanism as well as control the priority of the migration process. For memory rate control, we maintain a memory pool to hold intermediate data between pipeline stages and limit the total amout of memory each stage consumes.

Challenge: Scaling Address Space Mutation How a memory task is handled? Map a range of address space Map target guest VM memory Process memory Unmap the address space sys_mmap (...) down_write(mmap_sem); map_address_space() up_write(mmap_sem); privcmd_ioctl_mmap_batch(...) ... vma = find_vma(mm, m.addr); ret = traverse_pages(...); sys_munmap (...) unmap_address_space() The second challenge is how to scale the address space mutation. When handling memory data task, the consumer thread need to first allocate a free range of address space through mmap, and then map the target memory from the target guest VM through an ioctl call. After finish memory processing, the Mmapped address space is unmapped through munmap. We look deep inside into each step. Three steps have contentions on memory management samephone. An evaluation of migrating a 16 Gbyte memory VM with 8 consumer threads shows that about 47.94% of time is spent on it. 47.94% of time in migrating 16 GByte memory VM with 8 consumer threads

First Solution: Read Protecting Guest VM Map When map target guest memory Holding mmap_sem in write mode is too costly It is not necessary The mmap_sem can be hold in read mode privcmd_ioctl_mmap_batch can be done in parallel After analyzing the operation on mapping target guest memory, we found that holding the memmap_sem in write mode is too costly and it is not necessary in such operation. Thus, we hold the mmap_sem in read mode when mapping target guest memory. The guest memory map fuction can be done in parallel.

Range Lock There are still serious contentions Range lock Mutation to an address space is serialized Guest VM memory map contents with mutations Range lock A dynamic lock-service to the address space After Read Protecting Guest VM Map, there still exists some contentions. The address space mutation operations (map/unmap) are still serialized The guest VM memory map operation still contents with address space mutation. Thus, we proposed range lock, a dynamic lock-service to the address space

Range Lock Mechanism Skip list based lock service Lock an address range ([start, start + length]) Accesses to different portions of the address space can be parallelized In range lock, we use a skip list to maintain the status of address ranges. Thus, we can lock a specific range of address spaces. Accesses to different portions of the address space can be parallelized.

Range Lock Free pages Map guest page through hypercalls sys_mmap(): Down_write(mmap_sem) Obtain the address to map Lock_range(addr, len) Update /add VMAs Unlock_range(addr, len) Up_write(mmap_sem) sys_mremap(): Down_write(mmap_sem) Lock_range(addr, len) Do remap Unlock_range(addr, len) Up_write(mmap_sem) munmap(): Down_write(mmap_sem) Adjust first and last VMA Lock_range(addr, len) Detach VMAs Up_write(mmap_sem) Cleanup page table Free pages Unlock_range(addr, len) guest_map: Down_read(mmap_sem) Find VMA Lock_range(addr, len) Up_read(mmap_sem) Map guest page through hypercalls Unlock_range(addr, len) Here shows how range lock are applied into address space mutation operations. In practice, it parallelizes the most time-consuming parts as1) Mapping target guest VM memory and 2) Unmapping the address space which is highlighted in the graph.

Outline Design of PMigrate Challenges for PMigrate Implementation Evaluation Here we talk about the implementation efforts on realizing PMigrate

Implementing PMigrate Implementation on Xen Based on Xen tools of Xen 4.1.2 & Linux 3.2.6 Range lock: 230 SLOCs PMigrate: 1860 SLOCs Implementation on KVM Based on qemu-kvm 0.14.0 KVM migration: 2270 SLOCs We have implement Pmigrate on both Xen and KVM. It adds/changes 1,860 lines of code to Xen tools to implement Pmigrate and takes 230 lines of code to Linux Kernel to realize range lock. It takes 2270 lines of codes to KVM to implement Pmigration, which includes 830 SLOCs to change its iteration strategy into imageoriented.

Implementing KVM Vanilla KVM takes iteration-oriented pre-copy Handle 2 MByte data per iteration The qemu daemon shared by guest VM and migration process PMigrate-KVM takes image-oriented pre-copy Handle whole memory/disk image per iteration Separate migration process from qemu daemon The vanilla KVM takes a iteration-oriented pre-copy strategy, which handles 2 Mbyte data per iteration. Further, the qemu daemon is shared by the guest VM and the migration process together. Thus, during migration, when the qemu daemon is servicing migration process, the I/O operations of guest VM will be blocked. Thus, in Pmigrate_KVM, we change the iteration strategy into image-oriented, which will handle the whole memory/disk image each iteration. We also separate the migration process from the qemu daemon.

Outline Design of PMigrate Challenges for PMigrate Implementation Evaluation Here comes to the evaluation.

Evaluation Setup Conducted on two Intel machine Two 1.87 Ghz Six-Core Intel Xeon E7 chips 32 GByte memory One quad-port Intel 82576 Gigabit NIC One quad-port Broadcom Gigabit NIC We conducted the evaluation on two Intel machine. Each has two six core Intel E7 chips, 32 Gbyte memory and two quad-port Gigabit NIC. Another Intel machine is used as the NFS server

Workload Idle VM Memcached In paper One gigabit network connection Throughput: Xen 27.7 MByte/s KVM 20.1 MByte/s In paper PostgreSQL Dbench We evaluate Pmigrate-Xen and Pmigrate-KVM using four kinds of workloads: idle VM, memcached, postgreSQL and Dbench. Due to time limitation, we only talk about idle VM and memcached in this presentation, you can refer to the paper for the other workloads. Each memcached server has one Gigabit network connection

Idle VM Migration - Xen Vanilla PMigrate Total Memory Send (Gbyte) 16.2 16.2 Network Usage (Mbyte/s) 39.3 148.0 Here is the performance of Pmigrate-Xen migrating an idle VM. Through the number of data transferred during migration is almost same for vanilla Xen and Pmigrate Xen, As PMigrate leverage more CPU and network resources for migration. the total migration time is greatly reduced by Parallel migration for KVM Migration Time (s) 422.8 112.4

Idle VM Migration - KVM Vanilla PMigrate Total Data Send (Gbyte) 16.4 16.4 Network Usage (Mbyte/s) 84.2 294.7 Here is the performance of Pmigrate-KVM migrating an idle VM. Through the number of data transferred during migration is almost same for vanilla KVM and Pmigrate KVM, As PMigrate leverage more CPU and network resources for migration. the total migration time is greatly reduced by Parallel migration for KVM Migration Time (s) 203.9 57.4

Memcached VM Migration - Xen Vanilla PMigrate Migration Time (s) 1586.1 160.5 Non-response Time (s) 251.9 < 1 Network Usage (Mbyte/s) 38.0 145.0 Total Memory Send (Gbyte) Here is the performance of Pmigrate-Xen migrating an VM running memcached VM. It can be seen that the total migration time and the non-response time of the server during migration is greatly reduced by Parallel migration. As the avg. network throughput increases, the total memory dirtied during migration is greatly reduced, thus, the memory sent during migration and the last iteration are greatly reduced. While, the performance degradation of the memcached server on Pmigrate-Xen is only 9% larger than that on vanilla Xen 58.6 22.7 Memory Send Last iter (Gbyte) 9.2 0.04 Server Thr. 74.5% 65.4%

Memcached VM Migration - Xen Vanilla PMigrate Migration Time (s) 348.7 140.2 Non-response Time (s) 163 < 1 Network Usage (Mbyte/s) 90.7 289.1 Total Data Send (Gbyte) Here is the performance of Pmigrate-Xen migrating an VM running memcached VM. It can be seen that the total migration time and the non-response time of the server during migration is greatly reduced by Parallel migration. The Memcached server on KVM drops to only 0.22 MByte/s after the migration has been started for 180s and it lasts about 163.0s, which means that the server has nearly no response to the clients for a long period. However, memcached server on Pmigrate-KVM does not have such problem, as we changed the pre-copy strategy of KVM. As the avg. network throughput increases, even Pmigrate transfers more data, it is still faster than the vanilla. The reason Pmigrate transfer more data than vanilla is that the server throughput on Pmigrate is much more higher than that on vanilla as Pmigrate separate migration process from qemu daemon. 35.3 39.5 Server Thr. 13.2% 91.6%

Scalability of PMigrate-Xen Migrating Idle VM 197.4 This table shows the scalability of Pmigrate-Xen with and without optimization. The x-axis shows the number of threads used in the evaluation and the y-axis shows the total migration time. The blue bar shows the scalability of Pmigrate-Xen without any optimization. It scales worse when the number of threads increases to 2 threads. After applying the read lock optimization (shown as the red bar), the Performance of Pmigrate-Xen increases. After using the range lock optimization (shown as the green bar), the performance of Pmigrate-Xen is improved further. 122.92 149.3 112.4

Conclusion A general design of PMigrate by leveraging data/pipeline parallelism Range lock to scale address space mutation Implemention for both Xen and KVM Evaluation Results Improve VM migration performance Reduce overall resource consuming in many cases Here comes to our conclusion. In this presentation, we provide a general design of Pmigrate by leveraging data parallelism and pipeline parallelism We use range lock to scale address space mutation in Pmigration. We have implemented Pmigrate for both Xen and KVM The evaluation results shows that Pmigrate can improve live VM migration performance and reduce overall resource consuming in many cases.

Institute of Parallel and Distributed Systems Thanks Questions? PMigrate Parallel Live VM Migration ? Thank you for your listening. Our source code is available in our project web site. I am glad to answer any questions. Institute of Parallel and Distributed Systems http://ipads.se.sjtu.edu.cn/ http://ipads.se.sjtu.edu.cn/pmigrate

Backups

Load Balance – Network Experimental setup Result Co-locate a Apache VM – thr. 101.7 MByte/s Migrate a idle VM with 4 Gbyte memory Migration process use two NICs (share one NIC with Apache VM) Result Thr. during migration 91.1 MByte/s Migration speed 17.6 MByte/s + 57.2 MByte/s

Load Balance – CPU Experimental setup (Xen) 1 memcached server and 1 idle server 4 GByte memory 4 VCPU scheduled on 4 physical CPU Migrating the idle server PMigrate-Xen spawns 4 consumer threads PMigrate-Xen only share spare physical CPU Force PMigrate-Xen share all physical CPU

Load Balance – CPU Memcached Server workload One Gigabit Network connection Throughput: 48.4 MByte/s CPU consumption: about 100%

Load Balance – CPU Results PMigrate-Xen perfer spare cpu Vanilla Share PCPU Work Spare Total Time 116s 131s 39s 41s Avg. Memcached Throughput 2.9 MByte/s 23.3 Mbyte/s 6.2 MByte/s 16.9 MByte/s Avg. Throughput Lost 45.5 MByte/s 25.1 42.2 31.5 Total Throughput Lost 5276.2 MByte 3291 1637 1293

Related work