COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited.

Slides:



Advertisements
Similar presentations
Remus: High Availability via Asynchronous Virtual Machine Replication
Advertisements

Virtualization Technology
Remove the hurdles to offsite backup Highly efficient Automatically copies VMs to local or offsite storage location Validation and remediation.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Multiple Processor Systems
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Acknowledgments University of Waterloo University of British Columbia
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
G Robert Grimm New York University Disco.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Stanford University, 1997.
Remus: High Availability via Asynchronous Virtual Machine Replication.
Computer-System Structures
Cross Cluster Migration Remote access support Adianto Wibisono supervised by : Dr. Dick van Albada Kamil Iskra, M. Sc.
ECE669 L17: Memory Systems April 1, 2004 ECE 669 Parallel Computer Architecture Lecture 17 Memory Systems.
Virtualization for Cloud Computing
Copyright © 2002 Wensong Zhang. Page 1 Free Software Symposium 2002 Linux Virtual Server: Linux Server Clusters for Scalable Network Services Wensong Zhang.
Case Study - GFS.
1 Some Context for This Session…  Performance historically a concern for virtualized applications  By 2009, VMware (through vSphere) and hardware vendors.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Tanenbaum 8.3 See references
1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
Remus: VM Replication Jeff Chase Duke University.
Computer Architecture Lecture 28 Fasih ur Rehman.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
CH2 System models.
CSE 451: Operating Systems Section 10 Project 3 wrap-up, final exam review.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.
Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day14:
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Supporting Multi-Processors Bernard Wong February 17, 2003.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
Optimizing Live Migration of Virtual Machines across Wide Area Networks using Integrated Replication and Scheduling Sumit Kumar Bose, Unisys Scott Brock,
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
Distributed Shared Memory (part 1). Distributed Shared Memory (DSM) mem0 proc0 mem1 proc1 mem2 proc2 memN procN network... shared memory.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
PI System High Availability : Interface Redundancy and Disconnected Interface Startup Andy Singh, Ph.D., OPC Team Tony Cantele, Uniint Team Leader.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
Optimizing Live Migration of Virtual Machines across Wide Area Networks using Integrated Replication and Scheduling Sumit Kumar Bose, Unisys Scott Brock,
Local and Remote byte-addressable NVDIMM High-level Use Cases
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
Tweak Performance and Improve Availability of your Microsoft Azure VMs Rick
Complete VM Mobility Across the Datacenter Server Virtualization Hyper-V 2012 Live Migrate VM and Storage to Clusters Live Migrate VM and Storage Between.
PI System High Availability : Interface Redundancy and Disconnected Interface Startup Andy Singh, Ph.D., OPC Team Tony Cantele, Uniint Team Leader.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Virtual Machine Movement and Hyper-V Replica
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
E Virtual Machines Lecture 6 Topics in Virtual Machine Management Scott Devine VMware, Inc.
Distributed Systems Lecture 5 Time and synchronization 1.
© ExplorNet’s Centers for Quality Teaching and Learning 1 Explain the purpose of Microsoft virtualization. Objective Course Weight 2%
Wiera: Towards Flexible Multi-Tiered Geo-Distributed Cloud Storage Instances Zhe Zhang.
Efficient data maintenance in GlusterFS using databases
AlwaysOn Mirroring, Clustering
Windows Azure Migrating SQL Server Workloads
Disco: Running Commodity Operating Systems on Scalable Multiprocessors
Introduction to Operating Systems
EECS 498 Introduction to Distributed Systems Fall 2017
OS Virtualization.
湖南大学-信息科学与工程学院-计算机与科学系
Outline Midterm results summary Distributed file systems – continued
Distributed Availability Groups
Presentation transcript:

COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited.

Agenda Background COarse-grain LOck-stepping Summary

Non-Stop Service with VM Replication Typical Non-stop Service Requires Expensive hardware for redundancy Extensive software customization VM Replication: Cheap Application-agnostic Solution

Existing VM Replication Approaches 4 Replication Per Instruction: Lock-stepping Execute in parallel for deterministic instructions Lock and step for un-deterministic instructions Replication Per Epoch: Continuous Checkpoint Secondary VM is synchronized with Primary VM per epoch Output is buffered within an epoch

Problems 5 Lock-stepping Excessive replication overhead  memory access in an MP-guest is un-deterministic Continuous Checkpoint Extra network latency Excessive VM checkpoint overhead

Agenda 6 Background COarse-grain LOck-stepping Summary

Why COarse-grain LOck-stepping (COLO) 7 VM Replication is an overly strong condition Why we care about the VM state ?  The client care about response only Can the control failover without ”precise VM state replication” ? Coarse-grain lock-stepping VMs Secondary VM is a replica, as if it can generate same response with primary so far  Be able to failover without service stop Non-stop service focus on server response, not internal machine state!

Architecture of COLO 8 Pnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VM COarse-grain LOck-stepping Virtual Machine for Non-stop Service

Network topology of COLO 9 [eth0] : client and vm communication [eth1] : migration/checkpoint, storage replication and proxy Pnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VM

Network Process 10 Guest-RX Pnode Receive a packet from client Copy the packet and send to Snode Send the packet to PVM Snode Receive the packet from Pnode Adjust packet’s ack_seq number Send the packet to SVM Guest-TX Snode Receive the packet from SVM Adjust packet’s seq number Send the SVM packet to Pnode Pnode Receive the packet from PVM Receive the packet from Snode Compare PVM/SVM packet Same: release the packet to client Different: trigger checkpoint and release packet to client Base on Qemu’s netfilter and SLIRP

Storage Process 11 Write Pnode Send the write request to Snode Write the write request to storage Snode Receive PVM write request Read original data to SVM cache & write PVM write request to storage(Copy On Write) Write SVM write request to SVM cache Read Snode Read from SVM cache, or storage (SVM cache miss) Pnode Read form storage Checkpoint Drop SVM cache Failover Write SVM cache to storage Base on qemu’s quorum,nbd,backup-driver,backingfile

Memory Sync Process 12 PNode – –Track PVM dirty pages, send them to Snode periodically Snode – –Receive the PVM dirty pages, save them to PVM Memory Cache – –On checkpoint, update SVM memory with PVM Memory Cache

Checkpoint Process 13 Need modify migration process in Qemu to support checkpoint

Why Better 14 Comparing with Continuous VM checkpoint No buffering-introduced latency Less checkpoint frequency  On demand vs. periodic Comparing with lock-stepping Eliminate excessive overhead of un- deterministic instruction execution due to MP- guest memory access

Agenda 15 Background COarse-grain LOck-stepping Summary

16 Performance

Summary 17 COLO status colo frame: patchset v9 had been post(by colo-block: most of patch is reviewed (by colo-proxy: (by netfilter related is reviewed packet compare is developing

Summary 18 Next steps: Redesign based on feedbacks Develop and send out for review Optimize performance

19