Joshua Reich Princeton University Department of Computer Science 1 P2P File-Systems for Scalable Content Use.

Slides:



Advertisements
Similar presentations
1/17/20141 Leveraging Cloudbursting To Drive Down IT Costs Eric Burgener Senior Vice President, Product Marketing March 9, 2010.
Advertisements

Joshua Reich, Oren Laadan, Eli Brosh, Alex Sherman, Vishal Misra, Jason Nieh, and Dan Rubenstein 1 VMTorrent : Scalable P2P Virtual Machine Streaming.
Remus: High Availability via Asynchronous Virtual Machine Replication
2  Industry trends and challenges  Windows Server 2012: Modern workstyle, enabled  Access from virtually anywhere, any device  Full Windows experience.
 What Is Desktop Virtualization?  How Does Application Virtualization Help?  How does V3 Systems help?  Getting Started AGENDA.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
RETHINK BACKUP & ARCHIVE. 2 Backup and Archive are Top IT Priorities Which of the following would you consider to be your org’s most important IT priorities.
Chapter 4 Infrastructure as a Service (IaaS)
Wyse.com 2010 Cameron Smith Sales Engineer for IN, KS, and MO Desktop Virtualization.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
Xen , Linux Vserver , Planet Lab
The Stanford Directory Architecture for Shared Memory (DASH)* Presented by: Michael Bauer ECE 259/CPS 221 Spring Semester 2008 Dr. Lebeck * Based on “The.
Mary Sakaluk Iggy Filice Hamilton Public Library.
VMware Virtualization Last Update Copyright Kenneth M. Chipps Ph.D.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
19 Historical overview Main challenge: How to distribute content in high quality over the Internet cost-effectively? • Traditional “Best-effort” model:
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
OnCall: Defeating Spikes with Dynamic Application Clusters Keith Coleman and James Norris Stanford University June 3, 2003.
Midterm 2: April 28th Material:   Query processing and Optimization, Chapters 12 and 13 (ignore , 12.7, and 13.5)   Transactions, Chapter.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
Virtualization for Cloud Computing
New Challenges in Cloud Datacenter Monitoring and Management
Automatic software deployment using user-level virtualization for cloud-computing Future Generation Computer System (2013) Youhui Zhang, Yanhua Li, Weimin.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
1 Scheduling I/O in Virtual Machine Monitors© 2008 Diego Ongaro Scheduling I/O in Virtual Machine Monitors Diego Ongaro, Alan L. Cox, and Scott Rixner.
Virtual Desktops and Flex CSU-Pueblo Joseph Campbell.
Introduction To Windows Azure Cloud
Department of Computer Science Engineering SRM University
Exploring VoD in P2P Swarming Systems By Siddhartha Annapureddy, Saikat Guha, Christos Gkantsidis, Dinan Gunawardena, Pablo Rodriguez Presented by Svetlana.
How to Resolve Bottlenecks and Optimize your Virtual Environment Chris Chesley, Sr. Systems Engineer
XenDesktop Built on FlexPod Flexible IT Infrastructure for Desktop Virtualization.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Identifying Application Impacts on Network Design Designing and Supporting Computer.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Improving Network I/O Virtualization for Cloud Computing.
What is Driving the Virtual Desktop? VMware View 4: Built for Desktops VMware View 4: Deployment References…Q&A Agenda.
1 Towards Cinematic Internet Video-on-Demand Bin Cheng, Lex Stein, Hai Jin and Zheng Zhang HUST and MSRA Huazhong University of Science & Technology Microsoft.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco PublicITE I Chapter 6 1 Identifying Application Impacts on Network Design Designing and Supporting.
COMS E Cloud Computing and Data Center Networking Sambit Sahu
Xen (Virtual Machine Monitor) Operating systems laboratory Esmail asyabi- April 2015.
Live Migration Failover Clustering with Cluster Shared Volumes (CSV) Support for new Processor features Improved Performance Lower Power Costs Enhanced.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
VMware vSphere Configuration and Management v6
Ethernet. Ethernet  Ethernet is the standard communications protocol embedded in software and hardware devices, intended for building a local area network.
Full and Para Virtualization
SHADOWSTREAM: PERFORMANCE EVALUATION AS A CAPABILITY IN PRODUCTION INTERNET LIVE STREAM NETWORK ACM SIGCOMM CING-YU CHU.
1 Reforming Software Delivery Using P2P Technology Purvi Shah Advisor: Jehan-François Pâris Department of Computer Science University of Houston Jeffrey.
Web Technologies Lecture 13 Introduction to cloud computing.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
Desktop Virtualization —An Elegant Solution Presented by Cloud Computing. Endless Possibilities. August 10, 2012.
Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer
Data Centers and Cloud Computing 1. 2 Data Centers 3.
Cloud Computing ENG. YOUSSEF ABDELHAKIM. Agenda :  The definitions of Cloud Computing.  Examples of Cloud Computing.  Which companies are using Cloud.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Journey to the HyperConverged Agile Infrastructure
Azure Site Recovery For Hyper-V, VMware, and Physical Environments
Chapter 6: Securing the Cloud
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Chapter 3 Internet Applications and Network Programming
PA an Coordinated Memory Caching for Parallel Jobs
GGF15 – Grids and Network Virtualization
Virtualization Techniques
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Process Migration Troy Cogburn and Gilbert Podell-Blume
Internet and Web Simple client-server model
Virtual Memory: Working Sets
Presentation transcript:

Joshua Reich Princeton University Department of Computer Science 1 P2P File-Systems for Scalable Content Use

Goal: Scalable Content Distribution In crowd (WAN – gateways) or cloud (LAN – data-center servers) For use Not all parts of content are used at the same time! –Multimedia content –Executables –Virtual Appliances 2

Domain: Cloud Data-center w/ –physical machines –network storage VM - Software implementation of hardware machine [the machine as an executable] VMM - Software layer virtualizing hardware [OS for VMs] 3

Motivation VM optimized for specific purpose –Virtual Appliances –Virtual Servers –Virtual Desktops (VDI) Zero config, isolated, easy to replicate Shared infrastructure is cheaper Less IT headache unique images on EC2 alone!* Hosted VDI market alone est. $65B in 2013** 4 * checked 15 July 2011http://thecloudmarket.com/stats#/totals ** March, 2009

VM images stored on network Contention for networked storage results in I/O bottlenecks I/O bottlenecks significantly delay VM execution 5 High-level problem

VM image stored on SAN or NAS Accessed by servers hosting VDI instances Everyone comes to work in the morning, starts up their desktop SAN overloaded by simultaneous access Virtual Desktops stall SAN Example: Hosted VDI Boot Storm

7 Specific Challenges 1.Large image size + high demand = contention-induced network bottleneck 1.VMM expects complete image –Either download image completely –Or continual remote access 1.Complex VM image access patterns –Non-linear –Differ from run to run

Assume (2) & (3) aren’t problems Begins to look like video streaming Known approach: P2P Video-on-Demand –Need to stream a series of ordered pieces –While maintaining swarm efficiency –Use mix of earliest-first & rarest-first 8 Analogy to Streaming Video

9 Novel VMTorrent Architecture 1.Large image/high demand -> P2P 2.Complete VM image req’d -> Quick-Start 3.Non-linear access -> Profile Prefetch

Related Work Matrix WorkApproachProblem Addressed Notes Mietzner:2008 Shi:2008 Sequential distribution of VM images VM DeploymentSlow, doesn’t scale O’Donnell:2008 Chen:2009 Naive P2P distribution of VM images VM DeploymentSlow, scales IndustryHardware overprovisioning VM DeploymentFast, expensive Chandra:2005 Moka5 content prefetching + on-demand streaming Virtual Desktop Delivery Fast, highly structured Vlavianos:2006 Zhou:2007 Mix earliest first / random first prefetch Video StreamingFast, scales well VMTorrentQuick start + P2P + profile prefetch VM DeploymentFast, scales well 10

11 VM VMM Hardware/OS Custom FS Custom FS VMTorrent Architecture Swarm P2P Manager profile VMTorrent Instance Unmodified VM & VMM

Understanding VMTorrent: A Bottom Up Approach 12

Traditional VM Execution 13 VM Host VM Image FS VM runs on some host Virtual Machine: software implementation of a computer Implementation stored in an image Image stored on host’s local file system

Traditional VM Execution 14 VM VMM Host Hardware/OS VM Image FS Virtual Machine Monitor virtualizes hardware Conducts I/O to image through FS

VM Execution Over Network 15 VM VMM Hardware/OS VM Image FS Either to download image Network Backend Network Backend Network backend used Or to access via remote FS

VM Execution Over Network 16 VM VMM Hardware/OS VM Image FS Remote access smaller hits, but also writes and re-reads Network Backend Network Backend Download – one big up front performance hit

Custom FS Custom FS Quick Start with Custom FS 17 VM VMM Hardware/OS VM Image FS Network Backend Network Backend Divide image into pieces But provide appearance of complete image to VMM Introduce custom file system

Custom FS Custom FS Quick Start w/ Custom FS 18 VM VMM Hardware/OS Network Backend Network Backend VMM attempts to read piece 1 Piece 1 is present, read completes

Custom FS Custom FS Quick Start w/ Custom FS 19 VM VMM Hardware/OS Network Backend Network Backend VMM attempts to read piece 0 Piece 0 isn’t local, read stalls VMM waits for I/O to complete VM stalls

Custom FS Custom FS Quick Start w/ Custom FS 20 VM VMM Hardware/OS Network Backend Network Backend FS requests piece from backend Backend requests from network

Quick Start w/ Custom FS 21 VM VMM Hardware/OS Network Backend Network Backend 0 0 Later, network delivers piece 0 Custom FS Custom FS Read completes Custom FS receives, updates piece VMM resumes VM’s execution

Improved Performance w/ Custom FS 22 VM VMM Hardware/OS No waiting for image download to complete Network Backend Network Backend No more writes or re-reads over network w/ remote FS Custom FS Custom FS X X

Custom FS + Network Backend 23 VM VMM Hardware/OS Network Backend Network Backend Custom FS Custom FS

Alleviate bottleneck to network storage 24 VM VMM Hardware/OS Network Backend Network Backend Custom FS Custom FS Scaling w/ P2P Backend Swarm P2P Manager P2P Manager Exchange pieces w/ swarm P2P copy remains pristine

25 VM VMM Hardware/OS Custom FS Custom FS Minimizing Stall Time Swarm P2P Manager P2P Manager VMM accesses to non-local pieces ? 4! Trigger high priority swarm requests

26 VM VMM Hardware/OS Custom FS Custom FS Custom FS + P2P Manager Swarm P2P Manager P2P Manager

P2P Challenge: Request fulfillment latency Delays –Network RTT –At image source (peer or server) Impact –If even occasionally it takes 0.5s to obtain piece –Over the course of thousands of requests –10’s of seconds may be lost 27

P2P Challenge: Network Capacity Mem-cached: ideal access rate for given physical machine 28 (s) Cumulative Demand FS: ideal access rate w/ read-once never write Prefeching: ideal access rate w/ perfect prefetching Even assuming no latency 100Mb Network Delay lower bound

Solution: Prefetch Perfectly 29

P2P Challenge: Image Access Highly Nonlinear 30

Collect access patterns for VM/workload Determine expected accesses –Divide accesses into blocks –Sort by average access time –Remove blocks accessed in small fraction of runs Encode new order in profile Solution: Generate Profile Using Statistical Ordering 31

E.g., During boot storm ⇒ All actively fetch same small set of pieces ⇒ Low piece diversity ⇒ Little opportunity for peers to share ⇒ Low swarming efficiency P2P Challenge: In-order Profile Prefetch Inefficient 32

Solution: Randomization and Throttling 33 Randomize prefetch order Rate limiting (based on priority) Deadline-based throttling

34 VM VMM Hardware/OS Custom FS Custom FS VMTorrent Architecture Swarm P2P Manager profile

35 VM VMM Hardware/OS Custom FS Custom FS VMTorrent Architecture Swarm P2P Manager profile VMTorrent Instance Unmodified VM & VMM

Evaluation 36

37 VM Hardware/OS Custom FS Custom FS VMTorrent Prototype BT Swarm P2P Manager profile Custom C Using FUSE Custom C++ & Libtorrent

Emulab Testbed* Up to 101 modern hardware nodes One VMTorrent instance per node 100Mb LAN 38 *[White:2002]

VMs 39

Workloads 40 VDI-like tasks

We use normalized runtime (boot through shutdown) Normalized against memory-cached execution Allows easy cross-comparison for different VM/workload combinations Data Presentation 41

Flash Crowd Ubuntu VM Boot-Shutdown task Immediate peer departure 42

Flash Crowd 43

Hypothesis Larger swarm -> Longer time until full swarm efficiency Demand prioritization -> Relative loss of prefetch piece diversity -> Lower swarm efficiency 44

Swarming Efficiency (flash crowd) n

Staggered Arrival 46