Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enabling Cost-Effective Resource Leases with Virtual Machines Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/

Similar presentations


Presentation on theme: "Enabling Cost-Effective Resource Leases with Virtual Machines Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/"— Presentation transcript:

1 Enabling Cost-Effective Resource Leases with Virtual Machines Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/ University of Chicago Tim Freeman Argonne National Laboratory/ University of Chicago Kate Keahey Argonne National Laboratory/ University of Chicago HPDC 2007 Hot Topics Session

2 Motivation Leasing resources for short periods of time can be of great value to many applications. Workflows, real-time applications, and applications requiring resource co-scheduling. Leasing semantics The glidein approach: Condor glideins, MyCluster, and Falkon Advance reservations Meta-scheduling, deadlines, demos Utilization problems We argue that virtualization can make resource leasing cost- effective, despite the overhead of using VMs, thus: Providing an incentive for resource providers to allow short- term leasing of resources. Creating an opportunity for scientific applications (resource consumers) that require multi-level scheduling.

3 Approach Separate resource provisioning from execution management. Resource provisioning is handled by a new component called the Lease Manager Execution management can continue to be handled by a site's current scheduler (PBS/Maui, SGE, Condor,...) All provisioning is handled via the use of VMs Including provisioning resources for a batch job Use VMs suspend/resume mechanisms to backfill and suspend non-interactive/batch applications

4 LRM Lease Manager Execution Manager Short-term leases Batch computation VMM-enabled Worker Nodes

5 SHORT-TERM LEASE SHORT-TERM LEASE Scheduling the lease without using virtualization : Scheduling the lease using virtualization:

6 Experiment Setting Simulated testbed of 8 nodes connected by 100Mbps network, such that at most two VMs can run simultaneously on one node. We consider the best and worst cases Traces Artificial traces, combining serial batch requests and ARs Would require 10h to run on testbed (assuming perfect utilization) VM runtime overhead assumed to be 10% Experiments

7 Experiment I Is using VMs for suspend/resume backfill worth the overhead? Assumption: we are using only one VM image Prototype scheduler supporting batch serial requests and advance reservations, using backfilling or suspend/resume to plan around the ARs. A Resource Management Model for VM-Based Virtual Workspaces, B.Sotomayor, Masters paper, University of Chicago. February 2007.

8 Best-case trace Trace characteristics Duration of batch requests: Avg=15 min. AR resource consumption: 75% - 100% Proportion of Batch/AR: 75%/25% Benefits from suspend/resume because the large number of relatively long batch requests limit the efficiency of backfilling.

9 One Image (best case) Baseline Not using VMs (no runtime overhead) and backfilling instead of suspend/resume

10 One Image (best case) Add Runtime Overhead Running inside a VM adds runtime overhead, but not a big hit since images are predeployed.

11 One Image (best case) Use Suspend/Resume Allows for better resource utilization than backfilling, even better than baseline (because of long batch requests)

12 Worst-case trace Same as previous trace, but with shorter batch requests (avg=5 minutes) This also entails that there are more batch requests, since the total running time of the trace is still 10h With a large number of relatively short requests, backfilling is already very effective, and little is gained from suspend/resume. Furthermore, many more images have to be deployed in this case, which increases the preparation overhead.

13 One Image (worst case) Baseline Not using VMs (no runtime overhead) and backfilling instead of suspend/resume

14 One Image (worst case) Add Runtime Overhead Running inside a VM adds runtime overhead, but not a big hit since images are predeployed.

15 One Image (worst case) Use Suspend/Resume Doesn't provide any significant advantage over backfilling because of short batch requests.

16 Experiment II How much do we pay for the added flexibility of operating in multiple virtualized environments? Assumption: we are using multiple images Scheduler also has application-specific knowledge (i.e., it knows it is scheduling VMs) so it is able to also schedule timely VM image transfer. Image reuse strategies: realistically not all images will be different Modification of Experiment I Use 37 possible 600MB VM images. 7 images account for 70% of requests.

17 Multiple Images (best case) Baseline Not using VMs (no runtime overhead) and backfilling instead of suspend/resume

18 Multiple Images (best case) Transferring images Adds deployment overhead which delays starting time of batch requests.

19 Multiple Images (best case) Adding Runtime Overhead Makes running time even larger

20 Multiple Images (best case) Use Suspend/Resume Better resource utilization compensates for deployment overhead.

21 Multiple Images (best case) Image Reuse Improves performance slightly.

22 Multiple Images (worst case) Baseline Not using VMs (no runtime overhead) and backfilling instead of suspend/resume

23 Multiple Images (worst case) Transferring images Adds deployment overhead which delays starting time of batch requests.

24 Multiple Images (worst case) Adding Runtime Overhead Relatively small performance hit (the least of our concerns here)

25 Multiple Images (worst case) Use Suspend/Resume Doesn't improve significantly over backfilling, which already does a good job thanks to the presence of small batch requests

26 Multiple Images (worst case) Image Reuse Compensates for deployment overhead. Still not as good as baseline, but relatively small difference

27 Conclusions Using virtualization can make short-term leasing with interesting semantics cost- effective even in the presence of runtime overhead Given reasonable strategies of deployment overhead management the cost of using multiple images is acceptable. However, only artificial stress traces have been used so far. Preliminary results with real traces suggest that short-term leases can be integrated into real workloads and still be cost-effective (we will release these results as soon as they're solid)

28 Ongoing Work Develop a better scheduler Handle parallel batch submissions Integrate this virtualized resource manager with existing LRM This work is our top-down effort We also have a bottom-up effort Better modeling of traces Based on real world batch submissions Non-uniform overhead Understanding VM overhead in practice Virtualization in Practice:

29 Questions? Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/ University of Chicago Tim Freeman Argonne National Laboratory/ University of Chicago Kate Keahey Argonne National Laboratory/ University of Chicago


Download ppt "Enabling Cost-Effective Resource Leases with Virtual Machines Borja Sotomayor University of Chicago Ian Foster Argonne National Laboratory/"

Similar presentations


Ads by Google