Model-Based Resource Provisioning for a Web Service Utility Ron Doyle*, Jeff Chase, Omer Asad, Wei Jin, Amin Vahdat Internet Systems and Storage Group.

Slides:

Advertisements

Similar presentations

Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.

Advertisements

Hadi Goudarzi and Massoud Pedram

SLA-Oriented Resource Provisioning for Cloud Computing

Allocation of Frames Each process needs minimum number of pages

1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

CS533 Concepts of Operating Systems Jonathan Walpole.

Enabling High-level SLOs on Shared Storage Andrew Wang, Shivaram Venkataraman, Sara Alspaugh, Randy Katz, Ion Stoica Cake 1.

CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.

Energy Conservation in Datacenters through Cluster Memory Management and Barely-Alive Memory Servers Vlasia Anagnostopoulou Susmit.

SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Provisioning for Multi-tier Internet Applications Bhuvan Urgaonkar, Prashant.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.

An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.

Energy Management and Adaptive Behavior Tarek Abdelzaher.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

1 Action Breakout Session Anil, AP, Nina Bhatti, Charles Berdnall, Joe Hellerstein, Wei Hu, Anthony Joseph, Randy Katz, Li, Machi Mukund Kimmo Raatikanen,

Chapter 13 Embedded Systems

Load Adaptation: Options for Basic Services Vance Maverick ADAPT Bologna Feb. 13, 2003.

1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Virtualization in Data Centers Prashant Shenoy

A Hybrid Caching Strategy for Streaming Media Files Jussara M. Almeida Derek L. Eager Mary K. Vernon University of Wisconsin-Madison University of Saskatchewan.

Bandwidth Allocation in a Self-Managing Multimedia File Server Vijay Sundaram and Prashant Shenoy Department of Computer Science University of Massachusetts.

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.

By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and

New Challenges in Cloud Datacenter Monitoring and Management

1 Efficient Management of Data Center Resources for Massively Multiplayer Online Games V. Nae, A. Iosup, S. Podlipnig, R. Prodan, D. Epema, T. Fahringer,

Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Harold C. Lim, Shinath Baba and Jeffery S. Chase from Duke University AUTOMATED CONTROL FOR ELASTIC STORAGE Presented by: Yonggang Liu Department of Electrical.

Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.

Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.

Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.

Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.

Virtual Machine Hosting for Networked Clusters: Building the Foundations for “Autonomic” Orchestration Based on paper by Laura Grit, David Irwin, Aydan.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.

Internet Service Migration and Placement Part 1 Instructor: Xiaodong Zhang Xiaoning Ding 11/08/2004.

Budget-based Control for Interactive Services with Partial Execution 1 Yuxiong He, Zihao Ye, Qiang Fu, Sameh Elnikety Microsoft Research.

Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.

Challenges towards Elastic Power Management in Internet Data Center.

Computer Science 1 Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USITS 2003 Presented by: Bhuvan Urgaonkar.

1 Integrating security in a quality aware multimedia delivery platform Paul Koster 21 november 2001.

Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.

Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.

Resources Management and Component Placement Presenter:Bo Sheng.

June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.

Full and Para Virtualization

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

1 Agility in Virtualized Utility Computing Hangwei Qian, Elliot Miller, Wei Zhang Michael Rabinovich, Craig E. Wills {EECS Department, Case Western Reserve.

1 Hidra: History Based Dynamic Resource Allocation For Server Clusters Jayanth Gummaraju 1 and Yoshio Turner 2 1 Stanford University, CA, USA 2 Hewlett-Packard.

Capacity Planning in a Virtual Environment Chris Chesley, Sr. Systems Engineer

1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.

Stride Scheduling: Deterministic Proportional-Share Resource Management Carl A. Waldspurger, William E. Weihl MIT Laboratory for Computer Science Presenter:

Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.

Enabling Grids for E-sciencE Agreement-based Workload and Resource Management Tiziana Ferrari, Elisabetta Ronchieri Mar 30-31, 2006.

SEDA: An Architecture for Scalable, Well-Conditioned Internet Services

Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.

Action Breakout Session

PA an Coordinated Memory Caching for Parallel Jobs

Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.

GGF15 – Grids and Network Virtualization

Comparison of the Three CPU Schedulers in Xen

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

Cluster Resource Management: A Scalable Approach

DotSlash: An Automated Web Hotspot Rescue System

Cloud Computing Architecture

Department of Computer Science University of California, Santa Barbara

Presentation transcript:

Model-Based Resource Provisioning for a Web Service Utility Ron Doyle*, Jeff Chase, Omer Asad, Wei Jin, Amin Vahdat Internet Systems and Storage Group Department of Computer Science Duke University *

Internet Service Utilities Shared server cluster Web hosting centers Shared reserve capacity to handle surges and failures. Service/load multiplexing Dynamic provisioning Service is contractual Performance isolation Differentiated service SLAs

Utility Resource Management Goal: meet contractual service quality (SLA) targets under changing load; use resources efficiently. Approach: assign each hosted service a dynamic “slice” of resources. Combine “slivers” of shared servers, i.e., CPU time and memory. Resource containers [Banga99], VMware ESX [Waldspurger02], PlanetLab Assign shares of storage server I/O throughput. Given the mechanisms for performance isolation and proportional sharing, how do we set the knobs?

Adaptive Multi-Resource Provisioning This work addresses resource allocation policy for multiple resources, with a focus on memory & storage. 1. Provisioning: how much? [Muse SOSP01] 2. Assignment: which servers and storage units? clients Utility data center Utility OS executive Actuator (directives) Monitor (observations) or service manager

Model-Based Provisioning Resources interact in complex ways to determine overall service performance. Resource manager performance predictions Application models candidate allotments Incorporate a model of application behavior. Model predicts effects of candidate allotments. Plan allotments that are predicted to yield desired behavior. Monitor load and adapt as load intensity varies. workload profiles (e.g., access locality) storage models

Goals Research question: how can a resource manager incorporate these models when they exist? Manage multiple resources for diverse system goals. Meet SLA targets for response time Use surplus to optimize global average response time, yield, or value. Adjust to constraints discovered during assignment. Storage-aware caching [Forney03] Demonstrate that even simple models are a powerful basis for dynamic resource management.

Non-goals We are NOT trying to: build better models (you can plug in your favorite) parameterize or adapt models online from system observations manage network bandwidth schedule resources within each slice solve the assignment problem (bin-packing) allocate resources across the wide area make probabilistic performance guarantees Assume stable average case behavior at each load level, and provision for average response time.

System Context Load and performance measures reconfigurable redirecting switch configuration commands offered load λ per service server pool stateless interchangeable clients Muse [SOSP01] MBRP storage tier

Enforcing Slices Our prototype uses the Dash Web server [Asad02] to enforce resource control for slices at user level. Based on Flash [Pai99] using DAFS network storage. Asynchronous I/O from user space to user-level cache Low overhead (zero-copy, etc.), and user-level control Fully asynchronous, event-driven server “SEDA meets Click.” Independently size caches for co-hosted services. Request Windows [Jin03]: control the number of outstanding I/Os on a per-service basis. Dash is part of the utility’s trusted computing base.

A Simple Web Service Model CPU arrival rate λ Object cache (M) Storage λSλS M yields hit rate H λ S = λ (1 – H) Streams of requests with stable average case behavior per request class Varying load intensity λ Provision each stage, and M Downstream demand grows and shrinks with M (inverse) Bottlenecks limit demand downstream Generalize to stages or tiers

Web Cache Model Cache Size (M) 1 – M 1 – α 1 – T 1 – α H = H Footprint T objects Average size S Size is independent of popularity Cache M objects Given Zipf popularity  LFU approximation Integrate over the Zipf PDF

Storage Arrival Rate (IOPS) Cache Size ( M ) λ s = λS(1 – H) Each miss requires S I/O operations. S determines intensity of bulk I/O in this service’s storage load. Model predicts storage response time R S for load λ S given an IOPS share  per-service. Account for prefetching and sequential locality indirectly. λSλS

An Example using Dash IBM 2001 segment Load λ grows during trace segment. Dynamic cache resizing Storage IOPS demand λ S matches model prediction (squint) A few transient shifts in request locality

A Model-Based Allocator MBRP is a package of three primitives that coordinate with an assignment planner. Candidate Plan an initial allotment vector with CPU share and [M,  ] LocalAdjust Adjust a vector to adapt to a resource constraint or surplus, while staying on target for response time. GroupAdjust Modify a set of vectors to adapt to a fixed resource constraint or surplus exposed during assignment. Use any surplus to meet system-wide goals.

Candidate There is a large space of possible allotment vectors to meet a given response time target. Simplify the search space with a simple principle: Build a balanced system. Set the CPU share and storage allotment  to hit a preconfigured target utilization level . The  determines response time at storage and CPU. Select the minimum M and H that can hit the SLA target for overall response time. Refine  based on M and H and resulting λ S. Converges quickly.

Candidate LocalAdjust LocalAdjust adapts to constraint in one resource by adding more of another. Take as much as you can of the constrained resource, then rebalance to meet SLA target. E.g., in this graph it grows memory to respond to an IOPS constraint. Note: it’s not linear.

GroupAdjust Input: set of allotment vectors, with a group constraint or surplus. E.g., planner mapped all vectors to a shared server, leaving surplus memory. Adapt vectors to conform to constraint or use the surplus to meet a global goal. E.g., for services with the same profiles ( , S, T), prefer the service with the heaviest load.

Example: Differentiated Service Four identical services: -same load λ -same profiles ( , S, T) -same storage units Different SLA targets. Provision memory to meet targets first, then optimize global response time. (Give next unit of surplus memory to the most constrained service.)

Some Other Results in the Paper 1. GroupAdjust for services with different profiles and equivalent loads: prefer higher-locality services. 2. Simple dynamic example to optimize for global response time in a storage-aware fashion. 3. “Putting it all together” experiment: adjust to changes in locality, SLA targets, and available resources as well as changes in load. 4. Handle overload by shifting a co-hosted service to another server (bin-packing assignment). 5. Preliminary evaluation of storage model.

Conclusion Models are important for self-managing systems. MBRP shows how to use models to adapt proactively. Respond proactively to changing load signal, rather than reacting to off-target performance measures. It’s easy to plug better models into the framework. It seems clear that we can generalize this. Broader class of systems (e.g., multi-tier) and system goals (e.g., availability). But: models may be brittle or just plain wrong (HAL). Self-managing systems will combine proactive and reactive mechanisms.

Assignment Planning Map services to servers and storage units Allocator primitives work in concert with assignment planning Bin-packing services, balancing affinity, migration costs, local constraints/ surplus

Related Work Proportional-share schedulers: mechanism to enforce provisioning policies. Resource Containers[Banga99], Cluster Reserves[Aron00] Response-time schedulers: meet SLA targets without explicit partitioning/provisioning. Neptune[Shen02], Facade [Lumb03] Adaptive Resource Management for Servers: reactive, feedback-based adjustment of server resources. Web Server Performance Guarantees[Abdelzaher02], Predictable Web Server QoS[Aron-PhD], SEDA[Welsh01] Memory/storage management: goal-directed allotment of resources to services. Storage Aware Caching[Forney02], Value Sensitive Caching [Kelly99], Hippodrome[Anderson02]

Multiple Shared Resources Bottleneck Behavior Non-bottleneck resource adjustments have little effect. Global Constraints Services compete for resources in zero-sum game Local Constraints Service assignment to nodes exposes local resource constraints. Caching Memory allotment affects storage load for single service, impacting available resources for other services

Adaptive Resource Provisioning Utility OS Services Predictable average-case response time Resource intensive Workload Models predict Resource Demand Resource Interaction Effect of allotment decisions Framework is reactive to changes in workload characteristics for dynamic adaptation

Outline Overview Resource control mechanisms Web Service Models Model-Based Allocator Conclusions