Presentation is loading. Please wait.

Presentation is loading. Please wait.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory.

Similar presentations


Presentation on theme: "U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory."— Presentation transcript:

1 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory for Advanced Systems Software University of Massachusetts Amherst http://www.cs.umass.edu/~bhuvan

2 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 2 Internet Applications  Proliferation of Internet applications auction siteonline gameonline store  Growing significance in personal, business affairs  Focus: Internet server applications

3 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 3 Internet Workloads Are Dynamic  Multi-time-scale variations  Time-of-day, hour-of-day  Flash crowds  User threshold for response time: 8-10 s  Key issue: Provide good response time under varying workloads 0 20000 40000 60000 80000 100000 120000 140000 05101520 Time (hrs) Request Rate (req/min) 0 12 24 Time (hours) Time (days) 0 12345 Arrivals per min 0 0 140K 1200

4 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 4 Data Centers  Clusters of servers  Hosting platforms:  Rent resources to third-party applications  Performance guarantees in return for revenue  Benefits:  Applications: don’t need to maintain their own infrastructure o Rent server resources, possibly on demand  Platform provider: generates revenue by renting resources

5 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 5 Goals of a Data Center  Satisfy application performance guarantees under dynamic workloads  E.g., average response time, throughput  Maximize resource utilization  E.g., maximize the number of hosted applications Question: How should a data center manage its resources to meet these goals?

6 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 6 Manual Resource Allocation  Resource over-provisioning  Resource wastage  A bad estimate could result in under-allocation  Manual reallocation  Slow allocation time Challenge: How to handle dynamic workloads while efficiently utilizing resources? WC Soccer 1998

7 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 7 Dynamic Resource Management  How to map an application to servers in the data center?  How to provide good performance under dynamic workloads?  How to remain operational under extreme overloads? Application Placement [OSDI’02,PDCS’04] Dynamic Capacity Provisioning [Auto Computing’05] Scalable Policing [World Wide Web’05]

8 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 8 Dynamic Resource Management  How to map an application to servers in the data center?  How to provide good performance under dynamic workloads?  How to remain operational under extreme overloads? Application Placement [OSDI’02,PDCS’04] Dynamic Capacity Provisioning [Auto Computing’05] Scalable Policing [World Wide Web’05]

9 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 9 Talk Outline Motivation Data Center Models  Application Placement  Dynamic Capacity Provisioning  Summary and Future Research

10 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 10 Data Center Models  Small applications  Require only a fraction of a server  Shared Web hosting, $20/month to run own Web site  Shared hosting: multiple applications on a server  Co-located applications compete for server resources

11 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 11 Data Center Models  Large applications  May span multiple servers  eBay site uses thousands of servers!  Dedicated hosting: at most one application per server  Allocation at the granularity of a single server

12 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 12 Application Placement  How to map application to servers in the data center?  Step 1: Finding application’s resource requirement  Automatic requirement inference technique  Step 2: Identifying servers to host the application  Easy in dedicated hosting o Just assign the desired number of available servers!  Non-trivial in shared hosting o Opportunity for statistical multiplexing of resources on a server o Multi-dimensional Knapsack OSDI’02

13 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 13 Resource Requirement Inference time Measurement Interval Cumulative Probability Fractional usage 01 1 A 0.99 B ON-OFF PROCESS Fractional usage Probability 01 1 PDF CDF

14 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 14 Requirement Inference Technique  Profiling: process of determining resource usage  Run the application on an isolated server  Subject the application to a real workload  Determine CPU and network usage  Use the Linux trace toolkit [Yaghmour00]  Track scheduling events, packet transmissions times  Implementation on a Linux cluster  Apache Web server using SPECWeb99  Streaming media server with VBR MPEG-1 clients  Postgres database server  Quake game server

15 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 15 Application Profiles 0 0.05 0.1 0.15 0.2 0.25 0.3 00.10.20.30.40.50.60.7 Apache Web Server, 50% cgi-bin Probability Fraction of CPU 0 0.05 0.1 0.15 0.2 0.25 0.3 00.1.20.30.40.50.60.70.8 Streaming Media Server, 20 clients Probability Fraction of NW bandwidth  Observation: Resource usage can be bursty  Peak requirement much higher than a high percentile  Insight: Provisioning for the tail can save resources!  Under-provisioning of resources  Occasional violations of resource guarantees

16 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 16 Controlled Resource Under-provisioning  Allow applications to specify a violation tolerance V  Provision for the (100-V) th percentile of resource usage  Requirements do not necessarily peak simultaneously o Probability of violations even less than V  Similar to resource overbooking in airline industry  Determine which servers have enough capacity  σ k : (100-v) th percentile, C: server capacity Σ K σ cpu ≤ C cpu ; Σ K σ net ≤ C net kk

17 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 17 Resource Utilization Gains  1% violations can more than double number of applications !  Small under-provisioning can yield large gains  Bursty applications yield larger benefits Placement of Apache Web Servers 0 200 400 600 800 1000 1200 1400 020406080100120140 No Viol Viol=1% Web Servers Placed Data center size 0 50 100 150 200 250 300 350 020406080100120140 Placement of Streaming Media Servers No Viol Viol=1% Media Servers Placed Data center size

18 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 18 Impact of Under-provisioning on Application Performance  Provisioning for the tail results in tolerable degradation  Large resource savings possible with small degradation

19 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 19 Application Placement: Summary  Server applications tend to have bursty usage  Save resources in shared data centers running small applications  Determine resource usage behavior  Under-provision resources  Controlled performance degradation  Theoretical properties of application placement  NP-hard, approximation algorithms OSDI’02 PDCS’04

20 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 20 Talk Outline Motivation Data Center Models Application Placement Dynamic Capacity Provisioning  Summary and Future Research

21 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 21 Dynamic Capacity Provisioning  Key idea: increase or decrease allocated resources to handle workload fluctuations  To handle increased workload …  Shared hosting: increase resource share  Dedicated hosting: start replicas on additional servers  Focus: Dedicated hosting, large applications Monitor workload Monitor workload Compute future demand Compute future demand Adjust allocation [Chandra03, Chase01]

22 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 22 Dynamic Capacity Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations

23 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 23 Dynamic Capacity Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations

24 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 24 Internet Application Architecture  Multi-tier architecture  Each tier uses services provided by its successor  Session-based workloads  Caching, replication HTTPJ2EEDatabase request processing in an online bookstore search “moby” queries response Melville’s ‘Moby Dick’ Music CDs by Moby

25 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 25 Existing Application Models  Models for Web servers [Chandra03, Doyle03]  Do not model Java server, database etc.  Black-box models [Kamra04, Ranjan02]  Unaware of bottleneck tier  Extensions of single-tier models [Welsh03]  Fail to capture interactions between tiers  Existing models inadequate for multi-tier Internet applications

26 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 26 Baseline Application Model  Model consists of two components  Sub-system to capture behavior of clients  Sub-system to capture request processing inside the application SIGMETRICS’05 clientsapplication

27 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 27 Modeling Clients  Clients think between successive requests  Infinite server system to capture think time Z  Captures independence of Z from processing in application Client 1 Client 2 Client N Z Z Z Q0Q0 applicationclients

28 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 28 Modeling Request Processing Q1Q1 Q2Q2 QMQM tier 1tier 2tier M p M =1p3p3 p1p1 p2p2 S1S1 S2S2 SMSM  Transitions defined to capture circulation of requests  Request may move to next queue or previous queue  Multiple requests are processed concurrently at tiers  Processor sharing scheduling discipline  Caching effects get captured implicitly! N

29 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 29 Putting It All Together Q0Q0 Q1Q1 Q2Q2 QMQM p M =1p3p3 p1p1 p2p2 Z Z S1S1 S2S2 SMSM N  A closed-queuing model that captures a given number of simultaneous sessions being served tier 1tier 2tier M client

30 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 30 Model Solution and Parameter Estimation  Mean Value Analysis (MVA) Algorithm  Computes mean response time  Visit ratios  Equivalent to trans. probs. for MVA  V i ≈ λ i / λ req ; λ req at policer, λ i from logs  Service times  Use residence time X i logged at tier i  For last tier, S M ≈ X M  S i = X i – ( V i+1 / V i ) · X i+1  Think time  Measured at the entry point of application SIGMETRICS’05

31 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 31 Evaluation of Baseline Model  Auction site RUBiS  One server per tier ApacheJBOSS Mysql  Concurrency limits not captured 150 75

32 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 32 Q0Q0 Q1Q1 Q2Q2 QMQM Z Z S1S1 S2S2 SMSM N  Requests may be dropped due to concurrency limits  Need to model the finiteness of queues! Handling Concurrency Limits dropped requests

33 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 33 QMQM p1p1 pMpM S1S1 SMSM Q0Q0 Q1Q1 Q2Q2 QMQM Z Z S1S1 S2S2 SMSM N  Approach: Subsystems to capture dropped requests  Distinguish the processing of dropped requests Handling Concurrency Limits drop Q1Q1

34 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 34  Enhanced model can capture concurrency limits Response Time Prediction

35 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 35 Query Caching at the Database  Caching effects  Captured by tuning V i and/or S i  Bulletin-board site RUBBoS  50 sessions  SELECT SQL_NO_CACHE causes Mysql to not cache the response to a query  More model enhancements  Replication at tiers  Multiple session classes

36 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 36 Dynamic Capacity Provisioning Allocator Predictors Monitor Application Models Predicted workload Observed workload Resource reqmts Servers Workload measurements Server allocations

37 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 37 Handling Unanticipated Workloads  Allocation for an application may be insufficient  Short-term fluctuations are difficult to predict  Errors in parameter estimation may cause under-allocation  Reactor: Allocate additional servers over time scale of a few minutes if  Observed workload exceeds predicted workload  Request drop rate exceeds a threshold  Repeated invocations may be needed  Policer: If incoming session rate > current capacity  Turn away excess sessions  Highly scalable policing World Wide Web’05

38 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 38 Prototype Data Center  40+ Linux servers  Gigabit switches  Multi-tier applications  Auction (RUBiS)  Bulletin-board (RUBBoS)  Apache, JBOSS (replicable)  Mysql database Control Plane Application placement Dynamic provisioning Nucleus Apps OS Server Node Applications Request policer Resource monitoring Parameter estimation Nucleus Apps OS Nucleus Apps OS

39 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 39 Dynamic Capacity Provisioning WorkloadResponse time Server allocations  Auction application RUBiS  Factor of 4 increase in 30 min  Server allocations increased to match increased workload  Response time kept below 2 seconds

40 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 40 Talk Outline Motivation Data Center Models Application Placement Dynamic Capacity Provisioning Summary and Future Research

41 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 41 Summary Dynamic resource management in data centers  Application Placement  Improve utilization by under-provisioning  Dynamic Capacity Provisioning  Analytical model for Internet applications  Predictive provisioning  Reactive provisioning  Handling Extreme Overloads  Scalable policing

42 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 42 Future Research Directions  Virtual machine based hosting  Trade-off between fast switching and VM overheads  Malicious flash crowds, DoS attacks  Security mechanisms  Sensor networks  Constrained environment  How to provide desired performance to overlying applications?  Mobile computing  Resource-deficient clients  How to design Internet servers for such clients? Focus: Large-scale emerging distributed systems

43 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 43 Thank you! More information at: http://www.cs.umass.edu/~bhuvan

44 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 44 Agile Switching Using Virtual Machine Monitors  Use VMM’s to enable fast switching of servers  Switching time only limited by residual sessions VMM active dormant active VM 1 VM 2 VM 3 VM 2 VM 3  VMM’s allow multiple “virtual” m/c on a server  E.g., Xen, VMWare, …

45 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 45 Model Solution and Parameter Estimation  Mean Value Analysis (MVA) Algorithm  Computes mean response time  Visit ratios  Equivalent to trans. probs. for MVA  V i ≈ λ i / λ req ; λ req at policer, λ i from logs  Service times  Use residence time X i logged at tier i  For last tier, S M ≈ X M  S i = X i – ( V i+1 / V i ) · X i+1  Think time  Measured at the entry point of application SIGMETRICS’05

46 U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science 46 Prototype Data Center  40+ Linux servers  Gigabit switches  Multi-tier applications  Auction (RUBiS)  Bulletin-board (RUBBoS)  Apache, JBOSS (replicable)  Mysql database Control Plane Application placement Dynamic provisioning Nucleus Apps OS Server Node Applications Request policer Resource monitoring Parameter estimation Nucleus Apps OS Nucleus Apps OS


Download ppt "U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science Dynamic Resource Management in Internet Data Centers Bhuvan Urgaonkar Laboratory."

Similar presentations


Ads by Google