Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 691: Energy-Efficient Computing Lecture 4 SCALING: stateless vs. stateful Anshul Gandhi 1307, CS building

Similar presentations


Presentation on theme: "CSE 691: Energy-Efficient Computing Lecture 4 SCALING: stateless vs. stateful Anshul Gandhi 1307, CS building"— Presentation transcript:

1 CSE 691: Energy-Efficient Computing Lecture 4 SCALING: stateless vs. stateful Anshul Gandhi 1307, CS building anshul@cs.stonybrook.edu

2 autoscale paper

3 3 Facebook data center in Oregon Data Centers Collection of thousands of servers Stores data and serves user requests

4 4 Most power is actually wasted! [energystar.gov, McKinsey & Co., Gartner]  Annual US data centers: 100 Billion kWh = $ 7.4 Billion  As much CO 2 as all of Argentina  Google investing in power plants Power is expensive

5 Setup cost 260 s 200 W (+more) Servers are only busy 30% of the time on average, but they’re often left on, wasting power A lot of power is actually wasted  BUSY server: 200 Watts  IDLE server: 140 Watts  OFF server: 0 Watts Intel Xeon E5520 dual quad-core 2.27 GHz Time Demand Provisioning for peak ? 5

6 Problem statement Time Demand ? Given unpredictable demand, how to provision capacity to minimize power consumption without violating response time guarantees (95%tile) ? 1. Turn servers off: save power 2. Release VMs: save rental cost 3. Repurpose: additional work done 6

7 Experimental setup 28 servers 7 servers (key-value store) Response time: Time taken to complete the request A single request: 120ms, 3000 KV pairs 7 1 server (500 GB)

8 Experimental setup Goal: Provision capacity to minimize power consumption without violating response time SLA SLA: T 95 < 400ms-500 ms 8 28 servers 7 servers (key-value store) 1 server (500 GB)

9 AlwaysOn  Static provisioning policy  Knows the maximum request rate into the entire data center (r max = 800 req/s)  What request rate can each server handle? arrival rate (req/s) 95% Resp. time (ms) 400 ms 60 req/s 1 server 9

10 AlwaysOn T 95 = 291ms P avg = 2,323W 10

11 Reactive T 95 = 11,003ms P avg = 1,281W T 95 = 487ms P avg = 2,218W 11 x = 100%

12 Linear Regression T 95 = 2,544ms P avg = 2,161W Moving Window Average T 95 = 7,740ms P avg = 1,276W Predictive Use window of observed request rates to predict request rate at time (t+260) seconds. Turn servers on/off based on this prediction. 12

13 [Gandhi et al., Allerton Conference on Communication, Control, and Computing, 2011] [Gandhi et al., Open Cirrus Summit, 2011] AutoScale  Predictive and Reactive are too quick to turn servers off  If request rate rises again, have to wait for full setup time (260s) Wait for some time (t wait ) before turning idle servers off “Un-balance” load: Pack jobs on as few servers as possible without violating SLAs Two new ideas 13 Heuristic Energy(wait) = Energy(setup) P idle ∙ t wait = P max ∙ t setup jobs at server 95% Resp. time 10 jobs/server Load balancing?

14 Results 14 Reactive AutoScale Reactive AutoScale [Gandhi et al., International Green Computing Conference, 2012] [Gandhi et al., HotPower, 2011]

15 cachescale paper

16 Application in the Cloud Load Balancer Application Tier Caching Tier Database Why have a caching tier? 1.Reduce database (DB) load 16 λ req/sec λ DB req/sec (λ DB << λ)

17 Application in the Cloud Load Balancer Application Tier Caching Tier Database Why have a caching tier? 1.Reduce database (DB) load 2.Reduce latency 17 λ req/sec λ DB req/sec (λ DB << λ) Load Shrink your cache during low load > 1/3 of the cost [Krioukov`10] [Chen`08] [Ousterhout`10]

18 Caching Tier Will cache misses overwhelm the DB? Load Balancer Application Tier Database Goal: Keep λ DB = λ(1-p) low 18 λ req/sec λ DB req/sec λpλp λ(1-p) = λ DB If λ drops(1-p) can be higher p can be lower SAVE $$$

19 Large decrease in caching tier size Small decrease in caching tier size Are the savings significant? Small decrease in hit rate UniformZipf 19 % of data cached Hit rate, p It depends on the popularity distribution

20 Is there a problem? Performance can temporarily suffer if we lose a lot of hot data 20 Mean response time (ms) Time (min) Shrink the cache Response time stabilizes

21 What can we do about the hot data? 21 Caching Tier Start stateEnd state Option 1 Option 2 We need to transfer the hot data before shrinking the cache Caching Tier Transfer Caching Tier Primary Retiring


Download ppt "CSE 691: Energy-Efficient Computing Lecture 4 SCALING: stateless vs. stateful Anshul Gandhi 1307, CS building"

Similar presentations


Ads by Google