Presentation is loading. Please wait.

Presentation is loading. Please wait.

VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)

Similar presentations


Presentation on theme: "VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)"— Presentation transcript:

1 VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)

2 2 Demand  (t) Time (t) ? Data Center Capacity PROBLEM: Want to turn servers OFF/ON to match  (t)… … and also minimize setup penalties!

3 3  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W

4 4  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W

5 5  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W

6 6  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W WITH SETUP: E[Power] = 51.9 kW E[Response time] = 13s NO SETUP: E[Power] = 14.4 kW E[Response time] = 1s

7 7  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W Can we do better than ON/OFF?. Add inertia while turning servers OFF

8 8 Turn OFF a server after idle for t wait t wait independent of  (t)! new arrival routed to the most recently busy (MRB) server arriving job turns a server ON if all servers busy

9 9 t wait timers

10 10 t wait timers

11 11 t wait timers

12 12 THEOREM: For Poisson arrivals, as the load, the number of ON servers is concentrated around --- Proof Intuition --- Step 1: An equivalent system view k jobs run on the “first” k servers Step 2: Analysis of idle periods of M/G/∞ Time from a k  (k-1) to the next (k-1)  k transition

13 13 Intuition for idle periods 1. # jobs  Normal with mean and variance  Events happen at rate  4. Mean idle period of server MRB  servers for any constant t wait ! In practice, we choose t wait to amortize setup cost

14 14 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs ON/OFF DELAYEDOFF

15 15 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs ON/OFF DELAYEDOFF

16 16 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs ON/OFF DELAYEDOFF DELAYEDOFF: E[Power] = 18.9 kW E[Response time] = 1.002s NO SETUP: E[Power] = 14.4 kW E[Response time] = 1s.

17 17 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs DELAYEDOFF SQ. ROOT W/ LOOKAHEAD (“optimal offline”)

18 18 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs DELAYEDOFF SQ. ROOT W/ LOOKAHEAD (“optimal offline”)

19 19 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs DELAYEDOFF DELAYEDOFF: E[Power] = 18.9 kW E[Response time] = 1.002s  w/LOOKAHEAD: E[Power] = 15.8 kW E[Response time] = 1.036s SQ. ROOT W/ LOOKAHEAD (“optimal offline”)

20 1. Speed scaling algorithms 2. A simple proxy for MRB routing 3. Heterogeneous servers 4. Managing Virtual Machines in the Cloud 5. Wear-leveling/Performance tradeoffs 20

21 21 Power P(s) Speed (s) Q: Optimal speed to balance energy-performance? A: your favorite metric =

22 22 MRB requires a lot of state updates Proxy policy Assign static ranks to servers Route a new arrival to highest ranked idle server Almost the same performance as MRB + easier to implement than MRB + easy to extend (MRB) Rank Rank- based

23 23 more efficient less efficient Assign ranks to servers based on efficiency (1 = most efficient) Route a new arrival to highest ranked idle server

24 24 2GB RAM 1GB

25 Split physical servers into “virtual” servers Assign static ranks to virtual servers Route a new VM request to highest ranked idle virtual server Turn the physical server OFF after each virtual server has idled for t wait

26  Problem: unpredictable demand and non-trivial setup costs  DELAYEDOFF : A new traffic- oblivious capacity scaling scheme  Extensions to real-world scenarios 26 Demand ? Time Max Capacity

27 27 THEOREM [MRB]: As the load, the number of ON servers is concentrated around THEOREM [Round-Robin]: As the load, for constant job sizes, the number of ON servers is N servers Arrival rate = λ


Download ppt "VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)"

Similar presentations


Ads by Google