# VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)

## Presentation on theme: "VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)"— Presentation transcript:

VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)

2 Demand  (t) Time (t) ? Data Center Capacity PROBLEM: Want to turn servers OFF/ON to match  (t)… … and also minimize setup penalties!

3  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W

4  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W

5  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W

6  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W WITH SETUP: E[Power] = 51.9 kW E[Response time] = 13s NO SETUP: E[Power] = 14.4 kW E[Response time] = 1s

7  (t) # Busy servers # Jobs Mean Job Size = 1s Setup delay = 200s Busy power = 240W Can we do better than ON/OFF?. Add inertia while turning servers OFF

8 Turn OFF a server after idle for t wait t wait independent of  (t)! new arrival routed to the most recently busy (MRB) server arriving job turns a server ON if all servers busy

9 t wait timers

10 t wait timers

11 t wait timers

12 THEOREM: For Poisson arrivals, as the load, the number of ON servers is concentrated around --- Proof Intuition --- Step 1: An equivalent system view k jobs run on the “first” k servers Step 2: Analysis of idle periods of M/G/∞ Time from a k  (k-1) to the next (k-1)  k transition

13 Intuition for idle periods 1. # jobs  Normal with mean and variance  2. 3. Events happen at rate  4. Mean idle period of server 1 2 3 MRB  servers for any constant t wait ! In practice, we choose t wait to amortize setup cost

14 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs ON/OFF DELAYEDOFF

15 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs ON/OFF DELAYEDOFF

16 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs ON/OFF DELAYEDOFF DELAYEDOFF: E[Power] = 18.9 kW E[Response time] = 1.002s NO SETUP: E[Power] = 14.4 kW E[Response time] = 1s.

17 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs DELAYEDOFF SQ. ROOT W/ LOOKAHEAD (“optimal offline”)

18 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs DELAYEDOFF SQ. ROOT W/ LOOKAHEAD (“optimal offline”)

19 Mean Job Size = 1s Setup delay = 200s Busy power = 240W  (t) # Busy+idle servers # Jobs DELAYEDOFF DELAYEDOFF: E[Power] = 18.9 kW E[Response time] = 1.002s  w/LOOKAHEAD: E[Power] = 15.8 kW E[Response time] = 1.036s SQ. ROOT W/ LOOKAHEAD (“optimal offline”)

1. Speed scaling algorithms 2. A simple proxy for MRB routing 3. Heterogeneous servers 4. Managing Virtual Machines in the Cloud 5. Wear-leveling/Performance tradeoffs 20

21 Power P(s) Speed (s) Q: Optimal speed to balance energy-performance? A: your favorite metric =

22 MRB requires a lot of state updates Proxy policy Assign static ranks to servers Route a new arrival to highest ranked idle server Almost the same performance as MRB + easier to implement than MRB + easy to extend (MRB) Rank 1 2 3 4 5 Rank- based

23 more efficient less efficient Assign ranks to servers based on efficiency (1 = most efficient) Route a new arrival to highest ranked idle server 1 2 3 4 5

24 2GB RAM 1GB

25 1 2 3 4 5 6 Split physical servers into “virtual” servers Assign static ranks to virtual servers Route a new VM request to highest ranked idle virtual server Turn the physical server OFF after each virtual server has idled for t wait

 Problem: unpredictable demand and non-trivial setup costs  DELAYEDOFF : A new traffic- oblivious capacity scaling scheme  Extensions to real-world scenarios 26 Demand ? Time Max Capacity

27 THEOREM [MRB]: As the load, the number of ON servers is concentrated around THEOREM [Round-Robin]: As the load, for constant job sizes, the number of ON servers is N servers Arrival rate = λ

Download ppt "VARUN GUPTA Carnegie Mellon University 1 Partly based on joint work with: Anshul Gandhi Mor Harchol-Balter Mike Kozuch (CMU) (CMU) (Intel Research)"

Similar presentations