1 Mor Harchol-Balter, CMU, Computer Sci. Alan Scheller-Wolf, CMU, Tepper Business Andrew Young, Morgan Stanley Surprising results on task assignment for.

Slides:

Advertisements

Similar presentations

Computing Truth Value.

Advertisements

Radii and Intervals of Convergence

CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,

Surveys and Questionnaires. How Many People Should I Ask? Ask a lot of people many short questions: Yes/No Likert Scale Ask a smaller number.

Lab Assignment 1 COP 4600: Operating Systems Principles Dr. Sumi Helal Professor Computer & Information Science & Engineering Department University of.

Page 1 Alan Scheller-Wolf Lunteren, The Netherlands January 16, 2013 Things I Thought I Knew about Queueing Theory, but was Wrong About (Part 2, Service.

CPU Scheduling CPU Scheduler Performance metrics for CPU scheduling

Scheduling. Main Points Scheduling policy: what to do next, when there are multiple threads ready to run – Or multiple packets to send, or web requests.

Page 1 Alan Scheller-Wolf Lunteren, The Netherlands January 15, 2013 Things I Thought I Knew About Queueing Theory, but was Wrong About: Part 1, Multiserver.

Queuing Theory For Dummies

Simulation Where real stuff starts. ToC 1.What, transience, stationarity 2.How, discrete event, recurrence 3.Accuracy of output 4.Monte Carlo 5.Random.

INFINITE SEQUENCES AND SERIES

OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.

1 Connection Scheduling in Web Servers Mor Harchol-Balter School of Computer Science Carnegie Mellon

Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms

Fundamental Characteristics of Queues with Fluctuating Load VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ. Alan Scheller-Wolf Carnegie.

OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.

Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.

Fundamental Characteristics of Queues with Fluctuating Load (appeared in SIGMETRICS 2006) VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ.

Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.

CS 153 Design of Operating Systems Spring 2015 Lecture 10: Scheduling.

Copyright Kenneth M. Chipps Ph.D. 1 How To Do VLSM Last Update

Power Series Radii and Intervals of Convergence. First some examples Consider the following example series: What does our intuition tell us about the.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.

Mechanisms for Making Crowds Truthful Andrew Mao, Sergiy Nesterko.

Multi-Unit Auctions with Budget Limits Shahar Dobzinski, Ron Lavi, and Noam Nisan.

1 Mor Harchol-Balter Computer Science Dept, CMU What Analytical Performance Modeling Teaches Us About Computer Systems Design.

Probability Review Thinh Nguyen. Probability Theory Review Sample space Bayes’ Rule Independence Expectation Distributions.

YOU HAVE TO KNOW WHEN THE LIGHT BULB IS REALLY ON! DID YOU GET THAT? WHAT’S THE ANSWER? The one doing all the work [talking], is the one doing all the.

1 WORkshop on Multiserver Scheduling (WORMS) Carnegie Mellon University Pittsburgh, PA April 18 and 19, 2004 Funded by NSF ALADDIN and Tepper WELCOME EVERYONE!

Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.

Section 3.1: Proof Strategy Now that we have a fair amount of experience with proofs, we will start to prove more difficult theorems. Our experience so.

IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.

CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.

1 The Effect of Heavy-Tailed Job Size Distributions on System Design Mor Harchol-Balter MIT Laboratory for Computer Science.

Hypothesis testing Summer Program Brian Healy. Last class Study design Study design –What is sampling variability? –How does our sample effect the questions.

Carnegie Mellon University Computer Science Department 1 OPEN VERSUS CLOSED: A CAUTIONARY TALE Bianca Schroeder Adam Wierman Mor Harchol-Balter Computer.

Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.

CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.

LOGICAL REASONING FOR CAT 2009.

Lesson 2: Common Misconceptions. Misconception 1 “Christianity must be proven scientifically; I’ll accept Christianity when you prove it with the scientific.

Scheduling. Main Points Scheduling policy: what to do next, when there are multiple threads ready to run – Or multiple packets to send, or web requests.

E PLURIBUS HUGO “Out of the Many, a Hugo” v0.98. What does E Pluribus Hugo do? EPH is a way of tallying nominations that minimizes the effects of slates.

Analysis of SRPT Scheduling: Investigating Unfairness Nikhil Bansal (Joint work with Mor Harchol-Balter)

CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.

Statistical Techniques

A significance test or hypothesis test is a procedure for comparing our data with a hypothesis whose truth we want to assess. The hypothesis is usually.

Lecture Topics: 11/15 CPU scheduling: –Scheduling goals and algorithms.

Graphing Inequalities 2.8 >,≤,

1 Task Assignment with Unknown Duration Mor Harchol-Balter Carnegie Mellon.

INFINITE SEQUENCES AND SERIES The convergence tests that we have looked at so far apply only to series with positive terms.

1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal.

Solving Equations Involving Logarithmic and Exponential Functions

CPU Scheduling Andy Wang Operating Systems COP 4610 / CGS 5765.

1 Interactive Computer Theorem Proving CS294-9 October 19, 2006 Adam Chlipala UC Berkeley Lecture 9: Beyond Primitive Recursion.

Load Balancing and Data centers

Serve Assignment Policies

Andy Wang Operating Systems COP 4610 / CGS 5765

Instructor: Shengyu Zhang

Truth Tables Hurley

Prof. Ramin Zabih Least squares fitting Prof. Ramin Zabih

Indistinguishability by adaptive procedures with advice, and lower bounds on hardness amplification proofs Aryeh Grinberg, U. Haifa Ronen.

Planning and Scheduling

TECHNIQUES OF INTEGRATION

Selfish Load Balancing

If x is a variable, then an infinite series of the form

Psych 231: Research Methods in Psychology

Psych 231: Research Methods in Psychology

Lesson 66 – Improper Integrals

Presentation transcript:

1 Mor Harchol-Balter, CMU, Computer Sci. Alan Scheller-Wolf, CMU, Tepper Business Andrew Young, Morgan Stanley Surprising results on task assignment for high-variability workloads

2 Server farm model Q: What is a good Assignment Policy? (high-variability jobs) incoming jobs Assignment Policy FIFO Poisson( ) Goal: Minimize mean response time: E[T] POLICY MATTERS! Mor Harchol-Balter, CMU

3 Good Answers SITA (Size Interval) Split jobs based on size cutoffs. SITA +Isolation for smalls +Protects against high variability M/G/2 LWL (Least Work Left) Send job to host with least remaining work. LWL +High throughput Mor Harchol-Balter, CMU

4 Prior Work on SITA Supercomputing Centers [Hotovy, Schneider, O’Donnell 96] [Schroeder, Harchol-Balter 00] Manufacturing Centers [Buzacott, Shanthikumar 93] File Server Farms [Cardellini, Colajanni, Yu 01] Supermarkets SITA in Practice [Harchol-Balter 00] [Harchol-Balter 02] [Thomas 08] [Tari,Broberg,Zomaya, Baldoni 05] [Fu, Broberg, Tari 03] SITA variants [Harchol-Balter,Crovella,Murta 98] [Bachmat, Sarfati 08] [Sarfati 08] [Harchol-Balter, Vesilo 10] Optimizing SITA cutoffs [Broberg, Tari, Zeephongsekul 06] [Harchol-Balter 02] [Ciardo, Riska, Smirni 01] [El-Taha, Maddah 06] [Fu, Broberg, Tari 03] [Harchol-Balter, Crovella, Murta 99] [Oida, Shinjo 99] [ Tari, Broberg, Zomaya, Baldoni 05] [Thomas 08] SITA vs. LWL SITA LWL All conclude SITA far superior for high variability

5 Can’t prove anything because it’s not true! In search of a proof of SITA’s total dominance. OK, so not optimal, but definite win for high variability. Should at least beat all commonly used policies when variability is high enough. Months later Years later SITA Mor Harchol-Balter, CMU

The TRUTH about SITA, under very high job size variability Alternative Title : Mor Harchol-Balter, CMU

7 Q: In this talk we will show... as C 2  ∞ a)SITA diverges & LWL diverges? b)SITA converges & LWL diverges ? c)SITA diverges & LWL converges? d)SITA converges & LWL converges? A: All of the above SITA LWL

8 Q: In this talk we will show... as C 2  ∞ Convergent LWL Divergent LWL Convergent SITA Divergent SITA Looking for simple job size distributions to illustrate each. Mor Harchol-Balter, CMU

9 Results (2 server system) p 1-p a b Bimodal Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL or p 1-p Exp(  a ) H2H2 Exp(  b ) depends on pa & (1-p)b Mor Harchol-Balter, CMU

10 Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL or  a Trimodal b c=b m > 1  H3H3 Exp(  a ) Exp(  b ) Exp(  c )  depends on m Results (2 server system) Mor Harchol-Balter, CMU

11 Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL depends on  Results (2 server system) Bounded Pareto(  ) 1<  < 2 Mor Harchol-Balter, CMU

12 Bimodal Results X ~ p 1-p a b pa = QE[X] (1-p)b = (1-Q)E[X] THM: If  a < 1 &  b < 1  Convergent SITA Lemma: As C 2  ∞, but E[X], Q: const, a’s get little smaller  QE[X] b’s get much bigger  ∞ p  1 THM: LWL always diverges. Conv. LWL Diverg LWL Conv. SITA Diverg SITA depends  a &  b Mor Harchol-Balter, CMU

13 Understanding LWL Isn’t LWL always bad for high C 2 ? It depends … Need 2 longs for this to be a problem! So we need: Pr{ 2 longs } * E[T| 2 longs] ? But shorts stuck behind longs, so E[T]  ∞ Suffices to just look at E[X 3/2 ]. LWL Mor Harchol-Balter, CMU

Thm: 14 Understanding LWL Thm: [Scheller-Wolf, Sigman 97], [Scheller-Wolf, Vesilo 06] (2 SERVERS) 1 spare server C2∞C2∞ I can make both happen! LWL Mor Harchol-Balter, CMU

15 Bimodal Results THM: If  a < 1 &  b < 1  Convergent SITA Conv. LWL Diverg LWL Conv. SITA Diverg SITA THM: LWL always diverges. X ~ p 1-p a b pa = QE[X] (1-p)b = (1-Q)E[X] depends  a &  b Lemma: As C 2  ∞, but E[X], Q: const, a  QE[X], b  ∞, p  1 Mor Harchol-Balter, CMU

16 Trimodal Results THM: If m≤3, SITA converges If m>3, SITA diverges Lemma: As C 2  ∞, but E[X]: const, a  E[X] b  ∞, c  ∞ p a  1 THM: LWL always converges for  <1 Conv. LWL Diverg LWL Conv. SITA Diverg SITA X~ b c=b m a papa p b =b -3/2 p c =c -3/2 depends on m Mor Harchol-Balter, CMU

17 Results (2 server system) Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL p 1-p a b Bimodal  a Trimodal b c=b m > 1  Way more complex, because job types overlap! “Separation in the limit” p 1-p Exp(  a ) H2H2 Exp(  b ) H3H3 Exp(  a ) Exp(  b ) Exp(  c )  Mor Harchol-Balter, CMU

Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL E[T]LWL SITA C2C2 E[T] LWL SITA C2C2 E[T] LWL SITA C2C2 18 E[T] LWL SITA C2C2 Mor Harchol-Balter, CMU

19 Bounded Pareto (2 server system) THM: SITA always diverges. Lemma: As C 2  ∞, but E[X],  : const, k   E[X] p  ∞ THM: If  >3/2 and  <1, then LWL converges. Else LWL diverges. Conv. LWL Diverg LWL Conv. SITA Diverg SITA depends on  Extends to n>2 servers when  < n-1 X~ Bounded Pareto (k,p,  ) kp 1 <  < 2

Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL 20 Bounded Pareto Results  = 1.6 LWL SITA C2C2 E[T]  = 1.4 LWL SITA C2C2 E[T] Why was this not noticed ? Mor Harchol-Balter, CMU

21 Summary Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL p 1-p a b Bimodal or p 1-p Exp(  a ) H2H2 Exp(  b ) H3H3 Exp(  a ) Exp(  b ) Exp(  c )   a Trimodal b c=b n > 1  Bounded Pareto(  ) 1<  < 2 Mor Harchol-Balter, CMU

22 Old Nursery Rhyme “There once was a girl, who had a little curl right in the middle of her forehead. When she was good, she was very very good. But when she was bad, she was horrid.” When SITA is good, it is very, very good But when it is bad, it is horrid. Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL Mor Harchol-Balter, CMU

23 Epilogue … Mor Harchol-Balter, CMU

24 Where did SITA go wrong? SITA designed to keep shorts from getting stuck behind longs. Isn’t that good? But stringent segregation of shorts & longs can lead to underutilization of servers. Also, for some distributions, can’t subdivide to avoid infinite variability. Mor Harchol-Balter, CMU

25 Must be a better way… SITA Split jobs based on size cutoffs. SITA +Isolation for smalls +Protects against high variability LWL Send job to host with least remaining work. LWL +High throughput M/G/2 Take next job Take next job CS Take next small Take next job

26 Must be a better way… WIN/WIN ! Shorts have isolation from longs And server utilization is high WRONG! Thm: Whenever SITA diverges, CS diverges too. Mor Harchol-Balter, CMU CS Take next small Take next job

27 Thm: Whenever SITA diverges, CS diverges too. SITA Split jobs by size. SITA PROOF: There are 2 reasons why SITA diverges under given  (1)Any way of slicing leads to (2) There is a way of slicing away variability, but it forces e.g., Pareto(  )  SmallsLarges e.g., Bimodal  S’s L’s CS can’t help. CS should help, but doesn’t. CS Take next small Take next job

28 Thm: Whenever SITA diverges, CS diverges too. SITA Split jobs by size. SITA PROOF: There are 2 reasons why SITA diverges under given  (1)Any way of slicing leads to (2) There is a way of slicing away variability, but it forces Pareto(  )  SmallsLarges Bimodal  S’s L’s CS can’t help. CS should help, but doesn’t. Small job sees L of age ~L e.  Small server has been in overload for ~L e time  Small sees ~L e work  experiences L e delay.  Delay of small  ∞ as C 2  ∞ CS Take next small Take next job

29 Conclusion SITA. LWL M/G/2 Take next job Take next job Maybe isolating short jobs is not the panacea for high-variability workloads after all … Mor Harchol-Balter, CMU CS Take next small Take next job