1 Mor Harchol-Balter, CMU, Computer Sci. Alan Scheller-Wolf, CMU, Tepper Business Andrew Young, Morgan Stanley Surprising results on task assignment for.

Slides:



Advertisements
Similar presentations
Computing Truth Value.
Advertisements

Radii and Intervals of Convergence
CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,
Surveys and Questionnaires. How Many People Should I Ask? Ask a lot of people many short questions: Yes/No Likert Scale Ask a smaller number.
Lab Assignment 1 COP 4600: Operating Systems Principles Dr. Sumi Helal Professor Computer & Information Science & Engineering Department University of.
Page 1 Alan Scheller-Wolf Lunteren, The Netherlands January 16, 2013 Things I Thought I Knew about Queueing Theory, but was Wrong About (Part 2, Service.
CPU Scheduling CPU Scheduler Performance metrics for CPU scheduling
Scheduling. Main Points Scheduling policy: what to do next, when there are multiple threads ready to run – Or multiple packets to send, or web requests.
Page 1 Alan Scheller-Wolf Lunteren, The Netherlands January 15, 2013 Things I Thought I Knew About Queueing Theory, but was Wrong About: Part 1, Multiserver.
Queuing Theory For Dummies
Simulation Where real stuff starts. ToC 1.What, transience, stationarity 2.How, discrete event, recurrence 3.Accuracy of output 4.Monte Carlo 5.Random.
INFINITE SEQUENCES AND SERIES
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
1 Connection Scheduling in Web Servers Mor Harchol-Balter School of Computer Science Carnegie Mellon
Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms
Fundamental Characteristics of Queues with Fluctuating Load VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ. Alan Scheller-Wolf Carnegie.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.
Fundamental Characteristics of Queues with Fluctuating Load (appeared in SIGMETRICS 2006) VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ.
Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.
CS 153 Design of Operating Systems Spring 2015 Lecture 10: Scheduling.
Copyright Kenneth M. Chipps Ph.D. 1 How To Do VLSM Last Update
Power Series Radii and Intervals of Convergence. First some examples Consider the following example series: What does our intuition tell us about the.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 9 Introduction to Hypothesis Testing.
Mechanisms for Making Crowds Truthful Andrew Mao, Sergiy Nesterko.
Multi-Unit Auctions with Budget Limits Shahar Dobzinski, Ron Lavi, and Noam Nisan.
1 Mor Harchol-Balter Computer Science Dept, CMU What Analytical Performance Modeling Teaches Us About Computer Systems Design.
Probability Review Thinh Nguyen. Probability Theory Review Sample space Bayes’ Rule Independence Expectation Distributions.
YOU HAVE TO KNOW WHEN THE LIGHT BULB IS REALLY ON! DID YOU GET THAT? WHAT’S THE ANSWER? The one doing all the work [talking], is the one doing all the.
1 WORkshop on Multiserver Scheduling (WORMS) Carnegie Mellon University Pittsburgh, PA April 18 and 19, 2004 Funded by NSF ALADDIN and Tepper WELCOME EVERYONE!
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
Section 3.1: Proof Strategy Now that we have a fair amount of experience with proofs, we will start to prove more difficult theorems. Our experience so.
IT253: Computer Organization Lecture 3: Memory and Bit Operations Tonga Institute of Higher Education.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
1 The Effect of Heavy-Tailed Job Size Distributions on System Design Mor Harchol-Balter MIT Laboratory for Computer Science.
Hypothesis testing Summer Program Brian Healy. Last class Study design Study design –What is sampling variability? –How does our sample effect the questions.
Carnegie Mellon University Computer Science Department 1 OPEN VERSUS CLOSED: A CAUTIONARY TALE Bianca Schroeder Adam Wierman Mor Harchol-Balter Computer.
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
CS 267: Automated Verification Lecture 3: Fixpoints and Temporal Properties Instructor: Tevfik Bultan.
LOGICAL REASONING FOR CAT 2009.
Lesson 2: Common Misconceptions. Misconception 1 “Christianity must be proven scientifically; I’ll accept Christianity when you prove it with the scientific.
Scheduling. Main Points Scheduling policy: what to do next, when there are multiple threads ready to run – Or multiple packets to send, or web requests.
E PLURIBUS HUGO “Out of the Many, a Hugo” v0.98. What does E Pluribus Hugo do? EPH is a way of tallying nominations that minimizes the effects of slates.
Analysis of SRPT Scheduling: Investigating Unfairness Nikhil Bansal (Joint work with Mor Harchol-Balter)
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
Statistical Techniques
A significance test or hypothesis test is a procedure for comparing our data with a hypothesis whose truth we want to assess. The hypothesis is usually.
Lecture Topics: 11/15 CPU scheduling: –Scheduling goals and algorithms.
Graphing Inequalities 2.8 >,≤,
1 Task Assignment with Unknown Duration Mor Harchol-Balter Carnegie Mellon.
INFINITE SEQUENCES AND SERIES The convergence tests that we have looked at so far apply only to series with positive terms.
1 Mor Harchol-Balter Carnegie Mellon with Nikhil Bansal with Bianca Schroeder with Mukesh Agrawal.
Solving Equations Involving Logarithmic and Exponential Functions
CPU Scheduling Andy Wang Operating Systems COP 4610 / CGS 5765.
1 Interactive Computer Theorem Proving CS294-9 October 19, 2006 Adam Chlipala UC Berkeley Lecture 9: Beyond Primitive Recursion.
Load Balancing and Data centers
Serve Assignment Policies
Andy Wang Operating Systems COP 4610 / CGS 5765
Instructor: Shengyu Zhang
Truth Tables Hurley
Prof. Ramin Zabih Least squares fitting Prof. Ramin Zabih
Truth Trees.
Indistinguishability by adaptive procedures with advice, and lower bounds on hardness amplification proofs Aryeh Grinberg, U. Haifa Ronen.
Planning and Scheduling
TECHNIQUES OF INTEGRATION
Selfish Load Balancing
If x is a variable, then an infinite series of the form
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Lesson 66 – Improper Integrals
Presentation transcript:

1 Mor Harchol-Balter, CMU, Computer Sci. Alan Scheller-Wolf, CMU, Tepper Business Andrew Young, Morgan Stanley Surprising results on task assignment for high-variability workloads

2 Server farm model Q: What is a good Assignment Policy? (high-variability jobs) incoming jobs Assignment Policy FIFO Poisson( ) Goal: Minimize mean response time: E[T] POLICY MATTERS! Mor Harchol-Balter, CMU

3 Good Answers SITA (Size Interval) Split jobs based on size cutoffs. SITA +Isolation for smalls +Protects against high variability M/G/2 LWL (Least Work Left) Send job to host with least remaining work. LWL +High throughput Mor Harchol-Balter, CMU

4 Prior Work on SITA Supercomputing Centers [Hotovy, Schneider, O’Donnell 96] [Schroeder, Harchol-Balter 00] Manufacturing Centers [Buzacott, Shanthikumar 93] File Server Farms [Cardellini, Colajanni, Yu 01] Supermarkets SITA in Practice [Harchol-Balter 00] [Harchol-Balter 02] [Thomas 08] [Tari,Broberg,Zomaya, Baldoni 05] [Fu, Broberg, Tari 03] SITA variants [Harchol-Balter,Crovella,Murta 98] [Bachmat, Sarfati 08] [Sarfati 08] [Harchol-Balter, Vesilo 10] Optimizing SITA cutoffs [Broberg, Tari, Zeephongsekul 06] [Harchol-Balter 02] [Ciardo, Riska, Smirni 01] [El-Taha, Maddah 06] [Fu, Broberg, Tari 03] [Harchol-Balter, Crovella, Murta 99] [Oida, Shinjo 99] [ Tari, Broberg, Zomaya, Baldoni 05] [Thomas 08] SITA vs. LWL SITA LWL All conclude SITA far superior for high variability

5 Can’t prove anything because it’s not true! In search of a proof of SITA’s total dominance. OK, so not optimal, but definite win for high variability. Should at least beat all commonly used policies when variability is high enough. Months later Years later SITA Mor Harchol-Balter, CMU

The TRUTH about SITA, under very high job size variability Alternative Title : Mor Harchol-Balter, CMU

7 Q: In this talk we will show... as C 2  ∞ a)SITA diverges & LWL diverges? b)SITA converges & LWL diverges ? c)SITA diverges & LWL converges? d)SITA converges & LWL converges? A: All of the above SITA LWL

8 Q: In this talk we will show... as C 2  ∞ Convergent LWL Divergent LWL Convergent SITA Divergent SITA Looking for simple job size distributions to illustrate each. Mor Harchol-Balter, CMU

9 Results (2 server system) p 1-p a b Bimodal Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL or p 1-p Exp(  a ) H2H2 Exp(  b ) depends on pa & (1-p)b Mor Harchol-Balter, CMU

10 Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL or  a Trimodal b c=b m > 1  H3H3 Exp(  a ) Exp(  b ) Exp(  c )  depends on m Results (2 server system) Mor Harchol-Balter, CMU

11 Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL depends on  Results (2 server system) Bounded Pareto(  ) 1<  < 2 Mor Harchol-Balter, CMU

12 Bimodal Results X ~ p 1-p a b pa = QE[X] (1-p)b = (1-Q)E[X] THM: If  a < 1 &  b < 1  Convergent SITA Lemma: As C 2  ∞, but E[X], Q: const, a’s get little smaller  QE[X] b’s get much bigger  ∞ p  1 THM: LWL always diverges. Conv. LWL Diverg LWL Conv. SITA Diverg SITA depends  a &  b Mor Harchol-Balter, CMU

13 Understanding LWL Isn’t LWL always bad for high C 2 ? It depends … Need 2 longs for this to be a problem! So we need: Pr{ 2 longs } * E[T| 2 longs] ? But shorts stuck behind longs, so E[T]  ∞ Suffices to just look at E[X 3/2 ]. LWL Mor Harchol-Balter, CMU

Thm: 14 Understanding LWL Thm: [Scheller-Wolf, Sigman 97], [Scheller-Wolf, Vesilo 06] (2 SERVERS) 1 spare server C2∞C2∞ I can make both happen! LWL Mor Harchol-Balter, CMU

15 Bimodal Results THM: If  a < 1 &  b < 1  Convergent SITA Conv. LWL Diverg LWL Conv. SITA Diverg SITA THM: LWL always diverges. X ~ p 1-p a b pa = QE[X] (1-p)b = (1-Q)E[X] depends  a &  b Lemma: As C 2  ∞, but E[X], Q: const, a  QE[X], b  ∞, p  1 Mor Harchol-Balter, CMU

16 Trimodal Results THM: If m≤3, SITA converges If m>3, SITA diverges Lemma: As C 2  ∞, but E[X]: const, a  E[X] b  ∞, c  ∞ p a  1 THM: LWL always converges for  <1 Conv. LWL Diverg LWL Conv. SITA Diverg SITA X~ b c=b m a papa p b =b -3/2 p c =c -3/2 depends on m Mor Harchol-Balter, CMU

17 Results (2 server system) Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL p 1-p a b Bimodal  a Trimodal b c=b m > 1  Way more complex, because job types overlap! “Separation in the limit” p 1-p Exp(  a ) H2H2 Exp(  b ) H3H3 Exp(  a ) Exp(  b ) Exp(  c )  Mor Harchol-Balter, CMU

Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL E[T]LWL SITA C2C2 E[T] LWL SITA C2C2 E[T] LWL SITA C2C2 18 E[T] LWL SITA C2C2 Mor Harchol-Balter, CMU

19 Bounded Pareto (2 server system) THM: SITA always diverges. Lemma: As C 2  ∞, but E[X],  : const, k   E[X] p  ∞ THM: If  >3/2 and  <1, then LWL converges. Else LWL diverges. Conv. LWL Diverg LWL Conv. SITA Diverg SITA depends on  Extends to n>2 servers when  < n-1 X~ Bounded Pareto (k,p,  ) kp 1 <  < 2

Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL 20 Bounded Pareto Results  = 1.6 LWL SITA C2C2 E[T]  = 1.4 LWL SITA C2C2 E[T] Why was this not noticed ? Mor Harchol-Balter, CMU

21 Summary Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL p 1-p a b Bimodal or p 1-p Exp(  a ) H2H2 Exp(  b ) H3H3 Exp(  a ) Exp(  b ) Exp(  c )   a Trimodal b c=b n > 1  Bounded Pareto(  ) 1<  < 2 Mor Harchol-Balter, CMU

22 Old Nursery Rhyme “There once was a girl, who had a little curl right in the middle of her forehead. When she was good, she was very very good. But when she was bad, she was horrid.” When SITA is good, it is very, very good But when it is bad, it is horrid. Conv. SITA Diverg. SITA Conv. LWL Diverg. LWL Mor Harchol-Balter, CMU

23 Epilogue … Mor Harchol-Balter, CMU

24 Where did SITA go wrong? SITA designed to keep shorts from getting stuck behind longs. Isn’t that good? But stringent segregation of shorts & longs can lead to underutilization of servers. Also, for some distributions, can’t subdivide to avoid infinite variability. Mor Harchol-Balter, CMU

25 Must be a better way… SITA Split jobs based on size cutoffs. SITA +Isolation for smalls +Protects against high variability LWL Send job to host with least remaining work. LWL +High throughput M/G/2 Take next job Take next job CS Take next small Take next job

26 Must be a better way… WIN/WIN ! Shorts have isolation from longs And server utilization is high WRONG! Thm: Whenever SITA diverges, CS diverges too. Mor Harchol-Balter, CMU CS Take next small Take next job

27 Thm: Whenever SITA diverges, CS diverges too. SITA Split jobs by size. SITA PROOF: There are 2 reasons why SITA diverges under given  (1)Any way of slicing leads to (2) There is a way of slicing away variability, but it forces e.g., Pareto(  )  SmallsLarges e.g., Bimodal  S’s L’s CS can’t help. CS should help, but doesn’t. CS Take next small Take next job

28 Thm: Whenever SITA diverges, CS diverges too. SITA Split jobs by size. SITA PROOF: There are 2 reasons why SITA diverges under given  (1)Any way of slicing leads to (2) There is a way of slicing away variability, but it forces Pareto(  )  SmallsLarges Bimodal  S’s L’s CS can’t help. CS should help, but doesn’t. Small job sees L of age ~L e.  Small server has been in overload for ~L e time  Small sees ~L e work  experiences L e delay.  Delay of small  ∞ as C 2  ∞ CS Take next small Take next job

29 Conclusion SITA. LWL M/G/2 Take next job Take next job Maybe isolating short jobs is not the panacea for high-variability workloads after all … Mor Harchol-Balter, CMU CS Take next small Take next job