1 Scheduling Jobs with Varying Parallelizability Ravishankar Krishnaswamy Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
Scheduling Criteria CPU utilization – keep the CPU as busy as possible (from 0% to 100%) Throughput – # of processes that complete their execution per.
Advertisements

1 SOFSEM 2007 Weighted Nearest Neighbor Algorithms for the Graph Exploration Problem on Cycles Eiji Miyano Kyushu Institute of Technology, Japan Joint.
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Evaluating Heuristics for the Fixed-Predecessor Subproblem of Pm | prec, p j = 1 | C max.
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
S ELFISH M IGRATE : A Scalable Algorithm for Non-clairvoyantly Scheduling Heterogeneous Processors Janardhan Kulkarni, Duke University Sungjin Im (UC Merced.
Maintaining Variance and k-Medians over Data Stream Windows Brian Babcock, Mayur Datar, Rajeev Motwani, Liadan O’Callaghan Stanford University.
A Simple Distribution- Free Approach to the Max k-Armed Bandit Problem Matthew Streeter and Stephen Smith Carnegie Mellon University.
Approximation Algorithms for Capacitated Set Cover Ravishankar Krishnaswamy (joint work with Nikhil Bansal and Barna Saha)
Online Scheduling with Known Arrival Times Nicholas G Hall (Ohio State University) Marc E Posner (Ohio State University) Chris N Potts (University of Southampton)
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Scheduling with Outliers Ravishankar Krishnaswamy (Carnegie Mellon University) Joint work with Anupam Gupta, Amit Kumar and Danny Segev.
1 Better Scalable Algorithms for Broadcast Scheduling Ravishankar Krishnaswamy Carnegie Mellon University Joint work with Nikhil Bansal and Viswanath Nagarajan.
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
Item Pricing for Revenue Maximization in Combinatorial Auctions Maria-Florina Balcan, Carnegie Mellon University Joint with Avrim Blum and Yishay Mansour.
1 An Asymptotically Optimal Algorithm for the Max k-Armed Bandit Problem Matthew Streeter & Stephen Smith Carnegie Mellon University NESCAI, April
6/25/2015Page 1 Process Scheduling B.Ramamurthy. 6/25/2015Page 2 Introduction An important aspect of multiprogramming is scheduling. The resources that.
Convergence Time to Nash Equilibria in Load Balancing Eyal Even-Dar, Tel-Aviv University Alex Kesselman, Tel-Aviv University Yishay Mansour, Tel-Aviv University.
1 Scheduling on Heterogeneous Machines: Minimize Total Energy + Flowtime Ravishankar Krishnaswamy Carnegie Mellon University Joint work with Anupam Gupta.
7/12/2015Page 1 Process Scheduling B.Ramamurthy. 7/12/2015Page 2 Introduction An important aspect of multiprogramming is scheduling. The resources that.
A Model for Minimizing Active Processor Time Jessica Chang Joint work with Hal Gabow and Samir Khuller.
Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson.
Online Function Tracking with Generalized Penalties Marcin Bieńkowski Institute of Computer Science, University of Wrocław, Poland Stefan Schmid Deutsche.
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
Minimizing Makespan and Preemption Costs on a System of Uniform Machines Hadas Shachnai Bell Labs and The Technion IIT Tami Tamir Univ. of Washington Gerhard.
Called as the Interval Scheduling Problem. A simpler version of a class of scheduling problems. – Can add weights. – Can add multiple resources – Can ask.
Chapter 6 CPU SCHEDULING.
CS 473Lecture 121 CS473-Algorithms I Lecture 12 Amortized Analysis.
Yossi Azar Tel Aviv University Joint work with Ilan Cohen Serving in the Dark 1.
Why is bin packing interesting?
Approximation schemes Bin packing problem. Bin Packing problem Given n items with sizes a 1,…,a n  (0,1]. Find a packing in unit-sized bins that minimizes.
Speed Scaling to Manage Energy and Temperature Nikhil Bansal (IBM Research) Tracy Kimbrel (IBM) and Kirk Pruhs (Univ. of Pittsburgh)
1 Pruhs, Woeginger, Uthaisombut 2004  Qos Objective: Minimize total flow time  Flow time f i of a job i is completion time C i – r i  Power Objective:
1 Server Scheduling in the L p norm Nikhil Bansal (CMU) Kirk Pruhs (Univ. of Pittsburgh)
Scheduling policies for real- time embedded systems.
Department of Computer Science Stanley P. Y. Fung Online Preemptive Scheduling with Immediate Decision or Notification and Penalties.
A Maiden Analysis of Longest Wait First Jeff Edmonds York University Kirk Pruhs University of Pittsburgh.
Chapter 5 CPU Scheduling Bernard Chen Spring 2007.
Greedy Algorithms. Surprisingly, many important and practical computational problems can be solved this way. Every two year old knows the greedy algorithm.
Dean Tullsen UCSD.  The parallelism crisis has the feel of a relatively new problem ◦ Results from a huge technology shift ◦ Has suddenly become pervasive.
Operational Research & ManagementOperations Scheduling Economic Lot Scheduling 1.Summary Machine Scheduling 2.ELSP (one item, multiple items) 3.Arbitrary.
CSCI1600: Embedded and Real Time Software Lecture 23: Real Time Scheduling I Steven Reiss, Fall 2015.
APPROX and RANDOM 2006 Online Algorithms to Minimize Resource Reallocation and Network Communication Sashka Davis, UCSD Jeff Edmonds, York University,
With Extra Bandwidth and Time for Adjustment TCP is Competitive J. Edmonds, S. Datta, and P. Dymond.
Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.
Multicast Pull Scheduling Kirk Pruhs. The Big Problem Movie Distribution Database Replication via Internet Harry Potter Book Download Software Download.
1 Approximation Algorithms for Generalized Min-Sum Set Cover Ravishankar Krishnaswamy Carnegie Mellon University joint work with Nikhil Bansal and Anupam.
Ben Moseley Sungjin Im Goals of Talk  Make you aware of/interested in broadcast sched  Highlight known results, key open questions, some recent results.
1 Approximation Algorithms for Generalized Scheduling Problems Ravishankar Krishnaswamy Carnegie Mellon University joint work with Nikhil Bansal, Anupam.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Scheduling Parallel DAG Jobs to Minimize the Average Flow Time K. Agrawal, J. Li, K. Lu, B. Moseley.
6/13/20161 Greedy A Comparison. 6/13/20162 Greedy Solves an optimization problem: the solution is “best” in some sense. Greedy Strategy: –At each decision.
Tuesday, March 19 The Network Simplex Method for Solving the Minimum Cost Flow Problem Handouts: Lecture Notes Warning: there is a lot to the network.
CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.
Amortized Analysis.
CPU Scheduling CSSE 332 Operating Systems
Flavius Gruian < >
James B. Orlin Presented by Tal Kaminker
Framework for the Secretary Problem on the Intersection of Matroids
Dynamic and Online Algorithms for Set Cover
New Scheduling Algorithms: Improving Fairness and Quality of Service
University of Pittsburgh
Process Scheduling B.Ramamurthy 2/23/2019.
University of Pittsburgh
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
György Dósa – M. Grazia Speranza – Zsolt Tuza:
Process Scheduling B.Ramamurthy 4/19/2019.
Process Scheduling B.Ramamurthy 5/7/2019.
the k-cut problem better approximate and exact algorithms
Presentation transcript:

1 Scheduling Jobs with Varying Parallelizability Ravishankar Krishnaswamy Carnegie Mellon University

Outline Motivation Formalization of Problems A Concrete Example Generalizations 2

Motivation Consider a scheduler on a multi-processor system – Different jobs of varying importance arrive “online” – Each job is inherently decomposed into stages Each stage has some degree of parallelism Scheduler is not aware of these – Only knows when a job completes! Can we do anything “good”? 3

A Small Example Scheduling on a Multi-Processor system – m processors – 1 job arrives at t=0, it is fully sequential (does not make use of parallelism even when provided) – several jobs arrive subsequently, and they are all parallelizable completely (imagine a single ‘for’ loop, which can be completely parallelized) Bad Solution FCFS, and allocate all the machines to earliest job. 4

Formalization Scheduling Model Online, Preemptive, Non-Clairvoyant Job Types Varying degrees of Parallelizability Objective Minimize the average flowtime (or some function of them) 5

Explaining away the terms Online – we know of a new job only at the time when it arrives. Non-Clairvoyance – we know “nothing” about the job like runtime, extents of parallelizability, etc. – job tells us “it’s done” once it is completely scheduled 6

Explaining away the terms Varying Parallelizability [ECBD STOC 1997] – Each job is composed of different stages – Each stage r has a degree of parallelizability Γ r (p) How parallelizable the stage is, given p machines 7 p Γ(p) Makes no sense to allocate more cores

Special Case: Fully Parallelizable Stage 8 p Γ(p)

Special Case: Sequential Stage 9 p Γ(p) 1 1 In general, Γ is assumed to be 1)Non Decreasing 2)Sublinear

Explaining away the terms Objective Function Average flowtime:minimize Σ j (C j – a j ) L 2 Norm of flowtime:minimize Σ j (C j – a j ) 2 etc. 10

Can we do anything at all? Yes, with a little bit of resource augmentation – Online algorithm uses ‘ s m’ machines to schedule an instance on which OPT can only use ‘m’ machines. O(1/ ∈ )-competitive algorithm with (2+ ∈ )- augmentation for minimizing average flowtime [E99] 11

Outline Motivation Formalization of Problems A Concrete Example Generalizations 12

The Case of Unweighted Flowtimes Instance has n jobs that arrive online, m processors (online algorithm can use sm machines) Each job has several stages, each with its own ‘degree of parallelizability’ curve. Minimize average flowtime of the jobs (or by scaling, the total flowtime of the jobs) 13

The Case of Unweighted Flowtimes Algorithm: – At each time t, let N A (t) be the unfinished jobs in our algorithms queue. – For each job, devote sm/N A (t) share of processing towards it. This is O(1)-competitive with O(1) augmentation. 14 (In paper, Edmonds and Pruhs get O(1+ ∈ ) O(1/ ∈ 2 )-competitive algorithm)

High Level Proof Idea [E99] 15 General Instance Restricted Extremal Instance Solve Extremal Case - Non-Clairvoyant Alg can’t distinguish the 2 instances - Also ensure OPT R ≤ OPT G - Show that EQUI (with augmentation) is O(1) competitive against OPT R Each stage is either fully parallelizable or fully sequential

Reduction to Extremal Case Consider an infinitesimally small time interval [t, t+dt) ALG gives p processors towards some j. OPT gives p* processors to get same work done (before or after t). Γ( p) dt = Γ( p*) dt* 16 tt* 1)ALG is oblivious to change 2)OPT can fit in new work in-place If p ≥ p*, replace this work with dt “sequential work”. If p < p*, replace this work with pdt “parallel work”.

High Level Proof Idea [E99] 17 General Instance Restricted Extremal Instance Solve Extremal Case

Amortized Competitiveness Analysis Contribution of any alive job at time t is 1 Total rise of objective function at time t is |N A (t)| Would be done if we could show (for all t) |N A (t)| ≤ O(1) |N O (t)| 18 C j - a j

Amortized Competitiveness Analysis Sadly, we can’t show that. There could be situations when |N A (t)| is 100 and |N O (t)| is 10 and vice-versa too. Way around: Use some kind of global accounting. 19 When we’re way behind OPT When OPT pay lot more than us

Banking via a Potential Function Resort to an amortized analysis Define a potential function Φ(t) which is 0 at t=0 and t= Show the following: – At any job arrival, Δ Φ ≤ α ΔOPT ( ΔOPT is the increase in future OPT cost due to arrival of job) – At all other times, 20 Will give us an ( α+β) -competitive online algorithm

For our Problem Define rank(j) is sorted order of jobs w.r.t arrivals. (most recent has highest rank) y a (j,t) - y o (j,t) is the ‘lag’ in parallel work algorithm has over optimal solution 21

Arrivals and Departures are OK Recall When new job arrives, y a (j,t) = y o (j,t). Hence, that term adds 0. For all other jobs, rank(j) remains unchanged. Also, when our algorithm completes a job, some ranks may decrease, but that causes no drop in potential. 22

Running Condition Recall At any time instant, Φ increases due to OPT working and decreases due to ALG working. Notice that in the worst case, OPT is working on most recently arrived job, and hence Φ increases at rate of at most |N A (t)|. 23

What goes up must come down.. Φ will drop as long as there the algorithm is working on jobs in their parallel phase which OPT is ahead on. – If they are in a sequential phase, they don’t decrease in Φ. – If OPT is ahead on a job in parallel phase, max(0, y a (j) - y o (j)) = 0. Suppose there are very few (say, |N A (t)|/100) which are ‘bad’. Then, algorithm working drops Φ for most jobs. Drop is at least 24 (counters both ALG’s cost and the increase in Φ due to OPT working)

Putting it all together So in the good case, we have Handling bad jobs – If ALG is leading OPT in at least |N A (t)|/200 jobs, then we can charge LHS to 400 |N O (t)|. – If more than |NA(t)|/200 jobs are in sequential phase, OPT must pay 1 for each of these job sometime in the past/future. (observe that no point in OPT will be double charged) 25 Integrating over time, we get c(ALG) ≤ 400 c(OPT) c(OPT)

Outline Motivation Formalization of Problems A Concrete Example Generalizations 26

Minimizing L 2 norm of Flowtimes [GIKMP10] Round-Robin does not work 1 job arrives at t=0, and has some parallel work to do. Subsequently some unit sized sequential jobs arrive every time step. Optimal solution: just work on first job. Round-Robin will waste lot of cycles on subsequent jobs, and incur a larger cost on job 1 (because of flowtime 2 ) 27

To Fix the problem Need to consider “weighted” round-robin, where age of a job is its weight. Generalize the earlier potential function – handle ages/weights – can’t charge sequential parts directly with optimal (if they were executed at different ages, it didn’t matter in L 1 minimization) Get O(1/ ∈ 3 )-competitive algorithm with (2+ ∈ )-augmentation. 28

Other Generalization [CEP09] consider the problem of scheduling such jobs on machines whose speeds can be altered, with objective of minimizing flowtime + energy; they give a O( α 2 log m) competitive online algorithm. [BKN10] use similar potential function based analysis to get (1+ ∈ )-speed O(1)-competitive algorithms for broadcast scheduling. 29

Conclusion Looked at model where jobs have varying degrees of parallelism, and non-clairvoyant scheduling Outlined analysis for a O(1)-augmentation O(1)-competitive analysis Described a couple of recent generalizations Open Problems – Improving augmentation requirement/ showing a lower bound on L p norm minimization – Close the gap in flowtime+energy setting 30

31 Thank You! Questions?

References [ECBD97] Jeff Edmonds, Donald D. Chinn, Tim Brecht, Xiaotie Deng: Non-clairvoyant Multiprocessor Scheduling of Jobs with Changing Execution Characteristics. STOC [E99] Jeff Edmonds: Scheduling in the Dark. STOC [EP09] Jeff Edmonds, Kirk Pruhs: Scalably scheduling processes with arbitrary speedup curves. SODA [CEP09] Ho-Leung Chan, Jeff Edmonds, Kirk Pruhs: Speed scaling of processes with arbitrary speedup curves on a multiprocessor. SPAA [GIKMP10] Anupam Gupta, Sungjin Im, Ravishankar Krishnaswamy, Benjamin Moseley, Kirk Pruhs: Scheduling processes with arbitrary speedup curves to minimize variance. Manuscript