DP-F AIR : A Simple Model for Understanding Optimal Multiprocessor Scheduling Greg Levin † Shelby Funk ‡ Caitlin Sadowski † Ian Pye † Scott Brandt † †

Slides:



Advertisements
Similar presentations
Washington WASHINGTON UNIVERSITY IN ST LOUIS Real-Time: Periodic Tasks Fred Kuhns Applied Research Laboratory Computer Science Washington University.
Advertisements

1 EE5900 Advanced Embedded System For Smart Infrastructure RMS and EDF Scheduling.
Evaluating Heuristics for the Fixed-Predecessor Subproblem of Pm | prec, p j = 1 | C max.
THE UNIVERSITY of TEHRAN Mitra Nasri Sanjoy Baruah Gerhard Fohler Mehdi Kargahi October 2014.
RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor Paul Regnier † George Lima † Ernesto Massa † Greg Levin ‡ Scott Brandt ‡
Task Allocation and Scheduling n Problem: How to assign tasks to processors and to schedule them in such a way that deadlines are met n Our initial focus:
Module 2 Priority Driven Scheduling of Periodic Task
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Soft Real-Time Semi-Partitioned Scheduling with Restricted Migrations on Uniform Heterogeneous Multiprocessors Kecheng Yang James H. Anderson Dept. of.
How Many Boundaries Are Required to Ensure Optimality in Multiprocessor Scheduling? Geoffrey Nelissen Shelby Funk Dakai Zhu Joёl Goossens.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
CSE 421 Algorithms Richard Anderson Lecture 6 Greedy Algorithms.
By Group: Ghassan Abdo Rayyashi Anas to’meh Supervised by Dr. Lo’ai Tawalbeh.
Energy, Energy, Energy  Worldwide efforts to reduce energy consumption  People can conserve. Large percentage savings possible, but each individual has.
New Schedulability Tests for Real- Time task sets scheduled by Deadline Monotonic on Multiprocessors Marko Bertogna, Michele Cirinei, Giuseppe Lipari Scuola.
Fair Scheduling of Real-Time Tasks on Multiprocessors Uday Prabhala.
Introductory Seminar on Research CIS5935 Fall 2009 Ted Baker.
Real Time Operating Systems Scheduling & Schedulers Course originally developed by Maj Ron Smith 8-Oct-15 Dr. Alain Beaulieu Scheduling & Schedulers- 7.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms.
Quantifying the Sub-optimality of Non-preemptive Real-time Scheduling Abhilash Thekkilakattil, Radu Dobrin and Sasikumar Punnekkat.
Efficient Admission Control for Enforcing Arbitrary Real-Time Demand-Curve Interfaces Farhana Dewan and Nathan Fisher RTSS, December 6 th, 2012 Sponsors:
Scheduling policies for real- time embedded systems.
Multiprocessor Real-time Scheduling Jing Ma 马靖. Classification Partitioned Scheduling In the partitioned approach, the tasks are statically partitioned.
National Taiwan University Department of Computer Science and Information Engineering 1 Optimal Real-Time Scheduling for Uniform Multiprocessors 薛智文 助理教授.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 8.
BFair: An Optimal Scheduler for Periodic Real-Time Tasks
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
Real-Time Scheduling CS4730 Fall 2010 Dr. José M. Garrido Department of Computer Science and Information Systems Kennesaw State University.
The Global Limited Preemptive Earliest Deadline First Feasibility of Sporadic Real-time Tasks Abhilash Thekkilakattil, Sanjoy Baruah, Radu Dobrin and Sasikumar.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
6. Application mapping 6.1 Problem definition
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems RMS and EDF Schedulers.
Special Class on Real-Time Systems
CSE 522 Real-Time Scheduling (2)
Real-Time systems By Dr. Amin Danial Asham.
1 Real-Time Scheduling. 2Today Operating System task scheduling –Traditional (non-real-time) scheduling –Real-time scheduling.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
6.1 CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Static Process Scheduling
CSCI1600: Embedded and Real Time Software Lecture 23: Real Time Scheduling I Steven Reiss, Fall 2015.
Dynamic Priority Driven Scheduling of Periodic Task
Introduction to Real-Time Systems
CSEP 521 Applied Algorithms Richard Anderson Winter 2013 Lecture 3.
Introductory Seminar on Research CIS5935 Fall 2008 Ted Baker.
Clock Driven Scheduling
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
1 Uniprocessor Scheduling Chapter 3. 2 Alternating Sequence of CPU And I/O Bursts.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Basic Concepts Maximum CPU utilization obtained with multiprogramming
Improved Conditions for Bounded Tardiness under EPDF Fair Multiprocessor Scheduling UmaMaheswari Devi and Jim Anderson University of North Carolina at.
Distributed Process Scheduling- Real Time Scheduling Csc8320(Fall 2013)
Multiprocessor Real-Time Scheduling
Chapter 8 – Processor Scheduling
Module 5: CPU Scheduling
Richard Anderson Autumn 2015 Lecture 8 – Greedy Algorithms II
Sanjoy Baruah The University of North Carolina at Chapel Hill
CSCI1600: Embedded and Real Time Software
Richard Anderson Lecture 6 Greedy Algorithms
Richard Anderson Autumn 2016 Lecture 7
CSCI1600: Embedded and Real Time Software
Richard Anderson Lecture 7 Greedy Algorithms
Richard Anderson Autumn 2016 Lecture 8 – Greedy Algorithms II
Shortest-Job-First (SJR) Scheduling
Richard Anderson Winter 2019 Lecture 7
Module 5: CPU Scheduling
Richard Anderson Autumn 2015 Lecture 7
Richard Anderson Autumn 2019 Lecture 7
Presentation transcript:

DP-F AIR : A Simple Model for Understanding Optimal Multiprocessor Scheduling Greg Levin † Shelby Funk ‡ Caitlin Sadowski † Ian Pye † Scott Brandt † † University of California ‡ University of Georgia Santa Cruz Athens

Backstory NQ-Fair by Shelby Funk (Rejected from RTSS) S N S by Greg, Caitlin, Ian, and Scott (Rejected from RTSS) DP-Fair

Multiprocessors Most high-end computers today have multiple processors In a busy computational environment, multiple processes compete for processor time  More processors means more scheduling complexity

4 Real-Time Multiprocessor Scheduling Real-time tasks have workload deadlines  Hard real-time = “Meet all deadlines!” Problem: Scheduling periodic, hard real- time tasks on multiprocessor systems. This is difficult.

5 Real-Time Multiprocessor Scheduling easy!! Real-time tasks have workload deadlines  Hard real-time = “Meet all deadlines!” Problem: Scheduling periodic, hard real- time tasks on multiprocessor systems. This is difficult.

Taxonomy of Multiprocessor Scheduling Algorithms Uniprocessor Algorithms LLFEDF Partitioned Algorithms Globalized Uniprocessor Algorithms Global EDF Dedicated Global Algorithms Partitioned EDF Optimal Algorithms EKG LLREF pfair

7 Partitioned Schedulers ≠ Optimal Example: 2 processors; 3 tasks, each with 2 units of work required every 3 time units

8 Global Schedulers Succeed Example: 2 processors; 3 tasks, each with 2 units of work required every 3 time units Task 3 migrates between processors

9 Problem Model n tasks running on m processors A periodic task T = (p, e) requires a workload e be completed within each period of length p T's utilization u = e / p is the fraction of each period that the task must execute job release time = 0 p 2p2p 3p3p job release period p workload e

time work completed job release deadline Fluid Rate Curve period p workload e fluid rate curve utilization u = slope = e/p actual work curve slope = 0 or 1 10

job release deadline Feasible Work Region period p workload e zero laxity work complete work completed time 11 CPU rate = 1

12 Assumptions Processor Identity: All processors are equivalent Task Independence: Tasks are independent Task Unity: Tasks run on one processor at a time Task Migration: Tasks may run on different processors at different times No overhead: free context switches and migrations  In practice: built into WCET estimates

13 The Big Goal (Version 1) Design an optimal scheduling algorithm for periodic task sets on multiprocessors  A task set is feasible if there exists a schedule that meets all deadlines  A scheduling algorithm is optimal if it can always schedule any feasible task set

14 Necessary and Sufficient Conditions Any set of tasks needing at most 1) 1 processor for each task ( for all i, u i ≤ 1 ), and 2) m processors for all tasks (  u i ≤ m) is feasible  Proof: Small scheduling intervals can approximate a task's fluid rate curve Status: Solved  pfair (1996) was the first optimal algorithm

15 The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations  ( Finding a feasible schedule with fewest migrations is NP-Complete )

16 The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations Status: Solved  BUT, the solutions are complex and confusing Our Contributions: A simple, unifying theory for optimal global multiprocessor scheduling and a simple optimal algorithm

17 Greedy Algorithms Fail on Multiprocessors A Greedy Scheduling Algorithm will regularly select the m “best” jobs and run those  Earliest Deadline First (EDF)  Least Laxity First (LLF) EDF and LLF are optimal on a single processor ; neither is optimal on a multiprocessor  Greedy approaches generally fail on multiprocessors

18 Example (n = 3 tasks, m = 2 processors) : Why Greedy Algorithms Fail On Multiprocessors

19 Why Greedy Algorithms Fail On Multiprocessors At time t = 0, Tasks 1 and 2 are the obvious greedy choice

20 Why Greedy Algorithms Fail On Multiprocessors Even at t = 8, Tasks 1 and 2 are the only “reasonable” greedy choice

21 Why Greedy Algorithms Fail On Multiprocessors Yet if Task 3 isn’t started by t = 8, the resultant idle time eventually causes a deadline miss

22 Why Greedy Algorithms Fail On Multiprocessors How can we “see” this critical event at t = 8?

23 Proportioned Algorithms Succeed On Multiprocessors Subdivide Task 3 with a period of 10

24 Proportioned Algorithms Succeed On Multiprocessors Now Task 3 has a zero laxity event at time t = 8

25 Proportional Fairness Insight: Scheduling is easier when all jobs have the same deadline Application: Apply all deadlines to all jobs  Assign workloads proportional to utilization  Work complete matches fluid rate curve at every system deadline

26 Proportional Fairness Is the Key All known optimal algorithms enforce proportional fairness at all deadlines:  pfair (1996) - Baruah, Cohen, Plaxton, and Varvel ( proportional fairness at all times )  BF (2003) - Zhu, Mossé, and Melhem  LLREF (2006) - Cho, Ravindran, Jensen  EKG (2006) - Andersson, Tovar Why do they all use proportional fairness?

27 Scheduling Multiple Tasks is Complicated time work completed

28 Scheduling Multiple Tasks with Same Deadline is Easy work completed time

Actual Feasible Regions 29 work completed time job deadlines

Restricted Feasible Regions Under Deadline Partitioning 30 work completed time all system deadlines

31 The DP-Fair Scheduling Policy Partition time into slices based on all system deadlines Allocate each job a per-slice workload equal to its utilization times the length of the slice Schedule jobs within each slice in any way that obeys the following three rules: 1. Always run a job with zero local laxity 2. Never run a job with no workload remaining in the slice 3. Do not voluntarily allow more idle processor time than ( m -  u i ) x ( length of slice )

time DP-Fair Work Allocation time slice Allocated workload slope = utilization 32 work completed

DP-Fair Scheduling Rule #1 zero local laxity line When job hits zero local laxity, run to completion 33 time work completed time slice

DP-Fair Scheduling Rule #2 time slice local work complete line When job finishes local workload, stop running 34 time work completed

DP-Fair Scheduling Rule #3 time slice Don’t voluntarily allow idle time in excess of this limit slope = m -  u i slope = m Allowable idle time 35 time idle time

36 DF-Fair Guarantees Optimality We say that a scheduling algorithm is DP-Fair if it follows these three rules Theorem: Any DP-Fair scheduling algorithm for periodic tasks is optimal

37 Proof of Theorem

38 DP-Fair Implications ( Partition time into slices ) + ( Assign proportional workloads ) Optimal scheduling is almost trivial  Minimally restrictive rules allow great latitude for algorithm design and adaptability What is the simplest possible algorithm?

All time slices are equivalent (up to linear scaling), so normalize time slice length to 1  Jobs have workload = utilization Schedule by “stacking and slicing”  Line up job utilization blocks  Slice at integer boundaries 39 The DP-Wrap Algorithm

40 The DP-Wrap Algorithm Example: 3 processors, 7 tasks

41 The DP-Wrap Algorithm Example: 3 processors, 7 tasks

42 The DP-Wrap Algorithm Example: 3 processors, 7 tasks

43 The DP-Wrap Algorithm

44 DP-Wrap Performance n−1 context switches and m−1 migrations per time slice This is about 1/3 as many as LLREF, similar to BF, and somewhat more than EKG DP-Wrap has very low (O(n)) computational complexity

45 Extending DP-Fair: Sporadic Arrivals, Arbitrary Deadlines Sporadic arrivals: jobs from a task arrive at least p time units apart Arbitrary deadlines: A job from T = (p, e, D) has a deadline D time units after arrival; D may be smaller or larger than p DP-Fair and DP-Wrap are easily modified to handle these cases

46 DP-Fair Easily Extended Extending optimal algorithms to sporadic tasks and arbitrary deadlines has previously been difficult  pfair : 10 years  LLREF : 3 years DP-Fair & DP-Wrap: 3 weeks

Ongoing Work Extend to uniform multiprocessors (CPUs running at different speeds) Extend to random, unrelated job model Heuristic improvements to DP-Wrap  Work-conserving scheduling  Slice merging

48 Summary DP-Fair : Simple rules to make optimal real-time multiprocessor scheduling easy!! DP-Wrap : Simplest known optimal multiprocessor scheduling algorithm Extensible : DP-Fair and DP-Wrap easily handle sporadic tasks with arbitrary deadlines  We plan to extend DP-Fair and DP-Wrap to more general problem domains

49 Thanks for Listening Questions?

50 === EXTRA SLIDES === 1. Sporadic Tasks, Arbitrary Deadlines

51 Sporadic Arrivals, Arbitrary Deadlines We extend our task definition to T i = (p i, e i, D i ), where each job of T i has a deadline D i time units after its arrival; D i may be smaller or larger than p i. These are referred to as arbitrary deadlines. Sporadic arrivals means that p i is now the minimum inter-arrival time for T i 's jobs, so that a new job will arrive at least p i time units after the previous arrival.  Scheduling is on-line, i.e. we don't know ahead of time when new jobs will arrive

52 Task Density Our expression for utilization must be updated to the density of a task:  i = e i / min{ p i, D i } and are still sufficient conditions for the feasibility of a task set, but they are no longer necessary.  Example: T 1 = (2, 1, 1), T 2 = (2, 1, 2) on m = 1 is feasible, although = 1.5.

53 Optimality Our updated DP-Fair rules for sporadic tasks with arbitrary deadlines will allow an algorithm to schedule any task set with and ; in fact, we require that these conditions hold Since feasible task sets exist where these conditions do not hold, DP-Fair algorithms are not optimal in this problem domain  This is an acceptable limitation, since it has recently been proven that no optimal scheduler can exist for sporadic tasks

54 Updated DP-Fair Scheduling The rules for scheduling within time slices is nearly the same with sporadic tasks and arbitrary deadlines  The third rule is now: "Do not voluntarily allow more than units of idle time to occur during the time slice before time t" The F j (t) term represents unused capacity between tasks' deadlines and next arrivals

55 Updated DP-Fair Allocation The rules for allocating workloads in time slices become more complicated than simple proportionality  A task is active between its arrival and its deadline. When a task becomes active, it is allocated a workload proportional to its density and the length of its active segment in the slice  If a job arrives during a slice that will have a deadline within the slice, the remainder of the slice is partitioned into two new subslices, with a subslice boundary at the deadline, and remaining work for all tasks allocated proportionally to subslice lengths

Handling a Sporadic Arrival time slice Workload allocated upon arrival Sporadic arrival in the middle of a slice 56 time work completed

Managing Idle Time time slice Available idle time grows by utilization of unarrived sporadic job… slope = m -  u i slope = m Original idle time pool …until that job finally arrives Additional idle time from late arrival 57 time idle time

58 Theorem 3 Any DP-Fair scheduling algorithm for sporadic tasks with arbitrary deadlines will successfully schedule any task set where and

59 Modifying DP-Wrap The DP-Wrap algorithm is easily modified to accommodate the DP-Fair scheduling rules for sporadic tasks with arbitrary deadlines  If a job arrives during a time slice, wrap it onto the end of the stack of blocks  If the arriving job has a deadline within the current slice, partition the remainder of the slice into two subslices, as per our previous rule, and apply the usual scheduling rules to both of these subslices