Presentation is loading. Please wait.

Presentation is loading. Please wait.

RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor Paul Regnier † George Lima † Ernesto Massa † Greg Levin ‡ Scott Brandt ‡

Similar presentations


Presentation on theme: "RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor Paul Regnier † George Lima † Ernesto Massa † Greg Levin ‡ Scott Brandt ‡"— Presentation transcript:

1 RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor Paul Regnier † George Lima † Ernesto Massa † Greg Levin ‡ Scott Brandt ‡ † Federal University of Bahia ‡ University of California Brazil Santa Cruz

2 Multiprocessors Most high-end computers today have multiple processors In a busy computational environment, multiple processes compete for processor time  More processors means more scheduling complexity 2

3 3 Real-Time Multiprocessor Scheduling Real-time tasks have workload deadlines  Hard real-time = “Meet all deadlines!” Problem: Schedule a set of periodic, hard real-time tasks on a multiprocessor systems so that all deadlines are met.

4 4 Example: EDF on One Processor On a single processor, Earliest Deadline First (EDF) is optimal (it can schedule any feasible task set) Task 1: time = 0102030 CPU 2 units of work for every 10 units of time Task 3: 10 units of work for every 25 units of time Task 2: 6 units of work for every 15 units of time

5 5 Scheduling Three Tasks Example: 2 processors; 3 tasks, each with 2 units of work required every 3 time units job release time = 0 deadline Task 1 Task 2 Task 3 123

6 6 Global Schedule Example: 2 processors; 3 tasks, each with 2 units of work required every 3 time units time = 0 CPU 1 CPU 2 123 Task 1 migrates between processors

7 Taxonomy of Multiprocessor Scheduling Algorithms Uniprocessor Algorithms LLFEDF Partitioned Algorithms Globalized Uniprocessor Algorithms Global EDF Dedicated Global Algorithms Partitioned EDF Optimal Algorithms EKG DP-Wrap pfair LLREF 7 Uniprocessor Multiprocessor

8 8 Problem Model n tasks running on m processors A periodic task T = (p, e) requires a workload e be completed within each period of length p T's utilization u = e / p is the fraction of each period that the task must execute job release time = 0 p 2p2p 3p3p job release period p workload e

9 9 Assumptions Processor Identity: All processors are equivalent Task Independence: Tasks are independent Task Unity: Tasks run on one processor at a time Task Migration: Tasks may run on different processors at different times No overhead: free context switches and migrations  In practice: built into WCET estimates

10 10 The Big Goal (Version 1) Design an optimal scheduling algorithm for periodic task sets on multiprocessors  A task set is feasible if there exists a schedule that meets all deadlines  A scheduling algorithm is optimal if it can always schedule any feasible task set

11 11 Necessary and Sufficient Conditions Any set of tasks needing at most 1) 1 processor for each task ( for all i, u i ≤ 1 ), and 2) m processors for all tasks (  u i ≤ m) is feasible Status: Solved  pfair (1996) was the first optimal algorithm

12 12 The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations  ( Finding feasible schedule with fewest migrations is NP-Complete )

13 13 The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations Status: Ongoing…  All existing improvements over pfair use some form of deadline partitioning…

14 14 Deadline Partitioning Task 1 Task 2 Task 4 Task 3

15 15 Deadline Partitioning CPU 2 CPU 1

16 16 The Big Goal (Version 2) Design an optimal scheduling algorithm with fewer context switches and migrations Status: Ongoing…  All existing improvements over pfair use some form of deadline partitioning…  … but all the activity in each time window still leads to a large amount of preemptions and migrations Our Contribution: The first optimal algorithm which does not depend on deadline partitioning, and which therefore has much lower overhead

17 17 The RUN Scheduling Algorithm RUN uses a sequence of reduction operations to transform a multiprocessor problem into a collection of uniprocessor problems  Uniprocessor scheduling is solved optimally by Earliest Deadline First (EDF)  Uniprocessor schedules are transformed back into a single multiprocessor schedule A reduction operation is composed of two steps: packing and dual

18 18 A collection of tasks with total utilization at most one can be packed into a single fixed- utilization task: The Packing Operation Task 1: u = 0.1 Task 2: u = 0.3 Task 3: u = 0.4 Packed Task: u = 0.8

19 19 We divide time with the packed tasks’ deadlines, and schedule them with EDF. In each segment, consume according to total utilization. Scheduling a Packed Task’s Clients Task 1: p=5, u=.1 Task 2: p=3, u=.3 Task 3: p=2, u=.4 EDF schedule of tasks

20 20 The packed task temporarily replaces (acts as a proxy for) its clients. It may not be periodic, but has a fixed utilization in each segment. Defining a Packed Task Packed Task EDF schedule of tasks Task 1: p=5, u=.1 Task 2: p=3, u=.3 Task 3: p=2, u=.4

21 21 The dual of a task T is a task T* with the same deadlines, but complementary utilization / workloads: The Dual of a Task T: u = 0.4, p = 3 T*: u = 0.6, p = 3

22 22 The Dual of a System Given a system of n tasks and m processors, assume full utilization (  u i = m ) The dual system consists of n dual tasks running on n-m processors  Note that the dual utilizations sum to  u i  = n -  u i = n-m  If n < 2m, the dual system has fewer processors The dual task T* represents the idle time of T. T* is executed in the dual system precisely when T is idle, and vice versa.

23 23 The Dual of a System Task 1 Task 2 Task 3

24 24 The Dual of a System Task 1 Task 2 Task 3 time = 0 123 Dual System

25 25 The Dual of a System Because each dual task represents the idle time of its primal task, scheduling the dual system is equivalent to scheduling the original system. Original System time = 0 123 Dual System

26 26 The RUN Algorithm Reduction = Packing + Dual  Keep packing until remaining tasks satisfy: u i + u j > 1 for all tasks T i, T j Schedule of Dual System Schedule for “Primal” System Replace packed tasks in schedule with their clients, ordered by EDF

27 27 RUN: A Simple Example Task 1: u=.2 Task 2: u=.6 Task 3: u=.3 Step 1: Pack tasks until all pairs have utilization > 1 Task 5: u=.5 Task 4: u=.4 Task 12: u=.8

28 28 RUN: A Simple Example Task 3: u=.3 Step 1: Pack tasks until all pairs have utilization > 1 Task 5: u=.5 Task 4: u=.4 Task 12: u=.8 Task 34: u=.7

29 29 RUN: A Simple Example Step 2: Find the dual system on n-m = 3-2 = 1 processor Task 5: u=.5 Task 12: u=.8 Task 34: u=.7

30 30 RUN: A Simple Example Step 2: Find the dual system on n-m = 3-2 = 1 processor Task 5: u=.5 Task 12: u=.8 Task 34: u=.7 Task 5*: u=.5 Task 12*: u=.2 Task 34*: u=.3

31 31 RUN: A Simple Example Step 3: Schedule uniprocessor dual with EDF Task 5*: u=.5Task 12*: u=.2 Task 34*: u=.3 time = 0246 Dual CPU

32 32 RUN: A Simple Example Step 4: Schedule packed tasks from dual schedule time = 0246 Dual CPU

33 33 RUN: A Simple Example Step 4: Schedule packed tasks from dual schedule time = 0246 Dual CPUCPU 1 CPU 2

34 34 RUN: A Simple Example Step 5: Packed tasks schedule their clients with EDF time = 0246 Dual CPUCPU 1 CPU 2

35 35 RUN: A Simple Example The original task set has been successfully scheduled! time = 0246 CPU 1 CPU 2 Task 1: u=.2 Task 2: u=.6 Task 3: u=.3 Task 5: u=.5 Task 4: u=.4

36 36 RUN: A Few Details Several reductions may be necessary to produce a uniprocessor system  Reduction reduces the number of processors by at least half  On random task sets with 100 processors and hundreds of tasks, less than 1 in 600 task sets required more than 2 reductions. RUN requires  u i = m.  When this is not the case, dummy tasks may be added during the first packing step to create a partially or entirely partitioned system.

37 37 Proven Theoretical Performance RUN is optimal Reduction is done off-line, prior to execution, and takes O(n log n) Each scheduler invocation is O(n), with total scheduling overhead O(jn log m) when j jobs are scheduled RUN suffers at most (3r+1)/2 preemptions per job on task sets requiring r reductions.  Since r ≤ 2 for most task sets, this gives a theoretical upper bound of 4 preemptions per job.  In practice, we never observed more than 3 preeptions per job on our randomly generated task sets.

38 38 Simulation and Comparison We evaluated RUN in side-by-side simulations with existing optimal algorithms LLREF, DP- Wrap, and EKG.  Each data point is the average of 1000 task sets, generated uniformly at random with utilizations in the range [0.1, 0.99] and integer periods in the range [5,100].  Simulations were run for 1000 time units each.  Values shown are average migrations and preemptions per job.

39 39 Comparison Varying Processors Number of processors varies from m = 2 to 32, with 2m tasks and 100% utilization

40 40 Comparison Varying Utilization Total utilization varies from 55 to 100%, with 24 tasks running on 16 processors

41 Ongoing Work Heuristic improvements to RUN Extend RUN to broader problems: sporadic arrivals, arbitrarily deadlines, non-identical multiprocessors, etc Develop general theory and new examples of non-deadline-partitioning algorithms 41

42 Summary- The RUN Algorithm: is the first optimal algorithm without deadline partitioning outperforms existing optimal algorithms by a factor of 5 on large systems scales well to larger systems reduces gracefully to the very efficient Partitioned EDF on any task set that Partitioned EDF can schedule 42

43 43 Thanks for Listening Questions?


Download ppt "RUN: Optimal Multiprocessor Real-Time Scheduling via Reduction to Uniprocessor Paul Regnier † George Lima † Ernesto Massa † Greg Levin ‡ Scott Brandt ‡"

Similar presentations


Ads by Google