Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy.

Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy

Outline Introduction to real-time systems Voltage and frequency scaling (no slide) Speed adjustment in frame-based systems Dynamic speed adjustment in multiprocessor environment Speed adjustment in general periodic systems Static speed adjustment for tasks with different power consumptions Maximizing computational reward for given energy and deadline Tradeoff between energy consumption and reliability The Pecan board

Hard RT systems Periodic Aperiodic (frame-based) preemptive non preemptive non preemptive uni-processor parallel processors Soft RT systems Real-time systems

n tasks with maximum computation times C i and periods T i, for i=1,…,n. If C 2 = 3.1, then C 2 will miss its deadline although the utilization is +, which is less than 1. C=2, T=4 C=3, T=7 2 4 3.1 7 Liu and Layland RMS utilization bound is n (2 1/n - 1) Periodic, Rate Monotonic scheduling Static priority scheduling (high priority to task with shorter period).

n tasks with maximum computation times C i and periods T i, for i=1,…,n. Dynamic priority scheduling (high priority to the task with earlier deadline) All tasks will meet their deadlines if utilization is not more than 1. C=2, T=4 C=3, T=7 Periodic, EDF scheduling

time S max S min Select the speed based on worst-case execution time,WCET, and deadline CPU speed time deadline S max S min WCET speed adjustment in frame-based systems Assumption: all tasks have the same deadline. Static speed adjustment

Dynamic Speed adjustment techniques for linear code time WCET ACET Speed adjustment based on remaining WCET Note: a task very rarely consumes its estimated worst case execution time.

Dynamic Speed adjustment techniques for linear code time Remaining WCET Remaining time Speed adjustment based on remaining WCET

Dynamic Speed adjustment techniques for linear code time Speed adjustment based on remaining WCET

Dynamic Speed adjustment techniques for linear code time Speed adjustment based on remaining WCET Greedy speed adjustment: Give reclaimed slack to next task rather than all future tasks time

Dynamic Speed adjustment techniques for linear code time WCE time WCET ACET time AV Remaining av. ex. time Speed adjustment based on remaining average execution time WCE S max Remaining time

An alternate point of view time WCE time WCET ACET time AV WCE Reclaimed slack stolen slack

Dynamic Speed adjustment techniques for non-linear code Remaining WCET is based on the longest path Remaining average case execution time is based on the branching probabilities (from trace information). At a p3p3 p2p2 p1p1 minaveragemax

2. Periodic, non-frame-based systems  Each task has a WCET, C i and a period T i  Earliest Deadline First (EDF) scheduling  Static speed adjustment: If utilization U = < 1, then we can reduce the speed by a factor of U, and still guarantee that deadlines are met. S max Note: Average utilization, U av can be much less than U

Greedy dynamic speed adjustment  Giving reclaimed slack to the next ready task is not always a correct scheme  Theorem: A reclaimed slack has to be associated with a deadline and can be given safely to a task with an earlier or equal deadline. correct incorrect

aggressive dynamic speed adjustment  Theorem: if tasks 1, …, k are ready and will complete before the next task arrival, then we can swap the time allocation of the k tasks. That is we can add stolen slack to the reclaimed slack  Experimental rule: Do not be very aggressive and reduce the speed of a task below a certain speed (the optimal speed determined by U av )

Speed adjustment in Multi-processors 64 4 2 1. the case of independent tasks on two processors Canonical execution ==> all tasks consume WCET P1 P2 time 6 4 4 2 Deadline 12 Global queue P1 P2 6 6 93 Static speed management No speed management

Dynamic speed adjustment Non canonical execution ==> tasks consume ACET P1 P2 time 12 Greedy slack Reclamation (GSR) 6,56,69,33,3 P1 P2 Slack sharing (SS) If we select the initial speed based on WCET, can we do dynamic speed adjustment and still meet the deadline? deadline miss

Dynamic speed adjustment P1 P2 time 12 Greedy slack Reclamation (GSR) P1 P2 Slack sharing (SS) Non canonical execution ==> tasks consume ACET 6,56,69,33,3 If we select the initial speed based on WCET, can we do dynamic speed adjustment and still meet the deadline? deadline miss

Dynamic speed adjustment P1 P2 time 12 Greedy slack Reclamation (GSR) P1 P2 Slack sharing (SS) Non canonical execution ==> tasks consume ACET 6,56,69,33,3 If we select the initial speed based on WCET, can we do dynamic speed adjustment and still meet the deadline? deadline miss meets deadline

43 2. dependent tasks P1 P2 time 12 6 4 3 2 6 3 2 3 Ready_Q Use list scheduling 4 4 Canonical execution Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 2 6 3 2 3 Ready_Q Use list scheduling 4 4 Canonical execution 3 3 Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 2 6 3 2 3 Ready_Q 43 Use list scheduling 4 466 3 3 6 6 66 Canonical execution Assuming that we adjust the speed statically such that canonical execution meets the deadline. Can we reclaim unused slack dynamically and still meet the deadline? Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 2 6 3,1 2 3 Ready_Q 43 4 66 3 6 6 P1 P2 2 3 Ready_Q Use list scheduling Canonical execution Non-canonical Execution with Slack sharing time 12 Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 1 2 6 3,1 2 3 Ready_Q 43 4 66 3 6 6 P1 P2 2 Ready_Q Use list scheduling Canonical execution Non-canonical Execution with Slack sharing 6 time 12 Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 1 2 6 3,1 2 3 Ready_Q 43 4 66 3 6 6 P1 P2 2 Ready_Q Use list scheduling Canonical execution Non-canonical Execution with Slack sharing 6 time 12 6 6 Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 1 2 6 3,1 2 3 Ready_Q 43 4 66 3 6 6 P1 P2 2 Ready_Q Use list scheduling Canonical execution Non-canonical Execution with Slack sharing 6 4 3 time 12 6 6 4 4 Dynamic speed adjustment (2 processors)

2. dependent tasks P1 P2 time 12 6 4 3 1 2 6 3,1 2 3 Ready_Q 43 4 66 3 6 6 P1 P2 2 Ready_Q Use list scheduling Canonical execution Non-canonical Execution with Slack sharing 6 4 3 6 time 12 6 6 4 4 3 3 Dynamic speed adjustment (2 processors)

63 2. dependent tasks P1 P2 time 12 6 4 3 1 2 6 3,1 2 3 Ready_Q 43 4 66 3 6 6 P1 P2 2 Ready_Q Use list scheduling Canonical execution Non-canonical Execution with Slack sharing 6 time 12 6 6 4 4 3 Dynamic speed adjustment (2 processors) 6 6 deadline miss

Dynamic speed adjustment (2 processors) Solution: Use a wait_Q to enforce canonical order in Ready_Q P1 P2 time 12 6 4 3 2 6 3,1 Ready_Q Wait_Q A task is put in Wait_Q when its last predecessor starts execution Tasks are ordered in Wait_Q by their expected start time under WCET Only the head of the Wait_Q can move to the Ready_Q 2 3 4 3 6

4 3 4 3 Dynamic speed adjustment (2 processors) P1 P2 time 12 6 4 3 2 6 3,1 Ready_Q Wait_Q A task is put in Wait_Q when its last predecessor starts execution Tasks are ordered in Wait_Q by their expected start time under WCET Only the head of the Wait_Q can move to the Ready_Q 2 6 1 Solution: Use a wait_Q to enforce canonical order in Ready_Q

4 3 4 4 3 3 Dynamic speed adjustment (2 processors) P1 P2 time 12 6 4 3 2 6 3,1 Ready_Q Wait_Q A task is put in Wait_Q when its last predecessor starts execution Tasks are ordered in Wait_Q by their expected start time under WCET Only the head of the Wait_Q can move to the Ready_Q 2 6 1 4 3 4 3 Solution: Use a wait_Q to enforce canonical order in Ready_Q 6 6 6

64 3 Dynamic speed adjustment (2 processors) P1 P2 time 12 6 4 3 2 6 3,1 Ready_Q Wait_Q A task is put in Wait_Q when its last predecessor starts execution Tasks are ordered in Wait_Q by their expected start time under WCET Only the head of the Wait_Q can move to the Ready_Q 2 1 4 3 4 3 4 3 Solution: Use a wait_Q to enforce canonical order in Ready_Q 6 6 6 6

6 64 3 Dynamic speed adjustment (2 processors) P1 P2 time 12 6 4 3 2 6 3,1 Ready_Q Wait_Q A task is put in Wait_Q when its last predecessor starts execution Tasks are ordered in Wait_Q by their expected start time under WCET Only the head of the Wait_Q can move to the Ready_Q 2 1 4 3 4 3 4 3 Solution: Use a wait_Q to enforce canonical order in Ready_Q 6 6 6 6 6 6 meets deadline

Theoretical results For independent tasks, if canonical execution finishes at time T, then non-canonical execution with slack sharing finishes at or before time T. For dependent tasks, if canonical execution finishes at time T, then, non-canonical execution with slack sharing and a wait queue finishes at or before time T. Can optimize energy based on WCET (static speed adjustment) At run time, can use reclaimed slack to further reduce energy (dynamic speed adjustment), while still guaranteeing deadlines. Implication:

Simulation results We simulated task graphs from real-applications (matrix operations and solution of linear equations) We assumed that Power consumption is proportional to S 3 Typical results are as follows: Static speed/voltage adjustment Dynamic speed/voltage adjustment 100% 20% 40% 60% 80% Average dynamic slack (2 processors).8.7.6.5.4.3 20% 40% 60% 80% Normalized energy consumption Number of processors (50% dynamic slack) 24681012 100%

4. Static optimization when different tasks consume different power Start time deadline Assuming that the power consumption functions are identical for all tasks. Task 1 Task 2 Task 3 Then to minimize the total energy, all tasks have to execute at the same speed. If, however, the power functions, P i (S), are different for tasks i = 1, …, n, Then using the same speed for all tasks does not minimize energy consumption. Let C i = number of cycles needed to complete task i

Example: Start time deadline Task 1 Three tasks with C 1 = C 2 = C 3, If P i (S) = a i S 2, for task i, then, energy consumed by task i is E i = a i / t i. D time D/3 E Task 2 Task 3 t1t1 t2t2 If a 1 = a 2 = a 3, then t 1 = t 2 = t 3 minimizes total energy t3t3 Minimizing energy consumption

Example: Start time deadline Three tasks with C 1 = C 2 = C 3, If P i (S) = a i S 2, for task i, then, energy consumed by task i is E i = a i / t i. time E D/3 D If a 1, a 2 and a 3, are different then t 1 = t 2 = t 3 does not minimize total energy t1t1 t2t2 t3t3 Minimizing energy consumption

Start time deadline The problem is to find S i, i=1, …, n, such that to Note that We solved this optimization problem, consequently developing a solution for arbitrary convex power functions. Algorithm complexity: O(n 2 log n) D

Maximizing the system’s reward C1 C2 C3 General problem assumptions: tasks have different power/speed functions tasks have different rewards as functions of number of executed cycles time Speed (S) C1 Reward C2C3 S1 Power S2S3

Theorem: If power functions are all quadratic or cubic in S, Maximizing the system’s reward S1 Power S2S3 Theorem: If power functions are all of the form  i  S, then reward is maximum when power is the same for all tasks Energy Deadline C1 C2 C3 time Speed Given the speeds, we know how to maximize the total reward while meeting the deadline. p

Maximizing the system’s reward If power functions of tasks are not all of the form  i  S time Speed 1) Ignore energy and maximize reward, R, within the deadline 2) If exceed available energy;  t –remove  t from a task such that decrease in R is minimal –use  t to decrease the speed of a task, such as to maximize the decrease in energy consumption p

Maximizing the system’s reward time 1) Ignore energy and maximize reward, R, within the deadline 2) If exceed available energy; - remove  t from a task such that decrease in R is minimal - use  t to decrease the speed of a task, such as to maximize the decrease in energy consumption 3) Repeat step 2 until the energy constraints are satisfied  t If power functions of tasks are not all of the form  i  S p

Basic hypothesis:  Dependable systems must include redundant capacity in either time or space (or both)  Redundancy can also be exploited to reduce power consumption Tradeoff between energy & dependability Time redundancy (checkpointing and rollbacks) Space redundancy

Exploring time redundancy deadline The slack can be used to 1) add checkpoints 2) reserve recovery time 3) reduce processing speed For a given number of checkpoints, we can find the speed that minimizes energy consumption, while guaranteeing recovery and timeliness. S max

More checkpoints = more overhead + less recovery slack D C r Optimal number of checkpoints For a given slack (C/D) and checkpoint overhead (r/C), we can find the number of checkpoints that minimizes energy consumption, and guarantee recovery and timeliness. # of checkpoints Energy

Non-uniform check-pointing Observation: May continue executing at S max after recovery. Advantage: recovery in an early section can use slack created by execution of later sections at S max Disadvantage: increases energy consumption when a fault occurs (a rare event) Requires non-uniform checkpoints.

Non-uniform check-pointing Can find # of checkpoints, their distribution, and the CPU speed such that energy is minimized, recovery is guaranteed and deadlines are met

The Pecan board An experimental platform for power management research Profiling power usage of applications. Development of active software power control. An initial prototype for thin/dense servers based on Embedded PowerPC processors. A software development platform for PPC405 processors. PCI-Based Network Ethernet Network(s) Serial Ethernet PPC405 RAM Flash Serial Ethernet PPC405 RAM Flash Serial Ethernet PPC405 RAM Flash Serial Etherne t PPC405 PCI Bridge RAM Flash Serial Ethernet PCI Bridge PPC405 RAM Flash FPGA PCMCIA RTC

Block Diagram Single-board computer as a PCI adapter; non-transparent bridge. Up to 128MB SDRAM. Serial, 10/100 Ethernet, Boot Flash, RTC, FPGA, PCMCIA. Components grouped on voltage islands for power monitoring & control. Serial Ethernet PCI Bridge PPC405 RAM Flash FPGA PCMCIA RTC Linux for PPC 405GP available in-house & commercially (MontaVista). PCI provides easy expandability. Leverage open-source drivers for networking over PCI -- Fast access to network-based file systems. Software: Linux

Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy.

Similar presentations

Presentation on theme: "Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy.

Similar presentations

Presentation on theme: "Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy."— Presentation transcript:

Similar presentations

About project

Feedback