Download presentation
Presentation is loading. Please wait.
Published byErnesto Askren Modified over 9 years ago
1
Introduction and Background Power: A Critical Dimension for Embedded Systems Dynamic power dominates; static /leakage power increases faster Common techniques: power off, sleep states and DVFS General principle: use low power states whenever possible Reliability Issues: Transient Faults become more Common Scaled technology sizes and reduced design margins Effects of DVFS on Radiation-Induced Transient Faults Fault rates increase exponentially when supply voltage decreases Exponential fault rate model with DVFS: Poisson distribution Reliability-Aware Power Management (RA-PM) Preliminary Results RA-PM for Multiple Tasks with a Common Deadline Extending RA-PM to Periodic Real-Time Tasks Static schemes: constructing recovery tasks for tasks to be scaled EDF: utilization-based approaches NP-hard RMS: response-time analysis Dynamic schemes: recovery jobs for scaled task instances Wrapper-task: used to represent dynamic slack. Allows to conserve the slack for recovery jobs across preemption points Low-Power Reliable Real-Time Embedded Systems Dakai Zhu ( University of Texas at San Antonio ) and Hakan Aydin ( George Mason University ) UTSA Low Power Reliable Computing Research Group (LPRC) Webpage: http://www.cs.utsa.edu/~dzhu/lprc.html Copyright © 2007, LPRC Group (UTSA), All right reserved. TLD has more chances to save energy on empty memory banks & idle processing cores, but more liable to failures; PLD can enhance reliability with separate code/data sections, but will incur more energy consumption. [DSN’06] References 1.Dakai Zhu, Rami Melhem and Daniel Mossé, The Effects of Energy Management on Reliability in Real-Time Embedded Systems, ICCAD 2004 2.Dakai Zhu, Reliability-Aware Dynamic Energy Management in Dependable Embedded Real-Time Systems, RTAS 2006 and ACM TECS 2006 3.Dakai Zhu and Hakan Aydin, Energy Management for Real-Time Embedded Systems with Reliability Requirements, ICCAD 2006 4.Dakai Zhu and Hakan Aydin, Reliability Effects of Energy Management on Reliability in Real-Time Embedded Systems, DSN (Fast Abstract) 2006 5.Dakai Zhu and Hakan Aydin, Reliability-Aware Energy Management for Periodic Real-Time Tasks, RTAS 2007 Future Work Positive Thermal Effects of DVFS on Reliability DVFS less energy consumption lower temperatures lower failure rates higher system reliabilities; Flexible Task Models and Reliability Requirements Mixed periodic & sporadic tasks; Reliability fairness etc; Lifetime-aware systems with a fixed energy budget Maximize reliability by wisely dropping unimportant tasks Exploitation of Chip-Multiprocessor (CMP) For the same performance levels, multi-core processors are more power efficient than single-core processors Inherent hardware redundancy is ideal for fault tolerance For CMP processors with SMT support: thread-level duplication (TLD) vs. process-level duplication (PLD) Core 1 MEM TRC L1 Core 1 TRC L1 L2 MEM TLD vs. PLD on a dual-core with two thread running contexts (TRC) each; Core 1 MEM TRC L1 Core 1 TRC L1 L2 MEM Free MEM banks low power state Idle core is put to sleep Threads share address space Processes have separate memory T1T1 T2T2 T3T3 T4T4 D slack Four tasks, one unit each, sharing a common deadline 7 T2T2 T3T3 T4T4 RT 1 T1T1 Greedy scheme: manage the first task RT 2 T3T3 T4T4 RT 1 Manage two tasks for better energy savings T1T1 T2T2 4E 3 1 / 9 E 2 7 / 9 E 01 2345 6 7 01 2345 6 7 t 01 2345 6 7 Energy consumed Both energy savings and reliability can be improved via judicious slack management. However, the optimal allocation of slack to minimize energy while preserving reliability is NP-hard. [ICCAD’06] By scheduling recovery tasks/jobs, RA-PM schemes can preserve system reliability while achieving comparable energy savings as those in the ordinary energy management schemes, which could lead to drastically decreased system reliability due to increased fault rates and extended execution time. [RTAS’07] (a). Probability of failure(b). Normalized energy consumption (%) NPM: no power management, used as the base for comparison; CC-EDF: cycle-conserve EDF, an ordinary PM scheme, by Pillai and Shin [SOSP’01]; RA-DPM: reliability aware dynamic PM, using the wrapper-task mechanism; RA-DPM-DS: considering the discrete speed levels in RA-DPM; f DkDk time t TkTk t+1 t+2 t+3t+4t+5 Slack f DkDk time t TkTk t+1 t+2 t+3t+4t+5 Slack f DkDk time t RT k t+1 t+2 t+3t+4t+5 Slack TkTk When the amount of available slack is more than a task’s WCET, scheduling a recovery task before utilizing the remaining slack for energy savings preserves the task’s reliability. [RTAS’06] With recovery task, the final reliability of T k is: f f f t t D D Task T k has WCET of 2 and 3 units of slack is available at time t. Without management, T k runs at f max and its original reliability is R k 0. Ordinary greedy PM scheme uses all the slack to save energy; but the reliability can be 200 times worse! The RA-PM greedy scheme reserves part of slack for a recovery task, while using the remaining slack for DVFS and energy savings. The Problem: how to manage static & dynamic slack for energy savings without sacrificing system reliability! At the maximum frequency/voltage level, the average arrival rate for the faults is λ 0. With DVFS, the rate increases exponentially, where d indicates how sensitive the fault rate is to the supply voltage changes.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.