Minimizing Expected Energy Consumption in Real-Time Systems through Dynamic Voltage Scaling Ruibin Xu, Daniel Mosse’, and Rami Melhem.

Slides:



Advertisements
Similar presentations
Power Aware Scheduling for AND/OR Graphs in Multi-Processor Real-Time Systems Dakai Zhu, Nevine AbouGhazaleh, Daniel Mossé and Rami Melhem PARTS Group.
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Reducing Network Energy Consumption via Sleeping and Rate- Adaption Sergiu Nedevschi, Lucian Popa, Gianluca Iannaccone, Sylvia Ratnasamy, David Wetherall.
1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
1 “Scheduling with Dynamic Voltage/Speed Adjustment Using Slack Reclamation In Multi-processor Real-Time Systems” Dakai Zhu, Rami Melhem, and Bruce Childers.
Real- time Dynamic Voltage Scaling for Low- Power Embedded Operating Systems Written by P. Pillai and K.G. Shin Presented by Gaurav Saxena CSE 666 – Real.
Courseware Scheduling of Distributed Real-Time Systems Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens.
Power Reduction Techniques For Microprocessor Systems
Introduction and Background  Power: A Critical Dimension for Embedded Systems  Dynamic power dominates; static /leakage power increases faster  Common.
Power Aware Real-time Systems Rami Melhem A joint project with Daniel Mosse, Bruce Childers, Mootaz Elnozahy.
Power Management Algorithms An effort to minimize Processor Temperature and Energy Consumption.
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
System design-related Optimization problems Michela Milano Joint work DEIS Università di Bologna Dip. Ingegneria Università di Ferrara STI Università di.
Aleksandra Tešanović Low Power/Energy Scheduling for Real-Time Systems Aleksandra Tešanović Real-Time Systems Laboratory Department of Computer and Information.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
On the Task Assignment Problem : Two New Efficient Heuristic Algorithms.
Soner Yaldiz, Alper Demir, Serdar Tasiran Koç University, Istanbul, Turkey Paolo Ienne, Yusuf Leblebici Swiss Federal Institute of Technology (EPFL), Lausanne,
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
VOLTAGE SCHEDULING HEURISTIC for REAL-TIME TASK GRAPHS D. Roychowdhury, I. Koren, C. M. Krishna University of Massachusetts, Amherst Y.-H. Lee Arizona.
10 years of research on Power Management (now called green computing) Rami Melhem Daniel Mosse Bruce Childers.
OPTIMAL SERVER PROVISIONING AND FREQUENCY ADJUSTMENT IN SERVER CLUSTERS Presented by: Xinying Zheng 09/13/ XINYING ZHENG, YU CAI MICHIGAN TECHNOLOGICAL.
Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC.
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Logic Synthesis For Low Power CMOS Digital Design.
1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Low Power Design for Real-Time Systems Low power (energy) consumption is a key design for embedded systems Battery’s life during operation Reliability.
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,
Approximation Algorithms for Task Allocation with QoS and Energy Considerations Bader N. Alahmad.
1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
Dynamic Slack Reclamation with Procrastination Scheduling in Real- Time Embedded Systems Paper by Ravindra R. Jejurikar and Rajesh Gupta Presentation by.
A S CHEDULABILITY A NALYSIS FOR W EAKLY H ARD R EAL - T IME T ASKS IN P ARTITIONING S CHEDULING ON M ULTIPROCESSOR S YSTEMS Energy Reduction in Weakly.
Cloud Resource Scheduling for Online and Batch Applications Kick-off meeting.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 01, 2005 Session 14.
Probabilistic Preemption Control using Frequency Scaling for Sporadic Real-time Tasks Abhilash Thekkilakattil, Radu Dobrin and Sasikumar Punnekkat.
Multiprocessor Real-time Scheduling Jing Ma 马靖. Classification Partitioned Scheduling In the partitioned approach, the tasks are statically partitioned.
Maximum Network Lifetime in Wireless Sensor Networks with Adjustable Sensing Ranges Cardei, M.; Jie Wu; Mingming Lu; Pervaiz, M.O.; Wireless And Mobile.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Company name KUAS HPDS A Realistic Variable Voltage Scheduling Model for Real-Time Applications ICCAD Proceedings of the 2002 IEEE/ACM international conference.
A Node and Load Allocation Algorithm for Resilient CPSs under Energy-Exhaustion Attack Tam Chantem and Ryan M. Gerdes Electrical and Computer Engineering.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005.
Hard Real-Time Scheduling for Low- Energy Using Stochastic Data and DVS Processors Flavius Gruian Department of Computer Science, Lund University Box 118.
6. Application mapping 6.1 Problem definition
1 SYNTHESIS of PIPELINED SYSTEMS for the CONTEMPORANEOUS EXECUTION of PERIODIC and APERIODIC TASKS with HARD REAL-TIME CONSTRAINTS Paolo Palazzari Luca.
Real-Time Support for Mobile Robotics K. Ramamritham (+ Li Huan, Prashant Shenoy, Rod Grupen)
Energy-Aware Scheduling for Aperiodic Tasks on Multi-core Processors Dawei Li and Jie Wu Department of Computer and Information Sciences Temple University,
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
ECE555 Topic Presentation Energy-efficient real-time scheduling Xing Fu 20 September 2008 Acknowledge Dr. Jian-Jia Chen from ETH providing PPT Slides for.
Power Aware Real-time Systems A joint project with profs Daniel Mosse Bruce Childers Mootaz Elnozahy (IBM Austin) And students Nevine Abougazaleh Cosmin.
1 of 14 1/34 Embedded Systems Design: Optimization Challenges Paul Pop Embedded Systems Lab (ESLAB) Linköping University, Sweden.
Static Process Scheduling
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
FPGA-Based System Design: Chapter 6 Copyright  2004 Prentice Hall PTR Topics n Low power design. n Pipelining.
CprE 458/558: Real-Time Systems (G. Manimaran)1 Energy Aware Real Time Systems - Scheduling algorithms Acknowledgement: G. Sudha Anil Kumar Real Time Computing.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Energy-aware QoS packet scheduling.
Jamie Unger-Fink John David Eriksen.  Allocation and Scheduling Problem  Better MPSoC optimization tool needed  IP and CP alone not good enough  Communication.
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Networked Real-Time Systems: Routing and Scheduling
Parallel Programming in C with MPI and OpenMP
IIS Progress Report 2016/01/18.
Presentation transcript:

Minimizing Expected Energy Consumption in Real-Time Systems through Dynamic Voltage Scaling Ruibin Xu, Daniel Mosse’, and Rami Melhem

Problems Theme Context: Frame-based hard real time systems Given one or more tasks with – Same period – deadline = period – Order of execution of tasks Probability distribution of execution cycles of each task One processor with DVS support Goal: Schedule tasks (time allocation, speed) to minimize expected energy consumption

Problems & System Models Problems Intra-Task DVSInter-Task DVSHybrid System Model IdealRealistic

Problems Intra-Task DVS – Only one task – Compute speed of each cycle or group of consecutive cycles. Inter-Task DVS – Multiple Tasks and their order of execution – Compute fraction of remaining time to allot to each task – At run time, speed changes only at the boundary of a task Hybrid – Combine Intra and Inter-task DVS

System Models Ideal Model – Unrestricted continuous speed – No time or energy overhead for changing speed – Well defined power-frequency relation. p(f) = c 0 +c 1 f α Realistic Model – Predefined set of discrete speeds – Changing speed costs time and energy overhead – No assumption on power-frequency relation

Intra-Task DVS + Ideal System Minimize, where Subject to Optimal solution is Algorithms: PACE, GRACE.

Intra-Task DVS + Realistic System First approach: patch solution obtained under ideal system GRACE: round speed up to closest discrete frequency PACE: round speed up or down to closest discrete frequency Problems? – Can miss deadline – Ignore speed change overhead PACE: scan all phases and adjust speed, subtract maximum time penalty from allotted time

Intra-Task DVS + Realistic System Second Approach: Design DVS considering the realistic system model (PPACE). Change speed time penalty Change speed energy penalty Given r transition points, partition the range of execution cycles [1, W] into r+1 phases: [b 0, b 1 -1], [b 1, b 2 -1],…, [b r, b r+1 -1], where b 0 =1, b r+1 =W+1 Not necessarily the optimal partitioning!

Intra-Task DVS + Realistic System 1 0 b 0 =1b1b1 b2b2 b3b3 brbr cdf(x) X f0f0 f1f1 f2f2 frfr …… *Graph borrowed from presentation by Ruibin Xu

Intra-Task DVS + Realistic System Minimize Subject to Where

Intra-Task DVS + Realistic System An energy-time label l is a 2-tuple (e,t), where e and t denote energy and time, respectively *Graph borrowed from presentation by Ruibin Xu

Intra-Task DVS + Realistic System v0v0 v1v1 v2v2 v3v3 |LABEL(0)|=1|LABEL(1)|=M |LABEL(2)|=M 2 |LABEL(3)|=M 3 Exponential growth ! *Animation borrowed from presentation by Ruibin Xu

Intra-Task DVS + Realistic System Use Approximation – Approximations that preserve optimality but do not guarantee polynomial running time – Approximation based on factor є so that solution is within (1+є) of optimal, and we get a polynomial running time.

Intra-Task DVS + Realistic System v0v0 v1v1 v2v2 v3v3 |LABEL(0)|=1|LABEL(1)|=M |LABEL(2)|=M 2 |LABEL(3)|=M 3 |LABEL(1)|<<M, Hopefully! |LABEL(2)|<<M 2 Hopefully! *Animation borrowed from presentation by Ruibin Xu

Intra-Task DVS + Realistic System

Inter-Task DVS + Ideal System Algorithm: OITDVS slack β1Dβ1D (1-β 1 )D D β1β1 *Animation borrowed from presentation by Ruibin Xu

Inter-Task DVS + Ideal System β 4 =100% β 3 =xx% vs. T1T1 T2T2 T3T3 T4T4 β 2 =xx% vs. β 1 =xx% *Animation borrowed from presentation by Ruibin Xu

Inter-Task DVS + Realistic System Use the ideal system to compute allotted fractions of system time to each task Patch the solution to work on the realistic system: – Before computing the speed of a task, subtract from remaining time the maximum possible time penalty for all remaining speed changes. – Make sure a task runs at one of the discrete speed steps Algorithms PITDVS, PITDVS2

Hybrid (Intra + Inter-Task DVS) + Ideal System Combine intra and inter-task DVS. Compute fractions per cycle per task, instead of just per task. Algorithm: GOPDVS

Hybrid (Intra + Inter-Task DVS) + Realistic System Two approaches First (PGOPDVS): – Compute time allocation fraction per phase (instead of cycle) per task. – From the above fractions, compute fraction per task. – At run time, use task fractions to allot time to each task. – Compute task speed by applying patches as in inter-task DVS

Hybrid (Intra + Inter-Task DVS) + Realistic System Second (PIT-PPACE): – Compute time allocation fraction per task (inter- task DVS). – At run time, use intra-task DVS to compute speed schedule of each task according to allotted time. – Because the above step is time consuming, we can compute a set of solutions of intra-task DVS for each task and apply the best one at run time.

Hybrid (Intra + Inter-Task DVS) + Realistic System

Questions?

Energy-Aware Scheduling for Streaming Applications on Chip Multiprocessors Ruibin Xu, Rami Melhem, Daniel Mosse’

Problem Streaming applications operate on streams of data and are compute intensive. Examples: video streaming, automatic target recognition A stream of data can be abstracted as a sequence of requests. Thus a streaming application can be modeled as a periodic task in real-time systems. QoS: Throughput (T), and Deadline (D)

Problem Streaming applications are highly parallelizable and thus we can use CMPs to run them. CMPs support: – Turning off cores to reduce leakage – DVS to reduce dynamic energy consumption Goal: Schedule tasks so as to minimize energy consumption and meet QoS requirements.

Models Application is modeled as a DAG where nodes represent tasks, and edges represent precedence relations and communication requirements. Communication cost of transferring B bits – Delay is – Energy is Each processor has M discrete frequency steps

Decompose the Problem

Effect of Static Power Power function of a processor core Consider Y-oriented load only Assume job consists of c cycles, and we use y cores, then each is assigned cycles, and runs with speed Energy consumption is To minimize energy consumption Similar result for X-oriented load

Scheduling for Y-Oriented Load Solution has recursive nature Let denote optimal scheduling of tasks i through n, with end-to-end delay = t Computes single Value of i,i+1,..,j dt-d q WCEC Running time Energy

Scheduling for Y-Oriented Load We need to consider all M frequencies and all possible n-i+1 mappings of consecutive tasks i through n to first stage.

Scheduling for X-Oriented Load Use List scheduling to map tasks to cores in the same stage. Perform speed computation for tasks of the stage. Use hill-climbing to improve solution.

Scheduleing2D

Simulation Results Comparing Schedling2D against baseline. Baseline Algorithm – Period = deadline – Try all possible number of cores to find minimum energy consumption. – Use ETF heuristic to perform task mapping. – Uses convex programming approach to obtain execution speed of each task given task mapping.

Simulation Results Percentage of static power in total power: – 70nm: 22% – 50nm: 44% – 35nm: 67% As static power increases, energy savings obtained by Scheduling2D decreases

Simulation Results

As period increases, energy savings decrease For a given period, increasing deadline will initially result in increased energy savings

Thanks!