ParaScale : Exploiting Parametric Timing Analysis for Real-Time Schedulers and Dynamic Voltage Scaling Sibin Mohan 1 Frank Mueller 1,William Hawkins 2,

Slides:



Advertisements
Similar presentations
Feedback EDF Scheduling Exploiting Dynamic Voltage Scaling Yifan Zhu and Frank Mueller Department of Computer Science Center for Embedded Systems Research.
Advertisements

1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
Hadi Goudarzi and Massoud Pedram
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
Modeling shared cache and bus in multi-core platforms for timing analysis Sudipta Chattopadhyay Abhik Roychoudhury Tulika Mitra.
A Framework for Dynamic Energy Efficiency and Temperature Management (DEETM) Michael Huang, Jose Renau, Seung-Moon Yoo, Josep Torrellas University of Illinois.
Real- time Dynamic Voltage Scaling for Low- Power Embedded Operating Systems Written by P. Pillai and K.G. Shin Presented by Gaurav Saxena CSE 666 – Real.
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
Harini Ramaprasad, Frank Mueller North Carolina State University Center for Embedded Systems Research Tightening the Bounds on Feasible Preemption Points.
Constraint Systems used in Worst-Case Execution Time Analysis Andreas Ermedahl Dept. of Information Technology Uppsala University.
Power Reduction Techniques For Microprocessor Systems
Evaluating an Adaptive Framework For Energy Management in Processor- In-Memory Chips Michael Huang, Jose Renau, Seung-Moon Yoo, Josep Torrellas.
Minimizing Expected Energy Consumption in Real-Time Systems through Dynamic Voltage Scaling Ruibin Xu, Daniel Mosse’, and Rami Melhem.
Microarchitectural Approaches to Exceeding the Complexity Barrier © Eric Rotenberg 1 Microarchitectural Approaches to Exceeding the Complexity Barrier.
Fast Paths in Concurrent Programs Wen Xu, Princeton University Sanjeev Kumar, Intel Labs. Kai Li, Princeton University.
CS 7810 Lecture 12 Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors D. Brooks et al. IEEE Micro, Nov/Dec.
Complexity Analysis (Part I)
2/15/2006"Software-Hardware Cooperative Memory Disambiguation", Alok Garg, HPCA Software-Hardware Cooperative Memory Disambiguation Ruke Huang, Alok.
Memory Redundancy Elimination to Improve Application Energy Efficiency Keith Cooper and Li Xu Rice University October 2003.
Energy Efficient Instruction Cache for Wide-issue Processors Alex Veidenbaum Information and Computer Science University of California, Irvine.
1 Center for Embedded Systems Research (CESR) Department of Computer Science North Carolina State University Frank Mueller Timing Analysis: In Search of.
NC STATE UNIVERSITY Anantaraman © 2004RTSS–25 Enforcing Safety of Real-Time Schedules on Contemporary Processors using a Virtual Simple Architecture (VISA)
Enhancing Embedded Processors with Specific Instruction Set Extensions for Network Applications A. Chormoviti, N. Vassiliadis, G. Theodoridis, S. Nikolaidis.
Profile-based Dynamic Voltage Scheduling with Program Checkpoints The COPPER Team: Ana Azevedo, Ilya Issenin, Radu Cornea, Rajesh Gupta, Nikil Dutt, Alex.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
Minimizing Response Time Implication in DVS Scheduling for Low Power Embedded Systems Sharvari Joshi Veronica Eyo.
VOLTAGE SCHEDULING HEURISTIC for REAL-TIME TASK GRAPHS D. Roychowdhury, I. Koren, C. M. Krishna University of Massachusetts, Amherst Y.-H. Lee Arizona.
Low Contention Mapping of RT Tasks onto a TilePro 64 Core Processor 1 Background Introduction = why 2 Goal 3 What 4 How 5 Experimental Result 6 Advantage.
A Modular and Retargetable Framework for Tree-based WCET analysis Antoine Colin Isabelle Puaut IRISA - Solidor Rennes, France.
Ronny Krashinsky Seongmoo Heo Michael Zhang Krste Asanovic MIT Laboratory for Computer Science SyCHOSys Synchronous.
Baoxian Zhao Hakan Aydin Dakai Zhu Computer Science Department Computer Science Department George Mason University University of Texas at San Antonio DAC.
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Low Power Design for Real-Time Systems Low power (energy) consumption is a key design for embedded systems Battery’s life during operation Reliability.
1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy.
1 EE5900 Advanced Embedded System For Smart Infrastructure Energy Efficient Scheduling.
Dynamic Slack Reclamation with Procrastination Scheduling in Real- Time Embedded Systems Paper by Ravindra R. Jejurikar and Rajesh Gupta Presentation by.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
Critical Power Slope Understanding the Runtime Effects of Frequency Scaling Akihiko Miyoshi, Charles Lefurgy, Eric Van Hensbergen Ram Rajamony Raj Rajkumar.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Implicitly-Multithreaded Processors Il Park and Babak Falsafi and T. N. Vijaykumar Presented by: Ashay Rane Published in: SIGARCH Computer Architecture.
Hard Real-Time Scheduling for Low- Energy Using Stochastic Data and DVS Processors Flavius Gruian Department of Computer Science, Lund University Box 118.
NC STATE UNIVERSITY 1 Feedback EDF Scheduling w/ Async. DVS Switching on the IBM Embedded PowerPC 405 LP Frank Mueller North Carolina State University,
F A S T Frequency-Aware Static Timing Analysis
Harini Ramaprasad, Frank Mueller North Carolina State University Center for Embedded Systems Research Bounding Worst-Case Data Cache Behavior by Analytically.
Static WCET Analysis vs. Measurement: What is the Right Way to Assess Real-Time Task Timing? Worst Case Execution Time Prediction by Static Program Analysis.
CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
Addressing Instruction Fetch Bottlenecks by Using an Instruction Register File Stephen Hines, Gary Tyson, and David Whalley Computer Science Dept. Florida.
Yifan Zhu, Frank Mueller North Carolina State University Center for Efficient, Secure and Reliable Computing DVSleak: Combining Leakage Reduction and Voltage.
Harini Ramaprasad, Frank Mueller North Carolina State University Center for Embedded Systems Research Bounding Preemption Delay within Data Cache Reference.
CprE 458/558: Real-Time Systems (G. Manimaran)1 Energy Aware Real Time Systems - Scheduling algorithms Acknowledgement: G. Sudha Anil Kumar Real Time Computing.
Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Energy-aware QoS packet scheduling.
Sunpyo Hong, Hyesoon Kim
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Optimizations for the Multi-Level Computing Architecture Presented by: Utku Aydonat Kirk Stewart Ahmed Abdelkhalek Ivan Matosevic Supervisor: Prof. Tarek.
University of Michigan Electrical Engineering and Computer Science Dynamic Voltage/Frequency Scaling in Loop Accelerators using BLADES Ganesh Dasika 1,
Evaluating Register File Size
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Flavius Gruian < >
On Using Linearly Priced Timed Automata for Flow Analysis
CSCI1600: Embedded and Real Time Software
Stephen Hines, David Whalley and Gary Tyson Computer Science Dept.
Parallel Programming in C with MPI and OpenMP
FAST: Frequency-Aware Static Timing Analysis
CSCI1600: Embedded and Real Time Software
Research Topics Embedded, Real-time, Sensor Systems Frank Mueller moss
Presentation transcript:

ParaScale : Exploiting Parametric Timing Analysis for Real-Time Schedulers and Dynamic Voltage Scaling Sibin Mohan 1 Frank Mueller 1,William Hawkins 2, Michael Root 3, Christopher Healy 3, David Whalley 4 1 : North Carolina State University, Centre for Embedded Systems Research 2 : University of Illinois, Urbana-Champaign, 3 : Furman University, 4 : Florida State University

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Outline  Motivation  Static Timing Analysis  Parametric Timing Analysis  Using Parametric Timing Analysis  Framework  Experiments and Results  Conclusion

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Motivation  Worst-case execution time (wcet) obtained by Static timing analysis Dynamic timing methods (proven unsafe)  Static Timing Analysis Upper limits on loop bounds known at compile time Results in large overestimations Provides single numeric value  Places limits on real-time designers Type of code that may be used in tasks

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Static Timing Analysis for( i = 0 ; i < n ; ++i ) Loop Body ; Consider the following piece of code… Value must be known at compile time Result : WCET = 1000 cycles (say)

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Parametric Timing Analysis  Calculate : formula /closed form for WCET  Formula depends on number of loop iterations  Evaluated at run-time  Results used for… scheduling decisions power savings etc.

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Parametric Timing Analysis (contd.) For the previous example… for( i = 0 ; i < n ; ++i ) Loop Body ; Result : WCET = 102 * n (say) Number of cycles to execute Loop Body for above example Value must be known prior to loop entry

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Parametric Timing Analysis (contd.) In practice… for( i = 0 ; i < n ; ++i ) Loop Body ; Call IntraTaskScheduler(evaluate_loop_k(n)) ; int evaluate_loop_k( int loop_bound){ return ( 102 * loop_bound ) ; }

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Flow of Parametric Timing Analysis C source file Parametric Timing Analyzer C source file annotated with Parametric Evaluation functions Has Parametric formula changed? YES Send Annotated C source file to Parametric Timing Analyzer NO Use Annotated C source file for execution on Simulator

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Insert Parametric Formulae into Task Code Timing analysis of function with parametric loopLoop analyzed and parametric evaluation function generated Numerical Analysis of generated function completeTiming analysis if function with parametric loop complete

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Use of Parametric Timing Analysis  Evaluate parametric expressions at run-time Based on actual loop bounds at run-time  Transfer control to dynamic scheduler  Newly-calculated WCET sent to scheduler Calculate savings in execution time  Savings exploited by scheduler to… admit additional tasks reduce operating frequency/voltage  save power

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Unique advantages of our work  Exploit early knowledge of execution of tasks Especially knowledge about future execution of loops  Tightly bound execution for remainder of task  Intra-task DVS algorithm slows down processor As execution proceeds processor slowed down further Saves power!  Other DVS algorithms: Task execution sped up as deadlines approach

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Framework C Source Files gcc PISA Compiler Numeric WCET Bound Numeric Timing Analyzer Instruction/data info p-compiler assembly Parametric Timing Analyzer C Source Files & Parametric Functions Scheduler Energy/power values Wattch Power Model SimpleScalar Simulator

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Framework (contd.)  p-compiler Works on assembly files as input Extracts information about instructions and data  Timing Analyzers Numeric : provides constant, numeric WCET bound Parametric : provides parametric formulae as WCET  Schedulers Frequency/Voltage lowered during execution Set to levels for last executing task instance  SimpleScalar simulation framework Capable of handling multiple threads of execution Can be configured for simple processor, SMT, CMP, etc.

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Experiments  Energy measurement techniques… Perfect Clock Gating (PCG) Perfect Clock Gating with Leakage (PCGL)  ParaScale Our combined intra-task and static inter-task DVS ParameterRange of Values Utilization20%, 50%, 80% Ratio WCET/PET * 1x, 2x, 5x, 10x, 15x, 20x Base DVS AlgorithmsStatic DVS Parametric ParaScale * Parametric Execution Time

Dec 5-8, Energy Consumption : 2x, PCG Energy consumptions, assuming no leakage, for various utilizations DVS Schemes Least consumption: ParaScale (for all cases)

Dec 5-8, Energy Consumption : 2x, PCGL Energy consumptions, leakage considered, for various utilizations DVS Schemes Least consumption: ParaScale (for all cases)

Dec 5-8, Energy Consumption: 10x, PCG Energy consumptions, assuming no leakage, for various utilizations and DVS Schemes for 10x Least Consumption : ParaScale Overall savings are lower due to greater slack

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Energy Consumption: 10x, PCGL Energy consumptions, leakage considered, for various utilizations and DVS Schemes for 10x Least Consumption : ParaScale Overall savings are lower due to greater slack

Dec 5-8, Energy Consumption trends (PCG) Energy consumption (hence savings) for ParaScale: Drops as ratio of WCET/PET is increased Due to greater slack available in the system

Dec 5-8, Energy Consumption trends (PCGL) Energy consumption affected by Leakage power prevalent in system Relative savings lower

Dec 5-8, Scheduler Overhead: Utilization 50%  Scheduler overheads greatest for ParaScale  Power savings still significant Due to lowering of voltage/frequency for schedulers

22 Conclusion  Fixed-point approach To embed parametric formulae Bounds WCET of application + parametric code  Provide lower WCET bounds during execution Prior to entry into loops  Quantify savings in terms of Power savings ParaScale: combined inter and intra-task DVS  Savings of 66-80% ! Power savings over DVS-oblivious techniques.  Mainly due to knowledge about: Past execution Future execution due to parametric formulae

Dec 5-8, 2005 Sibin Mohan : Real-Time Systems Symposium Thank You ! Questions ?