1 ECE692 Topic Presentation Power/thermal-Aware Utilization Control Xing Fu 22 September 2009.

Slides:



Advertisements
Similar presentations
Feedback Control Real- time Scheduling James Yang, Hehe Li, Xinguang Sheng CIS 642, Spring 2001 Professor Insup Lee.
Advertisements

CprE 458/558: Real-Time Systems
Simulation of Feedback Scheduling Dan Henriksson, Anton Cervin and Karl-Erik Årzén Department of Automatic Control.
Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.
On the Robust Capability of Feedback Scheduling in ORB Middleware Bing Du David.C. Levy School of Electrical and Information Engineering University of.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
Courseware Scheduling of Distributed Real-Time Systems Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens.
Power Aware Virtual Machine Placement Yefu Wang. 2 ECE Introduction Data centers are underutilized – Prepared for extreme workloads – Commonly.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems (m, k)-firm tasks and QoS enhancement.
Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.
CS 795 – Spring  “Software Systems are increasingly Situated in dynamic, mission critical settings ◦ Operational profile is dynamic, and depends.
1 Deferrable Scheduling for Temporal Consistency: Schedulability Analysis and Overhead Reduction Ming Xiong : Lucent Bell Labs Song Han: City University.
SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Lu, and Zhu. CPU Utilization Control in Distributed Real-Time Systems Chenyang.
Preemptive Behavior Analysis and Improvement of Priority Scheduling Algorithms Xiaoying Wang Northeastern University China.
Lifetime Reliability-Aware Task Allocation and Scheduling for MPSoC Platforms Lin Huang, Feng Yuan and Qiang Xu Reliable Computing Laboratory Department.
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
By Group: Ghassan Abdo Rayyashi Anas to’meh Supervised by Dr. Lo’ai Tawalbeh.
Distributed Structural Health Monitoring A Cyber-Physical System Approach Chenyang Lu Department of Computer Science and Engineering.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.
More Realistic Power Grid Verification Based on Hierarchical Current and Power constraints 2 Chung-Kuan Cheng, 2 Peng Du, 2 Andrew B. Kahng, 1 Grantham.
End-to-End Delay Analysis for Fixed Priority Scheduling in WirelessHART Networks Abusayeed Saifullah, You Xu, Chenyang Lu, Yixin Chen.
Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
A S CHEDULABILITY A NALYSIS FOR W EAKLY H ARD R EAL - T IME T ASKS IN P ARTITIONING S CHEDULING ON M ULTIPROCESSOR S YSTEMS Energy Reduction in Weakly.
1 Server-level Power Control Ming Chen. 2 Motivations(1) Clusters of hundreds, even thousands of servers; Occupy one room of a building or even a whole.
Practical Schedulability Analysis for Generalized Sporadic Tasks in Distributed Real-Time Systems Yuanfang Zhang 1, Donald K. Krecker 2, Christopher Gill.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
On Distinguishing the Multiple Radio Paths in RSS-based Ranging Dian Zhang, Yunhuai Liu, Xiaonan Guo, Min Gao and Lionel M. Ni College of Software, Shenzhen.
Applying Control Theory to the Caches of Multiprocessors Department of EECS University of Tennessee, Knoxville Kai Ma.
Real-Time Systems Mark Stanovich. Introduction System with timing constraints (e.g., deadlines) What makes a real-time system different? – Meeting timing.
Suzhen Lin, A. Sai Sudhir, G. Manimaran Real-time Computing & Networking Laboratory Department of Electrical and Computer Engineering Iowa State University,
Power-Aware Scheduling of Virtual Machines in DVFS-enabled Clusters
Managing Server Energy and Operational Costs Chen, Das, Qin, Sivasubramaniam, Wang, Gautam (Penn State) Sigmetrics 2005.
Survey of Real Time Databases Telvis Calhoun CSc 6710.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
Dynamic Voltage Frequency Scaling for Multi-tasking Systems Using Online Learning Gaurav DhimanTajana Simunic Rosing Department of Computer Science and.
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
1 Dynamic Sleeping Scheduling for Real-time Wireless Sensor Networks Department of EECS University of Tennessee, Knoxville Xiaodong Wang, Yanjun Yao.
Computer Network Lab. Integrated Coverage and Connectivity Configuration in Wireless Sensor Networks SenSys ’ 03 Xiaorui Wang, Guoliang Xing, Yuanfang.
Present by Sheng Cai Coordinating Power Control and Performance Management for Virtualized Server Clusters.
Adaptive Resource Management Architecture for DRE Systems Nishanth Shankaran
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
ECE555 Topic Presentation Energy-efficient real-time scheduling Xing Fu 20 September 2008 Acknowledge Dr. Jian-Jia Chen from ETH providing PPT Slides for.
Chapter 4 A First Analysis of Feedback Feedback Control A Feedback Control seeks to bring the measured quantity to its desired value or set-point (also.
An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 31 – Process Management (Part 1) Klara Nahrstedt Spring 2009.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
ECE555 Course Project Proposal Coordinated Power and Utilization Control for DRE System with E2E Tasks Xing Fu Eric Puster 16 September 2008.
ECE692 Course Project Proposal Cache-aware power management for multi-core real-time systems Xing Fu Khairul Kabir 16 September 2009.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
-1- UC San Diego / VLSI CAD Laboratory Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems Andrew B. Kahng and Siddhartha.
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 32 – Multimedia OS Klara Nahrstedt Spring 2010.
ECE 692 Power-Aware Computer Systems Final Review Prof. Xiaorui Wang.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Coordinated Performance and Power Management Yefu Wang.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Distributed Process Scheduling- Real Time Scheduling Csc8320(Fall 2013)
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Networked Embedded Control System - Integration of control and computing Moonju Park Dept. of Computer Science & Engineering University of Incheon 1.
Networked Embedded Control System - Integration of control and computing Moonju Park Dept. of Computer Science & Engineering University of Incheon 1.
Wayne Wolf Dept. of EE Princeton University
PA an Coordinated Memory Caching for Parallel Jobs
Elastic Task Model For Adaptive Rate Control
Feedback Control Real-time Scheduling
Research Topics Embedded, Real-time, Sensor Systems Frank Mueller moss
Presentation transcript:

1 ECE692 Topic Presentation Power/thermal-Aware Utilization Control Xing Fu 22 September 2009

2 Outline of the Presentation Why consider power/thermal in real-time systems? Conflicts between power/thermal management and real-time guarantee. For example, the CPU frequency reduction  the tasks’ execution times will increase. Related work The Limitations. Motivation of papers to present 1 st : meet all the deadlines despite runtime exec time variations and save power. 2 nd : guarantees both thermal and timeliness.

3 Power-Aware CPU Utilization Control for Distributed Real-Time Systems Xiaorui Wang, Xing Fu, Xue Liu*, Zonghua Gu $ University of Tennessee, Knoxville *McGill University $ Hong Kong Univ. of Sci. and Tech.

4 Recap of Utilization Control CPU utilization: a trade-off –Too high  system overload  possible crash OS frozen by higher-priority real-time threads –Too low  poor application QoS, excessive power consumption Schedulable utilization bound –Utilization ≤ bound  meet all deadlines –Highest possible utilization with deadline guarantee Uncertainties –Unpredictable exec times (e.g. influenced by sensor data) –External resource contention (e.g. Denial of Service attacks)  Must maintain desired utilization under uncertainty! University of Tennessee, Knoxville

5 Existing Work on Utilization Control Various utilization control algorithms –Single-processor control [Lu 03] –Multi-processor control [Stankovic 01] [Lin 03] [Lu 04] –Hybrid control [Koutsoukos 05] –Decentralized control [Wang 05] –Adaptive control [Yao 08] –Optimization-based control [Chen 07] –Controllability and feasibility [Wang 07] –And more… All the algorithms rely exclusively on rate adaptation –Utilization of a task = exec time / period –Adjust task periods within allowed ranges University of Tennessee, Knoxville

6 Limitations of Rate Adaptation 1.Often infeasible to achieve utilization set points –WCET configuration  underutilization even at highest rates –Under-utilized processors  excessive power consumption Can save power by frequency scaling 2.Estimated rate ranges may not be accurate –Unexpected rate saturation  feasibility 3.Task rates cannot be adapted in some DRE systems –Need another knob for utilization control University of Tennessee, Knoxville

7 Power-Aware CPU Utilization Control Util control by rate adaptation + CPU frequency scaling –Utilization of a task = exec time / period Control approach –Controlled variable: utilizations of all the processors in a DRE –Manipulated variable: task rates and DVFS level –Variations: inaccurate or varying task exec times T1T1 T2T2 T3T3 T 11 T 12 T 13 P1P1 P2P2 P3P3 Precedence Constraints Subtask University of Tennessee, Knoxville

8 Recap of End-to-End Task Model in Distributed Real-Time Systems Periodic task T i = a chain of subtasks {T ij } on diff procs –All subtasks run at the same rate –End-to-end deadline = sum of all sub-deadlines Task rate can be adjusted within a range –Higher rate  better performance CPU frequency can be adjusted within a range –Lower frequency  power savings T1T1 T2T2 T3T3 T 11 T 12 T 13 P1P1 P2P2 P3P3 Precedence Constraints Subtask University of Tennessee, Knoxville

9 Control Problem Formulation Control objective: subject to two constraints 1. task rates: R min,j  r j (k)  R max,j (1 ≤ j ≤ m) 2. CPU frequency: F min,i  f i (k)  F max,i (1 ≤ i ≤ n) T1T1 T2T2 T3T3 T 11 T 12 T 13 P1P1 P2P2 P3P3 Precedence Constraints Subtask University of Tennessee, Knoxville

10 System Model without Freq Scaling New utilization = old utilization + change Utilization change = actual exec time × task rate change c jl : estimated execution time of T jl g i = actual execution time / estimation –Models uncertainty in execution times u 1 (k) = u 1 (k-1) + g 1 c 11  r 1 (k-1) u 2 (k) = u 2 (k-1) + g 2 c 12  r 1 (k-1) T1T1 P1P1 P2P2 T 11 T 12 System model: University of Tennessee, Knoxville

11 System Model with Freq Scaling Utilization change = actual exec time × task rate change actual exec time = exec time / relative CPU frequency T1T1 P1P1 P2P2 T 11 T 12 New system model:  System model is now nonlinear for a single control loop  Solution: separate to two control loops  Each loop assumes that the other control input is constant University of Tennessee, Knoxville

12 Two-Layer Control Architecture  Two coordinated control loops  Cluster-level task rate adaptation loop (EUCON)  One CPU frequency scaling loop on each processor  Two loops run on different timescales: rate loop runs much faster Model Predictive Controller Distributed Real-Time System (m tasks, n processors) Utilization Monitor Rate Modulator RM UM RM Proportional Controller Frequency Modulator PC FM PC FM … P1P1 P2P2 PnPn Model Predictive Controller University of Tennessee, Knoxville Proportional Controller

13 CPU Frequency Scaling Loop System model of processor P 1 by assuming r j (k) = r j Controller design –g 1 is unknown and assumed to be 1 (exec times are accurate) –A proportional (P) controller can achieve stability and accuracy Stability analysis for model variations –When g 1 is not 1, (exec times vary at runtime) –Result: 0 < g 1 < 2 Actual exec times cannot be twice their estimated values Need to be pessimistic for exec time estimation University of Tennessee, Knoxville

14 Coordination Analysis Goal: Coordinate the two control loops for global stability Stability of rate loop under the impact of the freq loop –Result: relative CPU frequency cannot be less than 0.1 –Most real processors have DVFS range from 1 to 0.5 (roughly) Control period configuration –Period of the frequency loop > settling time of the rate loop –Period of the rate loop = 2 sec Determined based on task periods –Settling time of the rate loop = 5 periods –Period of the frequency loop = 20 sec > 10 sec University of Tennessee, Knoxville

15 System Implementation Implemented based on FC-ORB real-time middleware Test-bed –12 tasks (25 subtasks) on 4 AMD processor (5 freq levels) –openSUSE Linux 11 with real-time support –RMS (Rate Monotonic Scheduling) with release guard Controllers –Rate controller: running on a separate machine –Frequency P controller: running on each processor CPU frequency modulator –5 discrete freq levels to approximate the desired continuous level? For 3.2, use 3, 3, 3, 3, 4 on a smaller timescale (subintervals) University of Tennessee, Knoxville

16 Empirical Results for Freq Scaling Loop No rate adaptation Utilization set point is RMS bound Exec time increase from 600s to 1200s Freq scaling can be used for util control and power savings More results are in the paper University of Tennessee, Knoxville

17 Frequency Scaling vs. EUCON EUCON fails due to rate saturation EUCON leads to power waste Freq scaling achieves the set points Freq scaling achieves power savings Freq scaling loop is activated here University of Tennessee, Knoxville

18 Coordinated Utilization Control Rate adaptation alone failsFreq scaling alone fails Coordinated control achieves the set points Coordinated control also achieves power savings University of Tennessee, Knoxville

19 Conclusions Existing work on utilization control relies exclusively on rate adaptation, which has some limitations This paper –Formulates a new utilization control problem by rate adaptation + CPU frequency scaling for power saving –Proposes a two-layer control architecture –Provides coordination analysis –Presents empirical results to demonstrate the effectiveness University of Tennessee, Knoxville

20 Dynamic Thermal and Timeliness Guarantees for Distributed Real-Time Embedded Systems Department of EECS University of Tennessee, Knoxville Xing Fu, Xiaorui Wang, and Eric Puster

21 Introduction Distributed Real-Time Embedded Systems Examples: Mission critical systems and Cyber physical system Requirements guarantee timeliness guarantee thermal 50% of all electronics failures are related to overheating the lifetime of a processor can be approximately halved if its temperature is increased ℃。 15 ℃ increase in temperature could double the failure rate of a disk drive Temperature must be explicitly controlled for reliability. An Integrated solution

22 Integrated solution Util control by rate adaptation Utilization of a task = exec time / period Thermal control by CPU frequency scaling CPU frequency  Power  Temperature System Diagram

23 Control Loops End-to-End task model  A cluster-level utilization controller Precedence Constraints Subtask (2) the controller computes a new rate for every task and sends the new rates to the rate modulators (1) the utilization monitor sends its utilization in the last control period to the cluster-level controller (3) the rate modulators change the task rates accordingly. Processor N FM T M Cluster Level Utilization Controller TC UMRM FM T M TC UMRM Rate Modulator Utilization Monitor Thermal Controller Thermal Monitor Frequency Modulator Processor1 2 UM Utilization Monitor Cluster Level Utilization Controller RM Rate Modulator

24 Thermal Control Loop Processor CPU frequency Temperature Thermal Controller Temperature set point 55 ℃ 50 ℃ Error: -5 ℃ Decrease Workload variation 50 ℃ PID (Proportional-Integral-Differential) controller System modeling Controller design Performance analysis

25 System model We use two steps to model the relationship between t i (k) and f i (k) Power model relates f i (k) to P i (k). Thermal model relates P i (k) to t i (k). System model

26 Controller Design & Performance Processor Temperature set point PID controller Control performance (e.g. Stability, Zero steady state error) If the temp cannot reach the temp set point even when frequency is the highest, the controller is saturated and the system has highest performance.

27 Coordination Goal: coordinate the two control loops for global stability Global stability: both the two control loops are still stable under the impact from the other loop. Robust Control : Small gain theorem Example: Utilization control loop Results: global stable under our hardware configuration and workload. Set point Controller System controlled Uncertainty

28 Thermal controller System Implementation Our solution is evaluated on a hardware test-bed while most existing work uses simulations. Implemented based on FC-ORB real-time middleware Test-bed 12 tasks (25 subtasks) on 4 AMD processor (5 freq levels) OpenSUSE Linux 11 with real-time support RMS (Rate Monotonic Scheduling) with release guard Controllers Utilization controller

29 Key components Temperature sensors Open source software. Thermometer  Machine Specific Register  File System The measured temperature is the maximum temperature of the processor.

30 Baselines OPEN A typical open-loop solution that configures the task rates and processor DVFS levels in a static way. Ad Hoc When the current processor temperature is lower than the set point, Ad Hoc will increase the processor’s DVFS level by one. When the temperature is lower than the set point, Ad Hoc sets the DVFS level to the lowest one to avoid overheating.

31 Empirical Results Thermal controller No utilization control Temperature set point decreases from 400s to 800s. Our solution achieves better performance.

32 Empirical Results Thermal variations Temp set point of a single processor is lowered due to thermal emergency. Temp set point of all single processors are lowered due to thermal emergency. Utilization is guaranteed in spite of thermal controller

33 Empirical Results Task execution time variations Control based solution guarantees both timeliness and thermal. OPEN may violate both utilization and thermal bound.

34 Conclusions Existing work studies utilization control and thermal guarantee separately which has some limitations. This paper Proposes an integrated solution problem by utilization control + thermal control for reliability. Designs a thermal control loop. Provides coordination analysis by Robust Control. Presents empirical results to demonstrate the effectiveness.

35 Comparison of two papers Paper 1Paper 2 What aboutPower-aware utilization control Simultaneous utilization and thermal control FocusCPU frequency scaling loop Thermal control loop Key ideaCPU frequency can be manipulated to change execution time Integrate conflicting control loop based on robust control theory

36 Critiques for Paper 1 DVFS, the only knob to adjust power? Memory hierarchy and I/O? Overhead of the frequency transition Alternative ways to create a continuous frequency Options to convert nonlinear system model Nonlinear controller Controllability and feasibility issues Underutilized real-time systems ( courtesy of Klairul )

37 Critiques for Paper 2 Nearly all existing thermal management utilize DVFS as power management. Disadvantage Fan control. Pulse Width Modulation (PWM). The assumption of temperature model multi-core architecture Presentation of robust control theory Possible solution: learn from [1] to present the technical. Current way (introducing a ratio) to model uncertainty can be extended. [1] Task Scheduling for Control Oriented Requirements for Cyber- Physical Systems, RTSS 2008, from Georgia Institute of Technology

38