Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy.

Slides:



Advertisements
Similar presentations
Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Advertisements

Linear Programming. Introduction: Linear Programming deals with the optimization (max. or min.) of a function of variables, known as ‘objective function’,
1 “Scheduling with Dynamic Voltage/Speed Adjustment Using Slack Reclamation In Multi-processor Real-Time Systems” Dakai Zhu, Rami Melhem, and Bruce Childers.
CS1104: Computer Organisation School of Computing National University of Singapore.
A Framework for Dynamic Energy Efficiency and Temperature Management (DEETM) Michael Huang, Jose Renau, Seung-Moon Yoo, Josep Torrellas University of Illinois.
Power Reduction Techniques For Microprocessor Systems
1 Optimization Algorithms on a Quantum Computer A New Paradigm for Technical Computing Richard H. Warren, PhD Optimization.
Mehdi Amirijoo1 Dynamic power management n Introduction n Implementation, levels of operation n Modeling n Power and performance issues regarding.
Scheduling for Energy Performance and Reliability Yavuz Yetim Princeton University.
Chuanjun Zhang, UC Riverside 1 Low Static-Power Frequent-Value Data Caches Chuanjun Zhang*, Jun Yang, and Frank Vahid** *Dept. of Electrical Engineering.
Aleksandra Tešanović Low Power/Energy Scheduling for Real-Time Systems Aleksandra Tešanović Real-Time Systems Laboratory Department of Computer and Information.
Chia-Yen Hsieh Laboratory for Reliable Computing Microarchitecture-Level Power Management Iyer, A. Marculescu, D., Member, IEEE IEEE Transaction on VLSI.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
Instruction Set Architecture (ISA) for Low Power Hillary Grimes III Department of Electrical and Computer Engineering Auburn University.
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.
Linear Programming Applications
ECE 510 Brendan Crowley Paper Review October 31, 2006.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
Solving Linear Programming Problems Using Excel Ken S. Li Southeastern Louisiana University.
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding ILPs with Branch & Bound ILP References: ‘Integer Programming’
Energy, Energy, Energy  Worldwide efforts to reduce energy consumption  People can conserve. Large percentage savings possible, but each individual has.
Decision Procedures An Algorithmic Point of View
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Embedded System Design Framework for Minimizing Code Size and Guaranteeing Real-Time Requirements Insik Shin, Insup Lee, & Sang Lyul Min CIS, Penn, USACSE,
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
Power Reduction for FPGA using Multiple Vdd/Vth
1 Outline:  Outline of the algorithm  MILP formulation  Experimental Results  Conclusions and Remarks Advances in solving scheduling problems with.
November , 2009SERVICE COMPUTATION 2009 Analysis of Energy Efficiency in Clouds H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department.
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
SoC TAM Design to Minimize Test Application Time Advisor Dr. Vishwani D. Agrawal Committee Members Dr. Victor P. Nelson, Dr. Adit D. Singh Apr 9, 2015.
Low Power Design for Real-Time Systems Low power (energy) consumption is a key design for embedded systems Battery’s life during operation Reliability.
Dynamic Slack Reclamation with Procrastination Scheduling in Real- Time Embedded Systems Paper by Ravindra R. Jejurikar and Rajesh Gupta Presentation by.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Storage Allocation for Embedded Processors By Jan Sjodin & Carl von Platen Present by Xie Lei ( PLS Lab)
Scheduling policies for real- time embedded systems.
1 Customer-Aware Task Allocation and Scheduling for Multi-Mode MPSoCs Lin Huang, Rong Ye and Qiang Xu CHhk REliable computing laboratory (CURE) The Chinese.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
CPU Scheduling Gursharan Singh Tatla 1-Feb-20111www.eazynotes.com.
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Hard Real-Time Scheduling for Low- Energy Using Stochastic Data and DVS Processors Flavius Gruian Department of Computer Science, Lund University Box 118.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
An Energy-efficient Task Scheduler for Multi-core Platforms with per-core DVFS Based on Task Characteristics Ching-Chi Lin Institute of Information Science,
June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly.
The Instruction Set Architecture. Hardware – Software boundary Java Program C Program Ada Program Compiler Instruction Set Architecture Microcode Hardware.
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
Lx: A Technology Platform for Customizable VLIW Embedded Processing.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
CprE 458/558: Real-Time Systems (G. Manimaran)1 CprE 458/558: Real-Time Systems Energy-aware QoS packet scheduling.
Compiler-Directed Power Density Reduction in NoC-Based Multi-Core Designs Sri Hari Krishna Narayanan, Mahmut Kandemir, Ozcan Ozturk Embedded Mobile Computing.
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
Input and Output Optimization in Linux for Appropriate Resource Allocation and Management James Avery King.
Memory Segmentation to Exploit Sleep Mode Operation
Andrea Acquaviva, Luca Benini, Bruno Riccò
Operating Systems (CS 340 D)
SECTIONS 1-7 By Astha Chawla
Adaptive Cloud Computing Based Services for Mobile Users
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Operating Systems (CS 340 D)
Digital Processing Platform
“Rate-Optimal” Resource-Constrained Software Pipelining
REVIEW FOR EXAM 1 Chapters 3, 4, 5 & 6.
Research Topics Embedded, Real-time, Sensor Systems Frank Mueller moss
Presentation transcript:

Workload Clustering for Increasing Energy Savings on Embedded MPSoCs S. H. K. Narayanan, O. Ozturk, M. Kandemir, M. Karakoy

Outline Introduction MPSoC Architecture Energy Reduction Schemes Unified Approach ILP formulation Results Conclusion

Introduction Systems are heading towards Multiprocessor Systems on Chip (MPSoC) design. Energy consumption on these systems is of concern, especially in embedded MPSoCs Current energy reduction techniques –Work independently –Not optimal This work makes use of a unified, optimal scheme to reduce the overall energy consumption

Architectural Details CMP/MPSoC Shared memory system Each processor can operate at a different frequency. Each processor can operate at a different voltage level. Scaling can take place between jobs

Scenario with no energy saving scheme P0P0 P2P2 P1P1 P3P3 P4P4 P5P5 Processor BusyProcessor Idle

Energy Reduction Schemes There are two primary groups –Voltage scaling techniques –Processor shutdown schemes They can be applied using hardware or an optimizing compiler They are applied independently They are applied in disjoint manner

Processor Shutdown Saves leakage energy Not all processors are used. These processors spend energy. Shut off unused processors. –Low power mode –Another idea Turn off processors by detecting that jobs have finished Turn them on later if necessary

Scenario with processor shutdown P0P0 P2P2 P1P1 P3P3 P4P4 P5P5 Processor Busy Processor Idle Processor Shutdown

Voltage Scaling Active Power  Voltage level 2 –Need to reduce voltage level in order to increase energy savings. –But how? Scaling! Frequency  Voltage levels –Time  1/Frequency –So, Time  Voltage levels So, scale the voltage down to take advantage of extra available time and reduce dynamic energy consumption! But leakage increases!

Scenario with voltage scaling P0P0 P2P2 P1P1 P3P3 P4P4 P5P5 Processor Voltage Scaled Processor ActiveProcessor Idle

Intuition for a unified approach Processor shut down does not try to take advantage voltage scaling. Pure Voltage scaling will not shut off idle processors. Job clustering is not being done. Hence a unified approach that optimally uses a combination of the two schemes on a per- job/processor basis is needed!

Unified approach Cluster jobs on as few processors as possible –Increases number of completely idle processors –They can be shut down Perform voltage scaling of those processors that have remaining slack. Question –How is it possible to select the optimal voltage level for a particular job? –How is it possible to determine the optimal clustering of jobs? Answer –Integer Linear Programming (ILP)

Scenario with voltage scaling and processor shutdown P0P0 P2P2 P1P1 P3P3 P4P4 P5P5 Processor Voltage Scaled Processor ActiveProcessor Idle Processor Shutdown

Scenario with unified approach including workload clustering P0P0 P2P2 P1P1 P3P3 P4P4 P5P5 Processor Active Processor Voltage ScaledProcessor Shutdown

Integer Linear Programming (ILP) A Linear Program (LP) is a problem that can be expressed as follows * minimize cx subject to Ax = b x >= 0 x is a vector of vector of variables to be solved for A is a matrix of known coefficients c and b are vectors of known coefficients *

System and Job Model Tmax Deadline for the jobs to finish Jmax jobs Pmax processors. Vnum discrete voltage levels Job_length(j,v), captures time taken by j to execute at voltage v. Job_Dynamic(j,v) captures dynamic energy spent by processsor executing Job j at Voltage level v.

Mathematical programming model X(p,j,v) –Binary variable –Expresses whether processor p runs job j at voltage v Job assignment Constraint –A job runs only on one processor –All jobs must be run once

Mathematical programming model Deadline constraint –All jobs finish before the deadline Deadline must be within bounds –Normally Tmax = Length of longest job without voltage scaling

Mathematical programming model Job Clustering and Shutdown constraint –Clusters jobs if there are more jobs than processors –If it reduces the overall energy of the system If a processor has no job scheduled on it –Shut it down How do we find out if a processor is busy? –Busy(p) is a binary variable

Mathematical programming model Dynamic Energy computation –If a processor p executes a job j at voltage level v Add the energy spent in doing so to the over all sum Leakage Energy computation –If a processor p is not shutdown, i.e. it is busy It spends leakage energy, add this to the overall sum

Mathematical programming model Objective function –Sum of the dynamic and leakage energy Simple variants –Reduce model to job clustering without voltage scaling –Reduce model to voltage scaling without job clustering

Results -Energy Savings

Results Job Assignment

Conclusion Implemented a unified ILP model to take advantage of both Voltage Scaling and processors shutdown. Model implements voltage scaling to reduce dynamic energy Model implements job clustering to reduce dynamic energy Model can represent voltage scling and job clustering individually as well.