Variation Aware Application Scheduling in Multi-core Systems Lavanya Subramanian, Aman Kumar Carnegie Mellon University {lsubrama,

Slides:

Advertisements

Similar presentations

Hadi Goudarzi and Massoud Pedram

Advertisements

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,

LEMap: Controlling Leakage in Large Chip-multiprocessor Caches via Profile-guided Virtual Address Translation Jugash Chandarlapati Mainak Chaudhuri Indian.

4/17/20151 Improving Memory Bank-Level Parallelism in the Presence of Prefetching Chang Joo Lee Veynu Narasiman Onur Mutlu* Yale N. Patt Electrical and.

Exploiting Unbalanced Thread Scheduling for Energy and Performance on a CMP of SMT Processors Matt DeVuyst Rakesh Kumar Dean Tullsen.

VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects Sarangi et al Prateeksha Satyamoorthy CS

PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst.

Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures J. Winter and D. Albonesi, Cornell University International Conference on Dependable.

Ensuring Robustness via Early- Stage Formal Verification Multicore Power Management: Anita Lungu *, Pradip Bose **, Daniel Sorin *, Steven German **, Geert.

CML CML Presented by: Aseem Gupta, UCI Deepa Kannan, Aviral Shrivastava, Sarvesh Bhardwaj, and Sarma Vrudhula Compiler and Microarchitecture Lab Department.

Designing Variation-Tolerance in Mixed-Signal Components of a System-on-Chip Wei Jiang and Vishwani D. Agrawal Electrical and Computer Engineering Auburn.

Software Architecture of High Efficiency Video Coding for Many-Core Systems with Power- Efficient Workload Balancing Muhammad Usman Karim Khan, Muhammad.

Senior Design Project: Parallel Task Scheduling in Heterogeneous Computing Environments Senior Design Students: Christopher Blandin and Dylan Machovec.

Statistical Full-Chip Leakage Analysis Considering Junction Tunneling Leakage Tao Li Zhiping Yu Institute of Microelectronics Tsinghua University.

Yuanlin Lu Intel Corporation, Folsom, CA Vishwani D. Agrawal

- Sam Ganzfried - Ryan Sukauye - Aniket Ponkshe. Outline Effects of asymmetry and how to handle them Design Space Exploration for Core Architecture Accelerating.

June 20 th 2004University of Utah1 Microarchitectural Techniques to Reduce Interconnect Power in Clustered Processors Karthik Ramani Naveen Muralimanohar.

Yefu Wang and Kai Ma. Project Goals and Assumptions Control power consumption of multi-core CPU by CPU frequency scaling Assumptions: Each core can be.

Trevor Burton6/19/2015 Multiprocessors for DSP SYSC5603 Digital Signal Processing Microprocessors, Software and Applications.

Architectural-Level Prediction of Interconnect Wirelength and Fanout Kwangok Jeong, Andrew B. Kahng and Kambiz Samadi UCSD VLSI CAD Laboratory

L i a b l eh kC o m p u t i n gL a b o r a t o r y Performance Yield-Driven Task Allocation and Scheduling for MPSoCs under Process Variation Presenter:

1 Razor: A Low Power Processor Design Presented By: - Murali Dharan.

Jan. 2007VLSI Design '071 Statistical Leakage and Timing Optimization for Submicron Process Variation Yuanlin Lu and Vishwani D. Agrawal ECE Dept. Auburn.

Enhancing the Platform Independence of the Real-Time Specification for Java Andy Wellings, Yang Chang and Tom Richardson University of York.

UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.

ECE 510 Brendan Crowley Paper Review October 31, 2006.

Exploiting Dark Silicon for Energy Efficiency Nikos Hardavellas Northwestern University, EECS.

1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University

Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.

Scalable Thread Scheduling and Global Power Management for Heterogeneous Many-Core Architectures 指導教授：周哲民學生：陳佑銓 CAD Group Department of Electrical.

Determining the Optimal Process Technology for Performance- Constrained Circuits Michael Boyer & Sudeep Ghosh ECE 563: Introduction to VLSI December 5.

SYNAR Systems Networking and Architecture Group Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures Daniel Shelepov and Alexandra.

Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.

1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah

An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget Represented by: Majid Malaika Authors:

An Analysis of Efficient Multi-Core Global Power Management Policies Authors: Canturk Isci†, Alper Buyuktosunoglu†, Chen-Yong Cher†, Pradip Bose† and Margaret.

Self-* Systems CSE 598B Paper title: Dynamic ECC tuning for caches Presented by: Niranjan Soundararajan.

3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.

Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.

Variation-Tolerant Circuits: Circuit Solutions and Techniques Jim Tschanz, Keith Bowman, and Vivek De Microprocessor Technology Lab Intel Corporation,

1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.

Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction University of California MICRO ’03 Presented by Jinho Seol.

Copyright © 2010 Houman Homayoun Houman Homayoun National Science Foundation Computing Innovation Fellow Department of Computer Science University of California.

Analyzing the Impact of Data Prefetching on Chip MultiProcessors Naoto Fukumoto, Tomonobu Mihara, Koji Inoue, Kazuaki Murakami Kyushu University, Japan.

Temperature-Sensitive Loop Parallelization for Chip Multiprocessors Sri HK Narayanan, Guilin Chen, Mahmut Kandemir, Yuan Xie Embedded Mobile Computing.

Patricia Gonzalez Divya Akella VLSI Class Project.

Scheduling Issues on a Heterogeneous Single ISA Multicore IRISA, France Robert Guziolowski, André Seznec. Contact: 1. M. Becchi and P.

Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.

Sudhanshu Khemka.  Treats each document as a vector with one component corresponding to each term in the dictionary  Weight of a component is calculated.

An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)

By Islam Atta Supervised by Dr. Ihab Talkhan

Migration Cost Aware Task Scheduling Milestone Shraddha Joshi, Brian Osbun 10/24/2013.

Thermal Management in Datacenters Ayan Banerjee. Thermal Management using task placement Tasks: Requires a certain number of servers (cores) for a specified.

Department of Electrical and Computer Engineering University of Wisconsin - Madison Optimizing Total Power of Many-core Processors Considering Voltage.

ECE692 Course Project Proposal Cache-aware power management for multi-core real-time systems Xing Fu Khairul Kabir 16 September 2009.

Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University.

Variation Aware Application Scheduling in Multi-core Systems Lavanya Subramanian, Aman Kumar Carnegie Mellon University {lsubrama,

Rakesh Kumar Keith Farkas Norman P Jouppi,Partha Ranganathan,Dean M.Tullsen University of California, San Diego MICRO 2003 Speaker ： Chun-Chung Chen Single-ISA.

M AESTRO : Orchestrating Predictive Resource Management in Future Multicore Systems Sangyeun Cho, Socrates Demetriades Computer Science Department University.

Hang Zhang1, Xuhao Chen1, Nong Xiao1,2, Fang Liu1

Computer Architecture: Parallel Task Assignment

Ioannis E. Venetis Department of Computer Engineering and Informatics

Computer Architecture: Parallel Processing Basics

Evaluating Register File Size

Warped Gates: Gating Aware Scheduling and Power Gating for GPGPUs

Simultaneous Multithreading

Maestro: Orchestrating Lifetime Reliability in Chip Multiprocessors

CARP: Compression-Aware Replacement Policies

Maria Méndez Real, Vincent Migliore, Vianney Lapotre, Guy Gogniat

A Low-Power Analog Bus for On-Chip Digital Communication

Presentation transcript:

Variation Aware Application Scheduling in Multi-core Systems Lavanya Subramanian, Aman Kumar Carnegie Mellon University {lsubrama,

Document Map Motivation Motivation Leakage Power and Frequency Variations in CMP Leakage Power and Frequency Variations in CMP Problem Problem Application scheduling exploiting frequency variation and leakage per core in CMPs Application scheduling exploiting frequency variation and leakage per core in CMPs Related Work Related Work Proposed Scheme Proposed Scheme Unified Power Performance Approach Unified Power Performance Approach Milestones Milestones 1

Motivation Variations in chip multi processors are a major concern. There are two components to this: The die-to-die component. The within-die component. At the transistor/device level, these are variations in L eff V th These variations in L eff and V th translate into frequency and leakage current variations at the micro-architecture level. 2

Motivation (contd…) Why a UNIFIED Power/Performance approach?? For cores that can operate at a specific maximum frequency, there is a wide variation in the leakage profiles. Analogously, for cores that have a certain leakage power, there is a wide spread in the maximum frequency characteristics. [3] 3

Problem The perspective of a chip multi processor being a homogenous set of cores is hence not a practical one. A CMP has to be relooked as: a collection of heterogeneous cores each core operating at different frequency each core with a different power profile 4

Related Work Work being done at UIUC, talks about a set of scheduling algorithms taking either power or performance in account but not both together. [1] The basic power efficiency inclined algorithm ( VarP ) tries to map applications onto the least leaky cores. The enhanced version of this ( VarP+AppP ) tries to map the highest dynamic power consuming applications onto the least leaky cores. Similarly, the performance centric algorithms (VarF) map applications onto the fastest cores. 5

Proposed Scheme 1.Rank the cores in the order of the maximum frequencies. 2.Obtain the static leakage power number for each core (profiled statically at a nominal temperature) 3.Rank the applications in the order of dynamic power (obtained by static profiling on a core) 4.For each application, starting from the highest dynamic power one, map the application onto the core with the highest frequency, with the least leakage. This could be achieved by sorting the cores in frequency and leakage bins/levels. 6

Milestones Milestone 1.1: Building variability information into the CMP simulator. Static profiling of applications. Milestone 2: Building a scheduler into the CMP simulator. Milestone 3: Implementing and analyzing the proposed scheme against the baseline algorithms. 7

References [1] R. Teodorescu and J. Torrellas. Variation-aware application scheduling and power management for chip multiprocessors. In ISCA’08: Proceedings of the 35th annual InternationalSymposium on Computer Architecture, 2008 [2] Y. Abulafia and A. Kornfeld. Estimation of FMAX and ISB in microprocessors. IEEE Transactions on VLSI Systems, 13(10), Oct 2006 [3] S. Borkar et. al., “Parameter variations and impact on circuits and micro-architecture,” Proc. DAC 2003, pp

Questions !! 9