Best detection scheme achieves 100% hit detection with <5% false alarms Princeton University Runtime Power Monitoring and Phase Analysis Methods for Power.

Slides:



Advertisements
Similar presentations
Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.
Advertisements

Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.
International Symposium on Low Power Electronics and Design Qing Xie, Mohammad Javad Dousti, and Massoud Pedram University of Southern California ISLPED.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
SE263 Video Analytics Course Project Initial Report Presented by M. Aravind Krishnan, SERC, IISc X. Mei and H. Ling, ICCV’09.
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.
Sensor-Based Abnormal Human-Activity Detection Authors: Jie Yin, Qiang Yang, and Jeffrey Junfeng Pan Presenter: Raghu Rangan.
Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
© 2012 IBM Corporation Barcelona Supercomputing Center MICRO 2012 Tuesday, December 4, 2012 Systematic Energy Characterization of CMP/SMT Processor Systems.
Phase Detection Jonathan Winter Casey Smith CS /05/05.
Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.
Quality of Service in IN-home digital networks Alina Albu 23 October 2003.
November 18, 2004 Embedded System Design Flow Arkadeb Ghosal Alessandro Pinto Daniele Gasperini Alberto Sangiovanni-Vincentelli
Research Directions for On-chip Network Microarchitectures Luca Carloni, Steve Keckler, Robert Mullins, Vijay Narayanan, Steve Reinhardt, Michael Taylor.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
SyNAR: Systems Networking and Architecture Group Symbiotic Jobscheduling for a Simultaneous Multithreading Processor Presenter: Alexandra Fedorova Simon.
Catching Accurate Profiles in Hardware Satish Narayanasamy, Timothy Sherwood, Suleyman Sair, Brad Calder, George Varghese Presented by Jelena Trajkovic.
Accelerating Machine Learning Applications on Graphics Processors Narayanan Sundaram and Bryan Catanzaro Presented by Narayanan Sundaram.
University of Karlsruhe, System Architecture Group Balancing Power Consumption in Multiprocessor Systems Andreas Merkel Frank Bellosa System Architecture.
Into the Wild: Studying Real User Activity Patterns to Guide Power Optimizations for Mobile Architectures Alex Shye, Benjamin Scholbrock, and Gokhan Memik.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
Unit VI. Keil µVision3/4 IDE for 8051 Tool for embedded firmware development Steps for using keil.
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
Soner Yaldiz, Alper Demir, Serdar Tasiran Koç University, Istanbul, Turkey Paolo Ienne, Yusuf Leblebici Swiss Federal Institute of Technology (EPFL), Lausanne,
Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI.
Sensor-Based Fast Thermal Evaluation Model For Energy Efficient High-Performance Datacenters Q. Tang, T. Mukherjee, Sandeep K. S. Gupta Department of Computer.
Low-Power Wireless Sensor Networks
Clone-Cloud. Motivation With the increasing use of mobile devices, mobile applications with richer functionalities are becoming ubiquitous But mobile.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee and Margaret Martonosi.
An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget Represented by: Majid Malaika Authors:
NATIONAL INSTITUTE OF TECHNOLOGY KARNATAKA,SURATHKAL Presentation on ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS Publisher’s:
ACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Execution Characteristics of SPEC CPU2000 Benchmarks: Intel C++ vs. Microsoft VC++
Suzhen Lin, A. Sai Sudhir, G. Manimaran Real-time Computing & Networking Laboratory Department of Electrical and Computer Engineering Iowa State University,
Prefetching Challenges in Distributed Memories for CMPs Martí Torrents, Raúl Martínez, and Carlos Molina Computer Architecture Department UPC – BarcelonaTech.
Parapet Research Group, Princeton University EE IEEE International Symposium on Workload Characterization IISWC ’05, Austin, TX Oct 06, 2005 Detecting.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.
Hardware. Control Process Unit(CPU) Contents Introduction Definition CPU Components of CPU Stages of the work of CPU CPU frequency CPU Cooling Conclusion.
21 June 2009Robust Feature Matching in 2.3μs1 Simon Taylor Edward Rosten Tom Drummond University of Cambridge.
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
R ECONFIGURABLE SECURITY SUPPORT FOR EMBEDDED SYSTEMS 1 AKSHATA VARDHARAJ.
Optimizing Power and Energy Lei Fan, Martyn Romanko.
Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX Feb 14, 2006.
Runtime Software Power Estimation and Minimization Tao Li.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
BarrierWatch: Characterizing Multithreaded Workloads across and within Program-Defined Epochs Socrates Demetriades and Sangyeun Cho Computer Frontiers.
Hardware Architectures for Power and Energy Adaptation Phillip Stanley-Marbell.
Exploiting Instruction Streams To Prevent Intrusion Milena Milenkovic.
An Integrated GPU Power and Performance Model (ISCA’10, June 19–23, 2010, Saint-Malo, France. International Symposium on Computer Architecture)
© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final.
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Control-Theoretic Approaches for Dynamic Information Assurance George Vachtsevanos Georgia Tech Working Meeting U. C. Berkeley February 5, 2003.
An Offline Approach for Whole-Program Paths Analysis using Suffix Arrays G. Pokam, F. Bodin.
Parapet Research Group, Princeton University EE Workshop on Hardware Performance Monitor Design and Functionality HPCA-11 Feb 13, 2005 Hardware Performance.
M AESTRO : Orchestrating Predictive Resource Management in Future Multicore Systems Sangyeun Cho, Socrates Demetriades Computer Science Department University.
Canturk ISCI Margaret MARTONOSI
Andrea Acquaviva, Luca Benini, Bruno Riccò
Leiming Yu, Fanny Nina-Paravecino, David Kaeli, Qianqian Fang
Canturk Isci Advisor: Margaret Martonosi
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Hardware Counter Driven On-the-Fly Request Signatures
Presented By: Darlene Banta
Request Behavior Variations
Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,
Phase based adaptive Branch predictor: Seeing the forest for the trees
Canturk Isci Gilberto Contreras Margaret Martonosi
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

Best detection scheme achieves 100% hit detection with <5% false alarms Princeton University Runtime Power Monitoring and Phase Analysis Methods for Power Management Canturk Isci and Margaret Martonosi Motivation and Research Overview  Power is the primary design constraint for current systems  Power density  Cooling / Thermal constraints  Energy  Battery life  Workloads exhibit drastically different behavior both within applications and among different applications (Phases)  These can be exploited by workload directed dynamic management techniques  Dynamically reconfigurable hardware  Power balancing / Activity migration  Need methods to track application power behavior and identify different (repetitive) regions of operation  Live, real-system experiments:  Reflect behavior of real, modern processors  Observe long time periods  Guide on-the-fly adaptations Live, Runtime Power Monitoring and Estimation Power Phase Analysis on Real Systems Our Work: Real Measurements Dynamic Management Power Estimation & Phase Analysis Runtime Monitoring Hardware Performance Counters Dynamic Program Flow Application ▪ Monitor application Execution: - Performance behavior via performance monitoring counters (PMCs) - Control flow via dynamic instrumentation ▪ Employ real power measurements to provide feedback to runtime power estimations and to evaluate phase characterizations ▪ Use application phase information to guide dynamic/adaptive power management techniques ▪ Represent application execution as a stream of PMC and control flow samples ▪ Estimate power behavior from PMC information ▪ Apply phase tracking, detection and prediction strategies under real-system effects based on PMC and control flow features Counter Based Power Estimation:  Idealized view: For all components on a chip…. MaxPower[I] * ArchScaling[I] * AccessRate[I] Power of component I = CPU Performance Counters! From Microarch. Properties Die Area + Stressmarks  Realistic view: Handle non-linear scaling… … + NonGatedPower[I] Empirical Multimeter Measurements Gcc GzipVpr Vortex Gap Crafty Total Power Estimates and Measurement Validation: Per-Component Estimates: Ex. Equake  Initialization and computation phases  Initialization with high complex IA32 instructions  FP intensive mesh computation phase + Fast (Real-time) + Offers estimated view of on-chip detail for real systems + Real measurement validation  Phases: Distinct and often-recurring regions of program behavior  Ex: Vortex  Power can also exhibit phase behavior  Phase Tracking: By evaluating the similarity among PMC vectors (PVs):  Similarity Criterion: L1-Distance between PVs  PVs achieve < 5W within phase variations with <10 phases  Real-System Effects on Phases: Metric and time variability  Phase Detection Under Real-System Variability:  Problem Definition: Variability effects on phases  Long-Term Value and Duration Prediction of Memory Bound Phases for DVFS: ABCB Ideal Glitch Gradient Shift Mutation Time Dilation ABCBDB ABCBDEB ABCBDEB ABCBDEF ABCBDEF  Proposed Solution: Transition-guided phase detection framework:  Mutations  Transition based tracking  Glitches and gradients  Glitch/Gradient Filtering  Shifts  ~Binary cross correlations  Time Dilations  Near-neighbor blurring Applications of Power Phase Analysis t ABCB t ABCBDEF … run1 Match! t run t 1 00 Very high detect threshold  P{hit} = 0 P{false alarm} = 0 0 detect threshold  P{hit} = 1 P{false alarm} = 1 Desired operating point  P{hit} ~ 1 P{false alarm} ~ 0  Evaluating Control-Flow-Based and Event-Counter Based Approaches: Control flow (Basic Block Vectors / BBVs): Perfect repeatability Architectural independence Detail at program level Runtime applicability BBV phases ≢ power phases No physical binding to power Event counters (PMCs): Runtime monitoring Strong relation to power Imperfect repeatability Lack of detail Pintool Application Binary Application Performance Counter Hardware Power Meas.via Current Probe OS serial device file Dynamic Instrumentation via Pin OS Hardware  Experimentation:  Evaluation:  Both approaches bring significant insights to application power behavior Error Number of Phases  PMCs achieve (on average 40%) less errors than BBVs in power phase characterization  Can predict >90% of DVFS’able phases, with less than 5% prediction overshoots!  Power Balancing for Multiprocessor Systems / Activity Migration: Power Task1 Task2 Swap hot task Slow down! Speed up! Core/μP 1Core/μP 2 Conclusions  Certain compositions of event counters can provide reasonably accurate runtime estimates for processor power consumption and distribution of power among architectural components  Workloads exhibit phases in their performance as well as power behavior - Performance counter vectors help identify different (recurring) power phases of applications  Real system variability effects impose additional challenges for detecting recurrent phases - Phase transition guided approach, together with supporting methods such as glitch/gradient filtering and near-neighbor blurring enable detection of repetitive power phase behavior  Both control flow and event counter based application features provide insight to application power behavior - PMC based approaches generally provide a better proxy to application power phase behavior, due to their strong physical binding to processor power consumption  These phase oriented methods can be employed to guide range of applications in current and next generation systems