Power-Aware Compilation CS 671 April 22, 2008. CS 671 – Spring 2008 1 Why Worry about Power Dissipation? Environment Thermal issues: affect cooling, packaging,

Slides:



Advertisements
Similar presentations
VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동 교수
Advertisements

CPE 731 Advanced Computer Architecture Instruction Level Parallelism Part I Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
Power Reduction Techniques For Microprocessor Systems
Computer Architecture & Organization
Adaptive Techniques for Leakage Power Management in L2 Cache Peripheral Circuits Houman Homayoun Alex Veidenbaum and Jean-Luc Gaudiot Dept. of Computer.
L27:Lower Power Algorithm for Multimedia Systems 성균관대학교 조 준 동
Chuanjun Zhang, UC Riverside 1 Low Static-Power Frequent-Value Data Caches Chuanjun Zhang*, Jun Yang, and Frank Vahid** *Dept. of Electrical Engineering.
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 14: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
CS 7810 Lecture 12 Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors D. Brooks et al. IEEE Micro, Nov/Dec.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
8/18/05ELEC / Lecture 11 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Energy Efficient Instruction Cache for Wide-issue Processors Alex Veidenbaum Information and Computer Science University of California, Irvine.
EECS 470 Superscalar Architectures and the Pentium 4 Lecture 12.
Low-power computer architecture
CSE477 L26 System Power.1Irwin&Vijay, PSU, 2002 Low Power Design in Microarchitectures and Memories [Adapted from Mary Jane Irwin (
Optimization Of Power Consumption For An ARM7- BASED Multimedia Handheld Device Hoseok Chang; Wonchul Lee; Wonyong Sung Circuits and Systems, ISCAS.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Power-Aware Computing 101 CS 771 – Optimizing Compilers Fall 2005 – Lecture 22.
Author: D. Brooks, V.Tiwari and M. Martonosi Reviewer: Junxia Ma
Mahapatra-Texas A&M-Spring'021 Power Issues with Embedded Systems Rabi Mahapatra Computer Science.
Power-aware Computing n Dramatic increases in computer power consumption: » Some processors now draw more than 100 watts » Memory power consumption is.
Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris
EECS 470 Power and Architecture Many slides taken from Prof. David Brooks, Harvard University and modified by Mark Brehob. A couple of slides are also.
EECS 470 Power and Architecture Many slides taken from Prof. David Brooks, Harvard University and modified by Mark Brehob. A couple of slides are also.
Power, Energy and Delay Static CMOS is an attractive design style because of its good noise margins, ideal voltage transfer characteristics, full logic.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
6.893: Advanced VLSI Computer Architecture, September 28, 2000, Lecture 4, Slide 1. © Krste Asanovic Krste Asanovic
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
1 Layers of Computer Science, ISA and uArch Alexander Titov 20 September 2014.
17 Sep 2002Embedded Seminar2 Outline The Big Picture Who’s got the Power? What’s in the bag of tricks?
Low Power Techniques in Processor Design
1 VLSI Design SMD154 LOW-POWER DESIGN Magnus Eriksson & Simon Olsson.
2007 Sept 06SYSC 2001* - Fall SYSC2001-Ch1.ppt1 Computer Architecture & Organization  Instruction set, number of bits used for data representation,
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Low-Power Wireless Sensor Networks
Last Time Performance Analysis It’s all relative
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
1 Embedded Systems Computer Architecture. Embedded Systems2 Memory Hierarchy Registers Cache RAM Disk L2 Cache Speed (faster) Cost (cheaper per-byte)
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
1 CS/EE 6810: Computer Architecture Class format:  Most lectures on YouTube *BEFORE* class  Use class time for discussions, clarifications, problem-solving,
3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.
ARM for Wireless Applications ARM11 Microarchitecture On the ARMv6 Connie Wang.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
경종민 Low-Power Design for Embedded Processor.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
1 Power estimation in the algorithmic and register-transfer level September 25, 2006 Chong-Min Kyung.
Basics of Energy & Power Dissipation
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Power Analysis of Embedded Software : A Fast Step Towards Software Power Minimization 指導教授 : 陳少傑 教授 組員 : R 張馨怡 R 林秀萍.
FPGA-Based System Design: Chapter 6 Copyright  2004 Prentice Hall PTR Topics n Low power design. n Pipelining.
Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 19: March 28, 2012 Minimizing Energy.
1 Lecture: Static ILP Topics: predication, speculation (Sections C.5, 3.2)
CS203 – Advanced Computer Architecture
Presented by Rania Kilany.  Energy consumption  Energy consumption is a major concern in many embedded computing systems.  Cache Memories 50%  Cache.
LOW POWER DESIGN METHODS
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
CS203 – Advanced Computer Architecture
SECTIONS 1-7 By Astha Chawla
Architecture & Organization 1
Architecture & Organization 1
Computer Architecture
Overheads for Computers as Components 2nd ed.
COMS 361 Computer Organization
Presentation transcript:

Power-Aware Compilation CS 671 April 22, 2008

CS 671 – Spring Why Worry about Power Dissipation? Environment Thermal issues: affect cooling, packaging, reliability, timing Battery life

CS 671 – Spring Power Dissipation Trends Hot Plate Nuclear Reactor Pentium Pentium Pro Pentium 2 Pentium 3 Pentium 4 (Prescott) Pentium 4

CS 671 – Spring Cooking-Aware Computing

CS 671 – Spring Intel vs. Duracell No Moore’s Law in batteries: 2-3%/year growth Processor (MIPS) Hard Disk (capacity) Memory (capacity) Battery (energy stored) x 14x 12x 10x 8x 6x 4x 2x 1x Improvement (compared to year 0) Time (years)

CS 671 – Spring Environment Protection Agency (EPA): computers consume 10% of commercial electricity consumption Includes peripherals, possibly also manufacturing Data center growth was cited as a contribution to the 2000/2001 California Energy Crisis Equivalent power (with only 30% efficiency) for AC CFCs used for refrigeration Lap burn Fan noise Environment

CS 671 – Spring Where Does the Juice Go in Laptops?

CS 671 – Spring What can we do about it? Two components to the problem: #1: Understand where and why power is dissipated #2: Think about ways to reduce it at all levels of computing hierarchy In the past, #1 is difficult to accomplish except at the circuit level Consequently most low-power efforts were all circuit related Now We Know Why Power is Important

CS 671 – Spring Power: The Basics Dynamic “switching” power vs. Static “leakage” power Dynamic power dominates, but static power increasing in importance Trends in each Static power: steady, per-cycle energy cost Dynamic power: capacitive and short-circuit Capacitive power: charging/discharging at transitions from 0  1 and 1  0 Short-circuit power: power due to brief short-circuit current during transitions. Most research focuses on capacitive, but recent work on others

CS 671 – Spring Temperature Capacitive (Dynamic) Power Static (Leakage) Power Minimum Voltage 20 cycles Di/Dt (Vdd/Gnd Bounce) Voltage (V) Current (A) VinVout CLCL Vdd Power Issues in Microprocessors

CS 671 – Spring Capacitive Power Dissipation Power ~ ½ CV 2 Af Capacitance: Function of wire length, transistor size Supply Voltage: Has been dropping with successive fab generations Clock frequency: Increasing… Activity factor: How often, on average, do wires switch?

CS 671 – Spring Lowering Dynamic Power Reducing Vdd has a quadratic effect Has a negative (~linear) effect on performance however Lowering C L May improve performance as well Keep transistors small (keeps intrinsic capacitance (gate and diffusion) small) Reduce switching activity A function of signal transition stats and clock rate Clock gating idle units Impacted by logic and architecture decisions

CS 671 – Spring Power vs. Energy

CS 671 – Spring Power vs. Energy Power consumption in watts Determines battery life in hours Sets packaging limits Energy efficiency in joules Rate at which energy is consumed over time Energy = power * delay (joules = watts * seconds) Lower energy number means less power to perform a computation at same frequency

CS 671 – Spring Power vs. Energy Metrics Power-delay Product (PDP) = P avg * t PDP is the average energy consumed per switching event Energy-delay Product (EDP) = PDP * t Takes into account that one can trade increased delay for lower energy/operation

CS 671 – Spring Low-Power Software Strategies Code running on CPU Code optimizations for low power Code accessing memory objects SW optimizations for memory Data flowing on the buses I/O coding for low power Compiler controlled power management CPU Cache Memory

CS 671 – Spring Code Optimizations for Low Power High-level operations (e.g. C statement) can be compiled into different instruction sequences –different instructions & ordering have different power Instruction Selection Select a minimum-power instruction mix for executing a piece of high level code Instruction Packing & Dual Memory Loads Two on-chip memory banks –Dual load vs. two single loads –Almost 50% energy savings

CS 671 – Spring Code Optimizations for Low Power Reorder instructions to reduce switching effect at functional units and I/O buses Cold scheduling minimizes instruction bus transitions Operand swapping Swap the operands at the input of multiplier Result is unaltered, but power changes significantly! Other standard compiler optimizations Intermediate level: Software pipelining, dead code elimination, redundancy elimination Low level: Register allocation and other machine specific optimizations

CS 671 – Spring Code Optimizations for Low Power Use processor-specific instruction styles on ARM the default int type is ~ 20% more efficient than char or short as the latter result in sign or zero extension on ARM conditional instructions can be used instead of branches

CS 671 – Spring ARM vs. THUMB ARM – 32-bit, requires fewer instructions THUMB – 16-bit, more instructions Switching between ARM/THUMB takes time

CS 671 – Spring Minimizing Memory Access Costs Reduce memory access, better use of registers Register access consumes less power than memory access Easy way: minimize number of r/w operations Cache optimizations Reorder memory accesses to improve cache hit rates Can use existing techniques for high-performance code generation

CS 671 – Spring Minimizing Memory Access Costs Loop optimizations such as loop unrolling, loop fusion also reduce memory power consumption More effective: explicitly target minimization of switching activity on I/O busses and exploiting memory hierarchy Data allocation to minimize I/O bus transitions –map large arrays with known access patterns to main memory to minimize address bus transitions –works in conjunction with coding of address busses Exploiting memory hierarchy –organizing video and DSP data to maximize the higher levels (lower power) of memory hierarchy

CS 671 – Spring Observation: Execution-time Variation Significant variation in execution time of real-time tasks But, variation is not random due to correlation in underlying signal (speech, sensor etc.)

CS 671 – Spring Observation: Applications Tolerant to Deadline Misses E.g. sensor networks Computation deadline misses lead to data loss Packet loss common in wireless links Significant probability of error in sensor signals noisy sensor channels Applications designed to tolerate noisy/bad data by exploiting spatio-temporal redundancy high transient losses acceptable if localized in time or space If the communication is noisy, and applications are loss tolerant, is it worthwhile to strive for perfect noise-free computing?

CS 671 – Spring Exploiting Execution-time Variation and Tolerance to Deadlines Idea: predict execution time of task instance and dynamically scale voltage so as to minimize shutdown Execution time prediction learn distribution of execution times (pdf) provide hints –MPEG decode can tell whether frame is P, I, or F But, some deadlines are missed! Adaptive control loop to keep missed deadlines < limit Provides adaptive power-fidelity trade-off

CS 671 – Spring Compiler-Controlled DVFS MICRO’05 – Princeton Use compiler to find (predict) large regions where low frequency won’t hurt performance

CS 671 – Spring Sensor Network Compilation PLDI 2007 – University of Pittsburgh 1 bit over the wire == 1000 executed instructions Rework binary “patches” to minimize difference from original binary

CS 671 – Spring Power-Aware Compilation Not all optimizations target performance Power-aware optimizations are Most important on embedded systems Most effective on VLIW architectures Still present primarily in the research community It’s important to rethink many of our notions of “optimization”