© Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)

Slides:



Advertisements
Similar presentations
EE 201A Modeling and Optimization for VLSI LayoutJeff Wong and Dan Vasquez EE 201A Noise Modeling Jeff Wong and Dan Vasquez Electrical Engineering Department.
Advertisements

International Symposium on Low Power Electronics and Design Qing Xie, Mohammad Javad Dousti, and Massoud Pedram University of Southern California ISLPED.
Lect.3 Modeling in The Time Domain Basil Hamed
Performance, Energy and Thermal Considerations of SMT and CMP architectures Yingmin Li, David Brooks, Zhigang Hu, Kevin Skadron Dept. of Computer Science,
Bridging Theory in Practice Transferring Technical Knowledge to Practical Applications.
Power Reduction Techniques For Microprocessor Systems
Multi Dimensional Steady State Heat Conduction P M V Subbarao Associate Professor Mechanical Engineering Department IIT Delhi It is just not a modeling.
MAE156A October 17, 2006 UCSD H. Ali Razavi. Permanent DC Motor After Coil (modeled:
Applications Team Sensing Products
MAE156A January 11, 2007 UCSD H. Ali Razavi. PM Motor 10A: Start/Stall 2A: Continuous V: 10-12V Design Example 1: Objective: To Turn On and Off a Permanent.
Lecture 8: Clock Distribution, PLL & DLL
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Introduction AD620 Instrumentation Amplifier
Author: D. Brooks, V.Tiwari and M. Martonosi Reviewer: Junxia Ma
ECE 510 Brendan Crowley Paper Review October 31, 2006.
CS 7810 Lecture 15 A Case for Thermal-Aware Floorplanning at the Microarchitectural Level K. Sankaranarayanan, S. Velusamy, M. Stan, K. Skadron Journal.
Integrated Regulation for Energy- Efficient Digital Circuits Elad Alon 1 and Mark Horowitz 2 1 UC Berkeley 2 Stanford University.
DC-DC Fundamentals 1.1 An Introduction
Engineering 80 – Spring 2015 Temperature Measurements
© 2012 Pearson Education. Upper Saddle River, NJ, All rights reserved. Electronic Devices, 9th edition Thomas L. Floyd Electronic Devices Ninth.
Requirements: General, simple, and fast, and must model heating at the granularity of architectural objects  Must be able to dynamically calculate temperatures.
Diodes TEC 284.
© 2004, 2005, Kevin Skadron A Quick Thermal Tutorial Kevin Skadron Mircea Stan Univ. of Virginia HotSpot group.
Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David) 5.Optimal DTM (Lev) 6.Clustering (Antonio)
Characteristics of Op-Amp &
Topic 1 different attributes that characterize sensors ETEC 6405.
Digital to Analog Converters
REFERENCE CIRCUITS A reference circuit is an independent voltage or current source which has a high degree of precision and stability. Output voltage/current.
Slide 1 U.Va. Department of Computer Science LAVA Architecture-Level Power Modeling N. Kim, T. Austin, T. Mudge, and D. Grunwald. “Challenges for Architectural.
Power Electronics and Drives (Version ) Dr. Zainal Salam, UTM-JB 1 Chapter 3 DC to DC CONVERTER (CHOPPER) General Buck converter Boost converter.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Modern VLSI Design 4e: Chapter 7 Copyright  2008 Wayne Wolf Topics Global interconnect. Power/ground routing. Clock routing. Floorplanning tips. Off-chip.
1 CMOS Temperature Sensor with Ring Oscillator for Mobile DRAM Self-refresh Control IEEE International Symposium on Circuits and Systems, Chan-Kyung.
1 Lecture 21: Core Design, Parallel Algorithms Today: ARM Cortex A-15, power, sort and matrix algorithms.
1 Overview 1.Motivation (Kevin) 1.5 hrs 2.Thermal issues (Kevin) 3.Power modeling (David) Thermal management (David) hrs 5.Optimal DTM (Lev).5 hrs.
CAD for Physical Design of VLSI Circuits
ANALOG CIRCUIT AND DEVICES 10/7/ Semester I 2013/2014 Course Code: EEE 3123.
Differential Amplifiers.  What is a Differential Amplifier ? Some Definitions and Symbols  Differential-mode input voltage, v ID, is the voltage difference.
SIGMA-DELTA ADC SD16_A Sigma-Delta ADC Shruthi Sujendra.
An evaluation of HotSpot-3.0 block-based temperature model
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Two Dimensional Steady State Heat Conduction
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
1 Some Limits of Power Delivery in the Multicore Era Runjie Zhang, Brett H. Meyer, Wei Huang, Kevin Skadron and Mircea R. Stan University of Virginia,
1 An Improved Block-Based Thermal Model in HotSpot 4.0 with Granularity Considerations Wei Huang 1, Karthik Sankaranarayanan 1, Robert Ribando 3, Mircea.
Modern VLSI Design 3e: Chapter 7 Copyright  1998, 2002 Prentice Hall PTR Topics n Power/ground routing. n Clock routing. n Floorplanning tips. n Off-chip.
Xuanxing Xiong and Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois, United States November, 2011 Vectorless.
DTM and Reliability High temperature greatly degrades reliability
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
Thermal-aware Phase-based Tuning of Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing This work was supported.
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Electronic. Analog Vs. Digital Analog –Continuous –Can take on any values in a given range –Very susceptible to noise Digital –Discrete –Can only take.
Solid-State Devices & Circuits
Digital to Analog Converter (DAC)
Digital-to-Analog Analog-to-Digital Week 10. Data Handling Systems  Both data about the physical world and control signals sent to interact with the.
CS203 – Advanced Computer Architecture
Engineering 80 – Spring 2016 Temperature Measurements 1 SOURCE: 3_standardbody__to-226_straightlead.jpg SOURCE:
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
COE 360 Principles of VLSI Design Delay. 2 Definitions.
Overview Motivation (Kevin) Thermal issues (Kevin)
Smruti R. Sarangi IIT Delhi
CS203 – Advanced Computer Architecture
Temperature sensors Temperature is the most often-measured environmental quantity. This might be expected since most physical, electronic, chemical, mechanical,
Electronic Devices Ninth Edition Floyd Chapter 17.
Morgan Kaufmann Publishers
Overview Motivation (Kevin) Thermal issues (Kevin)
Die Stacking (3D) Microarchitecture -- from Intel Corporation
Overview Motivation (Kevin) Thermal issues (Kevin)
Lev Finkelstein ISCA/Thermal Workshop 6/2004
Presentation transcript:

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David) 5.Optimal DTM (Lev) 6.Clustering (Antonio) 7.Power distribution (David) 8.What current chips do (Lev) 9.HotSpot (Kevin)

© Mircea Stan, Kevin Skadron, David Brooks, 2002 PowerPC G3 Microprocessor On-chip temperature sensor (junction temperature) –Based on differential voltage change across 2 diodes of different sizes –Implemented in PowerPC G3/G4 processors OS required for control Instruction Cache Throttling used to dynamically lower junction temperature

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Pentium III On-die thermal diode –Coupled with board-level thermal diode sensor Uses –Monitor long-term temperature and environmental trends –Provide indication of catastrophic failure

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Pentium 4 Thermal ramp rates ~50ºC/second (over whole package) Much too high for coarse-grained solutions Thermal Monitor –Highly-accurate on-die temperature sensing circuit –Fast acting temperature control circuit (~50ns) Reference Current Source Temperature Sensing Diode Current Comparator PROCHOT #

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Pentium 4 -- Thermal Monitor Trip Point is calibrated at manufacturing time Simple response –Turn processor clocks on/off at 50% duty cycle –For 1.5GHz processor, ~2  s on + ~2  s off?

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Pentium 4 -- Results For 200 traces (TPC-C, SPEC, Microsoft) –Thermal design point can be reduced to 75% of true “max power” with minimal performance loss

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Pentium 4 Thermal monitors allow –Tradeoff between cost and performance –Cheaper package More triggers, Less Performance –Expensive package No triggers, no performance loss

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Architecture-level Thermal Management Dynamically adjust execution to control temperature Avoid catastrophic failure (heat sink, fan) Permit use of less expensive package –Design for less than the worst case –Package costs ~$1/W above ~40 W –Heat sinks, heat pipes, thinned wafers, fans Fans reduce battery life –Peak power as high as 150 W now and > 200W in 1-2 generations –Temperatures over 100°C More fundamentally -- there is a need for architecture-level thermal modeling –What’s actually going on in there?

© Mircea Stan, Kevin Skadron, David Brooks, 2002 HotSpot project Collaboration between HPLP and LAVA Labs (ECE and CS depts. UVa) Deal with “hot spots” –Localized heating occurs much faster than chip-wide microsec. to millisec. –Chip-wide treatment is too conservative seconds to minutes but there is significant lateral thermal coupling through the package How do we model this? Prove temperature will be safely bounded

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Hot spots in Power4 Temperature “landscape”: space and time How to estimate early in the design cycle?

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Thermal modeling Want a fine-grained, dynamic model of temperature –At a granularity architects can reason about –That accounts for adjacency and package –That does not require detailed designs –That is fast enough for practical use HotSpot - a compact model based on thermal R, C – Parameterized to automatically derive a model based on various Architectures Power models Floorplans Thermal Packages

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Dynamic compact thermal model Electrical-thermal duality V  temp (T) I  power (P) R  thermal resistance (Rth) C  thermal capacitance (Cth) RC time constant (Rth Cth) Kirchoff Current Law differential eq. I = C · dV/dt + V/R thermal domain P = Cth · dT/dt + T/Rth where T = T_hot – T_amb At higher granularities of P, Rth, Cth P, T are vectors and Rth, Cth are circuit matrices T_hot T_amb

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Package we model Heat sink Heat spreader PCB Die IC Package Pin Interface material

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Modeling the package Thermal management allows for packaging alternatives/shortcuts/interactions HotSpot needs a model of packaging Basic thermal model: –Heat spreader –Heatsink –Interface materials (e.g. phase-change films) –Fan/Active cooler (TEC) Thermal resistance due to convection Constriction and bulk resistance for fins Spreading constriction and bulk resistance for heatsink base and heat spreader Thermal resistance for bonding material Thermal capacitance heat spreader and heatsink

© Mircea Stan, Kevin Skadron, David Brooks, 2002 “Optimal” package Default package is found using: –Power dissipation –Target temperature on chip –Chip area –Clock speed – high or low performance Power dissipation and target temperature used to determine resistance value needed Needs more work: modern packages are incredibly complex, yet there is still a need to model at higher levels Now: what can we do with HotSpot?

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Equivalent vertical network Chip Interface Spreader Interface + Sink Convection Peripheral spreader nodes Diagram is simplified – peripheral nodes

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Vertical network parameters Resistances –Determined by the corresponding areas and their cross sectional thickness –R = resistivity x thickness / Area Capacitances –C = specific heat x thickness x Area Peripheral node areas Chip Spreader North South East West

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Lateral resistances Determined by the floorplan and the length of shared edges between adjacent blocks –"Heat Spreading and Conduction in Compressed Heatsinks", Jaana Behm and Jari Huttunen, in proceedings of the 10th International Flotherm User Conference, May 2001.

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Lateral resistances – contd... Lengths used for silicon Lengths used in the spreader

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Our model (lateral and vertical) Interface material (not shown)

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Temperature equations Fundamental RC differential equation –P = C dT/dt + T / R Steady state –dT/dt = 0 –P = T / R When R and C are network matrices –Steady state – T = R x P –Modified transient equation dT/dt + (RC) -1 x T = C -1 x P –HotSpot software mainly solves these two equations

© Mircea Stan, Kevin Skadron, David Brooks, 2002 HotSpot Time evolution of temperature is driven by unit activities and power dissipations averaged over 10K cycles –Power dissipations can come from any power simulator, act as “current sources” in RC circuit ('P' vector in the equations) –Simulation overhead in Wattch/SimpleScalar: < 1% Requires models of –Floorplan: important for adjacency –Package: important for spreading and time constants –R and C matrices are derived from the above

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Implementation Primarily a circuit solver Steady state solution –Mainly matrix inversion – done in two steps Decomposition of the matrix into lower and upper triangular matrices Successive backward substitution of solved variables –Implements the pseudocode from CLR Transient solution –Inputs – current temperature and power –Output – temperature for the next interval –Computed using a fourth order Runge- Kutta (RK4) method

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Transient solution Solves differential equations of the form dT + AT = B where A and B are constants –In HotSpot, A is constant but B depends on the power dissipation –Solution – assume constant average power dissipation within an interval (10 K cycles) and call RK4 at the end of each interval In RK4, current temperature (at t) is advanced in very small steps (t+h, t+2h...) till the next interval (10K cycles) RK – `4` because error term is 4 th order i.e., O(h^4)

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Transient solution contd... 4 th order error has to be within the required precision The step size (h) has to be small enough even for the maximum slope of the temperature evolution curve Transient solution for the differential equation is of the form Ae -Bt with A and B are dependent on the RC network Thus, the maximum value of the slope (AxB) and the step size are computed accordingly

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Validation Validated and calibrated using MICRED test chips –9x9 array of power dissipators and sensors –Compared to HotSpot configured with same grid, package Within 7% for both steady-state and transient step-response –Interface material (chip/spreader) matters

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Current features Specification of arbitrary floorplans Format of floorplan file: –One line per unit –Line format – \t \t \t \t \n Takes a power trace file as an input and outputs corresponding temperature trace Ability to modify package specifactions (type of interface material, size and type of heat spreader and heat sink etc.)

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Current floorplan Modeled after an Alpha 21364

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Current floorplan – CPU core

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Soon to be features Grid model – RC network per grid cell instead of a block Temperature models for wires, pads and interface material between heat sink and spreader Better (more user friendly) floorplan specification Automatic floorplan generation using classical floorplanning algorithms

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Better floorplan specification Floorplan of current microprocessors has a structural similarity Floorplans similar to MIPS R10K, Pentium and the Alpha Pipeline order corresponds to floorplan adjacency

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Better floorplan specification Sample specification (with % areas) that takes advantage of pipeline order

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Automatic floorplan for architects Why develop an architectural floorplanning tool? –Thermal modeling requires adjacency information. –Wire delays make performance depend on the floorplan. Goal –Derive a realistic floorplan using only microarchitectural information –Trade off thermal efficiency against latency –Simulated annealing based floorplan optimization for thermal, delay and combined metrics Current work. Results will be available soon

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Sensors Caveat emptor: We are not well-versed on sensor design; the following is a digest of information we have been able to collect from industry sources and the research literature.

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Desirable Sensor Characteristics Small area Low Power High Accuracy + Linearity Easy access and low access time Fast response time (slew rate) Easy calibration Low sensitivity to process and supply noise

© Mircea Stan, Kevin Skadron, David Brooks, 2002 PowerPC G3 (Sanchez et al, Symp. on VLSI Circuits ‘97, COMPCON ‘97) 0.35 μ, 2.5V Area 0.2 mm 2 Power: 10 mW Precision: ±4.5° Offset: 12° at process corners Linearity: < ±4° Based on thermal diodes and current mirrors

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Types of Sensors (In approx. order of increasing ease to build) Thermocouples – voltage output –Junction between wires of different materials; voltage at terminals is α T ref – T junction –Often used for external measurements Thermal diodes – voltage output –Biased p-n junction; voltage drop for a known current is temperature-dependent Biased resistors (thermistors) – voltage output –Voltage drop for a known current is temperature dependent You can also think of this as varying R –Example: 1 KΩ metal “snake” BiCMOS, CMOS – voltage or current output –Rely on reference voltage or current generated from a reference band-gap circuit; current-based designs often depend on temp-dependence of threshold

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Thermal Sensors in PowerPC On-chip temperature sensor (junction temperature) –Based on differential voltage change across 2 diodes of different sizes –Implemented in PowerPC G3/G4 processors Instruction Cache Throttling used to dynamically lower junction temperature

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Typical Sensor Configuration PTAT – Proportional to Absolute Temperature

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Absolute Sensor 1 Schematics of Delta V gs Current Reference (left) Generator and Delay Cell (right) Syal, Lee, Ivanov, Altet, Online Testing Workshop, 2001

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Sensors: Problem Issues Poor control of CMOS transistor parameters Noisy environment –Cross talk –Ground noise –Power supply noise These can be reduced by making the sensor larger –This increases power dissipation –But we may want many sensors

© Mircea Stan, Kevin Skadron, David Brooks, 2002 “Reasonable” Values Based on conversations with engineers at Sun, Intel, and HP (Alpha) Linearity: not a problem for range of temperatures of interest Slew rate: < 1 μs –This is the time it takes for the physical sensing process (e.g., current) to reach equilibrium Sensor bandwidth: << 1 MHz, probably kHz –This is the sampling rate; 100 kHz = 10 μs –Limited by slew rate but also A/D Consider digitization using a counter

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Mid 1980s: < 0.1° was possible Precision –± 3° is very reasonable –± 2° is reasonable –± 1° is feasible but expensive –< ± 1° is really hard The limited precision of the G3 sensor seems to have been a design choice involving the digitization “Reasonable” Values: Precision P: 10s of mW

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Calibration Accuracy vs. Precision –Analogous to mean vs. stdev Calibration deals with accuracy –The main issue is to reduce inter-die variations in offset Typically requires per-part testing and configuration Basic idea: measure offset, store it, then subtract this from dynamic measurements

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Dynamic Offset Cancelation Rich area of research Build circuit to continuously, dynamically detect offset and cancel it Typically uses an op-amp Has the advantage that it adapts to changing offsets Has the disadvantage of more complex circuitry

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Role of Precision Suppose: –Junction temperature is J –Max variation in sensor is S –Thermal emergency is T T = J – S Spatial gradients –If sensors cannot be located exactly at hotspots, measured temperature may be G° lower than true hotspot T = J – S – G

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Rate of change of temperature Our FEM simulations suggest maximum 0.1° in about μs This is for power density < 1 W/mm2 die thickness between 0.2 and 0.7mm, and contemporary packaging This means slew rate is not an issue But sampling rate is!

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Sensors Summary Sensor precision cannot be ignored –Reducing operating threshold by 1-2 degrees will affect performance Precision of 1° is conceivable but expensive –Maybe reasonable for a single sensor or a few Precision of 2-3° is reasonable even for a moderate number of sensors Power and area are probably negligible from the architecture standpoint Sampling period <= μs

© Mircea Stan, Kevin Skadron, David Brooks, 2002 HotSpot Summary HotSpot is a simple, accurate and fast architecture level thermal model for microprocessors Over 90 downloads till now Ongoing active development – architecture level floorplanning will be available soon Download site – Mailing list –

© Mircea Stan, Kevin Skadron, David Brooks, 2002 Temperature-aware computing: Optimize performance subject to a thermal constraint