1 ISCA 2004 Tutorial Thermal Issues for Temperature-Aware Computer Systems Saturday, June 19 th 8:00am - 5:00pm © Mircea Stan, Kevin Skadron, David Brooks,

Slides:



Advertisements
Similar presentations
What is Thermal Energy?.
Advertisements

Chapter 9 Thermal Energy
Electronics Cooling MPE 635 Mechanical Power Engineering Dept.
International Symposium on Low Power Electronics and Design Qing Xie, Mohammad Javad Dousti, and Massoud Pedram University of Southern California ISLPED.
Electronics Cooling MPE 635
How does Heat Energy transfer from one substance to another?
© 2010 Eric Pop, UIUCECE 598EP: Hot Chips 1 Transient Thermal Response Transient Models –Lumped: Tenbroek (1997), Rinaldi (2001), Lin (2004) –Introduce.
Theories of Heat. all substances contain tiny, constantly moving particles Kinetic Theory.
CompSci Today’s Topics Technology for Computers A Technology Driven Field Upcoming Computer Architecture (Chapter 8) Reading (not in text)
Lecture 21: Packaging, Power, & Clock
Energy in Thermal Processes
Ch 6 Thermal Energy and Heat. Thermal Energy Temperature & Heat Temperature is a measure of the average kinetic energy of the individual particles in.
Heat Transfer Overview
CHE/ME 109 Heat Transfer in Electronics
Electronics Cooling Mechanical Power Engineering Dept.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Power-Aware Computing 101 CS 771 – Optimizing Compilers Fall 2005 – Lecture 22.
EECS 470 Power and Architecture Many slides taken from Prof. David Brooks, Harvard University and modified by Mark Brehob. A couple of slides are also.
Energy as Heat Transfer
1 ISCA 2004 Tutorial Thermal Issues for Temperature-Aware Computer Systems Saturday, June 19 th 8:00am - 5:00pm.
© 2004, 2005, Kevin Skadron A Quick Thermal Tutorial Kevin Skadron Mircea Stan Univ. of Virginia HotSpot group.
ME381R Lecture 1 Overview of Microscale Thermal Fluid Sciences and Applications Dr. Li Shi Department of Mechanical Engineering The University of Texas.
Cooling of power semiconductor devices
HEAT Guided Note Taking TEMPERATURE ENERGY TRANSFER THERMAL POLLUTION.
PHYSICS 103: Lecture 17 Agenda for Today: Temperature and Heat
Copyright © 2010 Pearson Education, Inc. Lecture Outline Chapter 16 Physics, 4 th Edition James S. Walker.
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Low Power Techniques in Processor Design
Chapter 10 Heat Thermal Equilibrium Bring two objects into thermal contact. –They can exchange energy. When the flow of energy stops, the objects are.
Heat Transfer & Phases Intro Chapter. Is the ability to do work and cause a change. Can be transferred. –Gases and liquids are made of molecules that.
Chapter 13 States of Matter
Thermal Energy Heat.
1 Overview 1.Motivation (Kevin) 1.5 hrs 2.Thermal issues (Kevin) 3.Power modeling (David) Thermal management (David) hrs 5.Optimal DTM (Lev).5 hrs.
Last Time Performance Analysis It’s all relative
Heat is a form of:. Everything in the universe has heat energy! Your BODY, your CAR…even ICE!
Heat is a form of:. Everything in the universe has heat energy! Your BODY, your CAR…even ICE!
Thermal Energy Chapter 16. Temperature – related to the average kinetic energy of an object’s atoms or molecules, a measure of how hot (or cold) something.
1 Some Limits of Power Delivery in the Multicore Era Runjie Zhang, Brett H. Meyer, Wei Huang, Kevin Skadron and Mircea R. Stan University of Virginia,
Unit 6 : Part 3 Heat. Outline Definition and Units of Heat Specific Heat and Calorimetry Phase Changes and Latent Heat Heat Transfer.
Thermodynamics. Thermodynamics – The study of heat transformation. Temperature – A measure of the average kinetic energy of the particles in an object.average.
1 An Improved Block-Based Thermal Model in HotSpot 4.0 with Granularity Considerations Wei Huang 1, Karthik Sankaranarayanan 1, Robert Ribando 3, Mircea.
By: Narendra Babu N M110247ME THERMAL ANALYSIS OF MICROPROCESSOR.
Section 1 Temperature. Describe how temperature relates to kinetic energy. Compare temperatures on different temperature scales. Give examples of thermal.
© Mircea Stan, Kevin Skadron, David Brooks, 2002 Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
14 Heat Homework: Problems: 3, 5, 13, 21, 33, 47, 49. Internal Energy
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
Chapter 5 Thermal Energy
Matter Intermolecular Forces  Are the forces between neighboring molecules.
Heat and Thermometer ELED 4312 Science Content. Contents Why do we need thermometer? How does a thermometer work? Change of Matter Kinetic theory Heat.
Chapter 13: Heat and Temperature Section 1 – Measuring Temperature.
U3g – L2 Get out last night’s homework. Answer the following in your notebook: 1.Why do hot objects often turn red? 2.A wall adapter for your cell phone.
Heat and Technology. Bellringer The temperature of boiling water is 100° on the Celsius scale and 212° on the Fahrenheit scale. Look at each of the following.
CHAPTER 6 THERMAL ENERGY. PS 7 a-c 1. I can illustrate and explain the addition and subtraction of heat on the motion of molecules. 2. I can distinguish.
Chapter 1. Essential Concepts
Thermal Energy That’s so hot.. All matter is made of tiny little particles (atoms and molecules) All matter is made of tiny little particles (atoms and.
Heat transfer mechanism Dhivagar R Lecture 1 1. MECHANISMS OF HEAT TRANSFER Heat can be transferred in three different ways: conduction, convection, and.
Overview Motivation (Kevin) Thermal issues (Kevin)
Smruti R. Sarangi IIT Delhi
Thermo-electric refrigeration.
Heat Transfer: The Physics of Computer Cooling By Kenneth Yu
Heat Energy.
Chapter 13: Heat and Temperature
ECE Engineering Design Thermal Considerations
Thermal energy Chapter 4.
Overview Motivation (Kevin) Thermal issues (Kevin)
Overview Motivation (Kevin) Thermal issues (Kevin)
Chapter 20: Heat and the First Law of Thermodynamics
Lev Finkelstein ISCA/Thermal Workshop 6/2004
HEAT Guided Note Taking TEMPERATURE ENERGY TRANSFER THERMAL POLLUTION.
Temperature and Thermal Energy
Presentation transcript:

1 ISCA 2004 Tutorial Thermal Issues for Temperature-Aware Computer Systems Saturday, June 19 th 8:00am - 5:00pm © Mircea Stan, Kevin Skadron, David Brooks, 2002

2 Presenters: Kevin Skadron CS Department, University of Virginia Mircea Stan ECE Department, University of Virginia David Brooks CS Department, Harvard University Antonio Gonzalez UPC-Barcelona, and Intel Barcelona Research Center Lev Finkelstein Intel Haifa

© Mircea Stan, Kevin Skadron, David Brooks, Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David) 5.Optimal DTM (Lev) 6.Clustering (Antonio) 7.Power distribution (David) 8.What current chips do (Lev) 9.HotSpot (Kevin)

© Mircea Stan, Kevin Skadron, David Brooks, Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David) 5.Optimal DTM (Lev) 6.Clustering (Antonio) 7.Power distribution (David) 8.What current chips do (Lev) 9.HotSpot (Kevin)

© Mircea Stan, Kevin Skadron, David Brooks, Motivation Power consumption: first-order design constraint  unconstrained power is a theoretical max  peak (  inst.) power is limiting power delivery (dI/dt)  sustained power limits thermal design/packaging  max sustained power: thermal “virus”  same as thermal design power  average active power and idle power limit mobile battery life, etc.  Common fallacy: instantaneous power  temperature Power-density is increasing even faster:  thermal effects become more problematic.  Moore’s Law: exponential increase Need Power/Temperature-aware computing!

© Mircea Stan, Kevin Skadron, David Brooks, Power density From PACT 2000 keynote; source: Intel website But this curve is flattening

© Mircea Stan, Kevin Skadron, David Brooks, Power-aware figures of merit Power (P): battery time (mobile) packaging (high-performance) Energy (PD): battery life (mobile) fundamental limits (kT) Energy-delay (PD^2): performance and low power Energy-delay^2 (PD^3): emphasis on performance Power-aware  low power Similar to “old” VLSI complexity (A, AD, AD^2) None of these are appropriate for thermal Refs: R. Gonzales et al. “Supply and threshold voltage scaling for low power CMOS”, JSSC, Aug A. Martin et al. “Design of an Asynchronous MIPS R3000”, ARVLSI’97 J. Ullman, “Computational aspects of VLSI”, CS Press, 1984

© Mircea Stan, Kevin Skadron, David Brooks, Cooking-aware computing  Boiling water will come soon

© Mircea Stan, Kevin Skadron, David Brooks, Power and temperature are BAD and can be EVIL Source: Tom’s Hardware Guide

© Mircea Stan, Kevin Skadron, David Brooks, Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David) 5.Optimal DTM (Lev) 6.Clustering (Antonio) 7.Power distribution (David) 8.What current chips do (Lev) 9.HotSpot (Kevin)

© Mircea Stan, Kevin Skadron, David Brooks, Thermal issues Temperature affects: Circuit performance Circuit power (leakage) IC reliability IC and system packaging cost Environment

© Mircea Stan, Kevin Skadron, David Brooks, Performance and leakage Temperature affects : Transistor threshold and mobility Subthreshold leakage, gate leakage Ion, Ioff, Igate, delay ITRS: 85°C for high-performance, 110°C for embedded! Ion NMOS Ioff

© Mircea Stan, Kevin Skadron, David Brooks, Temperature-aware circuits Robustness constraint: sets Ion/Ioff ratio Robustness and reliability: Ion/Igate ratio Idea: keep ratios constant with T: trade leakage for performance! Ref: “Ghoshal et al. “Refrigeration Technologies…”, ISSCC 2000 Garrett et al. “T3…”, ISCAS 2001

© Mircea Stan, Kevin Skadron, David Brooks, Resulting performance 25% - 30% extra performance (110 o C to 0 o C) regular TAC

© Mircea Stan, Kevin Skadron, David Brooks, Reliability The Arrhenius Equation: MTF=A*exp (E a /K*T) MTF: mean time to failure at T A: empirical constant E a : activation energy K: Boltzmann’s constant T: absolute temperature Failure mechanisms: Die metalization (Corrosion, Electromigration, Contact spiking) Oxide (charge trapping, gate oxide breakdown, hot electrons) Device (ionic contamination, second breakdown, surface-charge) Die attach (fracture, thermal breakdown, adhesion fatigue) Interconnect (wirebond failure, flip-chip joint failure) Package (cracking, whisker and dendritic growth, lid seal failure) Most of the above increase with T (Arrhenius) Notable exception: hot electrons are worse at low temperatures

© Mircea Stan, Kevin Skadron, David Brooks, Arrhenius or Erroneous? “Hot” issue in thermal community: is the Arrhenius equation correct/relevant? C. Lasance (Philips): “Erroneous” equation Claim: what really matters are thermal gradients in space and time, thermal cycling Will not solve the dispute here! Agreement: thermal issues are key for reliability, whether static or dynamic Another famous quote: “We have a headache with Arrhenius” (T. Okada, Sony, when asked about reliability prediction methods)

© Mircea Stan, Kevin Skadron, David Brooks, Packaging cost From Cray (local power generator and refrigeration)… Source: Gordon Bell, “A Seymour Cray perspective”

© Mircea Stan, Kevin Skadron, David Brooks, Packaging cost To today… Grid computing: power plants co-located near compute farms IBM S/390: refrigeration Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling” IBM Journal of R&D

© Mircea Stan, Kevin Skadron, David Brooks, IBM S/390 refrigeration Complex and expensive Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling” IBM Journal of R&D

© Mircea Stan, Kevin Skadron, David Brooks, IBM S/390 processor packaging Processor subassembly: complex! C4: Controlled Collapse Chip Connection (flip-chip) Source: R. R. Schmidt, B. D. Notohardjono “High-end server low temperature cooling” IBM Journal of R&D

© Mircea Stan, Kevin Skadron, David Brooks, Intel Itanium packaging Complex and expensive (note heatpipe) Source: H. Xie et al. “Packaging the Itanium Microprocessor” Electronic Components and Technology Conference 2002

© Mircea Stan, Kevin Skadron, David Brooks, P4 packaging Simpler, but still… Source: Intel web site

© Mircea Stan, Kevin Skadron, David Brooks, Environment Environment Protection Agency (EPA): computers consume 10% of commercial electricity consumption –This incl. peripherals, possibly also manufacturing –A DOE report suggested this percentage is much lower –No consensus, but it’s still a lot Equivalent power (with only 30% efficiency) for AC CFCs used for refrigeration Lap burn Fan noise

© Mircea Stan, Kevin Skadron, David Brooks, Heat mechanisms Conduction Convection Radiation Phase change Heat storage

© Mircea Stan, Kevin Skadron, David Brooks, Conduction Similar to electrical conduction (e.g. metals are good conductors) Heat flow from high energy to low energy Microscopic (vibration, adjacent molecules, electron transport) No major displacement of molecules Need a material: typically in solids (fluids: distance between mol) Typical example: thermal “slug”, spreader, heatsink Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001 A

© Mircea Stan, Kevin Skadron, David Brooks, Conduction Different materials (not a strong function of temperature) Si – more variation Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001

© Mircea Stan, Kevin Skadron, David Brooks, Convection Macroscopic (bulk transport, mix of hot and cold, energy storage) Need material (typically in fluids, liquid, gas) Natural vs. forced (gas or liquid) Typical example: heatsink (fan), liquid cooling Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001

© Mircea Stan, Kevin Skadron, David Brooks, Radiation Electromagnetic waves (can occur in vacuum) Negligible in typical applications Sometimes the only mechanism (e.g. in space) Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001

© Mircea Stan, Kevin Skadron, David Brooks, Surface-to-surface contacts Not negligible, heat crowding Thermal greases (can “pump-out”) Phase Change Films (undergo a transition from solid to semi-solid with the application of heat) Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001

© Mircea Stan, Kevin Skadron, David Brooks, Phase-change Thermal solutions evolution: Natural air cooling Forced-air cooling Liquid cooling Phase change (e.g. heat pipe) Refrigeration Phase change: a. Solid changing to a liquid—fusion, or melting, b. Liquid changing to a vapor—evaporation, also boiling, c. Vapor changing to a liquid—condensation, e. Liquid changing to a solid—crystallization, or freezing, f. Solid changing to a vapor—sublimation, g. Vapor changing to a solid—deposition.

© Mircea Stan, Kevin Skadron, David Brooks, Thermal capacitance Example:  (Aluminum) = 2,710 kg/m 3 C p (Aluminum) = 875 J/(kg-°C) V = t·A = m 3 C bulk = V·C p ·  = J/°C

© Mircea Stan, Kevin Skadron, David Brooks, Refrigeration “conventional” vs. thermo-electric (TEC) Can get T < T_amb (“negative” Rth!) TEC: Peltier effect (can use for local cooling)

© Mircea Stan, Kevin Skadron, David Brooks, TEC electro-thermal model

© Mircea Stan, Kevin Skadron, David Brooks, Simplistic steady-state model All thermal transfer: R = k/A Power density matters! Ohm’s law for thermals (steady-state)  V = I · R ->  T = P · R T_hot = P · Rth + T_amb Ways to reduce T_hot: -reduce P (power-aware) -reduce Rth (packaging) -reduce T_amb (Alaska?) -maybe also take advantage of transients (Cth) T_hot T_amb

© Mircea Stan, Kevin Skadron, David Brooks, Simplistic dynamic thermal model Electrical-thermal duality V  temp (T) I  power (P) R  thermal resistance (Rth) C  thermal capacitance (Cth) RC  time constant KCL differential eq. I = C · dV/dt + V/R differenceeq.  V = I/C ·  t + V/RC ·  t thermal domain  T = P/C ·  t + T/RC ·  t (T = T_hot – T_amb) One can compute stepwise changes in temperature for any granularity at which one can get P, T, R, C T_hot T_amb

© Mircea Stan, Kevin Skadron, David Brooks, Combined package model Source: CRC Press, R. Remsburg Ed. “Thermal Design of Electronic Equipment”, 2001 Steady-state Tj – junction temperature Tc – case temperature Ts – heatsink temperature Ta – ambient temperature

© Mircea Stan, Kevin Skadron, David Brooks, Itanium package model Example: processor + 4 cache modules Source: H. Xie et al. “Packaging the Itanium Microprocessor” Electronic Components and Technology Conference 2002

© Mircea Stan, Kevin Skadron, David Brooks, Thermal issues summary Performance, power, reliability Architecture-level: conduction only Convection: too complicated Radiation: can be ignored Use compact models for package Power density is key