Presentation is loading. Please wait.

Presentation is loading. Please wait.

PRACTICAL DYNAMIC THERMAL MANAGEMENT ON INTEL DESKTOP COMPUTER Guanglei Liu Department of Electrical and Computer Engineering Florida International University.

Similar presentations


Presentation on theme: "PRACTICAL DYNAMIC THERMAL MANAGEMENT ON INTEL DESKTOP COMPUTER Guanglei Liu Department of Electrical and Computer Engineering Florida International University."— Presentation transcript:

1 PRACTICAL DYNAMIC THERMAL MANAGEMENT ON INTEL DESKTOP COMPUTER Guanglei Liu Department of Electrical and Computer Engineering Florida International University July 12, 2012 Major Professor: Dr. Gang Quan

2 Thermal Design Challenges Figure from Intel Microprocessor Technology Lab, 2011 Number of transistors keeps increasing Nearly 40 billon transistors are integrated into single die [Mizunuma, 2009 ICCAD] More complicated architectures are built 80 core single chip processor has been demonstrated by Intel [Vangal, 2007 ISSCC] Environmental concerns In U.S, 46% of electricity is generated by fossil fuels. Electric Bill U.S. Datacenters: 120 billon kilowatt hours in 2012 9 billion dollar, 15% of all energy in U.S. High transistor density increases power density High power density brings up the on-chip temperatures and causes thermal issues Source: Environmental Protection Agency (EPA) Report

3 Thermal Issues Increase package/cooling costs 1-3 dollar per watt [Skadron, ICSA 2003] Data center, each watt on computing, ½ - 1 watt for cooling [Brill, 2007] Affect reliability As much as 50% reduction of device’s life span for every 10 o C increase [Yeo, DAC 2008] Degrade performance 10-15% more circuit delay for each 15 o C increase [Santarini, EDN 2005] Crush the computing system Processor’s self-protect mechanism automatically shuts down processor to avoid physical damage [Rohou, WFDO 1999] Increase Leakage power consumption Temperature from 65 o C to 110 o C can increase the leakage power by 38% for IC circuits.[Santarini, EDN 2005] Computing system cooling solutions Mechanical Cooling Solution Air-cooling (e.g. fan + heat sink) Cooling cost takes 51% of overall server power budget [Lefurgy, COM 2003] Noise level increases 10dB as fan speed increases by 50% [Lyon, STMMS 2004] Liquid-cooling High density liquid absorb 3500 times more heat than air [Chu, DMR 2004] High cooling cost Dynamic Thermal Management (DTM) Dynamic voltage and frequency scaling (DVFS) technique [Kim, HPCA 2008] Task migration [Lim QED 2002] Clock gating [Gunther, ITJ 2001] Fetch toggling [Brooks, HPCA 2001] Sacrifice system performance

4 Related Theoretical Work Our Research Goal : To develop up a practical hardware platform that enables us to investigate the limitations of the existing theoretical work, and develop practical and effective DTM techniques to accommodate those limitations Those theoretical work are derived based on simplified mathematical thermal models and idealized assumptions Thermal-aware throughput maximization [Chantem et al., ISLPED 2009] [Zhang et al., ICCAD 2007] [Chatha et al., DAC 2010] Peak temperature minimization [Chaturvedi et al., ASPDAC 2011] [Liu et al., RTAS 2010] [Qiu et al., ICESS 2010] Overall energy reduction under peak temperature constraints [Bao et al., DATE 2010] [Andrei et al., DAC 2009] [Huang et al., DATE 2011] Real-time guarantee under peak temperature constraint [Chaturvedi et al., CIT 2010] [Wang et al., RTS 2006] [Huang et al., RTSS 2009]

5 Thermal management validation [SUSCOM 2012] DTM techniques VS air-cooling DTM vs DPM algorithm Fundamental DTM principles validation Reactive DTM Single-core Limitations of theoretical works Non-constant sampling period Thermal profiling analysis [GreenCom 2012] Major contributions Practical hardware platform Intel i5 Quad core Linux operating system [SouthEast 2011] Proactive DTM algorithm Multi-core [DATE 2012][ASP2012] Neighbor-aware temperature prediction Algorithm for multicore with task migration

6 Practical Hardware Platform CoreTemp driver Read on-chip thermal sensor Lm-sensors Tool Monitor system information Cpufreq module 12 different speed levels Fancontrol shell script Manually adjust fan speed Intel i5 quad core Temperature capturing SPEC Benchmark DVFS Technique Fan Speed Control Computing system hardware monitoring tool Temperature value Fan Speed Voltag e value Fan control DVFS technique Power measureme nt Task migration CPU_affinity module Migrate process between cores Dell Precision T1500 workstation Linux kernel version of 2.6.23 SPEC CPU2000 Benchmark Integers and floating point operations Fluke current clamp, Multimeter Cooling/ CPU power consumption

7 Our Approach Enhanced reactive DTM (ERDTM) Build up a temperature vs. speed lookup table Run benchmarks with different speed levels Collect corresponding peak temperatures Offline thermal profiling analysis Buffer zone and safe region Buffer zone: Safe region: Time Temperature Safe region Buffer zone T safe T TURESHOLD is maximum possible temperature increment 4 o C

8 Experimental results Four identical tasks assigned to four cores to simulate single-core environment Temperature threshold is 55 o C Construct the lookup table offline Frequency lookup table Experiment setup FSDTM algorithmVS-DTM algorithm ERDTM algorithm Number of violations 87 Number of violations 12 Number of violations 0 DTM algorithm Performance evaluation ERDTM average throughput improvement is 8.1%

9 Neighbor-aware temperature prediction Our Neighbor-aware prediction where and are weights, which are obtained by collecting training data Obtained offline Individual increment factor Processor temperature increment Neighbor increment factor Heat transfer from neighbor processor Training process Apply least-square estimation Run the tasks and record temperature information

10 Neighbor-aware Task Migration Always migrate task from hottest core to the coolest core. Conventional approach: NADTM Algorithm Predict thermal emergency Migrate task DVFS technique Heat factor: to evaluate the processor hotness Increasing factor: to evaluate the temperature increment Our migration strategy choose the migration candidate with the minimum

11 Performance analysis Single task Multiple task  NADTM algorithm can effectively control the temperature under the threshold  It has a small temperature oscillation of 1 o C An average of 3.6% overall throughput improvement An average of 5.8% overall throughput improvement

12 Thank You for Your Attention ! Journals Peer Reviewed Conferences 1.Guanglei Liu, M. Fan, G. Quan, M. Qiu “On-Line Predictive Thermal Management under Peak Temperature Constraints for Practical Multi-core Platforms”, Journal of Low Power Electronics (ASP). (under review), 2012. 2.Guanglei Liu, G. Quan, M. Qiu “Practical Dynamic Thermal Management on An Intel Desktop Computer ”, Embedded Software Design, Journal of Sustainable Computing (SUSCOM) (under review), 2012. 3.H. Huang, V. Chaturvedi, Guanglei Liu, G. Quan, ”Leakage Aware Scheduling On Maximum Temperature Minimization For Periodic Hard Real-Time Systems”, Journal of Low Power Electronics (ASP), 2012. 1.Guanglei Liu, M. Fan, G. Quan, “Neighbor-Aware Dynamic Thermal Management for Multi-core Platform”, The 15th Design, Automation, and Test in Europe (DATE 2012), Dresden, Germany, March 12-16, 2012. 2.Guanglei Liu, G. Quan, M. Qiu, “The Practical On-line Scheduling for Throughput Maximization on Intel Desktop Platform under the Maximum Temperature Constraint“, The 2011 IEEE/ACM Green Computing and Communications (GreenCom 2011), Sichuan, China, August 4-5, 2011. 3.Guanglei Liu, G. Quan, ”Thermal Aware Scheduling on an Intel Desktop Computer,” IEEE SouthEast Conference (SouthEast 2011), Nashville, Tennessee, March 17-20, 2011. 4.Guanglei Liu, J. Fan, “Framework for Statistical Analysis of Homogeneous Multi- core Power Grid Networks“, IEEE 8th International Conference on ASIC (ASICON 2009), Changsha, China, October 20-23, 2009. 5.C. Liu, J. Tan, R. Chen, Guanglei Liu, J. Fan, “Thermal Aware Clocktree Optimization in Nanometer VLSI Systems Considering Temperature Variations“, IEEE 40th Southeastern Symposium on System Theory (SSST 2008), New Orleans, LA, March 17-18, 2008.


Download ppt "PRACTICAL DYNAMIC THERMAL MANAGEMENT ON INTEL DESKTOP COMPUTER Guanglei Liu Department of Electrical and Computer Engineering Florida International University."

Similar presentations


Ads by Google