Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Structure 2012 – Power Management 1 Computer Structure Power Management Lihu Rappoport and Adi Yoaz Thanks to Efi Rotem for many of the foils.

Similar presentations


Presentation on theme: "Computer Structure 2012 – Power Management 1 Computer Structure Power Management Lihu Rappoport and Adi Yoaz Thanks to Efi Rotem for many of the foils."— Presentation transcript:

1 Computer Structure 2012 – Power Management 1 Computer Structure Power Management Lihu Rappoport and Adi Yoaz Thanks to Efi Rotem for many of the foils

2 Computer Structure 2012 – Power Management 2 Processor Power Components  The power consumed by a processor consists of –Dynamic power: power for toggling transistors and lines from 0  1 or 1  0  αCV 2 f : α – activity, C – capacitance, V – voltage, f – frequency –Leakage power: leakage of transistors under voltage  function of: Z – total size of all transistors, V – voltage, t – temperature  Peak power must not exceed the thermal constrains –Power generates heat  Heat must be dissipated to keep transistors within allowed temperature –Peak power determines peak frequency (and thus peak performance) –Also affects form factor, cooling solution cost, and acoustic noise  Average power –Determines battery life (for mobile devices), electricity bill, air-condition bill –Average power = Total Energy / Total time  Including low-activity and idle-time (~90% idle time for client)

3 Computer Structure 2012 – Power Management 3 Performance per Watt  In small form-factor devices thermal budget limits performance –Old target: get max performance –New target: get max performance at a given power envelope  Performance per Watt  Increasing f also requires increasing V (~linearly) –Dynamic Power = αCV 2 f = Kf 3  X% performance costs ~3X% power –A power efficient feature – better than 1:3 performance : power  Otherwise it is better to just increase frequency (and voltage)  Vmin is the minimal operation voltage –Once at Vmin, reducing frequency no longer reduces voltage –At this point a feature is power efficient only if it is 1:1 performance : power  Active energy efficiency tradeoff –Energy active = Power active × Time active  Power active / Perf active –Energy efficient feature: 1:1 performance : power

4 Computer Structure 2012 – Power Management 4 Platform Power  Processor average power is <10% of the platform Display (panel + inverter) 33% CPU 10% Power Supply 10% MCH 9% Misc. 8% GFX 8% HDD 8% CLK 5% ICH 3% DVD 2% LAN 2% Fan 2%

5 Computer Structure 2012 – Power Management 5 Managing Power  Typical CPU usage varies over time –Bursts of high utilization & long idle periods (~90% of time in client)  Optimize power and energy consumption –High power when high performance is needed –Low power at low activity or idle  Enhanced Intel SpeedStep® Technology –Multi voltage/frequency operating points –OS changes frequency to meet performance needs and minimize power –Referred to as processor Performance states = P-States  OS notifies CPU when no tasks are ready for execution –CPU enters sleep state, called C-state –Using MWAIT instruction, with C-state level as an argument –Tradeoff between power and latency  Deeper sleep  more power savings  longer to wake

6 Computer Structure 2012 – Power Management 6 P-states  Operation frequncies are called P-states = Performance states –P0 is the highest frequency –P1,2,3… are lower frequencies –Pn is the min Vcc point = Energy efficient point  DVFS = Dynamic Voltage and Frequency Scaling –Power = CV 2 f ; f = KV  Power ~ f 3 –Program execution time ~ 1/f –E = P×t  E ~ f 2  Pn is the most energy efficient point –Going up/down the cubic curve of power  High cost to achieve frequency  large power savings for some small frequency reduction P0 P1 Pn Freq Power P2

7 Computer Structure 2012 – Power Management 7 C-States: C0  C0: CPU active state Leakage Clock Distribution Local Clocks and Logic Active Core Power

8 Computer Structure 2012 – Power Management 8 C-States: C1  C0: CPU active state  C1: Halt state: Stop core pipeline Stop most core clocks No instructions are executed Caches respond to external snoops Leakage Clock Distribution Active Core Power

9 Computer Structure 2012 – Power Management 9 C-States: C3  C0: CPU active state  C1: Halt state: Stop core pipeline Stop most core clocks No instructions are executed Caches respond to external snoops  C3 state: Stop remaining core clocks Flush internal core caches Leakage Active Core Power

10 Computer Structure 2012 – Power Management 10 C-States: C6  C0: CPU active state  C1: Halt state: Stop core pipeline Stop most core clocks No instructions are executed Caches respond to external snoops  C3 state: Stop remaining core clocks Flush internal core caches  C6 state: Processor saves architectural state Turn off power gate, eliminating leakage Leakage Core power goes to ~0 Active Core Power

11 Computer Structure 2012 – Power Management 11 Putting it all together  CPU running at max power and frequency  Periodically enters C1 Power [W] C1 C0 P0 Time

12 Computer Structure 2012 – Power Management 12 Putting it all together  Going into idle period –Gradually enters deeper C states –Controlled by OS Time Power [W] C2 C3 C4 C1 C0 P0

13 Computer Structure 2012 – Power Management 13 Putting it all together  Tracking CPU utilization history –OS identifies low activity –Switches CPU to lower P state Time Power [W] C2 C3 C4 C0 P1 C1 C0 P0

14 Computer Structure 2012 – Power Management 14 Putting it all together  CPU enters Idle state again Time Power [W] C2 C3 C4 C0 P1 C2 C3 C4 C1 C0 P0

15 Computer Structure 2012 – Power Management 15  Further lowering the P state  DVD play runs at lowest P state Putting it all together Time Power [W] C2 C3 C4 C0 P1 C0 P2 C2 C3 C4 C1 C0 P0

16 Computer Structure 2012 – Power Management 16 Voltage and Frequency Domains  Two Independent Variable Power Planes –CPU cores, ring and LLC  Embedded power gates – each core can be turned off individually  Cache power gating – turn off portions or all cache at deeper sleep states –Graphics processor  Can be varied or turned off when not active  Shared frequency for all IA32 cores and ring  Independent frequency for PG  Fixed Programmable power plane for System Agent –Optimize SA power consumption –System On Chip functionality and PCU logic –Periphery: DDR, PCIe, Display VCC Core (Gated) (Gated) (Gated) (Gated) (ungated) VCC SA VCC Graphics VCC Periphery Embedded power gates

17 Computer Structure 2012 – Power Management 17 Turbo Mode  P1 is guaranteed frequency –CPU and GFX simultaneous heavy load at worst case conditions –Actual power has high dynamic range  P0 is max possible frequency – the Turbo frequency –P1-P0 has significant frequency range (GHz)  Single thread or lightly loaded applications  GFX <>CPU balancing –OS treats P0 as any other P-state  Requesting is when it needs more performance –P1 to P0 range is fully H/W controlled  Frequency transitions handled completely in HW  PCU keeps silicon within existing operating limits –Systems designed to same specs, with or without Turbo Mode  Pn is the energy efficient state –Lower than Pn is controlled by Thermal-State “Turbo”H/WControl OSVisibleStates OSControl T-state & Throttle P1 Pn P0 1C frequency LFM

18 Computer Structure 2012 – Power Management 18 Frequency (F) No Turbo Core 0Core 1Core 2Core 3 Core 2Core 3Core 0Core 1 Power Gating Zero power for inactive cores Turbo Mode Workload Lightly Threaded

19 Computer Structure 2012 – Power Management 19 Workload Lightly Threaded Frequency (F) No Turbo Core 0Core 1Core 2Core 3 Turbo Mode Use thermal budget of inactive core to increase frequency of active cores Core 0Core 1 Turbo Mode Power Gating Zero power for inactive cores

20 Computer Structure 2012 – Power Management 20 Frequency (F) No Turbo Core 0Core 1Core 2Core 3 Workload Lightly Threaded Core 0Core 1 Turbo Mode Power Gating Zero power for inactive cores Turbo Mode Use thermal budget of inactive core to increase frequency of active cores

21 Computer Structure 2012 – Power Management 21 Active cores running workloads < TDP Frequency (F) No Turbo Core 0Core 1Core 2Core 3 Core 2 Core 3 Core 0 Core 1 Core 2Core 3Core 0Core 1 Turbo Mode Increase frequency within thermal headroom

22 Computer Structure 2012 – Power Management 22 Frequency (F) No Turbo Core 0Core 1Core 2Core 3 Workload Lightly Threaded And active cores < TDP Core 2 Core 3 Core 1 Core 0 Turbo Mode Increase frequency within thermal headroom Power Gating Zero power for inactive cores

23 Computer Structure 2012 – Power Management 23 Thermal Capacitance Classic Model Steady-State Thermal Resistance Design guide for steady state Classic Model Steady-State Thermal Resistance Design guide for steady state Temperature Time Classic model response Temperature rises as energy is delivered to thermal solution Thermal solution response is calculated at real-time Temperature rises as energy is delivered to thermal solution Thermal solution response is calculated at real-time Temperature Time More realistic response to power changes New Model Steady-State Thermal Resistance AND Dynamic Thermal Capacitance Foil taken from IDF 2011

24 Computer Structure 2012 – Power Management 24 Time Power Sleep or Low power Turbo Boost 2.0 “TDP” C0/P0 (Turbo) After idle periods, the system accumulates “energy budget” and can accommodate high power/performance for a few seconds In Steady State conditions the power stabilizes on TDP P > TDP: Responsiveness Sustain power Buildup thermal budget during idle periods Use accumulated energy budget to enhance user experience Intel® Turbo Boost Technology 2.0 Foil taken from IDF 2011

25 Computer Structure 2012 – Power Management 25 Core and Graphic Power Budgeting Cores and Graphics integrated on the same die with separate voltage/frequency controls; tight HW control Full package power specifications available for sharing Power budget can shift between Cores and Graphics Core Power [W] Graphics Power [W] Total package power Realistic concurrent max power Sum of max power Heavy Graphics workload Heavy CPU workload Specification Core Power Specification Graphics Power Applications Sandy Bridge Next Gen Turbo for short periods Sandy Bridge Next Gen Turbo for short periods Foil taken from IDF 2011


Download ppt "Computer Structure 2012 – Power Management 1 Computer Structure Power Management Lihu Rappoport and Adi Yoaz Thanks to Efi Rotem for many of the foils."

Similar presentations


Ads by Google