Power Management Features in Intel Processors

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Adders Used to perform addition, subtraction, multiplication, and division (sometimes) Half-adder adds rightmost (least significant) bit Full-adder.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Process Description and Control
Zhongxing Telecom Pakistan (Pvt.) Ltd
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Processes and Operating Systems
Copyright © 2013 Elsevier Inc. All rights reserved.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination. Introduction to the Business.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
Chapter 5 Input/Output 5.1 Principles of I/O hardware
Contents Page Learning targets
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Our Digital World Second Edition
Figure 12–1 Basic computer block diagram.
Mehdi Naghavi Spring 1386 Operating Systems Mehdi Naghavi Spring 1386.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 2: Managing Hardware Devices.
Turing Machines.
Advance Nano Device Lab. Fundamentals of Modern VLSI Devices 2 nd Edition Yuan Taur and Tak H.Ning 0 Ch9. Memory Devices.
PP Test Review Sections 6-1 to 6-6
Seungmi Choi PlanetLab - Overview, History, and Future Directions - Using PlanetLab for Network Research: Myths, Realities, and Best Practices.
Customer Education .
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Discrete Mathematical Structures: Theory and Applications
Computer Maintenance Unit Subtitle: CPUs Copyright © Texas Education Agency, All rights reserved.1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
Operating Systems Operating Systems - Winter 2011 Dr. Melanie Rieback Design and Implementation.
Operating Systems Operating Systems - Winter 2012 Dr. Melanie Rieback Design and Implementation.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
VOORBLAD.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Executional Architecture
KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Gursharan Singh Tatla PIN DIAGRAM OF 8086 Gursharan Singh Tatla Gursharan Singh Tatla
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Clock will move after 1 minute
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Energy Generation in Mitochondria and Chlorplasts
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
1 Overview 1.Motivation (Kevin) 1.5 hrs 2.Thermal issues (Kevin) 3.Power modeling (David) Thermal management (David) hrs 5.Optimal DTM (Lev).5 hrs.
Lev Finkelstein ISCA/Thermal Workshop 6/ Overview 1.Motivation (Kevin) 2.Thermal issues (Kevin) 3.Power modeling (David) 4.Thermal management (David)
Evaluation of Advanced Power Management for ClassCloud based on DRBL Rider Grid Technology Division National Center for High-Performance Computing Research.
Power Management. Outline Why manage power? Power management in CPU cores Power management system wide Ways for embedded programmers to be power conscious.
Overview Motivation (Kevin) Thermal issues (Kevin)
Presentation transcript:

Power Management Features in Intel Processors Shimin Chen Intel Labs Pittsburgh UPitt CS 3150, Guest Lecture, February 24, 2010

Power Management Many components in a computer system: CPU(s) DRAM memory Hard drives Graphics card Monitor Network card …… System-wide power management actions are based on power management features of individual components Our focus: CPUs PC system with Intel core i7

Why CPU Power Management? Save power For mobile devices: longer battery life For servers: lower operational cost More environmentally friendly Thermal management (less obvious but very important) Higher power  more heat  higher temperature Maximum operating temperature Beyond this temperature, transistors may not operate correctly. Then one sees weird bugs, or even system crashes. Running CPU at too high temperature reduces the CPU life.

Many Terms When Reading About CPU Power Management P-states, C-states ACPI Enhanced Intel SpeedStep Dynamic frequency and voltage scaling Halt state Idle state Suspend …

Two Perspectives Hardware perspective ACPI standard perspective Bottom up description Hardware mechanisms E.g., Intel processor manuals take this approach ACPI standard perspective ACPI: Advanced Configuration and Power Interface Top down description Define programming APIs and functionalities Confusions often arise because The same concept may be represented with different terms And the two descriptions do not exactly match

The Description in This Talk Combined approach: Provide a high level overview of ACPI Describe the hardware mechanisms and their relationships to ACPI I hope that this can give you a structured view of the CPU power management, and clarify the aforementioned terms and their relationships

Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P-States) Low-Power Idle States (C-States) Multi-core considerations Summary

What Is ACPI? ACPI (Advanced Configuration and Power Interface) Standard interface specification OS can perform power management using this API Hardware and software drivers support this API Mapping from CPU mechanisms to ACPI is provided by BIOS and software drivers Applications OS Power Management ACPI Software drivers Hardware: CPU, BIOS etc.

ACPI State Hierarchy (1/3) Global system states (g-state) G0 : Working G1 : Sleeping (e.g., suspend, hibernate) G2 : Soft off (e.g., powered down but can be restarted by interrupts from input devices) G3 : Mechanical off Lower number means higher power

ACPI State Hierarchy (2/3) Global system states (g-state) G0 : Working Processor power states (C-state) C0 : normal execution C1 : idle C2 : lower power but longer resume latency than C1 C3 : lower power but longer resume latency than C2 G1 : Sleeping (e.g., suspend, hibernate) Sleep State (S-state) S0 S1 S2 S3: suspend S4: hibernate G2 : Soft off (S5) G3 : Mechanical off

ACPI State Hierarchy (3/3) G0 : Working Processor power states (C-state) C0 : normal execution Performance state (P-State) P0: highest performance, highest power P1 Pn C1, C2, C3 G1 : Sleeping (e.g., suspend, hibernate) Sleep State (S-state): S0, S1, S2, S3, S4 G2 : Soft off (S5) G3 : Mechanical off

Supporting ACPI States ACPI defines data structures to track the states and functions to operate on the states CPUs implement mechanisms to support these states BIOS and software drivers hide the difference of CPU implementations to support the ACPI defined data structures and functions

Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P-States) Low-Power Idle States (C-States) Multi-core considerations Summary

Enhanced Intel SpeedStep Technology (EIST) Enhanced Intel SpeedStep == dynamic frequency and voltage scaling An operation point (frequency, voltage) == P-state Note that the CPU is in normal operation, executing instructions (C0)

Why Dynamic Frequency and Power Scaling? Physics: Lower voltage  slower transistor switch speed  longer latency of CPU operations  lower frequency Larger power savings if reducing frequency and voltage at the same time: P= CV2F P: power; C: capacitance; V: voltage; F: frequency

Example: Intel Pentium M at 1.6GHz Source: Ref[4]

Power vs. Core Voltage of Intel Pentium M at 1.6GHz Source: Ref[4]

Hardware Mechanisms Select voltage Voltage Regulator Processor Components Vcc Frequency multiplier Clock

Enhanced SpeedStep vs. Legacy SpeedStep Supports are mainly in CPU itself as opposed in chipsets Faster transition time (e.g., 10us down from 250us for the Intel Pentium M processor)

How to Control EIST in Software? EIST is available or not? CPUID instruction, ECX feature bit 07 Enable EIST (in OS kernel) Set special register IA32_MISC_ENABLE bit 16 Change operational point (in OS kernel) Write operation point ID to special register IA32_PERF_CTL This ID is processor model specific

EIST Availability Enhanced Intel SpeedStep® Technology is available in Pentium M processor Pentium 4 Intel Xeon Intel® Core™ Solo Intel® Core™ Duo Intel® Atom™ Intel® Core™2 Duo

Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P-States) Low-Power Idle States (C-States) Multi-core considerations Summary

Low-Power Idle State Power saving mechanisms: These are the idle C-State: C1, … CPU is not executing instructions in these C-states Power saving mechanisms: Stop clock signal Flush and shutdown cache Turn off cores

C-State in Intel Core i7 Processor Core C0 State The normal operating state of a core where code is being executed. Core C1/C1E State The core halts; it processes cache coherence snoops. C1E: if possible, reduce voltage and frequency to the lowest

C-State in Intel Core i7 Processor Core C0 State The normal operating state of a core where code is being executed. Core C1/C1E State The core halts; it processes cache coherence snoops. Core C3 State The core flushes the contents of its L1 instruction cache, L1 data cache, and L2 cache to the shared L3 cache, while maintaining its architectural state. All core clocks are stopped at this point. No snoops. C2 not defined. The C-States are processor model specific.

C-State in Intel Core i7 Processor Core C0 State The normal operating state of a core where code is being executed. Core C1/C1E State The core halts; it processes cache coherence snoops. Core C3 State The core flushes the contents of its L1 instruction cache, L1 data cache, and L2 cache to the shared L3 cache, while maintaining its architectural state. All core clocks are stopped at this point. No snoops. Core C6 State Before entering core C6, the core will save its architectural state to a dedicated SRAM on chip. Once complete, a core will have its voltage reduced to zero volts.

C-State Transition hlt or mwait instruction triggers the transition to lower power states Interrupts (among others) triggers the transition to C0

C-State Availability C0 is always available The low power idle C-States are processor model specific Described in processor data sheet.

Outline Introduction ACPI Overview Enhanced Intel SpeedStep Technology (P-States) Low-Power Idle States (C-States) Multi-core considerations P-States C-States Intel Turbo Boost Technology Summary

Multi-core Chip 4-core CPU (Nehalem) Question: can we set the individual core’s p-state and c-state?

P-State: Enhanced Intel SpeedStep Technology Dynamic frequency and voltage scaling Current Intel processors use the same frequency and voltage for all the cores Therefore, it is impossible to actually run different cores at different p-states. Processor p-state = MIN (core desired p-states)

C-State: Low-Power Idle States The actions are: Halting the execution Flushing cache Stopping clock … These actions can be performed on individual cores Different cores can have different C-State

How about C1E? C1E is C1 + the lowest frequency P-state Therefore, C1E is only used when all the cores are in C1E.

How about C-State for Hyper Threading? There can be two hardware threads per core Each thread may use mwait instruction to specify the desired C-state However, the C-state action cannot be performed for individual threads core c-state = MIN (thread c-state)

General Optimization Guideline In general, it is better to use the cores evenly Distribute computations so that the cores have similar utilization Then all the cores can go into the same P-State The processor can actually go into the P-State For single-threaded application, there is a new Intel processor feature

Intel Turbo Boost Technology Basic idea: Processor frequency is fundamentally limited by the operating temperature If there is head-room in operating temperature, one can increase the processor frequency to achieve higher performance Intel Turbo Boost Technology: All but one core are in C3/C6 Automatically increase frequency given temperature and other constraints

Summary ACPI defines a standard interface for operating systems to utilize hardware power features Supported by most OS, e.g., Linux, Windows CPUs, BIOS, and software drivers combined to support the ACPI interface Intel processor power features: Enhanced Intel SpeedStep Technology: P-State Low power idle states: C-State Intel Turbo Boost Technology: not in ACPI standard

References http://www.acpi.info “Intel® 64 and IA-32 Architectures Software Developer’s Manual”. Volume 3A: System Programming Guide. Order Number: 253668-033US. December 2009. Chapter 14. “Intel® 64 and IA-32 Architectures Optimization Reference Manual”. Order Number: 248966-020. November 2009. Chapter 11. “Enhanced Intel® SpeedStep® Technology for the Intel® Pentium® M Processor”. Order Number: 301170-001. March 2004. “Intel® Core™ i7-800 and i5-700 Desktop Processor Series, Datasheet – Volume 1”. September 2009. Chapter 4.

Thank you!

Backup

Summary: ACPI State Hierarchy G0 : Working Processor power states (C-state) C0 : normal execution Performance state (P-State) : Enhanced Intel SpeedStep Technology Other C-state: model-specific low-power idle states G1 : Sleeping (e.g., suspend, hibernate) Sleep State (S-state): S0, S1, S2, S3, S4 G2 : Soft off (S5) G3 : Mechanical off

Clock Duty Cycle Modulation Some Intel processors support an additional mechanism to reduce power consumption:

Use C-State to Reduce Power OS can monitor activity level (e.g., for every 100ms) and determine the desired C-State