Download presentation
Presentation is loading. Please wait.
1
Power Aware Software Architecture
Rajesh K. Gupta University of California, Irvine
2
The “Noveau Rich” in Computing
Instrumented wide-area spaces Personal area spaces Internet end-points In-body, in-cell, in-vitro spaces Generational shift in computing devices lot more of everything including networking and communications lot less of power, energy, volume, weight, patience Application is everything, the possibilities are limitless System architectures are due for an overhaul the architectures are (radically) changed/challenged the programming context is changed the system software contract is changed new awareness: location, power, timing, reactivity, stability
3
Outline The case for power awareness in Managing power in the OS
application development system software Managing power in the OS knobs and strategies Making software power aware the hardware knobs (DVS, DPM) the application knobs (duty cycling, criticality, aesthetics) An ongoing experiment
4
The Case for Power Awareness
Limited availability Energy and power uses of new devices is markedly different from laptops and notebook computers much wider dynamic range of power demand increasing share of memory, communication and signal processing multiple power use modalities depending upon application “immortal”, “paging-mode RX”, “lifeline TX”, “mission-mode”
5
Power Management Places
Hardware & firmware don’t know the global state and application-specific knowledge Users don’t know component characteristics, and can’t make frequent decisions Applications operate independently and the OS hides machine information from them OS plays an important role in allocation, sharing of critical resource it is a logical place for dynamic power management application-specific constraints and opportunities for saving energy that can be known only at that level
6
Operating System Directed Power Management
Significant opportunities in power management lie with application-specific “knobs” quality of service, timing criticality of various functions Needs of applications are driving force for OS power management functions & power-based API collaboration between applications and the OS in setting “energy use policy” OS helps resolve conflicts and promote cooperation OS is the most reasonable place, but… OS should incorporate application information in power management OS should expose power state and events to applications for them to adapt.
7
Power Savings Mechanisms
Dynamic Power Management (DPM) When a device is idle, it can transition to low-power “sleep” states. . Dynamic Voltage Scaling (DVS) A device can be run at different speeds at different power levels Execution of jobs can be slowed down to save power as long as all jobs are completed by their deadline. Plus any application level “knobs” quality and performance measures, application tolerances
8
Implementing DVS Often done using slowdown factors For example:
can be static or dynamic For example: Given a frequency range of [fmin ,fmax ] Slowdown factor is frequency scaled to [min,1], where min = fmin /fmax.. When we use a slowdown factor of , we set the frequency to, f = * fmax . The voltage is changed to the minimum voltage supported at f.
9
Slowdown Factors Much of the work in the context of real-time systems
makes sense since we need something to tradeoff against power saved Known results Essentially use schedulability tests to determine amount of slowdown possible
10
Our Work In Context DPM for devices with multiple active and multiple sleep states. Design and analyze algorithms for systems that allow both DPM and DVS.
11
Dynamic Power Management
When a device becomes idle, it can transition to lower power usage state. A fixed amount of additional time and energy are required to transition back to active state when a new request for service arrives. What is the best time threshold to transition to the sleep state? Too soon: pay start-up cost too frequently. Too late: spend too much time in the high-power state Generally, transition to sleep state when the cost of being in active state is at least the cost of `waking up’.
12
Multi-state Case Let there be k+1 states
Let State k be the shut-down state and 0 be the active state Let i be the power dissipation rate at state i Let i be the total energy dissipated to move back to State k States are ordered such that i+1 i k = 0 and 0 = 0 (without loss of generality). Power down energy cost can be incorporated in the power up cost for analysis (if additive).
13
Lower Envelope Idea For each state i, plot: Energy Time
LEA can be deterministic or probabilistic PLEA is e/(e-1) competitive.
14
Experimental Study: IBM Mobile Hard Drive
2.4 Active/Busy 40ms 0.56 0.9 Idle 1.5s 1.575 0.2 Stand-by 5s 4.75 Sleep Transition Time to Active Start-up Energy (Joules) Power Consumption State Trace data with arrival times of disk accesses from Auspex file server archive.
15
IBM Mobile Hard Drive
16
Goal Provide ways by which Application, Operating System and Hardware can exchange energy/power and performance related information efficiently. Facilitate the continuously dialogue / adaptation between OS / Applications. Facilitate the implementation of power aware OS services by providing a software interface to low power devices A power-aware API to the end user that enables one to implement energy-efficient RTOS services and applications
17
Power-aware API Requirements
Independent of Hardware and RTOS implementations enables its use in different hardware platforms for this all routines should access the HAL (Hardware Abstraction Layer) rather than the Hardware directly enables its use in different RTOS as well as its use with different scheduling strategies do not count on specific RTOS info and/or specific schedulers Services provided processor frequency scaling and low-power state transitions with costs of making such transitions battery status (if the system is battery based) appropriate routines to control energy-speed and energy-accuracy knobs available on I/O devices: network interface, serial interface, LCD, etc.
18
Power-aware API The application is able to
The applications interface provides the following services: The application is able to tell RT information to OS (period, deadlines, WCET, hardness) create new threads tell OS time predicted to finish a given task instance depending on the conditions of the environment (application dependent and not yet implemented) OS must be able to predict and tell applications the time estimated to finish the task depends on the scheduling scheme used A hard task must be killed if its deadline is missed.
19
A Power-Aware Software Architecture
Application PA-API PA-Middleware POSIX PA-OSL Operating System Modified OS Services Hardware Abstraction Layer PA-HAL Hardware
20
Power Aware Software Architecture
PA-API (Power Aware API) interfaces applications and OS making the power aware OS services available to the application writer. PA-OSL (Power Aware Operating System Layer) implements modified OS services and active components such as a DPM manager. PA-HAL (Power Aware Hardware Abstraction Layer) interfaces OS and Hardware making the power control knobs available to the OS programmer.
21
Software Architecture
PA-API - Power aware function calls available to the application writer. Some functions of this layer are specific to certain scheduling techniques. PA-Middleware - Power aware services implemented on the top of the OS (power management threads, data handling, etc...). POSIX - Standard interface for OS system calls. This isolates PA-API and PA-Middleware from OS. PA-OSL - Power aware OS layer. Calls related to modified OS services should go through this level. Also isolates OS from PA-API and PA-Middleware. PA-HAL - Power Aware Hardware Abstraction Layer. Isolates OS from underlying power aware hardware. Modified OS services Implementation / modification of OS services in a power related fashion. Ex: scheduler, memory manager, I/O, etc.
22
Layer Functionality Layer Function name PA-API PA-OSL PA-HAL
paapi_dvs_create_thread_type(), paapi_dvs_create_thread_instance() paapi_dvs_app_started(), paapi_dvs_get_time_prediction() paapi_dvs_set_time_prediction(), paapi_dvs_app_done(), paapi_dvs_set_adaptive_param() paapi_dvs_set_policy(), paapi_dpm_register_device() PA-OSL paosl_dvs_create_task_type_entry(), paosl_dvs_create_task_instance_entry(), paosl_dvs_killer_thread(), paosl_dvs_killer_thread_alarm_handler(), paosl_dpm_register_device(), paosl_dpm_deamon() PA-HAL pahal_dvs_initialize_processor_pm(), pahal_dvs_get_frequency_levels_info() pahal_dvs_get_current_frequency(), pahal_dvs_set_frequency_and_voltage() pahal_dvs_pre_set_frequency_and_voltage(), pahal_dvs_post_set_frequency_and_voltage() pahal_dvs_get_lowpower_states_info(), pahal_dvs_set_lowpower_state() pahal_dpm_device_check_activity(), pahal_dpm_device_pre_switch_state() pahal_dpm_device_switch_state(), pahal_dpm_device_post_switch_state() pahal_dpm_device_get_info(), pahal_dpm_device_get_curr_state() pahal_battery_get_info()
23
DVS Related Functions paapi_dvs_create_thread_type(), paapi_dvs_create_thread_instance() creates type and instance of a task respectively paapi_dvs_app_started(), paapi_dvs_app_done() delimits execution of useful work in a thread. Tell the OS whether the task has finished execution or not. paapi_dvs_get_time_prediction(), paapi_dvs_set_time_prediction() get current execution time prediction for a given thread paapi_dvs_set_adaptive_param() set the paremeters of the adaptive policy (it will be described later) for a given task. paapi_dvs_set_policy() choses the policy to be using for DVS
24
DVS Related Functions (contd.)
paosl_dvs_create_task_type_entry(), ... create a type and an instance of a thread in the kernel internal tables of type and instance respectively paosl_dvs_killer_thread() kills a thread that missed a deadline pahal_dvs_initialize_processor_pm() initialize structures for processor power management pahal_dvs_get_current_frequency(), pahal_dvs_set_frequency_and_voltage() pahal_dvs_pre_set_frequency_and_voltage(), pahal_dvs_get_frequency_levels_info() pahal_dvs_post_set_frequency_and_voltage() functions to switch processor among possible frequencies levels pahal_dvs_get_lowpower_states_info(), pahal_dvs_set_lowpower_state() functions to switch processor among low power states
25
DPM Functions paapi_dpm_register_device() paosl_dpm_deamon()
just register the device to be power managed paosl_dpm_deamon() implements the actual policy for a specific device. This deamon uses PA-HAL functions to decide on how to switch devices among all possible states. pahal_dpm_device_switch_state() switch device’s state pahal_dpm_device_check_activity() check whether the device has been idle and for how long. This functions needs support from the device driver. pahal_dpm_device_get_info(), pahal_dpm_device_get_curr_state() gets information about the device and about its current state respectively Others functions for helping implementing power policies. For example: pahal_battery_get_info() – gets battery status
26
Current Status API specification available from Implementation
Implementation eCOS RTOS: open source, Object oriented and highly configurable RTOS (by means of scripting language) Hardware platforms we are currently working with: Linux-synthetic (emulation of eCos over Linux - debugging purposes only) Compaq iPaq Pocket PC - StrongARM SA1110 based platform Accelent IDP (Integrated Development Environment) - also StrongARM SA1110 based. LRH Intel evaluation board 80200EVB - Intel Xscale based
27
Maxim board for voltage scaling
Implementation 80200EVB w/ voltage scaling board and the host system Compaq IPAQ running eCos Maxim board for voltage scaling
28
Using Power Aware OS: Example
The scheduler adapts frequency according to the real time parameters passed in as parameter on the thread type. The frequency is adjusted by means of factors by which it is multiplied resulting in lower speed (a factor can also speed up the processor if it is > 1). deadline void main() { mpeg_decoding_t = paapi_dvs_create_thread_type(100,30,100,hard); paapi_dvs_set_policy(SHUTDOWN | STATIC DYNAMIC | ADAPTIVE); paapi_dvs_create_thread_instance( mpeg_decoding_t, mpeg_decode_thread); } ... WCET period void mpeg_decode_thread() { for (;;) { paapi_dvs_app_started(); /* original code */ mpeg_frame_decode() paapi_dvs_app_done(); } Selects the DVS policy for all threads Kills the thread instance when deadline is missed
29
An Experiment Application + OS running on 80200 XScale board
Altera FPGA board generating interrupts to wake up the processor Maxim board providing voltage scaling Host PC for debugging and for loading the App. + OS into the board
30
The Experiment with DVS
Shutdown when idle as soon as CPU becomes idle shutdown the processor Shutdown + static slow down factors offline slow down factors are applied. The CPU is shutdown when idle. Shutdown + static slow down + dynamic slow down run-time slow down factors are computed based on a history of execution times in addition to the static and shutdown Shutdown + static slow down + dynamic slow down + adaptive slow down a deadline driven factor is also applied in addition to the other factors and shutdown. This factor adapts itself according to number of deadline missed in a previous window of executions.
31
DVS Experiment Four parameters are defined for the adaptive factor:
% of deadlines missed tolerable (D) every W executions Window size (W) Lower bound for the factor (L) Increments and decrement steps (Inc and Dec) For every W executions if the number of deadlines missed is less than D lower the adaptive factor by Dec if it is greater than L, otherwise keep it as it was. if the number of deadlines is greater than D increment the adaptive factor by Inc.
32
Application Set Three different real applications running concurrently: An MPEG2 decoder An ADPCM (Adaptive Differential Pulse Code Modulation) speech enconding Floating point FFT application Task Application WCET (us) Std Dev (us) T1 MPEG2 (wg_gdo_1.mpg) 30700 3100 T2 MPEG2 (wg_cs_1.mpg) 26300 2100 T3 ADPCM 9300 3300 T4 FFT 15900 T5 FFT (gaussian distribution) 13600 800
33
Task Set We used three tasksets based on the applications described earlier as shown in the table below: Taskset Characteristics Static Factors A T1 = (26300, , ) T3 = ( 9300, , ) T4 = (15900, , ) 0.9495 B T2 = (30700, , ) T3 = ( 9300, , ) T4 = (15900, , ) 0.8979 C T1 = (30700, , ) T3 = ( 9300, , ) T5 = (13600, , ) 0.9207
34
Frequency & Voltage Scaling
For the 4 schemes and the 3 tasksets experimented we measured processor power consumption using a shunt resistor and a DAQ board. The voltage of the Xscale processor is dynamically varied according to the frequency as in the table below: Frequency (Mhz) Voltage (Volts) 733 666 600 533 466 400 333 1.5 1.4 1.3 1.25 1.2 1.1 1.0
35
Results: Taskset A Scheme Energy Power Ratio Deadlines missed
Column deadlines missed shows the number of deadlines missed per task (T1, T3, T4) for a total of 415/207/138 executions respectively. For the adaptive algorithm, M varies as the number between parentheses, Inc=0.1, Dec=0.5, W=10 and D=20% Scheme Energy Power Ratio Deadlines missed Normal 39.085 0.779 1 0/0/0 Only Shutdown 31.504 0.628 0.80 Shut./Static 32.024 0.638 0.81 Shut./Static/Dyn. 28.496 0.568 0.72 1/1/2 Shut./Static/Dyn./Adapt. (0.95) 26.581 0.527 0.68 3/2/1 Shut./Static/Dyn./Adapt. (0.90) 26.258 0.522 0.67 Shut./Static/Dyn./Adapt. (0.85) 25.251 0.502 0.64 3/1/4 Shut./Static/Dyn./Adapt. (0.80) 24.835 0.494 0.63 3/2/51 Shut./Static/Dyn./Adapt. (0.75) 24.330 0.483 0.62 3/2/63
36
Results: Taskset B Scheme Energy Power Ratio Deadlines missed
Column deadlines missed shows the number of deadlines missed per task (T2, T3, T4) for a total of 130/65/43 executions respectively Scheme Energy Power Ratio Deadlines missed Normal 12.546 0.798 1 0/0/0 Only Shutdown 11.265 0.716 0.89 Shut./Static 9.819 0.624 0.78 1/0/1 Shut./Static/Dyn. 9.811 Shut./Static/Dyn./Adapt. (0.95) 9.795 0.623 Shut./Static/Dyn./Adapt. (0.90) 8.815 0.562 0.70 1/1/31 Shut./Static/Dyn./Adapt. (0.85) 8.828 Shut./Static/Dyn./Adapt. (0.80) 8.185 0.522 0.65 34/10/34 Shut./Static/Dyn./Adapt. (0.75) 8.211 0.525
37
Results: Taskset C Scheme Energy Power Ratio Deadlines missed
Column deadlines missed shows the number of deadlines missed per task (T1, T3, T5) for a total of 130/65/43 executions respectively Scheme Energy Power Ratio Deadlines missed Normal 13.080 0.838 1 0/0/0 Only Shutdown 12.342 0.772 0.94 Shut./Static 12.391 0.789 Shut./Static/Dyn. 10.892 0.693 0.83 0/1/18 Shut./Static/Dyn./Adapt. (0.95) 10.958 0.697 Shut./Static/Dyn./Adapt. (0.90) 9.875 0.627 0.75 1/8/32 Shut./Static/Dyn./Adapt. (0.85) 9.990 0.637 0.76 11/16/32 Shut./Static/Dyn./Adapt. (0.80) 9.889 0.631 Shut./Static/Dyn./Adapt. (0.75) 9.789 0.624 0.74
38
OS-directed DVS Results
39
Using Application-level “knob”
Example: Image Compression Algorithm tradeoff image quality against energy available by varying the compression parameters such as BPP (bits per pixel) The image compression algorithm is ran in a continuous loop with battery polling every 10 secs. A simple power tradeoff policy is added to adapt the quality of the image against the battery voltage left. Whenever the battery drops 30mV the application adjusts the image BPP by starting at 1.5. For a cut-off of 4020mV, the battery life is extended from 290 seconds to 340 seconds.
40
The battery life is extended by 18% with a slight (= “not noticeable by human eye”) degradation of image quality
41
Concluding Remarks Computers with radios present a very wide range of system optimization opportunities for power, size and performance Efficient power and energy management is key to enabling new range of applications Energy efficiency is a system-level concern that cuts across subsystem components, functionality layers and its implementations Application programming needs to be energy aware and provide knobs for the system designer to incorporate in DPM.
42
… and others have already solved the problem?
Yes, but Microsoft... … and others have already solved the problem?
43
Operating System Power Management (OSPM)
Supported by Microsoft’s desktop operating systems via APM - Advanced Power Management OS/BIOS co-operation When OS goes to idle condition it performs an access to a register that causes an SMI# SMI handler puts system into low power state APM required OS to trust the system BIOS
44
Current OSPM - ACPI Advanced Configuration and Power Management Interface (ACPI) OS visible (SCI-based) as opposed to OS invisible (SMI-based) OS/drivers/BIOS are in sync regarding power states Standard way for the system to describe its device config. & power control h/w interface to the OS register interface for common functions system control events, processor power and clock control, thermal management, and resume handling Info on devices, resources, & control mechanisms Thermal Management
47
ACPI Processor Power States
Latency C1 < C2 < C3 Power Throttling Power C1 > C2 > C3
48
Overview of ACPI System States
CPU Memory Devices Wake Up Context Tracking G0 C0: Full Speed C1:C3 Executing in PM state (ie Thermal Throttle/HLT) Retained Power: ON Refresh: Normal Powered Up & Down based on demand D0-D3 Working Not Executing Context Retained CPU CLK: OFF System CLK: ON Power: ON S1 H/W responsible for saving context of CPU, System I/O, & Memory Retained Power : ON Refresh : Normal Devices Power down depending on wakeup & power requirements Lowest Latency CS:IP +1 Sleeping Not Executing CPU/Sys Cache Context Lost CPU CLK: OFF System CLK: OFF Power: ON S2 Retained Power : ON Refresh : Standby / Auto Devices Power down depending on wakeup & power requirements H/W responsible for saving context of System I/O & Memory OS responsible for saving CPU context Latency > S1 Boot Vector Sleeping S3 Not Executing CPU/Cache Context Lost CPU CLK: OFF System CLK: OFF Power: OFF H/W responsible for saving Memory context BIOS restores Memory Controller Context. OS responsible for saving CPU & System I./O context Retained Power : ON Refresh : Standby / Auto Devices Power down depending on wakeup & power requirements Latency > S2 Boot Vector Sleeping S4 Not Executing CPU/Cache Context Lost Everything: OFF Context Lost Power : OFF Refresh : N/A Devices Power down depending on wakeup & power requirements Latency > S3 Boot Vector OS(S4) / BIOS(S4bios) is responsible for saving and restoring all system context, including memory S4BIOS Sleeping G2/S5 OFF OFF Devices are OFF, Power Button Press will wake up the system Latency > S4 Boot Vector OS uses S5 to turn the machine off Soft OFF NOTES: - OS chooses the lowest supported sleep state in which all enabled wakeup devices still functions under the latency requirements from apps. - ASL binds each Sx state to a SLP_TYP value, which based on platform design of power planes & clocking logic det what portions of the h/w power down. - For each Device, ASL lists which power resources are needed to maintain a ‘wakeup’ capable state - ‘System I/O’ refers to Motherboard Devices: PIT, PIC, DMAC, NMI State....OS saves & restores this stuff for S3
49
Summary of functional areas covered by ACPI
System Power Management ACPI defines mechanisms for putting the computer as a whole in and out of system sleeping states. Device Power Management ACPI tables describe devices, their power states, the power planes the devices are connected to, and controls for putting devices into different power states. Processor power management While the OS is idle but not sleeping, it will use commands described by ACPI to put processors in low-power states. Device and processor performance management DPM to achieve desirable balance between performance and energy by transitioning devices and processors into different states when the system is active.
50
ACPI functionalities (cont.)
Plug and Play hierarchically arranged device and configuration information System Events a general event mechanism for system events such as thermal events, power management events, docking, device insertion and removal, and so on Battery management either through a Smart Battery subsystem interface controlled by the OS directly through the embedded controller interface, or a Control Method Battery interface. Thermal management provides a model to allow OEMs to define thermal zones, thermal indicators, and methods for cooling thermal zones. A standard hw and sw interface between OS and Embedded Controller allows any OS to provide a standard bus enumerator that can directly communicate with an embedded controller in the system, thus allowing other drivers within the system to communicate with and use the resources of system embedded controllers.
51
Microsoft’s OnNow Win32 API extension allows applications to
affect the power management decision making adapt to power state find out if running on batteries so as to reduce processing discover disk state & postpone low priority I/O e.g. paging Requires changes in hardware, firmware (BIOS), OS, and application software bus & device power management standards for h/w ACPI interface standard between OS & hardware integration of power management into app control
52
OnNow Components Ref.: Microsoft’s “OnNow Power Management Architecture for Applications”
53
OnNow Architecture User’s view: system is either on or off
Reality: system transitions among a number of “power states” according to OS’s power policy Global power states working: apps are executing sleep: software is not executing, & CPU is stopped OS tracks user’s activities & application execution states to decide when to enter sleep monitor user input, hints from applications wake-up is time-based or device-based off: system has shutdown and must reboot
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.