Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,

Slides:



Advertisements
Similar presentations
QoS-based Management of Multiple Shared Resources in Dynamic Real-Time Systems Klaus Ecker, Frank Drews School of EECS, Ohio University, Athens, OH {ecker,
Advertisements

Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems Authors: Wanghong Yuan, Klara Narhstedt Appears in SOSP 2003 Presented by:
Performance Testing - Kanwalpreet Singh.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
ECOS: Leveraging Software-Defined Networks to Support Mobile Application Offloading Aaron Gember, Christopher Dragga, Aditya Akella University of Wisconsin-Madison.
1 BUFFERING APPROACH FOR ENERGY SAVING IN VIDEO SENSORS Wanghong Yuan, Klara Nahrstedt Department of Computer Science University of Illinois at Urbana-Champaign.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 14 – Introduction to Multimedia Resource Management Klara Nahrstedt Spring 2012.
Context Awareness System and Service SCENE JS Lee 1 An Energy-Aware Framework for Dynamic Software Management in Mobile Computing Systems.
Distributed Multimedia Systems
Institute of Networking and Multimedia, National Taiwan University, Jun-14, 2014.
Resource Management of Highly Configurable Tasks April 26, 2004 Jeffery P. HansenSourav Ghosh Raj RajkumarJohn P. Lehoczky Carnegie Mellon University.
Software Architecture of High Efficiency Video Coding for Many-Core Systems with Power- Efficient Workload Balancing Muhammad Usman Karim Khan, Muhammad.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Multimedia Streaming in Dynamic Peer-to-Peer Systems and Mobile Wireless.
Source-Adaptive Multilayered Multicast Algorithms for Real- Time Video Distribution Brett J. Vickers, Celio Albuquerque, and Tatsuya Suda IEEE/ACM Transactions.
Quality of Service in IN-home digital networks Alina Albu 7 November 2003.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Jonathan.
Performance and Energy Bounds for Multimedia Applications on Dual-processor Power-aware SoC Platforms Weng-Fai WONG 黄荣辉 Dept. of Computer Science National.
Reducing the Energy Usage of Office Applications Jason Flinn M. Satyanarayanan Carnegie Mellon University Eyal de Lara Dan S. Wallach Willy Zwaenepoel.
Building Resource-Aware Applications and Systems Michael B. Jones Microsoft Research.
Multimedia/Real-Time Prashanth Reddy. Multimedia and Real-Time Audio and Video – Continuous Media (CM) Graphics – Discrete media CM –High data rates –Timing.
Processor Frequency Setting for Energy Minimization of Streaming Multimedia Application by A. Acquaviva, L. Benini, and B. Riccò, in Proc. 9th Internation.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.
SyNAR: Systems Networking and Architecture Group Symbiotic Jobscheduling for a Simultaneous Multithreading Processor Presenter: Alexandra Fedorova Simon.
New Challenges in Cloud Datacenter Monitoring and Management
Adaptive Video Coding to Reduce Energy on General Purpose Processors Daniel Grobe Sachs, Sarita Adve, Douglas L. Jones University of Illinois at Urbana-Champaign.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Scaling and Packing on a Chip Multiprocessor Vincent W. Freeh Tyler K. Bletsch Freeman L. Rawson, III Austin Research Laboratory.
Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.
Power Issues in On-chip Interconnection Networks Mojtaba Amiri Nov. 5, 2009.
Distributed Multimedia March 19, Distributed Multimedia What is Distributed Multimedia?  Large quantities of distributed data  Typically streamed.
Reconfigurable Caches and their Application to Media Processing Parthasarathy (Partha) Ranganathan Dept. of Electrical and Computer Engineering Rice University.
Energy-Efficient Soft Real-Time CPU Scheduling for Mobile Multimedia Systems Wanghong Yuan, Klara Nahrstedt Department of Computer Science University of.
Supporting Multi-Fidelity Computations in Mobile Interactive Applications Dushyanth Narayanan.
Wireless Networks Breakout Session Summary September 21, 2012.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Ramazan Bitirgen, Engin Ipek and Jose F.Martinez MICRO’08 Presented by PAK,EUNJI Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors.
RANI NALAMARU DEPARTMENT OF COMPUTER SCIENCE BALL STATE UNIVERSITY RANI NALAMARU DEPARTMENT OF COMPUTER SCIENCE BALL STATE UNIVERSITY Efficient Transmission.
Challenges towards Elastic Power Management in Internet Data Center.
Quality of Service Karrie Karahalios Spring 2007.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
Mobile Middleware for Energy-Awareness Wei Li
1 Adaptable applications Towards Balancing Network and Terminal Resources to Improve Video Quality D. Jarnikov.
11 MANAGING PERFORMANCE Chapter 16. Chapter 16: MANAGING PERFORMANCE2 OVERVIEW  Optimize memory, disk, and CPU performance  Monitor system performance.
A Cyclic-Executive-Based QoS Guarantee over USB Chih-Yuan Huang,Li-Pin Chang, and Tei-Wei Kuo Department of Computer Science and Information Engineering.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
An Energy-efficient Task Scheduler for Multi-core Platforms with per-core DVFS Based on Task Characteristics Ching-Chi Lin Institute of Information Science,
A Systematic Approach to the Design of Distributed Wearable Systems Urs Anliker, Jan Beutel, Matthias Dyer, Rolf Enzler, Paul Lukowicz Computer Engineering.
Scalable Video Coding and Transport Over Broad-band wireless networks Authors: D. Wu, Y. Hou, and Y.-Q. Zhang Source: Proceedings of the IEEE, Volume:
NUS.SOC.CS5248 A Time Series-based Approach for Power Management in Mobile Processors and Disks X. Liu, P. Shenoy and W. Gong Presented by Dai Lu.
1 Soft Timers: Efficient Microsecond Software Timer Support For Network Processing Mohit Aron and Peter Druschel Rice University Presented By Oindrila.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
John Ankcorn Networks and Mobile Systems Group MIT LCS Software Technologies for Wireless Communication and Multimedia.
Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.
Computer Science and Engineering Power-Performance Considerations of Parallel Computing on Chip Multiprocessors Jian Li and Jose F. Martinez ACM Transactions.
Multimedia Computing and Networking Jan Reduced Energy Decoding of MPEG Streams Malena Mesarina, HP Labs/UCLA CS Dept Yoshio Turner, HP Labs.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Efficient Opportunistic Sensing using Mobile Collaborative Platform MOSDEN.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Kalman Filter and Data Streaming Presented By :- Ankur Jain Department of Computer Science 7/21/03.
Soft Timers : Efficient Microsecond Software Timer Support for Network Processing - Mohit Aron & Peter Druschel CS533 Winter 2007.
Problem and Motivation
Andrea Acquaviva, Luca Benini, Bruno Riccò
Tosiron Adegbija and Ann Gordon-Ross+
Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian
Presentation transcript:

Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris, Sarita V. Adve, Douglas L. Jones, Robin H. Kravets, and Klara Nahrstedt Computer Science and Electrical & Computer Engineering University of Illinois at Urbana-Champaign GRACE

Motivation Goal: Energy efficient mobile multimedia systems Opportunity: Dynamic resource variations Use adaptation to respond to changes Adapt all system layers Hardware, network, operating system, application, … All layers must adapt cooperatively to minimize energy while meeting current resource constraints  GRACE – Global Resource Adaptation through CoopEration

Challenges in Cross-Layer Adaptation - I What to adapt? When to adapt? Ideally: All layers, all apps Frequently

Challenges in Cross-Layer Adaptation - I What to adapt? When to adapt? Ideally:All layers, all apps Frequently Expensive

Challenges in Cross-Layer Adaptation - I What to adapt? When to adapt? Ideally:All layers, all apps Frequently Prior work: All layers, all apps (GRACE-1) Infrequent Expensive

Challenges in Cross-Layer Adaptation - I What to adapt? When to adapt? Ideally:All layers, all apps Frequently Prior work: All layers, all apps (GRACE-1) Infrequent One app or one system layer Frequent Expensive

Challenges in Cross-Layer Adaptation - I What to adapt? When to adapt? Ideally:All layers, all apps Frequently Prior work: All layers, all apps (GRACE-1) Infrequent One app or one system layer Frequent GRACE solution = hierarchical adaptation Three adaptation levels: global, per-app, and internal infrequent frequent but limited scope Expensive

Challenges in Cross-Layer Adaptation - II Implementing cross-layered hierarchical adaptation is difficult Multiple adaptations Multiple time-granularities What information to expose at each layer? How and when to communicate information between layers?  Interfaces need to be well designed

Contributions Implementation of hierarchical adaptation on a real system Significant energy savings from hierarchical adaptation

Overview GRACE hierarchy Global Per-application Internal System layers and adaptations for GRACE-2 Adaptation algorithms Results Summary

Global Adaptation Adapts all applications and system layers Goal: For all apps, … choose app, CPU, network, … configuration such that minimize system energy subject to CPU, network, … constraints Expensive – triggered on large changes e.g., app enters or exits Adapts for long-term resource demands

Per-Application Adaptation Considers one application at a time - adapts all layers Global adaptation decision = resource allocation Goal: For a single app, choose app, CPU, network, … configuration such that minimize system energy subject to CPU, network, … allocation from global adaptation Triggered every frame Adapts for resource demand for next frame

Internal Adaptation Adapts single system layer several times per frame Not visible to rest of the system Respects resource allocation from global

Overview GRACE hierarchy System layers and adaptations for GRACE-2 Adaptation algorithms Results Summary

The CPU Layer CPU adaptation: DVFS on Pentium-M processor Processor has discrete DVFS points Emulate continuous DVFS [Ishihara 98] Adaptation decisions at global and per-app level CPU energy model used by adaptation algorithm

The Application Layer Adaptive H.263 encoder [Sachs 99] Adaptation decisions at global and per-app level Adaptation Trade-off between network and CPU energy Choice between more or less compression Drop DCT and motion search based on adaptive thresholds No impact on user perception

The OS Scheduler Layer Earliest-deadline first soft real-time scheduler Enforces budget allocations for CPU time, bandwidth Adapted at global and internal level Scheduler supports budget sharing [Caccamo 00] Unused budget shared between applications Reduces number of deadline misses

The Network Layer Non-adaptive network layer – not implemented Fixed (available) network bandwidth for each experiment 2 Mbps to 11 Mbps in b WLAN Network energy model used by adaptation algorithm

Adaptations in GRACE-2 LayerAdaptationHierarchy Level GlobalPer-appInternal CPUDynamic voltage and frequency scaling (DVFS) √√X

Adaptations in GRACE-2 LayerAdaptationHierarchy Level GlobalPer-appInternal CPU Dynamic voltage and frequency scaling (DVFS) √√X ApplicationDrop DCT and motion estimation computations based on adaptive thresholds √√X

Adaptations in GRACE-2 LayerAdaptationHierarchy Level GlobalPer-appInternal CPU Dynamic voltage and frequency scaling (DVFS) √√X Application Drop DCT and motion estimation computations based on adaptive thresholds √√X SchedulerChange CPU time, network bandwidth budget √X√

Overview GRACE hierarchy System layers and adaptations for GRACE-2 Adaptation algorithms Results Summary

Invoked on large changes in system – e.g., application enters/exits Goal: For all apps, … choose app + CPU config minimize CPU + network energy subject to CPU and network bandwidth constraints MMKP problem – solved using heuristics and brute force Global Adaptation (1 of 2)

Global Adaptation (2 of 2) … App config 1 CPU config 1 … CPU config m Global controller App k App 1 CPU time, network bytes (long-term history, 95 th percentile) CPU, network allocation App config n CPU config 1 … CPU config m …

Invoked at start of an application frame Goal: For a single app choose app + CPU config minimize CPU + network energy subject to CPU, network allocation from global adaptation Per-app Adaptation (1 of 2)

Per-app Adaptation (2 of 2) … App config 1 CPU config 1 … CPU config m Per-app controller App i CPU time, network bytes (short-term history, linear predictor) choose app, CPU config App config n CPU config 1 … CPU config n

GRACE-2 System – Architecture (1/3) Global controller in action Application Per-app Controller OS Scheduler long-term resource demands allocated time, bandwidth Global Controller CPU Network Adaptor MonitorAdaptorPredictor Monitor allocated time, bandwidth, energy

GRACE-2 System – Architecture (2/3) Per-app controller in action Application Per-app Controller OS Scheduler long-term resource demands allocated time, bandwidth Global Controller CPU Network Adaptor MonitorAdaptorPredictor Monitor allocated time, bandwidth, energy app config next frame’s resource demands frequency

GRACE-2 System – Architecture (3/3) OS scheduler in action Application Per-app Controller OS Scheduler long-term resource demands allocated time, bandwidth Global Controller CPU Network Adaptor MonitorAdaptorPredictor Monitor allocated time, bandwidth, energy app config next frame’s resource demands frequency bandwidth frequency status: energy; miss, overrun cycles usage

GRACE-2 System – Implementation Implemented on ThinkPad R40 laptop and Linux Everything except network is implemented All results include global adaptation in all layers Global saves average 32% energy over base system

Experimental Methodology Evaluated remote sensing, teleconferencing type applications Combinations of speech and video encoders and decoders Multiple encoders and/or decoders per workload Standard video and audio input streams Only H.263 video encoder is adaptive

Experimental Methodology - Workloads Evaluated remote sensing, teleconferencing type applications Combinations of speech and video encoders and decoders Multiple encoders and/or decoders per workload Standard video and audio input streams Only H.263 video encoder is adaptive 4 resource constraints (vary period, bandwidth  16 workloads) Unconstrained Only CPU Constrained Only Network Constrained Both Constrained

Experimental Methodology - Energy Measured entire system energy using sampling power supply Including display, disk, memory system Modeled network energy added to measurements Isolated CPU+network energy with CPU, network models Models applied to implemented system First set of results based on these models

Overview GRACE hierarchy System layers and adaptations for GRACE-2 Adaptation algorithms Results CPU + network System Summary

CPU + Network (Model) Energy Savings (1/3) Per-app CPU adaptation gives modest savings 4 to 10%, average 7%

CPU + Network (Model) Energy Savings (2/3) Per-app application adaptation saves significant energy over global 9% to 18%, average 14%

CPU + Network (Model) Energy Savings (3/3) GRACE-2 = Global + Per-app CPU + Per-app application Saves significant energy over global: 18% to 35%, average 27% > only per-app CPU + only per-app application

CPU + Network (Model) – Analysis CPU energy > network energy App config that does least compression is least energy True for all constraint scenarios Bytes generated by some frames > bandwidth  Global will not use this config Per-app has better predictions – better resource utilization

Results – Measured Energy Savings GRACE-2’s per-app adaptation saves noticeable system energy Network constrained workloads benefit most Savings between 7% and 14%, average of 10% This is in addition to global adaptation Measurements include display, disk, memory system power

Summary Goal: Energy efficient mobile multimedia systems GRACE uses hierarchical cross-layer adaptations in all layers Our focus: per-app adaptations Per-app adaptation effective with network constraint Better utilization of resources based on better predictions 27% savings over global Combining per-app adaptations > additive savings

Current/Future Work  Network implementation  Integrating reliability  Other application adaptations  Improving per-app predictors