Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*,

Slides:



Advertisements
Similar presentations
Chapter 3 Embedded Computing in the Emerging Smart Grid Arindam Mukherjee, ValentinaCecchi, Rohith Tenneti, and Aravind Kailas Electrical and Computer.
Advertisements

Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.
1 Fast Configurable-Cache Tuning with a Unified Second-Level Cache Ann Gordon-Ross and Frank Vahid* Department of Computer Science and Engineering University.
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
1 A Self-Tuning Configurable Cache Ann Gordon-Ross and Frank Vahid* Department of Computer Science and Engineering University of California, Riverside.
*time Optimization Heiko, Diego, Thomas, Kevin, Andreas, Jens.
University of Minho School of Engineering Centre Algoritmi Uma Escola a Reinventar o Futuro – Semana da Escola de Engenharia - 24 a 27 de Outubro de 2011.
A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang, Vahid F., Lysecky R. Proceedings of Design, Automation and Test in Europe Conference.
A Highly Configurable Cache Architecture for Embedded Systems Chuanjun Zhang, Frank Vahid and Walid Najjar University of California, Riverside ISCA 2003.
1 Using A Multiscale Approach to Characterize Workload Dynamics Characterize Workload Dynamics Tao Li June 4, 2005 Dept. of Electrical.
Adaptive Cache Compression for High-Performance Processors Alaa R. Alameldeen and David A.Wood Computer Sciences Department, University of Wisconsin- Madison.
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
A One-Shot Configurable- Cache Tuner for Improved Energy and Performance Ann Gordon-Ross 1, Pablo Viana 2, Frank Vahid 1, Walid Najjar 1, and Edna Barros.
Research Directions for On-chip Network Microarchitectures Luca Carloni, Steve Keckler, Robert Mullins, Vijay Narayanan, Steve Reinhardt, Michael Taylor.
ECE 510 Brendan Crowley Paper Review October 31, 2006.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Automatic Tuning of Two-Level Caches to Embedded Applications Ann Gordon-Ross and Frank Vahid* Department of Computer Science and Engineering University.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
“Early Estimation of Cache Properties for Multicore Embedded Processors” ISERD ICETM 2015 Bangkok, Thailand May 16, 2015.
CPACT – The Conditional Parameter Adjustment Cache Tuner for Dual-Core Architectures + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Korea Univ B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors 컴퓨터 · 전파통신공학과 최병준 1 Computer Engineering and Systems Group.
Computer Science Open Research Questions Adversary models –Define/Formalize adversary models Need to incorporate characteristics of new technologies and.
1 of 20 Phase-based Cache Reconfiguration for a Highly-Configurable Two-Level Cache Hierarchy This work was supported by the U.S. National Science Foundation.
Embedding Constraint Satisfaction using Parallel Soft-Core Processors on FPGAs Prasad Subramanian, Brandon Eames, Department of Electrical Engineering,
U N I V E R S I T Y O F S O U T H F L O R I D A The basic idea is to start from a difference equation with unknown parameters and orders in the following.
A Single-Pass Cache Simulation Methodology for Two-level Unified Caches + Also affiliated with NSF Center for High-Performance Reconfigurable Computing.
Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.
A S ELF -T UNING C ACHE ARCHITECTURE FOR E MBEDDED S YSTEMS Chuanjun Zhang, Frank Vahid and Roman Lysecky Presented by: Wei Zang Mar. 29, 2010.
Survey of Real Time Databases Telvis Calhoun CSc 6710.
Adaptive Multi-Threading for Dynamic Workloads in Embedded Multiprocessors 林鼎原 Department of Electrical Engineering National Cheng Kung University Tainan,
Design, Optimization, and Control for Multiscale Systems
Dynamic Phase-based Tuning for Embedded Systems Using Phase Distance Mapping + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
1 Elke. A. Rundensteiner Worcester Polytechnic Institute Elisa Bertino Purdue University 1 Rimma V. Nehme Microsoft.
Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.
1 of 21 Online Algorithms for Wireless Sensor Networks Dynamic Optimization Arslan Munir 1, Ann Gordon-Ross 2+, Susan Lysecky 3, and Roman Lysecky 3 1.
Analysis of Cache Tuner Architectural Layouts for Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
Thermal-aware Phase-based Tuning of Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing This work was supported.
PANEL Ubiquitous Systems Ubiquity for Everyone: What is Missing? Ann Gordon-Ross Department of Electrical and Computer Engineering University of Florida,
1 of 20 Low Power and Dynamic Optimization Techniques for Power-Constrained Domains Ann Gordon-Ross Department of Electrical and Computer Engineering University.
1 CMP-MSI.07 CARES/SNU A Reusability-Aware Cache Memory Sharing Technique for High Performance CMPs with Private Caches Sungjune Youn, Hyunhee Kim and.
CISC Machine Learning for Solving Systems Problems Microarchitecture Design Space Exploration Lecture 4 John Cavazos Dept of Computer & Information.
Lightweight Runtime Control Flow Analysis for Adaptive Loop Caching + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing Marisha.
Exploiting Dynamic Phase Distance Mapping for Phase-based Tuning of Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Criticality Aware Smart Spaces T. Mukherjee Impact Lab ( Department of Computer Science & Engineering Ira A. Fulton School of Engineering.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Closing the Query Processing Loop in Oracle 11g Allison Lee, Mohamed Zait.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
Marilyn Wolf1 With contributions from:
A Dynamic Scheduling Framework for Emerging Heterogeneous Systems
Preface to the special issue on context-aware recommender systems
ANTICIPATORY LOGISTICS
Department of Electrical & Computer Engineering
Anne Pratoomtong ECE734, Spring2002
Improved schedulability on the ρVEX polymorphic VLIW processor
Tosiron Adegbija and Ann Gordon-Ross+
TLC: A Tag-less Cache for reducing dynamic first level Cache Energy
Haishan Zhu, Mattan Erez
A Cognitive Approach for Cross-Layer Performance Management
Ann Gordon-Ross and Frank Vahid*
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
An Adaptive Middleware for Supporting Time-Critical Event Response
Tapestry: Reducing Interference on Manycore Processors for IaaS Clouds
Dynamic Code Mapping Techniques for Limited Local Memory Systems
Tosiron Adegbija and Ann Gordon-Ross+
A Self-Tuning Configurable Cache
Self-Managed Systems: an Architectural Challenge
Automatic Tuning of Two-Level Caches to Embedded Applications
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Presentation transcript:

Realizing Closed-loop, Online Tuning and Control for Configurable-Cache Embedded Systems: Progress and Challenges Islam S. Badreldin*, Ann Gordon-Ross*, Tosiron Adegbija§, and Mohamad Hammam Alsafrjalani+ Department of Electrical & Computer Engineering *University of Florida, §University of Arizona, +University of Miami {ibadreldin,anngordonross}@ufl.edu, tosiron@email.arizona.edu, alsafrjalani@miami.edu

Introduction and Motivation Embedded systems have become ubiquitous Stringent design constraints: energy, performance, Quality of Experience (QoE) expectations, cost Increasingly complex memory and compute requirements Must operate in unknown and changing environments, and demands Application characteristics User expectations Feature complex heterogeneous processors to satisfy execution demands and diversity

The Cache Subsystem Caches can consume up to 50% of total processor power Cache optimization more important in heterogeneous multicore systems Caches must be adaptable to varying execution demands, environments, and user QoE expectations/feedback I.e., configurable caches Cache design space becomes extremely large when considering QoE expectations Cache tuning becomes more complex

Configurable Caches

Cache Tuning in Heterogeneous Cores New dynamic cache tuning methods required When to reconfigure the cache Incorporating runtime time-varying user QoE expectations User-defined optimization metrics Data sharing, cache coherence, etc. Impact of cache tuning on user experience Must not degrade user experience during cache tuning Need online, closed-loop, user-aware cache tuning systems

Overview of Cache Tuning Methods (Design space exploration) Offline DSE Online DSE Simulation-based Subsetting Heuristics Analytical methods Static configs Near-optimal configs Explore 1% of design space Eliminate DSE

Problem Formulation Set of n applications 𝐴={ 𝑎 1 , 𝑎 2 ,…, 𝑎 𝑛 } 𝐴={ 𝑎 1 , 𝑎 2 ,…, 𝑎 𝑛 } Set of m possible cache configurations 𝐶={ 𝑐 1 , 𝑐 2 ,…, 𝑐 𝑚 } Set of ki application phases per application ai 𝑃 𝑎 𝑖 ={ 𝑝 1 , 𝑝 2 ,…, 𝑝 𝑘 𝑖 } Energy-related optimization metric 𝑒( 𝑎 𝑖 , 𝑝 𝑘 , 𝑐 𝑗 ) Performance-related optimization metric 𝑢( 𝑎 𝑖 , 𝑝 𝑘 , 𝑐 𝑗 ) Pareto-optimal design 𝑐 0 ∈𝐶 𝑠.𝑡. 𝑒 𝑎 𝑖 , 𝑝 𝑘 , 𝑐 0 ≤𝑒 𝑎 𝑖 , 𝑝 𝑘 , 𝑐 𝑗 with ‘<‘ for some 𝑐 𝑗 & 𝑢 𝑎 𝑖 , 𝑝 𝑘 , 𝑐 0 ≤𝑢 𝑎 𝑖 , 𝑝 𝑘 , 𝑐 𝑗 with ‘<‘ for some 𝑐 𝑗 Energy Execution time Pareto optimal designs

Online Cache Tuning and Control Goal: continuously monitor and adapt to changing user and environment requirements Must be transparent to both user and application Need a feedback control system for self-monitoring and online adjustment Requirements of a self-tuning cache system 1. Monitor and collect information about system behavior 2. Dynamically reconfigure tunable parameter values 3. Determine when to change configurations in response to phase changes without incurring significant penalties 4. Adapt to user QoE expectations; limit QoE degradation

Proposed Ideal Self-Tuning Cache System Controller (tuner): incorporates QoE awareness; phase-based statistical model (linear/non-linear adaptive filters) Temporal execution data: consider history of execution statistics and performance metrics as time series No need to explicitly perform phase classification Operate at high temporal resolution while dynamically tuning interval Plant (cache subsystem): single level, multilevel, multicore

Challenges of Online Cache Tuning and Control Avoiding QoE degradation due to DSE overheads  use analytical and prediction-based models to limit the DSE to a few best options Performing well even for applications that are unknown at design time  use online adaptation and control to update the model data in real-time Operating at an optimal tuning interval  operate at high temporal resolution using online filtering while dynamically adjusting the tuning interval Tracking variations in QoE expectations  incorporate sensor values and historical data to directly predict best configurations given an estimated QoE expectation Incorporating user feedback  combine both intrusive and non-intrusive techniques to solicit user feedback to identify what the user considers a best configuration

Conclusions Efficient self-tuning caches critical for emerging resource-constrained computers May operate in unknown environments Wide variety of applications Different kinds of users may require different kinds of runtime tuning Several challenges exist for realizing self-tuning cache that is transparent to both the user and the executing applications Minimizing DSE overheads Self-tuning for unknown applications Meeting user QoE expectations Etc. Future work: explore proposed ideal architecture towards addressing the challenges Use an online, adaptive statistical model to predict best configuration given historical data, user feedback, sensor readings, etc. QoE-aware optimization, and dynamic adjustment of the tuning interval