Task Graph Scheduling for RTR Paper Review By Gregor Scott.

Slides:



Advertisements
Similar presentations
ECE-777 System Level Design and Automation Hardware/Software Co-design
Advertisements

POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
1 Reconfigurable Hardware Thomas Polzer Overview Definition Definition Methods Methods Devices Devices Applications Applications Problems Problems.
Octavian Cret, Kalman Pusztai Cristian Vancea, Balint Szente Technical University of Cluj-Napoca, Romania CREC: A Novel Reconfigurable Computing Design.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
Reconfigurable Computing: What, Why, and Implications for Design Automation André DeHon and John Wawrzynek June 23, 1999 BRASS Project University of California.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Design Flow – Computation Flow. 2 Computation Flow For both run-time and compile-time For some applications, must iterate.
Ritu Varma Roshanak Roshandel Manu Prasanna
1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.
Dynamically Reconfigurable Architectures: An Overview Juanjo Noguera Dept. Computer Architecture (DAC-UPC)
Synergistic Execution of Stream Programs on Multicores with Accelerators Abhishek Udupa et. al. Indian Institute of Science.
A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnati.
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
Torino (Italy) – June 25th, 2013 Ant Colony Optimization for Mapping, Scheduling and Placing in Reconfigurable Systems Christian Pilato Fabrizio Ferrandi,
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Study of AES Encryption/Decription Optimizations Nathan Windels.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
03/12/20101 Analysis of FPGA based Kalman Filter Architectures Arvind Sudarsanam Dissertation Defense 12 March 2010.
An Effective Dynamic Scheduling Runtime and Tuning System for Heterogeneous Multi and Many-Core Desktop Platforms Authous: Al’ecio P. D. Binotto, Carlos.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
Memory Management. Process must be loaded into memory before being executed. Memory needs to be allocated to ensure a reasonable supply of ready processes.
1 Operating Systems for Reconfigurable Computing Systems Xuequn(Robin) Li Instructor: Prof. Shawki Areibi School of Engineering School of Engineering.
Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.
Operating Systems for Reconfigurable Systems John Huisman ID:
Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:
FPGA Run-time Reconfigurable Placement Presentation by Brian Leonard Clemson University 2003 SURE REU Program Advisor: Ron Sass.
FPGA FPGA2  A heterogeneous network of workstations (NOW)  FPGAs are expensive, available on some hosts but not others  NOW provide coarse- grained.
(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
Heng Tan Ronald Demara A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management.
Hardware Implementation of a Memetic Algorithm for VLSI Circuit Layout Stephen Coe MSc Engineering Candidate Advisors: Dr. Shawki Areibi Dr. Medhat Moussa.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
Embedded Runtime Reconfigurable Nodes for wireless sensor networks applications Chris Morales Kaz Onishi 1.
Embedding Constraint Satisfaction using Parallel Soft-Core Processors on FPGAs Prasad Subramanian, Brandon Eames, Department of Electrical Engineering,
“Politehnica” University of Timisoara Course No. 2: Static and Dynamic Configurable Systems (paper by Sanchez, Sipper, Haenni, Beuchat, Stauffer, Uribe)
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
CS5222 Advanced Computer Architecture Part 3: VLIW Architecture
Paper Review Presentation Paper Title: Hardware Assisted Two Dimensional Ultra Fast Placement Presented by: Mahdi Elghazali Course: Reconfigurable Computing.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Operating Systems for Reconfigurable Embedded Platforms: Online Scheduling of Real-Time Tasks Jinxu Ding Ramón Mercado.
Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy.
Operating Systems Lecture 31.
1 Advanced Digital Design Reconfigurable Logic by A. Steininger and M. Delvai Vienna University of Technology.
Winter-Spring 2001Codesign of Embedded Systems1 Co-Synthesis Algorithms: Distributed System Co- Synthesis Part of HW/SW Codesign of Embedded Systems Course.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
Custom Computing Machines for the Set Covering Problem Paper Written By: Christian Plessl and Marco Platzner Swiss Federal Institute of Technology, 2002.
1 Hardware-Software Co-Synthesis of Low Power Real-Time Distributed Embedded Systems with Dynamically Reconfigurable FPGAs Li Shang and Niraj K.Jha Proceedings.
Optimizing Packet Lookup in Time and Space on FPGA Author: Thilan Ganegedara, Viktor Prasanna Publisher: FPL 2012 Presenter: Chun-Sheng Hsueh Date: 2012/11/28.
POLITECNICO DI MILANO A SystemC-based methodology for the simulation of dynamically reconfigurable embedded systems Dynamic Reconfigurability in Embedded.
Static Translation of Stream Program to a Parallel System S. M. Farhad The University of Sydney.
Compiler Research How I spent my last 22 summer vacations Philip Sweany.
Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.
Automated Software Generation and Hardware Coprocessor Synthesis for Data Adaptable Reconfigurable Systems Andrew Milakovich, Vijay Shankar Gopinath, Roman.
By M.M. Bassiri and H. S. Shahhoseini Elisha Colmenar
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
Chapter 2 Memory and process management
Dynamo: A Runtime Codesign Environment
FPGA: Real needs and limits
Improving cache performance of MPEG video codec
Anne Pratoomtong ECE734, Spring2002
Reconfigurable Hardware Scheduler for RTS
Dynamically Reconfigurable Architectures: An Overview
Reconfigurable Hardware
Presentation transcript:

Task Graph Scheduling for RTR Paper Review By Gregor Scott

Paper Objective This paper provides an ILP formulation to minimize schedule length on partial dynamic reconfigurable architectures. A heuristic scheduler designed to exploit hardware module reuse Analysis of theses scheduling methods to address problem in current HW/SW Co-design.

Introduction An FPGA solution loaded with the a single configuration at the end of the design phase, can be termed as Compile Time Reconfiguration (CTR). Technology now allows FPGA’s to be reconfigured between different stage of computation. If a hardware application is bigger then the FPGA fabric allows, it must be partitioned into pieces that fit. Classical HW/SW co-design must be improved to take advantage of FPGA’s that support dynamic and partial reconfiguration.

Target Device and Context (a) 1D reconfiguration modules are confined to columns. (b) 2D, modules can consume lass space on the FPGA allowing for more efficient use of space. Xilinx Virtex4 & 5 FPGA’s allow for dynamic 2D partial reconfiguration like this.

Target Device and Context Module reuse means multiple tasks can be completed with a single configuration. Deconfiguration policy is a set of rules used to decide how to remove a configuration module from the FPGA. Antifragmentation techniques avoid fragmentation of the FPGA space. Configuration Prefetching means a module is loaded on to the FPGA as soon as possible.

Target Device and Context A 2D reconfigurable platform is modeled as a grid of reconfigurable units (RU). Each cell can be represented as its row, column pair (r,c). An application is provided as a set of tasks in the form of a Directed Acyclic Graph (DAG). Tasks in the application can be executed using a set of execution units (EU) which correspond to different RU configurations on the device, and a configuration bitstream.

Target Device and Context For any task with its set of EU implementations, using the latency, size and reconfiguration time for each implementation, a function can be defined that specific for the task: The EU that is needed to execute it The position on the FPGA to place the EU The time the EU can be configured if reuse is not possible. And the time the execution can start.

The ILP formulation for 2D reconfiguration and software execution A processor must exist in a static area to take of reconfiguration and be a processing element. The possibility of having a task execute only in software exists. Processor and the RU’s have separate memory. Communication model and latency between the processors and RU’s must be considered. Lots of algebra defining the parameters and constrains of the Integer linear Formulation can be found in this section of the paper.

Napoleon: a Heuristic Approach This algorithm will sort the task set based on dependencies and potential module reuse by later tasks. Then creating a placement list for EU’s that do not leave and RU’s empty. Replacing EU’s before a task is run if space is available.

Napoleon: a Heuristic Approach An example of a scheduled tasks

Napoleon: Task Graph and RU layout

Results ILP vs. Napoleon Heuristic Napoleon approach times very close to ILP.

Results ILP vs. Napoleon ILP takes much longer to compute than Napoleon.

Conclusion The first goal of this paper was to propose a model to assist in hardware/software co design of runtime reconfigurable systems. The second goal of the paper was to introduce a heuristic approach that can obtain good results in a reasonable amount of time. Next steps are in extending napoleon to work on line and schedule tasks at runtime.

Review opinions Long Results not very grounded Explains problems and premise well Presents new concepts well Interesting and easy to read Thank You, Questions?